How to listen to the playwright page.on("request") or playwright.on("response")

1k views Asked by At

My question is how do I listen to page.on("") like it's on the normal playwright I am using scrapy-playwright.

def start_requests(self):
    # GET request
    yield scrapy.Request(
        url,
        meta={
            "playwright": True,
            "playwright_include_page": True,
        },
    )

def parse(self, response):
    print("parse")
    page = response.meta["playwright_page"]
    print(page, "printing")

What I hope to achieve is to use capture a post request since the base url sends multiple background requests to the server and returns a json response. I am only interested in json response.

EDIT

2023-06-24 12:39:35 [scrapy-playwright] DEBUG: [Context=default] Request: <POST https://turo.com/api/bulk-quotes/v2> (resource type: fetch, referrer: https://turo.com/gb/en/search?country=US&defaultZoomLevel=11&delivery=false&deliveryLocationType=city&endDate=06%2F30%2F2023&endTime=10%3A00&isMapSearch=false&itemsPerPage=200&latitude=37.7749295&location=San%20Francisco%2C%20CA&locationType=CITY&longitude=-122.41941550000001&pickupType=ALL&placeId=ChIJIQBpAG2ahYAR_6128GcTUEo&region=CA&sortType=RELEVANCE&startDate=06%2F25%2F2023&startTime=10%3A00&useDefaultMaximumDistance=true)
2023-06-24 12:39:35 [scrapy-playwright] DEBUG: [Context=default] Response: <200 https://turo.com/api/bulk-quotes/v2> (referrer: None)

I see the request and response in the terminal. How do I capture and return it to parse for further processing.

1

There are 1 answers

0
Martins On

How to access those where literally in the scrapy-playwright documentation

all I needed to do was to add playwright_page_event_handlers

"playwright_page_event_handlers": {
    "response": "handle_response",
},

so start_requests now looks like this

def start_requests(self):
    yield scrapy.Request(
        url,
        meta={
            "playwright": True,
            "playwright_include_page": True,
            "playwright_page_event_handlers": {
                "response": "handle_response",
            },
        },
        callback=self.parse,
    )

async def handle_response(self, response):
    logging.info(f"Received response with URL {response.url}")

documentation