I am looking to return an embedded response from a website. This website makes it very difficult to reach this embedded response without javascript so I am hoping to use splash. I am not interested in returning the rendered HTML, but rather one embedded response. Below is a screenshot of the exact response that I am looking to get back from splash.
This response returns a JSON object to the site to render, I would like the raw JSON returned from this response, how do I do this in Lua?
Turns out this is a bit tricky. The following is the kludge I have found to do this:
Splash call with LUA script, called from Scrapy:
This script works by returning the HAR file of the whole page load, it is key to set
splash.request_body_enabled = true
andsplash.response_body_enabled = true
to get the actual response content in the HAR file.The HAR file is just a glorified JSON object with a different name... so:
From there you can search the JSON object for the exact embedded response.
I really dont think this is a very efficient method, but it works.