How to use scrapy to crawl angularjs websites?

1.7k views Asked by At

I need a way to get ALL odds of ALL events of bookmakers

I am using Scrapy+Splash to get the first javascript-loaded content of the site. But to get all other odds, I have to click "Spagna-LigaSpagnola", "Italia->Serie A", etc.

How can I do that ?

1

There are 1 answers

1
Adrien Blanquer On BEST ANSWER

You can emulate behaviors, like a scroll, or a click, by writting a JavaScript script and by telling Splash to execute that script when it renders your page.

A little exemple:

You define a JavaScript function that selects an element in the page and then clicks on it:

(source: splash doc)

  -- Get button element dimensions with javascript and perform mouse click.
_script = """
function main(splash)
    assert(splash:go(splash.args.url))
    local get_dimensions = splash:jsfunc([[
        function () {
            var rect = document.getElementById('button').getClientRects()[0];
            return {"x": rect.left, "y": rect.top}
        }
    ]])
    splash:set_viewport_full()
    splash:wait(0.1)
    local dimensions = get_dimensions()
    splash:mouse_click(dimensions.x, dimensions.y)

    -- Wait split second to allow event to propagate.
    splash:wait(0.1)
    return splash:html()
end
"""

Then, when you request, you modify the endpoint and set it to "execute", and you add "lua_script": _script to the args.

def parse(self, response):
    yield SplashRequest(response.url, self.parse_elem,
                        endpoint="execute",
                        args={"lua_source": _script})

You will find all the informations about splash scripting here