Use a specific Scrapy downloader middleware per request

1k views Asked by At

I use Crawlera as a IP rotating service to crawl a specific website which is banning my IP quickly but I have this problem only with one website out of a dozen.

As it is possible to register multiple middlewares for a Scrapy project, I wanted to know if it was possible to define the downloader middleware to use PER REQUEST.

So I could use my Crawlera's quota only for the problematic website and not for all my requests.

1

There are 1 answers

0
Georgiy On

One of possible solution - usage custom_settings spider attribute (and removing CrawleraMiddleware from project settings
(assuming that you have 1 spider per 1 website and CrawleraMiddleware enabled in project settings):

class ProblemSpider(scrapy.spider):

    custom_settings = {
        'DOWNLOADER_MIDDLEWARES' : {'scrapy_crawlera.CrawleraMiddleware': 610},
        'CRAWLERA_ENABLED' : True,
        'CRAWLERA_APIKEY' : '<API key>'}

    def parse(self, response):
....

In this case CrawleraMiddleware will be used only in spiders where it defined in their custom_settings attribute.