How scrapy write in log while running spider?

Question

How scrapy write in log while running spider?

174 views Asked by Prabhakar At 11 June 2015 at 10:42

While running scrapy spider, I am seeing that the log message has "DEBUG:" which has 1. DEBUG: Crawled (200) (GET http://www.example.com) (referer: None) 2. DEBUG: Scraped from (200 http://www.example.com)

I want to know that 1. what to those "Crawled" and "Scraped from" meant for? 2. From where those above both ULRs returned from(i.e. while scraping page which variable/argument has holding those URLs)

Original Q&A

There are 1 answers

**Frank Martin** · Accepted Answer · 2015-06-11T15:46:49+00:00

Let me try to explain based on the Scrapy Sample Code shown on the Scrapy Website. I saved this in a file scrapy_example.py.

from scrapy import Spider, Item, Field

class Post(Item):
    title = Field()

class BlogSpider(Spider):
    name, start_urls = 'blogspider', ['http://blog.scrapinghub.com']

    def parse(self, response):
        return [Post(title=e.extract()) for e in response.css("h2 a::text")]

Executing this with the command scrapy runspider scrapy_example.py it will produce the following output:

(...)
DEBUG: Crawled (200) <GET http://blog.scrapinghub.com> (referer: None) ['partial']
DEBUG: Scraped from <200 http://blog.scrapinghub.com>
    {'title': u'Using git to manage vacations in a large distributed\xa0team'}
DEBUG: Scraped from <200 http://blog.scrapinghub.com>
    {'title': u'Gender Inequality Across Programming\xa0Languages'}
(...)

Crawled means: scrapy has downloaded that webpage.

Scraped means: scrapy has extracted some data from that webpage.

The URL is given in the script as start_urls parameter.

Your output must have been generated by running a spider. Search the file where that spider is defined and you should be able to spot the place where the url is defined.

TechQA.

How scrapy write in log while running spider?

There are 1 answers

Related Questions in PYTHON

Related Questions in SCRAPY

Related Questions in SCRAPYD

Related Questions in PORTIA

Popular Questions

Popular Tags

Trending Questions