AttributeError: 'module' object has no attribute 'Scraper'

1.1k views Asked by At

Using python 2.7 I am attempting to scrape and import articles from the NYT and have done so before with no problem either when getting one article or multiple at the same time and now getting error AttributeError: 'module' object has no attribute 'Scraper'.

I am using the newspaper package and it has worked great so far until this error. It appears to work on some html links and not on others despite the html links being accurate. Any ideas on a solution?

here is my code:

import pandas as pd
import newspaper
from newspaper import Article

url3='http://www.nytimes.com/2010/08/04/nyregion/04shooting.html'
url4='http://www.nytimes.com/2010/08/04/nyregion/04gunman.html'
url5='http://www.nytimes.com/2010/08/05/nyregion/05shooting.html'
url6='http://www.nytimes.com/2010/08/05/nyregion/05vics.html'
urls=[url3, url4,url5,url6]
Nyt_HBC =pd.DataFrame()
for i in urls: 
    a=Article(i, language='en')
    a.download()
    a.parse()
    Nyt_HBC= Nyt_HBC.append([[a.title, a.text]], ignore_index=True)
Nyt_HBC.columns=['Title','Article']
Nyt_HBC

Here is my full error message(quick note you can not run it without .parse())-

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-47-12545a6e9854> in <module>()
      9     a=Article(i, language='en')
     10     a.download()
---> 11     a.parse()
     12     Nyt_HBC= Nyt_HBC.append([[a.title, a.text]], ignore_index=True)
     13 Nyt_HBC.columns=['Title','Article']

/Users/ThomasPLapinger/anaconda/lib/python2.7/site-packages/newspaper/article.pyc in parse(self)
    226 
    227         if self.config.fetch_images:
--> 228             self.fetch_images()
    229 
    230         self.is_parsed = True

/Users/ThomasPLapinger/anaconda/lib/python2.7/site-packages/newspaper/article.pyc in fetch_images(self)
    245             first_img = self.extractor.get_first_img_url(
    246                 self.url, self.clean_top_node)
--> 247             self.set_top_img(first_img)
    248 
    249         if not self.has_top_image():

/Users/ThomasPLapinger/anaconda/lib/python2.7/site-packages/newspaper/article.pyc in set_top_img(self, src_url)
    399     def set_top_img(self, src_url):
    400         if src_url is not None:
--> 401             s = images.Scraper(self)
    402             if s.satisfies_requirements(src_url):
    403                 self.set_top_img_no_check(src_url)

AttributeError: 'module' object has no attribute 'Scraper'
0

There are 0 answers