Python Goose not able to extract mashable / usatoday / politicalwire articles

Question

Python Goose not able to extract mashable / usatoday / politicalwire articles

582 views Asked by Pratik Poddar At 28 January 2014 at 05:51

I am using python goose extractor and its failing for every article on mashable.com and usatoday.com. Can someone suggest a fix for the problem?

For usatoday.com article:

g = Goose()
article = g.extract(url='http://www.usatoday.com/story/tech/columnist/talkingtech/2014/01/25/namm-2014---ik-multimedias-rings-to-make-music/4863193/')
assert(article.cleaned_text=='')

For mashable article:

g = Goose()
article = g.extract(url='http://mashable.com/2014/01/26/square-cofounder-jim-mckelvey/')
assert(article.cleaned_text=='')

For politicalwire article:

g = Goose()
article = g.extract(url='http://politicalwire.com/archives/2014/01/27/some_republicans_go_off_script_in_sotu_response.html')
assert(article.cleaned_text=='')

I assume these are pretty important websites for text extraction. Can someone suggest a fix please? Thanks

Original Q&A

There are 1 answers

**Ankush Shah** · Answer 1 · 2014-06-07T16:39:09+00:00

Ankush Shah On 07 June 2014 at 16:39

The latest version of Goose from here is able to extract from usatoday.com and mashable.com

TechQA.

Python Goose not able to extract mashable / usatoday / politicalwire articles

There are 1 answers

Related Questions in PYTHON

Related Questions in TEXT-EXTRACTION

Related Questions in GOOSE

Popular Questions

Popular Tags

Trending Questions