What is wrong with the bs4 documentation? I can't run unwrap() sample code

483 views Asked by At

I'm trying to strip out some fussy text from pages like this. I want to preserve the anchored links but lose the breaks and the a.intro. I thought I could use something like unwrap() to strip off layers but I'm getting an error: TypeError: 'NoneType' object is not callable

For kicks, I tried running the documentation sample code itself, since I couldn't see how my version differed.

markup = '<a href="http://example.com/">I linked to <i>example.com</i></a>'
soup = BeautifulSoup(markup)
a_tag = soup.a

a_tag.i.unwrap()
a_tag
# <a href="http://example.com/">I linked to example.com</a>

I'm getting the exact same error. What am I missing here? I'm working in Scraperwiki, fwiw.

3

There are 3 answers

0
Amanda On BEST ANSWER

Seems to be a scraperwiki issue. Works fine in ipython console.

0
Brian Cain On

I get this error too.

In [27]: type(a_tag.i.unwrap)
Out[27]: NoneType

In [28]: 'unwrap' in dir(a_tag.i)
Out[28]: False

FWIW, replace_with_children yields the same results:

In [29]: type(a_tag.i.replace_with_children)
Out[29]: NoneType

Looks like a bug to me.

In [13]: import BeautifulSoup as Bs

In [16]: Bs.__version__
Out[16]: '3.2.1'
0
Suzana On

I had the same error message with soup.select(). The reason was an old version of the BeautifulSoup4 library. Somebody at ScraperWiki fixed it (see this conversation at the ScraperWiki Google Group).