Despite the "similarly phrased" warning, I don't think this has been asked.
I'm starting to use BeautifulSoup (v4) and, for example, to get the href from an A-link you might do this:
for a_link in soup.html.body.select( 'a' ):
print( a_link )
if a_link.has_attr( 'href' ):
print( a_link[ 'href' ])
if a_link.has_attr( 'hrefXXX' ):
print( "... also hrefXXX")
print( hasattr( a_link, 'href' ) )
print( hasattr( a_link, 'hrefXXX' ) )
... what happens here is that the "also" line is never printed, but that the final 2 lines always return True
! In fact it appears not to matter what you put as the 2nd argument in hasattr
, it always seems to return True
.
Without being able to explain hasattr
's behaviour, my first thought, after trying searching, was that has_attr
might be very specific to BeautifulSoup. From searching, this does appear to be the case: in other words this is finding whether an HTML tag has a "tag attribute".
On the other hand, I have a slight suspicion that has_attr
may have wider application than BeautifulSoup. Years ago I used Jython and I have a feeling that there may have been a has_attr
and a hasattr
.
Can someone explain why hasattr
is always returning True
?
This is part of bs4 API.
hasattr()
is alwaysTrue
, because you can select tags just with dot syntax (.
). For example:Prints:
Note:
will try to find all
<hrefXXX>
tags under<has_attr>
tags, but returns emptyResultSet