Python difference between hasattr and has_attr

909 views Asked by At

Despite the "similarly phrased" warning, I don't think this has been asked.

I'm starting to use BeautifulSoup (v4) and, for example, to get the href from an A-link you might do this:

for a_link in soup.html.body.select( 'a' ):
    print( a_link )
    if a_link.has_attr( 'href' ):
        print( a_link[ 'href' ])
    if a_link.has_attr( 'hrefXXX' ):
        print( "... also hrefXXX")
    print( hasattr( a_link, 'href' ) )
    print( hasattr( a_link, 'hrefXXX' ) )

... what happens here is that the "also" line is never printed, but that the final 2 lines always return True! In fact it appears not to matter what you put as the 2nd argument in hasattr, it always seems to return True.

Without being able to explain hasattr's behaviour, my first thought, after trying searching, was that has_attr might be very specific to BeautifulSoup. From searching, this does appear to be the case: in other words this is finding whether an HTML tag has a "tag attribute".

On the other hand, I have a slight suspicion that has_attr may have wider application than BeautifulSoup. Years ago I used Jython and I have a feeling that there may have been a has_attr and a hasattr.

Can someone explain why hasattr is always returning True?

1

There are 1 answers

1
Andrej Kesely On

This is part of bs4 API. hasattr() is always True, because you can select tags just with dot syntax (.). For example:

from bs4 import BeautifulSoup


txt = '''
<body>
    <hrefyyy>This is hrefyyy</hrefyyy>
</body>'''

soup = BeautifulSoup(txt, 'html.parser')

body = soup.find('body')

print( hasattr(body, 'hrefxxx' ))  # True
print( hasattr(body, 'hrefyyy' ))  # True

print( body.hrefxxx )     # <--- this is not error, it just returns `None`
print( body.hrefyyy )     # <--- returns <hrefyyy> tag

Prints:

True
True
None
<hrefyyy>This is hrefyyy</hrefyyy>

Note:

a_link.has_attr( 'hrefXXX' )

will try to find all <hrefXXX> tags under <has_attr> tags, but returns empty ResultSet