How to find a specific "term/text" in the HTML tags using Beautiful Soup?

60 views Asked by At

This is my code: I am passing URL stored in a list and for each URL I parse it using 'html.parser'. I am looking for the term "livefyre"

for page in links:
    req = requests.get(page, headers=hdr)
    soup = BeautifulSoup(req.text, "html.parser")
    for link in soup.find('div', attrs={"id" : "livefyre-comments"}):
        print(len(link.get_text()))

This only outputs the div elements matching the specific "id" : "livefyre-comments". I want to search for all/any occurrences of "livefyre" anywhere on the HTML page. Please help.

1

There are 1 answers

0
Jacob Lee On

You can use the tag[attr*='val'] CSS selector, which checks if val is a substring of the value attached to the attribute attr for tags tag. This would match the following elements, for example:

  • <tag attr="value">
  • <tag attr="values">
  • <tag attr="valuables">
  • <tag attr="invalid">

So, using this CSS selector, you can modify the code using the bs4.BeautifulSoup.select() method:

for page in links:
    req = requests.get(page, headers=hdr)
    soup = BeautifulSoup(req.text, "html.parser")
    for elem in soup.select("div[id*='livefyre']"):
        print(len(elem.getText())