Beautifulsoup selector in Python returns blank result set for valid selector

Question

Beautifulsoup selector in Python returns blank result set for valid selector

42 views Asked by NedStarkOfWinterfell At 03 January 2024 at 17:36

We want to scrape some content from this webpage. The HTML of the element we are interested in is this (div.white-bg-border-radius-kousik.shadow-kousik-effect.mb-2).

For this, we are trying to use this selector in BeautifulSoup (Python). It does not work. I tried three four variants, they did not work as well, the HTML shows that this element is present 36 times in the page. The selectors return either blank set or 2-3 results, so I am obviously missing something. Need to find out the right way of doing it.

from bs4 import BeautifulSoup
import os
import urllib.request

url = "https://bankcodesfinder.com/world-postal-codes/india"
with urllib.request.urlopen(url) as response:
        html = str(response.read())
        soup = BeautifulSoup(html, 'html.parser')
        elements = soup.find_all('div.white-bg-border-radius-kousik.shadow-kousik-effect.mb-2') # This returns blank set
        elements2 = soup.findAll('div', class_=['shadow-kousik-effect', 'mb-2']) #returns just 3 elements, whereas this is a subset class search of the original list of 3 classes, so this should return at least 36 elements
        elements3 = soup.select('div.shadow-kousik-effect') # returns just 3 results

Original Q&A

There are 1 answers

**Timeless** · Accepted Answer · 2024-01-03T18:26:32+00:00

I think it has to do with your response which on my machine gives tags with trailing \r\n.

<div\r\n class="white-bg-border-radius-kousik shadow-kousik-effect mb-2">

 <a \r\n="" class="nounderline" href="/world...>

Using requests, your css selector returns the 35 elements (search-box excluded).

import requests

url = "https://bankcodesfinder.com/world-postal-codes/india"

soup = BeautifulSoup(requests.get(url).text, "html.parser")

css = "div.white-bg-border-radius-kousik.shadow-kousik-effect.mb-2"

regions = [list(tag.stripped_strings) for tag in soup.select(css)]

Output :

# len(regions) # 35
[
    ['ANDAMAN & NICOBAR ISLANDS', '102 Branches'],
    ['ANDHRA PRADESH', '10493 Branches'],
    ['ARUNACHAL PRADESH', '302 Branches'],
    ['ASSAM', '4022 Branches'],
    ['BIHAR', '9113 Branches'],
    ...
]

TechQA.

Beautifulsoup selector in Python returns blank result set for valid selector

There are 1 answers

Related Questions in PYTHON

Related Questions in BEAUTIFULSOUP

Related Questions in URLLIB

Popular Questions

Trending Questions