We want to scrape some content from this webpage. The HTML of the element we are interested in is this (div.white-bg-border-radius-kousik.shadow-kousik-effect.mb-2).
For this, we are trying to use this selector in BeautifulSoup (Python). It does not work. I tried three four variants, they did not work as well, the HTML shows that this element is present 36 times in the page. The selectors return either blank set or 2-3 results, so I am obviously missing something. Need to find out the right way of doing it.
from bs4 import BeautifulSoup
import os
import urllib.request
url = "https://bankcodesfinder.com/world-postal-codes/india"
with urllib.request.urlopen(url) as response:
html = str(response.read())
soup = BeautifulSoup(html, 'html.parser')
elements = soup.find_all('div.white-bg-border-radius-kousik.shadow-kousik-effect.mb-2') # This returns blank set
elements2 = soup.findAll('div', class_=['shadow-kousik-effect', 'mb-2']) #returns just 3 elements, whereas this is a subset class search of the original list of 3 classes, so this should return at least 36 elements
elements3 = soup.select('div.shadow-kousik-effect') # returns just 3 results

I think it has to do with your
responsewhich on my machine gives tags with trailing\r\n.Using
requests, your css selector returns the 35 elements (search-box excluded).Output :