getting specific images from page

Question

getting specific images from page

110 views Asked by nlper At 09 June 2015 at 12:19

I am pretty new with BeautifulSoup. I am trying to print image links from http://www.bing.com/images?q=owl:

redditFile = urllib2.urlopen("http://www.bing.com/images?q=owl")
redditHtml = redditFile.read()
redditFile.close()

soup = BeautifulSoup(redditHtml)

productDivs = soup.findAll('div', attrs={'class' : 'dg_u'})
for div in productDivs:
    print div.find('a')['t1']  #works fine
    print div.find('img')['src'] #This getting issue KeyError: 'src'

But this gives only title, not the image source Is there anything wrong?

Edit: I have edited my source, still could not get image url.

Original Q&A

There are 2 answers

alecxe On 09 June 2015 at 12:46

If you open up browser develop tools, you'll see that there is an additional async XHR request issued to the http://www.bing.com/images/async endpoint which contains the image search results.

Which leads to the 3 main options you have:

simulate that XHR request in your code. You might want to use something more suitable for humans than urllib2; see requests module. This would be so called "low-level" approach, going down to the bare metal and web-site specific implementation which would make this option non-reliable, difficult, "heavy", error-prompt and fragile
automate a real browser using selenium - stay on the high-level. In other words, you don't care how the results are retrieved, what requests are made, what javascript needs to be executed. You just wait for search results to appear and extract them.
use Bing Search API (this should probably be option #1)

**Vikas Ojha** · Accepted Answer · 2015-06-09T13:08:50+00:00

Vikas Ojha On 09 June 2015 at 13:08 BEST ANSWER

Bing is using some techniques to block automated scrapers. I tried to print

div.find('img')

and found that they are sending source in attribute names src2, so following should work -

div.find('img')['src2']

This is working for me. Hope it helps.

TechQA.

getting specific images from page

There are 2 answers

Related Questions in PYTHON

Related Questions in HTML

Related Questions in WEB-SCRAPING

Related Questions in BEAUTIFULSOUP

Related Questions in HTML-PARSING

Popular Questions

Popular Tags

Trending Questions