I am trying to scrape URLs from a webpage. I am using this code:
from bs4 import BeautifulSoup
import urllib2
url = urllib2.urlopen("http://www.barneys.com/barneys-new-york/men/clothing/shirts/dress/classic#sz=176&pageviewchange=true")
content = url.read()
soup = BeautifulSoup(content)
links=soup.find_all("a", {"class": "thumb-link"})
for link in links:
print (link.get('href'))
But what I'm getting as output is just 48 links instead of 176. What am I doing wrong?
So what I did is I used Postmans interceptor feature to look at the call the website made each time it loaded the next set of 36 shirts. Then from there replicated the calls in code. You can't dump it all 176 items all at once so I replicated the 36 at a time the website did.