Extract "Liked" songs from Pandora using python

433 views Asked by At

I am attempting to use Python's urllib2 to extract info on my "liked" tracks in Pandora. I'm getting discrepencies when comparing the HTML yielded from the following code and the HTML seen via Chrome's inspect element:

import urllib2

headers={ 'User-Agent' : 'Mozilla/5.0' }

url='http://www.pandora.com/profile/likes/myusername'

request=urllib2.Request(url,None,headers)
response = urllib2.urlopen(request)
html = response.read()

I'm thinking this might be due to the lack of authentication even though I'm still able to load the same page logged out using Chrome's incognito mode.

So I added the following lines to attempt to use basic authentication on my request:

SERVER='pandora.com'
authinfo = urllib2.HTTPPasswordMgrWithDefaultRealm()
authinfo.add_password(None, SERVER, "login", "password")
handler=urllib2.HTTPBasicAuthHandler(authinfo)
myopener=urllib2.build_opener(handler)
opened=urllib2.install_opener(myopener)

headers={ 'User-Agent' : 'Mozilla/5.0' }

url='http://www.pandora.com/profile/likes/chris.r.armstrong'

request=urllib2.Request(url,None,headers)
response = urllib2.urlopen(request)
html = response.read()

Still not getting the right HTML response back. Any suggestions?

1

There are 1 answers

0
karlcow On

The DOM (HTML page), you see inside the browser is not the payload of the HTTP request. Once an HTTP request is been made by a browser, depending on how complex a page is, a number of transformations happen. At the basic level, the parser might reorder and/or reorganize the content as mandated by HTML5 parsing algorithm. Then JS scripts and XMLHttpRequests will modify and add content to the DOM.

If you really need the DOM as seen in the browser, you might want to use a webdriver for being able to get back what the browser sees and not only what the HTTP client sees.

Hope it helps.