I am now woking on a sina weibo crawler using its api. In order to use api, I have to access oauth2 authorizing page to retrive the code from url.
This is exactly how I do:
Use my app_key and app_secret (both known)
get the url of oauth2 webpage
copy and paste the code from Respond URL manually.
This is my code:
#call official SDK
client = APIClient(app_key=APP_KEY, app_secret=APP_SECRET, redirect_uri=CALLBACK_URL)
#get url of callback page of authorization
url = client.get_authorize_url()
print url
#open webpage in browser
webbrowser.open_new(url)
#after the webpage responding, parse the code part in the url manually
print 'parse the string after 'code=' in url:'
code = raw_input()
My Question is exactly how to get rid of the manually parsing part?
Reference: http://blog.csdn.net/liuxuejiang158blog/article/details/30042493
To get the contents of a page using requests, you can do like this
You can see details of the requests library here. You can use
pipto install it into your virtualenv / python dist.For writing crawler, you can also use scrapy.
And finally, I did not understand one thing, if you have a official client then why do you need to parse the contents of an URL to get data. Doesn't the client give you data using some nice and easy to use functions?