I need to write a piece of code that will download data files from a website which requires a log in.
I'd have thought that this would be quite easy, but I'm having difficulty with the login, programmatically.
I tried using the steps outlined in this post:
How to login and then download a file from aspx web pages with R
But when i get to the second from last step in the top answer I get an error message:
Error: Internal Server Error
So I am trying to write an RCurl code to login to the site, then download the files. Here is what I have tried:
install.packages("RCurl")
library(RCurl)
curl = getCurlHandle()
curlSetOpt(cookiejar = 'cookies.txt', .opts = list(ssl.verifypeer = FALSE), followlocation = TRUE, autoreferer = TRUE, curl= curl)
html <- getURL('https://research.valueline.com/secure/f2/export?params=[{appId:%27com_2_4%27,%20context:{%22Symbol%22:%22GT%22,%22ListId%22:%22recent%22}}]', curl = curl)
viewstate <- as.character(sub('.*id="__VIEWSTATE" value="([0-9a-zA-Z+/=]*).*', '\\1', html))
params <- list(
'ctl00$ContentPlaceHolder$LoginControl$txtUserID' = '<myusername>',
'ctl00$ContentPlaceHolder$LoginControl$txtUserPw' = '<mypassword>',
'ctl00$ContentPlaceHolder$LoginControl$btnLogin' = 'Sign In',
'__VIEWSTATE' = viewstate
)
html = postForm('https://research.valueline.com/secure/f2/export?params=[{appId:%27com_2_4%27,%20context:{%22Symbol%22:%22GT%22,%22ListId%22:%22recent%22}}]', .params = params, curl = curl)
grepl('Logout', html)