I am trying to write a code that will allow me to download a .xls file from a secured https website which requires a login. This is very difficult for me, as i have no experience with web-coding--all my R experience comes from econometric work with readily available datasets.
i followed this thread to help write some code, but i think im running into trouble because the example is http, and i need https.
this is my code:
install.packages("RCurl")
library(RCurl)
curl = getCurlHandle()
curlSetOpt(cookiejar = 'cookies.txt', followlocation = TRUE, autoreferer = TRUE, curl = curl)
html <- getURL('https://jump.valueline.com/login.aspx', curl = curl)
viewstate <- as.character(sub('.*id="_VIEWSTATE" value="([0-9a-zA-Z+/=]*).*', '\\1', html))
params <- list(
'ct100$ContentPlaceHolder$LoginControl$txtUserID' = 'MY USERNAME',
'ct100$ContentPlaceHolder$LoginControl$txtUserPw' = 'MY PASSWORD',
'ct100$ContentPlaceHolder$LoginControl$btnLogin' = 'Sign In',
'_VIEWSTATE' = viewstate)
html <- postForm('https://jump.valueline.com/login.aspx', .params = params, curl = curl)
when i get to running the piece that starts "html <- getURL(..." i get:
> html <- getURL('https://jump.valueline.com/login.aspx', curl = curl)
Error in function (type, msg, asError = TRUE) :
SSL certificate problem: unable to get local issuer certificate
is there a workaround for this? how am i able to access the local issuer certificate?
I read that adding '.opts = list(ssl.verifypeer = FALSE)' into the curlSetOpt would remedy this, but when i add that, the getURL runs, but then postForm line gives me
> html <- postForm('https://jump.valueline.com/login.aspx', .params = params, curl = curl)
Error: Internal Server Error
Besides that, does this code look like it will work given the website i am trying to access? I went into the inspector, and changed all the params to be correct for my webpage, but since i'm not well versed in webcoding i'm not 100% i caught the right things (particularly the VIEWSTATE). Also, is there a better, more efficient way i could approach this?
automating this process would be huge for me, so your help is greatly appreciated.
Try httr:
That still gives me a server error, but that's probably because you're not sending the right fields in the body.