How to locate my next URL in R?

94 views Asked by At

I'm trying to log to a secured (HTTPS) server. When I use an Internet browser to enter user and password, the browser sends me to the web page corresponding to the next URL past the login screen. When I enter the user password via getURL(), I'm having a difficulty locating that next URL. The following is the R script and the exchange between the server and the client. Where is that hidden next URL?

library(RCurl)
library(XML)
library(bitops)
options(RCurlOptions = list(cainfo = system.file("CurlSSL", "cacert.pem", package = "RCurl")))
html_form = getURL(url1, userpwd = "my_userID:my_password", verbose = TRUE)

The verbose option provides the following log on the client/server exchange:

* Hostname was NOT found in DNS cache
* Trying 192.118.93.82...
* Connected to bjm.ordernet.co.il (192.118.93.82) port 443 (#0)
* successfully set certificate verify locations:
* CAfile: C:/Program Files/R/R-3.2.0/library/RCurl/CurlSSL/cacert.pem
* CApath: none
* SSL connection using TLSv1.0 / AES128-SHA
* Server certificate:
* subject: OU=GT38954200; OU=See www.rapidssl.com/resources/cps (c)14; OU=Domain Control Validated - RapidSSL(R); CN=*.ordernet.co.il
* start date: 2014-11-29 21:53:16 GMT
* expire date: 2017-01-31 01:33:56 GMT
* subjectAltName: bjm.ordernet.co.il matched
* issuer: C=US; O=GeoTrust Inc.; CN=RapidSSL SHA256 CA - G3
* SSL certificate verify ok.

* GET /Login.aspx?lang=en-US HTTP/1.1
* Host: bjm.ordernet.co.il
* Accept: */*

* HTTP/1.1 302 Found
* Cache-Control: no-cache
* Pragma: no-cache
* Content-Type: text/html; charset=utf-8
* Expires: -1
* Location: /Logout.aspx?aspxerrorpath=/Login.aspx
* X-AspNet-Version: 4.0.30319
* Set-Cookie: LANG=en-US; expires=Fri, 24-Jul-2015 18:50:55 GMT; path=/
* Set-Cookie: .FMRAUTH1=Ubb1R/LBTivtXIwlo/FwbBb0w4Av8TVjvR9XMCcPsVKl2V3RFizDEnZoqdiN6Zis; path=/
* Set-Cookie: .FMRAUTH2=zve5LoqIhZR7tmL0h6ztFG1chyGqCCxBn8kyUqumGgfZAupZTzwjRVW5D459hgLOYX7kZP73HwMOI0nGW4hktdzrp5X8aqrQ8DXvYMNqPAk=; path=/
* Node: 2
* Date: Wed, 24 Jun 2015 18:50:55 GMT
* Content-Length: 155
* Set-Cookie: BIGipServerJer-Pool-V2=2778245312.20480.0000; path=/
* Set-Cookie: TS0176e599=017770e57706811e1a74d02f40c933078588699851800fa95b7e2d9ab5cef9c90b29384338790f603f7ab7d7265e70d83de27057cf874a94f040693278e3f249610f5e940b3fe48c1073207fe646e08ea53ca44f9951e3be19facd19c146fc095fef78e672; Path=/

* Connection #0 to host bjm.ordernet.co.il left intact
1

There are 1 answers

1
Nick Kennedy On

You're setting the username and password using HTTP authentication, whereas the website you're trying to log into is using a form for login. it also uses JavaScript to do the POSTing of the login. If you wanted to get this to work, you'd probably be best using the developer mode of your favourite web browser, logging in and looking in the Net tab for what was POSTed during login. You could then use the httr package to POST a similar login.

The answer to your original question was that the redirect with 302 response is in the Location field, but all that's doing is asking you to login again!