Get API response after logging in using rvest

113 views Asked by At

Using a browser, I can navigate to https://myaccount.draftkings.com/login and login using a username and password. Then if I navigate to https://api.draftkings.com/scores/v1/leaderboards/9000?format=json&embed=leaderboard in another tab, I can see the API response.

I want to do this programmatically.

There is a python package that I previously used successfully for this purpose. But for some reason, I can't get that package to work now.

I believe the python package hits an API endpoint with a post request to authenticate and saves the cookies to a file that are later used to make requests.


I tried sending a post request to the login endpoint the python package uses with httr2:

req <- httr2::request("https://api.draftkings.com/users/v3/providers/draftkings/logins?format=json") %>% 
  httr2::req_method("POST") %>% 
  httr2::req_headers(
    "Contest-Type" = "application/json",
    "Accept" = "*/*",
    "Accept-Encoding" = "gzip, deflate, br"
  ) %>% 
  httr2::req_body_json(
   list(
     "login" = "email",
     "password" = "password",
     "host" = "api.draftkings.com",
     "challengeResponse" = list("solution" = "", "type" = "Recaptcha")
   )
  )

req %>% 
  httr2::req_perform()

but I get a 403 error.


I also tried logging in using rvest:

library(rvest)

url <- "myaccount.draftkings.com/login"
session <- session(url)

form <- session %>%
  html_form() %>% 
  magrittr::extract2(1)

form$action <- url

filled_form <- form %>%
  html_form_set(!!! list(EmailOrUsername = "user",
                         Password = 'password'))

html_form_submit(filled_form, submit = 3)

session_jump_to(session, "https://www.draftkings.com/lobby")

but that didn't seem to do what I want either. Note that the form object returned an empty action element. I'm not sure what to replace the action element with, but if I leave it null I get an error.


I also tried saving the cookies from a browser session after logging into draftkings.com and passing all the cookies to a GET request with httr:


cook <- jsonlite::read_json("cookies.json")

clean_cook <- unlist(lapply(cook, function(x) {stats::setNames(x$value, x$name)}))

resp <- httr::GET(
  "https://api.draftkings.com/scores/v1/leaderboards/9000?format=json&embed=leaderboard", 
  httr::set_cookies(.cookies = clean_cook)
)

httr::content(resp)

but this returns a 400 error code and message "Invalid userKey.". This is the same error message I get if I clear my cache in the browser and then visit https://api.draftkings.com/scores/v1/leaderboards/9000?format=json&embed=leaderboard. I don't think the issue is related to URL encoding of the cache values. I tried restarting my RStudio session.


Update

I figured out how to successfully perform the GET request using the cookies saved from my browser and httr2:

cook <- jsonlite::read_json("cookies.json")

clean_cook <- paste0(unlist(lapply(cook, function(x) {paste0(x$name, "=", x$value)})), collapse = ";")

req <- httr2::request(
  "https://api.draftkings.com/scores/v1/leaderboards/9000?format=json&embed=leaderboard"
) 

req <- httr2::req_headers(req, "cookie" = clean_cook) 

resp <- httr2::req_perform(req)

str(httr2::resp_body_json(resp))

For some reason, I was unable to get httr::set_cookies() to work. Passing cookies in the header (using httr::add_headers) was also not reliably successful using httr.

0

There are 0 answers