My R code (see below) generates these errors in some cases:
[1] "2023-08-12 16:47:37.463"
Error in curl::curl_fetch_memory(url, handle = handle): Could not resolve host: api.abc.com
Request failed [ERROR]. Retrying in 1.3 seconds...
Error in curl::curl_fetch_memory(url, handle = handle): Could not resolve host: api.abc.com
Request failed [ERROR]. Retrying in 1 seconds...
Error in curl::curl_fetch_memory(url, handle = handle):
Could not resolve host: api.abc.com
api.abc.com is not the original API I use. I use a commercial API which noticed me their server was not down at the particular moment above. In some cases when the server was down it returned http-code 503.
I have two questions:
- what can be the cause of these errors?
- how can I keep my script below continue running in cases with these errors? Currently it breaks after these error messages. I was not expecting this since I use
RETRYin my code withGET.
My code below is called every 10 seconds with the scheduler tclTaskSchedule (see end of code). In this examplecode I have used a free API (universities.hipolabs.com) as example.
library(httr) # accessing API's'
library(jsonlite) # JSON parsing
library(dplyr)
library(readr)
library(purrr)
library(tidyr)
library(stringr)
library(tibble)
library(tcltk2)
library(lubridate)
run_api_once <- function() {
mydatalist <- list() #create an empty list
my_next_page_with_number <- "http://universities.hipolabs.com/search?country=United+States"
mydata1 <- RETRY("GET", my_next_page_with_number)
if(mydata1$status_code != 200){
print(mydata1$status_code)
http_responses <<- append(http_responses, paste(mydata1$status_code, Sys.time()))
has_more_pages <- FALSE
} else {
rawdata <- rawToChar(mydata1$content)
mydata2 <- fromJSON(rawdata, flatten = FALSE, simplifyVector = FALSE)
mydata <- mydata2
mydatalist <- c(mydatalist, mydata)
}
y <- Sys.time()
y <- format(y, "%Y-%m-%d %H:%M")
print(y)
users <- tibble(user = mydatalist)
myvar <<- users %>% unnest_wider(user)
return(myvar)
}
# call function every 10 seconds:
tclTaskSchedule(10000, run_api_once(), id = "run_api_once", redo = TRUE)
# end session:
tclTaskDelete(NULL)
I suppose it is irrelevant, although for completeness: I stream the content of myvar to a local server on my pc with Plumber. See code below:
# stream df myvar to local api at port 8405:
library(plumber)
pr("D:/plumber_universities2test.R") %>%
# pr("C:/plumber_universities2test.R") %>%
pr_run(port=8405)
Which calls this script:
library(plumber)
library(dplyr)
#* @param symbol Ticker symbol (just to input something in the function)
#* @get /return
#* @serializer json list(na="string")
universities_data <- function(symbol) {
data <- myvar
data
}
Thanks a lot!
To answer your questions:
httr; or you are making a request to an invalid URL. I can't be sure without seeing the actual URL you are making the request to, but I would guess the third option is the most likely. You should check if you are making a mistake while pasting together a particular URL. For example"google.comsearch"instead of"google.com/search"RETRYis not acting in the way you expect is because this is not an HTTP error status returned by the server, but your request simply can't be executed. To demonstrate the difference, let's have a look at the behaviour of a simple function that makes a request to a URL that automatically returns an HTTP error and one that does not exist at all:Created on 2023-08-13 with reprex v2.0.2
As you can see, the first example still executes the remaining code of the function while the second one stops with an error. I would suggest to carefully check why the requests are not getting to the server and if you are certain that there is no better way, you can wrap
tryaroundRETRY:But the behaviour of
RETRYis correct in my opinion as it is not simply ignoring what is probably a mistake in the code or your internet configuration (not a server side issue).