Column types incorrect when using an API in R

32 views Asked by At

I'm using an API to pull data from a chemically aware database into R. I'm using the httr library and when the query is ready I GET() the data and extract the contents into a df called rawdata:

cdd_data <- GET(data_url, add_headers(.headers = c("X-CDD-Token"=TOKEN)))
rawdata = content(cdd_data) %>%
  clean_names(.)

However, a number of these columns which should be type double are being imported as type logical, and wiping out the numerical content of these columns:

Rows: 5159 Columns: 35                                                                                                                                                                 
── Column specification 
Delimiter: ","
chr  (12): Molecule Name, Batch Name, Pharmacokinetics: Run Lab, Pharmacokinetics: Study_ID, Pharmacokinetics: Gender, Pharmacokinetics: Species, Pharmacokinetics: Strain, Pharmacok...
dbl  (11): Molecular weight (g/mol), Pharmacokinetics: Time (h), Pharmacokinetics: Rep1 (ng/mL), Pharmacokinetics: Rep2 (ng/mL), Pharmacokinetics: Rep3 (ng/mL), Pharmacokinetics: Do...
lgl  (11): Pharmacokinetics: Tmax (h), Pharmacokinetics: Cmax (ng/mL), Pharmacokinetics: CMax_g (ng/g), ADMET- Cyprotex: Mou PPB fu, ADMET- Cyprotex: Rat PPB fu, ADMET- Cyprotex: Mo...
date  (1): Pharmacokinetics: Run Date

where those 11 logical columns should be type double.

Is there a way to either recover the data after the import by changing the column types, or by pre-defining column types before the import, to preserve the data?

I have tried changing some of the parameters for the content() function (as="parsed", as="text", etc.) and also retrieved the problems with the import, where the function taunts me by showing that my numerical data is being wiped out because it is expecting logical input:

One or more parsing issues, call `problems()` on your data frame for details, e.g.:
  dat <- vroom(...)
  problems(dat) 
> problems(rawdata)
# A tibble: 186 × 5
     row   col expected           actual file 
   <int> <int> <chr>              <chr>  <chr>
 1  1910    30 1/0/T/F/TRUE/FALSE 0.0485 ""   
 2  1911    28 1/0/T/F/TRUE/FALSE 0.408  ""   
 3  2484    28 1/0/T/F/TRUE/FALSE 0.446  ""   
 4  2484    29 1/0/T/F/TRUE/FALSE 0.497  ""   
 5  2484    30 1/0/T/F/TRUE/FALSE 0.0236 ""   
 6  2484    31 1/0/T/F/TRUE/FALSE 0.0400 ""   
 7  2484    33 1/0/T/F/TRUE/FALSE 65.6   ""   
 8  2484    34 1/0/T/F/TRUE/FALSE 7.80   ""   
 9  2484    35 1/0/T/F/TRUE/FALSE 4.20   ""   
10  2485    32 1/0/T/F/TRUE/FALSE 55.2   ""   
# ℹ 176 more rows
# ℹ Use `print(n = ...)` to see more rows
0

There are 0 answers