I'm using an API to pull data from a chemically aware database into R. I'm using the httr library and when the query is ready I GET() the data and extract the contents into a df called rawdata:
cdd_data <- GET(data_url, add_headers(.headers = c("X-CDD-Token"=TOKEN)))
rawdata = content(cdd_data) %>%
clean_names(.)
However, a number of these columns which should be type double are being imported as type logical, and wiping out the numerical content of these columns:
Rows: 5159 Columns: 35
── Column specification
Delimiter: ","
chr (12): Molecule Name, Batch Name, Pharmacokinetics: Run Lab, Pharmacokinetics: Study_ID, Pharmacokinetics: Gender, Pharmacokinetics: Species, Pharmacokinetics: Strain, Pharmacok...
dbl (11): Molecular weight (g/mol), Pharmacokinetics: Time (h), Pharmacokinetics: Rep1 (ng/mL), Pharmacokinetics: Rep2 (ng/mL), Pharmacokinetics: Rep3 (ng/mL), Pharmacokinetics: Do...
lgl (11): Pharmacokinetics: Tmax (h), Pharmacokinetics: Cmax (ng/mL), Pharmacokinetics: CMax_g (ng/g), ADMET- Cyprotex: Mou PPB fu, ADMET- Cyprotex: Rat PPB fu, ADMET- Cyprotex: Mo...
date (1): Pharmacokinetics: Run Date
where those 11 logical columns should be type double.
Is there a way to either recover the data after the import by changing the column types, or by pre-defining column types before the import, to preserve the data?
I have tried changing some of the parameters for the content() function (as="parsed", as="text", etc.) and also retrieved the problems with the import, where the function taunts me by showing that my numerical data is being wiped out because it is expecting logical input:
One or more parsing issues, call `problems()` on your data frame for details, e.g.:
dat <- vroom(...)
problems(dat)
> problems(rawdata)
# A tibble: 186 × 5
row col expected actual file
<int> <int> <chr> <chr> <chr>
1 1910 30 1/0/T/F/TRUE/FALSE 0.0485 ""
2 1911 28 1/0/T/F/TRUE/FALSE 0.408 ""
3 2484 28 1/0/T/F/TRUE/FALSE 0.446 ""
4 2484 29 1/0/T/F/TRUE/FALSE 0.497 ""
5 2484 30 1/0/T/F/TRUE/FALSE 0.0236 ""
6 2484 31 1/0/T/F/TRUE/FALSE 0.0400 ""
7 2484 33 1/0/T/F/TRUE/FALSE 65.6 ""
8 2484 34 1/0/T/F/TRUE/FALSE 7.80 ""
9 2484 35 1/0/T/F/TRUE/FALSE 4.20 ""
10 2485 32 1/0/T/F/TRUE/FALSE 55.2 ""
# ℹ 176 more rows
# ℹ Use `print(n = ...)` to see more rows