I am working with API data from the FoodData Central API. The JSON returned includes a row for each matching food, plus a nested dataframe with specific nutritional information (foodNutrients) for this same food. Note that the foodNutrients dataframe is not always available, and the number of columns it returns varies.
I am trying to pull 2 specific columns (value, unitName) from 1 specific row (where nutrientId==1008) of the foodNutrients df, when available for a given row of the parent df.
I'm looking for the most efficient way to do what this ugly, ugly code is doing:
##############################################################################
# ugly hacky version
library(jsonlite)
library(dplyr)
# search FoodData Central API and get JSON
api_key = "DEMO_KEY"
search_keyword = "banana cream pie"
endpoint_url <- paste("https://api.nal.usda.gov/fdc/v1/foods/search?api_key=",api_key,"&query=",gsub(" ","%20",search_keyword),"&requireAllWords=true",sep="")
json <- fromJSON(endpoint_url)
# here is our dataframe which contains a list containing another dataframe (foodNutrients)
df1 <- json$foods %>%
select(fdcId, description, brandOwner, ingredients, foodNutrients)
# here is some hacky code to show what I am trying to do
# yes, this is probably the stupidest way I could code this
# may God have mercy on my soul
# help me Obi Wan, you're my only hope
# hold output here
output <- data.frame()
# yes, I am using a loop, which is dumb
for(i in 1:nrow(df1)) {
# grab the nested dataframe and add our index
nested_df <- df1[i,"foodNutrients"][[1]]
# hack
chk <- data.frame()
# if nested_df exists and has all needed columns...
if(nrow(nested_df)>0 &
"value" %in% colnames(nested_df) &
"unitName" %in% colnames(nested_df) &
"nutrientId" %in% colnames(nested_df)) {chk <- nested_df %>% filter(nutrientId==1008)}
# pull out energy if available
if(nrow(chk)==1) {
energy_value <- chk[1,"value"]
energy_unit <- chk[1,"unitName"]
} else {
energy_value <- 0
energy_unit <- NA
}
# add a row to a new dataframe with everything I want
row <- data.frame(fdcId=df1[i,"fdcId"],
description=df1[i,"description"],
brandOwner=df1[i,"brandOwner"],
ingredients=df1[i,"ingredients"],
energy_value,
energy_unit)
output <- rbind(output, row)
}
# hello there!
print(head(output))
I have been playing around with various map() functions but have yet to find the right combination that does what I am trying to do here. How do I get this same result without a loop and pile of hacks?
I'm not sure this is exactly what you're looking for, but thought...
...so I tried
map
andunnest
of thefoodNutrients
column:And, if you want to add zeros when no energy available, you can add:
before
select
.Does this accomplish what you needed?
Output