Edited to give reprex in response to comments.
I'm reading in several years of public temperature data for California's 58 counties. I'd like to create a summary, the daily statewide average, and put those means in new rows on top of the data frame with the county data in a single, piped step.
I now do this in three steps: (1) reading the county data, (2) creating the means separately, and (3) rowing binding the newly-created means to the data.
Here's a reprex:
#### Reprex ####
library(tidyverse)
df1 <-
data.frame(
name = toupper(c(rep(letters[1:5], each=5))),
x = as.character(c(rnorm(25, 55, 10)))
)
df2 <- df1 |>
group_by(name) |>
mutate(x = mean(as.numeric(x), narm = TRUE)) |>
ungroup() |>
select(name, x) |>
unique() |>
mutate(name = "Z")
df <- rbind(df1, df2)
Here is what I've tried so far, to no avail. Both throw the error message: Error in UseMethod("summarise") : no applicable method for 'summarise' applied to an object of class "c('double', 'numeric')":
#Test 1
df <-
data.frame(
name = toupper(c(rep(letters[1:5], each=5))),
x = as.character(c(rnorm(25, 55, 10)))
) |>
group_by(name) |>
select(name, x) |>
do(bind_rows(., data.frame(name = "Z",
mutate(x = mean(as.numeric(.$x), narm = TRUE))))) |>
ungroup()
#Test 2
df <-
df <-
data.frame(
name = toupper(c(rep(letters[1:5], each=5))),
x = as.character(c(rnorm(25, 55, 10)))
) |>
group_by(name) |>
select(name, x) |>
do(bind_rows(., data.frame(name = "Z",
mutate(x = summarize(mean(as.numeric(.$x), narm = TRUE)))))) |>
ungroup()
Any help is much appreciated.
The base R pipe doesn't let you use the object it's piping more than once--and twice is needed here, once to append to and once to get the means--but you can work around that by piping into an anonymous function, like this. (Note that I decreased your data size to 3 groups of 3 to make it easier to see and set a seed so the random number generation is fully reproducible.)
I don't like this much, stylistically I'd do it in two steps, 1 to read and clean the data, 2 to calculate and append. The base R pipe's placeholder
_requires a named argument, whichbind_rowsdoesn't have, so we still need an anonymous function, but I still prefer this way:If you don't mind the
magrittrpipe, you can simplify Step 2 to this: