I am a beginner and not too familiar with advanced features of R. I am unable to understand why reduce()
doesn't work for grouped_df
. I am building upon my discussion at Rowwise summation for Tibble datatype where I posted reduce()
as one of the solutions when the class of datatype is:
"tbl_df" "tbl" "data.frame"
Here's the sample data:
df <- data.frame(client = rep(c("Client A","Client B", "Client C"),3),
year = rep(c(2014,2013,2012), each=3),
rev1 = rep(c(10,20,30),3),
rev2 = rep(c(10,20,30),3))
where, class (df)
is "tbl_df" "tbl" "data.frame"
I'd now convert df
to of class grouped_df
by :
df1 <- df %>%
group_by(client, year,rev1) %>%
summarise(rev3 = sum(rev1,rev2)) %>%
select(client, year, rev3, rev1)
where, class (df1)
is "grouped_df" "tbl_df" "tbl" "data.frame"
, which is as expected.
Now, when I use reduce()
to do row-wise summation on df1
, it throws an error.
df1%>% dplyr::mutate(sum=Reduce("+",.[3:4]))
Error: incompatible size (9), expecting 1 (the group size) or 1
However, when I convert df1
to data frame, it works well.
df1%>% dplyr::as_data_frame() %>% dplyr::mutate(sum=Reduce("+",.[3:4]))
The head()
of above output is:
# A tibble: 6 × 5
client year rev3 rev1 sum
<fctr> <dbl> <dbl> <dbl> <dbl>
1 Client A 2012 20 10 30
2 Client A 2013 20 10 30
3 Client A 2014 20 10 30
4 Client B 2012 40 20 60
5 Client B 2013 40 20 60
6 Client B 2014 40 20 60
...
Can someone please explain why reduce()
function doesn't work for grouped data, but works for non-grouped data? Maybe, I am missing something here.
Reduce()
andreplace()
work on vectors.The df1 grouped dataframe becomes much more than a collection of vectors. Below is what it looks like if you flip open the objects (found in the environment pane.)
If we add an
ungroup()
we can get a collection of vectors back.In any case, could maybe this dplyr code work instead?