Grouping & summarizing data frame by multiple different columns in R

Question

Grouping & summarizing data frame by multiple different columns in R

1.2k views Asked by vagabond At 15 February 2016 at 16:21

I don't know if I am not searching with the right terms but I can't find a post about this.

I have a df :

df <- data.frame(grouping_letter = c('A', 'A', 'B', 'B', 'C', 'C'), grouping_animal = c('Cat', 'Dog', 'Cat', 'Dog', 'Cat', 'Dog'), value = c(1,2,3,4,5,6))

I want to group by grouping_letter and by grouping_animal. I want to do this using dplyr.

If I did it separately, it would be :

df %>% group_by(grouping_letter) %>% summarise(sum(value))

df %>% group_by(grouping_animal) %>% summarise(sum(value))

Now let's say, I have hundreds of columns I need to group by individually. How can I do this?

I was trying:

results <- NULL for (i in grouping_columns) { results[[i]] <- df %>% group_by(df$i) %>% summarize(sum(value)) }

I got a list called results with the output. I am wondering if there is a better way to do this instead of using a for-loop?

Original Q&A

There are 1 answers

**akrun** · Answer 1 · 2016-02-15T16:25:05+00:00

We can create an index of 'grouping' colums (using grep), loop over the index (with lapply) and separately get the sum of 'value' after grouping by the column in the 'index'.

 library(dplyr)
 i1 <- grep('grouping', names(df))
 lapply(i1, function(i) 
        df[setdiff(seq_along(df), i)] %>% 
                     group_by_(.dots=names(.)[1]) %>% 
                     summarise(Sumvalue= sum(value)))
#[[1]]
#Source: local data frame [2 x 2]

#  grouping_animal Sumvalue
#           (fctr)    (dbl)
#1             Cat        9
#2             Dog       12

#[[2]]
#Source: local data frame [3 x 2]

#  grouping_letter Sumvalue
#           (fctr)    (dbl)
#1               A        3
#2               B        7
#3               C       11

Or we can do this by converting the dataset from 'wide' to 'long' format, then group by the concerned columns and get the sum of 'value'.

library(tidyr)
gather(df, Var, Group, -value) %>% 
         group_by(Var, Group) %>% 
         summarise(Sumvalue = sum(value))
#              Var Group Sumvalue
#            (chr) (chr)    (dbl)
#1 grouping_animal   Cat        9
#2 grouping_animal   Dog       12
#3 grouping_letter     A        3
#4 grouping_letter     B        7
#5 grouping_letter     C       11

TechQA.

Grouping & summarizing data frame by multiple different columns in R

There are 1 answers

Related Questions in R

Related Questions in DPLYR

Related Questions in APPLY

Related Questions in GROUP-SUMMARIES

Popular Questions

Popular Tags

Trending Questions