R data.frame: rowSums of selected columns by grouping vector

Question

R data.frame: rowSums of selected columns by grouping vector

880 views Asked by user28400 At 11 June 2015 at 10:03

I have a data frame with a sequence of numeric columns, surrounded on both sides by (irrelevant) columns of characters. I want to obtain a new data frame that keeps the position of the irrelevant columns, and adds the numeric columns to eachother by a certain grouping vector (or applies some other row-wise function to the data frame, by group). Example:

sample = data.frame(cha1 = c("A","B"),num1=1:2,num2=3:4,num3=11:12,num4=13:14,cha2=c("C","D"))
> sample
  cha1 num1 num2 num3 num4 cha2
1    A    1    3   11   13    C
2    B    2    4   12   14    D

with the goal to obtain

> goal
  cha1 X1 X2 cha2 
1    A  4 24    C
2    B  6 26    D

i.e. I've summed the 4 numeric columns according to the grouping vector gl(2,2,4) = (1,1,2,2) [levels: 1,2]

For a purely numeric data frame I've found the following method:

sample_num = sample[,2:5] #select numeric columns
data.frame(t(apply(sample_num,1,function(row) tapply(row, INDEX=gl(2,2,4),sum))))

I could combine this with re-inserting the character columns to give the intended result, but I'm really looking for a more elegant way. I'm particularly interested in a plyr method if there is one, as I'm trying to migrate to plyr for all my data frame manipulations. I imagine the first step would be to cast the data frame into long format, but I have no idea how to proceed from there.

One 'absolute' requirement is that I cannot do without the gl(n,k,l) method of grouping, as I need this to be applicable to a wide range of data frames and grouping factors.

EDIT: for simplicity assume that I know which columns are the relevant numeric columns. I'm not concerned with how to select them, I'm concerned with how to do my grouped sum without messing up the original data frame structure.

Thanks!

Original Q&A

There are 1 answers

**Nick Fisch** · Answer 1 · 2015-07-16T11:40:40+00:00

Nick Fisch On 16 July 2015 at 11:40

Grpindex<-gl(2,2,4)    
goal<-cbind.data.frame(sample["cha1"],(t(rowsum(t(sample[,2:5]), paste0("X",Grpindex)))),sample["cha2"])

Output:

  cha1 X1 X2 cha2
1    A  4 24    C
2    B  6 26    D

TechQA.

R data.frame: rowSums of selected columns by grouping vector

There are 1 answers

Related Questions in R

Related Questions in DATAFRAME

Related Questions in PLYR

Related Questions in APPLY

Related Questions in R-FACTOR

Popular Questions

Trending Questions