Two-dimensional table of means in summaryBy in R

1.1k views Asked by At

Let's take this code:

library(doBy)
tab <- summaryBy(x ~ A + B, df)

It computes the mean of x for each combination of A & B.

How can I create a 2D table out of tab with A in rows and B in columns, so that the intersection of a row and column gives the mean of x for a group having given combination of A & B?

2

There are 2 answers

0
IRTFM On BEST ANSWER

I would skip the intermediate step and use a function that is designed to deliver what you want, namely tapply. Using agsudy's data:

> x.mean <- with(dat, tapply(x, list(A=A,B=B), mean))
> x.mean
   B
A           2         3         4         5
  1 0.3671088 0.4531040 0.5942483 0.8013453
  2 0.4776386 0.6115361 0.7907584 0.6607741
  3 0.3966482 0.3447879 0.4372367 0.4914243
  4 0.2779789 0.6780573 0.4087858 0.4205421
  5 0.6288597 0.6924584 0.6508705 0.5648296

If you really wanted to use the intermediate step you can also use tapply with either the I or c functions to do the rearrangement:

with(tab, tapply(x.mean, list(A=A,B=B), c))
   B
A           2         3         4         5
  1 0.3671088 0.4531040 0.5942483 0.8013453
  2 0.4776386 0.6115361 0.7907584 0.6607741
  3 0.3966482 0.3447879 0.4372367 0.4914243
  4 0.2779789 0.6780573 0.4087858 0.4205421
  5 0.6288597 0.6924584 0.6508705 0.5648296

> with(tab, tapply(x.mean, list(A,B), I))
          2         3         4         5
1 0.3671088 0.4531040 0.5942483 0.8013453
2 0.4776386 0.6115361 0.7907584 0.6607741
3 0.3966482 0.3447879 0.4372367 0.4914243
4 0.2779789 0.6780573 0.4087858 0.4205421
5 0.6288597 0.6924584 0.6508705 0.5648296
0
agstudy On

If I understand I think you are looking to reshape your data. One option is to use dcast from reshape2 package:

dat.s <- summaryBy(x~B+A, data=dat)
library(reshape2)
dcast(dat.s,A~B)

For example:

## create some data 
set.seed(1)
dat <- data.frame(x=runif(100),
                  A=sample(1:5,100,rep=TRUE),
                  B=sample(2:5,100,rep=TRUE))


library(doBy)
dat.s <- summaryBy(x~B+A, data=dat)
dat.s <- round(dat.s,2)   ## for better output
library(reshape2)
dcast(dat.s,A~B)

  A    2    3    4    5
1 1 0.37 0.45 0.59 0.80
2 2 0.48 0.61 0.79 0.66
3 3 0.40 0.34 0.44 0.49
4 4 0.28 0.68 0.41 0.42
5 5 0.63 0.69 0.65 0.56