cbind for multiple table() functions

814 views Asked by At

I'm trying to count the frequency of multiple columns in a data.frame.

I used the table function on each column and bound them all by cbind, and was going to use the aggregate function after to calculate the means by my identifier. Example:

df1
V1       V2     V3
George   Mary   Mary  
George   Mary   Mary
George   Mary   George
Mary     Mary   George
Mary    George  George
Mary   
Frequency<- as.data.frame(cbind(table(df1$V1), table(df1$V2), table(df1$V3)))
row.names V1
George    3
Mary      3
          1
George    1
Mary      4
          1
George    3
Mary      2

The result I get (visually) is a 2 column data frame, but when I check the dimension of Frequency, I get a result implying that the 2nd column only exists.

It's causing me trouble when I try to rename the columns and run the aggregate function, errors I get for rename:

colnames(Frequency) <- c("Name", "Frequency")
Error in names(Frequency) <- c("Name", "Frequency") : 
  'names' attribute [2] must be the same length as the vector [1]

The Final purpose is to run an aggregate command and get the mean by name:

Name.Mean<- aggregate(Frequency$Frequency, list(Frequency.Name), mean)

Desired output:

Name   Mean
George Value
Mary   Value
2

There are 2 answers

0
akrun On BEST ANSWER

Using mtabulate (data from @user3169080's post)

library(qdapTools)
d1 <- mtabulate(df1)
is.na(d1) <- d1==0 
colMeans(d1, na.rm=TRUE)
# Alice George   Mary 
#  4.0    3.0    2.5 
1
Joswin K J On

I hope this is what you were looking for:

> df1
  V1     V2     V3
1 George George George
2   Mary   Mary  Alice
3 George George George
4   Mary   Mary  Alice
5   <NA> George George
6   <NA>   Mary  Alice
7   <NA>   <NA> George
8   <NA>   <NA>  Alice
> ll=unlist(lapply(df1,table))
> nn=names(ll)
> nn1=sapply(nn,function(x) substr(x,4,nchar(x)))
> mm=data.frame(ll)
> mm$names=nn1
> tapply(mm$ll,mm$names,mean)
> Mean=tapply(mm$ll,mm$names,mean)
> data.frame(Mean)
       Mean
Alice   4.0
George  3.0
Mary    2.5