Efficiently finding the count of column values for distinct rows in a dataframe in r

Question

Efficiently finding the count of column values for distinct rows in a dataframe in r

195 views Asked by Shiva At 19 June 2015 at 15:11

Suppose I have a data frame as:

id   value
1    "hi"
1    "hi"
1    "hi again"
1    "hi again"
2    "hello"
2    "hi"

Now I want to get the count of each value for each of the distinct values in id column. The output would be like

id    value       Freq
1     "hi"        2
1     "hi again"  2
2     "hello"     1
2     "hi"        1

I tried splitting up the first data frame for each distinct id and get the frequency using the table() function on the value column and appending the id column later. Also, I end up with a lot of dataframes in my memory. I just want to know if I can achieve the above dataframe without chewing up my memory with lot of dataframes(as I have almost 5 million rows).

Original Q&A

There are 1 answers

**grrgrrbla** · Accepted Answer · 2015-06-19T15:16:21+00:00

assuming your data.frame is called df, using data.table:

library(data.table)
setDT(df)[ , .(Freq = .N), by = .(id, value)]

using dplyr:

libary(dplyr)
group_by(df, id, value) %>% summarise(Freq = n())

You should choose one of those two packages (dplyr or data.table) and learn it really thoroughly. In the long run you will likely use both. But beginning with one and really understanding it will help you tremendously. I use both pretty much everytime I use R.

dplyr tends to be easier for beginners, so I would read a tutorial on it. This will help you forever. There is also a great video-tutorial which can be found on this site under The grammar and graphics of datascience.

I personally prefer data.table because it is faster and more flexible. Check the new HTML vignettes and the PDF vignettes here.

TechQA.

Efficiently finding the count of column values for distinct rows in a dataframe in r

There are 1 answers

Related Questions in R

Related Questions in COUNT

Related Questions in SUBSET

Related Questions in MEMORY-EFFICIENT

Popular Questions

Popular Tags

Trending Questions