Obtaining descriptive statistics of observations with years of complete data in R

67 views Asked by At

I have the following panel dataset

id year Value
1  1     50
2  1     55
2  2     40
3  1     48
3  2     54
3  3     24
4  2     24
4  3     57
4  4     30

I would like to obtain descriptive statistics of the number of years in which observations have information available, for example: the number of individuals with only one year of information is 1, the number of individuals with only two years of information is one, while the number of individuals with three years of available information is 2.

3

There are 3 answers

0
lmo On BEST ANSWER

In base R using table and it's faster cousin tabulate:

table(tabulate(dat$id))

1 2 3 
1 1 2 

or

table(table(dat$id))

Convert to a data.frame:

data.frame(table(tabulate(dat$id)))
  Var1 Freq
1    1    1
2    2    1
3    3    2
0
d.b On
lapply(split(df$id, ave(df$year, df$id, FUN = length)), function(x) length(unique(x)))
#$`1`
#[1] 1

#$`2`
#[1] 1

#$`3`
#[1] 2
0
akrun On

We can use data.table. Convert the 'data.frame' to 'data.table' (setDT(df1)), grouped by 'id', get the length of unique number of 'year', grouped by that column, get the number of rows (.N)

library(data.table)
setDT(df1)[,  uniqueN(year), .(id)][, .N, V1]
#   V1 N
#1:  1 1
#2:  2 1
#3:  3 2