Rolling rowsums in unbalanced panel data in R

191 views Asked by At

I have unbalanced panel data and want to take the rowsum (MRAR) for each observation over the past 36 months as reported in the columns (time-series "dates"):

dput(ER)

NA, NA, NA, NA, NA, NA, NA, -4.91111111111111, NA, NA, -6, 
NA, NA, NA, -1.31111111111111, NA, NA, NA, -5.95555555555556, 
-5.73333333333333, -5.75555555555556, -5.86666666666667, 
-5.33333333333333, -5.35555555555556, NA, -5.22222222222222, 
-5.17777777777778, -5.28888888888889, -5.26666666666667)), 
.Names = c("ER.08.2007", "ER.09.2007", "ER.10.2007", "ER.11.2007", "ER.12.2007", "ER.01.2008", 
"ER.02.2008", "ER.03.2008", "ER.04.2008", "ER.05.2008", "ER.06.2008", 
"ER.07.2008", "ER.08.2008", "ER.09.2008", "ER.10.2008", "ER.11.2008", row.names = c(NA, 
-3530L), class = "data.frame")

str(ER)
'data.frame':   3530 obs. of  120 variables:
 $ ER.08.2007: num  NA NA NA NA NA NA NA NA NA NA ...
 $ ER.09.2007: num  NA NA NA NA NA NA NA NA NA NA ...
 $ ER.10.2007: num  NA NA NA NA NA NA NA NA NA NA ...

I have tried the following:

MRAR_3y <- as.data.frame(mat.or.vec(nrow(ER), length(dates)))

for (i in seq(1,length(dates)-36))
{
  MRAR_3y[,i] <- rowSums(ER[,c(seq(i,(i+35)))], na.rm=FALSE)
}

The desired MRAR_3y dataframe gives the sum of the past 36 month's ER However, the above code returns the following:

> str(MRAR_3y)
'data.frame':   3530 obs. of  120 variables:
 $ V1  : num  NA NA NA NA NA NA NA NA NA NA ...
 $ V2  : num  NA NA NA NA NA NA NA NA NA NA ...
 $ V59 : num  NA NA NA NA NA NA NA NA NA NA ...
 $ V60 : num  NA NA NA NA NA NA NA NA NA NA ...
 $ V61 : num  NA NA NA NA -53.9 ...
 $ V62 : num  NA NA NA NA -55.6 ...
 $ V63 : num  NA NA NA NA -53.9 ...
 $ V64 : num  NA NA NA NA -53.7 ...

So there are some values even before the first 36 date columns. There are some "inf" entries in the df if I do view(MRAR_3y).

This question relates to several threads surrounding rolling sums, i.e. R dplyr rolling sum

Help much appreciated, Wilhelm Fantastisch

2

There are 2 answers

0
Andrew Gustar On

A simple way to do it is by differencing cumulative sums. Here is an example, but you will need to tailor it to your data.

x <- sample(10,100,replace=TRUE)

L <- length(x)
W <- 36

cumsum(x)[-c(1:W)] - cumsum(x)[-c((L-W+1):L)]   

[1] 181 181 179 187 186 182 176 173 181 173 167 170 175 174 175 181 180 184 186... etc
0
BENY On

You can look at zoo rollsum, by using Andrew's sample data

x <- sample(10,100,replace=TRUE)
zoo:rollsum(x,36)

181 181 179 180 180 182 184 183 181 182 187 189 192 191 187 196 200 201