I have unbalanced panel data and want to take the rowsum (MRAR) for each observation over the past 36 months as reported in the columns (time-series "dates"):
dput(ER)
NA, NA, NA, NA, NA, NA, NA, -4.91111111111111, NA, NA, -6,
NA, NA, NA, -1.31111111111111, NA, NA, NA, -5.95555555555556,
-5.73333333333333, -5.75555555555556, -5.86666666666667,
-5.33333333333333, -5.35555555555556, NA, -5.22222222222222,
-5.17777777777778, -5.28888888888889, -5.26666666666667)),
.Names = c("ER.08.2007", "ER.09.2007", "ER.10.2007", "ER.11.2007", "ER.12.2007", "ER.01.2008",
"ER.02.2008", "ER.03.2008", "ER.04.2008", "ER.05.2008", "ER.06.2008",
"ER.07.2008", "ER.08.2008", "ER.09.2008", "ER.10.2008", "ER.11.2008", row.names = c(NA,
-3530L), class = "data.frame")
str(ER)
'data.frame': 3530 obs. of 120 variables:
$ ER.08.2007: num NA NA NA NA NA NA NA NA NA NA ...
$ ER.09.2007: num NA NA NA NA NA NA NA NA NA NA ...
$ ER.10.2007: num NA NA NA NA NA NA NA NA NA NA ...
I have tried the following:
MRAR_3y <- as.data.frame(mat.or.vec(nrow(ER), length(dates)))
for (i in seq(1,length(dates)-36))
{
MRAR_3y[,i] <- rowSums(ER[,c(seq(i,(i+35)))], na.rm=FALSE)
}
The desired MRAR_3y dataframe gives the sum of the past 36 month's ER However, the above code returns the following:
> str(MRAR_3y)
'data.frame': 3530 obs. of 120 variables:
$ V1 : num NA NA NA NA NA NA NA NA NA NA ...
$ V2 : num NA NA NA NA NA NA NA NA NA NA ...
$ V59 : num NA NA NA NA NA NA NA NA NA NA ...
$ V60 : num NA NA NA NA NA NA NA NA NA NA ...
$ V61 : num NA NA NA NA -53.9 ...
$ V62 : num NA NA NA NA -55.6 ...
$ V63 : num NA NA NA NA -53.9 ...
$ V64 : num NA NA NA NA -53.7 ...
So there are some values even before the first 36 date columns. There are some "inf" entries in the df if I do view(MRAR_3y).
This question relates to several threads surrounding rolling sums, i.e. R dplyr rolling sum
Help much appreciated, Wilhelm Fantastisch
A simple way to do it is by differencing cumulative sums. Here is an example, but you will need to tailor it to your data.