Creating a dataframe from an lapply function with different numbers of rows

1.3k views Asked by At

I have a list of dates (df2) and a separate data frame with weekly dates and a measurement on that day (df1). What I need is to output a data frame within a year prior to the sample dates (df2) and the measurements with this.

eg1 <- data.frame(Date=seq(as.Date("2008-12-30"), as.Date("2012-01-04"), by="weeks"))
eg2 <- as.data.frame(matrix(sample(0:1000, 79*2, replace=TRUE), ncol=1))
df1 <- cbind(eg1,eg2)
df2 <- as.Date(c("2011-07-04","2010-07-28"))

A similar question I have previously asked (Outputting various subsets from one data frame based on dates) was answered effectively with daily data (where there is a balanced number of rows) through this function...

output <- as.data.frame(lapply(df2, function(x) {
  df1[difftime(df1[,1], x - days(365)) >= 0 & difftime(df1[,1], x) <= 0, ]
}))

However, with weekly data an uneven number of rows means this is not possible. When the 'as.data.frame' function is removed, the code works but I get a list of data frames. What I would like to do is append a row of NA's to those dataframes containing fewer observations so that I can output one dataframe, so that I can apply functions simply ignoring the NA values e.g...

df2 <- as.Date(c("2011-01-04","2010-07-28"))
output <- as.data.frame(lapply(df2, function(x) {
df1[difftime(df1[,1], x - days(365)) >= 0 & difftime(df1[,1], x) <= 0, ]
}))
col <- c(2,4)
output_two <- output[,col]
Mean <- as.data.frame(apply(output_two,2,mean), na.rm = TRUE)
1

There are 1 answers

0
akrun On BEST ANSWER

Try

 lst <- lapply(df2, function(x) {df1[difftime(df1[,1], x - days(365)) >= 0 & 
                difftime(df1[,1], x) <= 0, ]})
  n1 <- max(sapply(lst, nrow))
  output <- data.frame(lapply(lst,  function(x) x[seq_len(n1),]))