Build a data frame with overlapping observations

134 views Asked by At

Lets say I have a data frame with the following structure:

> DF <- data.frame(x=1:5, y=6:10)
> DF
  x  y
1 1  6
2 2  7
3 3  8
4 4  9
5 5 10

I need to build a new data frame with overlapping observations from the first data frame to be used as an input for building the A matrix for the Rglpk optimization library. I would use n-length observation windows, so that if n=2 the resulting data frame would join rows 1&2, 2&3, 3&4, and so on. The length of the resulting data frame would be

(numberOfObservations-windowSize+1)*windowSize

The result for this example with windowSize=2 would be a structure like

  x  y
1 1  6
2 2  7
3 2  7
4 3  8
5 3  8
6 4  9
7 4  9
8 5 10

I could do a loop like

DFResult <- NULL
numBlocks <- nrow(DF)-windowSize+1
for (i in 1:numBlocks) {
    DFResult <- rbind(DFResult, DF[i:(i+horizon-1), ])
}

But this seems vey inefficient, especially for very large data frames.

I also tried

rollapply(data=DF, width=windowSize, FUN=function(x) x, by.column=FALSE, by=1)
     x y
[1,] 1 6
[2,] 2 7
[3,] 2 7
[4,] 3 8

where I was trying to repeat a block of rows without applying any aggregate function. This does not work since I am missing some rows

I am a bit stumped by this and have looked around for similar problems but could not find any. Does anyone have any better ideas?

1

There are 1 answers

0
akrun On BEST ANSWER

We could do a vectorized approach

i1 <- seq_len(nrow(DF))
res <- DF[c(rbind(i1[-length(i1)], i1[-1])),]
row.names(res) <- NULL   
res
#  x  y
#1 1  6
#2 2  7
#3 2  7
#4 3  8
#5 3  8
#6 4  9
#7 4  9
#8 5 10