rolling computation to fill gaps by finding following or previous values in a data.table time series

50 views Asked by At

I have a data.table that looks like this:

tsdata <- data.table(time   = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10),
                     signal = c(0, 1, 1, 0, 0, 1, 0, 0, 0, 1))

I am trying to fill the gaps between the ones, but only if the gap of zeros is small. So a flexible solution to define the gap would be nice. In this example the gap with zeros shouldn't be bigger than 2.

The result should look like this:

tsdata <- data.table(time   = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10),
                     signal = c(0, 1, 1, 1, 1, 1, 0, 0, 0, 1))

My real time series data is much bigger than this, so any help is appreciated.

1

There are 1 answers

0
G. Grothendieck On BEST ANSWER

Group by rleid(signal) and then fill in short 0 sequences not at the beginning or end with 1.

 tsdata[, signal2 := ifelse(signal[1] == 0 & 
                           .N <= 2 & 
                           time[1] > min(tsdata$time) & 
                           time[.N] < max(tsdata$time), 1, signal),
  by = rleid(signal)]

tsdata

giving:

    time signal signal2
 1:    1      0       0
 2:    2      1       1
 3:    3      1       1
 4:    4      0       1
 5:    5      0       1
 6:    6      1       1
 7:    7      0       0
 8:    8      0       0
 9:    9      0       0
10:   10      1       1