Create counter of consecutive runs of a certain value

792 views Asked by At

I have data where consecutive runs of zero are separated by runs of non-zero values. I want to create a counter for the runs of zero in the column 'SOG'.

For the first sequence of 0 in SOG, set the counter in column Stops to 1. For the second run of zeros, set 'Stops' to 2, and so on.

SOG Stops
--- -----
4   0
4   0
0   1
0   1
0   1
3   0
4   0
5   0
0   2
0   2
1   0
2   0
0   3
0   3
0   3
4

There are 4 answers

0
Roland On BEST ANSWER
SOG <- c(4,4,0,0,0,3,4,5,0,0,1,2,0,0,0)
#run length encoding:
tmp <- rle(SOG)
#turn values into logicals
tmp$values <- tmp$values == 0
#cumulative sum of TRUE values
tmp$values[tmp$values] <- cumsum(tmp$values[tmp$values])
#inverse the run length encoding
inverse.rle(tmp)
#[1] 0 0 1 1 1 0 0 0 2 2 0 0 3 3 3
0
akrun On

Try

 df$stops<- with(df, cumsum(c(0, diff(!SOG))>0)*!SOG)
 df$stops
 # [1] 0 0 1 1 1 0 0 0 2 2 0 0 3 3 3
0
Pat W. On

Using dplyr:

 library(dplyr)
 df <- df %>% mutate(Stops = ifelse(SOG == 0, yes = cumsum(c(0, diff(!SOG) > 0)), no = 0))
 df$Stops
 #[1] 0 1 1 1 0 0 0 2 2 0 0 3 3 3

EDIT: As an aside to those of us who are still beginners, many of the answers to this question make use of logicals (i.e. TRUE, FALSE). ! before a numeric variable like SOG tests whether the value is 0 and assigns TRUE if it is, and FALSE otherwise.

SOG
#[1] 4 0 0 0 3 4 5 0 0 1 2 0 0 0
!SOG
#[1] FALSE  TRUE  TRUE  TRUE FALSE FALSE FALSE  TRUE  TRUE FALSE FALSE
#[12]  TRUE  TRUE  TRUE

diff() takes the difference between the value and the one before it. Note that there is one less element in this list than in SOG since the first element doesn't have a lag with which to compute a difference. When it comes to logicals, diff(!SOG) produces 1 for TRUE - FALSE = 1, FALSE - TRUE = -1, and 0 otherwise.

diff(SOG)
#[1] -4  0  0  3  1  1 -5  0  1  1 -2  0  0
diff(!SOG)
#[1]  1  0  0 -1  0  0  1  0 -1  0  1  0  0

So cumsum(diff(!SOG) > 0) just focuses on the TRUE - FALSE changes

cumsum(diff(!SOG) > 0)
#[1] 1 1 1 1 1 1 2 2 2 2 3 3 3

But since the list of differences is one element shorter, we can append an element:

cumsum(c(0, diff(!SOG) > 0))  #Or cumsum( c(0, diff(!SOG)) > 0 ) 
#[1] 0 1 1 1 1 1 1 2 2 2 2 3 3 3

Then either "multiply" that list by !SOG as in @akrun's answer or use the ifelse() command. If a particular element of SOG == 0, we use the corresponding element from cumsum(c(0, diff(!SOG) > 0)); if it isn't 0, we assign 0.

0
Ronak Shah On

A one-liner with rle would be -

df <- data.frame(SOG = c(4,4,0,0,0,3,4,5,0,0,1,2,0,0,0))
df <- transform(df, Stops = with(rle(SOG == 0), rep(cumsum(values) * values, lengths)))
df

#   SOG Stops
#1    4     0
#2    4     0
#3    0     1
#4    0     1
#5    0     1
#6    3     0
#7    4     0
#8    5     0
#9    0     2
#10   0     2
#11   1     0
#12   2     0
#13   0     3
#14   0     3
#15   0     3