I have a huge data set but I will explain the problem based on an example.
In the below table I have a Unique ID
, I have a Date
(time series for each Unique ID
) and status of that Unique ID
.
Unique ID Date Status
1 Jan-06 Active
1 Feb-06 Active
1 Mar-06 Not active
1 Apr-06 Stable
1 May-06 Active
1 Jun-06 Stable
1 Jul-06 Active
1 Aug-06 Active
1 Sep-06 Active
2 Oct-06 Active
2 Nov-06 Not active
2 Dec-06 Stable
2 Jan-07 Active
2 Feb-07 Stable
2 Mar-07 Active
2 Apr-07 Active
2 May-07 Active
Result I am trying to get to is to capture the date of next event of happening (Not active or stable )
If you see below unique ID 1 was active as of Jan 2006, we need to capture when it hit not active or stable, it hit some time in March 06.
If you see below unique ID 1 was active as of May 2006, we need to capture when it hit not active or stable, it hit some time in Jun 06.
Note: No Need to add any date for IDs which are already in not active or stable
Unique ID Date Status Result
1 Jan-06 Active Mar-06
1 Feb-06 Active Apr-06
1 Mar-06 Not active NA
1 Apr-06 Stable NA
1 May-06 Active Jun-06
1 Jun-06 Stable NA
1 Jul-06 Active Always active
1 Aug-06 Active Always active
1 Sep-06 Active Always active
2 Oct-06 Active Nov-06
2 Nov-06 Not active NA
2 Dec-06 Stable NA
2 Jan-07 Active Feb-07
2 Feb-07 Stable NA
2 Mar-07 Active Always active
2 Apr-07 Active Always active
2 May-07 Active Always active
Generate example data:
Replace the last "Active" observation for each
id
with "Always Active" in copies ofdate
andstatus
variables:Use
which()
to get positions of non-"Active" observations (regardless ofid
), and usediff()
andrep()
to create a vector of positions of the next non-"Active" event for the group of prior "Active" events. Then get the date of the next non-active event for "Active" events:Print the data by
id
(usingby()
):