I am able to construct a new data frame based on the data below where each row comprises the expectation values of each categorical variable in the ID column, taken in ascending order of time. But how can I do this up until a cut off point in time. For example, if I only want values to be taken in chronological order until time = 5.
library('dplyr')
library('purrr')
df <- read.csv("data.csv", header = TRUE)
# df
ID Time Expectation
1 NJL.1 3 0.1
2 NJL.1 1 0.1
3 NJL.1 2 0.1
4 NJL.1 4 0.1
5 NJL.1 6 0.1
6 NJL.1 5 100.0
7 NJL.1 10 0.1
8 NJL.1 8 0.1
9 NJL.1 9 0.1
10 NJL.1 7 0.1
11 NJL.2 10 0.1
12 NJL.2 1 0.1
13 NJL.2 3 0.1
14 NJL.2 6 0.1
15 NJL.2 4 0.1
16 NJL.2 2 6.0
17 NJL.2 5 0.1
18 NJL.2 8 7.0
19 NJL.2 9 8.0
20 NJL.2 7 0.1
21 NJL.3 3 0.1
22 NJL.3 1 0.1
23 NJL.3 2 0.1
24 NJL.3 4 0.1
25 NJL.3 6 0.1
26 NJL.3 5 10.0
27 NJL.3 10 0.1
28 NJL.3 8 0.1
29 NJL.3 9 0.1
30 NJL.3 7 0.1
df <- df %>%
group_by(ID) %>%
summarise(var = list(Expectation[order(Time)]),
var_ts = purrr::map(var, ts))
So for example, for NJL.1, values would be (0.1, 0.1, 0.1, 0.1. 100) and all other expectation values are ignored.
Many thanks!
a
data.table
approachsample data
code
Now you can easily summarise, paste+collapse, dcast, etc.. to get desired output.
Examples:
or