Reshape start-end time intervals to smaller intervals in R

321 views Asked by At

Here is duration data by time intervals.

id <- c("A", "B", "B", "B", "C", "C", "D", "E", "F", "F", "F", "F")
start <- c(368, 200, 230, 788, 230, 521, 272, 306, 0, 162, 337, 479)
end <- c(373.98, 229.98, 233.98, 842.98, 239.98, 639.98, 285.98,
       306.98,  95.98, 162.98, 339.98, 539.98)
value <- c(20, 24, 24, 24, 19, 19, 100, 1, 8, 8, 8, 8)
dt <- data.frame(id, start, end, value)
head(dt)
  id start    end value
1  A   368 373.98    20
2  B   200 229.98    24
3  B   230 233.98    24
4  B   788 842.98    24
5  C   230 239.98    19
6  C   521 639.98    19

I would like to convert following data to table format within 1001 column (first one = id, and columns from 1 to 1000). Split intervals.

Transform duration data into "check point" format. Create rows for each id, where sequence of duration concurs to column name should be a $value of $id. For another cases = 0.

d <- data.frame(matrix(ncol = 1001, nrow = 1))
colnames(d) <- c("id", 1:1000)
dim(d)
[1]    1 1001

I have created date frame within 1001 columns. I know how to create sequence for row, but I have trouble with implement this seq into table.

What operator in r helps me? Any ideas where is start point this? Thank you very much for any help.

I hope the example is sufficiently clear, otherwise please let me know and I will try to further clarify.

Expected output is data frame within 1001 columns, where name of first one = id, from second to last = number from 1 to 1000. For each unique id we should add value from $value when name of column = time interval (numbers from $start to $end)

1

There are 1 answers

3
akrun On BEST ANSWER

One value in 'start' was '0'. So, I changed to '1', created a matrix ('m1') of 1000 columns and 6 rows (length of unique elements in the 'id' column). Using Map, created a sequence for each 'start', 'end' value, the output is a list ('lst'). We rbind the 'lst' ('d2'), using row/column indexing based on values from 'd2', we replace the NA values in 'm1' with 'value' column that was replicated based on the 'nrow' of each 'lst' element.

dt$start[9] <- 1
m1 <- matrix(ncol=1000, nrow=length(unique(dt$id)),
   dimnames=list(unique(dt$id), paste0('id', 1:1000)))
lst <- Map(function(x,y,z) data.frame(id=z, Col=seq(x,y)) ,
               dt$start, trunc(dt$end), dt$id)
d2 <- do.call(rbind, lst)
m1[cbind(as.numeric(d2$id), d2[,2])] <- rep(dt$value,sapply(lst, nrow))