density of ongoing events from density of starting time

93 views Asked by At

I have a data frame containing a column of starting times of event A, and length of event A in hours, like so:

df = structure(list(StartTime = c(10.1401724605821, 8.34114734060131, 
10.1930766354781, 9.49644518946297, 9.36002452136017, 10.8311833878979, 
9.44229844841175, 8.48090101312846, 9.31779155065306, 9.57179348240606
), Length = c(3.28013235144317, 3.97817114274949, 4.29317499510944, 
2.63135516550392, 3.49188423063606, 4.08827690966427, 3.63062007538974, 
3.82309223059565, 1.52407871372998, 1.80725628975779)), row.names = c(NA, 
-10L), class = c("tbl_df", "tbl", "data.frame"))

In practice, df contains thousands of records. I would like to calculate the density (or histogram - but density makes more sense due to the fact that in each increment of time there are many events) of the number of ongoing events. So for example, in an event started at 8.02, and takes 1 hour, then this record contributes one count of ongoing operation at 8.03, 8.04...9.02. Each record similarly contributes to many times.

What is the best way of approaching this?

1

There are 1 answers

2
Allan Cameron On BEST ANSWER

Here's a tidyverse solution:

library(dplyr)
library(tidyr)
library(ggplot2)

df %>% 
  mutate(end = StartTime + Length) %>% 
  pivot_longer(c("StartTime", "end")) %>%
  arrange(value) %>%
  mutate(active = cumsum(2 * (name == "StartTime") - 1)) %>%
  ggplot(aes(value, active)) +
  geom_step() +
  labs(x = "time", y = "count")

Created on 2020-10-16 by the reprex package (v0.3.0)