I have a large dataset of vocal notes and am trying to calculate the duration of each note (row) depending on the type of call type it is (I am quite an amateur in the world of coding). For example:
note_type call_type begin.time end.time
1 doublet 314.2205 314.2344
2 doublet 314.2856 314.3273
3 squeak 316.2678 316.2799
1 triplet 316.3476 316.3928
1 triplet 318.4582 318.4713
2 triplet 318.5413 318.5853
where begin.time is the start of the note time and end.time is its end. I want to add a column calculating the duration such that for: doublets, duration = note_type2 end.time - note_type1 begin.time squeaks, duration = end.time - begin.time triplets, duration = end.time of 3rd note in a row - begin.time of 1st note in 'triplet'
So far, I have:
all_notes %>%
mutate(call_dur = case_when(call_type == "doublet" & note_type == "1" ~ (lead(end.time) - begin.time),
call_type == "squeak" ~ (end.time - begin.time)))
but am not sure how to calculate the difference across 3 rows because the 'triplet' calls don't always have the same note structure (i.e. triplet can be note types 1,1,2 or note types 1,2,2 or note types 1,2,3 or any combination thereof).
Is there a way to calculate the end.time of the 3rd row in series - the begin.time of the first labeled 'triplet' note?
Ideally, the new column would look like this:
note_type call_type begin.time end.time call_dur
1 doublet 314.2205 314.2344 0.0139
2 doublet 314.2856 314.3273 na
3 squeak 316.2678 316.2799 0.0121
1 triplet 316.3476 316.3928 2.2377
1 triplet 318.4582 318.4713 na
2 triplet 318.5413 318.5853 na
It would just involve adding the
n=xargument to thelead()function to use more than 1 positionOUTPUT