Overlapping cut function r

162 views Asked by At

I have a dataset of GPS locations that looks like this.

library(tidyverse)
seg.length.m <-2500
d <- tibble(
UTC = c('2021-12-02 15:34:00','2021-12-02 15:35:00','2021-12-02 15:36:00','2021-12-02 15:37:00','2021-12-02 15:38:00','2021-12-02 15:39:00','2021-12-02 15:40:00','2021-12-02 15:41:00','2021-12-02 15:42:00','2021-12-02 15:43:00'),
Lat = c('-57.5328713333333','-57.5355708333333','-57.5379695','-57.54041','-57.5429036666667','-57.5452343333333','-57.547898','-57.550008','-57.5523958333333','-57.5547928333333'),
Lon = c('-52.8311235','-52.8234366666667','-52.8165898333333','-52.8099165','-52.8025293333333','-52.7961175','-52.7887356666667','-52.7823455','-52.7759113333333','-52.7689473333333'),
transect = c('1','1','1','1','1','1','1','1','1','1'), 
distance = c('559.716068583762','498.202858146185','491.293157455882','532.341915485295','471.188754640846','541.338131747554','457.918685684549','475.555477163174','504.209591568878','474.017635562939'))

Where transect is my transect number, there is a few hundred of them and distance is the distance (meters) from one location to the next. I want break these transects into smaller segments of seg.length.m which in this case is 2500 meters.

There are a few other group_by variables, but this is the code i have.

segmented <- d |>
  group_by(transect) |> 
  mutate(sd_c = cumsum(distance),
         cum_int =  cut(sd_c,
                        breaks = seq(from = 0,
                                     # to is the greater or seg.length.m, max sd_c
                                     to = ifelse(ceiling(max(sd_c, na.rm=T)) < seg.length.m,
                                            seg.length.m,
                                            ceiling(max(sd_c, na.rm=T))),
                                     #by 
                                     by = seg.length.m), right =T, include.lowest=T, labels=F
         )
  ) |> 
  ungroup() |>
  arrange(UTC)

current output

segmented

desired output

segmented |> rbind( segmented[4,] |> mutate(cum_int =2)) |> rbind( segmented[9,] |> mutate(cum_int = NA)) |> arrange(UTC)

to then feed through

segmented <- segmented |>
      group_by(transect, cum_int) |> 
      summarise(seg_length = sum(as.numeric(distance)))|> 
      mutate(cum_int = ifelse(is.na(cum_int), lag(cum_int)+1, cum_int)) 

The issue i have is that it i want cut to have an overlap, so the last row in the first segment is the repeated in the first row of the second segment. cut might not be the best function here, but can't find a better way, shingle?

This data is actually an sf spatial dataframe, which means the points (held in geometry) are concatenated when i summaries() , and then i st_cast them as "LINESTRINGS".

Currently these linestrings have gaps between them, which i want to removed.

Thanks internet, i hope my question is clear. :)

0

There are 0 answers