Organising list() of a calculated function by descending day order within patients in R

33 views Asked by At

I'm trying to calculate the time in bed (TIB) for a dataset containing a lot of data, among which the time they went to bed (inbed) and the time they got out of bed (uitbed). The code I have works in a smaller practice database (with less than 10 days), but when I use data of more than 10 days, the listing order goes wrong and because of this the TIB values are pasted into the wrong rows in my table.

my sample data:

acti_sd <- data.frame(
  patient = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11),
  diagnosis = c(1, 1, 4, 5, 4, 1, 2, 2, 5, 3, 4),
  SDa1_inbed = c("19:00", "20:00", "18:30", "23:00", "20:00", "20:00", "20:00", "18:30", "20:00", "18:30", "20:00"),
  SDa2_inbed = c("01:00", "00:00", "23:25", "00:00", "22:45", "00:15", "00:00", "23:25", "00:00", "23:25", "00:00"),
  SDa1_uitbed = c("06:15", "10:00", "09:00", "08:00", "99:99", "06:15", "10:00", "09:00", "10:00", "09:00", "10:00"),
  SDa2_uitbed = c("09:30", "04:00", "08:30", "05:00", "06:30", "07:45", "04:00", "08:30", "04:00", "08:30", "04:00"),

The code I'm using is this (which does succesfully calculate the TIB, but it doesnt order the list properly:

#function
calc_TIB <- function(inbed, uitbed) {
  in_bed_time <- as.numeric(as.POSIXct(inbed, format = "%H:%M"))
  out_bed_time <- as.numeric(as.POSIXct(uitbed, format = "%H:%M"))
#if go to bed past or before midnight 
  out_bed_time <- ifelse(out_bed_time < in_bed_time, out_bed_time + 86400, out_bed_time)
#convert back to hours
  tib <- (out_bed_time - in_bed_time) / 3600
  
  return(tib)
}

#perform function for every patient and day
result <- lapply(split(long_wide_actisd[, c("inbed", "uitbed")],
    list(long_wide_actisd$SDa_day, long_wide_actisd$patient)), 
                 function(x)calc_TIB(x$inbed, x$uitbed))

#turn it into a matrix
SDa_TIB <- do.call(rbind, result)

#add tib matrix into table as a new column
long_wide_actisd <- cbind(long_wide_actisd, SDa_TIB)

print(long_wide_actisd)

This organises the TIB data in a way where the order goes as follows "SDa1.1, SDa10.1, SDa11.1", etc". I tried switching the patient and SDa_day in my list but then it gives the order "1 SDa1, 2 SDa1, 3 SDa1, etc" I want the list to be organised in the same way my data is which is "SDa1.1, SDa2.1, SDa3.1" or "1 SDa1, 1 SDa2, 1 SDa3" Where the data is organised per patient with descending SDa days (without it prioritising 10, 11 etc because it does do what i want in a database where each patient has less than 10 days). I need it to be organised that way so it pastes into my table correctly. I added a picture of what both the aforementioned options look like which maybe helps make more sense of it. Can anyone help me out with this? picture of output

(I added some sample data similar to what I'm using and the following code is the code I used to pivot the table in the way that I was using it and deal with NAs):

acti_sd[acti_sd == '99:99'] <- NA

long_actisd <- SDa_data%>%
  pivot_longer(cols = !c(patient, diagnosis), 
               names_to = c("SDa_day", "measurementtype"),
               names_sep = "_",
               values_to = "measurement"
               ) 
long_actisd
long_wide_actisd <- long_actisd|>
  pivot_wider(names_from = "measurementtype", 
              values_from = "measurement")

again any help would be appreciated thank you!

1

There are 1 answers

0
zephryl On

(Note I added a column to your example data to better reproduce the problem.)

I think you've overcomplicated things a bit. In particular, there's no need to split() your dataframe into individual rows or use lapply(). Instead, just pass all of inbed and uitbed at once:

long_wide_actisd$SDa_TIB <- calc_TIB(long_wide_actisd$inbed, long_wide_actisd$uitbed)

long_wide_actisd
  # # A tibble: 33 × 6
  #    patient diagnosis SDa_day inbed uitbed SDa_TIB
  #      <dbl>     <dbl> <chr>   <chr> <chr>    <dbl>
  #  1       1         1 SDa1    19:00 06:15    11.2 
  #  2       1         1 SDa2    01:00 09:30     8.5 
  #  3       1         1 SDa10   01:00 06:15     5.25
  #  4       2         1 SDa1    20:00 10:00    14   
  #  5       2         1 SDa2    00:00 04:00     4   
  #  6       2         1 SDa10   00:00 10:00    10   
  #  7       3         4 SDa1    18:30 09:00    14.5 
  #  8       3         4 SDa2    23:25 08:30     9.08
  #  9       3         4 SDa10   23:25 09:00     9.58
  # 10       4         5 SDa1    23:00 08:00     9   
  # # ℹ 23 more rows

FYI, the split() operation is what was mucking things up. It returned a list of dataframes that's sorted differently than your original dataframe; specifically, in alphabetical order, where eg "10" comes before "1".

orig_order <- paste(long_wide_actisd$SDa_day, long_wide_actisd$patient, sep = ".")
split_order <- names(
  split(long_wide_actisd[, c("inbed", "uitbed")],
        list(long_wide_actisd$SDa_day, long_wide_actisd$patient))
)
head(data.frame(orig_order, split_order))
#   orig_order split_order
# 1     SDa1.1      SDa1.1
# 2     SDa2.1     SDa10.1
# 3    SDa10.1      SDa2.1
# 4     SDa1.2      SDa1.2
# 5     SDa2.2     SDa10.2
# 6    SDa10.2      SDa2.2

Data:

acti_sd <- data.frame(
  patient = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11),
  diagnosis = c(1, 1, 4, 5, 4, 1, 2, 2, 5, 3, 4),
  SDa1_inbed = c("19:00", "20:00", "18:30", "23:00", "20:00", "20:00", "20:00", "18:30", "20:00", "18:30", "20:00"),
  SDa2_inbed = c("01:00", "00:00", "23:25", "00:00", "22:45", "00:15", "00:00", "23:25", "00:00", "23:25", "00:00"),
  SDa10_inbed = c("01:00", "00:00", "23:25", "00:00", "22:45", "00:15", "00:00", "23:25", "00:00", "23:25", "00:00"),
  SDa1_uitbed = c("06:15", "10:00", "09:00", "08:00", "99:99", "06:15", "10:00", "09:00", "10:00", "09:00", "10:00"),
  SDa2_uitbed = c("09:30", "04:00", "08:30", "05:00", "06:30", "07:45", "04:00", "08:30", "04:00", "08:30", "04:00"),
  SDa10_uitbed = c("06:15", "10:00", "09:00", "08:00", "99:99", "06:15", "10:00", "09:00", "10:00", "09:00", "10:00")
)