How to replace cur_data() with a non-deprecated function?

20 views Asked by At

I wrote a function which analyses if there is a methane concentration greater than 2.5 ppm; if yes, does a linear regression and check if this regression is significant (pvalue < 0.05)

Then I have large dataset, of several wells. So I group the data by well and want to apply the function for each well (i.e. for Well_A, do a regression of methane vs time, if it is significant, return TRUE). Everything works well although I keep having a warning message that cur_data is deprecated. It still works, but I'd like a correct code without it, in case I use it for a long time and it not longer works at some point. The issue is that whatever I do I cannot get the correct code. I tried using the dot, which make the code work but does not seem to loop over each well.

Here is the code, I created 4 wells, only one of them (Well_D) has a significant slope. You can see that using cur_data works, but then if I use the dot, every well has significant slope. I tried other solutions and none worked.

library(tibble)
library(dplyr)

###Data
df <- tibble(
  WELL_NAME = rep(c("Well_A", "Well_B", "Well_C", "Well_D"), each = 5),
  FIELD_TIME = rep(c(0, 2, 4, 6, 8), times = 4),
  CH4_PPM = c(2.0, 2.1, 2.0, 2.2, 2.1,   # Données pour Well_A
              2.0, 2.2, 2.4, 2.5, 2.3,   # Données pour Well_B
              1.8, 1.9, 1.8, 1.9, 1.8,   # Données pour Well_C
              1.5, 1.7, 2.2, 2.8, 3.3)   # Données pour Well_D
)

### Function 

check_methane_slope_significance <- function(x) {
  if (any(x$CH4_PPM > 2.5)) {
    lm_result <- lm(CH4_PPM ~ FIELD_TIME, data = x)
    p_value <- summary(lm_result)$coefficients[2, 4] 
    slope <- coef(lm_result)[["FIELD_TIME"]]
    if (p_value < 0.05 && slope > 0) {  
      return(TRUE)  
    } else {
      return(FALSE) 
    }
  } else {
    return(FALSE)  
  }
}


##### CODE WITH CUR_DATA WHICH WORKS####
significant_slopes <- df %>%
  group_by(WELL_NAME) %>%
  summarise(has_significant_slope = any(check_methane_slope_significance(cur_data())), .groups = "drop")

significant_slopes <- significant_slopes %>%
  filter(has_significant_slope)
print(significant_slopes)

###### CODE WITHOUT CUR_DATA, RETURNS ONLY TRUE
significant_slopes <- df %>%
  group_by(WELL_NAME) %>%
  summarise(has_significant_slope = any(check_methane_slope_significance(.)), .groups = "drop")

significant_slopes <- significant_slopes %>%
  filter(has_significant_slope)
print(significant_slopes)
0

There are 0 answers