I'm trying to write a function that takes a dataframe, converts a column from chr to dbl, then adds 1 to a column. I also want to optionally replace certain values with NA. Otherwise, if the relevant argument is not used, I want the function to skip the NA replacement step.

Data

library(tibble)
library(dplyr)
library(magrittr)

df <-
  tibble(id = 1:10, col_of_interest = 21:30) %>%
  add_row(id = 11, col_of_interest = 999) %>%
  mutate(across(col_of_interest, as.character))

df

## # A tibble: 11 x 2
##       id col_of_interest
##    <dbl> <chr>          
##  1     1 21             
##  2     2 22             
##  3     3 23             
##  4     4 24             
##  5     5 25             
##  6     6 26             
##  7     7 27             
##  8     8 28             
##  9     9 29             
## 10    10 30             
## 11    11 999  

Writing a function

The function should:

  1. Take in the data.
  2. Convert col_of_interest from chr to dbl.
  3. Replace 999 with NA (but only if I specified that 999 should be replaced with NA)
  4. Add 1 to col_of_interest

My attempt

When writing my function I was guided by two resources:

  1. Passing data variables into function arguments using {{ var }} as covered here.
  2. The use of if is based on this answer.
add_one <- function(data, var, na_if_val = NULL) {

  data %>%

    mutate(across({{ var  }}, as.numeric)) %>%
    
    {if( is.null( {{ na_if_val }} )
    ) .  # <--- the dot means: "return the preexisting dataframe"

      else

        na_if( {{ na_if_val }} )

    } %>%
    
    mutate(across({{ var  }}, add, 1))
}

When I test the function on my df object I get an error.

add_one(data = df,
        var = col_of_interest,
        na_if_val = "999")

Error in check_length(y, x, fmt_args("y"), glue("same as {fmt_args(~x)}")) : argument "y" is missing, with no default

Googling this error yielded this page, stating that:

Note, however, that na_if() can only take arguments of length one.

However, incorporating only na_if( {{ na_if_val }} ) in add_one function's pipe does work. It's the conditional evaluation combined with is.null that causes the function to break. I don't understand why.

2

There are 2 answers

1
Emman On BEST ANSWER

I solved the problem by simply specifying x and y arguments of drop_na.

add_one <- function(data, var, na_if_val = NULL) {

  data %>%

    mutate(across({{ var  }}, as.numeric)) %>%
    
    {if( is.null( {{ na_if_val }} )
    ) .  # <--- the dot means: "return the preexisting dataframe"

      else

        na_if(x = ., y = {{ na_if_val }} ) ## <-- change is here

    } %>%
    
    mutate(across({{ var  }}, add, 1))
}


add_one(data = df,
        var = col_of_interest,
        na_if_val = 999)

## # A tibble: 11 x 2
##       id col_of_interest
##    <dbl>           <dbl>
##  1     1              22
##  2     2              23
##  3     3              24
##  4     4              25
##  5     5              26
##  6     6              27
##  7     7              28
##  8     8              29
##  9     9              30
## 10    10              31
## 11    11              NA

EDIT

I removed {{ }} around na_if_val following @LionelHenry's comment.

add_one <- function(data, var, na_if_val = NULL) {

  data %>%

    mutate(across({{ var  }}, as.numeric)) %>%

    {if( is.null(na_if_val)
    ) .  # <--- the dot means: "return the preexisting dataframe"

      else

        na_if(x = ., y = na_if_val)

    } %>%

    mutate(across({{ var  }}, add, 1))
}
6
Pedro Faria On

Your have several problems, but the main one is because you are doing the non-stardard evaluation wrong.

add_one <- function(data, var, na_if_val = NULL) {
  
  var_b <- enquo(var)
  
  data <- data %>%
    mutate(across(!!var_b, as.numeric)) 
  
   if(!is.null(na_if_val)){
     data <- data %>% 
       mutate(across(!!var_b, na_if, y = na_if_val))
   }
   
  data <- data %>% 
    mutate(across(!!var_b, add, 1))
  
  return(data)
}

Returning this:

add_one(df, col_of_interest, 999)

# A tibble: 11 x 2
      id col_of_interest
   <dbl>           <dbl>
 1     1              22
 2     2              23
 3     3              24
 4     4              25
 5     5              26
 6     6              27
 7     7              28
 8     8              29
 9     9              30
10    10              31
11    11              NA

First, you need to enquote the variable of interest with the enquo() function, then, you unquote this variable (with bang bang !!) in the places that you want it. Another problem of your function, is inserting your if statement, in the middle of a pipe, this does not work. If you need to apply certain methods in special cases, you need to evaluate it separately from the main calculation.