How to mutate dataframe inside do with dplyr

147 views Asked by At

Inside the do I am calling a mutate_ with reference to the original dataframe. The problem is that I can't get access to that dataframe inside the mutate. This has to do with lazyeval package, but I haven't been able to figure it out. Thank you for your help.

For example, say this function returns a data_frame of points.

lattice_points <- function (x0, y0, r){
    df <- expand.grid(
            x1 = ceiling(x0-r):floor(x0+r),
            y1 = ceiling(y0-r):floor(y0+r)) %>% 
        filter((x1-x0)^2 + (y1-y0)^2 <= r^2)
    return (df)
}

Then I made another function to serialize this one (and I want to identify it with id):

many_lattice_points <- function (df, 
        id_ = "id", x_ = "x0", y_ = "y0", r_ = "r") {
    df_out <- rowwise(df_in) %>% 
        do( lattice_points(.[x_], .[y_], .[r_])) %>%
              mutate_(.dots = interp(~ .[var_], var_ = as.name(id_)))
    return (df_out)
}

With this input:

> input_df <- data_frame(
    id = c("a", "b"), x0 = c(0.5, 5.5), 
    y0 = c(0.5, 0.5), r  = c(1  , 1  ) )

    id     x0    y0    r
  <chr> <dbl> <dbl> <dbl>
1     a   0.5   0.5     1
2     b   5.5   0.5     1

I should get the following:

    id     x1    y1 
  <chr> <dbl> <dbl> 
1     a   0.0   0.0 
2     a   0.0   1.0 
3     a   1.0   0.0 
4     a   1.0   1.0 
5     b   5.0   0.0 
6     b   5.0   1.0 
7     b   6.0   0.0 
8     b   6.0   1.0 

However, I'm getting an error because it can't find the column var_.

Clarification: I have found other ways to go around it, but I would like to exploit the power of do with lazyeval.

1

There are 1 answers

0
Diego-MX On

The best way to add an id column with do is to include it in the group_by before it.

many_lattice_points <- function (df, 
        id_ = "id", x_ = "x0", y_ = "y0", r_ = "r") {
    df_check <- count_(id_)
    if (not(all(data_check$n == 1))) 
        warning("The function may not work as expected")

    df_out <- group_by_(df_in, .dots = id_) %>% 
        do( lattice_points(.[x_], .[y_], .[r_]) ) 
    return (df_out)
}

While there's still an issue with how to do it with lazyeval, this is satisfactory in terms of purpose and logically.