using apply with an anonymous function which uses specific locations in the row

630 views Asked by At

I have a data frame (data2) with 10000 rows and 14 variables:

   treat rep dist  time0   time10   N2O10    WC Temp
1   AGP   1    0 10:09:00 10:19:00 0.2270316 12 17.1
   time20     N2O20      N2O0 t0    t10       t20
1 10:31:00 0.3479662 0.2395295 0 0.1666667 0.3666667

I want to do a linear regression and get the slope for each row in the data frame where x is t0, t10 and t20 and y is N2O0, N2O10, and N2O20. Like this example for one row from the data frame:

data3<-data2[1,]
with (data3, lm(c(N2O0,N2O10,N2O20)~c(t0,t10,t20)))

When I tried to use the above function as an anonymous function inside "apply" I got an error message.

data4<-apply(data2, 1, function(data2) lm(c(data2$N2O0,data2$N2O10,data2$N2O20)~c(data2$t0,data2$t10,data2$t20))$coefficients[2])

Error in eval(substitute(expr), data, enclos = parent.frame()) :
invalid 'envir' argument of type 'character'

I have no idea what it means and will be happy for any suggestions on how to correct this line.

1

There are 1 answers

0
A5C1D2H2I1M1N2O1R2T1 On BEST ANSWER

I would suggest:

  1. Subset the columns of interest at the start.
  2. Create a list within your apply.
  3. Run lm on that list.

Try:

apply(data2[c("N2O0","N2O10","N2O20", "t0","t10","t20")], 1, function(x) {
  temp <- as.list(x)
  lm(c(N2O0, N2O10, N2O20) ~ c(t0, t10, t20), data = temp)$coefficients[2]
})
#         1 
# 0.3059211 

You're running into this problem in part because when you use apply, all of the values are becoming characters because of the "treat", "time0", "time10", and "time20" columns.

Compare:

> apply(data2, 1, function(data2) sum(data2[1]))
Error in sum(data2[1]) : invalid 'type' (character) of argument
> apply(data2[-c(1, 4, 5, 9)], 1, function(data2) sum(data2[1]))
1 
1 

Sample data:

data2 <- structure(list(treat = "AGP", rep = 1L, dist = 0L, time0 = "10:09:00", 
        time10 = "10:19:00", N2O10 = 0.2270316, WC = 12L, Temp = 17.1, 
        time20 = "10:31:00", N2O20 = 0.3479662, N2O0 = 0.2395295, 
        t0 = 0L, t10 = 0.1666667, t20 = 0.3666667), .Names = c("treat", 
    "rep", "dist", "time0", "time10", "N2O10", "WC", "Temp", "time20", 
    "N2O20", "N2O0", "t0", "t10", "t20"), row.names = "1", class = "data.frame")