Evaluating asymptote in R nls for multiple factors

594 views Asked by At

I am trying to evaluate if different populations reach different asymptotes using NLS, in R. Here I have two data.frames df1 has only one population (Represented by Site)

df1<- structure(list(Site = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = c("ALT01", 
"ALT02", "ALT03", "Cotton", "Deep", "Eckhardt", "Green", "Johnson", 
"Kissinger", "Marsh", "Sand", "Shypoke", "Sora", "Spike", "Tamora", 
"WRP01", "WRP05", "WRP08", "WRP10", "WRP11", "WRP12", "WRP14", 
"WRP15", "WRP18"), class = "factor"), Nets = 1:18, Cumulative.spp = c(12L, 
13L, 15L, 17L, 17L, 17L, 17L, 19L, 19L, 19L, 19L, 20L, 22L, 22L, 
22L, 22L, 22L, 22L)), .Names = c("Site", "Nets", "Cumulative.spp"
), row.names = c(NA, 18L), class = "data.frame")

and df2 has to populations (Again represented by Site)

df2 <- structure(list(Site = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 3L, 3L, 3L, 3L, 3L, 
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L), .Label = c("ALT01", 
"ALT02", "ALT03", "Cotton", "Deep", "Eckhardt", "Green", "Johnson", 
"Kissinger", "Marsh", "Sand", "Shypoke", "Sora", "Spike", "Tamora", 
"WRP01", "WRP05", "WRP08", "WRP10", "WRP11", "WRP12", "WRP14", 
"WRP15", "WRP18"), class = "factor"), Nets = c(1L, 2L, 3L, 4L, 
5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L, 13L, 14L, 15L, 16L, 17L, 18L, 
1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L, 13L, 14L, 
15L, 16L, 17L, 18L), Cumulative.spp = c(12L, 13L, 15L, 17L, 17L, 
17L, 17L, 19L, 19L, 19L, 19L, 20L, 22L, 22L, 22L, 22L, 22L, 22L, 
7L, 10L, 11L, 12L, 13L, 14L, 14L, 14L, 15L, 15L, 16L, 16L, 16L, 
16L, 16L, 17L, 17L, 17L)), .Names = c("Site", "Nets", "Cumulative.spp"
), row.names = c(1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 
12L, 13L, 14L, 15L, 16L, 17L, 18L, 25L, 26L, 27L, 28L, 29L, 30L, 
31L, 32L, 33L, 34L, 35L, 36L, 37L, 38L, 39L, 40L, 41L, 42L), class = "data.frame")

When I model for one population everything looks great:

Model1<-nls(Cumulative.spp ~ SSasympOff(Nets, A, lrc, c0), data = df1)

What I am trying to do is see if I can add several populations to the same model and add a Site Variable, I have tried this:

Model2<-nls(Cumulative.spp ~ SSasympOff(Nets, A, lrc, c0) + Site , data = df2)

and this:

Model2<-nls(Cumulative.spp ~ SSasympOff(Nets + Site , A, lrc, c0), data = df2)

But no luck so far, any help would be appreciated.

1

There are 1 answers

2
G. Grothendieck On BEST ANSWER

We assume that you want to have different Asym parameters for the two populations but common lrc and c0 parameters.

First in (1) we show how to modify the solution in the question to get the answer. Most of the code in (1) is just to get starting values but the actual fit is only one line of code -- two lines if you count the fact that we defined the formula in a separate line.

Then in (2) we show how to simplify (1) by using algorithm "plinear" eliminating the need to get starting values for the linear parameters. In (2a) we show a further simplification which extends more readily to more sites and in (2b) we simplify that further under the condition that all sites are present (which is not the case in the question but may be the case in the real data).

1) default algorithm We can get starting values in nls by fitting each population separately (fm1, fm2) and together (fm3). Finally fit the model with different Asym parameters (fm4).

# get starting values
fo <- Cumulative.spp ~ SSasympOff(Nets, A, lrc, c0)
fm1 <- nls(fo, df2, subset = Site == "ALT01")
fm2 <- nls(fo, df2, subset = Site == "ALT03")
fm3 <- nls(fo, df2)
st <- c(A1 = coef(fm1)[["A"]], A2 = coef(fm2)[["A"]], coef(fm3)[c("lrc", "c0")])

# fit    
fo4 <- Cumulative.spp ~ SSasympOff(Nets, A1*(Site=="ALT01")+A2*(Site=="ALT03"), lrc, c0)
fm4 <- nls(fo4, data = df2,  start = st)

plot(Cumulative.spp ~ Nets, df2, col = Site)
points(fitted(fm4) ~ Nets, df2, col = "red", pch = 20)

screenshot

2) plinear Actually Asym is special since the model is linear in it and we can use this to simplify the above as we don't need starting values for the linear parameters if we switch to algorithm="plinear". This eliminates the need to run fm1 and fm2. We only need fm3 to generate starting values. Note that "plinear" requires that the RHS of the formula be a matrix with each column multiplying the coefficient of one linear parameter. Here we have two linear parameters (the Asym for each Site) so it is a two-column matrix.

# get starting values
fo <- Cumulative.spp ~ SSasympOff(Nets, A, lrc, c0)
fm3 <- nls(fo, df2)    
st5 <- coef(fm3)[c("lrc", "c0")]

# fit
mm <- with(df2, cbind(Site=="ALT01", Site=="ALT03"))
fo5 <- Cumulative.spp ~  mm * SSasympOff(Nets,1,lrc,c0)
fm5 <- nls(fo5, data = df2,  start = st5, algorithm = "plinear")

2a) mm could alternately be written like this which has the advantage that it extends to more sites:

mm <- model.matrix(~ Site - 1, transform(df2, Site = droplevels(Site)))

2b) If all levels of the Site factor are represented in the data then we could simplify even further as droplevels(Site) (which drops the unused levels) could then be simply Site allowing us to write:

mm <- model.matrix(~ Site - 1, df2)

Update: Some fixes and improvements.