Optim function does not give right solution

75 views Asked by At

I'm trying to optimize coefficients using the optim function in R. This is the function I'm trying to minimize :

min.RSS = function(data, par){
  sum(
    ((data$CCI) - 
      10^(par[1]+par[2]*data$Landsat +
         par[3]*data$Landsat^2 +
         par[4]*data$Landsat^3 + 
         par[5]*data$Landsat^4))^2, na.rm=TRUE)
}

I tried all the method in optim and none gives a right result. Example of result This is I think due to my data. My dataset contain more than 1.600.000 values and most of them are close to 0. I tried to by-pass this issue when restraining my dataset in order to have more homogenous value with this function :

df_nouveau <- data[data$CCI >= 2 | runif(nrow(data)) < 0.01, ]

I got better results with this function but still not what I expected. Example 2 The real interest of this optimization is to catch the dynamic (big augmentation or diminution in the values) and not to predict precisely the low concentration. Do anyone know how to do this ?

1

There are 1 answers

2
user2554330 On

The main part of the exponent, i.e.

     par[1]+par[2]*data$Landsat +
     par[3]*data$Landsat^2 +
     par[4]*data$Landsat^3 + 
     par[5]*data$Landsat^4

is a polynomial in data$Landsat. There are other parametrizations of polynomials besides the sum of powers, and they are often more numerically stable.

For example, I'd try using orthogonal polynomials instead, which is the default used by the poly() function in R. To do this, you pre-calculate the matrix

M <- cbind(1, poly(data$Landsat, 4))

then min.RSS could be

min.RSS = function(data, par){
  sum(
    ((data$CCI) - 10^(M %*% par))^2, na.rm=TRUE)
}

The solution par will not give the same coefficients as yours, but the possible predictions of this formulation match yours exactly. There is a transformation of par to obtain yours if you really care about the individual coefficients--but that is usually not a useful calculation to do. The predictions of the model are what matter.