"optim" function fails to converge for maximum likelihood estimation in a random-effect model

78 views Asked by At

I have a longitudinal dataset for plant growth that includes repeated measurements in different seasons. I am trying to estimate the parameters for a growth equation using this dataset. The growth equation is as follows:

Lt = 150*((1 + ((150/Lt_1)^(1/p)-1)exp(-k*td/365))^(-p))

Lt represents the plant height at a measurement, Lt_1 represents the plant height at the previous measurement, and td represents the duration of time between measurements. The parameters I want to estimate are k and p. However, I believe that the parameter k may vary depending on the season, so I want to have specific k values for each season. Additionally, I would like to include individual variation as a random effect in the model.

To estimate these parameters, I wrote a code using the optim function, where I assigned each parameter (p, k, I) and calculated the sum of the likelihood. However, the code takes a long time to run and fails to converge. In my actual data, I have a larger number of individuals (n = 100), making the computation even more challenging.

I am seeking advice on the best approach to estimate these parameters efficiently and effectively. Any suggestions or alternative methods would be greatly appreciated.

Data

# generate dummy data
n = 10
id = as.factor(1:n)
season = c("s1","s2","s3","s4")
L0 = rnorm(n,40,5)
td = rnorm(n, 180, 10)

growth <- function(td, Lt_1){
  return(function(k,p){
    150*((1 + ((150/Lt_1)^(1/p)-1)*exp(-k*td/365))^(-p))
  })
}
L1 = growth(Lt_1 = L0, td = td)(p = 1.2, k = 0.55)
L2 = growth(Lt_1 = L1, td = td)(p = 1.2, k = 1.09)
L3 = growth(Lt_1 = L2, td = td)(p = 1.2, k = 0.62)
L4 = growth(Lt_1 = L3, td = td)(p = 1.2, k = 0.97)

df <- data.frame(id = id, L0 = L0, L1 = L1, L2 = L2, L3 = L3, L4 = L4) %>% 
  pivot_longer(cols = L0:L4, names_to = "Ltype", values_to = "Lobs") %>% 
  mutate(Lobs = round(Lobs)) %>% 
  group_by(id) %>% 
  mutate(Lt_1 = lag(Lobs)) %>% 
  filter(!is.na(Lt_1)) %>% 
  mutate(td = round(rnorm(4, 180, 10))) %>% 
  mutate(season = season) %>% 
  ungroup() %>% 
  select(id, season, td, Lt_1, Lobs)

head(df)
# A tibble: 6 x 5
  id    season    td  Lt_1  Lobs
  <fct> <chr>  <dbl> <dbl> <dbl>
1 1     s1       183    36    44
2 1     s2       190    44    65
3 1     s3       193    65    77
4 1     s4       182    77    96
5 2     s1       191    43    53
6 2     s2       177    53    75

Caluculate the sum of likelihood

# define the function to calculate the sum of negative log likelihood
len_id = length(unique(df$id))

sum_negloglike <- function(par){
  # parameters to estimate
  p = par[1]
  sigma = par[2]
  k <- par[3:6]
  I = par[seq(7, 6+length(unique(df$id)), by = 1)]
  
  # individuals and seasons to loop
  indv <- unique(df$id)
  season <- unique(df$season)
  
  # define the dataframe
  new_df <- data.frame()
  
  # assign k and I
  ## for individual i
  for(i in 1:length(indv)){
    indv_i = indv[i]
    I_i = I[which(indv == indv_i)]
    
    df_i = df %>% 
      filter(id == indv_i) %>% 
      mutate(I = I_i)
    
    # for season j in indvidual i
    for (j in 1:length(season)){
      season_j = season[j]
      k_j = k[which(season == season_j)]
    
      df_i_j = df_i %>% 
        filter(season == season_j) %>% 
        mutate(k = k_j) %>% 
        # estimate the growth
        mutate(Lpred = growth(td, Lt_1)(k+I, p))
    
      # add it to the dataframe
      new_df <- bind_rows(new_df, df_i_j)
    }
  }
  
  # calculate the sum of negative log-likelihood
  Lobs <- new_df$Lobs
  Lpred <- new_df$Lpred
  negloglike_sum = sum_negloglike(Lobs)(Lpred, sigma)
  
  return(negloglike_sum)
}

Maximum likelihood estimation

# set initial parameters
p = 1.2
sigma = 1.88
k <- c(0.55, 1.09, 0.62, 0.97)
I = rnorm(len_id, 0, 0.01)
par = c(p, sigma, k, I)

# calculate the sum of likelihood with initial parameter values
sum_negloglike(par) 

# estimate the parameters
optim(par, sum_negloglike)
0

There are 0 answers