What is ga() in the gamlss package doing?

267 views Asked by At

I have been looking into the gamlss package for fitting semiparametric models and came across something strange in the ga() function. Even if the model is specified as having a gamma distribution, fitted using REML, the output for the model is Gaussian, fitted using GCV.

Example::

library(mgcv)
library(gamlss)
library(gamlss.add)
data(rent)
ga3 <- gam(R~s(Fl)+s(A), method="REML", data=rent, family=Gamma(log))
gn3 <- gamlss(R~ga(~s(Fl)+s(A), method="REML"), data=rent, family=GA)

Model summary for the GAM::

summary(ga3)
Family: Gamma 
Link function: log 

Formula:
R ~ s(Fl) + s(A)

Parametric coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept) 6.667996   0.008646   771.2   <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Approximate significance of smooth terms:
        edf Ref.df      F p-value    
s(Fl) 1.263  1.482 442.53  <2e-16 ***
s(A)  4.051  4.814  36.34  <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

R-sq.(adj) =  0.302   Deviance explained = 28.8%
-REML =  13979  Scale est. = 0.1472    n = 1969

Model summary for the GAMLSS::

summary(getSmo(gn3))
Family: gaussian 
Link function: identity 

Formula:
Y.var ~ s(Fl) + s(A)

Parametric coefficients:
             Estimate Std. Error t value Pr(>|t|)
(Intercept) 6.306e-13  8.646e-03       0        1

Approximate significance of smooth terms:
        edf Ref.df      F p-value    
s(Fl) 1.269  1.492 440.14  <2e-16 ***
s(A)  3.747  4.469  38.83  <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

R-sq.(adj) =  0.294   Deviance explained = 29.6%
GCV = 0.97441  Scale est. = 0.97144   n = 1969

Question::

Why is the model output giving the incorrect distribution and fitting method? Is there something that I am missing here and this is correct?

2

There are 2 answers

0
user2772003 On

When using the ga()-function, gamlss calls in the background the gam()-function from mgcv without specifying the family. As a result, the splines are fitted assuming a normal distribution. Therefore you see when showing the fitted smoothers family: gaussian and link function: identity. Also note that the scale estimate returned when using ga() is related to the normal distribution.

0
Robert On

Yes, when using the ga()-function, each gamlss iteration calls in the background the gam()-function from mgcv. It uses the correct local working variable and local weights for a gamma distribution on each iteration.