different AIC from drop1 function and AIC function

40 views Asked by At

I'm using stepwise removal of variable for a glm using drop1.

From model 1, I am dropping the variable with the lowest AIC as required producing model 2.

However, when I then compare the AIC of the two the AIC of model 1 is lower and the AIC reported from drop1 for model2 and the AIC reported from AIC() are different.

I'm confused why these AICs for the same model from drop1() and AIC() and how to go about model selection given this?

Thanks

m1 <- glm(abundance ~ treatment + land_use + patch_size, data=df)

drop1(m1, test="F")

m2 <- glm(abundance ~ treatment + land_use, data=df)

AIC(m1) 

AIC(m2) ##this AIC varies from that reported in drop1() and is higher than AIC(m1)
1

There are 1 answers

0
Daniel Dvorkin On

The drop1 function uses extractAIC to get the AIC, while the AIC function uses a different additive constant. See the Details section of ?extractAIC, specifically this paragraph:

For linear models with unknown scale (i.e., for ‘lm’ and ‘aov’), -2 log L is computed from the deviance and uses a different additive constant to ‘logLik’ and hence ‘AIC’. If RSS denotes the (weighted) residual sum of squares then ‘extractAIC’ uses for -2 log L the formulae RSS/s - n (corresponding to Mallows' Cp) in the case of known scale s and n log (RSS/n) for unknown scale. ‘AIC’ only handles unknown scale and uses the formula n * log(RSS/n) + n + n * log 2pi - sum(log w) where w are the weights. Further ‘AIC’ counts the scale estimation as a parameter in the ‘edf’ and ‘extractAIC’ does not.

However, as it goes on to say:

Note that the methods for this function may differ in their assumptions from those of methods for ‘AIC’ (usually via a method for ‘logLik’). We have already mentioned the case of ‘"lm"’ models with estimated scale, and there are similar issues in the ‘"glm"’ and ‘"negbin"’ methods where the dispersion parameter may or may not be taken as ‘free’. This is immaterial as ‘extractAIC’ is only used to compare models of the same class (where only differences in AIC values are considered).

So in other words, you could implement your own drop1 function using AIC, and you'd get the same differences between the AICs of the various submodels and the AIC of the full model that you do with the built-in drop1.