Why is the covariance matrix from R's glm different to that from minitab's probit analysis

117 views Asked by At

Using the dataset in the R script below. The output from R binomial probit glm vcov function is different to that from Minitab's probit analysis variance covariance table. R Documentation for vcov says it returns the variance covariance table. The fitted model coefficients are the same (to 3 dp) but the covariance matrices are completely different. How can you get the same matrix output from R as you get from minitab or vice versa?

I had expected the tables from R and Minitab to be equal.

I asked Minitab support, with whom I have a paid for annual subscription, and got this unhelpful response:

"Thanks for your query. We are not in the position to comment on the method used in other applications. However, you will find the formula used in the page below.

https://support.minitab.com/en-us/minitab/21/help-and-how-to/statistical-modeling/reliability/how-to/probit-analysis/methods-and-formulas/equation/

Which points to a page containing the equation πj = c + (1 − c) g(β0 + xjβ) and nothing else - I was expecting something back about singular value decomposition or similar.

I only subscribe to Minitab for the support but it seems less than helpful these days, so I shall probably just use R instead.

From R

df<-data.frame(stimulus = c(0.615,0.634,0.655,0.675),
   success = c(0,3,3,2),
   failure = c(5,4,3,0))
fm <- glm(cbind(success,failure)~stimulus,data=df,
    family=binomial("probit"))
coef(fm)

(Intercept)    stimulus 
  -29.68637    45.89187 

vcov(fm)
            (Intercept)  stimulus
(Intercept)    156.9548 -244.3060
stimulus      -244.3060  380.5229 

From Minitab

same coefficients but different covariance matrix

Regression Table

Variable Coef Standard Error Z P
Constant -29.6864 13.1351 -2.26 0.024
stimulus 45.8920 20.4461 2.24 0.025

Natural
Response 0

Matrix VCCO1

172.531 -268.482
-268.482 418.044

1

There are 1 answers

0
Ben Bolker On

"The man with two watches never knows what time it is."

For what it's worth, glmmTMB — which uses a completely different fitting procedure from glm — gives a covariance matrix that is close to Minitab's.

m_glm <- glm(cbind(success,failure)~stimulus,data=df,
      family=binomial("probit"))
library(glmmTMB)
m_tmb <- glmmTMB(cbind(success,failure)~stimulus,data=df,
      family=binomial("probit"))
vcov(m_tmb)
Conditional model:
            (Intercept)  stimulus
(Intercept)    172.5314 -268.4821
stimulus      -268.4821  418.0434

There is a bit more information on how the covariance matrix is computed for glm here ... It's not obvious to me why the values based on the QR decomposition from an iteratively reweighted least-squares solution (glm) would be different from values based on the estimated curvature of a log-likelihood surface (glmmTMB). I think it may have something to do with the fact that a probit link is non-canonical (the covariance matrices from glm and glmmTMB are nearly identical when we change to a canonical logit link). (I've asked a question on Cross Validated about this; Thomas Lumley confirmed in answer that the difference arises because glmmTMB/Minitab's covariance matrices are estimated based on the observed information while GLM uses the expected (Fisher) information.)

That said, you should also be careful with standard errors (sqrt(diag(vcov(...)))) as measures of uncertainty for GLM parameters, as they're only reliable for large data sets. The profile confidence intervals for glm and glmmTMB are very similar.

> confint.default(m_glm)
                2.5 %    97.5 %
(Intercept) -54.24111 -5.131622
stimulus      7.65886 84.124881
> confint(m_glm)
Waiting for profiling to be done...
                2.5 %   97.5 %
(Intercept) -59.70676 -6.66262
stimulus     10.01528 92.56096

> confint(m_tmb, method = "wald", estimate = FALSE) ## equiv. confint.default()
                 2.5 %    97.5 %
(Intercept) -55.430789 -3.942062
stimulus      5.818323 85.965601

> confint(m_tmb, method = "profile", estimate = FALSE) ## equiv. confint()
                2.5 %    97.5 %
(Intercept) -59.70642 -6.660542
stimulus     10.01219 92.560959