Using the dataset in the R script below. The output from R binomial probit glm vcov function is different to that from Minitab's probit analysis variance covariance table. R Documentation for vcov says it returns the variance covariance table. The fitted model coefficients are the same (to 3 dp) but the covariance matrices are completely different. How can you get the same matrix output from R as you get from minitab or vice versa?
I had expected the tables from R and Minitab to be equal.
I asked Minitab support, with whom I have a paid for annual subscription, and got this unhelpful response:
"Thanks for your query. We are not in the position to comment on the method used in other applications. However, you will find the formula used in the page below.
Which points to a page containing the equation πj = c + (1 − c) g(β0 + xjβ) and nothing else - I was expecting something back about singular value decomposition or similar.
I only subscribe to Minitab for the support but it seems less than helpful these days, so I shall probably just use R instead.
From R
df<-data.frame(stimulus = c(0.615,0.634,0.655,0.675),
success = c(0,3,3,2),
failure = c(5,4,3,0))
fm <- glm(cbind(success,failure)~stimulus,data=df,
family=binomial("probit"))
coef(fm)
(Intercept) stimulus
-29.68637 45.89187
vcov(fm)
(Intercept) stimulus
(Intercept) 156.9548 -244.3060
stimulus -244.3060 380.5229
From Minitab
same coefficients but different covariance matrix
Regression Table
Variable Coef Standard Error Z P
Constant -29.6864 13.1351 -2.26 0.024
stimulus 45.8920 20.4461 2.24 0.025Natural
Response 0Matrix VCCO1
172.531 -268.482
-268.482 418.044
"The man with two watches never knows what time it is."
For what it's worth,
glmmTMB
— which uses a completely different fitting procedure fromglm
— gives a covariance matrix that is close to Minitab's.There is a bit more information on how the covariance matrix is computed for
glm
here ... It's not obvious to me why the values based on the QR decomposition from an iteratively reweighted least-squares solution (glm
) would be different from values based on the estimated curvature of a log-likelihood surface (glmmTMB
). I think it may have something to do with the fact that a probit link is non-canonical (the covariance matrices fromglm
andglmmTMB
are nearly identical when we change to a canonical logit link). (I've asked a question on Cross Validated about this; Thomas Lumley confirmed in answer that the difference arises becauseglmmTMB
/Minitab's covariance matrices are estimated based on the observed information while GLM uses the expected (Fisher) information.)That said, you should also be careful with standard errors (
sqrt(diag(vcov(...)))
) as measures of uncertainty for GLM parameters, as they're only reliable for large data sets. The profile confidence intervals forglm
andglmmTMB
are very similar.