summary.manova output shows different p values from the summary.manova stats table and broom tidy()

236 views Asked by At

I noticed that the summary.manova() function in R produces two different p.values. One in a table that is printed in the console and the other in the stats table located in the summary object. What p.values should be reported? The values are slightly different. I first noticed this problem when using the tidy() function from broom, it was reporting p.values from the stats table and not the console.

I can recreate the problem using the iris data frame:

head(iris)
fit = manova(as.matrix(iris[,1:4]) ~ Species, data = iris)
fit_summary = summary.manova(fit, test = "Wilks")
fit_summary #output1
fit_summary$stats #output2
broom::tidy(fit, test = "Wilks") #output2 
1

There are 1 answers

0
Ben Bolker On BEST ANSWER

Nice reproducible example! From everything I can see here, the only differences are in output representation, not in the underlying values.

In the printed summary output, p-values less than a threshold are printed only as "<2.2e-16" (on the theory that you probably shouldn't be worrying about differences among tiny p-values anyway ...)

fit_summary #output1
           Df    Wilks approx F num Df den Df    Pr(>F)    
Species     2 0.023439   199.15      8    288 < 2.2e-16 ***
Residuals 147                                              
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

If you explicitly extract the $stats component, then you get a value printed to R's default 7-digit precision:

> fit_summary$stats #output2
           Df      Wilks approx F num Df den Df        Pr(>F)
Species     2 0.02343863 199.1453      8    288 1.365006e-112
Residuals 147         NA       NA     NA     NA            NA

If you use tidy, it returns a tibble rather than a data frame, which has a different set of defaults for output precision (i.e., it only reports 3 significant digits).

> broom::tidy(fit, test = "Wilks") 
# A tibble: 2 x 7
  term         df   wilks statistic num.df den.df    p.value
  <chr>     <dbl>   <dbl>     <dbl>  <dbl>  <dbl>      <dbl>
1 Species       2  0.0234      199.      8    288  1.37e-112
2 Residuals   147 NA            NA      NA     NA NA  

  

All of these defaults can be reset: for example, ?tibble::formatting tells you that options(pillar.sigfig=7) will set the significant digits for tibble-printing to 7; ?options tells you that you can use options(digits=n) to change the defaults for base-R printing.