Is it possible to extract slope and intercept from multiple fitted lines into a tibble?

281 views Asked by At

I'm trying to compare the slope and intercept of many separate fitted lines and would like to extract this information from the equations that are shown using stat_poly_eq. I am able to plot all of the data and lines but since in my actual data I has over 50 equations, I'd like a simple way to extract the slope and intercept of each line.

Below is code to generate a similar plot with mtcars.

I'd like to add an output as a tibble with columns for cyl, gear, slope, intercept.

library(tidyverse)
library(ggpmisc)

ggplot(mtcars, aes(x = wt, y = mpg, color = as.character(cyl))) +
  geom_point()+
  facet_wrap(gear ~ .) +
  stat_poly_line(fullrange = TRUE, se = FALSE) +
  stat_poly_eq(aes(label = paste(..eq.label.., ..rr.label.., sep = "~~~~~")), 
               parse=TRUE,label.x.npc = "right")

1

There are 1 answers

2
Adam Quek On BEST ANSWER

You can use nlme::lmList to help run all regression.

library(nlme)

#create a new grouping variable for the nested grouping factor
mtcars2 <- mtcars %>% mutate(gear_cyl = interaction(mtcars$gear, mtcars$cyl))

fm1 <- lmList(mpg ~ wt | gear_cyl, mtcars2)

Calling the summary for coefficient and r-squared will return you the same statistics shown on your ggplot.

summary(fm1)$coef

, , (Intercept)

    Estimate Std. Error  t value     Pr(>|t|)
3.4 21.50000        NaN      NaN          NaN
4.4 40.85910   4.254923 9.602782 4.813403e-08
5.4 41.01754        NaN      NaN          NaN
3.6 64.70408        NaN      NaN          NaN
4.6 30.20964  12.042632 2.508558 2.326993e-02
5.6 19.70000        NaN      NaN          NaN
3.8 25.05942   4.527584 5.534834 4.526681e-05
5.8 22.14000        NaN      NaN          NaN

, , wt

      Estimate Std. Error    t value    Pr(>|t|)
3.4         NA        NaN         NA          NA
4.4  -5.859279   1.741259 -3.3649675 0.003940916
5.4  -7.017544        NaN        NaN         NaN
3.6 -13.469388        NaN        NaN         NaN
4.6  -3.380894   3.866794 -0.8743402 0.394867930
5.6         NA        NaN         NA          NA
3.8  -2.438894   1.085886 -2.2459951 0.039178704
5.8  -2.000000        NaN        NaN         NaN
summary(fm1)$r.squared

[1] 0.0000000 0.5358963 1.0000000 1.0000000 0.8095674 0.0000000 0.4561613 1.0000000

To put everything into a tibble:

output <- tibble(gear_cyl = names(fm1),
                 intercept = summary(fm1)$coef[,1,1],
                 slope = summary(fm1)$coef[,1,2],
                 r_sq = summary(fm1)$r.squared)

output %>% separate(gear_cyl, c("gear", "cyl"), sep="\\.")

# A tibble: 8 x 5
  gear  cyl   intercept  slope  r_sq
  <chr> <chr>     <dbl>  <dbl> <dbl>
1 3     4          21.5  NA    0    
2 4     4          40.9  -5.86 0.536
3 5     4          41.0  -7.02 1    
4 3     6          64.7 -13.5  1    
5 4     6          30.2  -3.38 0.810
6 5     6          19.7  NA    0    
7 3     8          25.1  -2.44 0.456
8 5     8          22.1  -2.00 1