Plot combining regression coefficients (partial derivatives) with CIs in R, lincom + coefplot or plotbeta?

793 views Asked by At

Most of the time we run a regression with interactive terms, we are interested in a partial derivative. For example, consider the model below,

Model specification

If I am interested to know the effect of X1 on P(Y), or the partial derivative of X1 on P(Y), I need the following combination of coefficients:

Partial derivative of X1

Instead of calculating it by hand, I can use, for example, the lincom function in R to calculate linear combination of regression parameters. But I would like not only to know the numbers from calculations like this; I would like to plot them. The problem is, if I am using a R package to plot coefficients (e.g., coefplot) it plots the coefficients from my model, but with no option for linear combination of coefficients. Is there any way to combine the lincom function (or other function that calculates combination of parameter) with coefplot (or other coefficient plot packages with this option)?

Of course, in the example above I only consider the derivative of X1, and if I plot it I will have a plot with one dot and its confidence intervals only, but I would like to show in the plot the coefficients for the partial derivatives of X1, X2, and Z, as in the example below.

Coefficients plot (the one I have): Coefficients Plot

Combination of parameters or partial derivatives plot (the one I am trying to get): Combined Estimates

I discovered that Stata has a function that does what I am looking for, called "plotbeta." Does R have something similar?

1

There are 1 answers

12
DaveArmstrong On BEST ANSWER

Here's a start. This defined a function called plotBeta(), the ... are arguments that get passed down to geom_text() for the estimate text.

plotBeta <- function(mod, confidence_level = .95, include_est=TRUE, which.terms=NULL, plot=TRUE, ...){
  require(glue)
  require(ggplot2)
  b <- coef(mod)
  mains <- grep("^[^:]*$", names(b), value=TRUE)
  mains.ind <- grep("^[^:]*$", names(b))
  if(!is.null(which.terms)){
    if(!(all(which.terms %in% mains)))stop("Not all terms in which.terms are in the model\n")
    ins <- match(which.terms, mains)
    mains <- mains[ins]
    mains.ind <- mains.ind[ins]
  }
  icept <- grep("Intercept", mains)
  if(length(icept) > 0){
    mains <- mains[-icept]
    mains.ind <- mains.ind[-icept]
  }
  if(inherits(mod, "lm") & !inherits(mod, "glm")){
    crit <- qt(1-(1-confidence_level)/2, mod$df.residual)
  }else{
    crit <- qnorm(1-(1-confidence_level)/2)
  }
  out.df <- NULL
  for(i in 1:length(mains)){
    others <- grep(glue("^{mains[i]}:"), names(b))
    others <- c(others, grep(glue(":{mains[i]}:"), names(b)))
    others <- c(others, grep(glue(":{mains[i]}$"), names(b)))
    all.inds <- c(mains.ind[i], others)
    ones <- rep(1, length(all.inds))
    est <- c(b[all.inds] %*% ones)
    se.est <- sqrt(c(ones %*% vcov(mod)[all.inds, all.inds] %*% ones))
    lower <- est - crit*se.est
    upper <- est + crit*se.est
    tmp <- data.frame(var = mains[i],  
                      lab = glue("dy/d{mains[i]} = {paste('B', all.inds, sep='', collapse=' + ')}"), 
                      labfac = i, 
                      est = est, 
                      se.est = se.est, 
                      lower = lower, 
                      upper=upper)
    tmp$est_text <- sprintf("%.2f (%.2f, %.2f)", tmp$est, tmp$lower, tmp$upper)
    out.df <- rbind(out.df, tmp)
  }
  out.df$labfac <- factor(out.df$labfac, labels=out.df$lab)
  if(!plot){
    return(out.df)
  }else{
    g <- ggplot(out.df, aes(x=est, y=labfac, xmin=lower, xmax=upper)) + 
      geom_vline(xintercept=0, lty=2, size=.25, col="gray50") + 
      geom_errorbarh(height=0) + 
      geom_point() + 
      ylab("") + xlab("Estimates Combined") + 
      theme_classic() 
    if(include_est){
      g <- g + geom_text(aes(label=est_text), vjust=0, ...)
    }
    g
  }
}

Here's an example with some made-up data:

set.seed(2101)
dat <- data.frame(
  X1 = rnorm(500), 
  X2 = rnorm(500), 
  Z = rnorm(500), 
  W = rnorm(500)
)
dat <- dat %>% 
  mutate(yhat = X1 - X2 + X1*X2 - X1*Z + .5*X2*Z - .75*X1*X2*Z + W, 
         y = yhat + rnorm(500, 0, 1.5))

mod <- lm(y ~ X1*X2*Z + W, data=dat)
plotBeta(mod, position=position_nudge(y=.1), size=3) + xlim(-2.5,2)

enter image description here

EDIT: comparing two models

Using the newly-added plot=FALSE, we can generate the data and then combine and plot.

mod <- lm(y ~ X1*X2*Z + W, data=dat)
p1 <- plotBeta(mod, plot=FALSE)
mod2 <- lm(y ~ X1*X2 + Z + W, data=dat)
p2 <- plotBeta(mod2, plot=FALSE)
p1 <- p1 %>% mutate(model = factor(1, levels=1:2, 
                                   labels=c("Model 1", "Model 2")))
p2 <- p2 %>% mutate(model = factor(2, levels=1:2, 
                                   labels=c("Model 1", "Model 2")))

p_both <- bind_rows(p1, p2)
p_both <- p_both %>% 
  arrange(var, model) %>% 
  mutate(labfac = factor(1:n(), labels=paste("dy/d", var, sep="")))

ggplot(p_both, aes(x=est, y=labfac, xmin=lower, xmax=upper)) + 
  geom_vline(xintercept=0, lty=2, size=.25, col="gray50") + 
  geom_linerange(position=position_nudge(y=c(-.1, .1))) + 
  geom_point(aes(shape=model), 
             position=position_nudge(y=c(-.1, .1))) + 
  geom_text(aes(label=est_text), vjust=0,
            position=position_nudge(y=c(-.2, .15))) + 
  scale_shape_manual(values=c(1,16)) + 
  ylab("") + xlab("Estimates Combined") + 
  theme_classic() 

enter image description here