I am running a logistic regression to get the probability of something at certain inputs. I'll use an example:
I want to predict someone's chance to be admitted to a graduate school based on their GRE and GPA. Using the visreg package, it produces this very easily and nicely with beds.
v1 <- visreg::visreg(
model,
"GPA",
scale = "response",
rug = 2,
xlab = "GPA",
ylab = "Pr(Accepted)",
type = "conditional",
by = "GRE",
breaks = c(
input$gre_in2 - 1,
input$gre_in2,
input$gre_in2 + 1
),
gg = TRUE,
fill.par = list(col = "#008DFF33")
)
In this example, it will produce a ggplot2 plot, faceted 3 times at 3 cross sections which are your GRE score +/-1. It then will show you the regression curve for that GRE score and the probability you're admitted as the x axis varies for GPA.
My problem is with the way visreg handles cross sections. The breaks argument tells it which cross sections I want. If breaks = a single number, then that is the number cross sections (so 80 will produce 80 cross sections, not a cross section at GRE=80). If it is a vector of numbers, then it will produce cross sections at those values of the "by" variable. So I can only have a minimum of 2 cross sections. If no argument is put in for the breaks, it will do it at 10th,50th,90th percentiles, and if (1) is put in, it will do the cross section at the mean or median, I'm not sure.
Another problem is that I am making these plots interactive with ggplotly. I was able to convert this ggobject to a grob object, delete the extra columns, then draw the single facet I wanted onto a new grid.newpage(), but drawing like this cannot be returned as an object and I cannot make it interactive with plotly.
Is there a way I can get the cross section at a specific value with visreg without using a vector of points where the cross section should be? Or can I delete/subset a facet from a ggplot object?
Thanks
Reproducible example: Note here: I am only using 3 GRE scores (let's saying I'm inputting my score as 320) to build the glm model. The data that I build the model from has the entire range of possible scores. Also note that I am choosing 3 just because I need a vector of scores in the breaks argument. I could have chosen 2, but showing 3 seemed more appropriate so you can see how your chance would change +/-1 GRE score. I added observations going up to 326 so that the mean/median won't be the 320 that I wanted to use as my example input.
nd2 <- data.frame(GRE = 319, GPA = seq(from = 2,to = 4.33,by = 0.01))
nd2 <- rbind(nd2, data.frame(GRE = 320,GPA = seq(from = 2,to = 4.33,by = 0.01)))
nd2 <- rbind(nd2, data.frame(GRE = 321,GPA = seq(from = 2,to = 4.33,by = 0.01)))
nd2 <- rbind(nd2, data.frame(GRE = 322,GPA = seq(from = 2,to = 4.33,by = 0.01)))
nd2 <- rbind(nd2, data.frame(GRE = 323,GPA = seq(from = 2,to = 4.33,by = 0.01)))
nd2 <- rbind(nd2, data.frame(GRE = 324,GPA = seq(from = 2,to = 4.33,by = 0.01)))
nd2 <- rbind(nd2, data.frame(GRE = 325,GPA = seq(from = 2,to = 4.33,by = 0.01)))
nd2 <- rbind(nd2, data.frame(GRE = 326,GPA = seq(from = 2,to = 4.33,by = 0.01)))
nd2$pr <- seq(0,1)
model <-
stats::glm(data = nd2,pr ~ GPA + GRE,family = stats::binomial(link = logit))
v1 <- visreg::visreg(
model,
"GPA",
scale = "response",
rug = 2,
xlab = "GPA",
ylab = "Pr(Accepted)",
type = "conditional",
by = "GRE",
breaks = c(
319,
320,
321
),
gg = TRUE
)
plotly::ggplotly(v1)