I have created a GAM containing 3 predictor variables. Because some of the predictor and response data that I built the model with had non-normal distributions, I log transformed them. Additionally, because my response variable has a lot of 0s, I had to add a very small number to every datapoint in order the log transform it. Now I have completed the model, but for the sake of interpretation, I think it would be easier for others to understand the partial effect plots if I could back transform both axes; however, I cannot seem to figure out how to generate back transformed axes on the partial effects.
Here is an example of one of the default plots.
BestModel <- gam(Density ~ s(x) + s(y) + s(z), data = df, sp = c(10000, 0.01,10,10000))
plot(BestModel)
Both axes here are log-transformed. The closest I've gotten to approximating this curve in the original units was by generating a new dataset where one predictor variable increases while other predictors are kept constant. The I plotted the back transformed predicted values from my model against the back transformed predictor variable.
df2 <- NewDataset
plot(((exp(df2$y)*900)/10000), exp(predict(BestModel, newdata = df2))-0.006993007,
xlab = "Back Transformed y", ylab = "Back Transformed Density")
This is the curve that I would expect, but now I don't have a smooth line or a confidence interval, and it doesn't look as nice as the partial plots produced by the default plot(BestModel).
I attempted to add confidence intervals and a nicer line with ggplot2. For the sake of comparison on this post, I left the points and the line that runs through them in the plot.
ggplot(data = df, aes(x = (exp(y)*900)/10000,
y = exp(predict(BestModel, newdata = df))-0.006993007)) +
geom_point()+
geom_line()+
labs(x = "y", y = "Density")+
stat_smooth(method = "gam", se = TRUE)
Unfortunately, that doesn't seem to produce the right curve, nor does switching to method to "loess".
Is there any easier way to back transform each axis individually to match its original units and get a partial plot for each predictor?



There are two ways relatively simple ways to do this
Consider this example from Simon Wood's Generalized Additive Models; an introduction with R
The smooth of ocean depth
b.depthis based on the square root of the covariate, so matches your setting. Also note that the response data are assumed to be Tweedie distributed conditional upon the model.First we create the data we want to evaluate the smooth or the model at:
Here I use a function in gratia to produce data evenly over ocean depth and then transform this to the scale used to fit the model, as well as providing a value for the offset term (which is used to normalize for the different sizes of the nets used in the data).
Option 1
In this option we are evaluating the smooth at the requested values of the covariate, but as this model involves a non-identity link function we need to backtransform to the response scale using the inverse of the link function. We only evaluate the required smooth, then we add the confidence interval, the model constant term, before back transforming everything. I could have pulled the original values for
b.depthfromds, but it is simpler in this instance to just backtransform to theb.depthscale directly usingmutate(). Then we plot.Option 2
In this option we are predicting from the model, but as the model is additive we can just select which terms we want to include in this prediction. Here I choose the intercept (the model constant term) and the smooth of ocean depth. The default in
gratia::fitted_values()is to generate predictions on the response scale — it does this by first predicting on the link scale, computing the credible interval and then transforms everything to the response scale using the inverse of the link function, which it extracts directly from the model.Both options produce
The y axis scale is now more easily interpreted as an expected count and the x axis is on the natural scale. The interpretation is the same though; conditional upon the terms in the model, the plot shows how the expected count varies as a function of ocean depth.