I'm having a problem in trying to understand and testing missing data. My goal is to use a logistic regression to help infer if a missing value is MAR or not. For this example, I am using the OceanBuoys
dataset from the naniar
package, along with bind_shadow()
and good old ggplot.
From the logistic regression, we can infer that the sea_temp_c
does help us predict the missing values in air_temp_c_NA
, as shown below:
test <- oceanbuoys %>% bind_shadow()
model_log<- glm(air_temp_c_NA` ~ sea_temp_c, family="binomial", data = test )
summary(model_log)
So from here I wanted to visualise the model to get an understanding. However, my attempts are returned with a "model did not converge" warning.
What am I missing here? I would have assumed that sea_temp_c
being a significant predictor of air_temp_c_NA
would allow for convergence for the visualisation?
ggplot(test, aes(y = air_temp_c_NA`, color = air_temp_c_NA`, x = sea_temp_c)) +
geom_point()+
geom_smooth(method="glm", method.args = list(family = "binomial"), se = FALSE)