I am trying to run a linear regression on weighted data.
When using speedlm
i get an error msg when there are missing values in the data.
library(speedglm)
sampleData <- data.frame(w = round(runif(12,0,1)),
target = rnorm(12,100,50),
predictor = c(NA, rnorm(10, 40, 10),NA))
summary(sampleData)
w target predictor Min. :0.0000 Min. : -3.381 Min. :22.58 1st Qu.:0.0000 1st Qu.: 48.321 1st Qu.:30.45 Median :1.0000 Median : 84.156 Median :37.09 Mean :0.5833 Mean : 92.306 Mean :35.03 3rd Qu.:1.0000 3rd Qu.:119.891 3rd Qu.:41.96 Max. :1.0000 Max. :223.896 Max. :43.48 NA's :2
#run linear regression without weights
linearNoWeights <- lm(formula("target~predictor"), data = sampleData)
speedLinearNoWeights <- speedlm(formula("target~predictor"), data = sampleData)
#run linear regression with weights
linearWithWeights <- lm(formula("target~predictor"), data = sampleData, weights =sampleData[,"w"] )
speedLinearWithWheights <- speedlm(formula("target~predictor"), data = sampleData, weights =sampleData[,"w"] )
Error in base::crossprod(x, y) : non-conformable arguments In addition: Warning messages: 1: In sqw * X : longer object length is not a multiple of shorter object length 2: In sqw * y : longer object length is not a multiple of shorter object length Called from: base::crossprod(x, y)
Is there any way around this that does not force me to fix the data before running the regression?
You should try to change the
na.action
option. Below is your code, which I am able to run, when I changena.action
tona.exclude/na.omit
.You can go through the documentation for
na.omit
orna.exclude
to understand when to use what. Hope this helps.