Although there are other reports for the same error message none is helping in my case.
I have prepared my own data, splitted as below but it is impossible to obtain the confussion matrix.
test_index <- createDataPartition(y = workingData$PM10, times = 1, p = 0.5, list = FALSE)
train_set <- workingData[-test_index,]
test_set <- workingData[test_index,]
train_knn <- train(PM10 ~. , method= "knn" , data = train_set)
y_hatknn <- predict(train_knn, train_set, type = "raw")
confusionMatrix(y_hatknn, test_set$PM10)
The last line above gives
Error: `data` and `reference` should be factors with the same levels.
I would like to upload the data for reproduction, but can provide the basic:
str(workingData)
'data.frame': 3653 obs. of 3 variables:
' $ Date : num 2e+07 2e+07 2e+07 2e+07 2e+07 ...
' $ Rain_mm: num 0.1 6.7 0 1.4 0.8 1.8 15.3 0 2.6 3.8 ...
' $ PM10 : num -1 -1 -1 -1 -1 ...
PM10 being pollution PM10 levels.
How to resolve it?
Adding more info:
After the original error:
confusionMatrix(y_hatknn, test_set$PM10) Error:
dataandreferenceshould be factors with the same levels.
I try to set as factor...
confusionMatrix(y_hatknn, as.factor(test_set$PM10)) Error:
dataandreferenceshould be factors with the same levels.
With the prediction as factor...
confusionMatrix(as.factor(y_hatknn), test_set$PM10) Error:
dataandreferenceshould be factors with the same levels.
With both parameters as factors...
confusionMatrix(as.factor(y_hatknn), as.factor(test_set$PM10)) Error in confusionMatrix.default(as.factor(y_hatknn), as.factor(test_set$PM10)) : the data cannot have more levels than the reference
Really need to get is sorted out