How to locate individual samples that have been misclassified using kNN, in R?

Question

How to locate individual samples that have been misclassified using kNN, in R?

561 views Asked by daisybeats At 22 October 2020 at 06:17

Using the Iris dataset in R, I am looking at classification using kNN. I am interested in finding the observations that have been misclassified using the test set. I was able to produce scatter plots which gives a visual of the observations that have been misclassified. However, how can I locate and list all the observations that have been misclassified. I have included the code I used to get the scatter plots below which was from https://rpubs.com/Tonnia/irisknn

set.seed(12345)
allrows <- 1:nrow(iris)
trainrows <- sample(allrows, replace = F, size = 0.8*length(allrows))
train_iris <- iris[trainrows, 1:4]
train_label <- iris[trainrows, 5]
table(train_label)
test_iris <- iris[-trainrows, 1:4]
test_label <- iris[-trainrows, 5]
table(test_label)

library(class)
error.train <- replicate(0,30)
for(k in 1:30) {
  pred_iris <- knn(train = train_iris, test = train_iris, cl = train_label, k)
  error.train[k]<-1-mean(pred_iris==train_label)
}

error.train <- unlist(error.train, use.names=FALSE)

error.test <- replicate(0,30)
for(k in 1:30) {
  pred_iris <- knn(train = train_iris, test = test_iris, cl = train_label, k)
  error.test[k]<-1-mean(pred_iris==test_label)
}

error.test <- unlist(error.test, use.names = FALSE)

plot(error.train, type="o", ylim=c(0,0.15), col="blue", xlab = "K values", ylab = "Misclassification errors")
lines(error.test, type = "o", col="red")
legend("topright", legend=c("Training error","Test error"), col = c("blue","red"), lty=1:1)

pred_iris<-knn(train = train_iris, test = test_iris, cl = train_label, 6)
result <- cbind(test_iris, pred_iris)
combinetest <- cbind(test_iris, test_label)

result%>%
  ggplot(aes(x=Petal.Width, y=Petal.Length, color=pred_iris))+
  geom_point(size=3)

combinetest%>%
  ggplot(aes(x=Petal.Width, y=Petal.Length, color=test_label))+
  geom_point(size=3)

Original Q&A

There are 1 answers

**Kezrael** · Accepted Answer · 2020-10-22T07:31:52+00:00

In your code, pred_iris holds the value for the current trained model response.

Once you have the combinetest data, around the end of your code, you could do something like:

combinetest[test_label != pred_iris,]

To get the ones with a different prediction than label.

Alternatively, with a more tidyverse readable syntax:

library(tidyverse)
combinetest %>%
    filter(test_label != pred_iris)

TechQA.

How to locate individual samples that have been misclassified using kNN, in R?

There are 1 answers

Related Questions in R

Related Questions in CLASSIFICATION

Related Questions in KNN

Related Questions in IRIS-DATASET

Popular Questions

Popular Tags

Trending Questions