I am currently working on a model where I have to predict some materials like ladders, nuts, bolts, mouse, bottles, etc. I have written one algorithm for this which is working okay as of now, The set of images that I have is available on my local computer and I have enough training data to do the training and testing as well. As of now, I have a total of 26 image classes to predict from, all are material type.

Now, this is fine, but I want a case where if an image doesn't belong to said image classes I want it to return something like this, where it would specify that this is not a material rather it's a different picture altogether.

To do this I am thinking to double train my model with a different set of images( for e.g. Imagenet) where just by looking at any non-material image it would return me something like this "this is not material!"

So basically, the same model would get train on two different datasets, one dataset is my material dataset another one is anything other than materials, like images in Imagenet.

My question is how do I approach this? Or do I even need to do this? Or else I just write a simple if - else and put anything that it is not recognizing as material as Non-material type?

1 Answers

nuric On Best Solutions

You can just merge the two datasets and label the ones that do not belong to said 26 classes as a special 27th class. Whenever your model predicts that class you know it's not part of your dataset. For example:

pred = [0.1, 0.1, 0.8] # Assume label 2 is not-this-dataset label

then you can use images from other dataset with label 2 and train as usual in a training cycle. Make sure to balance the dataset, as in there aren't proportionally too many special not-this-dataset labels so your model doesn't overfit and just predict everything is not from your original dataset.