Regarding the number of features extracted from an image for training

121 views Asked by At

I am building a software to classify cells from images taken by a microscope.

I have a dataset of images of cells to use as training dataset - I have extracted Keypoints from each image using ORB - Here is my problem - some image produce a lot of keypoints and some small number of keypoints. Thus the descriptor vectors produced are of different lentgh. So when i try to build a training matrix from them i have to 'Normalize' the number of Keypoints chosen from each Image so that the length of the descriptor vectors will be the same.

How many key points should i pick and which? how to pick the 'Best' Keypoints? (this question also rises when i want to preform a prediction on an object i want to classify) are there known approaches to this problem? Regards.

2

There are 2 answers

1
Ajay On

You could use bag of words approach to classify your images. You first have to collect all keypoint descriptors and cluster them into a certain number of groups. Each group (cluster) is your word. A descriptor corresponds to a word. For each image now, you can build a histogram by counting the occurrence of words. You can then normalize the histogram to remove the effect of varying number of keypoints in different images.

Using spatial pyramid matching could be another solution.

0
Diego On

The simplest approach, as described by Ajay, is to cluster keypoints into N clusters and then define N binary features, such that for a given sample, feature i equals 1 if the sample shows a keypoint in cluster i, and 0 otherwise.

Another approach is to use a kernel classifier, like Support Vector Machines (SVM), and use a kernel that accepts variable-length vectors (e.g. Fisher kernel).