Is it an overfitting problem for SVM classification?

332 views Asked by At

I am new in Machine learning, and I want to detect emotions from the face.

Preprocessing: I used equalizeHist to equalizes the histogram of grayscale images (JAFFE database with 213 images), in the goal to normalizes the brightness and increases the contrast of the image.

Feature extraction: I extract features with Gabors filters from images, and I get a matrix of 213x120. I split data: 60% train data and 40% test data, and I normalize it.

Train and test: for training the model, I use SVM classifier with an RBF kernel. With grid-search, I select the best couple C, gamma (using 10-fold cross-validation). Then, I test the performance of the model on the test data (unseen data), and I get 89% accuracy.

The problem is when I want to predict emotion from a new input face, I get a false result.. Is it an overfitting problem?

UPDATE 1: feature extraction

gf = GaborFilter(ksize=(11, 11), freq_nbr=5, or_nbr=8, lambd=4, sigma=8, gamma=5, psi=5*np.pi/6) #Gabor filter with *cv2.getGaborKernel* 

data = []
for j in range(len(roi1_images)): # len(roi1_images) = 213
  vect1 = feature_extraction_roi_avr(roi1_images[j], gf.kernels) # feature vect from left eye
  vect2 = feature_extraction_roi_avr(roi2_images[j], gf.kernels) # feature vect from right eye (40 features)
  vect3 = feature_extraction_roi_avr(roi3_images[j], gf.kernels) # feature vect from mouth (40 features)
  vect = np.concatenate((vect1, vect2, vect3), axis=None) # feature vect of one face (40 features)
  data.append(vect)
data = np.array(data) # Data matrix (213, 120)

UPDATE 2: Learn and test model


clf = GridSearchCV(estimator=SVC(kernel='rbf'), param_grid=svm_parameters, cv=10, n_jobs=-1) # grid search with 10fold cross validation 
scaler = StandardScaler()
 
# Split data
X_train_, X_test_, y_train, y_test = train_test_split(data, data_labels, random_state=0, test_size=0.4, stratify=data_labels)

X_train = scaler.fit_transform(X_train) # Normalize train data
clf.fit(X_train, y_train) # fit SVM model
X_test = scaler.transform(X_test) # Normalize test data
score = clf.score(X_test, y_test) # calculate mean accuracy
print(score) # score accuracy = 0.8953488372093024

UPDATE 3: predict new input

    landmarks, gs_image = detect_landmarks(path_image)
    roi1_image = extract_roi(gs_image, landmarks, 1) #extract region 1 (eye 1)
    roi2_image = extract_roi(gs_image, landmarks, 2) #extract region 2 (eye 2)
    roi3_image = extract_roi(gs_image, landmarks, 3) #extract region 3 (eye 3)
    vect1 = feature_extraction_roi_avr(roi1_image, gf.kernels)
    vect2 = feature_extraction_roi_avr(roi2_image, gf.kernels) 
    vect3 = feature_extraction_roi_avr(roi3_image, gf.kernels)
    vect = np.concatenate((vect1, vect2, vect3), axis=0)
    vect = np.reshape(vect, (1,-1))
    vect = scaler.transform(vect)
    class_ = clf.predict(vect)[0]
    print(class_,end=" ")
0

There are 0 answers