I am just beginner for Deep Learning. I try to capture all details.
Perplex is derived from MPI dataset. other emotions are derived from FER2013. To balance the data set, all emotions into (training: 3171, validation: 816) strategy, due to lack of perplex dataset.
Dataset Size:
perplex happy sad neutral angry
train 3171 3171 3171 3171 3171
perplex happy sad neutral angry
test 816 816 816 816 815
FER2013 source: downsized version of
https://www.kaggle.com/msambare/fer2013
MPI source (only cam 2, 3 & 4 angle, of all actors emotions such as Clueless, Confusion and Thinking):
https://www.b-tu.de/en/graphic-systems/databases/the-small-mpi-facial-expression-database
https://www.b-tu.de/fg-graphische-systeme/datenbanken/die-grosse-mpi-gesichtsausdrueckedatenbank
preprocess steps:
Firstly, all 3171 x 5 = 15855 and 816 x 5 = 4079 images into 48x48 gray scale.
Samples:
Sample dataset for Angry, Happy, Perplex and Sad
CNN Architecute using TFLearn:
# Input Layer
convnet = input_data(name="input", shape=[None, 48, 48, 1])
#Enabling Filters
convnet = conv_2d(convnet, 32, 5, activation = "relu")
convnet = max_pool_2d(convnet, 5)
convnet = conv_2d(convnet, 64, 5, activation = "relu")
convnet = max_pool_2d(convnet, 5)
convnet = conv_2d(convnet, 128, 5, activation = "relu")
convnet = max_pool_2d(convnet, 5)
convnet = conv_2d(convnet, 64, 5, activation = "relu")
convnet = max_pool_2d(convnet, 5)
convnet = conv_2d(convnet, 32, 5, activation = "relu")
convnet = max_pool_2d(convnet, 5)
convnet = fully_connected(convnet, 1024, activation = "relu")
convnet = dropout(convnet, 0.5)
#Output Layer
convnet = fully_connected(convnet, 5, activation = "softmax")
convnet = regression(convnet, optimizer = "SGD", learning_rate = 0.001, loss = "categorical_crossentropy", name = "targets")
model = tflearn.DNN(convnet,
best_checkpoint_path = best_cp_path + '/',
best_val_accuracy = "BEST_VAL_ACCURACY",
tensorboard_dir = log_path,
tensorboard_verbose = 3)
image_size = 48
channel = 1
#Split for training and testing
train_x = np.array([index[0] for index in train]).reshape(-1, image_size, image_size, channel)
train_y = np.array([index[1] for index in train])
test_x = np.array([index[0] for index in test]).reshape(-1, image_size, image_size, channel)
test_y = np.array([index[1] for index in test])
model.fit(
train_x,
train_y,
validation_set = (test_x, test_y),
n_epoch = 500,
snapshot_step = 500,
show_metric = True,
run_id = "ED_SGD-0.001",
snapshot_epoch = True
)
I am shuffling all training and validation files before splitting as train and test input.
The training is overfitting after 150+ epoch out of 500, I am using raw SGD without momentum and learning decay, I tried Adam also same overfitting issue. the val_accuracy is 0.62 and train_accuracy is 0.8+
Early Saving Method: I am saving 0.62 accuracy model files in the model folder.
Odd's in my mind:
If you see closely all perplex emotion have black background, because it's taken in lab environment. Other emotions are taken from fer2013 it's lively with different grey shade background and some black also I can find.
How to overcome this overfitting issue?
Which hyper-parameter values should I tune?
Should I upscale to 7000+ images as like in FER2013 dataset?
Should I apply different background of grey shades randomly to perplex images?
How to increase the accuracy?
Loss curve:
Last Epoch values (taken from tensorboard graph, some values I didn't save from terminal):
Training Step: 17200+ | total loss: 1.7+ | time:
| SGD | epoch: 500 | **loss: 0.5236** - acc: 0.8270 | val_loss: 1.30219 - val_acc: 0.6239 -- iter: 15855/15855
GitHub:
you are warmly welcome