Keras CNN instantly overfitting, not dataset issue

Question

Keras CNN instantly overfitting, not dataset issue

226 views Asked by Deividas Brazauskas At 05 December 2020 at 12:33

Been trying to build a CNN to classify MFCC data, but the model is instantly over-fitting.

Data:

18 000 files (80% train, 20% test)
5 labels

5 classes in data are all of equal amounts. This model has been created to handle a lot more files than 18k, so I've been told to reduce the network any how I can, which might help.

Reduced the filter from (3,3) to (1,1), tried reducing the hidden neuron amounts even reduce the layer amounts. I am simply stuck, anyone any ideas?

Not matter what happens I never get accuracy higher than 60-65% when measuring accuracy with the testing data.

Model code:

time_start_train = time.time()
i = Input(shape=(feature_count,feature_count,1))
m = Conv2D(16, d, activation='elu', padding='same')(i)
m = MaxPooling2D()(m)
m = Conv2D(32, d, activation='elu', padding='same')(m)
m = MaxPooling2D()(m)
m = Conv2D(64, d, activation='elu', padding='same')(m)
m = MaxPooling2D()(m)
m = Conv2D(128, d, activation='elu', padding='same')(m)
m = MaxPooling2D()(m)
m = Conv2D(256, d, activation='elu', padding='same')(m)
m = MaxPooling2D()(m)
m = Flatten()(m)
m = Dense(512, activation='elu')(m)
m = Dropout(0.2)(m)
o = Dense(out_dim, activation='softmax')(m)

model = Model(inputs=i, outputs=o)

model.compile(loss='categorical_crossentropy', optimizer=Nadam(lr=1e-3), metrics=['accuracy'])

history = model.fit(data_train[0], data_train[1], epochs=10, verbose=1, validation_split = 0.1, shuffle=True)

Model summary:

    _________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_1 (InputLayer)         (None, 192, 192, 1)       0         
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 192, 192, 16)      32        
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 96, 96, 16)        0         
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 96, 96, 32)        544       
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 48, 48, 32)        0         
_________________________________________________________________
conv2d_3 (Conv2D)            (None, 48, 48, 64)        2112      
_________________________________________________________________
max_pooling2d_3 (MaxPooling2 (None, 24, 24, 64)        0         
_________________________________________________________________
conv2d_4 (Conv2D)            (None, 24, 24, 128)       8320      
_________________________________________________________________
max_pooling2d_4 (MaxPooling2 (None, 12, 12, 128)       0         
_________________________________________________________________
conv2d_5 (Conv2D)            (None, 12, 12, 256)       33024     
_________________________________________________________________
max_pooling2d_5 (MaxPooling2 (None, 6, 6, 256)         0         
_________________________________________________________________
flatten_1 (Flatten)          (None, 9216)              0         
_________________________________________________________________
dense_1 (Dense)              (None, 512)               4719104   
_________________________________________________________________
dropout_1 (Dropout)          (None, 512)               0         
_________________________________________________________________
dense_2 (Dense)              (None, 5)                 2565      
=================================================================
Total params: 4,765,701
Trainable params: 4,765,701
Non-trainable params: 0

MFCC example (192x192)

model accuracy

model loss

Original Q&A

There are 2 answers

**SKG** · Answer 1 · 2020-12-05T12:35:19+00:00

SKG On 05 December 2020 at 12:35

Try to apply L1/ L2 regularization.

**SKG** · Answer 2 · 2020-12-05T15:28:31+00:00

SKG On 05 December 2020 at 15:28

If you don't have deep knowledge of ML/ DL models, use AUTOML instead of KERAS. In AUTOML, one doesn't need to think much about the different parameters.

TechQA.

Keras CNN instantly overfitting, not dataset issue

There are 2 answers

Related Questions in PYTHON

Related Questions in TENSORFLOW

Related Questions in KERAS

Related Questions in CONV-NEURAL-NETWORK

Related Questions in OVERFITTING-UNDERFITTING

Popular Questions

Popular Tags

Trending Questions