why am I getting error in transfer learning?

294 views Asked by At

I am training a model for Optical Character Recognition of Gujarati Language. The input image is a character image. I have taken 37 classes. Total training images are 22200 (600 per class) and testing images are 5920 (160 per class). My input images are 32x32

Below is my code:

model = tf.keras.applications.DenseNet121(include_top=False, weights='imagenet', pooling='max')
base_inputs = model.layers[0].input
base_outputs = model.layers[-1].output # NOTICE -1 not -2
prefinal_outputs = layers.Dense(1024)(base_outputs)
final_outputs = layers.Dense(37)(prefinal_outputs)
new_model = keras.Model(inputs=base_inputs, outputs=base_outputs)
    
    from tensorflow.keras.preprocessing.image import ImageDataGenerator
    
    train_datagen = ImageDataGenerator(
                                       shear_range=0.2,
                                       zoom_range=0.2,
                                       
                                       
                                       horizontal_flip=False)
    
    test_datagen = ImageDataGenerator(horizontal_flip = False)
    
    training_set = train_datagen.flow_from_directory('C:/Users/shweta/Desktop/characters/train',
                                                     target_size = (32, 32),
                                                     batch_size = 64,
                                                     class_mode = 'categorical')
    
    test_set = test_datagen.flow_from_directory('C:/Users/shweta/Desktop/characters/test',
                                                target_size = (32, 32),
                                                batch_size = 64,
                                                class_mode = 'categorical')
    
    new_model.compile(optimizer = 'adam', loss = 'categorical_crossentropy', metrics = ['accuracy'])
    
    new_model.fit_generator(training_set,
                            
                             epochs = 25,
                             validation_data = test_set, shuffle=True)
    new_model.save('alphanumeric.mod')

I am getting following output:

Output of code

Thanks in advance!

2

There are 2 answers

2
Gerry P On

The code should be:

model = tf.keras.applications.DenseNet121(include_top=False, weights='imagenet, pooling='max', input_shape=(32,32,3))
base_outputs = model.layers[-1].output 
prefinal_outputs = layers.Dense(1024)(base_outputs)
final_outputs = layers.Dense(37)(prefinal_outputs)
new_model = keras.Model(inputs=model.input, outputs=final_outputs)
new_model.compile(Adam(), loss='categorical_crossentropy', metrics=['accuracy'])  

Also you should use model.fit in the future. Model.fit can now work with generators and model.fit_generator will be depreciate in future versions of tensorflow. I ran against your dataset and got accurate results in about 10 epochs. Here is some additional advice. It is best to use and adjustable learning rate. The keras callback ReduceLROnPlateau makes this easy to do. Documentation is here. Set it to monitor the validation loss. My use is shown below.

lr_adjust=tf.keras.callbacks.ReduceLROnPlateau( monitor="val_loss", factor=0.5, patience=1, verbose=1, mode="auto",
        min_delta=0.00001,  cooldown=0,  min_lr=0)

Also I recommend using the callback ModelCheckpoint. Documentation is here. Set it up to monitor validation loss and it will save the weights that achieved the lowest validation loss. My implementation is shown below.

sav_loc=r'c:\Temp' # set this to the path where you want to save the weights
checkpoint=tf.keras.callbacks.ModelCheckpoint(filepath=save_loc, monitor='val_loss', verbose=1, save_best_only=True,
        save_weights_only=True, mode='auto', save_freq='epoch', options=None)
callbacks=[checkpoint, lr_adjust]

In model.fit include callbacks=callbacks. When training is completed you want to load these saved weights into the model, then save the model. You can use the saved model to make predictions. Code is below.

model.load_weights(save_loc)  
model.save(save_loc)
10
Seth P On

First of all, very well written code. These are some of the things, I have noticed while I was going through the code and tf,keras docs.

I would like to ask what kind of labels have you got beacuse you know categorical_crossentropy expects ONE HOT CODED labels.(Check this).So, if your labels are integers, use sparsecategoricalentropy.

Similar issue There was post where someone was trying to classsify into 2 and used categorical instead of binary crossentropy. If you want to look at.

Cheers Let me know how it goes!

PS: @gerry made a very good point and if labels are One hot encoded use categoricalcrossentropy!