I have an LSTM network consisting of two layers. I am trying to perform transfer learning by re-training partially the first layer of the network (re-training only some of the weights randomly from the first layer of the network). I loaded the weights of a previously trained LSTM model. I set the second layer to be untrainable. For the first layer, I want to re-train only a percentage of the weights (lets us say 10% of the weights only, chosen randomly from the first layer), and I want the rest of the weights in the first layer to be untrainable.
A part of the code is as follows. I am also not fully sure if unit_index= i //4 needs to be considered.
model = Sequential()
model.add(LSTM(12, input_shape=(NUM_IN_SEQUENCE,4), activation = "tanh", recurrent_initializer='orthogonal', return_sequences=True, use_bias=True))
model.add(LSTM(6, input_shape=(NUM_IN_SEQUENCE,4), activation = "tanh", recurrent_initializer='orthogonal', return_sequences=True, use_bias=True))
model.add(Dense(2, activation='linear'))
earlystop=EarlyStopping(monitor='val_loss', min_delta=0, patience=3, verbose=0, mode='auto')
callbacks_list = [earlystop]
# load the previously trained weights
model.load_weights('rnn_model_2.h5')
# freeze the weights of the second layer
model.layers[1].trainable = False
# randomly select 10% of the units in the first layer to be trainable
total_units_layer_1 = model.layers[0].units
num_trainable_units_layer_1 = int(0.1 * total_units_layer_1)
# randomly select indices of units to be trainable
trainable_indices = np.random.choice(range(total_units_layer_1), size=num_trainable_units_layer_1, replace=False)
# assign the selected units trainable
for i, weight in enumerate(model.layers[0].trainable_weights):
unit_index = i // 4 # each LSTM unit has 4 weights (input, recurrent, bias, and cell)
if unit_index in trainable_indices:
weight._trainable = True
else:
weight._trainable = False
# combile model
model.compile(optimizer='adam', loss='mean_squared_error', metrics=['mse','mae'])
callbacks = [
EarlyStoppingByLossVal(monitor='val_loss', value=2e-6, verbose=1),
]
model.fit(x_train, y_train, epochs=50, batch_size=64,validation_data=(x_test,y_test),callbacks=callbacks, verbose=1)