Regression with Keras API not giving consistent result

98 views Asked by At

I am doing a comparative study on a simple regression (one independent variable and one target variable) in two ways:- LinearRegression vs neural network (NN - Keras API). My sample data as follows:

  x1           y
121.9114    121.856
121.856     121.4011
121.4011    121.3222
121.3222    121.9502
121.9502    122.0644

LinearRegression Code:

lr = LinearRegression()
lr.fit(X_train, y_train)

Note: LR model gives me RMSE 0.22 consistently in each subsequent run.

NN Code:

nn_model = models.Sequential()
nn_model.add(layers.Dense(2, input_dim=1,  activation='relu')) 
nn_model.add(layers.Dense(1))
nn_model.compile(optimizer='adam', loss='mse', metrics=['mae'])
nn_model.fit(X_train, y_train, epochs=40, batch_size=32)

Training Loss:

Epoch 1/40 539/539 [==============================] - 0s 808us/sample - loss: 16835.0895 - 
mean_absolute_error: 129.5276
Epoch 2/40 539/539 [==============================] - 0s 163us/sample - loss: 16830.6868 - 
mean_absolute_error: 129.5106
Epoch 3/40 539/539 [==============================] - 0s 204us/sample - loss: 16826.2856 - 
mean_absolute_error: 129.4935
...........................................
...........................................
Epoch 39/40 539/539 [==============================] - 0s 187us/sample - loss: 16668.3582 - 
mean_absolute_error: 128.8823
Epoch 40/40 539/539 [==============================] - 0s 168us/sample - loss: 16663.9828 - 
mean_absolute_error: 128.8654

NN based solution gives me RMSE = 136.7476

Interestingly NN based solution gives me different RMSE in different run because training loss appears different in each run.

For example in first run as shown above loss starts with 16835 and final loss in 40th epoch is 16663. In this case model gives me RMSE=136.74

If i run the same code second time then loss starts with 16144 and final loss in 40th iteration is 5. In this case if RMSE comes to 7.3.

Sometimes i see RMSE as 0.22 also when training loss starts with 400 and ends (40th epoch) with 0.06.

This Keras behavior giving me hard time to understand if there is a problem with Keras API or i am doing something wrong or this problem statement is not suitable for Keras.

Could you please help me in understanding the issue and what could be the best way to stabilize the NN based solution ?

Some Additional Info:

  1. My training and test data is always fixed so no data is shuffled.
  2. number of records in train data = 539
  3. number of records in test data = 154
  4. tried MinMaxScaling also on train & test but doesn't bring stability in prediction.
1

There are 1 answers

0
Lucas Azevedo On

there are multiple questions regarding the consistency/reproducibility of Keras. I have already answered that here a while ago and since then I have realized that some other modifications need to be done to achieve consistency:

According to Keras FAQ and this Kaggle experiment you CANNOT achieve consistency if you are using GPU processing. So they recommend you to set CUDA_VISIBLE_DEVICES="" and set the python hash generator to a fixed seed with PYTHONHASHSEED=0 (this must be done outside the script you're using Keras in).

You also have to set some seeds:

1)numpy random seed

import numpy as np
np.seed(1)

2)tensor flow random seed

import tensorflow as tf
tf.set_random_seed(2)

3)python random seed

import random
random.seed(3)

Additionally, you have to set two (if you have multiprocessing capabilities) arguments to model.fit. These ones are not often mentioned on the answers I've seen around:

model.fit(..., shuffle=False, use_multiprocessing=False)

Make sure that you are training your model on a cpu. Later versions of tensorflow-gpu might be able to identify and select a GPU even when you set CUDA_VISIBLE_DEVICES="".