My binary classifier DNN's accuracy seems stuck since epoch 1. I think this means that the model is not learning. Any insight on why this is happening?

Problem statement: I would like to classify a given sequence of readings for sensors (ex. [0 1 15 1 0 3]) into either 0 or 1 (0 equivalent to "idle" state, 1 equivalent to "active" state).

About the dataset: Dataset is available here The "state" column is the target, while the rest of the columns are the features.

I've tried using SGD instead of Adam, tried using different kernel initializes, tried changing the number of hidden layers and number of neurons per layer and tried using sklearn's StandardScaler instead of the MinMaxScaler. None of these approaches seemed to change the outcome.

This is the code:

import numpy as np
import pandas as pd
from keras.models import Sequential
from keras.layers import Dense, Dropout
from keras.callbacks import EarlyStopping
from keras.optimizers import Adam
from keras.initializers import he_uniform
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import MinMaxScaler
from sklearn.preprocessing import StandardScaler
from sklearn.preprocessing import OneHotEncoder

seed = 7
random_state = np.random.seed(seed)

data = pd.read_csv('Dataset/Reformed/Model0_Dataset.csv')
X = data.drop(['state'], axis=1).values
y = data['state'].values

#min_max_scaler = MinMaxScaler()
std_scaler = StandardScaler()

# X_scaled = min_max_scaler.fit_transform(X)
X_scaled = std_scaler.fit_transform(X)

X_train, X_test, y_train, y_test = train_test_split(X_scaled, y, test_size=0.2, random_state=random_state)

# One Hot encode targets
y_train = y_train.reshape(-1, 1)
y_test = y_test.reshape(-1, 1)

enc = OneHotEncoder(categories='auto')
y_train_enc = enc.fit_transform(y_train).toarray()
y_test_enc = enc.fit_transform(y_test).toarray()

epochs = 500    
batch_size = 100
model = Sequential()
model.add(Dense(700, input_shape=(X.shape[1],), kernel_initializer=he_uniform(seed)))
model.add(Dropout(0.5))
model.add(Dense(1400, activation='relu', kernel_initializer=he_uniform(seed)))
model.add(Dropout(0.5))
model.add(Dense(700, activation='relu', kernel_initializer=he_uniform(seed)))
model.add(Dropout(0.5))
model.add(Dense(800, activation='relu', kernel_initializer=he_uniform(seed)))
model.add(Dropout(0.5))
model.add(Dense(2, activation='softmax'))
model.summary()

early_stopping_monitor = EarlyStopping(patience=25)
# model.compile(SGD(lr=.01, decay=1e-6, momentum=0.9, nesterov=True), loss='binary_crossentropy', metrics=['accuracy'])
model.compile(Adam(lr=.01, decay=1e-6), loss='binary_crossentropy', metrics=['accuracy'], )

history = model.fit(X_train, y_train_enc, validation_split=0.2, batch_size=batch_size,
                    callbacks=[early_stopping_monitor], epochs=epochs, shuffle=True, verbose=1)
eval = model.evaluate(X_test, y_test_enc, batch_size=batch_size, verbose=1)

Expected results: Accuracy increasing (and loss decreasing) with each epoch (at least for the early epochs).

Actual results: The following values are fixed throughout the entire training process:

loss: 8.0118 - acc: 0.5001 - val_loss: 8.0366 - val_acc: 0.4987

1 Answers

0
Matias Valdenegro On

You are using the wrong loss, with a two-output softmax you should use categorical_crossentropy and you should one-hot encode your labels. If you want to use binary_crossentropy, then the output layer should be a one unit with a sigmoid activation.