Using the keras simpleRNN layer, I am hitting this wall. I have two other models, one only with fully connected Dense layers, and one using LSTM which work as expected, so I don't think it's the data processing that is the issue.

For context, I am using the tf.keras reuters dataset, which comes tokenized, and the output data consists of 46 possible tags which I have categorized.

What the data looks like and how it's processed What the data looks like and how it's processed

Below is the model code.

modelRNN = Sequential()

modelRNN.add(Embedding(input_dim=maxFeatures, output_dim=256,input_shape=(maxWords,)))
modelRNN.add(SimpleRNN(1024))
#modelRNN.add(Activation("sigmoid"))
modelRNN.add(Dropout(0.8))

modelRNN.add(Dense(128))

modelRNN.add(Dense(46, activation="softmax"))


modelRNN.compile(
  optimizer='adam',
  loss='categorical_crossentropy',
  metrics=['accuracy'],
)

And I am fitting using the following parameters

historyRNN = modelRNN.fit(x_train, y_train,
  epochs=100,
  batch_size=512,
  shuffle=True,
  validation_data = (x_test,y_test)
  )

Fitting this model, consistently has a val_accuracy of 0,3762, and a val_loss of ~3,4. This "ceiling" can be clearly seen in the graph:

Graphs

Things I've tried: changing super parameters, changing the input data shape, trying different optimizers.

Any tip is appreciated, thank you. And thank you to the people that helped edit my posts to be more understandable :)

The graphs for the other two models, working on the same data:

Dense layers only

Dense graph

LSTM

LSTM graph

0

There are 0 answers