I am trying to train a lstm autoencoder to convert the input space to a latent space and then visualize it, and I hope to find some interesting patterns in the latent space. The input is data from 9 sensors. They are to be transformed into a three-dimensional space.
The problem is that when I run the code, sometimes the data is transformed into 3 dimensions and sometimes only one or 2 dimensions are used. the remaining dimensions are zero.
this is the autoencoder im using:
class Autoencoder(Model):
def __init__(self, latent_dim):
super(Autoencoder, self).__init__()
self.latent_dim = latent_dim
self.encoder = Sequential([
layers.LSTM(features, activation='relu', input_shape=(time,features), return_sequences=True),
layers.Dropout(0.3),
layers.LSTM(latent_dim, activation='relu', return_sequences=False),
layers.RepeatVector(time)
])
self.decoder = Sequential([
layers.LSTM(latent_dim, activation='relu', return_sequences=True),
layers.Dropout(0.3),
layers.LSTM(features, activation='relu', return_sequences=True),
layers.TimeDistributed(Dense(features))
])
def call(self, x):
encoded = self.encoder(x)
decoded = self.decoder(encoded)
return decoded
autoencoder = Autoencoder(latent_dim)
autoencoder.compile(optimizer='adam', loss='mae')
history = autoencoder.fit(X, X, epochs=10, validation_split=0.2, shuffle=False)
encoded = autoencoder.encoder(X)
encoded_reshaped = pd.DataFrame(encoded.numpy().reshape(-1, latent_dim))
this is the latent space, as you can see, the encoder used only the one dimension to represent the data
0 1 2
0 0.0 2.164718 0.0
1 0.0 2.056577 0.0
2 0.0 2.020535 0.0
3 0.0 2.134846 0.0
4 0.0 2.109566 0.0
5 0.0 1.902232 0.0
6 0.0 1.919019 0.0
7 0.0 2.021480 0.0
8 0.0 1.839327 0.0
9 0.0 1.740795 0.0
10 0.0 2.008053 0.0
11 0.0 1.966692 0.0
12 0.0 1.899480 0.0
13 0.0 1.811787 0.0
14 0.0 2.182250 0.0
15 0.0 2.146597 0.0
16 0.0 1.908313 0.0
does anyone knows the reason and how to avoid it from happening? when only one dimension is used
The problem could come from the activation function used:
relu
activation will set to zero all negative values(relu(x)=max(0,x))
so, if your latent space representation "needs" to use negative values you will loose them after applying the activation step.Try to use another activation function that allows negative values or to modify your
relu
function by changing the threshold value. If you are usingkeras
you can check the activation options here https://keras.io/api/layers/activations/I had the same issue and it solved it for me !