Can I split my long sequences into 3 smaller ones and use a stateful LSTM for 3 samples?

Question

Can I split my long sequences into 3 smaller ones and use a stateful LSTM for 3 samples?

617 views Asked by codeananda At 27 April 2021 at 19:29

I am doing a time-series sequence classification problem.

I have 80 time-series all length 1002. Each seq corresponds to 1 of 4 categories (copper, cadmium, lead, mercury). I want to use Keras LSTMs to model this. These models require data to be fed in the form [batches, timesteps, features]. As each seq is independent, the most basic setup is for X_train to have shape [80, 1002, 1]. This works fine in an LSTM (with stateful=False)

But, 1002 is quite a long seq length. A smaller size could perform better.

Let's say I split each seq up into 3 parts of 334. I could continue to use a stateless LSTM. But (I think?) it makes sense to have it be stateful for 3 samples and then reset state (since the 3 chunks are related).

How do I implement this in Keras?

First, I transform the data into shape [240, 334, 1] using a simple X_train.reshape(-1, 334, 1) but how do I maintain the state for 3 samples and then reset the state in model.fit()?

I know I need to call model.reset_states() somewhere but couldn't find any sample code out there showing me how to work it. Do I have to subclass a model? Can I do this using for epoch in range(num_epochs) and GradientTape? What are my options? How can I implement this?

Also, if I split the sequences up, what do I do with the labels? Do I multiply them by the number of chunks each seq is split up into (3 in this case)? Is there a way for an LSTM to ingest 3 samples and then spit out one prediction? Or does each sample have to correspond to a prediction?

Finally, if I split my sequences up into 3 subsequences, do I have to have a batch size of 3? Or can I choose any multiple of 3?

Here is the super basic code I used with X_train.shape == [80, 1002, 1].

model = Sequential([
    LSTM(10, batch_input_shape=(10, 1002, 1)), # 10 samples per batch
    Dense(4, activation='sigmoid')
])
model.compile(loss='categorical_crossentropy',
             optimizer='rmsprop',
             metrics=['accuracy'])
model.fit(X_train, y_train, epochs=3, batch_size=10, shuffle=False)

I know there are loads of questions here, happy to make separate ones if this is too much for one.

Original Q&A

There are 1 answers

**codeananda** · Accepted Answer · 2021-05-07T13:33:06+00:00

codeananda On 07 May 2021 at 13:33 BEST ANSWER

The easy solution is to reshape the data from having 1 feature to having 3.

Turn [80, 1002, 1] into [80, 334, 3] rather than [240, 334, 1]. This keeps the number of samples the same and so you don't have to mess around with statefulness. You can also just use it with the normal fit() API.

TechQA.

Can I split my long sequences into 3 smaller ones and use a stateful LSTM for 3 samples?

There are 1 answers

Related Questions in KERAS

Related Questions in LSTM

Related Questions in RECURRENT-NEURAL-NETWORK

Related Questions in TF.KERAS

Related Questions in LSTM-STATEFUL

Popular Questions

Popular Tags

Trending Questions