confusion about pytorch LSTM implementation

Question

confusion about pytorch LSTM implementation

461 views Asked by tom_cat At 12 March 2022 at 05:29

as we all known, pytorch's LSTM implementation is a layered Bi-directional LSTM.

the first layer's input dimension is supposed to be (L,N,H_in) . If we use bidirectional LSTM, then the output of first layer is (L, N, 2*H_hiddensize) official doc

I can't figure out how this output is fed into the second LSTM layer. will the output of backforward layer and the forward layer be merged or concatenated?

I check the source code of its implementation. source code but i fail to understand it.

layers = [_LSTMLayer(**self.input_size**, self.hidden_size,
                             self.bias, batch_first=False,
                             bidirectional=self.bidirectional, **factory_kwargs)]

for layer in range(1, num_layers):
    layers.append(_LSTMLayer(**self.hidden_size**, self.hidden_size,
                                     self.bias, batch_first=False,
                                     bidirectional=self.bidirectional,
                                     **factory_kwargs))

for idx, layer in enumerate(self.layers):
    x, hxcx[idx] = layer(x, hxcx[idx])

Why the output of first layer (shape: L,N,2H_hiddensize) can be fed into the second layer which expect (shape: L,N, H_hiddensize) but not (shape: L,N,2H_hiddensize)

Original Q&A

There are 2 answers

**dhruvbird** · Answer 1 · 2023-07-08T23:16:32+00:00

A bi-directional LSTM can be viewed as 2 independent LSTMs that have nothing to do with each other except that they share the input tensor. The forward LSTM consumes the input in the forward direction whereas the reverse LSTM consumes it in the reverse direction (of the time dimension).

**Yong** · Answer 2 · 2022-07-17T07:48:15+00:00

I can't figure out how this output is fed into the second LSTM layer. will the output of backforward layer and the forward layer be merged or concatenated?

Yes, the output of bidirectional LSTM will concatenate the last step of forward hidden and the first step of reverse hidden

reference: Pytorch LSTM documentation

For bidirectional LSTMs, h_n is not equivalent to the last element of output; the former contains the final forward and reverse hidden states, while the latter contains the final forward hidden state and the initial reverse hidden state.

TechQA.

confusion about pytorch LSTM implementation

There are 2 answers

Related Questions in PYTORCH

Related Questions in LSTM

Related Questions in BILSTM

Popular Questions

Popular Tags

Trending Questions