I am confused about how to reconstruct the following Pytorch code in TensorFlow. It uses both the input size x and the hidden size h to create a GRU layer
import torch
torch.nn.GRU(64, 64*2, batch_first=True, return_state=True)
Instinctively, I first tried the following:
import tensorflow as tf
tf.keras.layers.GRU(64, return_state=True)
However, I realize that it does not really account for h or the hidden size. What should I do in this case?
The hidden size is 64 in your tensorflow example. To get the equivalent, you should use
This is because the keras layer does not require you to specify your input size (64 in this example); it is decided when you build or run your model for the first time.