After reading the answers to this question I'm still a bit confused about the whole PackedSequence object thing. As I understand it, this is an object optimized for parallel processing of variable sized sequences in recurrent models, a problem to which zero padding is one [imperfect] solution. It seems that given a PackedSequence object, a Pytorch RNN will process each sequence in the batch to its end, and not continue to process the padding. So why is padding needed here? Why are there both a pack_padded_sequence() and pack_sequence() methods?
Why do we need pack_padded_sequence() when we have pack_sequence()?
3.1k views Asked by H.Rappeport At
1
Mostly for historical reasons;
torch.nn.pack_padded_sequence()
was created beforetorch.nn.pack_sequence()
(the later appeared in0.4.0
for the first time if I see correctly) and I suppose there was no reason to remove this functionality and break backward compatibility.Furthermore, it's not always clear what's the best/fastest way to
pad
your input and it highly varies on data you are using. When data was somehow padded beforehand (e.g. your data was pre-padded and provided to you like that) it is faster to usepack_padded_sequence()
(see source code ofpack_sequence
, it's calculatinglength
of each data point for you and callspad_sequence
followed bypack_padded_sequence
internally). Arguablypad_packed_sequence
is rarely of use right now though.Lastly, please notice
enforce_sorted
argument provided since1.2.0
version for both of those functions. Not so long ago users had to sort their data (or batch) with the longest sequence first and shortest last, now it can be done internally when this parameter is set toFalse
.