Project 2 FAQ

This page was last updated: Friday, 24-Jul-2020 09:43:32 AEST by blair@cse.unsw.edu.au

    1. Is the provided return values for post incorrect in the provided preprocessing class?

    Yes, an earlier version of the assignment had two values being returned here (batch and vocab), however this function should only return batch.

    2. Can we use additional libraries in our preprocessing?

    No. An earlier version of the assignment stated that additional libraries could be requested to be installed however this is no longer the case and only torch, torchtext, and the python standard lib may be used.

    3. Why are the lengths of the sequences provided in part 2?

    If you examine the input tensors you will see that they are *padded*. This is because each sequence will be a different length, so to allow PyTorch to function in the batch setting, padding tokens, in this case 0-vectors, are added to either the beginning or end of the individual sequences. After doing this, you can stack the individual sequences and you have your batch (another neat thing PyTorch will do in the dataloader is group similar length sentences together, so this padding is minimized and computation is faster). In any case, this means that the *output* of the LSTM will contain hidden and output states generated from these padded tokens. It's possible that the reset gate may learn to ignore these, but it's not guaranteed - so the final output (both in the first returned value and the second) will not be the hidden state at the end of the sequence, but the hidden state at the end of the padding. To get around this, we need to let PyTorch know what to return. Conveniently, PyTorch has a built-in object that an LSTM will already process in this way. This is torch.nn.utils.rnn.PackedSequence, which contains both the padded input tensor, and each sequences lengths. If an object of this type is passed into torch.nn.LSTM, the final hidden state returned (which is what you want) is the hidden state calculated at the end of the sequence. Note that torch.nn.LSTM returns output in the form "output, (h_n, _c_n)"; h_n is what we are referring to here. If you want to use the output, you will need to get the tensor inside the PackedSequence object. Finally, note that PyTorch has a built-in method for creating a PackedSequence from a tensor and list of lengths: torch.nn.utils.rnn.pack_padded_sequence.

Back to Project 2 | Back to the main page