Cyan's Blog

Search

Search IconIcon to open search

RNN中output和hidden_state的区别

Last updated Apr 22, 2022 Edit Source

# Difference between output and hidden_state in RNN

2022-04-22

Tags: #RNN

RNN in detail

# Bidirectional Case

  1. The output will give you the hidden layer outputs of the network for each time-step, but only for the final layer. This is useful in many applications, particularly encoder-decoders using attention. (These architectures build up a ‘context’ layer from all the hidden outputs, and it is extremely useful to have them sitting around as a self-contained unit.)

  2. The h_n will give you the hidden layer outputs for the last time-step only, but for all the layers. Therefore, if and only if you have a single layer architecture, h_n is a strict subset of output. Otherwise, output and h_n intersect, but are not strict subsets of one another. (You will often want these, in an encoder-decoder model, from the encoder in order to jumpstart the decoder.)

  3. If you are using a bidirectional output and you want to actually verify that part of h_n is contained in output (and vice-versa) you need to understand what PyTorch does behind the scenes in the organization of the inputs and outputs. Specifically, it concatenates a time-reversed input with the time-forward input and runs them together. This is literal. This means that the ‘forward’ output at time T is in the final position of the output tensor sitting right next to the ‘reverse’ output at time 0; if you’re looking for the ‘reverse’ output at time T, it is in the first position.1


  1. machine learning - Is hidden and output the same for a GRU unit in Pytorch? - Stack Overflow ↩︎