Cyan's Blog

Search

Search IconIcon to open search

D2L-57-LSTM-长短期记忆网络

Last updated Apr 18, 2022 Edit Source

# Long Short-Term Memory

2022-04-18

Tags: #LSTM #DeepLearning #RNN

# Cell State

# 3 Gates & 1 Candidate State

# 3 Gates

# 1 Candidate State

# Update Cell State

对上一个Cell State $\mathbf{C}_{t-1}$ 的更新分为两步:

# Forget using Forget Gate

$$\mathbf{C}{t}=\textcolor{darkorchid}{\mathbf{F}{t} \odot \mathbf{C}{t-1}}+\mathbf{I}{t} \odot \tilde{\mathbf{C}}_{t}$$ 与Forget Gate按元素相乘,屏蔽掉需要忘记的元素。

# Merge new Candidate State

$$\mathbf{C}{t}=\mathbf{F}{t} \odot \mathbf{C}{t-1}\textcolor{orangered}{+\mathbf{I}{t} \odot \tilde{\mathbf{C}}_{t}}$$ 先将候选Cell State与Input Gate按元素相乘得到需要更新的位置,再和遗忘后的结果相加。

# Output new Hidden State

其实 $\mathbf{H}{t}$ 只是 $\mathbf{C}{t}$ 的门控版本: 先利用tanh调整大小范围到 $(-1,1)$, 再使用Output Gate Mask一遍: $$\mathbf{H}{t}=\mathbf{O}{t} \odot \tanh \left(\mathbf{C}_{t}\right)$$

# Variants of LSTM

# Peepholes

All gates can have a peep at the cell state $\mathbf{C}_{t-1}$.

# Convex Combination2 (coupled forget and input gates)

Forget to remember, remember to forget. The total amount of information stays the same.

# GRU

D2L-56-门控循环单元GRU


  1. Understanding LSTM Networks – colah’s blog Many pics in this article is from colah’s blog. ↩︎

  2. 凸组合 - Convex Combination ↩︎