Cyan's Blog

Search

Search IconIcon to open search

Part.19_Regularized_Linear_Regression(ML_Andrew.Ng.)

Last updated Sep 10, 2021 Edit Source

# Regularization & Linear Regression

2021-09-10

Tags: #MachineLearning #Regularization #GradientDescent #LinearRegression #NormalEquation

# Regularization & Gradient Descent

添加了正则项之后有两点需要注意:

同时考虑上面两点, 梯度下降更新公式变为了:

$$ \begin{aligned} Re&peat\space {\\ &\theta_{0}:=\theta_{0}-\alpha \frac{1}{m} \sum_{i=1}^{m}\left(h_{\theta}\left(x^{(i)}\right)-y^{(i)}\right) x_{0}^{(i)} \\ &\theta_{j}:=\theta_{j}-\alpha\left[ \frac{1}{m} \sum_{i=1}^{m} \left(h_{\theta}\left(x^{(i)}\right)-y^{(i)}\right) x_{j}^{(i)}+\frac{\lambda}{m} \theta_{j}\right] \quad\quad j \in{1,2 \ldots n} \\ } \end{aligned} $$ 要是把方括号打开, 第二行的更新公式可以变为: $$ \theta_{j}:=\theta_{j}\left(1-\alpha\frac\lambda m\right) -\alpha\frac{1}{m} \sum_{i=1}^{m} \left(h_{\theta}\left(x^{(i)}\right)-y^{(i)}\right) x_{j}^{(i)} \quad\quad j \in{1,2 \ldots n} $$ 因为$\left(1-\alpha\frac\lambda m\right)$一定小于1, 所以这个更新公式每次都会缩小一点点$\theta_i$, 而公式的后半部分和没有正则化的公式是完全一样的.

# Regularization & Normal Equation

$$\begin{aligned} &\theta=(X^{T} X+\lambda \cdot L)^{-1} X^{T} \vec y \\ &\text { where } L=\left[\begin{array}{cccc} 0 & & & & \\ & 1 & & & \\ & & 1 & & \\ & & & \ddots & \\ & & & & 1 \end{array}\right]_{(n+1)\times(n+1)} \end{aligned}$$

^72311a

$L$的第一个0可以理解为不用正则化$\theta_0$