Cyan's Blog

Search

Search IconIcon to open search

Part.19_Regularized_Linear_Regression(ML_Andrew.Ng.)

Last updated Sep 10, 2021 Edit Source

# Regularization & Linear Regression

2021-09-10

Tags: #MachineLearning #Regularization #GradientDescent #LinearRegression #NormalEquation

# Regularization & Gradient Descent

添加了正则项之后有两点需要注意:

同时考虑上面两点, 梯度下降更新公式变为了:

\begin{aligned} Re&peat\space {\\ &\theta_{0}:=\theta_{0}-\alpha \frac{1}{m} \sum_{i=1}^{m}\left(h_{\theta}\left(x^{(i)}\right)-y^{(i)}\right) x_{0}^{(i)} \\ &\theta_{j}:=\theta_{j}-\alpha\left[ \frac{1}{m} \sum_{i=1}^{m} \left(h_{\theta}\left(x^{(i)}\right)-y^{(i)}\right) x_{j}^{(i)}+\frac{\lambda}{m} \theta_{j}\right] \quad\quad j \in{1,2 \ldots n} \\ } \end{aligned} 要是把方括号打开, 第二行的更新公式可以变为: θj:=θj(1αλm)α1mi=1m(hθ(x(i))y(i))xj(i)j1,2n \theta_{j}:=\theta_{j}\left(1-\alpha\frac\lambda m\right) -\alpha\frac{1}{m} \sum_{i=1}^{m} \left(h_{\theta}\left(x^{(i)}\right)-y^{(i)}\right) x_{j}^{(i)} \quad\quad j \in{1,2 \ldots n} 因为(1αλm)\left(1-\alpha\frac\lambda m\right)一定小于1, 所以这个更新公式每次都会缩小一点点θi\theta_i, 而公式的后半部分和没有正则化的公式是完全一样的.

# Regularization & Normal Equation

θ=(XTX+λL)1XTy where L=[0111](n+1)×(n+1)\begin{aligned} &\theta=(X^{T} X+\lambda \cdot L)^{-1} X^{T} \vec y \\ &\text { where } L=\left[\begin{array}{cccc} 0 & & & & \\ & 1 & & & \\ & & 1 & & \\ & & & \ddots & \\ & & & & 1 \end{array}\right]_{(n+1)\times(n+1)} \end{aligned}

^72311a

LL的第一个0可以理解为不用正则化θ0\theta_0