Cyan's Blog

Search

Search IconIcon to open search

正则项不影响线性回归损失函数的凸性

Last updated Sep 10, 2021 Edit Source

# 正则项不影响线性回归损失函数的凸性

2021-09-10

Tags: #MachineLearning #Regularization #GradientDescent #LinearRegression #ConvexOptimization

# 首先, 如何证明一个函数为凸函数?

如果$f$是二阶可微的,那么如果$f$的定义域是凸集,并且$\forall x\in dom(f), \nabla^2 f(x)\geqslant0$,那么$f$ 就是一个凸函数.1

# 我们首先证明没有正则项的$J(\theta)$是凸的

$$\begin{aligned} \frac{\partial}{\partial \theta_{j}} J(\theta)

&=\frac{1}{2 m} \sum_{i=1}^{m} \frac{\partial}{\partial \theta_{j}} \left(h_{\theta}(x)-y\right)^{2} \

&=\frac{1}{2 m} \sum_{i=1}^{m} 2 \left(h_{\theta}(x)-y\right) \cdot \frac{\partial}{\partial \theta_{j}}\left(h_{\theta}(x)-y\right) \

&=\frac{1}{ m} \sum_{i=1}^{m} \left(h_{\theta}(x)-y\right) x_j \\ \end{aligned}$$

$$\begin{aligned} \frac{\partial^2}{\partial \theta_{j}^2} J(\theta)

&=\frac{\partial}{\partial \theta_{j}}\left(\frac{\partial}{\partial \theta_{j}} J(\theta)\right) \

&=\frac{1}{ m} \sum_{i=1}^{m}\frac{\partial}{\partial \theta_{j}} \left(h_{\theta}(x)-y\right) x_j \

&=\frac{1}{ m} \sum_{i=1}^{m}x_j^2 \\ \end{aligned}$$ 显然是凸的.

# 然后加上正则项

$$\begin{aligned} \frac{\partial}{\partial \theta_{j}} J(\theta)

&=\frac{1}{2 m} \left[\sum_{i=1}^{m} \frac{\partial}{\partial \theta_{j}} \left(h_{\theta}(x)-y\right)^{2} +\lambda \frac{\partial}{\partial \theta_{j}} \sum_{i=1}^{n} \theta_{i}^{2}\right]\

&=\frac{1}{2 m}\left[ \sum_{i=1}^{m} 2 \left(h_{\theta}(x)-y\right) x_j+2\lambda\theta_{j}\right]\

&=\lambda\theta_{j}+\frac{1}{ m} \sum_{i=1}^{m} \left(h_{\theta}(x)-y\right) x_j \\ \end{aligned}$$

$$\begin{aligned} \frac{\partial^2}{\partial \theta_{j}^2} J(\theta)

&=\frac{\partial}{\partial \theta_{j}}\left(\frac{\partial}{\partial \theta_{j}} J(\theta)\right) \

&=\frac{\partial}{\partial \theta_{j}}\left(\lambda\theta_{j}+\frac{1}{ m} \sum_{i=1}^{m} \left(h_{\theta}(x)-y\right) x_j\right) \

&=\lambda+\frac1{m} \sum_{i=1}^m x_j^2 \end{aligned}$$ 当然在$\lambda>0$的时候上式恒大于零, 根据上面的定理, 损失函数一定是凸函数, 证毕.


  1. 更详细的推导详见知乎文章: https://zhuanlan.zhihu.com/p/210252556 ↩︎