# 线性回归

2021-01-07 11:40:35 裏表異体

# 线性回归

## 一元线性回归

$y_i = f(x_i) = wx_i + b \\$

$S = \sum^{n}_{i=1}{(f(x_i)-y_i)^2}$

$L(w,b)=\frac{1}{2}S =\frac{1}{2}\sum^{n}_{i=1}{(f(x_i)-y_i)^2}$

\begin{align} (w^*,b^*) &= \underset{<w,b>}{\operatorname{arg min}} \frac{1}{2}\sum^{n}_{i=1}{(f(x_i)-y_i)^2}\\ &= \underset{<w,b>}{\operatorname{arg min}} \frac{1}{2}\sum^{n}_{i=1}{(wx_i+b-y_i)^2}\\ \end{align}

$$L(w,b)$$$$w$$$$b$$求偏导，得

$\frac{\partial{L(w,b)}}{\partial w} = \sum^{n}_{i=1}{(wx_i+b-y_i)}x_i \\\frac{\partial{L(w,b)}}{\partial b} = \sum^{n}_{i=1}{(wx_i+b-y_i)} \\$

### 梯度下降法

$w \leftarrow w - \alpha \frac{\partial{L}}{w} \\ b \leftarrow b - \alpha \frac{\partial{L}}{b} \\$

### 直接求解

$w = \frac{\sum^{n}_{i=1}y_i(x_i-\overline{x})}{\sum{x^2_i-\frac{1}{n}(\sum^{n}_{i=1}wx_i)}} \\ b = \frac{1}{n}\sum_{i=1}^n(y_i-wx_i)$

## 多元线性回归

$f(x) = w_1x_1 + w_2x_2 + w_3x_3 +\dots + w_mx_m + b$

$$yx_i$$$$w$$写成向量的形式(这里$$x_i$$代表对$$x$$的第$$i$$次观测得到的数据)

$y = \begin{bmatrix} y_1\\ y_2\\ \dots\\ y_n\\ \end{bmatrix} , x_i = \begin{bmatrix} x_1\\ x_2\\ \dots\\ x_n\\ \end{bmatrix} , w = \begin{bmatrix} w_1\\ w_2\\ \dots\\ w_n\\ \end{bmatrix}$

$f(x_i) = w^Tx_i + b\\$

$X = \begin{bmatrix} x_{11} \, x_{12} \, \dots \, x_{1m}\\ x_{21} \, x_{22} \, \dots \, x_{2m}\\ \dots\\ x_{n1} \, x_{n2} \, \dots \, x_{nm}\\ \end{bmatrix}$

$\hat{w} = \begin{bmatrix} w_1\\ w_2\\ \dots\\ w_n\\ b\\ \end{bmatrix} , X = \begin{bmatrix} x_{11} \,\, x_{12} \,\, \dots \,\, x_{1m} \,\, 1\\ x_{21} \,\, x_{22} \,\, \dots \,\, x_{2m} \,\, 1\\ \dots\\ x_{n1} \,\, x_{n2} \,\, \dots \,\, x_{nm} \,\, 1\\ \end{bmatrix}$

$y = f(X) = X\hat{w}$

$\hat{w}^* = \underset{<\hat{w}>}{\operatorname{arg min}} (y-X\hat{w})^T(y-X\hat{w})\\$

$L(\hat{w}) = (y-X\hat{w})^T(y-X\hat{w}) = (y-X\hat{w})^2$

$$\hat{w}$$求偏导得

$\frac{\partial L(\hat{w})}{\partial \hat{w}} = 2X^T(X\hat{w}-y)$

### 梯度下降法

$\hat{w} \leftarrow \hat{w} - \alpha \frac{\partial{L}}{\partial\hat{w}} \\$

### 正规方程法

$2X^T(X\hat{w}-y) = 0$

$\hat{w} = (X^TX)^{-1}X^Ty$

## 编程实现

'''

'''

import numpy as np
import matplotlib.pyplot as plt

def lossf(X,w,y):
return np.sum((y-np.dot(X,w))**2)

def init(X,y):
if X.ndim == 1:
X = X.reshape(X.size,1)
if y.ndim == 1:
y = y.reshape(y.size,1)
#在x后面多加一列1
X = np.c_[X,np.ones([X.shape[0],1])]
n,m = X.shape
w =  w = np.random.normal(1,0.1,m)
w = w.reshape(w.size,1)
return X,y,w

'''

'''
def LRWithNormalEquation(x,y):
X,y,w = init(x,y)
inv = np.linalg.inv(np.dot(X.T,X))
R = np.dot(X.T,y)
w = np.dot(inv,R)
return w

'''

'''
#初始化
X,y,w = init(x,y)

delta = 0.001  #收敛系数
alpha = 0.001  #学习速率
max_step = 10000 #最大次数
err = 1000
loss = []
i = 1
while err>delta and i < max_step:
i += 1
err = lossf(X,w,y)
loss.append(err)
print(w)

plt.plot(loss)
return w

def f(X,w):
return np.dot(X,w)

x = np.array([0.50,0.75,1.00,1.25,1.50,1.75,1.75,2.00,
2.25,2.50,2.75,3.00,3.25,3.50,4.00,4.25,4.50,4.75,5.00,5.50])

y = np.array([10,  26,  23,  43,  20,  22,  43,  50,  62, 50,  55,  75,
62,  78,  87,  76,  64,  85,  90,  98])

w2 = LRWithNormalEquation(x,y)

X,y,w = init(x,y)
y1 = f(X,w1)
y2 = f(X,w2)
plt.subplot(1,2,1)
plt.scatter(x,y)
plt.plot(x,y1)
plt.title(')
plt.subplot(1,2,2)
plt.scatter(x,y)
plt.plot(x,y2)



## 参考资料

https://www.cnblogs.com/urahyou/p/14227037.html