当前位置：网站首页>Gradient understanding decline
Gradient understanding decline
20201106 01:14:27 【Artificial intelligence meets pioneer】
author PHANI8 compile VK source Analytics Vidhya
Introduce
In this article , We'll see what a real gradient descent is , Why it became popular , Why? AI and ML Most of the algorithms in follow this technique .
Before we start , What the gradient actually means ？ That sounds strange, right ！
Cauchy is 1847 The first person to propose gradient descent in
Um. , The word gradient means the increase and decrease of a property ！ And falling means moving down . therefore , in general , The act of descending to a certain point and observing and continuing to descend is called gradient descent
therefore , Under normal circumstances , As shown in the figure , The slope of the top of the mountain is very high , Through constant movement , When you get to the foot of the mountain, the slope is the smallest , Or close to or equal to zero . The same applies mathematically .
Let's see how to do it
Gradient descent math
therefore , If you see the shape here is the same as the mountains here . Let's assume that this is a form of y=f（x） The curve of .
Here we know , The slope at any point is y Yes x The derivative of , If you use a curve to check , You'll find that , When you move down , The slope decreases at the tip or minimum and equals zero , When we move up again , The slope will increase
Remember that , We're going to look at the smallest point x and y What happens to the value of ,
Look at the picture below , We have five points in different positions ！
![](http://qiniu.aihubs.net/61300Screenshot (123).png)
When we move down , We will find that y The value will decrease , So in all the points here , We get a relatively minimum value at the bottom of the graph . therefore , Our conclusion is that we always find the minimum at the bottom of the graph （x,y）. Now let's take a look at how ML and DL Pass this , And how to reach the minimum point without traversing the whole graph ？
In any algorithm , Our main purpose is to minimize the loss , This shows that our model works well . To analyze this , We're going to use linear regression
Because linear regression uses straight lines to predict continuous output 
Let's set a straight line y=w*x+c
Here we need to find w and c, In this way, we have the best fitting line to minimize the error . So our goal is to find the best w and c value
Let's start with some random values w and c, We update these values based on the loss , in other words , We update these weights , Until the slope is equal to or close to zero .
We will take y The loss function on the axis ,x There's... On the shaft w and c. Look at the picture below 
![](http://qiniu.aihubs.net/47460Screenshot (124).png)
In order to achieve the minimum in the first graph w value , Please follow these steps 

use w and c Start calculating a given set of x _values The loss of .

Draw points , Now update the weight to 
w_new =w_old – learning_rate * slope at (w_old,loss)
Repeat these steps , Until the minimum value is reached ！

We subtract the gradient here , Because we want to move to the foot of the mountain , Or moving in the steepest direction of descent

When we subtract , We're going to get a smaller slope than the previous one , This is where we want to move to a point where the slope is equal to or close to zero

We'll talk about the learning rate later
The same applies to pictures 2, Loss and c Function of
Now the question is why we put learning rate in the equation ？ Because we can't traverse all the points between the starting point and the minimum
We need to skip a few points

We can take big steps at the beginning .

however , When we're close to the minimum , We need to take small steps , Because we're going to cross the minimum , Move to a slope to add . In order to control the step size and movement of the graph , The introduction of learning rate . Even if there is no learning rate , We'll also get the minimum , But what we care about is that our algorithms are faster !!
![](http://qiniu.aihubs.net/59180Screenshot (125).png)
Here is an example algorithm for linear regression using gradient descent . Here we use the mean square error as the loss function 
1. Initialize model parameters with zero
m=0,c=0
2. Use （0,1） Any value in the range initializes the learning rate
lr=0.01
The error equation 
![](http://qiniu.aihubs.net/43480Screenshot (128).png)
Now use （w*x+c） Instead of Ypred And calculate the partial derivative
![](http://qiniu.aihubs.net/12675Screenshot (129).png)
3.c It can also be calculated that
![](http://qiniu.aihubs.net/38784Screenshot (130).png)
4. Apply this to all epoch Data set of
for i in range(epochs):
y_pred = w * x +c
D_M = (2/n) * sum(x * (y_original  y_pred))
D_C = (2/n) * sum(y_original  y_pred)
Here the summation function adds the gradients of all points at once ！
Update parameters for all iterations
W = W – lr * D_M
C = C – lr * D_C
Gradient descent method is used for deep learning of neural networks …
ad locum , We update the weights of each neuron , In order to get the best classification with minimum error . We use gradient descent to update the ownership value of each layer …
Wi = Wi – learning_rate * derivative (Loss function w.r.t Wi)
Why it's popular ？
Gradient descent is the most commonly used optimization strategy in machine learning and deep learning .
It's used to train data models , It can be combined with various algorithms , Easy to understand and implement
Many statistical techniques and methods use GD To minimize and optimize their process .
Reference
 https://en.wikipedia.org/wiki/Gradient_descent
 https://en.wikipedia.org/wiki/Stochastic_gradient_descent
Link to the original text ：https://www.analyticsvidhya.com/blog/2020/10/whatdoesgradientdescentactuallymean/
Welcome to join us AI Blog station ： http://panchuang.net/
sklearn Machine learning Chinese official documents ： http://sklearn123.com/
Welcome to pay attention to pan Chuang blog resource summary station ： http://docs.panchuang.net/
版权声明
本文为[Artificial intelligence meets pioneer]所创，转载请带上原文链接，感谢
边栏推荐
 C++ 数字、string和char*的转换
 C++学习——centos7上部署C++开发环境
 C++学习——一步步学会写Makefile
 C++学习——临时对象的产生与优化
 C++学习——对象的引用的用法
 C++编程经验（6）：使用C++风格的类型转换
 Won the CKA + CKS certificate with the highest gold content in kubernetes in 31 days!
 C + + number, string and char * conversion
 C + + Learning  capacity() and resize() in C + +
 C + + Learning  about code performance optimization
猜你喜欢

C + + programming experience (6): using C + + style type conversion

Latest party and government work report ppt  Park ppt

在线身份证号码提取生日工具

Online ID number extraction birthday tool

️野指针？悬空指针？️ 一文带你搞懂！

Field pointer? Dangling pointer? This article will help you understand!

HCNA Routing＆Switching之GVRP

GVRP of hcna Routing & Switching

Seq2Seq实现闲聊机器人

【闲聊机器人】seq2seq模型的原理
随机推荐
 LeetCode 91. 解码方法
 Seq2seq implements chat robot
 [chat robot] principle of seq2seq model
 Leetcode 91. Decoding method
 HCNA Routing＆Switching之GVRP
 GVRP of hcna Routing & Switching
 HDU7016 Random Walk 2
 [Code+＃1]Yazid 的新生舞会
 CF1548C The Three Little Pigs
 HDU7033 Typing Contest
 HDU7016 Random Walk 2
 [code + 1] Yazid's freshman ball
 CF1548C The Three Little Pigs
 HDU7033 Typing Contest
 Qt Creator 自动补齐变慢的解决
 HALCON 20.11：如何处理标定助手品质问题
 HALCON 20.11：标定助手使用注意事项
 Solution of QT creator's automatic replenishment slowing down
 Halcon 20.11: how to deal with the quality problem of calibration assistant
 Halcon 20.11: precautions for use of calibration assistant
 “十大科学技术问题”揭晓！青年科学家50²论坛
 "Top ten scientific and technological issues" announced Young scientists 50 ² forum
 求反转链表
 Reverse linked list
 js的数据类型
 JS data type
 记一次文件读写遇到的bug
 Remember the bug encountered in reading and writing a file
 单例模式
 Singleton mode
 在这个 N 多编程语言争霸的世界，C++ 究竟还有没有未来？
 In this world of N programming languages, is there a future for C + +?
 es6模板字符
 js Promise
 js 数组方法 回顾
 ES6 template characters
 js Promise
 JS array method review
 【Golang】️走进 Go 语言️ 第一课 Hello World
 [golang] go into go language lesson 1 Hello World