当前位置：网站首页>Function classification big PK! How to use sigmoid and softmax respectively?
Function classification big PK! How to use sigmoid and softmax respectively?
20201108 16:17:51 【Spiritual】
Design models to perform classification tasks （ As for the chest X Just check the disease or handwritten number to classify ） when , Sometimes you need to choose multiple answers at the same time （ If you choose pneumonia and abscess at the same time ）, Sometimes you can only choose one answer （ Like numbers “8”）. This article will discuss how to apply Sigmoid Function or Softmax Function handles the original output value of the classifier .
There are many kinds of neural network classifier classification algorithms , But the content of this paper is limited to neural network classifier . The classification problem can be solved by different neural networks , Such as feedforward neural network and convolution neural network . application Sigmoid Function or Softmax The final result of FNN classifier is a vector , namely “ The original output value ”, Such as [0.5, 1.2, 0.1, 2.4], These four outputs correspond to the chest X Pneumonia after light examination 、 Heart hypertrophy 、 Tumors and abscesses . But what do these raw output values mean ？ It may be easier to understand by converting the output value to a probability . Compared with the seemingly casual “2.4”, The possibility of diabetes is 91％, This statement is easier for patients to understand .Sigmoid Function or Softmax Function can map the original output value of classifier to probability . The following figure shows the original output of the feedforward neural network （ Blue ） adopt Sigmoid Functions are mapped to probabilities （ Red ） The process of ：
Then use Softmax Function repeats the above process ：
As shown in the figure ,Sigmoid Functions and Softmax Function gives different results . The reason lies in ,Sigmoid The function processes the raw output values separately , So the results are independent of each other , The sum of probabilities is not necessarily 1, Pictured 0.37 + 0.77 + 0.48 + 0.91 = 2.53. contrary ,Softmax The output values of functions are related to each other , The sum of the probabilities is always 1, Pictured 0.04 + 0.21 + 0.05 + 0.70 = 1.00. therefore , stay Softmax Function , To increase the probability of a class , The probability of other categories must be reduced accordingly .
Sigmoid Function application ： With the chest X Xray examination and admission for example, chest X Photo chip ： A chest X Light film can show many diseases at the same time , So the chest X Xray classifiers also need to display multiple symptoms at the same time . Here is a chest showing pneumonia and abscess X Photo chip , In the tab bar on the right, there are two “1”：
be hospitalized ： The goal is based on the patient's health record , Determine the possibility of the patient's admission in the future . therefore , The classification problem can be designed as ： According to the diagnosis, the disease may lead to the patient's admission in the future （ If any ）, Classify the patient's existing health records . There may be a variety of diseases leading to admission , So there may be more than one answer . Chart ： The following two feedforward neural networks correspond to the above problems respectively . In the final calculation , from Sigmoid Function handles the original output value , Get the corresponding probability , Allow multiple possibilities to coexist —— Because of the chest X Xrays may reflect a variety of abnormal states , There may be more than one cause of admission .
Softmax Function application ： With handwritten numbers and Iris（ Iris ） For example, handwritten numbers ： Distinguish between handwritten numbers （MNIST Data sets ：https://en.wikipedia.org/wiki/MNIST_database） when , The classifier should use Softmax function , What kind of numbers are . After all , Numbers 8 It's just numbers 8, It can't be numbers at the same time 7.
Iris：Iris Data set in 1936 In introducing （https://en.wikipedia.org/wiki/Iris_flower_data_set）, It includes 150 Data sets , Divided into iris 、 Variegated Iris 、 Iris Virginia 3 class , Each category has 50 Data sets , Each data contains calyx length 、 Calyx width 、 Petal length 、 Petal width 4 Attributes . following 9 An example is taken from Iris Data sets ：
There are no images in the dataset , But here's the mottled iris （https://en.wikipedia.org/wiki/Iris_flower_data_set#/media/File:Iris_versicolor_3.jpg）, For you to enjoy ：
Iris Neural network classifier of data set , To adopt Softmax Function handles the original output value , Because a iris can only be a specific species —— There's no point in dividing it into several varieties .
About “e” We should understand that Sigmoid and Softmax function , We should introduce “e”. In this paper , Just need to know e It's about equal to 2.71828 The mathematical constant of . Here is about e Other information about ：• e The decimal system means forever , The numbers appear completely random —— Be similar to pi.• e Often used in compound interest 、 In the study of gambling and some probability distributions .• Here is e A formula for ：
but e There is more than one formula for . There are many ways to calculate it . For example ：https://www.intmath.com/exponentiallogarithmicfunctions/calculatinge.php• 2004 year , Google's IPO reached 2,718,281,828 dollar , namely “e Million dollars ”.• Wikipedia is the famous decimal number in human history e The evolution of （https://en.wikipedia.org/wiki/E_%28mathematical_constant%29#Bernoulli_trials）, from 1690 One digit of the year begins , Last until 1978 Year of 116,000 Digit number ：
Sigmoid Functions and Softmax function Sigmoid = Multi label classification problem = Multiple correct answers = Exclusive output （ For example, the chest X Light check 、 In the hospital ）• Building classifiers , When solving a problem that has more than one correct answer , use Sigmoid The function processes each raw output value separately .• Sigmoid The function is shown below （ Be careful e）：
In this formula ,σ Express Sigmoid function ,σ（zj） It means that you will Sigmoid Function applied to a number Zj. “Zj” Represents a single raw output value , Such as 0.5. j Represents the output value of the current operation . If you have four raw output values , be j = 1,2,3 or 4. In the previous example , The original output value is [0.5,1.2,0.1,2.4], be Z1 = 0.5,Z2 = 1.2,Z3 = 0.1,Z4 = 2.4. therefore ,
Z2,Z3、Z4 The calculation process is the same as above . because Sigmoid The function is applied to each of the original output values , So the possible output scenarios include ： All categories have very low probabilities （ Such as “ This chest X There is nothing wrong with light inspection ”）, The probability of one category is high, but the probability of others is very low （ Such as “ chest X The light examination revealed only pneumonia ”）, The probability of multiple or all categories is high （ Such as “ chest X Light examination revealed pneumonia and abscess ”）. The following figure for Sigmoid Function curve ：
Softmax = Multi category classification problem = There is only one correct answer = Mutually exclusive output （ For example, handwritten numbers , Iris ）• Building classifiers , When solving a problem with only one correct answer , use Softmax The function processes the raw output values .• Softmax The denominator of the function synthesizes all the factors of the original output value , It means ,Softmax The different probabilities obtained by the function are related to each other .• Softmax The function is expressed as follows ：
Except for the denominator , To synthesize all the factors , In the original output value e ^ thing Add up ,Softmax Function and Sigmoid There's not much difference in functions . In other words , use Softmax Function to calculate a single raw output value （ for example Z1） when , You can't just count Z1, In the denominator Z1,Z2,Z3 and Z4 It should also be calculated , As shown below ：
Softmax The advantage of the function is that the sum of all the output probabilities is 1：
When distinguishing handwritten numbers , use Softmax Function handles the original output value , If you want to add an example, it is divided into “8” Probability , It's going to reduce the example to other numbers （0,1,2,3,4,5,6,7 and / or 9） Probability .Sigmoid and Softmax Other examples of
summary :
• If the model output is a non mutex class , And you can select multiple categories at the same time , Then Sigmoid Function to calculate the original output value of the network .
• If the model output is a mutex class , And only one category can be selected , Then Softmax Function to calculate the original output value of the network .
版权声明
本文为[Spiritual]所创，转载请带上原文链接，感谢
边栏推荐
 C++ 数字、string和char*的转换
 C++学习——centos7上部署C++开发环境
 C++学习——一步步学会写Makefile
 C++学习——临时对象的产生与优化
 C++学习——对象的引用的用法
 C++编程经验（6）：使用C++风格的类型转换
 Won the CKA + CKS certificate with the highest gold content in kubernetes in 31 days!
 C + + number, string and char * conversion
 C + + Learning  capacity() and resize() in C + +
 C + + Learning  about code performance optimization
猜你喜欢

C + + programming experience (6): using C + + style type conversion

Latest party and government work report ppt  Park ppt

在线身份证号码提取生日工具

Online ID number extraction birthday tool

️野指针？悬空指针？️ 一文带你搞懂！

Field pointer? Dangling pointer? This article will help you understand!

HCNA Routing＆Switching之GVRP

GVRP of hcna Routing & Switching

Seq2Seq实现闲聊机器人

【闲聊机器人】seq2seq模型的原理
随机推荐
 LeetCode 91. 解码方法
 Seq2seq implements chat robot
 [chat robot] principle of seq2seq model
 Leetcode 91. Decoding method
 HCNA Routing＆Switching之GVRP
 GVRP of hcna Routing & Switching
 HDU7016 Random Walk 2
 [Code+＃1]Yazid 的新生舞会
 CF1548C The Three Little Pigs
 HDU7033 Typing Contest
 HDU7016 Random Walk 2
 [code + 1] Yazid's freshman ball
 CF1548C The Three Little Pigs
 HDU7033 Typing Contest
 Qt Creator 自动补齐变慢的解决
 HALCON 20.11：如何处理标定助手品质问题
 HALCON 20.11：标定助手使用注意事项
 Solution of QT creator's automatic replenishment slowing down
 Halcon 20.11: how to deal with the quality problem of calibration assistant
 Halcon 20.11: precautions for use of calibration assistant
 “十大科学技术问题”揭晓！青年科学家50²论坛
 "Top ten scientific and technological issues" announced Young scientists 50 ² forum
 求反转链表
 Reverse linked list
 js的数据类型
 JS data type
 记一次文件读写遇到的bug
 Remember the bug encountered in reading and writing a file
 单例模式
 Singleton mode
 在这个 N 多编程语言争霸的世界，C++ 究竟还有没有未来？
 In this world of N programming languages, is there a future for C + +?
 es6模板字符
 js Promise
 js 数组方法 回顾
 ES6 template characters
 js Promise
 JS array method review
 【Golang】️走进 Go 语言️ 第一课 Hello World
 [golang] go into go language lesson 1 Hello World