The basic principle of image recognition

From the last one , We're finally in TensorFlow The world of machine learning . Using the first classification algorithm for handwritten digit recognition, a 91% About the recognition rate results , Good progress , But the results are not satisfactory .

The reason why we are not satisfied with the result , Of course, the algorithm is too simple . Even though we've all accepted “ All problems can be described by mathematical formulas ” This view , But just unfolding a picture 784 A number is used as an equation parameter to perform a linear operation + Nonlinear classifiers are called “ Artificial intelligence ” I feel so unreliable ... As for being able to get 91% Low recognition rate , In this sense , It seems to be a little bit unbelievable . This disbelief doesn't mean 91% It's too low , But this kind of joke, general calculation has 91% It's a bit of a fantasy .

In fact, the charm of mathematics is like this , It seems that the formula is simple , But the last section said , Don't forget it's 784 Wei , Manual calculation is going to be crazy .

If you use the applet described in the previous article , The weight matrix calculated after the learning process of our image recognition program is completed W Of 10 All dimensions are transformed into 28x28 Resolution image ( Remember , Our weight matrix W yes 784x10, among 784 Namely 28x28 from ), And then do some shading , It's going to look like this :



The red part represents that the weight is negative , The blue part represents a positive weight .

In characters 0 For example , The red part in the picture represents , If you want to identify the image , If there's handwriting on the top , So this picture tends to be less likely to be a character 0. And the blue part , It means that if there are handwriting marks at the same position , That picture is more likely to be a character 0. So it's all 28x28=784 All data are calculated in this way , The final result , It's closer to the character, of course 0 The possibility of . This is the basic principle of image recognition in this program .

The reason why we give this weight graph here , Even though the algorithm is simple , But more clearly “ machine learning ” The mathematical meaning of . Next “ Neural network algorithm ” And other algorithms , Because of the complexity , The weight of simple results can not be expressed in this intuitive way .

neural network

In the official MNIST Case study , The part of neural network is directly skipped . Because with the development of Technology , There are better algorithms for image recognition , Namely “ Convolutional neural networks ”, We'll talk about this in the next article .

Actually, I think ,“ neural network ” This concept can't be crossed in any case , Otherwise, many concepts in the later stage can't be explained or understood . Science is always like this , Most of the time, even if there is no peak breakthrough , Ordinary work can not be omitted , Otherwise, it will be a castle in the air .

“ neural network ” It is the result of natural selection , The human brain is made up of innumerable neurons , It is said that it is close to 900 One hundred million , It's astronomical . The transmission and reflection of these neural networks support all the intelligence and behavior of modern human beings .



In the era of artificial intelligence without enough modern theory support , Imitate the human brain “ neural network ” Working mode of , establish “ Artificial neural network ” It's natural to do machine learning . And the results in practice are also very exciting , So it's never been very long AI In the history of ,“ Artificial neural network ” I have ruled for quite a long time . So that for many non professionals ,“ neural network ” It has become. AI Iconic concepts .



Mimicking the basic way neurons in the human brain work , The figure below shows a “ Artificial neural network ” How the basic unit works :



Every computing node like this , There are n Dimension input , Complete a linear calculation similar to the previous source code sample , Then summarize the output , This output is then connected to the next computing node . Many such calculation nodes summarize to complete a set of calculations , So it becomes a “ layer ”. Output of the upper layer , Be the input to the next level , Multiple levels add up , Complete the final machine learning process .

In these multi-layer calculations , The first layer is responsible for the input of all the original data , So it's called “ Input layer ”; The last layer completes the output of the result , be called “ Output layer ”; The middle part bears the result of the upper layer , Complete the input of the next layer through calculation , But it's actually invisible to users , be called “ Hidden layer ”. You will often see these concepts when you look at various materials , You need to know what these concepts mean .



The figure above shows the neural network's various deformations and combined network patterns . such “ bionics ” The general combination mode has achieved surprising results . From the mathematical calculation structure is very clear , But the mathematical mechanism of internal multi node combination is not very clear in any paper . You can understand it as : By adding computing nodes 、 More reflect and maintain the tiny relationship between each data and other data, even the relationship after multi-layer interaction , So as to more accurately complete the calculation of the results .

Back propagation

With the built-in neural network algorithm implemented , Ordinary users certainly care less about the internal mathematical implementation of the algorithm , Here's just one point .

In the linear regression equation , We use the gradient descent method to solve the equation , Every calculation can determine the direction of our next calculation through the performance of the cost function .

In multilayer neural networks , This way of solving the equation is obviously not working . Because the end result , With the initial input , There are many hidden layers in the middle .

therefore “ Artificial neural network ” The solution of the problem mainly depends on “ Back propagation ” To solve the problem , The main idea is that after the last layer comes to the conclusion , The weight of this layer is modified by the cost function of this layer W And offset b, And reverse the signal to the next level , So that the upper layer can also adjust its own layer W/b, Step by step back propagation , All the way to the input layer .

Activation function

In the previous diagram of an independent neural node , You may have noticed that in addition to the linear formula we were familiar with in the last example .

There's another one at the back “Threshold unit”, That is to say “ Threshold unit ”. In the real world , Our brains are less likely to deal with anything that needs to be dealt with , All of your brains .

And according to the above “ Artificial neural network ” The diagram shows that , All nodes , Although there are layers of division , It's actually fully connected .

Full connection means that for any input , In fact, all the units are involved in the calculation , This is also obviously unreasonable .

The last threshold unit of each node , It's used to decide for a task , Whether this node participates in the final calculation and in what way . This action , It is also called in machine learning “ Activation function ”.

There are many commonly used activation functions , Like we mentioned earlier sigmoid function , I mentioned it last time because this function can be used to do 0、1 classification . If the input value of this function is less than 0.5, The output of 0; Input greater than 0.5, Then enter 1.

also tanh Activation function , Input less than 0 The output 0, Input greater than 0, The output 1.

Finally, the activation function we will use this time ReLu, If its input is less than 0, The output 0, If the input is greater than 0, Then output as is .

These mathematical characteristics , It determines the way in which the neuron units used participate in the overall calculation . How to choose , Depending on the problem we're trying to solve . If the problem is more complicated , I can't figure out what to do ? that , So easy to use tools and frameworks , Such a small amount of code , Try it all over again ?

Neural network image recognition source code

#!/usr/bin/env python
# -*- coding=UTF-8 -*- import input_data
mnist = input_data.read_data_sets('MNIST_data', one_hot=True) import tensorflow as tf
sess = tf.InteractiveSession() # Yes W/b Initialization helps to prevent the algorithm from falling into the local optimal solution ,
# It's about breaking symmetry and preventing 0 Gradient and neuron node are always 0 Other questions , The principle of mathematics is a similar problem
# These two initializations are defined as subprograms separately because there will be multiple calls of multilayer neural network
def weight_variable(shape):
# fill “ The weight ” matrix , The elements are in accordance with the truncated normal distribution
# There can be parameters mean Represents the specified mean and stddev Specify the standard deviation
initial = tf.truncated_normal(shape, stddev=0.1)
return tf.Variable(initial)
def bias_variable(shape):
# use 0.1 Constant padding “ Offset ” matrix
initial = tf.constant(0.1, shape=shape)
return tf.Variable(initial) # Define placeholders , amount to tensorFlow Operation parameters of ,
#x Is the input image matrix ,y_ Is the label given , It must be supervised learning
x = tf.placeholder("float", shape=[None, 784])
y_ = tf.placeholder("float", shape=[None, 10]) # Define the input layer neural network , Yes 784 Nodes ,1024 Outputs ,
# The number of outputs is self-defined , To match the number of nodes in the second layer
W1 = weight_variable([784, 1024])
b1 = bias_variable([1024])
# Use relu The activation function of the algorithm , The following formula is the same as the previous example
h1 = tf.nn.relu(tf.matmul(x, W1) + b1) # Define the second level ( Hidden layer ) The Internet ,1024 Input ,512 Output
W2 = weight_variable([1024, 512])
b2 = bias_variable([512])
h2 = tf.nn.relu(tf.matmul(h1, W2) + b2) # Define the third level ( Output layer ),512 Input ,10 Output ,10 It's also the number of categories we want
W3 = weight_variable([512, 10])
b3 = bias_variable([10])
# The output of the last layer also uses softmax classification ( It's also an activation function )
y3=tf.nn.softmax(tf.matmul(h2, W3) + b3) # Cross entropy cost function
cross_entropy = -tf.reduce_sum(y_*tf.log(y3))
# Here we use the more complex ADAM Optimizer to do " The steepest descent of the gradient ",
# In the previous example, we used :GradientDescentOptimizer
train_step = tf.train.AdamOptimizer(1e-4).minimize(cross_entropy)
# Calculate the accuracy rate to evaluate the effect
correct_prediction = tf.equal(tf.argmax(y3,1), tf.argmax(y_,1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float"))
#tf Initialization and all variables initialization
sess.run(tf.global_variables_initializer())
# Conduct 20000 Step training
for i in range(20000):
# Every batch of data 50 Group
batch = mnist.train.next_batch(50)
# Every time 100 Step by step, calculate the accuracy and display the intermediate results
if i%100 == 0:
train_accuracy = accuracy.eval(feed_dict={
x:batch[0], y_: batch[1]})
print "step %d, training accuracy %g"%(i, train_accuracy)
# Training with datasets
train_step.run(feed_dict={x: batch[0], y_: batch[1]}) # Complete the model training and give the final evaluation result
print "test accuracy %g"%accuracy.eval(feed_dict={
x: mnist.test.images, y_: mnist.test.labels})

This program uses the 3 Layer of neural network , after 20000*50 Data training , The final accuracy can reach 96% above , This is a significant improvement over the previous example .

In fact, one example and this one , In the end, we all use tf.nn.softmax() function . See in it “nn” No, , This is a “Neural Networks” Abbreviation , in other words , This is not just a neural network algorithm , In fact, the last example , Neural network algorithm is also used .

If it wasn't before TensorFlow The s , This kind of characteristic we can see clearly in the algorithm source code , And now , It's easy to ignore .

So in the last example , We're actually using only “ First floor ” The neural network algorithm of , After simplifying the mathematical formula , It's a common linear algorithm , And then through nonlinear softmax classification .

This example is undoubtedly a classic neural network algorithm ,3 The layers are 784 Inputs ->( Input layer )1024 Nodes ->( Hidden layer )512 node ->( Output layer )10 Node output .

How are the layers of neural network connected ? It's simple , As shown in the program , Each layer is on the formula line , The variables used in the calculation , Is the output variable of the upper layer , It's like linking layers .TensorFlow It will be automatically after the upper layer in the calculation diagram , Add nodes on this layer .

In order to get better recognition results , We also used AdamOptimizer Optimizer “ The steepest descent of the gradient ”.TensorFlow There are several algorithms built in , Mathematical implementation can refer to the reference link at the bottom .

So the design of neural network , How many layers of network should be used ? How many nodes per layer ?

There is no unified standard for this at present , generally speaking , The more layers 、 More nodes , You can get a better recognition rate , But at the same time, the slower the model will work . There could be bigger “ Over fitting ” risk . We'll talk about it later .

And a little bit more recognition , More computing nodes are needed , It's not necessarily cost-effective .

It's not like a drastic increase in resource consumption like adding an entire layer , Increasing the number of nodes in a layer is usually a cost-effective way , Details , It also depends on experimental testing and scientific evaluation .

Multilayer neural networks , Because of the increasing depth of the network , Also known as “ Deep neural network ”(Deep Neural Networks / DNN), This abbreviation is often used with CNN( Convolutional neural networks )、RNN( Cyclic neural network ) Come together .

Small instructions

Recently, in order to write this series , Search for reference materials on the Internet , In addition, I also try to find some ready-made pictures to help explain the concept . As a result, in many articles about machine learning , Find a lot of fallacies , I can't stop sweating .

It also reminds me , On the one hand, I try my best to proofread and clarify the concept again , Prevent similar low-level errors in this article . Of course, the level is limited , It's hard to avoid that there are still some mistakes that can't be found or the cognition itself is wrong , Welcome experts from all walks of life to correct me and let me make continuous progress .

On the other hand, the overall feeling , It could be development “ The great leap forward ” Why , And after all, the basic level of domestic progress is slow 、 Late , Many translations and “ course ” It's the worst hit area of conceptual error .

Originally, because I mainly face the readers around me and in China , I hope that the reference materials cited as far as possible are all from Chinese materials , But today, I decided to give up the idea completely . It's better to have Chinese materials of the same quality , without , I have to quote some foreign materials , After all, it's not just horizontal , There's no way to be more serious than .

I think this may also be an issue that the domestic technology community should pay attention to . Level is on the one hand , Attitude is a more important aspect . Write it out here today , I hope to share with you .

In addition, it is about the structure of this paper , It seems that the length of each article is quite different . This is mainly for the coherence of knowledge points . For example, chapter four , Many concepts are introduced discontinuously , I'm afraid there will be many difficulties in reading the source code , I have to put it for a long time . When reading, you can make some choices and control the progress according to your own situation .

( To be continued ...)

Citation and reference

TensorFlow The Chinese community

Tensorflow Build your own neural network ( Don't worry about it Python Tutorial video )

Overview of Artificial Neural Networks and its Applications

Activation function based on neural network and corresponding mathematical introduction

An overview of gradient descent optimization algorithms

From boiler man to AI Experts (5) More articles about

  1. TensorFlow from 1 To 2( One ) From boiler man to AI Experts

    introduction There was a quote , Here's another quote . It's about apples . The main idea is , Apple has released a new development language Swift, There are many excellent features , So many fashionable programmers go to the pit to learn . lo , After a period of hard study and practice like brain gymnastics , Discover the use of S ...

  2. From boiler man to AI Experts ---- Series of tutorials

    TensorFlow from 1 To 2( Twelve ) Generative antagonistic network GAN And automatic image generation Those amazing TensorFlow Extension packages and community contribution models   From boiler man to AI Experts (11)(END) From boiler man to AI Experts (10)  From the pot ...

  3. From boiler man to AI Experts (2)

    big data As mentioned in the previous section , Mostly AI problem , There will be many variables , Here is an in-depth explanation of this problem . For example, a website needs to do user behavior analysis , So as to guide the improvement of website construction . In general, without behavioral analysis , You don't need to collect too much data from users . For example, use ...

  4. From boiler man to AI Experts (1)

    preface The title comes from a famous stem , The cause is the last problem :< Boiler design changes industry AI, Is it feasible? ?>, Later, many similar questions were extended , what " Express Transfer AI Is it feasible? ?"."xxx Switch ...

  5. From boiler man to AI Experts (7)

    Talk about the plan Unknowingly wrote the seventh , Take care of your thoughts : Learn basic concepts , Understand what is and is not , Where is the current position , Where to go? . This is the first one I hope to do . And at the beginning of the first and second , Very careful consideration of non IT Professional readers . I hope to take this opportunity ...

  6. From boiler man to AI Experts (4)

    Handwritten numeral recognition problem Image recognition is one of the mainstream applications of deep learning , Handwritten numeral recognition is a classic case of the simplified version of image recognition . stay TensorFlow In the official documents of , Recognize handwritten numbers "MNIST" The case is called ...

  7. TensorFlow from 1 To 2( Two ) From boiler man to AI Experts

    Image sample Visualization In the fourth chapter , We introduced the official introduction case MNIST, The function is to recognize handwritten numbers 0-9. This is a very basic TensorFlow application , Status is equivalent to that of the usual language learning "Hello World!& ...

  8. From boiler man to AI Experts (11)(END)

    speech recognition TensorFlow 1.x An example of speech recognition is provided in speech_commands, Used to identify common command vocabularies , Realize the voice control of the device .speech_commands It's a mature speech recognition prototype , ...

  9. From boiler man to AI Experts (10)

    RNN Cyclic neural network (Recurrent Neural Network) Like word2vec Mentioned in , A lot of data prototypes , There is a correlation between before and after . The breaking of relevance will inevitably lead to the loss of key indications , So in the follow-up training and prediction ...

Random recommendation

  1. Objective-C Runtime Runtime four :Method Swizzling

    understand Method Swizzling It's learning runtime It's a great opportunity for the mechanism . I don't want to do more sorting here , Translation only by Mattt Thompson Published in nshipster Of Method Swizzling One article . Me ...

  2. 【CF Brush problem 】14-05-12

    Round 236 div.1 A: Just need each point to connect all the bigger ones , Until we run out of time . //By BLADEVIL #include <cmath> #include <cstdio& ...

  3. SQL Server Shrink database

    // Delete a lot of data /*** * * BEGIN TRANSACTION; * SELECT * INTO #keep FROM Original WHERE CreateDate > '2011 ...

  4. python data type (sequence Sequence 、dictionary The dictionary 、 Dynamic type )

    The article is excerpted from :http://www.cnblogs.com/vamei 1.sequence Sequence sequence( Sequence ) It's a set of ordered elements ( Strictly speaking , It's a collection of objects , But since we haven't introduced “ Yes ...

  5. DzzOffice Administrator login method and administrator application introduction

    DzzOffice It's managed in a way similar to windows Management style , It's directly on the desktop , Through the administrator application for all the management work in the system . 1. visit http://www.domain.com ( The access address of your site ) 2. spot ...

  6. ST-Link STVP Cannot communicate with the device!

    use STLink stay ST Visual Programmer Chinese vs STM8 Downloading binaries sometimes appears : reason : Mostly STM8 The target board has no power supply problem , Or false soldering of power supply pins :

  7. oracle Trigger debugging

    1. Click trigger as shown in the figure below , The debug window appears 2. Perform compilation and debugging 3. Click bug , Will draw red position , Add the statement that will trigger this trigger . If the trigger executes successfully , There won't be a second 4 Map , You don't succeed , Data debugging information will appear , The specific error location will be located at ...

  8. nginx Report errors upstream timed out (110: Connection timed out) Solution 【 turn 】

    from nginx Report errors upstream timed out (110: Connection timed out) Solution - For programmers http://outofmemory.cn/code-sn ...

  9. Oracle Database and client character sets

    1. View database character set information SQL> select * from nls_database_parameters; among ,NLS_CHARACTERSET Is the character set of the current database . 2. Client character set customer ...

  10. Python practice -2

    #1. Use while Loop input 1 2 3 4 5 6 8 9 10 count = 0 while count < 10: count += 1 # count = count + 1 if c ...