# print( 'Hello，NumPy！' )

2020-11-08 08:32:33

# print( "Hello,NumPy！" )

Learning is painful , Learn today , Lose tomorrow . This kind of weather , Or sleep is the most comfortable .

Let's talk about it , Make a noise , But you still have to learn .

In the process of learning, I always have the habit of taking notes , But the quality of the notes can't be flattered , Most of them have not been sorted out , But it's a good choice to review .

Self contact Python since , Most of them are reptiles , What kind of online novel , Virtual game currency , Examination question bank ah and so on have written , Also help others crawl a lot of website public data . I've collated an article about reptiles before （ Too lazy , Just one article , Then there's a chance to have time , Sort it out again ）： Web crawler page style analysis

After that , That's right Social Engineering There was a certain interest , Maybe you need to be eloquent , have a glib tongue , achieve “ swindler ” Level of , Can we play social engineering . I wrote an article before Web Infiltrate relevant articles ：Taoye Penetrate into a black platform headquarters , The truth behind it is terrible to think about it .

It is still necessary to remind you that , For someone or information that you don't trust , If you know something about cyber security , It can be used as a penetration experience , Otherwise , The best way to deal with it is to ignore , Don't let your curiosity be the beginning of the abyss , Especially in this cloud dragon hybrid virtual network world . Just a few days ago , The police also cracked the country's largest network luo liao What about extortion , Victim Da 10 More than ten thousand people , The amount involved is also XXXXXXXXXXXXXXX, We still need to pay attention to .

Anyway , There's a lot to learn 、 Very miscellaneous , I'm not good at learning , The notes are rarely reviewed . A good workman does his work well , You must sharpen your tools first , This doesn't start systematic learning, machine learning , So I want to put the previous record Numpy、Pandas、Matplotlib“ Three swordsmen ” Rearrange your notes , It's also a review .

Later on , Can learn some machine learning algorithms , Main reference 《 Machine learning practice / Machine Learning in Action》 With Mr. Zhou Zhihua 《 machine learning 》 Watermelon book , And some other technical articles written by some of the big guys in the circle . If you are good at it, try to tear it by hand , If you can't tear it by hand, it means that you still need to improve .

Flag Too many , I feel like I'll be slapped in the face . Don't worry , Take your time , I'm not afraid to fight in the face , Anyway, the skin is rough and the meat is thick (￣_,￣ )

This article begins with NumPy Sort it out , Maybe it's not comprehensive , Only a few common ones are recorded , Other words will be used later to update it . The following content mainly refers to the rookie tutorial and NumPy Official documents ：

About NumPy Installation , I have already introduced the construction of deep learning environment , Recommended installation Anaconda, It integrates a large number of third-party tool modules , It doesn't have to be manual pip install ..., This is a bit like Java Medium Maven.Anaconda May refer to ： be based on Ubuntu+Python+Tensorflow+Jupyter notebook Build a deep learning environment

If you have not installed Anaconda That's OK , Only need Python In the environment, execute the following command to install NumPy that will do ：

> pip3 install numpy -i https://pypi.tuna.tsinghua.edu.cn/simple


The following is used in the following NumPy The version is ：1.18.1

In [1]: import numpy as np

In [2]: np.__version__
Out[2]: '1.18.1'


stay NumPy in , The objects of operation are mostly ndarray type , It can also be called by another name array, We can think of it as a matrix or a vector .

establish np.ndarray Objects come in many ways ,NumPy There are also many api Available to call , For example, we can create a specified ndarray object ：

In [7]: temp_array = np.array([[1, 2, 3], [4, 5, 6]], dtype = np.int32)

In [8]: temp_array
Out[8]:
array([[1, 2, 3],
[4, 5, 6]])

In [9]: type(temp_array)
Out[9]: numpy.ndarray       #  The output is of type ndarray


Yes, of course , You can also call arange, And then it's done reshape Operation to change its shape , Convert a vector to 2x3 Matrix form of , The object type is still numpy.ndarray

In [14]: temp_array = np.arange(1, 7).reshape(2, 3)     # arange Generating vectors ,reshape Change shape , Into a matrix

In [15]: temp_array
Out[15]:
array([[1, 2, 3],
[4, 5, 6]])

In [16]: type(temp_array)       #  The type of output is still ndarray
Out[16]: numpy.ndarray


From above , We can find out , No matter what way （ Other methods will be introduced later ） To create objects ,NumPy It's all about ndarray type , And this type of object mainly contains the following properties ：

• ndarray.ndim： Express ndarray The number of shaft , It can also be understood as dimension , Or it can be understood as the number of brackets in the outer layer . such as [1, 2, 3] Of ndim Namely 1,[[1], [2], [3]] Of ndim be equal to 2,[[[1]], [[2]], [[3]]] Of ndim be equal to 3（ Pay attention to the number of brackets in the outer layer ）
• ndarray.shape： Express ndarray The shape of the , The output is a tuple . The ndarray Yes n That's ok m Column , The output is (n, m), such as [[1], [2], [3]] The output is (3, 1),[[[1]], [[2]], [[3]]] The output is (3, 1, 1),[[[1, 2]], [[3, 4]]] The output is (2, 1, 2). Through the above 3 An example , You can find shape It is expressed from the outside to the inside
• ndarray.size： This is easier to understand , That means ndarray The total number of internal elements , That is to say shape The product of the
• ndarray.dtype： Express ndarray The data type of the internal element , Common are numpy.int32、numpy.int64、numpy.float32、numpy.float64 etc.

That's all ndarray Some of the common properties in , Be careful ： Part of it , Not all of them , For other properties, please refer to the official documents

We can observe by the following ndarray Properties of , And how its internal properties should be modified ：

In the above example np.expand_dims and np.astype It will be introduced later .

np.zeros You can create an element full of 0 Of ndarray,np.ones You can create an element full of 1 Of ndarray. You can specify ndarray Of shape shape , It can also be done through dtype Property specifies the data type of the inner element ：

In [70]: np.zeros([2,3,2], dtype=np.float32)
Out[70]:
array([[[0., 0.],
[0., 0.],
[0., 0.]],

[[0., 0.],
[0., 0.],
[0., 0.]]], dtype=float32)

In [71]: np.ones([3,2,2], dtype=np.float32)
Out[71]:
array([[[1., 1.],
[1., 1.]],

[[1., 1.],
[1., 1.]],

[[1., 1.],
[1., 1.]]], dtype=float32)


in addition , stay Tensorflow Through tf.fill To generate a specified element shape tensor , As follows 2x3 Tensor , And the internal elements are 100：

In [76]: import tensorflow as tf

In [77]: tf.fill([2,3], 100)
Out[77]:
<tf.Tensor: shape=(2, 3), dtype=int32, numpy=
array([[100, 100, 100],
[100, 100, 100]])>


And in the NumPy in , Also have fill Interface , It's just that you can only go through what you already have ndarray To call fill, Not directly np.fill To call ：

In [79]: data = np.zeros([2, 3])

In [80]: data.fill(100)

In [81]: data
Out[81]:
array([[100., 100., 100.],
[100., 100., 100.]])


np.arange With the usual range Works in a similar way , Used to produce a continuous interval ndarray, Pay attention to the left, not the right , And the array is an arithmetic sequence , Tolerances can be self-defined （ It can be a decimal ）, as follows ：

In [85]: np.arange(10)
Out[85]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [86]: np.arange(3, 10, 2)
Out[86]: array([3, 5, 7, 9])

In [87]: np.arange(3, 10, 0.7)
Out[87]: array([3. , 3.7, 4.4, 5.1, 5.8, 6.5, 7.2, 7.9, 8.6, 9.3])


numpy.linspace Function to create a one-dimensional array , An array is made up of a sequence of equal differences , You can specify the number of elements inside the element and whether it contains stop value . as follows , In the interval 1-5 Create an element with the number of 10 Equal difference sequence of ：

In [89]: np.linspace(1, 5, 10)      #  Default includes stop
Out[89]:
array([1.        , 1.44444444, 1.88888889, 2.33333333, 2.77777778,
3.22222222, 3.66666667, 4.11111111, 4.55555556, 5.        ])

In [90]: np.linspace(1, 5, 10, endpoint = False)    # endpoint Property can be set to not contain stop
Out[90]: array([1. , 1.4, 1.8, 2.2, 2.6, 3. , 3.4, 3.8, 4.2, 4.6])


np.random.random and np.random.rand Random from 0-1 Generate corresponding to shape Of ndarray object ：

In [4]: np.random.random([3, 2])
Out[4]:
array([[0.68755531, 0.56727707],
[0.86027161, 0.01362836],
[0.56557302, 0.94283249]])

In [5]: np.random.rand(2, 3)
Out[5]:
array([[0.19894754, 0.8568503 , 0.35165264],
[0.75464769, 0.29596171, 0.88393648]])


np.random.randint Randomly generate a specified range of ndarray, And the internal elements are int type ：

In [6]: np.random.randint(0, 10, [2, 3])
Out[6]:
array([[0, 6, 9],
[5, 9, 1]])


np.random.randn Returns the standard normal distribution ndarray（ The mean for 0, The variance of 1）：

In [7]: np.random.randn(2,3)
Out[7]:
array([[ 2.46765106, -1.50832149,  0.62060066],
[-1.04513254, -0.79800882,  1.98508459]])


in addition , We are NumPy Use in random When , It's a random set of data , and If you want to generate the same data each time , You have to go through np.random.seed To set it up ：

In [33]: np.random.seed(100)

In [34]: np.random.randn(2, 3)
Out[34]:
array([[-1.74976547,  0.3426804 ,  1.1530358 ],
[-0.25243604,  0.98132079,  0.51421884]])

In [35]: np.random.seed(100)

In [36]: np.random.randn(2, 3)
Out[36]:
array([[-1.74976547,  0.3426804 ,  1.1530358 ],
[-0.25243604,  0.98132079,  0.51421884]])


stay NumPy One dimension in ndarray in , It's like a list , It can be sliced and traversed ：

In [5]: a
Out[5]: array([1., 2., 3., 4., 5., 6., 7., 8., 9.])

In [6]: a[2], a[2:5]
Out[6]: (3.0, array([3., 4., 5.]))

In [7]: a * 3, a ** 3   #  cube
Out[7]:
(array([ 3.,  6.,  9., 12., 15., 18., 21., 24., 27.]),
array([  1.,   8.,  27.,  64., 125., 216., 343., 512., 729.]))

In [13]: for i in a:
...:     print(i, end=", ")
1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0,


If our ndarray It's not a one-dimensional array , It's a two-dimensional matrix , Or higher dimensional ndarray, Then we need to segment it from multiple dimensions . And When we're dealing with high dimensional ndarray When traversing , The dimension of a single result is less than that of meta dimension , such as 2 The result of traversing dimensional matrix is 1 Dimension vector , The result of three-dimensional traversal is 2 D matrix .

in addition , There is one more thing to be said , The dimensions of my data are quite high , In order to facilitate us to index the data ,NumPy For us ... To segment , Specific examples are as follows ：

In [14]: a = np.random.randint(0, 2, 6).reshape(2, 3)

In [15]: a
Out[15]:
array([[1, 1, 1],
[1, 0, 1]])

In [16]: a[:, :2]
Out[16]:
array([[1, 1],
[1, 0]])

In [17]: for i in a:        #  Traverse the matrix , Then the output is the row vector , A single output has fewer dimensions than the original 1
...:     print(i, end=", ")
[1 1 1], [1 0 1],

In [19]: data = np.random.randint(0, 2, [2, 2, 3])

In [20]: data
Out[20]:
array([[[0, 0, 1],
[0, 1, 1]],

[[1, 0, 0],
[0, 1, 0]]])

In [21]: data[..., :2]      # ... It means that the first two dimensions need to be , amount to  data[:, :, :2]
Out[21]:
array([[[0, 0],
[0, 1]],

[[1, 0],
[0, 1]]])


shape operation ：

• a.ravel()、ndarray.flatten(), take ndarray Do the stretching operation （ Straighten it into vector form ）
• a.reshape(), Re change a Of shape shape
• a.T、a.transpose(), return a The inverse matrix of

All of the above operations return new results , Without changing the original ndarray（a）. And The above operations are all horizontal operations by default , If you need vertical , You need to control order Parameters , Specific operation can refer to rookie tutorial . except reshape outside , also resize, It's just resize Will change a Result , Instead of producing a new result ：

Another little trick to master is , It's going on reshape When , If it comes in -1, The corresponding result will be calculated automatically . For example, a 2x3 Matrix a, We carry out a.reshape(3, -1), And here it is -1 It stands for 2, When we have a lot of data , This is very convenient to use ：

In [44]: a
Out[44]:
array([[1, 1, 1],
[1, 0, 1]])

In [45]: a.reshape([3, -1])
Out[45]:
array([[1, 1],
[1, 1],
[0, 1]])


Modification of array dimensions ：

dimension describe
broadcast_to Broadcast array to new shape
expand_dims Expand the shape of the array
squeeze Remove one-dimensional entries from the shape of the array

The specific operation is as follows ：

Array connection ：

function describe
concatenate Join the array sequence along the existing axis
hstack Stack arrays in a sequence horizontally （ Column direction ）
vstack Stack arrays in a sequence vertically （ Line direction ）

The following code shows the operation of array join , among concatenate By controlling axis To determine the direction of the connection , The effect is equal to hstack and vstack. Another thing to note is ： The following example simply concatenates two arrays , In fact, you can connect multiple , such as np.concatenate((x, y, z), axis=1)

Segmentation of arrays ：

function describe
split Divide an array into multiple subarrays
hsplit Divide an array horizontally into multiple subarrays （ By column ）
vsplit Divide an array vertically into multiple subarrays （ Press the line ）

It's the same as an array join ,split By controlling axis Property to get the same as hsplit、vsplit Same effect , Here's just split An example of , About hsplit and vsplit Refer to official documents ：

Addition and deletion of array elements ：

function describe
append Add values to the end of the array
insert Inserts a value along the specified axis before the specified subscript
delete Delete the subarray of a certain axis , And return the new array after deletion

radio broadcast (Broadcast) yes numpy For different shapes (shape) How to do the numerical calculation of the array , Arithmetic operations on arrays are usually performed on the corresponding elements .

If two arrays a and b The same shape , The meet a.shape == b.shape, that a*b The result is that a And b Array corresponding bit multiplication . This requires the same dimension , And the length of each dimension is the same .

In [8]: import numpy as np

In [9]: a = np.array([1,2,3,4])
...: b = np.array([10,20,30,40])

In [10]: a * b
Out[10]: array([ 10,  40,  90, 160])


When in operation 2 The shapes of arrays are different ,numpy Will automatically trigger the broadcast mechanism . Such as ：

In [11]: a = np.array([[ 0, 0, 0],
...:            [10,10,10],
...:            [20,20,20],
...:            [30,30,30]])
...: b = np.array([1,2,3])

In [12]: a + b
Out[12]:
array([[ 1,  2,  3],
[11, 12, 13],
[21, 22, 23],
[31, 32, 33]])


The following image shows the array b How to broadcast with array a compatible .

np.tile It can broadcast the target operation array , such as 1x3 The following operations can be broadcast as 4x6, Pay attention to the above broadcast_to Make a difference ,broadcast_to It has to be expanded , and tile Extendable dimension , We can not expand the dimension , Specific operation according to their own actual needs .

There was Tensorflow Experienced readers should know , It also has tile and broadcast operation , But when we have a large amount of data , It is said that tile Is more efficient than broadcast Be low , I don't know why , It will be useful in the future .

In [20]: a
Out[20]: array([[1, 1, 0]])

In [21]: np.tile(a, [4, 2])     #  The second parameter represents the multiple of each dimension broadcast , This is line expansion 4 times , Liege 2 times
Out[21]:
array([[1, 1, 0, 1, 1, 0],
[1, 1, 0, 1, 1, 0],
[1, 1, 0, 1, 1, 0],
[1, 1, 0, 1, 1, 0]])


About NumPy Copy and attempt in ： This part of the knowledge is also what I was learning before NumPy The missing points of time , Take advantage of this opportunity , Record it here .

• No copy

On this point , Actually, it was recorded before LeetCode Hot topic HOT 100（01, Addition of two numbers ） The algorithm also mentioned , This point needs to be paid more attention to .

• View or shallow copy （view）

Use the same code as above , Only the... Has been modified 57 That's ok , take y = x and y = x.view(), You can find , At this time x and y Of id Values are not the same , They don't have the same memory address , We modify x Of shape after ,y Of shape Nothing has changed .

however , When we change, it's not shape, Instead, change the data inside a variable array , The other array changes as well

• Copy or deep copy (copy)

The view or shallow copy uses view, And copy or deep copy uses copy. Use copy When modifying an array shape, Or internal elements , The other array doesn't change .

（ About copy, The code is no longer demonstrated here , Readers can operate by themselves , And then compare them ）

So to conclude ：

• y = x, explain x and y Of Same memory address , Modify one of , The other will change as it happens （ Whether it's shape, Or internal elements ）
• y = x.view(), The memory addresses of the two are different , Modify one of the shape, The other doesn't change ; And modify the inner elements of one of the tuples , The other one will change with it
• y = x.copy(), The memory addresses of the two are different , Whether it's modifying a tuple shape, Or internal elements , None of the other will change , They are independent of each other

NumPy The mathematical correlation function in , There is nothing to talk about in this part ：

• np.pi, return π value
• np.sin(), Return sine value
• np.cos(), Returns the cosine of
• np.tan(), Return tangent value
• numpy.ceil(), Returns the smallest integer greater than or equal to the specified expression , That is, round up .
• np.exp(2), Returns the index value , That is to say $e^2$

Other related mathematical functions , Refer to official documents .

NumPy The arithmetic operations in , There is nothing to talk about in this part ：

• numpy.add(a,b)： Add two arrays
• numpy.subtract(a,b)： Subtracting two arrays
• numpy.multiply(a,b)： Multiply two arrays
• numpy.divide(a,b)： Divide two arrays by
• numpy.reciprocal(a), Back to the bottom
• numpy.power(a, 4), return a The fourth power of

NumPy The statistical function in , This is a little bit of a note ：

The above example shows how to get the maximum value and the difference between the maximum value and the maximum value in an array , Into axis Parameters , Then we get it in the corresponding direction , If there is no introduction axis Parameters , It means to get the maximum value of the whole array . In addition to the above interfaces , There are other common statistical functions , There is no difference between the specific operation and the above , as follows ：

• np.amin()： Get the minimum
• np.amax()： Get the maximum
• np.ptp()： Get the difference between the maximum and the minimum
• np.median()： Get the median （ The median ）
• np.mean()： Get the mean
• np.var()： To obtain the variance ,$\sigma^2 = \frac{1}{n}\sum_{i=1}n(x_i-\overline{x})2$
• np.std()： Get the standard deviation ,$\sigma$

NumPy Linear algebra in ：

• np.dot(a, b) It's the product of two matrices
• np.vdot(a, b) The sum of the products of the corresponding positions of two matrices
• np.inner(a, b) Inner product , Namely a With each line of b Each line of the sum

such as a=[[1, 0], [1, 1]],b=[[1, 2], [1, 3]]
np.inner(a, b) amount to [1, 0] * [1, 2] = 1 -> For the first number
[1, 0] * [1, 3] = 1 -> For the second number
[1, 1] * [1, 2] = 3 -> For the third number
[1, 1] * [1, 3] = 4 -> For the fourth number

The matrix product is the sum of the product of the row of the first matrix and the column product of the second matrix , and inner It is equivalent to the sum of the row products of the first matrix and the second matrix

• np.matmul(a, b) Feeling and np.dot(a, b) It works the same , They're all matrix products
• np.linalg.det(a) Calculate the value of the determinant of a matrix
• np.linalg.solve(a, [[1], [1]]) Find solutions to linear equations , The first parameter corresponds to the coefficient , The second parameter is equivalent to the parameter term
• np.linalg.inv(a) Calculate the inverse of a matrix

#  Save the array to  .npy  In the file with the extension .
numpy.save(file, arr, allow_pickle=True, fix_imports=True)

• file： File to save , extension .npy, If there is no extension at the end of the file path .npy, The extension will be automatically added with .
• arr: The array to save
• allow_pickle: Optional , Boolean value , Allow to use Python pickles Save an array of objects ,Python Medium pickle Used before saving to or reading from a disk file , Serialize and deserialize objects .
• fix_imports: Optional , For convenience Pyhton2 Read from Python3 Saved data .
In [81]: a = np.random.randint(1, 10, [3, 4])

In [82]: np.save("a.npy", a)

In [83]: np.load("a.npy")
Out[83]:
array([[2, 7, 3, 1],
[4, 6, 4, 3],
[2, 2, 9, 5]])


Reference material ：

Reference material ：

[1] NumPy Novice tutorial ：https://www.runoob.com/numpy/numpy-tutorial.html
[2] NumPy Official documents ：https://numpy.org/doc/stable/user/quickstart.html

Not enough time , It's a bit hasty to write later , But should not affect the normal reading and later review , For the time being .

Be careful ： Only a few common ones are recorded , Other words will be used later to update it , Other contents can refer to the document .

It was meant to be in accordance with 《 Machine learning practice / Machine Learning in Action》 This book is about to tear the code out of it , But for practical reasons , It may need to be torn by hand SVM 了 , This algorithm is still a headache , It's too complicated inside , There are few data to deduce it completely , It also involves a lot of nouns of "Mo Sheng" , Such as ： Optimization under nonlinear constraints 、KKT Conditions 、 Lagrange dual 、 The largest interval 、 The optimal lower bound 、 Kernel functions and so on , The book of heaven may be 、 Probably 、 Maybe that's it . Fortunately, I have studied before SVM, But it must still take a lot of energy to tear , Also need to refer to a lot of information , Including but not limited to 《 Machine learning practice / Machine Learning in Action》、《 machine learning 》、《 Statistical learning method 》.

therefore , In the next issue , It should start tearing SVM, As for whether we can succeed in the end , It's hard to say . It may take a lot of time , During this period LeetCode HOT 100 And it needs to be painted normally .

I am a Taoye, Love to study , Love sharing , Keen on all kinds of Technology , I like playing chess in my spare time 、 Listen to the music 、 Talking about animation , I hope to take this opportunity to record my growth process and life , Also hope to be able to foster more like-minded friends in the circle , For more information, welcome to wechat Princess ： Cynical Coder.

Recommended reading ：