Write it at the front

The main content of this blog

  • application MinMaxScaler Realize the normalization of feature data
  • application StandardScaler Realize the standardization of feature data

Feature preprocessing

Definition

​ adopt Some conversion functions Integrate feature data Convert to a more suitable algorithm model The characteristic data process of

Feature preprocessing API

sklearn.preprocessing

Why normalization / Standardization ?

​ The characteristics of the The unit or size varies greatly , Or the variance of a feature is several orders of magnitude larger than that of other features , Easy to influence ( control ) Target result , Some algorithms cannot learn other features

normalization

Definition

​ Map data to by transforming the original data ( The default is [0,1]) Between

Act on each column ,max Is the maximum value of a column ,min Is the minimum value of a column , that X’’ For the end result ,mx,mi Default for the specified interval value mx by 1,mi by 0

API

  • sklearn.preprocessing.MinMaxScaler (feature_range=(0,1)… )

    • MinMaxScalar.fit_transform(X)

      • X:numpy array Formatted data [n_samples,n_features]
    • Return value : The transformed shape is the same array

data

milage,Liters,Consumtime,target
40920,8.326976,0.953952,3
14488,7.153469,1.673904,2
26052,1.441871,0.805124,1
75136,13.147394,0.428964,1
38344,1.669788,0.134296,1

Code

from sklearn.preprocessing import MinMaxScaler

def minmax_demo():
data = pd.read_csv("dating.txt")
print(data)
# 1、 Instantiate a converter class
transfer = MinMaxScaler(feature_range=(2, 3))
# 2、 call fit_transform
data = transfer.fit_transform(data[['milage','Liters','Consumtime']])
print(" The result of normalization of minimum and maximum values :\n", data) return None

result

Standardization

Definition

​ Transform the original data to mean value 0, The standard deviation is 1 Within the scope of

Act on each column ,mean Is the average ,σ As the standard deviation

API

  • sklearn.preprocessing.StandardScaler( )

    • After processing, all data in each column is clustered in the mean value 0 The standard deviation is 1
    • StandardScaler.fit_transform(X)
      • X:numpy array Formatted data [n_samples,n_features]
    • Return value : The transformed shape is the same array

data

​ Same as the data used in the introduction

Code

from sklearn.preprocessing import StandardScaler

def stand_demo():
data = pd.read_csv("dating.txt")
print(data)
transfer = StandardScaler()
data = transfer.fit_transform(data[['milage','Liters','Consumtime']])
print(" The result of Standardization :\n",data)
print(" The average value of each column of features :\n",transfer.mean_)
print(" The variance of each column characteristic :\n",transfer.var_)
return None

Running results

Normalization of feature preprocessing & More articles on Standardization

  1. AI Study --- Feature Engineering 【 feature extraction 、 Feature preprocessing 、 Feature dimension reduction 】

    Learning framework Feature Engineering (Feature Engineering) Data and features determine the upper limit of machine learning , And the model and algorithm are just approaching the upper limit What is Feature Engineering : It helps us to make the algorithm perform better sklearn Lord ...

  2. Feature preprocessing of data ?( normalization )&( Standardization )&( Missing value )

    What is feature processing : Through specific statistical methods ( Mathematical methods ) Transform the data into the data required by the algorithm sklearn Feature handling API: sklearn.preprocessing Code example :  At the end of the article ! normalization : The formula :    ...

  3. About use sklearn Data preprocessing —— normalization / Standardization / Regularization

    One . Standardization (Z-Score), Or remove the mean and variance scaling Formula for :(X-mean)/std   For each attribute / Each column is carried out separately . Attribute the data on a regular basis ( Proceed in columns ) Subtract its mean value , And with its variance . And what you get is , For each genus ...

  4. 【 primary 】 About use sklearn Data preprocessing —— normalization / Standardization / Regularization

    One . Standardization (Z-Score), Or remove the mean and variance scaling Formula for :(X-mean)/std   For each attribute / Each column is carried out separately . Attribute the data on a regular basis ( Proceed in columns ) Subtract its mean value , And with its variance . And what you get is , For each genus ...

  5. Use sklearn Data preprocessing —— normalization / Standardization / Regularization

    One . Standardization (Z-Score), Or remove the mean and variance scaling Formula for :(X-mean)/std   For each attribute / Each column is carried out separately . Attribute the data on a regular basis ( Proceed in columns ) Subtract its mean value , And divide it by its variance . And what you get is , For each genus ...

  6. [Scikit-Learn] - Data preprocessing - normalization / Standardization / Regularization

    reference: http://www.cnblogs.com/chaosimple/p/4153167.html One . Standardization (Z-Score), Or remove the mean and variance scaling Formula for :(X-mean)/ ...

  7. Normalization in data preprocessing (Normalization) And regularization in the loss function (Regularization) To reassure

    background : data mining / There are many terms in machine learning , And my knowledge is limited . I've been wondering about the concept of regularity . So I wrote a blog to sort out Abstract : 1. Regularization (Regularization) 1.1  The purpose of regularization 1.2 Regularized L1 Fan ...

  8. python Employment class - TaoBao - Catalog .txt

    volume TOSHIBA EXT Folder PATH The list volume serial number is AE86-8E8DF:.│ python Employment class - TaoBao - Catalog .txt│ ├─01 Network programming │ ├─01- Basic concepts │ │ 01- Network communication overview ...

  9. Learning notes 57— normalization (Normalization)、 Standardization (Standardization) And centralization / Zero mean value (Zero-centered)

    1 Concept    normalization :1) Turn data into (0,1) perhaps (1,1) Decimal between . It is mainly for the convenience of data processing , Mapping data to 0-1 Within the scope of processing , More convenient and fast .2) Change a dimensional expression into a dimensionless expression , Convenient for different units or ...

  10. normalization (Normalization)、 Standardization (Standardization) And centralization / Zero mean value (Zero-centered)

    The source of bloggers' learning , thank !https://www.jianshu.com/p/95a8f035c86c normalization (Normalization). Standardization (Standardization) And centralization / Zero mean value ...

Random recommendation

  1. win7 Remote desktop restore full screen shortcut key

    Different computers may have different shortcut keys ( Some laptops don't even have corresponding key values ):① Desktop computer :ctrl+alt+break Composite key .②CTRL+ALT+PAGEDOWN Composite key .③ Some notebooks don't have break key , Try adding ...

  2. linux How to view the host's Internet ip Address

    stay linux Next, if we use nat Way to get online . adopt ifconfig Command to see ip The address is often an intranet address So how to view the public network used by the host on the Internet IP Well ? We can use it from the command line curl The command implements this function . [r ...

  3. javascript Details of the event 2

    1. Event object : It's triggering DOM An object is generated when an event occurs . 2. Event object event: (1).type: Get event type (2).target: Get the event target (3).stopPropagation(): prevent ...

  4. IOS Print network request full link

    NSMutableString *urlStr = [NSMutableString stringWithFormat:@"%@?",request.URL]; ;i<[pa ...

  5. sublime On in mode vim And modify it esc

    The first thing I use is sublime text2 sublime Lower open vim Pattern : stay Preference -> Setting-User Inside plus "ignored_packages" ...

  6. Android---&gt;activity Interface jump , And look at the lifecycle process

    main.xml Interface layout <?xml version="1.0" encoding="utf-8"?> <LinearLayout xmlns ...

  7. Relatively positioned div There is no vertical scroll bar

    In a relative position div There are a lot of absolute positioning in html Elements , There is no scroll bar in the vertical direction . reason : We expect the point as the origin to be below the origin of the window , But the scroll bar takes the window origin as the origin , So there is no scroll bar . solve : stay div Set another one outside div,o ...

  8. javascript Interpretation of the concept of archetypal pattern

    Archetypal model (prototype) It refers to using prototype instances to point to the types of objects created , And create new objects by copying these stereotypes . For prototype patterns , You can use JavaScript Unique prototypes inherit features to create objects , The real archetypal inheritance is as the latest ...

  9. Educational Codeforces Round 62 (Rated for Div. 2)

    A. Detective Book The question : Reading alone   Give each chapter of the buried hole in the page can be filled in . If a person doesn't fill a hole in a day, he will always watch How many days can I finish reading this book Ideas : Simulate Take the maximum number of pages in the process   Such as ...

  10. python2.7 Pits buried in source code or third party packages ( Continuous updating )

    1.psutil package ,aix In the environment , If the process command is too long , The program can't get the complete process command , The test code is as follows import psutil proc=psutil.Process(11534558) pidDict ...