当前位置:网站首页>Distributed system! How to realize user tracking and authentication?

Distributed system! How to realize user tracking and authentication?

2020-11-10 14:21:51 Search Cloud library technology team

The fastest shortcut in the world , It's down to earth , This article has been included 【 Architecture technology column 】 Focus on this place where you like to share .

In interviews with some Internet companies , An interviewer often asks such a question :

If you disable the browser cookie, How to achieve user tracking and authentication ?

Unfortunately, there are still a lot of candidates who don't answer the questions , It's not clear cookie and session The difference between .

And there are amazing real cases at work :

hold user ID Store in local storage As token Use , The reason is that they claim to have abandoned cookie This kind of backward thing ;

A mobile project , From the server API The client needs to simulate a cookie, It's like in a browser ajax Spending like that API.

The Internet is based on HTTP The protocol builds , and HTTP Protocol is popular because of its simplicity , however HTTP The agreement is stateless ( At the communication level, virtual circuits are much more expensive than datagrams ) Of , So people have come up with various ways to track users , Include cookie/session Mechanism 、token、flash Cross browser cookie Even browser fingerprints, etc .

Hide user identities everywhere ( Browser fingerprint technology doesn't even need storage media )

Talk about using spring security There are a lot of information about specific technologies , This article does not intend to write a concrete implementation of the framework and code .

We'll talk about the difference between authentication and Authorization , Then we will introduce some technologies that are widely used in the industry , Finally, I'll talk about how to do it for API Build and choose the appropriate authentication method .

authentication 、 to grant authorization 、 voucher

First , Authentication and authorization are two different concepts , To make our API More secure and clear design , It is necessary to understand the difference between authentication and authorization , They are also different words in English .

authentication ( authentication)

Refers to the identity of the current user , When a user logs in, the system can trace his identity and make operations in accordance with the corresponding business logic .

Even if the user is not logged in , Most systems also track his identity , It's just a guest or an anonymous user .

What authentication technology solves is “ Who am I ?” The problem of .

The authorization is (authorization)

What kind of identity is allowed to access certain resources , After obtaining the user's identity, continue to check the user's permissions .

Single system authorization is often accompanied by authentication , But in the open API Under the multi system structure of , Authorization can be done by different systems , for example OAuth.

Licensing technology is about solving “ What can I do? ?” The problem of .

voucher

The basis for authentication and authorization is the need for a medium (credentials) To mark the identity or rights of visitors , In real life, everyone needs an ID card to access his bank account 、 Marriage and endowment insurance , This is the certificate of authentication ;

In ancient military activities , The emperor will issue amulets to the general who goes to war , Junior generals don't care about people who hold amulets , Just execute the command corresponding to the amulet .

In the Internet world , The server issues session ID Store in cookie, This is a kind of credential Technology . Digital vouchers also show up in all aspects ,SSH Login key 、JWT token 、 One time passwords, etc .

give an example

The user account is not necessarily a table stored in the database , In some enterprises IT In the system , There are more requirements for account management and authority .

So account Technology (accounting) It can help us manage user accounts in different ways , At the same time, it has the ability to share accounts between different systems .

For example, Microsoft's Active Directory (AD), And simple directory access protocol (LDAP), Even blockchain Technology .

Access control strategy (AC)

Another important concept is access control policy (AC).

If we need to divide the permissions of resources into a very fine granularity , We have to consider the identity of users to access limited resources , Select based on access control list (ACL) Or user role based access control (RBAC) Or other access control policies .

In popular technologies and frameworks , None of these concepts can be implemented in isolation , So when you use these technologies in real life , People tend to be for one OAuth2 The concept of authentication or authorization has been debated .

For ease of understanding , I have attached a glossary of common technologies and concepts at the end of the article .

Now I'll introduce it in API Several kinds of authentication and authorization technologies are often used in development :

  • HTTP Basic AUthentication
  • HAMC、OAuth2
  • JWT token.

HTTP Basic Authentication

You must have used this way , But you don't have to know what it is , Not long ago , When you visit the management interface of a home router , You will often see a browser pop-up form , You are required to enter the user password .

Behind this , When the user has entered the user name and password , The browser does a very simple operation for you :

Combine the user name and password and Base64 code

Add... To the encoded string Basic Prefix , Then set the name to Authorization Of header Head

API It can also be very simple to provide HTTP Basic Authentication authentication , So the client can easily pass through Base64 Transfer the user name and password :

  • Connect the user name and password with a colon , for example username:abc123456
  • In order to prevent the user name or password from exceeding ASCII Code range of characters , Recommended UTF-8 code
  • Use the string above with Base 64 code , for example dXNlcm5hbWU6YWJjMTIzNDU2
  • stay HTTP Add... To the request header “Basic + Encoded string ”, namely :Authorization: Basic QWxhZGRpbjpPcGVuU2VzYW1l

It's a very simple way to do it , Used in a large number of scenarios .

Of course, the disadvantages are obvious ,Base64 It can only be called encoding , Not encryption ( In fact, clients that don't need to configure a key don't have any reliable encryption , We all depend on TSL agreement ).

The fatal weakness of this method is that the encoded password is easy to leak in the network transmission if plaintext transmission , If the password doesn't expire , Once the password is leaked , Only by changing the password .

HMAC(AK/SK) authentication

Before we connect some PASS Platform and payment platform , We will be asked to pre generate a access key(AK) and secure key(SK), Then the authentication request is completed by signature , This way, transmission can be avoided secure key, And most of the time, signatures are only allowed to be used once , Avoid replay attacks .

This is based on AK/SK The main authentication method is to use hash message authentication code (Hash-based MessageAuthentication Code) To achieve , So there are many places called HMAC authentication , Actually, it's not very accurate .

HMAC It's just using... With key Value hash algorithm generates message digest , In the design API There are different implementations .

HMAC It is used as a credential generation algorithm in authentication design as network communication , Avoid password and other sensitive information transmission in the network .

The basic process is as follows :

  1. The client needs to be preset in the authentication server access key(AK Or call app ID) and secure key(SK)
  2. Calling API when , The client needs to set parameters and access key After natural sorting and using secure key Sign to generate an extra parameter digest
  3. The server is based on the preset secure key Do the same summary calculation , And the results are exactly the same
  4. Be careful secure key It can't be transmitted over the network , And store in an untrusted location ( The browser etc. )

In order to make the signature of each request unique , To achieve replay attacks , We need to put some interference information in the signature .

There are two typical practices in industry standards :

  • question / Answer algorithm (OCRA: OATH Challenge-Response Algorithm)
  • Time based one time cipher algorithm (TOTP:Time-based One-time Password Algorithm)

question / Answer algorithm

question / The answer algorithm requires the client to request the server once , To obtain a 401 Uncertified return , And get a random string (nonce).

take nonce Attach to as described above HMAC Signature , Servers use pre assigned nonce Also check the signature , This nonce It will only be used once on the server , So you can provide a unique summary .

One time password authentication based on time

In order to avoid additional requests to get nonce, Another algorithm is to use timestamps , And through the way of synchronization time to reach an agreement , Valid within a certain time window (1 About minutes ).

Here we just use the timestamp as the time window for verification , It is not strictly a time-based one-time cryptographic algorithm .

Standard time-based one-time cryptographic algorithms are widely used in two-step verification , for example Google The authenticator can verify without network communication ( But relying on accurate timing Services ).

The principle is that the client server shares the secret key, and then according to the time window, it can pass through HMAC The algorithm calculates the same captcha .

TOTP Basic principles and common manufacturers

OAuth2 and Open ID

OAuth( Open licensing ) It's an open standard , Allow users to authorize third-party websites to access information they store on other service providers , Instead of providing a user name and password to a third-party website or sharing all of their data .

OAuth It's a licensing standard , It's not a certification standard . The server providing the resource does not need to know the exact user identity (session), Just verify the permissions granted by the authorization server (token) that will do .

The picture above is just OAuth A simplified process for ,OAuth The basic idea is to get through the authorization server access token and refresh token(refresh token Used to refresh access token), And then through access token Get data from the resource server .

In particular scenarios, there are the following patterns :

  • Authorization code mode (authorization code)
  • Simplified mode (implicit)
  • Password mode (resource owner password credentials)
  • Client mode (client credentials)

If you need to get user authentication information ,OAuth It doesn't define this part of the content itself , If you need to identify user information , Another authentication layer is needed , for example OpenID Connect.

verification access token

OAuth verification

In some introductions OAuth There is little talk about how the resource server validates access token Of .

OAuth core The standard doesn't define this part , But in the OAuth Two kinds of verification are mentioned in other standard documents access token The way ,

After completing the authorization process , The resource server can use OAuth Provided by the server Introspection Interface to verify access token,OAuth The server will return access token Status and expiration time of .

stay OAuth It is verified in the standard token The term is Introspection.

It also needs attention access token It's the credential between the user and the resource server , It's not a credential between the resource server and the authorization server .

Additional authentication should be used between the resource server and the authorization server ( for example Basic authentication ).

JWT verification

The authorization server uses the private key to sign JWT Formal access token, The resource server needs to use pre configured public key verification JWT token, And get the token State and some are contained in access token In the information .

So in JWT Under the scheme of , The resource server and the authorization server no longer need to communicate , In some scenarios, it brings a huge advantage .

meanwhile JWT There are also some weaknesses , I will be in JWT Part of the explanation .

refresh token and access token

Almost everyone is just beginning to understand OAuth There is always a question , Why has there been access token It also needs to be refresh token Well ?

The authorization server will return with the first authorization request access token and refresh token, Refresh in the back access token When it comes to refresh token.

access token and refresh token The design intention is different ,access token Designed to interact between client and resource server , and refresh token Is designed to interact between the client and the authorization server .

In some authorization models access token Need to be exposed to the browser , Act as a temporary session between the resource server and the browser , There is no signature mechanism between the browser and the resource server ,access token Become the only proof , therefore access token The expiration time of (TTL) It should be as short as possible , In order to avoid the user's access token Being attacked by sniffers .

Because of the demand access token Time is very short ,refresh token It can help users maintain a state for a long time , Avoid frequent reauthorization .

People will think that let access token Isn't it OK to keep a long expiration time ?

actually refresh token and access token The difference is that even if refresh token Intercepted , The system is still secure , Client holding refresh token To get access token At the same time, it needs to be configured in advance secure key, There is always secure authentication before the client and the authorization server .

OAuth、Open ID、OpenID Connect

There are too many terms for authentication , When I build my own authentication server or access a third-party authentication platform , Sometimes you can't understand these terms until the last minute of development .

OAuth

OAuth Responsible for solving the authorization problem between distributed systems , Even if sometimes the client and the resource server or authentication server exist on the same machine .

OAuth It doesn't solve the problem of Authentication , However, a good design is provided to facilitate the docking with existing authentication systems .

Open ID

Open ID The problem to be solved is the authentication between distributed systems , Use Open ID token The ability to authenticate users across multiple systems , And return user information , Can be used independently , And OAuth There's no connection .

OpenID Connect

OpenID Connect The solution is in OAuth The problem of user authentication under this system , The basic principle of the implementation is that the user's authentication information will be (ID token) Treat it as a resource .

stay OAuth After authorization under the framework , Re pass access token Get the user's identity .

The relationship between these three concepts is a little difficult to understand , In real-world situations , If a separate authentication system is needed in the system , There is no need for authorization between multiple systems Open ID. If used OAuth As an authorization standard , You can go through OpenID Connect To complete user authentication .

JWT

stay OAuth Equally distributed authentication 、 Under the authorization system , There are more requirements for certificate technology , For example, including users ID、 Expired information , You don't need to associate it in external storage .

So the industry is very interested in token Further optimization has been made , A self contained token is designed , After the token is issued, there is no need to check whether the token is legal from the server store , The expiration of a token can be obtained by parsing the token 、 Effective information , This is it. JWT (JSON Web Token).

JWT It's a kind of containing token (self-contained token), Or value token (value token), We used to use association to session Upper hash Values are called reference tokens (reference token).

In short , A basic JWT The token is a fraction 3 Segment structure .

eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJzdWIiOiIxMjM0NTY3ODkwIiwibmFtZSI6IkpvaG4gRG9lIiwiYWRtaW4iOnRydWV9.TJVA95OrM7E2cBab30RMHrHDcEfxjoYZgeFONFh7HgQ

Generate JWT The process of token is :

header json Of base64 Encoded as the first part of the token

payload json Of base64 Encoded as the second part of the token

Assembly first 、 The second part is after coding json as well as secret The third part of the token to sign

So you just need to sign secret key You can check JWT token , If you add users to the message body ID、 Expired information can be used to verify whether the token is valid 、 Out of date , There's no need to get from the database / Read information from the cache .

Because of the encryption algorithm , So first 、 Even if the two parts are modified ( Including expired information ) It can't be verified .

JWT The advantage is that it can not only be used as token Use , At the same time, it can also carry some necessary information , Save multiple queries .

Be careful :

  • JWT token First 、 The second part is just base64 code , It's not readable by the naked eye , Sensitive information should not be stored
  • JWT token The self-contained feature of , That's why it can't be withdrawn
  • JWT The signature algorithm can be designed by oneself , For the convenience of debugging , Symmetric encryption algorithms can be used in local environments , Asymmetric encryption algorithm is recommended in production environment
  • JWT token In the micro service system, the advantages are particularly prominent . Multi layer call API It can be delivered directly JWT token, Take advantage of self-contained capabilities , It can reduce the number of user information query ; what's more , Using asymmetric encryption can be done by distributing keys in the system

verification JWT token.

Of course OAuth Yes access token There are no restrictions on the technology used by the certificate ,OAuth It's not mandatory to use JWT, In the use of JWT The advantages of self-contained features are , It must be taken into account JWT Withdraw difficult questions .

In some of the withdrawals token It is not suitable for use in demanding projects JWT, Even if some solutions are adopted (whitelist and blacklist) It's against the design JWT The purpose of .

Cookie 、Token in Cookie、Session Token Still used

In the build API when , Developers will find that there are some differences between our authentication methods and web applications , Except image ajax This typical web Outside technology , If we want to API It's stateless , It is not recommended to use Cookie.

Use Cookie The essence of this is that the server will assign a Session ID, The client will bring this in the subsequent requests ID As a sign of the current user , because HTTP Itself is stateless ,Cookie It's a way to implement state built into the browser .

If our API It's for the client to use , To impose on API Call management for Cookie You can also complete the task .

In some legacy or non-standard authentication implementation projects , We can still see these practices , Fast Authentication .

Use cookie, for example web In the project ajax The way

Use session ID or hash As token, But will token Put in header In the transfer

The generated token ( May be JWT) Put in cookie Pass on , utilize HTTPonly and Secure Label protection token

Choose the right authentication method

With the development of microservices ,API It's not just for WEB perhaps Mobile APP, also BFF(Backend for Frontend) and Domain API Certification of , And the integration of third-party services .

Client to server authentication is different from server to server authentication .

We put end users (Human) Participating Communications , be called Human-to-machine (H2M), The communication between servers is called Machine-to-machine (M2M).

H2M Communication needs higher security ,M2M The natural ratio of communication to H2M Security , So more emphasis on performance , In different situations, it is very important to choose the appropriate authentication technology .

for example HTTP Basic Authentication Used as a H2M Certification seems a bit backward , But in M2M Is widely used .

Another thing worth mentioning ,H2M In this way of communication , The client is out of control , Because the key can't be distributed autonomously , The security of authentication communication is highly dependent on HTTPS.

Look at their relationship from a macro perspective , It's very helpful for our technology selection .

The glossary

Browser fingerprinting

By querying the browser's proxy string , The screen is dark , Language, etc. , These values are then passed through a hash function to produce a fingerprint , No need to pass Cookie You can recognize the browser

MAC(Message authentication code

In cryptography , Message identifier , It's a small piece of information generated by a specific algorithm , Check the integrity of a message

HOTP(HMAC-based One-time Password algorithm

One time cipher algorithm based on hash message verification code

Two-step verification

It's an authentication method , Use two different elements , Merge together , To confirm the identity of the user , It is a special case of multifactor verification

OTP (One time password )

One time password , For example, the authentication code in registered email and SMS

WechatIMG3262.png

版权声明
本文为[Search Cloud library technology team]所创,转载请带上原文链接,感谢