This article will introduce the microservice architecture and related components , Describe what they are and why they use the microservice architecture and these components . This article focuses on the concise expression of the overall picture of the microservice architecture , So it won't involve details such as how to use components .
In order to prevent not providing the original website reprint , Add the original link here ：
To understand microservices , First of all, we need to understand those who are not micro services . Generally speaking, the opposite of microservices is single application , That is, all functions are packaged into an application in a single unit . From monomer application to microservice, it's not done overnight , It's a gradual process . This paper will take an online supermarket application as an example to illustrate this process .
The initial need
A few years ago , Xiaoming and Xiaopi set up an online supermarket together . Xiaoming is responsible for program development , Xiaopi is responsible for other matters . At that time, the Internet was not developed , Online supermarket or blue ocean . As long as the function is realized, you can make money at will . So their needs are simple , Just need a website to hang on the public network , Users can browse products on this website 、 Purchase goods ; In addition, we need a management background , Can manage goods 、 user 、 And order data .
Let's sort out the feature list ：
- User registration 、 Login function
- Merchandise display
- Place an order
- Management backstage
- User management
- Commodity management
- Order management
Because the demand is simple , Xiaoming's left hand and right hand is a slow motion , The website is ready . Management background for security , Don't do it with the website , Xiao Ming's right hand and left hand slow motion replay , Manage the website . The overall structure is as follows ：
Xiao Ming waves his hand , Found a cloud service to deploy , The website goes online . After going online, it's very well received , Loved by all kinds of fat houses . Xiaoming starts to lie down and collect money .
As the business grows ……
Not for long , A few days later , All kinds of online supermarkets are following closely , It has a strong impact on Xiaoming Xiaopi .
Under the pressure of competition , Xiaoming Xiaopi decides to develop some marketing methods ：
- Carry out promotional activities . For example, on New Year's day, the whole market is on sale , Two for the Spring Festival , Valentine's day dog food coupons and so on .
- Expand channels , New mobile marketing . Besides the website , We also need to develop the mobile end APP, Wechat applet, etc .
- Precision marketing . Use historical data to analyze users , Provide personalized services .
These activities need the support of program development . Xiaoming pulls Xiaohong to join the team . Xiaohong is responsible for data analysis and mobile related development . Xiaoming is responsible for the development of promotion related functions .
Because the development task is more urgent , Xiaoming Xiaohong didn't plan the architecture of the whole system well , Slap your head at random , Decided to put promotion management and data analysis in the management background , Wechat and mobile APP And build . After a few days all night , New functions and applications are basically completed . At this time, the architecture is as follows ：
There are many irrationalities in this stage ：
- Websites and mobile applications have many duplicate codes of the same business logic .
- Data is sometimes shared through databases , Sometimes transfer is called through an interface . Interface call relationships are messy .
- A single application provides an interface for other applications , Gradually, more and more, more and more , It contains a lot of logic that doesn't belong to it . Apply boundary blur , The function belongs to confusion .
- The management background has a lower level of support in the initial design . Performance bottleneck appears after data analysis and promotion management , Affecting other applications .
- The database table structure is dependent on multiple applications , Can't refactor and optimize .
- All applications operate on one database , There is a performance bottleneck in the database . Especially when data analysis runs , The database performance drops dramatically .
- Development 、 test 、 Deploy 、 Maintenance is becoming more difficult . Even if only one small function is changed , It also needs the whole app to be released together . Sometimes the press conference accidentally brought some untested code , Or modify a function , Another unexpected thing went wrong . In order to reduce the impact of possible issues and online business stoppage , All apps will be released at 3:4am . After release, in order to verify the normal operation of the application , We have to keep an eye on the peak user hours the next day ……
- The team is shuffling . The question of which application some public functions should be built in is often debated for a long time , At the end of the day, we can do our own , Or put it anywhere but don't maintain it .
Despite all the problems , But there is no denying the results of this stage ： Build systems quickly based on business changes . however Urgent and onerous tasks tend to get people into parts 、 A short way of thinking , To make compromising decisions . In this architecture , Everyone only pays attention to one acre of land , Lack of overall situation 、 Long term design . In the long term , System construction will be more and more difficult , Even into constant overthrow 、 The cycle of reconstruction .
It's time to make a change
Fortunately, Xiaoming and Xiaohong are good young people who pursue ideals . After realizing the problem , Xiaoming and Xiaohong have freed up part of their energy from trivial business needs , Start to sort out the overall structure , Be prepared to start to reform .
To make a transformation , First of all, you need to have enough energy and resources . If your demand side （ Business people 、 project manager 、 Boss, etc ） Strong focus on demand progress , So that you can't spare extra energy and resources , Then you may not be able to do anything ……
In the world of programming , The most important thing is The ability to abstract . The process of microservice transformation is actually an abstract process . Xiaoming and Xiaohong have sorted out the business logic of the online supermarket , Abstract out common business capabilities , Make a few public services ：
- Customer service
- Goods and services
- Promotional services
- Order service
- Data analysis services
Each application background only needs to obtain the required data from these services , So a lot of redundant code is deleted , There's a thin control layer and front end . The structure of this stage is as follows ：
This phase just separates the services , The database is still shared , So the disadvantages of some chimney systems still exist ：
- Database becomes a performance bottleneck , And there's a risk of a single point of failure .
- Data management tends to be chaotic . Even if you have good modular design at the beginning , Over time , There is always a phenomenon that one service gets data from another service directly from the database .
- Database table structure may be dependent on multiple services , Pull one hair and move the whole body , It's hard to adjust .
If you keep the pattern of sharing databases , Then the whole architecture will become more and more rigid , Lost the meaning of microservice Architecture . So Xiaoming and Xiaohong work together , Split the database . All persistence layers are isolated from each other , Each service is responsible for . in addition , In order to improve the real-time performance of the system , Added message queuing mechanism . The structure is as follows ：
After a complete split, each service can adopt heterogeneous technology . For example, data analysis service can use data warehouse as persistence layer , In order to do some statistical calculation efficiently ; The visit frequency of commodity service and promotion service is relatively large , Therefore, the cache mechanism is added .
Another way to abstract public logic is to make these public logic into a public framework library . This method can reduce the performance loss of service calls . But the management cost of this method is very high , It's hard to guarantee the consistency of all application versions .
Database splitting also has some problems and challenges ： For example, cross library cascading requirements , Query the granularity of data through services . But these problems can be solved by reasonable design . On the whole , Database splitting has more advantages than disadvantages .
Microservice architecture also has a technical benefit , It makes the division of labor of the whole system clearer , Clearer responsibilities , Everyone is dedicated to providing better services for others . In the era of monomer application , Public business functions often have no clear attribution . At the end of the day, you can do your own , Everyone did it again ; Or a random person （ Generally, people with strong ability or enthusiasm ） Do the application he is responsible for . In the latter case , This person is in charge of his own application , And extra responsibility for providing these public functions to others —— But this function originally is nobody to be responsible , Just because he's good at it / More enthusiastic , Just carry the pot inexplicably （ This kind of situation is also called "can do more" ）. In the end, people are reluctant to provide public functions . In the long term , The people in the team are becoming more and more independent , I don't care about the overall architecture design anymore .
Look at it this way , Using microservice architecture also needs to adjust the organizational structure . So to do microservice transformation needs the support of managers .
After the transformation , Xiao Ming and Xiao Hong know their respective pots . They are very satisfied with , It's as beautiful and perfect as Maxwell's equations .
There is no silver bullet
Spring is coming , All things recovery, , It's the annual shopping Carnival . Watch the number of daily orders rise , Xiaopi Xiaoming Xiaohong smiles . But it's not a long time , too great pleasure will bring about sadness , A sudden bang , The system is down .
Previous monomer applications , Troubleshooting is usually to look at the log , Study error messages and call stacks . and The whole application of microservice architecture is divided into multiple services , It's very difficult to locate the fault point . Xiao Ming checks the logs one by one , One service one service call by hand . After ten minutes of searching , Xiaoming finally located the fault point ： The promotion service stopped responding because it received too many requests . Other services call promotion services directly or indirectly , So it went down . In the microservices architecture , A service failure may have avalanche effect , Cause the whole system to fail . Actually before the festival , Xiaoming and Xiaohong have done request quantity evaluation . As expected , The server resources are enough to support the festival requests , So there must be something wrong . But the situation is urgent , With every minute and second passing, there is white silver , So Xiaoming also has no time to check the problem , A couple of new virtual machines were built on the cloud at once , Then deploy new promotion service nodes one by one . After a few minutes of operation , At last, the system is barely back to normal . It's estimated that hundreds of thousands of sales have been lost in the whole breakdown time , Three people's heart is dripping blood ……
After the event , Xiaoming simply wrote a log analysis tool （ It's too much , The text editor can hardly open , I can't see it with my naked eyes ）, Statistics of the promotional service access log , It was found that during the failure , Goods and services due to code problems , In some scenarios, there will be a large number of requests for promotional services . The problem is not complicated , Xiaoming's fingers shake , Fixed the hundreds of thousands of Bug.
The problem is solved , But there is no guarantee that similar problems will not happen again . Although the logical design of microservice architecture is perfect , But it's like a gorgeous palace of building blocks , Can't stand the wind . Microservice architecture solves the old problems , It also introduces new questions ：
- The whole application of microservice architecture is divided into multiple services , It's very difficult to locate the fault point .
- Decreased stability . The more services there are, the more likely one of them will fail , And a service failure can cause the whole system to hang up . in fact , In the production scenario with large number of visits , Faults always happen .
- There are a lot of services , Deploy 、 Management is a lot of work .
- In terms of development ： How to ensure that all services are still collaborative under continuous development .
- Testing ： After the service is split , Almost all functions involve multiple services . The original test of a single program becomes the test of inter service calls . Testing becomes more complicated .
Xiao Ming, Xiao Hong, thinks hard , Determined to solve these problems . The treatment of faults generally starts from two aspects , On the one hand, try to reduce the probability of failure , On the other hand, it can reduce the impact caused by the failure .
monitor - Find signs of trouble
In a highly concurrent distributed scenario , The trouble is often a sudden avalanche . So we must establish a perfect monitoring system , Find as many signs of trouble as you can .
There are many components in the microservice architecture , Each component needs to monitor different indicators . such as Redis Cache generally monitors the memory usage value 、 The network traffic , Number of database monitoring connections 、 disk space , The business service monitors the number of concurrent 、 Response delay 、 Error rate, etc . Therefore, it is not realistic to build a large and comprehensive monitoring system to monitor various components , And the scalability will be very poor . The general approach is to let each component provide an interface to report its current status （metrics Interface ）, The output data format of this interface should be consistent . Then deploy an indicator collector component , Get and keep the component state from these interfaces regularly , At the same time, it provides query services . Finally, we need one more UI, Query all indicators from the indicator collector , Draw the monitoring interface or alarm according to the threshold value .
Most components don't need to be developed by yourself , There are open source components on the network . Xiao Ming downloaded RedisExporter and MySQLExporter, These two components provide Redis Cache and MySQL Index interface of database . Microservices implement customized indicator interfaces according to the business logic of each service . Then Xiao Ming uses Prometheus As an indicator collector ,Grafana Configure monitoring interface and email alarm . Such a set of microservice monitoring system is set up ：
Location problem - Link tracking
In the microservices architecture , A user's request often involves multiple internal service calls . For the convenience of locating problems , Need to be able to record every user request , How many service calls are generated inside the microservice , And its calling relationship . This is called link tracking .
We use one Istio Link tracking examples in the documentation to see the effect ：
The picture is from Istio file
As you can see from the diagram , This is a user access productpage Page request . In the course of the request ,productpage Service order calls details and reviews Service Interface . and reviews The service called... Again during the response ratings The interface of . The whole link trace record is a tree ：
To achieve link tracking , Every service call will be in HTTP Of HEADERS Record at least four items of data in ：
- traceId：traceId Identify a call link requested by a user . Have the same traceId The call to belongs to the same link .
- spanId： Identifies the... Of a service call ID, That is, the node of link tracking ID.
- parentId： Parent node spanId.
- requestTime & responseTime： Request time and response time .
in addition , You also need to call the components of log collection and storage , And show the link call UI Components .
The above is just a simple explanation , The theoretical basis for link tracking can be found in Google Of Dapper
After understanding the theoretical basis , Xiao Ming chose Dapper An open source implementation of Zipkin. And then I shake my fingers , Wrote a HTTP Request interceptor , In every time HTTP This data is generated on request and injected into HEADERS, At the same time, send the call log asynchronously to Zipkin In the log collector of . Here's an extra mention ,HTTP Request interceptor , It can be implemented in microservice code , You can also use a network agent component to achieve （ But in this way, every microservice needs to add a layer of agent ）.
Link tracking can only locate which service has a problem , Can't provide specific error information . The ability to find specific error information needs to be provided by the log analysis component .
To analyze problems - Log analysis
The log analysis component should have been widely used before the rise of microservices . Even if the single application architecture , When the number of visits increases 、 Or when the server scale increases , The size of the log file expands to make it difficult to access with a text editor , What's worse is that they're scattered across multiple servers . Check a problem , You need to log in to each server to get the log file , Look for... One by one （ And turn on 、 Search is slow ） Log information you want .
therefore , When the scale of application becomes larger , We need a diary of “ Search engine ”. In order to accurately find the desired log . in addition , The data source side also needs to collect the components of the log and display the results UI Components ：
Xiao Ming investigates , Used a famous land ELK Log analysis component .ELK yes Elasticsearch、Logstash and Kibana Abbreviation of three components .
- Elasticsearch： Search engine , It is also the storage of logs .
- Logstash： Log collector , It receives log input , Do some preprocessing on the log , Then output to Elasticsearch.
- Kibana：UI Components , adopt Elasticsearch Of API Look up the data and show it to the user .
Finally, there is a small problem of how to send logs to Logstash. One solution is to call... Directly when the log is output Logstash The interface sends logs in the past . So again （ Why , Why want to use “ also ”） To change the code …… So Xiao Ming chose another plan ： The log is still output to the file , One more... In each service Agent Scan the log file and output it to Logstash.
gateway - Access control , Service governance
After splitting into microservices , There are a lot of services , Lots of interfaces , Make the whole call relationship messy . Often in the development process , Halfway through , Suddenly I can't remember which service a certain data should call . Or the writing is wrong , Called a service that should not be called , Originally a read-only function resulted in the modification of data ……
To deal with these situations , The call of microservice needs a gatekeeper , That's gateway . Add a layer of gateway between the caller and the callee , Check the permission every time . in addition , Gateway can also be used as a platform to provide service interface documents .
One problem with using gateways is to decide how much granularity to use ： The coarsest granularity solution is a gateway for the whole microservice , Microservices access microservices through gateways , Microservices call directly inside ; The most fine-grained is all calls , Whether it's an internal call to a microservice or an external call , All have to go through the gateway . The compromise is to divide the microservices into several areas according to the business areas , Call directly in the area , The interval is called through the gateway .
Because the service quantity of the whole online supermarket is not so much , The scheme of coarsest granularity adopted by Xiaoming ：
Services are registered at discovery - Dynamic capacity
Front components , They are all designed to reduce the possibility of failure . But trouble always happens , So the other thing that needs to be studied is how to reduce the impact of faults .
The roughest （ It's also the most commonly used ） Fault handling strategy is redundancy . Generally speaking , A service will deploy multiple instances , This can share the pressure and improve the performance , Second, even if one instance hangs another instance, it can respond .
One problem with redundancy is the use of a few redundancies ？ There is no exact answer to this question on the timeline . According to the service function 、 Different time periods , A different number of instances are needed . For example, on weekdays , Probably 4 An example is enough ; And in promotions , The flow is increasing , You may need to 40 An example . So the amount of redundancy is not a fixed value , It's adjusted in real time as needed .
Generally speaking, the operation of adding an instance is ：
- Deploy a new instance
- Register the new instance to load balancing or DNS On
There are only two steps to operation , But if you register to load balancing or DNS If the operation is manual , That's not easy . Think about adding 40 After an example , You have to type it manually 40 individual IP The feeling of ……
The solution to this problem is to automatically register and discover services . First , You need to deploy a service discovery service , It provides address information services for all registered services .DNS It's also a service discovery service . Then each application service automatically registers itself to the service discovery service when it starts . And the application service will start in real time （ regular ） From the address list of the service discovery service to the local . The service discovery service also checks the health status of the application service on a regular basis , Remove unhealthy instance addresses . In this way, when you add an instance, you only need to deploy the new instance , When the instance is offline, you can directly shut down the service , Service discovery will automatically check the increase and decrease of service instances .
Service discovery will also work with client load balancing . Because the application service has synchronized the service address list locally , So when accessing microservices , You can decide the load policy yourself . You can even add some metadata to the service registration （ Service version and other information ）, The client load controls the traffic according to the metadata , Realization A/B test 、 Blue and green release and other functions .
Service discovery has many components to choose from , for instance Zookeeper 、Eureka、Consul、Etcd etc. . But Xiaoming thinks his level is good , Want to show off , So based on Redis I wrote a ……
Fuse 、 service degradation 、 Current limiting
When a service stops responding for various reasons , Callers usually wait for a period of time , Then timeout or receive error return . If the call link is long , May cause requests to pile up , The whole link consumes a lot of resources and has been waiting for the downstream response . So when multiple access to a service fails , It should be blown , Mark that the service has stopped working , Direct return error . Do not reestablish the connection until the service returns to normal .
The picture is from 《 Microservice design 》
When downstream services stop working , If the service is not the core business , Then the upstream service should be degraded , To ensure that the core business is not interrupted . For example, the online supermarket order interface has a function of recommending goods to collect orders , When the recommended module hangs , Order function cannot be hung up together , Just turn off the recommended function temporarily .
After a service hangs up , Upstream services or users will usually retry their access . This leads to a return to normal service , It is likely that the network traffic is too large in an instant and then immediately hangs up , Repeated sit ups in the coffin . So services need to be able to protect themselves —— Current limiting . There are many current limiting strategies , The simplest is when there are too many requests per unit time , Discard redundant requests . in addition , It can also be considered to limit current in different areas . Reject only requests from services that generate a large number of requests . For example, both goods service and order service need to visit promotion service , Commodity services have made a lot of requests due to code problems , Promotional services only limit requests from product services , Requests from the order service respond normally .
Microservices architecture , There are three levels of testing ：
- End to end testing ： Covering the whole system , Generally in the user interface model test .
- Service Testing ： Test the service interface .
- unit testing ： Test code units .
The ease of implementation of the three tests from top to bottom is increasing , But the test effect is decreasing . End to end testing is the most time-consuming and laborious , But after passing the test, we have the most confidence in the system . Unit tests are easiest to implement , It's also the most efficient , But after the test, there is no guarantee that the whole system is free of problems .
Because the end-to-end test is difficult to implement , Generally, only end-to-end testing of core functions . Once the end-to-end test fails , It needs to be broken down into unit tests ： Then analyze the reasons for the failure , Then write unit tests to reproduce the problem , So in the future we can catch the same mistakes faster .
The difficulty of service testing is that services often depend on other services . This problem can be solved by Mock Server solve ：
Unit tests are familiar to everyone . We usually write a lot of unit tests （ Including regression testing ） Try to cover all code .
Index interface 、 Link tracking Injection 、 Log drainage 、 Service registration found 、 Routing rules and other components as well as fusing 、 Current limiting and other functions need to add some docking code to the application service . It is very time-consuming and labor-consuming to let each application service implement itself . be based on DRY Principles , Xiaoming has developed a set of microservice framework , Take the code that interfaces with each component and some other common code out of the framework , All application services are developed using this framework .
Using the microservice framework can achieve a lot of customized functions . You can even inject program call stack information into the link trace , Implement code level link tracking . Or output thread pool 、 Connection pool status information , Real time monitoring of the underlying status of the service .
Using a unified microservice framework has a serious problem ： The cost of updating the framework is high . Every frame upgrade , All application services are required to cooperate with the upgrade . Of course , Generally, the compatible scheme will be used , Set aside a period of parallel time for all application service upgrades . But if there are many application services , The upgrade time can be very long . And there are some very stable application services that are almost not updated , The person in charge may refuse to upgrade …… therefore , Using unified microservice framework needs perfect version management method and development management specification .
Another road - Service Mesh
Another way to abstract common code is to abstract it directly into a reverse proxy component . Each service additionally deploys this agent component , All outbound and inbound traffic is processed and forwarded through this component . This component is called Sidecar.
Sidecar No additional network costs .Sidecar It will be deployed on the same host and share the same virtual network card with the microservice node . therefore sidecar The communication with microservice node is only realized by memory copy .
The picture is from ：Pattern: Service Mesh
Sidecar Only responsible for network communication . You also need a component to manage all of them sidecar Configuration of . stay Service Mesh in , The part responsible for network communication is called data plane （data plane）, The part responsible for configuration management is called control plane （control plane）. The data plane and the control plane make up Service Mesh Basic architecture .
The picture is from ：Pattern: Service Mesh
Sevice Mesh The advantage over the microservice framework is that it doesn't break into code , It's easier to upgrade and maintain . It is often criticized for its performance . Even if the loopback network does not generate actual network requests , But there's still the extra cost of memory copies . In addition, some centralized traffic processing will also affect performance .
end 、 It's also the beginning
Microservices are not the end of architecture evolution . Go closer and Serverless、FaaS Wait for the direction . On the other hand, some people are singing together for a long time , Rediscover the single architecture ……
No matter what , The transformation of microservice architecture has come to an end . Xiaoming contentedly touched his increasingly smooth head , I plan to have a coffee break this weekend .