The sword of Damocles in software development
2020-12-08 13:23:11 【osc_ ftbxuxl1】
↑ ???? I don't want to read it , Then listen to me ↑
Why does your program always appear bug？
Why should we change bug It takes up most of your time ？
Look at the end of this article , Make sure you can design more stable programs , Get rid of bug The entanglement of , It's more reassuring to do projects ！
Remember when I was at school , The projects that we did , It's not for homework , It's just for showing off in the competition , So the quality of the project is very low .
How low is it ？
Most of the projects , As long as the basic functions are available , It's done , Don't consider any anomalies at all . Even if it can run successfully once , Let me cut some pictures and put them in PPT Or in the lab report , It's enough to give the teacher an assignment or to answer a contest .
That project appeared bug What shall I do? ？
If some functions are not available during the test , It is very simple , Regardless of him , direct PS A working diagram will do .
If you find that some features are not available during the game , That's easy, too , Throw the pan to “ The site network is not good ” Just go .
however , these “ Tips ” It doesn't work in a business , Enterprise projects must bring real value to the enterprise , There is no room for carelessness and deceit .
When I first entered the enterprise internship , Still retain the wolf nature of their own development projects in the school ????, As long as you can complete the basic functions , Make sure to complete the development as quickly as possible .
one day , When I get ready to leave work early , Test students came to me and said .
“ feed , Your program has bug, Here the user orders how the amount is negative ？”
Write a bug
For me, a new employee of the workplace , This is the first time in my life that someone said my code has bug, I have a problem , I'm not right .
at that time , The first thought in my mind was how to put this bug Fool the past , How to correct instead of ！ It seems that I have formed a very bad habit .
A few days after that , I received more than one test in a row bug, Then correct them one by one . If such a flawed program is released online , The loss is immeasurable , Now think about it, I'm still afraid .
This principle Development 1 God , Change bug 4 God
After this , I realized , Developing projects in the enterprise , You can't just pursue efficiency in development , Also pay attention to the stability of the project , Otherwise, the additional rework time will be much longer than the time saved in the development time , And it will affect how your colleagues think of you . If you will develop bug Leave it online , The consequences are even more unimaginable ！
later , After working in two big companies, byte skipping and Tencent , I further realized how important project stability is , And accumulated more experience to improve the stability of the project only in large companies .
I concluded. 10 individual Risk points that are not normally considered in development , as well as 16 individual Reduce the risk 、 Ways to improve project stability , Share with you ~
Before sharing this , Tell a story first .
Sword of Damocles
In ancient Greek legend , Damocles is BC 4 The tyrant of Syracuse, Italy in the 19th century （ An exclusive title of ancient Greek rulers ） Courtiers of Dionysius II , He was very fond of flattering Dionysius .
He flattered ：“ As a great man with power and authority , Dionysius was very lucky .”
So Dionysius offered to exchange his identity for the day , Then he can try the fate of the leader .
At the dinner party in the evening , Damocles enjoyed being king very much . When dinner is almost over , When he looked up, he noticed the sword above the throne, which was only suspended by a horse's mane . He immediately lost interest in food and beauty , And ask the tyrant to let him go , He never wanted to be so lucky again .
Sword of Damocles
What does this story tell us ？
After peace and tranquility , There is always danger and uneasiness .
How much honor and status does one gain , He has to pay the same price .
The higher the status , The safer it seems , The more dangerous it is .
In time of peace prepare for war , Serious consequences that may arise at any time , Be careful .
So what does this have to do with software development ？ Let me uncover the sword of Damocles in software development .
be threatened by growing crises
“ After peace and tranquility , There is always danger and uneasiness .”
Software development is just like this , On the surface of the machine “ die ” Of , It will only be executed according to the instructions or programmed programs entered by people , immutable and frozen , Very obedient . It's like when we write code and throw it on the machine , You can have a good sleep .
But is it true ？ Can we really trust machines and programs ？
Actually , In the world of procedure, there are dangers , Human factors 、 Environmental factors and so on may have an impact on our program . therefore , We must always adhere to the software development Distrust principle , keep
overly pessimistic（ Too pessimistic ）, Put all requests related to the program 、 service 、 Interface 、 Return value 、 machine 、 frame 、 Middleware and so on are regarded as untrustworthy , Advance gradually and entrench oneself at every step 、 Fortify everywhere .
The principle of distrust in the procedural world
So why should I write code so carefully , I don't trust anything ？
The pain of big projects
“ How much honor and status does one gain , He has to pay the same price .”
Software development , The more valuable the project is , The more pressure you have to bear , Let's hear about big projects .
I'm a big project worth over 100 million , It serves tens of millions of users every day , Help them acquire knowledge and happiness .
My friends only see my aura and glory , But they don't see the pressure and risk I'm carrying , Today I finally have a chance to share my feelings with you .
Remember many years ago , I'm a kid , Only a few small owners developed me , During that time , I grew up very fast . Although only a few dozen people use me , But I feel very relaxed and happy , Once in a while I'm lazy , No one will find out .
later , I have more and more functions , Get stronger . Countless new faces come to greet me every day , And enjoy the service I provide . Little by little , More developers have left a mark on me , I feel like I'm getting complicated , Also began to feel the pressure . I can't find another chance to be lazy , Because once I rest , It will make my masters lose a lot of wealth .
Now , It's a big project for me , Tens of millions of users depend on me every day , I finally have more value , But it also increases a lot of troubles , Felt a greater danger .
First , At the same time, it serves millions of users , There could be hundreds of thousands every second 、 Even millions of requests need to be handled by me , So I have to work at a high load all the time , Let alone rest , Even a little bit slower , You'll get complaints from users , Owners will also be criticized for this .
My run , Must rely on the support of many brothers , So I have to get along with my brothers , Even if a brother falls down , I will be affected .
Behind my great strength , Have a very fragile heart . It's been strengthened and reformed so many times , My functions are getting more and more at the same time , As a result, it has been implanted with various frameworks and plug-ins , It's getting bigger and bigger like a snowball , I don't know when it's going to explode . So that every time the owners change me, they have to be very careful , I grew up very slowly, too .
But what scares me most is , It's the bad guys ！
They are different from normal users , Some keep making requests , Trying to knock me down . Some around my back , Trying to control me directly . Some of them are eyeing me , Watch and record my every move . There are also attempts to operate illegally , Trying to make a huge profit from me .
It's so tiring to be a big project , I don't know how much longer I can hold on to .
Is it really believable ？
“ The higher the status , The safer it seems , The more dangerous it is .”
Today is an era of open source and sharing software , When we develop projects , More or less will use the existing resources on the Internet , Like dependency packages 、 Tools 、 Components 、 frame 、 Interface 、 Ready made cloud services and so on , These resources can greatly improve our development efficiency .
Take cloud services for example , It has almost become a necessary resource for us to develop , We used to want to make a website , You may need to buy your own physical server , And then connect to the network , Then deploy the project to . Today, , Log directly to the cloud official website of large companies （ Like Tencent cloud 、 Alibaba cloud ）, Then rent a cloud server , Very economical .
The cloud service
Let's talk about the mainstream development framework , Before doing a simple website interface may only use
It sounds like there's no problem , You don't have to doubt anything , because We're born with big companies , Or trust in fame .
however , Do you know? , When you decide to use someone else's resources , You've got some control of the project system 、 It could even be half a life , It's all handed in .
So think about it , The resources you use , Is it really believable ？
below 10 A question , It may change your perception of development .
1. Is the development tool credible ？
We usually write code in large, comprehensive development tools , such as
JetBrains IDEA perhaps
Vscode. A lot of students who are just starting to write code 、 Even experienced veterans , Have absolute trust in development tools .
For example, you type on the keyboard
a, The editor interface must show
however , Due to lack of memory and other reasons , In fact, development tools are also popular .
For example, you want to call a function , Usually, after the first few letters of a function name , The development tool will automatically prompt you with the full function name , But if the development tool doesn't give you a hint , The first thing you suspect is that this function doesn't exist , It's not that the editor doesn't prompt you as expected . In this case , You can wait a moment for the editor , Or further confirm whether the function really does not exist , Instead of creating a new function immediately .
Or the project doesn't work , I think it's OK to investigate , At this point, you might as well restart the next development tool , Or clean up the cache , Maybe it will work ！
There are also a lot of very interesting situations , For example, the editor is in full swing , All kinds of prompt errors , But the project still works .
Why can't it run ？ Why can it work ？
therefore , Don't absolutely believe in development tools .
2. Are open source projects credible ？
This is an era of open source software , stay
GitHub We can find a lot of excellent open source projects on the open source project platform , Good open source projects can even get 10 More than ten thousand concerns , Are these well-known open source projects credible ？
Not entirely credible ！ From every open source project Issues You can see that , And usually the bigger the project , The more problems are found , such as
Vue project , Cumulative put forward and closed 8000 Many questions .
Vue Project issues
I remember once using a well-known open source server
Tomcat, I met bug, Every time a specific request is received, an error will be reported . At first I had no doubt that it was
Tomcat The problem of , It's about trying to figure out what's wrong with your code . After repeated investigation and search , Finally confirmed that is
Tomcat Of itself bug！
Although open source projects are not entirely credible , But compared to private projects , All the students who are interested in the project can find the problems in the project together , And solve it , To a certain extent, it can improve the reliability of the project .
3. Is the dependency library trustworthy ？
When we develop projects , A large number of dependency libraries are usually used . Official sources depend directly on （ such as
npm） Search dependency Libraries , Then use the package manager , With a single line of command or writing a configuration file, you can make it automatically install dependencies , Very convenient .
however , These are published to the official source repository , It's believable ？
Not to mention that almost every developer has the opportunity to publish the dependency library to the official , Even if it's a dependency library for big Internet companies , It may not be credible .
What impresses me most is Alibaba's
JSON Serialization class library
fastjson, Almost nobody knows 、 No one knows , Because of its extremely fast parsing speed, it is widely praised . however , This library has been repeatedly exposed to high-risk vulnerabilities , It allows attackers to execute commands remotely ！ The average developer doesn't find this at all , This has brought great harm to the project .
therefore , When selecting dependent Libraries , We should do a good job in research , Rely on the security of the library as much as possible , And make sure you don't conflict with existing dependencies .
4. Is programming language credible ？
Java Is a strong type of language , It's robust . I believe I have learned this sentence
Java You can't be more familiar with . however , Can strongly typed programming languages be trusted ？
There may be students here who are going to express their doubts , If the most basic and low-level programming languages we've been using all the time exist bug, So how can we believe in frameworks built on these programming languages ？
But the truth is , All programming languages have bug！ And basically every time a new version of a programming language is released, there's some history bug Amendment . Just
Java for , There's even a special record bug The database of ！
Java Bug database
however , For most developers , I believe that even if the program accidentally triggers the programming language itself bug, I don't have enough confidence to question , Instead, modify the code directly to bypass .
exactly , Questioning a programming language requires a foundation and knowledge base , But once you find a puzzling problem in the program , It is suggested that we should not ignore , You can spend some time exploring , Maybe you've succeeded in discovering a significant bug, Can also deepen the understanding of this programming language .
5. Is the server trusted ？
The server is the host of the project , The performance and stability of the server will directly affect the project process .
Whether it's personal developers or businesses , Usually, they will directly rent cloud servers provided by large companies to deploy projects , It saves you the trouble of building and maintaining yourself .
But is the cloud server of a big company credible ？
Not entirely credible ！ Even today's cloud server providers promise their own services SLA（ Service level agreements ） You can achieve 5 individual 9（99.999% About one year of downtime 5 minute ）, even to the extent that 6 individual 9（99.9999% About one year of downtime 30 second ）, But there are still risks .
There's a very famous case , stay 2013 year , China's largest social communication software has suffered a massive failure , Hundreds of millions of users are affected . And the reason is , One of the problems in the construction of municipal roads , Cut off the network cable , As a result, the server where the software is located cannot be accessed .
Except for the unreliability of usability , There may be some security and privacy issues . Of course, cloud service providers usually don't get users' data , But there's no way to trust them absolutely . After all, the privacy of data is crucial to the enterprise , This is why large companies will build their own server rooms and networks .
6. Is the database trustworthy ？
Most business data in an enterprise is stored in a database , Through the project back-end program to operate and query the data in the database .
Just like the server , We can use software to build our own database , such as
MySQL, You can also rent cloud databases of large companies directly , Is the database credible ？
In fact, in the enterprise back-end project , Databases are usually performance bottlenecks , Relatively fragile , When the amount of access concurrency is higher , The query performance of the database will decrease , In severe cases, the whole system may be down ！ Even cloud database services provided by big companies , Encounter slow query （ It takes a long time to query ） when , Maybe there's no way to deal with it .
The data in the database may not be reliable , Sometimes a mistake by the Administrator , Accidentally delete data or add a wrong data , It may affect users , Losses caused . What is more , Even delete the library to run away , Don't talk about Medes ！
Delete the library and run away
therefore , Don't trust the database too much , Techniques like caching should be used to help the database share the pressure , And back it up regularly . Otherwise, once the database goes down or data is lost , The loss is immeasurable ！
7. Is the cache service trustworthy ？
Caching is a necessary technology for developing high performance programs , By storing slow query data such as database in memory , Read data directly from memory , To improve query performance . With caching , The project can not only support more people to query data at the same time , It also protects the database .
At present, the mainstream cache technology is
Memcached etc. , You can build your own server , You can also rent cloud caching services provided by big companies directly .
Store a cache of key value pairs
So is the cache service trustworthy ？
If the concurrency of the project is not particularly large , General caching technology is enough to support , But if the magnitude of the project is large , Maybe the cache can't withstand the pressure , When it's serious, it goes down . And once the cache crashes , A large number of query commands will directly request the database , So the database will hang up in an instant , In severe cases, it can lead to paralysis of the whole project ！
therefore , When using caching , Concurrency needs to be evaluated , Ensure high availability by building clusters and data synchronization . Besides , And to prevent Cache avalanche 、 Cache penetration 、 Cache breakdown Other questions , A brief explanation .
Cache avalanche ： A large number of caches expire at the same time , Requests can't access the cache , All on the database , Cause the database to hang up .
Cache penetration ： Persistent access cache does not exist in key Causes the request to be called directly to the database , Cause the database to hang up .
Cache breakdown ： A hot spot with high frequency of requests key Suddenly expired , All requests will be called to the database instantly , Cause the database to hang up .
If you don't prevent these three problems , Even if you rent a cache service from a big company , It's the same with blowing a bomb .
8. Is the object store trusted ？
In the project , There is often the ability for users to upload pictures or files , This kind of data is usually large , It's not convenient to store in database . Although we can save the files directly to the server , But it's better to use specialized object storage services .
You can simply store objects as a large folder , We can upload and download files directly through it . Big cloud service providers also provide professional object storage services , You don't have to build it yourself , Is the object store trusted ？
In general , Files uploaded to the object store will not be missing or lost , And it can also synchronize the uploaded data across the campus , Backup .
Cross campus synchronization
however , Remember the time , The file uploaded to the object store is not consistent with the source file , It's too small 1M. At first I thought it was when the file was uploaded to the object store , It will be compressed automatically , But after downloading the file from the object store to the local , It was found that it was not consistent with the source file ！ Although the probability of this happening is extremely small , But from that moment on , I don't believe in object storage anymore .
Let's talk about the cross campus synchronization of object storage with my own real experience . Because the business that the individual is responsible for is more important , In case a single computer room hangs up , Maybe it's hundreds of thousands of dollars per minute ！ So I configured automatic cross campus synchronization for the object store , Upload the file to Guangzhou computer room first , Then the data will be automatically synchronized to the Shanghai computer room , And the operation and maintenance students promise that the delay of automatic synchronization will not exceed 15 minute .
I believe most developers don't care if they configure data synchronization , I believe it will automatically synchronize . Then I write a program to do synchronous monitoring 、 When comparing data , It is found that the data is often not synchronized , Proportion up to 10%！
therefore , You can't trust object storage completely , Although most of the time, large companies have reliable object storage services , But there's no guarantee it's safe . Especially in the case of synchronous backup , Whether the synchronization is successful , How many students have cared about ？ Write a program to verify and protect .
9. API Is the interface trustworthy ？
In development , We often call other systems to provide API Interface to implement a function easily . For example, inquire about the weather of a place , You can directly call the weather query interface provided by others , You don't have to write it yourself . We can also offer API Interface for other people to use , Especially in microservice Architecture , Each service realizes the interaction and cooperation in the form of interface call .
Almost all API Interface providers will say how secure their interfaces are 、 Please feel free to use , that API Is the interface really trustworthy ？
Actually ,API Interfaces are the least trusted resources ！
First ,API The provider of an interface can be any developer , It's hard to determine the stability and security of the interface by their one-sided words .
Even if this interface has high performance 、 And it's safe , But you don't know how many people are using this interface with you at the same time , Maybe it's just you , Or maybe 100 Ten thousand other developers ？ In this competitive environment , Interface
qps（query per second Queries per second ） Can we meet our expectations ？ Does the return time of the interface really not time out ？
What is more , Secretly put API Interface changed , No notification is sent to the caller , In this way, all callers of the interface will fail , Seriously affect the operation of the project ！
therefore , We're calling a third party API Interface , Be careful 、 Be careful 、 Be careful again ！
Besides , If we were API The provider of the interface , Also pay attention to protect your own API Interface , Avoid being called by too many developers at the same time , Causes the interface to hang up .
API There are complex call relationships
10. Serverless Credible? ？
If the server is not trusted , Let's not rent the server , Rent directly from a large company Serverless It's OK to serve as the backstage of the project ？
Serverless Server free architecture , It's not really that you don't need a server , It's about deploying the project interface 、 The operation and maintenance of the server needs to be done by the service provider , So developers don't have to care about servers , Just concentrate on writing code .
It sounds great , that Serverless Credible? ？
Use Serverless, Although it can greatly improve the efficiency of development and operation and maintenance , But relative to the server and other resources , Even less credible ！
First ,Serverless It is deployed on the server itself , It will inevitably be affected by the server .
secondly ,Serverless Services don't keep the state of the application for a long time , It starts with the request , There is a cold start period , Although there are many related optimizations and solutions , However, the performance of the interface cannot be guaranteed precisely , Especially in high concurrency scenarios , Performance is often not as expected .
most important of all , When you choose to use Serverless The service , You're bound to a cloud service provider , It's very difficult to migrate later ！ Just imagine , All the functions of your project are left to others to maintain , Is it really a good thing ？ Once the cloud service provider has transformed the architecture or interface , Your code will change with it , And this change is not in your control ！
Of course ,Serverless It has a lot of advantages , It is also the inevitable trend of the development of cloud computing technology , I just hope that before you use it , Considering the possible risks , And take measures to deal with it .
Cloud Computing Era
summary ： It's because we trust so much in those big names 、 Seemingly safe resources , So the danger behind it is more difficult to detect , The consequences are often more lethal ！
“ In time of peace prepare for war , Serious consequences that may arise at any time , Be careful .”
In software development , Although the project appears to be working properly , But risk is everywhere , So we need to learn the idea of defensive programming . Think of yourself as a jerk , Don't trust anyone , Try to find the risks in the program , Active defense .
Let's share 16 A defensive programming approach , After learning , Can greatly reduce the risk in the program .
1. Programming habits
To reduce the risk in the program , First, develop good programming habits .
First , When writing code , Be sure to keep a good attitude , Don't write code in a hurry or with a mission mentality . If it's just to fulfill the requirements , Then it's very likely that you won't notice the risk in the code , Even found the risk is not willing to repair , This really saves development time , But when there's a problem later , You still have to spend more time investigating 、 Communication and repair bug. Spoil things by excessive enthusiasm , Run counter to one's desire .
When writing code , If you use the same and complex variable name or string multiple times in one place , It is suggested not to knock manually , It's using your favorite “ Copy and paste ”, To prevent the wrong hands bug.
Copy and paste a shuttle
Besides , We should strengthen the check of return value in the code , And choose safe syntax and data structure , Avoid using obsolete grammar . Different programming languages have different best programming habits , For example
Java In language , Should be for all possible
NULL To check the variables of , prevent
NPE（NULL Pointer Error Null pointer exception ）, When developing multithreaded programs , Choose thread safe
ConcurrentHashMap instead of
HashMap wait . It can also be used
Assert（ Assertion ） To ensure that the values of variables in the program run are as expected .
It is recommended to use an editor with check function to write code , When we write code, we automatically check for errors , You can also suggest good coding styles , Can greatly reduce the risk of development . Besides , Before the code is submitted , Be sure to check the code many times , Especially the files that are copied and pasted , There are often missing revisions . After submitting the code , You can also find experienced colleagues to help read and check the code （ Code review ）, Further ensure that there are no grammatical and logical errors .
Editor syntax check and prompt
2. exception handling
The operation of the program changes , The same piece of code may produce different results in different situations , Even abnormal . So many mainstream programming languages have exception handling mechanisms , For example
Java in , First use
try Capture exception 、 Reuse
catch Handling exceptions 、 Last use
finally Release resources and take care of the aftermath .
In programming , We should make rational use of exception handling mechanism , To defend against possible problems in code . Usually in exception handling , We'll log errors 、 Perform error reporting and alerting 、 Retry etc. .
For example, we don't trust the database , Add exception handling when querying and manipulating data , Once the database ventilation results in operation failure , Record the failure information in the log , And by email 、 SMS and other alarm methods to inform developers , It's the first time you can find problems and investigate them . If necessary, automatic retrying can be realized , Save some manual operation .
3. Request to check
All requests are untrusted , Even on the intranet , It may also be because of some mistakes , Caused the wrong request to be made .
So every interface we write , Before implementing specific business logic , Be sure to check the request parameters first , Here are a few common verification methods ：
Parameter type verification ： For example, the request parameter should be
IntegerInteger instead of
LongLong integer type .
Value validity check ： For example, the range of integers is greater than or equal to 0、 String length greater than 5, Or meet a particular format , Like cell phone number 、 ID card, etc .
User permission check ： Many interfaces need to be called by login users or administrators , So you have to pass the request parameter （ Request header ） To determine the identity of the current user , It was downloaded by an ordinary user VIP It's certainly unreasonable to pay for movies ！
4. flow control
above-mentioned , All requests are untrusted , It's not just the requested value , And the amount and frequency of requests . For all interfaces , Limit the frequency of its calls , Prevent the interface from being flushed by a large number of instantaneous requests . For paid interfaces , It also prevents the number of user requests for the interface from exceeding the number of original purchases .
Besides , There is also a situation that is easily overlooked , If your interface A We call other people's interfaces in B, Maybe your interface A Its own logic can withstand every second 1000 A request , But you're sure the interface B Can you bear it ？
therefore , Flow control is needed , It's not just about preventing the interface from being blown up , It can also protect internal services and calls .
what , You said your interface is very good , It can resist 100 Million requests , There are no other services called , Then I'll look for 100 ten thousand + 1 Individuals also request your interface , Look, you're afraid ！
DDOS Distributed denial of service attacks
The commonly used flow control can be divided into different granularity ：
User flow control ： Limit the number of calls per user to an interface in a certain period of time .
Interface flow control ： Limit the total number of calls to an interface in a certain period of time .
Single machine flow control ： Limit the total number of calls to all interfaces of the project on a single server in a certain period of time .
Distributed flow control ： Limit the total number of requests from all servers of the project in a certain period of time .
Of course , In addition to the ways mentioned above , Flow control can be very flexible , There are also many excellent current limiting tools . such as
RateLimiter Token bucket single machine current limiting 、 Ali's
Sentinel Distributed current limiting framework, etc .
Sentinel Flow control panel
5. Roll back
Sometimes , Our operations on the project may be wrong , It could be manual operation , It could also be machine operation , This led to some online failures . At this time , You can choose to roll back .
Rolling back means undoing an operation , Restore the project to its previous state , Here are some common rollback operations .
Sometimes , We want to insert data in bulk , But when the data is inserted half way through , The program suddenly appears abnormal , At this time, we need to roll back the previously inserted data , It's like nothing happened . Otherwise, there may be a risk of data inconsistency .
The most common way is to use transactions to handle batch operations of databases , When an exception occurs , Execute the rollback method of database client .
If the configuration information of the project , For example, database link address , Write to code , Once the configuration is wrong or the address changes , You're going to have to rewrite the code , Very trouble .
A better way is to publish the configuration to the configuration center for management , Let the project read the configuration of the configuration center dynamically . If you accidentally publish the wrong configuration , You can roll back directly in the configuration center , Restore the configuration .
No one can guarantee that their code is correct , A lot of times , The project did not find any problems in the test environment validation , But once online , It's a lot of holes . This shows that there is something wrong with our newly released code .
At this time , The simplest way is to roll back the version , Repackage and release the code that worked before . Large companies generally have their own project release platform , Can use the interface to roll back , Automatically publish previous versions of project packages .
6. Multi level cache
above-mentioned , Caching is very important for projects , It's not just a tool to improve performance , It's also the umbrella of the database .
But what if the cache crashes ？
There are two options , The first is to cluster the cache , So as to ensure the high availability of cache .
But nothing can be trusted , Clusters may also fail ！
So you can use the second solution , The first level cache is down , Let's build a second level cache on top of it ！
Usually , In high concurrency projects , We're going to design multilevel caching , Distributed caching + Local cache . When a request needs to get data , From distributed cache first （ such as
Redis） Query in , If the distributed cache goes down collectively , So get the data from the local cache . such , Even if the cache crashes , It can also help the system support for a period of time .
This may be different from some multilevel cache designs , Sometimes , We will use the local cache as the first level cache , Cache some hot data , When the local cache cannot find a value , To access the distributed cache . The main problem that this kind of design solves is , Reduce the number of requests to the distributed cache , And further improve performance , It's different from the design purpose above .
Multi level cache design
7. Service fusions and downgrades
Every year's double eleven , We'll be on time to watch the flash buying page on the screen , Just waiting for that one “ Please try again later ！”
Our project is far more fragile than we thought , Many services often have problems for various reasons . For example, when doing activities , A large number of users accessing at the same time will lead to more requests for project services , If you don't withstand the pressure of the project , It's going to hang up .
To prevent this risk , We can use service degradation Strategy , If the system really can't serve all users , Then go back and ask for the next , Give the user a direct return of “ Amicable ” Tips or interfaces , Instead of forcing the project to death under pressure .
coordination Service failure technology , According to the system load and other indicators to dynamically open or close the degradation . For example, machine CPU When occupied and full , Just turn on the demotion , Direct return error ; When the machine CPU When it comes back to normal , Return the data normally 、 Perform the operation .
Hystrix It is the famous microservice downgrading framework .
8. Active detection
above-mentioned , Even if it's a synchronization service from a big company , It is also possible that the synchronization is not timely or even the data is lost . therefore , To further ensure the success of synchronization 、 The accuracy of the data , We can Active detection .
For example, write a timed script or task , Check whether the data of the original address and the target address are consistent at regular intervals , Or check whether the data is correct through some logic . Of course, it can also be detected immediately after each data synchronization , More insurance .
9. Data compensation
When data inconsistency is detected , We're going to have data compensation , For example, synchronize the data that is not synchronized again 、 Update inconsistent data, etc .
In addition to solving the data inconsistency detected actively , Data compensation is also widely used in business design and architecture design .
For example, after calling an interface to query data failed , Pause for a while , And then automatically try again , Or get data from other places . Another example is when the producer of message queue fails to send message , It should be automatically reissued and recorded , Instead of just invalidating the message .
The idea of data compensation is to ensure the final consistency of data , Data errors are not terrible , If you can correct your mistakes, you are a good child . This idea is also widely used in distributed transaction scenarios .
10. The data backup
Data is the life of an enterprise , So we have to keep the data as safe and complete as possible .
Many students store their important documents in multiple places , Like your own computer 、 On the Internet and so on . Again , In software development , We should also duplicate important data , As copies in different places . such , Even if one server is down , You can also get data from other servers , Reduced risk .
The data backup
Interface is a complicated and changeable guy , If our project relies on other interfaces for functionality , So it's better to make sure that the interface is alive all the time , Otherwise, it may affect the operation of the project .
for instance , When we pay with the bank , You must call the interface provided by the bank to get the balance information of the bank card , If this interface hangs up , Can't get the balance , Users can't pay , It's a loss of income ！
therefore , We need to keep in touch with important interfaces all the time , To prevent them from accidentally dying . You can use the heartbeat mechanism , Call the interface regularly or send a heartbeat packet , To determine whether the interface is still alive . Once the call times out or fails , It can be checked and dealt with immediately , Thus, the impact time of the accident is greatly reduced .
The heartbeat detection
12. Redundant design
When evaluating system resources and capacity , We're going to do some redundant design , For example, the current total amount of data in the database is 1G, So if you want to synchronize the database data to other storage （ such as
Elasticsearch） when , At least double the storage space , namely 2G, To deal with possible data growth later . The more potential the business has , There can be more multiples of redundancy , But also be careful not to be redundant , After all, resources are also very expensive ！
Actually , Redundancy design is an important design idea . When we design business or system architecture , It can't be limited to current conditions , It's about future development , Choose a mode that is relatively easy to expand . Otherwise, the project will get bigger and bigger later , Every change to the project is difficult .
13. Elastic expansion and contraction
Dreams are necessary , Maybe suddenly , We used to have only 100 The small items that people use suddenly become popular , There are hundreds of thousands of new users to use .
however , Because our project is deployed on only one server , It can't support so many people , Just hang up , These users are very disappointed , I don't want to use our project anymore .
The dream is broken
This is also a common risk , We can use elastic scaling technology , The system will automatically expand or reduce resources according to the current project usage and resource occupancy .
For example, when the system pressure is high , Allocate more machines （ Containers ）, When the system pressure is low , Cut down on a few machines . This can not only effectively cope with sudden traffic growth , It can also save costs in peacetime , And it saves the trouble of manually distributing and adjusting the machine .
14. Different live
Mentioned earlier , The server is not trusted , Don't say a server is down , Because of some natural and man-made disasters , The whole computer room may hang up collectively ！
Unlike backup , Remote live refers to the establishment of independent data centers in different cities , Under normal circumstances , No matter where the user visits the business system , Can get the right service , That is, there are many at the same time “ live ” Service for .
And when business is abnormal in some place , Users can access normal business systems in other places , To get the right service .
In this way , Even if the computer room in Guangzhou has crossed , We also have Shanghai , Shanghai's Cross , We also have Beijing .
At the same time, the more services are alive , The more reliable the system is , But at the same time, the higher the cost 、 More complicated , So it's almost all big companies that do extra work . Never let the investment under normal circumstances be greater than the loss caused by the failure ！
How to realize the technology of living in different places when you are hungry （ One ） General introduction
15. Monitoring alarm
It's impossible for the project to run normally all the time , But we can't 24 Watch your computer screen for hours to monitor project performance ？ You can't completely ignore the project , It's out bug Waiting for users to complain .
therefore , The best way is to add monitoring alarms to the service , When something goes wrong with the program , Report to the monitoring platform at the information conference , And send a notice to the developer as soon as possible . You can also view the running status of the project in real time through the monitoring platform , If something goes wrong, you can locate it more quickly .
Grafana Monitoring platform
16. Online diagnosis and thermal repair
Since nothing in the program world can be trusted , Danger is everywhere , So just prepare for the worst , Suppose the online program will produce bug.
Since it is impossible to prevent , Then stand by , stay bug Repair it as quickly as it appears , To reduce the impact .
Usually , We need to change bug, You also need to go through code changes 、 Submission code 、 Merge code 、 Pack to build 、 Release online and other processes . When the process is finished , Maybe the system's cold .
To improve efficiency , We can use online diagnostics and hot fix Technology . In the presence of bug when , First use online diagnostic tools to easily get the running status and code execution information of the project , Improve the efficiency of investigation . Once the problem is discovered , Use hotfix technology to modify the runtime code directly , No need to rebuild and restart the project ！
Java in , We can use Alibaba's open source diagnostic tools
Arthas, At the same time, it supports online hot repair function . You can also write your own scripts to achieve , But it's a little more complicated .
See here , There must be some classmates who make complaints about it , How to write a program needs to consider so many problems that have nothing to do with function . Code that could have been written in five minutes , Now it may not take an hour to finish writing ！
Actually , Not all projects need to be absolutely safe （ Of course, we can't ）, It is that we should always keep in mind the danger in times of peace , Make defensive programming your own habit .
On the ground , According to the magnitude of the project 、 Audience 、 framework 、 The degree of emergency and other factors to comprehensively evaluate the project to achieve what degree of safety , Not over design 、 entertain imaginary or groundless fears .
Let's slow down the time , Think calmly before developing , Anticipate and avoid risks , Don't let the sword of Damocles fall .
Little flowers , Let them know you “ Looking at ” I
- C++ 数字、string和char*的转换
- Won the CKA + CKS certificate with the highest gold content in kubernetes in 31 days!
- C + + number, string and char * conversion
- C + + Learning -- capacity() and resize() in C + +
- C + + Learning -- about code performance optimization
C + + programming experience (6): using C + + style type conversion
Latest party and government work report ppt - Park ppt
Online ID number extraction birthday tool
Field pointer? Dangling pointer? This article will help you understand!
GVRP of hcna Routing & Switching
- LeetCode 91. 解码方法
- Seq2seq implements chat robot
- [chat robot] principle of seq2seq model
- Leetcode 91. Decoding method
- HCNA Routing＆Switching之GVRP
- GVRP of hcna Routing & Switching
- HDU7016 Random Walk 2
- [Code+＃1]Yazid 的新生舞会
- CF1548C The Three Little Pigs
- HDU7033 Typing Contest
- HDU7016 Random Walk 2
- [code + 1] Yazid's freshman ball
- CF1548C The Three Little Pigs
- HDU7033 Typing Contest
- Qt Creator 自动补齐变慢的解决
- HALCON 20.11：如何处理标定助手品质问题
- HALCON 20.11：标定助手使用注意事项
- Solution of QT creator's automatic replenishment slowing down
- Halcon 20.11: how to deal with the quality problem of calibration assistant
- Halcon 20.11: precautions for use of calibration assistant
- "Top ten scientific and technological issues" announced| Young scientists 50 ² forum
- Reverse linked list
- JS data type
- Remember the bug encountered in reading and writing a file
- Singleton mode
- 在这个 N 多编程语言争霸的世界，C++ 究竟还有没有未来？
- In this world of N programming languages, is there a future for C + +?
- js Promise
- js 数组方法 回顾
- ES6 template characters
- js Promise
- JS array method review
- 【Golang】️走进 Go 语言️ 第一课 Hello World
- [golang] go into go language lesson 1 Hello World