当前位置:网站首页>Deep analysis of cache and database consistency

Deep analysis of cache and database consistency

2020-12-06 07:03:10 Zhu Xiaosi

Click on the above “ Zhu Xiaosi's blog ”, choice “ Set to star ”

The background to reply " book ", obtain

source : Back end technology

When we are doing database and cache data synchronization , Update cache after all , Or delete the cache , After all, we should operate the database first , It's better to operate the cache first ? In this paper, we take you in-depth analysis of the database and cache double write problem , For your reference .

The main content of this article is

  • Data caching

    • Why use caching

    • What kind of data is suitable for caching

    • The advantages and disadvantages of caching

  • How to ensure cache and database consistency

    • Do not update cache , Instead, delete the cache

    • Operation first cache , Or operate the database first

    • How to ensure the consistency of database and cache data

  • Cache and database consistency combat

    • actual combat : So let's delete the cache , Update the database

    • actual combat : Update the database first , Delete the cache again

    • actual combat : Cache delay double delete

    • actual combat : Delete cache retry mechanism

    • actual combat : Read binlog Delete cache asynchronously

Data caching

In our actual business scenario , There must be a lot of data caching scenarios , For example, the page selling goods , It includes a lot of data with a lot of concurrent access , They can be called “ hotspot ” data , There is a characteristic of these data , That is, the update frequency is low , The reading frequency is high , This data should be cached as much as possible , So as to reduce the chance of requests hitting the database , Reduce the pressure on the database .

Why use caching

Caching is for the pursuit of “ fast ” And there is . Let's use code as an example .

I'm in my own Demo Code warehouse added two interfaces to query inventory getStockByDB and getStockByCache, Query the inventory quantity of an item from database and cache respectively .

And then we use JMeter Test concurrent requests .(JMeter Please refer to my previous article : Click here

What needs to be stated is , My tests are not rigorous , It's just a comparative test , Don't use it as a reference for actual service performance .

This is the code for two interfaces :

/**
 *  Check stock : Query inventory through the database 
 * @param sid
 * @return
 */
@RequestMapping("/getStockByDB/{sid}")
@ResponseBody
public String getStockByDB(@PathVariable int sid) {
    int count;
    try {
        count = stockService.getStockCountByDB(sid);
    } catch (Exception e) {
        LOGGER.error(" Failed to query inventory :[{}]", e.getMessage());
        return " Failed to query inventory ";
    }
    LOGGER.info(" goods Id: [{}]  The remaining stock is : [{}]", sid, count);
    return String.format(" goods Id: %d  The remaining stock is :%d", sid, count);
}

/**
 *  Check stock : Query inventory through cache 
 *  A cache hit : Return to stock 
 *  Cache miss : Query database write cache and return 
 * @param sid
 * @return
 */
@RequestMapping("/getStockByCache/{sid}")
@ResponseBody
public String getStockByCache(@PathVariable int sid) {
    Integer count;
    try {
        count = stockService.getStockCountByCache(sid);
        if (count == null) {
            count = stockService.getStockCountByDB(sid);
            LOGGER.info(" Cache miss , Query the database , And write to the cache ");
            stockService.setStockCountToCache(sid, count);
        }
    } catch (Exception e) {
        LOGGER.error(" Failed to query inventory :[{}]", e.getMessage());
        return " Failed to query inventory ";
    }
    LOGGER.info(" goods Id: [{}]  The remaining stock is : [{}]", sid, count);
    return String.format(" goods Id: %d  The remaining stock is :%d", sid, count);
}

First set it to 10000 In the case of concurrent requests , function JMeter, As a result, a large number of errors were reported in the first place ,10000 A request 98% All of our requests failed . It's very disturbing ~

Open the log , An error is as follows :

SpringBoot Built in Tomcat The maximum number of concurrent operations , The default value is 200, about 10000 Concurrent , The stand-alone service is really inadequate . Of course , You can modify the concurrency settings here , But your little machine may not be able to carry it .

After changing it to the following configuration , My little machine is taking inventory through cache , To ensure the 10000 Concurrent 100% Return request :

server.tomcat.max-threads=10000
server.tomcat.max-connections=10000

You can see , Without caching , Throughput for 668 Requests per second

In the case of caching , Throughput for 2177 Requests per second

In this way “ It's very lax ” By contrast , There is a cache for a single machine , Improved performance 3 More than double , If you have multiple machines , More concurrent , Because there's more pressure on the database , The performance advantage of caching should be more obvious .

After testing this little experiment , I look at it. I hang on MySQL The small pipe Tencent cloud server , I'm afraid he will be hung up by such high traffic . This burst of traffic , It may be detected as abnormal attack traffic ~

I use Tencent cloud server 1C4G2M, The event bought , Very cheap . Here's a free ad , Please contact me after Tencent cloud sees it and give me money ;)

What kind of data is suitable for caching

Data with a large amount of cache but not always changing , Like details , Comments, etc . For data that changes constantly , It's not suitable for caching , On the one hand, it will increase the complexity of the system ( Cache update , Cache dirty data ), On the other hand, it also brings some instability to the system ( Cache system maintenance ).

But in some extreme cases , You need to cache some data that will change , For example, if you want the page to display the quasi real-time inventory number , Or some other special business scenarios . At this point, you need to make sure that the cache doesn't ( always ) There's dirty data , This requires further discussion .

The advantages and disadvantages of caching

Should we cache it or not , This is actually a trade-off( Balance ) The problem of .

On the advantages of caching :

  • It can shorten the response time of the service , Give users a better experience .

  • It can increase the throughput of the system , It can still improve the user experience .

  • Reduce the pressure on the database , Prevent the database from collapsing during peak hours , It leads to the whole online service BOOM!

On the cache , It also introduces a lot of extra problems :

  • There are many types of cache , Is memory cache ,memcached still redis, Are you familiar with , If you're not familiar with , Undoubtedly, it increases the difficulty of maintenance ( It was a pure database system ).

  • The cache system should also consider the distribution , such as redis There will be a lot of holes in the distributed cache , Undoubtedly, it increases the complexity of the system .

  • In a special situation , If there are very high requirements for cache accuracy , You have to think about Cache and database consistency issues .

This article wants to focus on , It's the consistency between cache and database , Look down, ladies and gentlemen .

How to ensure cache and database consistency

So much about the need for caching , So is it a simple thing to use cache , I've always felt that way before , Until the need for cache and database retention Strong consistency Scene , It is a profound knowledge to keep the consistency of database data and cache data .

From ancient hardware caching , OS cache starts , Caching is a unique discipline . This issue has been discussed by the industry for a long time , The debate has been so far . I looked through a lot of information , It turns out that this is actually a trade-off . It's worth talking about .

The following discussion will introduce several perspectives , I'll follow the point of view and write code validation questions .

Do not update cache , Instead, delete the cache

Most people think , Caching should not be about updating the cache , Instead, you should delete the cache , Then the next request goes to cache , The database is no longer found after reading , Write cache .

Opinion citation :《 Distributed database and cache double-write consistency scheme resolution 》 Lonely smoke

The reason a : Thread safety perspective

At the same time, there's a request A And request B Update operation , Then there will be

(1) Threads A Updated database

(2) Threads B Updated database

(3) Threads B Updated cache

(4) Threads A Updated cache

So there's a request A Update cache should be better than request B Update the cache early , But because of the Internet and so on ,B But than A The cache was updated earlier . This leads to dirty data , So don't think about .

Reason two : Business scenario perspective

There are two things :

(1) If you are a write database scenario more , The read data scenario has fewer business requirements , The adoption of this scheme will lead to , I haven't read the data yet , The cache is updated frequently , Waste performance .

(2) If you write the value to the database , It is not written directly to the cache , Instead, it goes through a complex set of calculations before writing to the cache . that , After each write to the database , Calculate the value of the write cache again , It's a waste of performance . obviously , Delete cache is more suitable for .

In fact, if the business is very simple , Just go to the database and get a value , Write cache , So it's OK to update the cache . however , The operation of eliminating cache is simple , And the side effects are only added once cache miss, It is suggested that as a general treatment .

Operation first cache , Or operate the database first

So here's the problem , We delete the cache first , Then update the database , Or update the database first , Then delete the cache ?

Let's see what the big guys say first .

《【58 Shen Jian Architecture Series 】 Cache architecture design details two or three things 》58 Shen Jian :

For an operation that cannot be guaranteed to be transactional , It must involve “ Which task to do first , Which task to do after ” The problem of , What is the direction of solving this problem : If there is a discrepancy , Who does it first has less impact on the business , It's up to who will execute first .

Suppose the cache is eliminated first , Write the database again : The first step is to eliminate cache successfully , The second step failed to write the database , It will only trigger once Cache miss.

Suppose you write the database first , Eliminate cache again : The first step is to write the database successfully , The second step failed to eliminate the cache , It will appear DB China is the new data ,Cache It's old data , Data inconsistency .

There's no problem with what Shen Jian said , however The problem of data dirty reading in concurrent requests is not fully considered , Let's take a look at the lonely tobacco teacher 《 Distributed database and cache double-write consistency scheme resolution 》:

Delete cache first , Update the database

This scheme will lead to inconsistent request data

There is also a request A Update operation , Another request B Query operation . Then the following will happen :

(1) request A Write operation , Delete cache

(2) request B Query found cache does not exist

(3) request B Go to the database to query for the old value

(4) request B Write the old value to the cache

(5) request A Writes the new value to the database

This leads to inconsistencies . and , If you do not use the cache expiration policy , The data is always dirty .

So delete the cache first , Updating the database is not a once and for all solution , Let's look at updating the database first , How about deleting cache again ?

Update the database first , Delete the cache again There is no concurrency problem in this case ?

No, it isn't . Suppose there are two requests , A request A Do query operation , A request B Do update operation , So this is going to happen

(1) The cache just failed

(2) request A Query the database , Get an old value

(3) request B Writes the new value to the database

(4) request B Delete cache

(5) request A Writes the old value found to the cache

ok, If this happens , Dirty data does happen .

However , What's the probability of that happening ?

There is a congenital condition for this to happen , It's the steps (3) Write database operations than steps (2) The read database operation takes less time , It's possible to make steps (4) Before the steps (5). But , Think about it , Database reads are much faster than writes ( Why else do read and write separation , The point of doing read/write separation is because the read operation is faster , Less resources ), So step (3) It takes more time than steps (2) shorter , It's very difficult for this to happen .

Update the database first , There are still problems with deleting the cache , however , The possibility of problems will be due to the reasons mentioned above , Become lower !

( Additional explanation : I used it “ Update the database first , Delete the cache again ” And there is no expiration time policy , Is there a problem ? Since caching and updating the database first is not atomic , If the database is updated , The program stops , You didn't delete the cache , Because there is no expiration policy , It's dirty data forever .)

therefore , If you want to implement the basic cache database double write consistent logic , So in most cases , I don't want to do too much design , In the case of increasing too much work , please Update the database first , Delete the cache again !

What do I have to do if the database and cache data are strongly consistent

that , What if I have to guarantee absolute consistency , Give a conclusion first :

There is no way to achieve absolute consistency , This is from CAP Theory decides , The scenario that cache system is suitable for is the scenario with non strong consistency , So it belongs to CAP Medium AP.

therefore , We've got to get over it , You can do it BASE The theory says Final consistency .

Finally, consistency emphasizes all copies of data in the system , After a period of synchronization , Eventually a consistent state can be reached . therefore , The essence of final consistency is to ensure that the final data can be consistent , It does not need to guarantee the strong consistency of system data in real time

The big guys gave the solution to reach the final consistency , It mainly aims at the above two double writing strategies ( Delete cache first , Update the database / Update the database first , Delete the cache again ) As a result of Dirty data problem , Deal with it accordingly , To ensure final consistency .

Cache delay double delete

ask : So let's delete the cache , Then update the database to avoid dirty data ?

answer : Adopt the strategy of delay double deletion .

We mentioned above , Delete cache first , In the case of updating the database again , If you do not use the cache expiration policy , The data is always dirty .

So how to solve this problem by delay double deletion ?

(1) Eliminate the cache first

(2) Write the database again ( These two steps are the same as before )

(3) Sleep 1 second , Eliminate the cache again

Do it , Can be 1 Cache dirty data in seconds , Delete again .

that , This 1 How does the second determine , How long should I sleep ?

For the above case , Readers should assess their own project's time consumption of the read data business logic . The sleep time for writing the data is then based on the time spent reading the data's business logic , Add a few hundred ms that will do . The purpose of this is , Is to make sure that the read request ends , Write requests can remove cached dirty data caused by read requests .

If you use mysql What about the read-write separation architecture ?

ok, under these circumstances , The reasons for the inconsistent data are as follows , Or two requests , A request A Update operation , Another request B Query operation .

(1) request A Write operation , Delete cache

(2) request A Write the data to the database ,

(3) request B Query cache discovery , The cache has no value

(4) request B Go to the library and look up , At this time , Master slave synchronization is not complete yet , So the query is the old value

(5) request B Write the old value to the cache

(6) The database completes master-slave synchronization , Changes from library to new value

The above situation , That's why the data is inconsistent . Again, use the double-delete delay strategy . It's just , The sleep time is changed to be based on the delay time of master-slave synchronization , Add a few hundred ms.

Adopt this synchronous elimination strategy , What to do with throughput reduction ?

ok, Make the second delete asynchronous . Start your own thread , Delete asynchronously . such , You don't have to sleep for a while to write a request , Back again . Do it , Increase throughput .

So delete the cache first , In the case of updating the database again , You can use a delayed double deletion strategy , To ensure that dirty data will only survive for a while , It will be covered with accurate data .

Update the database first , Delete the cache again , The possibility of dirty data in cache is very small , But there will be . We can still use the delayed double deletion strategy , In the request A After writing dirty old values to the cache , Delete cache again . To ensure that the dirty cache is removed .

Delete cache failed how to do : Retry mechanism

It seems that all the problems have been solved , But in fact , There is another problem that has not been taken into account , That's the operation of deleting the cache , What to do if you fail ? For example, when double deletion is delayed , The second cache delete failed , That's still not clear dirty data ?

The solution is to add a retrial mechanism , Ensure that the cache is deleted successfully .

Refer to the plan map given by lonely tobacco teacher :

Scheme 1 :

The process is as follows

(1) Update database data ;

(2) Cache deletion failed due to various problems

(3) Will need to be deleted key To the message queue

(4) Consume your own messages , Get what needs to be removed key

(5) Continue retrying the delete operation , Until success

However , There is a drawback to this scheme , A large number of intrusions into line of business code . So we have plan two , In scheme two , Start a subscription program to subscribe to the database binlog, Get the data you need to operate on . In the application , Let's do another procedure , Get the information from the subscriber , Do the delete cache operation .

Option two :

The flow is shown in the following figure :

(1) Update database data

(2) The database writes the operation information binlog In the Journal

(3) The subscription program extracts the required data as well key

(4) Another piece of non business code , Get that information

(5) Attempt to delete cache operation , Deletion failed

(6) Send this information to the message queue

(7) Retrieve this data from the message queue , Retrying the operation .

Extended reading

Update cached Design Pattern There are four kinds. :

  • Cache aside

  • Read through

  • Write through

  • Write behind caching, Here is a summary of Chen Hao's article to learn .

https://coolshell.cn/articles/17416.html

Summary

Quote Chen Hao 《 Cache update routines 》 The final conclusion serves as a summary :

In a distributed system, either through 2PC or Paxos Agreement guarantees consistency , Or desperately reduce the probability of dirty data in concurrency

The scenario that cache system is suitable for is the scenario with non strong consistency , So it belongs to CAP Medium AP,BASE theory .

Heterogeneous databases have no way of strong consistency , Just minimize the time window , Achieve final consistency .

And don't forget to set the expiration time , It's an all embracing plan

Conclusion

This paper summarizes and discusses the double write consistency of cache database .

The content of the article can be summarized as follows :

  • For reading more data and writing less data , Please use cache .

  • In order to keep the database and cache consistent , It will lead to the decrease of system throughput .

  • In order to keep the database and cache consistent , It will lead to complex business code logic .

  • Cache does not achieve absolute consistency , But ultimately consistency can be achieved .

  • For the case that the cache database data needs to be consistent , Try to think about how much consistency is required , Choose the right solution , Avoid over design .

Reference resources

  • https://cloud.tencent.com/developer/article/1574827

  • https://www.jianshu.com/p/2936a5c65e6b

  • https://www.cnblogs.com/rjzheng/p/9041659.html

  • https://www.cnblogs.com/codeon/p/8287563.html

  • https://www.jianshu.com/p/0275ecca2438

  • https://www.jianshu.com/p/dc1e5091a0d8

  • https://coolshell.cn/articles/17416.html

Want to know more ? sweep Trace the QR code below and follow me

The background to reply " technology ", Join the technology group

【 Highlights 】

Point a praise + Looking at , Less bug ????

版权声明
本文为[Zhu Xiaosi]所创,转载请带上原文链接,感谢
https://chowdera.com/2020/12/20201206070037506k.html