当前位置:网站首页>How to ensure that messages are not consumed repeatedly? (how to ensure the idempotent of message consumption)

How to ensure that messages are not consumed repeatedly? (how to ensure the idempotent of message consumption)

2020-11-09 10:50:39 Old man

Interview questions

How to ensure that messages are not consumed repeatedly ? Or say , How to guarantee the idempotence of message consumption ?

Psychological analysis of interviewers

In fact, this is a very common problem , These two questions can be asked together . Since it's consumer news , It must be considered whether there will be repeated consumption ? Can we avoid repeated consumption ? Or do not make the system abnormal after repeated consumption ? This is MQ Basic problems in the field , In fact, I still ask you in essence How to guarantee idempotence with message queue , This is a question to consider in your architecture .

Analysis of interview questions

Answer the question , First of all, don't hear the repetition of the news , I don't know , you Let's talk about the possible problems of repeated consumption .

First , such as RabbitMQ、RocketMQ、Kafka, There may be the problem of repeated message consumption , normal . Because the problem is usually not MQ I promise , It's developed by us to guarantee . Pick one Kafka Let's give you an example , Talk about how to repeat consumption .

Kafka There's actually a offset The concept of , It's every message written in , There is one. offset, Represents the sequence number of the message , then consumer After consuming the data , Every once in a while ( On a regular basis ), I'll take the news I've consumed offset Submit a , Express “ I've already spent , Next time I restart something , You let me continue from the last consumption to offset Let 's go on spending ”.

But there are always accidents , For example, we often encountered in previous production , It's you sometimes reboot the system , See how you restart , If you come across something in a hurry , direct kill Process , Restart again . This can lead to consumer Some messages have been processed , But I didn't submit it offset, Embarrassed . After reboot , A few messages will be consumed once again .

Take a chestnut .

There is such a scene . data 1/2/3 In turn Kafka,Kafka Each of these three pieces of data will be assigned one offset, Represents the serial number of this data , Let's assume that the distribution of offset In turn, is 152/153/154. Consumers from Kafka When we go shopping , It's also in this order of consumption . If when consumers consume  offset=153  This data of , Just about to submit offset To Zookeeper, At this point the consumer process is restarted . So the data consumed at this time 1/2 Of offset Not submitted ,Kafka I don't know you've already spent  offset=153  This data . So after the reboot , Consumers will look for Kafka say , well , Guys , Then you give me the data behind the last place I consumed and continue to pass it to me . Because of the previous offset Failed to submit successfully , So the data 1/2 It will come back , If the consumer doesn't go heavy at this time , Then it will lead to repeated consumption .

Be careful : New version of the Kafka Have already put offset Of storage from Zookeeper Transfer to Kafka brokers, And use the internal displacement theme  __consumer_offsets  For storage .
image.png
If the consumer's job is to take a piece of data and write one in the database , Will lead to saying , You may just put the data 1/2 Insert... Into the database 2 Time , So the data is wrong .

In fact, repeated consumption is not terrible , What's terrible is that you don't think about repeated consumption , How to guarantee idempotence .

Let's give you an example . Suppose you have a system , Consume a message and insert a piece of data into the database , If you repeat a message twice , You just inserted two , This data is not wrong ? But if you spend the second time , Make your own judgment about whether you've already consumed , If you throw it directly , In this way, a piece of data is not preserved , So as to ensure the correctness of the data .

A piece of data repeats twice , There is only one piece of data in the database , This guarantees the idempotence of the system .

Idempotency , Popular point theory , Just one data , Or a request , Repeat for you many times , You have to make sure that the corresponding data doesn't change , Don't make mistakes .

So here comes the second question , How to guarantee the idempotence of message queue consumption ?

In fact, we have to think about business , I'll give you some ideas :

  • For example, you take a data to write to the database , First, you can check according to the primary key , If all this data is available , You don't have to insert ,update Just a moment .
  • For example, you write Redis, That's OK , Anyway, every time set, Natural idempotence .
  • For example, you are not the above two scenes , That's a little more complicated , When you need to have producers send every piece of data , There is a global and unique id, Similar orders id Things like that , Then when you spend here , According to this id Go for example Redis Check it out , Have you ever consumed before ? If you haven't consumed , You deal with , And then this id Write Redis. If you've spent , Then don't deal with it , Make sure you don't repeat the same message .
  • For example, the unique key based on the database ensures that duplicate data will not be repeatedly inserted . Because there is only one key constraint , Repeated data insertion will only report errors , No dirty data in the database .
    image.png

Of course , How to ensure MQ The consumption of is idempotent , It needs to be combined with specific business .

Zero basis , learn Java, Just join me for ten years Java Study Group :3907814 , Focus on technology learning and communication .

版权声明
本文为[Old man]所创,转载请带上原文链接,感谢