当前位置:网站首页>To me, multithreading transaction must be a pseudo proposition!

To me, multithreading transaction must be a pseudo proposition!

2020-11-09 13:19:46 Why technology

This is a why Technology 74 Original articles

 Late at night
Late at night

Don't ask , No way to ask

You should know about distributed transactions . But this multithreaded transaction ......

Don't worry , I'll tell you slowly .

As shown in the figure , There is a partner who wants to implement multithreaded transactions .

I've seen this demand many times in different places , So I said : This problem has arisen again .

So is there a solution ?

Before that , My answers are all very positive : Beyond all doubt , There must be no .

Why? ?

Let's start by theorizing .

Come on , First of all, let me ask you , What are the characteristics of transactions ?

This is not difficult ? Eight part essay must be recited ,ACID You have to open your mouth :

  • Atomicity (Atomicity)
  • Uniformity (Consistency)
  • Isolation, (Isolation)
  • persistence (Durability)

So here comes the question , Do you think if there are multithreaded transactions , So which feature have we broken ?

You can't think about multithreading , You just want to , Two different users each initiate an order request , There are transactions in the background implementation logic corresponding to this request .

Isn't this multi-threaded transaction ?

In this scenario, you haven't thought about how to control the transaction operation of two users separately ?

Because the two operations are completely isolated , Play with their own links .

So what's the basic principle between multiple transactions ?

Isolation, . Two transaction operations should not interfere with each other .

What multithreaded transactions want to achieve is A Thread exception .A,B The transactions of the thread are rolled back together .

In the characteristics of transaction, there is a dead card . therefore , Multithreaded transactions don't work in theory .

Practice guides theory through , So the code for multithreaded transactions can't be written .

I talked about isolation . So, please. ,Spring Source code inside , How is the isolation of transactions guaranteed ?

The answer is ThreadLocal.

When the transaction is on , Save the current link in ThreadLocal Inside , This ensures the isolation between multiple threads :

You can see , This resource Object is a ThreadLocal object .

In the following method, the assignment operation is performed :

org.springframework.jdbc.datasource.DataSourceTransactionManager#doBegin

Among them bindResource In the method , It is to bind the current link to the current thread , Among them resource That's what we just said ThreadLocal:

Each thread plays its own game , We can't break ThreadLocal Rules of use , Let each thread share the same ThreadLocal Well ?

Iron seed , If you do that , That's not a long way to go ?

therefore , In theory , Or code implementation , I don't think this requirement can be realized .

At least that's what I thought before .

But things , A little bit changed .

Say a scene , Regular implementation

Any behavior that discusses technology implementation out of context is playing rogue .

therefore , Let's take a look at the scene first .

Suppose we have a big data system , Set time every day , We need to pull from big data systems 50w Data , Do a cleaning operation on the data , Then save the data to the database of our business system .

For business systems , this 50w Data , All of them have to be left in the warehouse , Not one of them . Or you don't insert any of them .

In the process , It doesn't call other external interfaces , There will be no other process to manipulate the data in this table .

Since one of them is right , So for all of you, intuitively , There must be two solutions :

  1. for Insert transaction by transaction in the loop .
  2. Directly insert a statement in batches .

For this need , Open transaction , And then in for The insertion of a loop can be said to be very low The solution to the problem .

Very inefficient , Let me show you .

such as , We have a Student surface , The table structure is very simple , as follows :

CREATE TABLE `student` (
  `id` bigint(63) NOT NULL AUTO_INCREMENT,
  `name` varchar(32) DEFAULT NULL,
  `home` varchar(64) DEFAULT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=1 DEFAULT CHARSET=utf8;

In our project , We go through for Loop insert data , At the same time, this method has @Transactional annotation :

num Parameters are the data we pass through the front-end request , It means to insert num Data :

In this case , We can use the link below , Simulate the insertion of a specified amount of data :

http://127.0.0.1:8081/insertOneByOne?num=xxx

I tried to put num Set to 50w, Let it run slowly , But I'm still too young , I've been waiting for a long time, but I haven't got the result .

So I put num Change it to 5000, The operation results are as follows :

insertOneByOne Execution time consuming :133449ms,num=5000

Insert one by one 5000 Data , Time consuming 133.5 s The appearance of .

At this rate , Insert 50w It's got to be 13350s, It's about as many hours as that :

Who can resist this .

therefore , There's huge room for optimization .

For example, we optimize to insert in batches like this :

Their corresponding sql This is the sentence :

insert into table ([ Name ],[ Name ]) VALUES ([ The column value ],[ The column value ]), ([ The column value ],[ The column value ]);

We still call through the front-end interface :

When our num Set to 5000 When , My page has been refreshed 10 Time , You see, the time-consuming is basically 200ms Within milliseconds :

from 133.5s To 200ms, friends , What's this? ?

This is a qualitative leap . Performance has improved by nearly 667 Double the look .

Why can batch insertion make such a big leap ?

You want to. , Before for Circular insert , although SpringBoot 2.0 Default used HikariPool, The connection pool is for you by default 10 A connection .

But you just need a connection , Start a transaction . It's not time consuming .

The time-consuming part is you 5000 Time IO ah .

therefore , It is inevitable that it will take a long time .

And batch insertion is just a piece of sql sentence , So you just need a connection , You don't need to open the transaction yet .

Why don't you open a transaction ?

You have one sql There is a hammer to start a business ?

that , If we insert 50w Data , What would it be like ?

Come on , Do something about it , Have a try :

http://127.0.0.1:8081/insertBatch?num=500000

You can see an exception thrown . And the error message is very clear :

Packet for query is too large (42777840 > 1048576). You can change this value on the server by setting the max_allowed_packet' variable.; nested exception is com.mysql.jdbc.PacketTooBigException: Packet for query is too large (42777840 > 1048576).You can change this value on the server by setting the max_allowed_packet' variable.

Say your bag is too big . Can be set by max_allowed_packet To change the size of the bag .

We can query the current configuration size through the following statement :

select @@max_allowed_packet;

As you can see, yes 1048576, namely 1024*1024,1M size .

And the packet size we need to transfer is 42777840 byte , Probably 41M The appearance of .

So we need to change the configuration size .

This place is also a wake-up call : If your sql The sentence is very big , There are big fields in it , Remember to adjust mysql This parameter of .

You can modify the configuration file or execute it directly sql Change the way statements are made .

I'll use it here sql Statement modified to 64M:

set global max_allowed_packet = 1024*1024*64;

And then execute it again , You can see that the insertion succeeded :

50w The data of ,74s The appearance of .

The data is either all submitted , Or none of them , The requirements are fulfilled .

In terms of time , It's a little long , But I can't think of any good promotion plan .

So how can we shorten the time ?

The idea of coquettish appeared

I can think of that , It's just multi thread .

50w data . We have five threads , A thread handles 10w data , Save and store without exception , Roll back when there is a problem .

This requirement is well implemented . It can be written in minutes .

But with a need : this 5 Data of threads , If there is a thread problem , You need to roll it all back .

Follow the train of thought , We found that this is the so-called multithreaded transaction .

I said it's impossible to do it because I think about transactions @Transactional Annotation to achieve .

We just need to use it correctly , Then the relational business logic can be , You don't need to and can't get involved in the transaction opening and committing or rolling back .

This kind of code is called declarative transaction .

A declarative transaction corresponds to a programmatic transaction .

Through programmatic transactions , We have full control over the opening and committing of transactions or rollback operations .

Can think of programming transactions , It's almost half done .

Would you , First of all, we have a global variable called Boolean type , By default, it can be submitted .

In child threads , We can start a transaction through a programmatic transaction , Then insert 10w After data , But don't submit . At the same time, tell the main thread , I'm ready on my side , Has reached the awaited .

If there is an exception in the child thread , So I'll tell the main thread , There's something wrong with me , And then roll back on your own .

Finally, the main thread collected 5 The state of the child thread .

If there is a problem with one thread , Then set the global variable to be uncommitted .

Then wake up all the waiting child threads , Roll back .

According to the above process , That's how you write simulation code , You can copy it directly and run it :

public class MainTest {
    // Can I submit
    public static volatile boolean IS_OK = true;

    public static void main(String[] args) {
        // The child thread waits for the main thread to notify
        CountDownLatch mainMonitor = new CountDownLatch(1);
        int threadCount = 5;
        CountDownLatch childMonitor = new CountDownLatch(threadCount);
        // The running result of the child thread is
        List<Boolean> childResponse = new ArrayList<Boolean>();
        ExecutorService executor = Executors.newCachedThreadPool();
        for (int i = 0; i < threadCount; i++) {
            int finalI = i;
            executor.execute(() -> {
                try {
                    System.out.println(Thread.currentThread().getName() + ": Start execution ");
// if (finalI == 4) {
// throw new Exception(" Something unusual happened ");
// }
                    TimeUnit.MILLISECONDS.sleep(ThreadLocalRandom.current().nextInt(1000));
                    childResponse.add(Boolean.TRUE);
                    childMonitor.countDown();
                    System.out.println(Thread.currentThread().getName() + ": Be ready , Wait for other thread results , Determine whether the transaction is committed ");
                    mainMonitor.await();
                    if (IS_OK) {
                        System.out.println(Thread.currentThread().getName() + ": Transaction submission ");
                    } else {
                        System.out.println(Thread.currentThread().getName() + ": Transaction rollback ");
                    }
                } catch (Exception e) {
                    childResponse.add(Boolean.FALSE);
                    childMonitor.countDown();
                    System.out.println(Thread.currentThread().getName() + ": Something unusual happened , Start transaction rollback ");
                }
            });
        }
        // The main thread waits for all child threads to execute response
        try {
            childMonitor.await();
            for (Boolean resp : childResponse) {
                if (!resp) {
                    // If a child thread fails to execute , Then change mainResult, Let all child threads roll back
                    System.out.println(Thread.currentThread().getName()+": A thread failed to execute , The flag bit is set to false");
                    IS_OK = false;
                    break;
                }
            }
            // The main thread got the result successfully , Let the child thread begin to execute according to the result of the main thread ( Commit or rollback )
            mainMonitor.countDown();
            // In order to block the main thread , Let the child thread execute .
            Thread.currentThread().join();
        } catch (Exception e) {
            e.printStackTrace();
        }
    }
}

In the case that all child threads are normal , The output is like this :

From the results , It's in line with our expectations .

Suppose there is an exception in a child thread , So the running result is like this :

An exception occurred in a thread , All threads are rolled back , This seems to be in line with expectations .

If you write this code according to the previous requirements , So congratulations , A careless implementation of a similar two-phase commit (2PC) Consistency agreement for .

I mentioned earlier that you can think of programming transactions , It's almost half done .

And the other half , It's a two-stage submission (2PC).

I draw gourd by the gourd

With the scoop in front , Isn't it very easy for you to draw a gourd according to the picture ?

Just a little bit of code , The sample code can be downloaded from here Get , So let me cut a picture here :

The above code should be very easy to understand , Start five threads , Each thread inserts 10w Data .

It goes without saying , Think with your toes , It's definitely better than one-time batch insertion 50w Data fast .

As for how fast , Don't talk nonsense , Look directly at the execution effect .

Because of our controller That's true :

So call the link :

http://127.0.0.1:8081/batchHandle

The output is as follows :

Remember the time it took us to insert in bulk ?

73791ms.

from 73791ms To 15719ms. fast 58s The appearance of .

It's already very good .

What if a thread throws an exception ? Such as this :

Let's look at the log output :

Through log analysis , It seems to meet the requirements .

But from the actual test effect of reader feedback , It's also very significant :

Do you really meet the requirements ?

Meet the requirements , It just looks .

Experienced readers must have seen the problem for a long time . Having raised his hands high : teacher , I know this question .

I said before , This implementation method is actually a programming transaction with two-phase commit (2PC) Use .

The flaw lies in 2PC On .

It's like I'm talking to readers about this :

You can't pull back any more , And then there will be 3PC,TCC,Seata This set of distributed transaction things .

Write it down , It's going to take tens of thousands of words . So I transferred an article from Poseidon , It's in the second push . If you are interested, you can have a look . Dry cargo is full. .

In fact, when we think of a subthread as a subsystem in a microservice , This is a distributed transaction scenario .

And the solution we came up with , It's not a perfect solution .

although , In a way , We bypass the isolation of transactions , But there is a certain probability of data consistency problems , Although the probability is relatively small .

So I call it this program, it's called : Programming based on luck , Change time with luck .

matters needing attention

About the code above , In fact, there are still a few things to pay attention to .

Let me remind you .

first : How many threads are enabled for allocation data insertion , This parameter can be adjusted .

For example, I changed it to 10 Threads , Each thread inserts 5w Data . So the execution time is fast again 2s:

But I must remember that the bigger the better , Also remember to adjust the maximum number of connections to the database connection pool . In vain .

the second : It's because how many threads you start can be adjusted , It can even be calculated every time .

Then we must pay attention to a problem is not to let any task into the queue . Once in the queue , The program immediately cools .

Would you , If we need to turn on 5 Child threads , But the number of core threads is only 4 individual , A task has entered the queue .

So this 4 Core threads will be blocked all the time , Wait for the main thread to wake up .

And what the main thread is doing at this time ?

Waiting for 5 Running results of threads , But it can only collect 4 results .

So it's going to wait .

Third : This is where multiple threads start transactions and insert data into the table , Beware of database deadlock .

The fourth one : Pay attention to the code in the program ,countDown Installation standard writing is to put finally In the code block , I'm here for the beauty of the screenshot , This step is omitted :

If you really want to use , Pay attention to . And this finally You have to think clearly and write , It's not random .

The fifth one : I'm just here to provide a way of thinking , And it's not a multithreaded transaction at all .

It also proves again that , Multithreading transaction is a pseudo proposition .

So it's not too much for me to give a pseudo consistency answer based on luck .

Sixth : Multithreading transaction: think about it from another angle , It can be understood as a distributed transaction ., You can use this case to understand distributed transactions . But the best way to solve distributed transactions is : Don't have distributed transactions !

Most of the solutions to distributed transactions are : Final consistency .

High cost performance , Most businesses can also accept .

Seventh : If you want to get this solution for production , Remember to communicate with business colleagues first , Can you accept this situation . The dilemma between speed and safety .

At the same time, keep a good interface for manual repair :

And finally

Pretty good , It's hard to avoid mistakes , If you find something wrong , You can bring it up in the message area , I'll modify it . Thank you for reading , I insist on originality , Thank you for your attention .

I am a why, A literary creator delayed by code , Not a big man , But like to share , He is a good Sichuan man who is warm and full of material .

also , Welcome to pay attention to me .

版权声明
本文为[Why technology]所创,转载请带上原文链接,感谢