Apache Cassandra 4.0 It has been published Beta edition , This is the first support JDK 11 And higher JDK Version of Cassandra edition .

 
Time delay for Apache Cassandra It's an obvious concern for users , So everyone is right JDK 11 New low latency garbage collector introduced in ZGC(Z Garbage Collector) High hopes .
 
We will see Cassandra 4.0 Brings powerful performance improvements , And some new garbage collectors (ZGC、 In especial Shenandoah) To a large extent, the launch of the company has enhanced the effect of these improvements .
 
Tested by the author ,Cassandra 4.0 In terms of throughput and delay 25% To 30% The promotion of . So when using the same garbage collector ,Cassandra 4.0 It's easy to beat Cassandra 3.11.6.

 
01 The test method
 
The next benchmark is to use tlp-cluster stay AWS Build and configure Apache Cassandra Clustered , And use tlp-stress For load generation and indicator collection .
 
All the tools used in this test are open source , And just have one AWS account number , Anyone can easily reproduce the process and results of this test .
 
Three of the clusters in this test r3.2xlarge Instance as node , Another one c3.2xlarge Examples are used as pressure measurement nodes .
 
In addition to garbage collection and heap memory settings , We used Apache Cassandra Default Settings .
 
The cluster is generated and configured by the latest version of tlp-cluster Accomplished . besides , We also added some auxiliary scripts (helper script), So that we can put Reaper and Medusa Automation of cluster generation and installation process .
 
Install and configure according to the documentation tlp-cluster After the tool , You can always create the same For benchmarking Cassandra Clustered .
 
# 3.11.6 CMS JDK8
build_cluster.sh -n CMS_3-11-6_jdk8 -v 3.11.6 --heap=16 --gc=CMS -s 1 -i r3.2xlarge --jdk=8 --cores=8 # 3.11.6 G1 JDK8
build_cluster.sh -n G1_3-11-6_jdk8 -v 3.11.6 --heap=31 --gc=G1 -s 1 -i r3.2xlarge --jdk=8 --cores=8 # 4.0 CMS JDK11
build_cluster.sh -n CMS_4-0_jdk11 -v 4.0~alpha4 --heap=16 --gc=CMS -s 1 -i r3.2xlarge --jdk=11 --cores=8 # 4.0 G1 JDK14
build_cluster.sh -n G1_4-0_jdk14 -v 4.0~alpha4 --heap=31 --gc=G1 -s 1 -i r3.2xlarge --jdk=14 --cores=8 # 4.0 ZGC JDK11
build_cluster.sh -n ZGC_4-0_jdk11 -v 4.0~alpha4 --heap=31 --gc=ZGC -s 1 -i r3.2xlarge --jdk=11 --cores=8 # 4.0 ZGC JDK14
build_cluster.sh -n ZGC_4-0_jdk14 -v 4.0~alpha4 --heap=31 --gc=ZGC -s 1 -i r3.2xlarge --jdk=14 --cores=8 # 4.0 Shenandoah JDK11
build_cluster.sh -n Shenandoah_4-0_jdk11 -v 4.0~alpha4 --heap=31 --gc=Shenandoah -s 1 -i r3.2xlarge --jdk=11 --cores=8  
Be careful : To control variables in a benchmark , We'll use the same group for the whole test EC2 Examples to test .
 
Use the following script appropriately , You can do it Cassandra 3.11.6 To Cassandra 4.0~alpha4 And different versions of JDK The replacement of :
#!/usr/bin/env bash

OLD=$1
NEW=$2
curl -sL https://github.com/shyiko/jabba/raw/master/install.sh | bash
. ~/.jabba/jabba.sh
jabba uninstall $OLD
jabba install $NEW
jabba alias default $NEW
sudo update-alternatives --install /usr/bin/java java ${JAVA_HOME%*/}/bin/java 20000
sudo update-alternatives --install /usr/bin/javac javac ${JAVA_HOME%*/}/bin/java

  

Calling JDK Version management tools jabba when , You can use the following JDK value :
  • openjdk@1.11.0-2
  • openjdk@1.14.0
  • openjdk-shenandoah@1.8.0
  • openjdk-shenandoah@1.11.0
OpenJDK 8 Has been used Ubuntu Of apt Tool installation completed .
 
Here's in the benchmark , In different JDK Under the version ,java -version Output result of :
  • jdk8
openjdk version "1.8.0_252"
OpenJDK Runtime Environment (build 1.8.0_252-8u252-b09-1~18.04-b09)
OpenJDK 64-Bit Server VM (build 25.252-b09, mixed mode)
  • jdk8 with Shenandoah
openjdk version "11.0.2" 2019-01-15
OpenJDK Runtime Environment 18.9 (build 11.0.2+9)
OpenJDK 64-Bit Server VM 18.9 (build 11.0.2+9, mixed mode)
  • jdk11 with Shenandoah
openjdk version "11.0.8-testing" 2020-07-14
OpenJDK Runtime Environment (build 11.0.8-testing+0-builds.shipilev.net-openjdk-shenandoah-jdk11-b277-20200624)
OpenJDK 64-Bit Server VM (build 11.0.8-testing+0-builds.shipilev.net-openjdk-shenandoah-jdk11-b277-20200624, mixed mode)
  • jdk14
openjdk version "14.0.1" 2020-04-14
OpenJDK Runtime Environment (build 14.0.1+7)
OpenJDK 64-Bit Server VM (build 14.0.1+7, mixed mode, sharing)

 
02 CMS
 
CMS (Concurrent Mark Sweep) The collector is currently Apache Cassandra Default garbage collector . Because it's in JDK 14 Has been removed from , So all the tests are based on JDK 8 or JDK 11 On going .
 
The following settings are used for CMS The benchmark of :
 -XX:+UseParNewGC
-XX:+UseConcMarkSweepGC
-XX:+CMSParallelRemarkEnabled
-XX:SurvivorRatio=8
-XX:MaxTenuringThreshold=1
-XX:CMSInitiatingOccupancyFraction=75
-XX:+UseCMSInitiatingOccupancyOnly
-XX:CMSWaitDuration=10000
-XX:+CMSParallelInitialMarkEnabled
-XX:+CMSEdenChunksRecordAlways
-XX:+CMSClassUnloadingEnabled
-XX:ParallelGCThreads=8
-XX:ConcGCThreads=8
-Xms16G
-Xmx16G
-Xmn8G

  

Please note that , -XX:+UseParNewGC The parameter has been changed from JDK 11 Remove , After that, it becomes an implicit parameter (implicit parameter). Using this parameter will prevent JVM Start of .
 
We will CMS Maximum heap memory of (max heap) Restriction on 16GB, Otherwise it could trigger major collection A long pause .

 
03 G1
 
comparison CMS The collector ,G1GC( namely Garbage-First Garbage Collector, Garbage priority garbage collector ) It's easier to configure , Because it can dynamically adjust the size of the younger generation .
 
however G1GC More suitable for large heap memory (>=24GB)—— That's why it didn't become Cassandra The default garbage collector for . in addition , Although its throughput is better than CMS Better , But its delay is longer than that of the debugged one CMS longer .
 
The following settings are used for G1 The benchmark of :
 -XX:+UseG1GC
-XX:G1RSetUpdatingPauseTimePercent=5
-XX:MaxGCPauseMillis=300
-XX:InitiatingHeapOccupancyPercent=70
-XX:ParallelGCThreads=8
-XX:ConcGCThreads=8
-Xms31G
-Xmx31G

  

In order to Cassandra 4.0 Benchmark , We're running G1 I used JDK14.
 
We use 31GB The heap memory size of , So you can benefit from compressed pointers (compressed oops), And can have the most addressable objects with the smallest heap memory size .

 
04 ZGC
 
ZGC(Z Garbage Collector) yes JDK The latest garbage collector in , Its main focus is to put the world on hold (stop-the-world) The time delay is reduced to 10ms within .ZGC It should also ensure that the heap memory size has no effect on the pause time , This allows its heap memory size to be expanded to 16TB.
 
If all the desired effects can be satisfied ,ZGC This makes it unnecessary to use off heap storage , And it can simplify Apache Cassandra Some of the development tasks .
 
The following settings are used for ZGC The benchmark of :
 -XX:+UnlockExperimentalVMOptions
-XX:+UseZGC
-XX:ConcGCThreads=8
-XX:ParallelGCThreads=8
-XX:+UseTransparentHugePages
-verbose:gc
-Xms31G
-Xmx31G

  

We need to use - XX:+UseTransparentHugePages As a flexible way to avoid Linux Enable large memory pages in (page).
 
Despite the official ZGC The document says this could cause a surge in latency , From the results of the test, it doesn't seem to happen . We might be able to use large memory pages for multiple throughput tests , So as to determine what impact this method will have on the benchmark results .
 
Please note that ,ZGC You can't use compressed pointers , But also not affected by “32GB threshold ” The limitation of . We are in the ZGC tests , Use and in G1 The same thing in the test 31GB Heap memory for . such , In both cases, the free memory size of the system will be the same .

 
05 Shenandoah
 
Shenandoah It's a red hat (RedHat) Development of low latency garbage collector . stay JDK 8 and 11 in , It's ported backwards as a (backport) There is a version of . from Java 13 Start , It's contained in OpenJDK In the main line version of .
 
And ZGC equally , in the majority of cases Shenandoah It's a concurrent garbage collector . Its goal is to keep the pause time from increasing linearly with the increase of heap memory .
 
The following settings are used for Shenandoah The benchmark of :
 
 -XX:+UnlockExperimentalVMOptions
-XX:+UseShenandoahGC
-XX:ConcGCThreads=8
-XX:ParallelGCThreads=8
-XX:+UseTransparentHugePages
-Xms31G
-Xmx31G

  

Shenandoah It should be possible to use compressed pointers , So you can benefit from using more than 32GB Smaller piles .

 
06 Cassandra 4.0 Of JVM To configure
 
Cassandra 4.0 The versions are Java 8 and Java 11 There are different jvm.options file . They are :
  • conf/jvm-server.options
  • conf/jvm8-server.options
  • conf/jvm11-server.options
If you will Cassandra from 3.11 Version upgrade to 4.0 edition , What was already there jvm.options The document can still be used , As long as it's renamed jvm-server.options, And will jvm8-server.options and jvm11-server.options Just remove the file . however , It's not a recommended way .
 
The recommended way is to change the original jvm.options The settings in the file , Re apply to new jvm-server.options and jvm8-server.options On the file . These specific Java Options file (option file) Most of them are related to the parameters of garbage collection .
 
once jvm-server.options and jvm8-server.options The file is updated and in place , To configure jvm11-server.options Documents and from JDK 8 The switch to JDK 11 It's much easier .

 
07 The workload
 
In this benchmark, there are 8 Threads , And limit the reading and writing ratio to 80% Write /20% read .tlp-stress Use a lot of asynchronous query statements , This makes it as long as a limited number of pressure testing threads, a careless may let Cassandra Node overload . In this load test , Each thread sends... Concurrently at one time 50 Query statements .
 
The key space in this test (keyspace) The replication factor (replication factor) by 3, And all statements are at the consistency level LOCAL_ONE To execute .
 
For all garbage collectors and Cassandra Version testing , The number of operations per second 25k、40k、45k、50k The growth rate of , So we can evaluate their performance at different pressure levels .
 
Below tlp-stress Statement is used in this test :
tlp-stress run BasicTimeSeries -d 30m -p 100M -c 50 --pg sequence -t 8 -r 0.2 --rate <desired rate> --populate 200000

  

All the workloads are running 30 minute , take 5 to 16GB Load the data to each node and consider the reasonable compaction load .
 
Please note that , The purpose of this test is not to evaluate Cassandra The optimal performance of , Because for different workloads , There are many ways to debug to get the best performance .
 
The purpose of this test is not to debug these garbage collectors , Although they have exposed many parameters and options that can improve performance for a specific workload .
 
What these benchmarks want to achieve is : Use most of the default settings and Cassandra With the same load , Provides a fair comparison of different garbage collectors .

 
 
08  Benchmark results
 
3.11.6 25k-40k ops/s
 
 
4.0 25k-40k ops/s
 
 
4.0 45k-50k ops/s
 
In terms of throughput ,Cassandra 3.11.6 Up to 41k ops/s, and Cassandra 4.0 As high as 51k ops/s. Both versions use CMS As a garbage collector , And the upgraded Cassandra 4.0 than Cassandra 3.11.6 We have achieved 25%.
 
4.0 A lot of performance improvements in this release can be used to explain this result , Especially about compaction (compaction) Heap pressure problem caused by operation ( You can see CASSANDRA-14654 As an example ).
 
stay Cassandra 3.11.6 In the cluster JDK 8 Of Shenandoah stay 40k ops/s In the load test of , Not only failed to achieve the highest throughput , There was also a query failure .
 
And with the help of Cassandra 4.0 Clusters and JDK 11,Shenandoah It's a lot better —— Its maximum throughput in this case 49.6k ops/s, Almost able to catch up with 4.0 stay CMS The throughput of the next year has increased .
 
Use JDK 8 and Cassandra 3.11.6,G1 and Shenandoah On the whole, the throughput can only be up to 36k ops/s.
 
Use JDK 14 and JDK 11, The former seems to make G1 There's also a slight improvement in my performance —— from 47k/s Up to 50k/s.
 
Whether to use JDK 14 still JDK 11,ZGC We can't compete with other garbage collectors in terms of throughput , The highest can only reach 41k ops/s.
 
stay Cassandra 3.11.6 In the cluster ,JDK 8 Of Shenandoah Under moderate load , It shows a very impressive low latency . however , Its performance will decline with the increase of load pressure .
 
In the use of CMS Under the circumstances ,Cassandra 4.0 The average of p99(99 Percentile ) The value is between 11ms To 31ms Between , And the throughput is as high as 50k ops/s. Under moderate load , Read operated P99 Average value in Cassandra 3.11.6 In Chinese, it means 17ms, And in the Cassandra 4.0 And it's down to 11.5ms. In contrast , In terms of delay Cassandra 4.0 Yes 30% The promotion of .
 
Cassandra 4.0 In terms of throughput and delay 25% To 30% The promotion of , So when using the same garbage collector ,Cassandra 4.0 It's easy to beat Cassandra 3.11.6.
 
It is worth mentioning that , stay Cassandra 3.11.6 In the cluster ,Shenandoah The latency is very low under moderate load , However, its performance under pressure makes us worry about its ability to handle sudden increase of load .
 
although ZGC It shows a very impressive low latency under moderate load , Especially in use JDK 14 when , But its highest throughput is not equal to Shenandoah Compete with .
 
In almost all load tests ,Shenandoah The average of read latency and write latency p99 The values are all the lowest .Shenandoah Such a low delay plus it's in Cassandra 4.0 The throughput that can be achieved in , Make it a way to Cassandra 4.0 Upgrade time , A garbage collector worth considering .
 
Under moderate load , The average latency of read operations p99 Only value 2.64ms It's quite impressive in itself . On this basis , If you know that this data is recorded by the client , You have to Shenandoah I look at it with new eyes .
 
in the majority of cases ,G1 Maximum p99 Value corresponds to the maximum pause time it is configured for , namely 300ms. If you want to reduce the pause time of the target , You may not want to see the effect in the case of high load , It could even trigger a longer pause .
 
Under moderate load ,Shenandoah The average of p99 Delay can be reduced 77%, That is, the maximum delay is only 2.64ms. This can be a big boost for delay sensitive use cases —— Compared with Cassandra 3.11.6 Medium CMS, Read operated p99 The delay is greatly reduced 85%!
 
It is worth mentioning that ,JDK 14 Medium ZGC Good performance under moderate load , Unfortunately, it can't keep the same performance at a higher throughput rate . We are optimistic that ZGC It will be improved in the next few months , In the end, it may be enough to Shenandoah contest with each other .

 
09  reflection
 
G1 By removing debugging requirements for different generations of sizes Cassandra Ease of use , however It's at the cost of some performance .
 
A new issue Apache Cassandra 4.0 It's an impressive performance enhancement , It will allow the use of things like Shenandoah and ZGC This new generation of garbage collectors . These collectors are easy to use , It doesn't take a lot of subtle debugging , And it's more efficient on delay .
 
If you use Cassandra 3.11.6, It's hard for us to recommend Shenandoah, because Cassandra The performance of nodes under high load is not satisfactory . But from JDK 11 and Cassandra 4.0 Start ,Shenandoah There's an amazing improvement in latency , At the same time, it can support almost Cassandra The maximum throughput that a database can provide .
 
Since this benchmark focuses on specific workloads , Your test results may vary depending on the situation . But for delay sensitive use cases , The results of this test let us know Apache Cassandra I feel quite optimistic about my future , Because it will bring us more than Cassandra 3.11.6 A big improvement in .
 
Download the latest Apache 4 And try it yourself . If you have any feedback , Remember to use community mail or ASF Slack Contact us . Click here Check out our contact information .

Technology base | Apache Cassandra 4.0 More articles on Benchmarking

  1. Apache Cassandra 4.0 Introduction to new features

    introduction Hello everyone , I'm Cai Yifan , yes Cassandra One of the contributors to .( Although it is not convenient for me to disclose the name of my company ), But at the moment our company Cassandra Our deployment is one of the largest in the world ,Cassandra There are also many applications in our company . Ca ...

  2. 【 turn 】Apache Kylin 2.0 Bring interactive benefits to big data BI

    Reprinted from :[ Technical post ]Apache Kylin 2.0 Bring interactive benefits to big data BI Editor's note :Kyligence Co founder and CEO Luke Han I'd like to make a question on the topic of “” Speech . be based on Hadoop Of SQL Has been being ...

  3. AOP Technology base

    1. introduction 2.AOP Technology base 3.Java platform AOP Technology research 4..Net platform AOP Technology research 2.1 AOP The origin of Technology AOP Technology wasn't born too late , As early as 1990 Year begins , come from Xerox Palo Alto ...

  4. AOP Technology base ( turn )

    1. introduction 2.AOP Technology base 3.Java platform AOP Technology research 4..Net platform AOP Technology research 2.1 AOP The origin of Technology AOP Technology wasn't born too late , As early as 1990 Year begins , come from Xerox Palo Alto ...

  5. JNI Technology base (2)—— Write from scratch JNI Code

    The book follows : <JNI Technology base (1)—— Write from scratch JNI Code > 2. Compile source HelloWorld.java And generate HelloWorld.class 3. Generate header file HelloWorld.h ...

  6. 《C# Language and database technology foundation 》 Words must be

    <C# Language and database technology foundation > Chapter one 1..NET Framework   frame 2.sharp            Sharp , intense 3.application      Applications 4.devel ...

  7. JavaWeb Review of basic concepts of development technology

    JavaWeb Review of basic concepts of development technology Chapter one   Overview of dynamic web development technology 1.JSP technology :JSP yes Java Server Page Abbreviation , It's based on Java Server side dynamic web page . 2.JSP Operating principle : When the user first ...

  8. ajax Detailed explanation of technical basis

    One . summary 1. What is? ajax You can communicate with the server [ asynchronous ] Interactive technology , Browsers don't need to refresh 2. When will it appear ajax? -- XMLHttp Microsoft 1999 Microsoft released IE5 edition , It's embedded ajax technology When ...

  9. Hadoop Basics -Apache Avro Serialization and anti serialization

    Hadoop Basics -Apache Avro Serialization and anti serialization author : Yin Zhengjie Copyright notice : Original works , Declined reprint ! Otherwise, the legal liability will be investigated . One .Apache Avro brief introduction 1>.Apache Avro The source of the ...

  10. Modern front end technology analysis :Web Front end technology foundation

    ​ Recent years , More and more people are going into the front line : Up to now , The number of front-end engineers still can not meet the development needs of enterprises : meanwhile , The complexity of Internet application scenarios improves the requirements for the ability of front-end engineers , Part of the initial front-end engineers are not competent for the work of the enterprise ...

Random recommendation

  1. CSS3 Magic Hall : know @font-face and Font Icon

    One . Preface We used to beautify the site with pictures LOGO. title . Icon, etc , And now we can go through it @font-face Get a more flexible way to beautify . Two . Look at the examples /* Definition */ @font-face { font- ...

  2. Use cjson Nested encapsulation of objects

    There are two parts ,1) establish json.2) analysis json 1) Create nesting json Code for char * makeJson() { cJSON * pRoot = NULL; cJSON * pSub_1 = NU ...

  3. python A simplified version of genetic algorithm

    Reduced version of genetic algorithm , In the algorithm, only mutation operator is used instead of crossover operator , But evolution still works from string import ascii_lowercase from random import choice, ...

  4. SSH_ Framework integration 1

    1 WEB Environment Spring    Because it is WEB Application in the environment Spring, So first configure web.xml: (1)WebContent-WEB-INF-lib In bag , Join in Spring Under bag required ...

  5. solr5.5 course -tomcat deployed

    tomcat and solr Download it from their respective websites , Version as follows : tomcat edition :8.0.24 solr edition :5.5.0 1.solr After decompressing , The directory structure is as follows : 2.tomcat Of webapps In the new solr Catalog , hold ...

  6. thinkphp 5.0 Namespace

    Namespace Namespace ThinkPHP5 Define and automatically load class library files in the way of namespace , Effectively solve the problem of multiple modules and Composer The problem of namespace conflicts between class libraries , And a more efficient class library automatic loading mechanism is realized . If you don't know the namespace ...

  7. nginx root、alias、location Instruction usage

    One .nginx root Instructions 1. Nginx To configure The relevant configuration is shown in the figure below : By configuring root Directory to "/wwwroot/html/" Location Using the virtual host method , The host name is test, We need to configure ...

  8. Some of the Qt The program is in Windows Sort out the problems of porting between platforms

    Today, try to Qt The program is ported to various virtual machines for testing , because Qt The report of dependent libraries often can't show all the dependent libraries . As a result, there are frequent problems , It's not easy to solve all the problems , Here are some routines . First of all, for Qt edition , I've used it a lot , Finally, it is recommended at this stage Min ...

  9. dos3 Chapter

    FOR There are some variables in the command , Their usage is unknown to many novice friends , Today I will explain their usage ! The first FOR All of the variables are listed : ~I          - Remove any quotes ("), Expand %I     %~f ...

  10. linux About file / Folder operation in

    Say one thing about stat function stat function Header file :    #include <sys/stat.h>  Function definition :    int stat(const char *file_name, str ...