当前位置:网站首页>Async profiler, a sharp tool for CPU analysis

Async profiler, a sharp tool for CPU analysis

2021-09-15 04:28:23 roshilikang

This article has been included https://github.com/lkxiaolou/lkxiaolou welcome star.

brief introduction

async-profiler It's a collection and Analysis java Performance tools , Translate github Project introduction on :

async-profiler It's a model without Safepoint bias problem The low cost of java Acquisition Analyzer , It USES HotSpot special api To collect stack information and memory allocation information , Can be in OpenJDK,Oracle JDK And some others based on HotSpot Of java virtual machine .async-profiler There are several events that can be traced :

  • cpu cycle
  • Hardware and software performance counters , For example, cache miss , Branch miss , Page error , Context switching, etc
  • Java Memory allocation in heap
  • Lock attempt , Include Java Object monitor and ReentrantLock

Usage method

First download async-profiler,github Home page (https://github.com/jvm-profiling-tools/async-profiler) There are compiled files on the , Find the corresponding platform to download

Basic usage

After decompressing the downloaded file , There is one profiler.sh Script , Run the script to java The process goes on cpu analysis , for example java process id by 1232

./profiler.sh start 1232
./profiler.sh stop 1232

Or you can use -d Specify the time of profiling ( second )

./profiler.sh -d 30 1232 

After execution, the collected information will be output :

Usually we use more intuitive Flame chart To draw the output information , Using parameter -f {$file_name}.svg

./profiler.sh -d 30 -f ./nacos.svg 1232

After the command is executed, a svg File format , Open with a browser

What do you think of the flame diagram ? You can refer to Ruan Yifeng's article

《 How to read flame chart 》 http://www.ruanyifeng.com/blog/2017/09/flame-graph.html

Simply speaking :x The axis is the number of samples ,x The longer the axis, the more times it is sampled by this method , Consume cpu The longer it takes y The axis is the depth of the stack , It's usually thinner up there , If you find one up there flat roof , And the flat top is very wide , It may have a problem , Consume cpu More time .

Introduction to other parameters

  • -e event

event Optional parameters use this command to view

./profiler.sh list list 1232

sample output :

event The default is cpu, It can also be used. alloc To see the memory allocation

./profiler.sh -e alloc -d 30 -f ./nacos-alloc.svg 1232

lock Check the lock :

Other models will not be tried one by one here , The most used is cpu Pattern .

  • -i N Set the sampling frequency , The default is 10ms, You can use ms,us,s As a unit , Unitless. The default is nanoseconds , Such as
./profiler.sh -i 500us -d 30
  • -j N Set stack sampling depth

  • -o fmt Set the output format : Optional summary、traces、flat、jfr、collapsed、svg、tree, The most common is svg

More commands can be found in async-profiler github Home page

An example of gateway performance pressure test

The author once tested the full link asynchronous gateway ,RPS stay 2000 about , Never go up ,cpu The consumption is relatively high , So using async-profiler Conduct cpu analyse , The resulting flame pattern is as follows

You can see there's a wide and deep stack here , It consumes a lot of cpu, Limited to image size , Pull up again , Let's see what this is

It can be seen from the class name that log4j, The guess is that the code logs too often somewhere , Get rid of the log when you find the place to log , Pressure test , Sure enough RPS Promoted to 5000, A small journal has such a great influence .

Principle introduction

I believe you should be able to use async-profiler To carry out cpu Dissected , If you are interested, you can learn about async-profiler Implementation principle , There is an article about it in detail , You can refer to

《JVM CPU Profiler Technical principles and deep source analysis 》 https://mp.weixin.qq.com/s/RKqmy8dw7B7WtQc6Xy2CLA

Just to summarize :

  • cpu profiler There are generally two ways to implement :(1)Sampling, sampling (2)Instrumentation, Buried point ; Sampling has less performance loss but less accuracy , Buried point ( similar AOP) Precision but big performance impact
  • Sampling Sampling can only be done in Safe Point Sampling at the same time , It will lead to the deviation of statistical results , That's what I said at the beginning Safepoint bias problem, For example, some methods take a short time to execute , But the execution frequency is high , It really takes up cpu, This part if Sampling The sampling frequency of is not small enough , Maybe we can't sample , But if the sampling frequency is too high, the performance will be affected , This is the general sampling based cpu profiler The shortcomings of
  • async-profiler It's based on sampling , But it doesn't have Safepoint bias problem, It's through a process called AsyncGetCallTrace Sampling in different ways , This kind of sampling does not need to be taken at a safe point , But this function is not so easy to call , It takes some skill ( Black science and technology ) How to get

It's the third reason that I wrote this article to recommend this cpu Analysis tool , For example, I used a different Analyzer (uber-common/jvm-profiler) To analyze the gateway above cpu, The flame diagram is as follows

If so , There's no way to find the problem . And Ali open source Arthas Medium cpu Analysis is also used async-profiler. So don't you try ?


WeChat official account " Master bug catcher ", Back end technology sharing , Architecture design 、 performance optimization 、 Source code reading 、 Troubleshoot problems 、 Step on the pit practice .

版权声明
本文为[roshilikang]所创,转载请带上原文链接,感谢
https://chowdera.com/2021/09/20210909112309750i.html

随机推荐