当前位置:网站首页>Exhibition cloud technology interpretation | in the face of emergencies, how does app do a good job in crash analysis and performance monitoring?

Exhibition cloud technology interpretation | in the face of emergencies, how does app do a good job in crash analysis and performance monitoring?

2020-11-09 22:32:23 Jingdong Zhilian cloud developer

Yun Mei guides reading :

stay 《 Exhibition cloud technology interpretation 》 Special topic , Has successively launched Safety Design And Intelligent recommendation article , The most stringent 8 Big security program 、 Online exhibition based on the service design method and exhibition scene intelligent recommendation building road . In this article, we will continue to understand the technology behind the mice cloud , Let's take a look at how the most important online platform in the exhibition cloud does well in performance monitoring and crash analysis .

Mice cloud solutions cover ** Business , technology , platform , application ** Four levels , At the business level, it provides a cloud exhibition hall full of science and technology 、 Various modes of forum and conference, etc ; There are many solutions at the application level , Live solutions 、 Video conferencing solutions 、 Mobile R & D solutions, etc ; At the technical level, it relies on massive elastic cloud computing capability and sufficient scalable cloud storage and bandwidth resources , It integrates the capabilities of a variety of platforms in Jingdong , There's a technical middle ground 、 Data center 、 Smart midrange and business midrange capabilities , It can quickly respond to the needs of foreground applications .

As a cloud show , Finally, the external window presented to the vast number of participants must be ** Website 、APP、H5、 Applet ** Waiting line system . In Jingdong exhibition cloud , "One-stop" work style APP Solution EMOP Platform is also an important part of it , It can provide a variety of mobile terminals for mice cloud organizers , Include APP、H5、 Small program etc. , Provide full business process planning 、 Design 、 Research and development 、 operating 、 One stop service such as operation and maintenance .

Recently, China International Service Trade Fair was successfully held ( It's called the service trade association for short ) Take a new model for the first time —— Integration of online and offline exhibitions , With the help of EMOP The capability of the platform at the operation and maintenance level , For the service trade association APP The stable operation of escorts .

There are as many enterprises as 17000 More than , And the Service Trade Association APP As an important window for the external operation and maintenance of cloud Exhibition , For coming from ** Different sizes around the world 、 Exhibitors from different industries 、 Purchasers provide exhibitions 、 Forum meeting 、 Digital platform for negotiation and signing .**

During the exhibition , Service Trade Association APP How to carry the pressure of passenger flow ? How to guarantee online APP The quality of the ? How to effectively repair online crashes ? How to improve customer retention and liveliness ?

Before looking for answers to questions , Let's take a case of Jingdong first ,2020 Beijing East 6.18 The turnover is over 2692 Billion , Record high .618 Great promotion period , JD.COM APP There are thousands of users browsing products 、 Rush purchase 、 Order, etc , Not only to withstand the pressure of 100 million class flow , And provide a good user experience , without doubt , This is inseparable from the technology behind it .

In the face of the impact of large-scale exhibition passenger flow , In the face of an emergency , How can R & D personnel quickly locate problems and fix them in time , You have to talk about what's behind it ** Crash analysis system and mobile performance monitoring system **.

**  One , Jingdong mobile crash analysis system **

The crash system is positioned for mobile APP It provides a service platform for crash monitoring and crash module positioning , Through to APP Monitoring and analysis of crash data , To help APP Reduce the frequency of crashes , Improve user satisfaction , have ** Support Android、iOS Minimalist access 、 Real-time monitoring 、 Comprehensive information 、 Safe and stable 、 The statistics are detailed ** Other characteristics .

The crash analysis system is APP The stable operation of the system plays a role of escort , At present, it has been approved by Jingdong Wanshang 、 Jingdong home 、 Seven fresh and so on APP Validation of the , This service trade fair APP Access to the crash system , One is to monitor at any time APP, Protect the exhibition 、 Forum meeting 、 Negotiate the stable operation of functions such as signing a contract ; The other is to locate the crash module in a short time , Fix the problem in time .

How does a crash analysis system do crash monitoring ?

  1. ** Crash capture :** Support native crash 、OOM collapse 、 Custom error or exception reporting ;

  2. ** Crash analysis :** Cluster analysis based on collapse module dimension , The main features of the problem can be quickly found by clustering data analysis ;

  3. ** Cross end exception :** Support cross end exception module query , Cross end exception data parsing (RN、Flutter), Quickly locate cross end problems ;

  4. ** Advanced features :** For a variety of abnormal information for detailed search query , Support raw data export , Support the contrast query of different query conditions .

Two , Jingdong mobile monitoring system

Mobile performance monitoring system aims to establish a unified application performance access framework , Through a variety of performance monitoring programs , Collect client data , Show abnormal data , Assist to locate abnormal problems and output performance report , In order to help developers quickly find product performance problems , Optimize APP performance .

Performance analysis is like APP An old Chinese medicine doctor with pulse , All kinds of difficult and complicated diseases can be found out quickly .

During the conference, the service trade fair APP After running online , Through performance analysis, real-time perception of application startup performance 、 Page loading performance 、 Network request, etc , The problem with locating anomalies is , Protect exhibitors , Buyers , Media reporters and other good use experience .

How can the performance analysis system monitor all aspects ?

Through the following functions :** Start the monitoring 、 Caton monitoring 、 network monitoring 、webview monitor , Native PAGE monitoring and log reporting 、 Install package analysis tools 、 Memory analysis tools ** Performance monitoring can be realized .

( One ) Start the monitoring

Monitor the time it takes for online users to start an application and locate where the time is spent .

The start-up monitoring adopts a non-invasive way, which is divided into three stages , The time consumption of each stage of application startup is recorded :

The first stage  :  Recorded Application The initialization phase of the ;

The second stage  :  Recorded Application It takes time from initialization to the user's home page ;

The third stage  :  It records the time-consuming process from the beginning of the homepage creation to the full display ;

Time to start = The first stage takes time + The second stage takes time + The third stage takes time .

in addition , The method execution time is also reported for the main life cycle methods applied in the start-up period , Assist users to disassemble the start-up process through intuitive data .

( Two ) Caton monitoring

There are many reasons for this , among , The impact of the main thread stuck is the most serious , May cause the user to be unable to use the mobile normally APP Any business on .

Introduction to the principle of Caton monitoring :

① APP UI Thread message mechanism

APP UI Thread is Looper Threads , Maintain a message queue in the thread ,Looper Tasks in the cyclic consumption message queue , If there are time-consuming operations in the message queue , Will affect UI Task mapping , Cause the interface to get stuck .

②  collection UI The run time of the thread message

Mobile (Android/iOS) It's all used AOP Thought , To the whole UI The thread's message processing process is monitored , The execution time of each message is collected .

③ The Caton condition ( Message execution time > Carton threshold )

After collecting the execution time of each message , Automatically compare the stuck threshold , A message that exceeds the threshold is considered to take too long to execute , It will cause the main route to get stuck .

④ sampling

While executing the main thread , The stack of the sampling thread to the main thread 、cpu And so on . But the sampling thread has to sleep for a period of time , The main reason for this is not to disturb most of the short messages of the main thread , To rob cpu resources , Cause performance degradation .

( 3、 ... and )  network monitoring

There are two categories of network monitoring data :** Native network monitoring and picture ** monitor .

Native network monitoring mainly monitors the performance of the interface and abnormal data ; Picture monitoring is a new addition to network monitoring CDN Node data reporting .

Introduction of network monitoring technology scheme :

① use ASM Bytecode editing technology Hook here we are App The network behavior of the underlying underlying underlying network components ;

②  collection UI The run time of the thread message

Mobile (Android/iOS) It's all used AOP Thought , To the whole UI The thread's message processing process is monitored , The execution time of each message is collected .

③  The whole network monitoring includes performance data and abnormal data , It meets the basic abnormal alarm monitoring , It also has the ability to analyze the performance data on the aggregation line ;

( Four ) Native PAGE performance monitoring

The native PAGE monitoring adopts a more lightweight data collection strategy , Mainly through the operation period , The frame rate of the acquisition device ,CPU, Memory , Number of threads , Traffic data , To reflect the current page runtime performance .

Introduction to the technical scheme :

①  Frame rate acquisition

The frame rate collection scheme is to monitor the refresh event of the page through the callback of the system , When the page starts to refresh, it calculates the drawing time per frame of the page , By the number of frames / Total time to calculate the true frame rate .

② CPU、 Memory 、 Thread data collection

CPU, Memory usage and the number of threads can reflect the resource consumption of native pages , The performance monitoring system adopts the method of equal time interval sampling to achieve sample collection .

  • CPU Monitoring can feed back the current page CPU usage , It can feed back the problem of power consumption ;

  • Memory usage can indicate the current page consumption of memory and JVM Available memory size in , So as to calculate the top rate of memory and page memory jitter ;

  • The number of threads can monitor whether the number of threads in the native PAGE exceeds the limit , The number of monitoring threads exceeds the limit OOM problem .

( 5、 ... and )webview Introduction of monitoring technology scheme

Jingdong performance monitoring system supports Webview Tencent X5 Kernel performance monitoring , The core indicators of monitoring are window.perfromance.timing Parameters , This parameter records the whole of webview The loading process is time consuming , As shown in the figure below :

Data returned through the browser kernel , You can calculate the total page load time 、 Network requests take time 、DOM Loading time 、 White screen time is the performance indicators such as user waiting time .

The above two systems enable the FTC at the monitoring level APP Principle and implementation of , They all come from Jingdong EMOP platform , The platform is an enterprise mobile R & D platform , combination “ Jingdong system ”APP R & D experience and best practices , One stop solution for mobile development , Can help enterprises to build a strong mobile platform , Quickly create high quality APP、 All kinds of small programs and other mobile terminal products , Support new business development of enterprises , Help the smooth implementation of enterprise mobile transformation .

EMOP Support private cloud and public cloud deployment , There are four stable development frameworks ,7 Mobile development technology and component capabilities , From the “ Mobile Jingdong ”APP R & D best practices , Under the premise of realizing multi service closed-loop , Solve the cost 、 quality 、 efficiency 、 Standard questions , Help multiple teams APP The quality of R & D has been improved 、 Cost savings , For details, please see :https://emop.jd.com/home/.

Recommended reading :

Welcome to click Jingdong Zhilian cloud , Learn about the developer community

 More wonderful technology practice and exclusive dry goods analysis

 Welcome to your attention 【 Jingdong Zhilian cloud Developer 】 official account

版权声明
本文为[Jingdong Zhilian cloud developer]所创,转载请带上原文链接,感谢