Recently, Lao Huang has been making pairs 11 Relevant stuff , So blogs and github Not much updated , During this period, I also made a lot of things in the company .
Let's briefly share the business monitoring related .
Let's start with the background .
A business is in double 11 In the first wave, there was no real-time business kanban , During the summary meeting, technical colleagues were complained by relevant leaders and business personnel , It is said that there is no way to clearly understand the situation at that time , Can't adjust the corresponding strategy in time and effectively .
After that, Lao Huang learned that , That business is an old one , Resources are tight , Don't dare to access the database in real time ( Single point ), I'm afraid the database will hang up , The business is cold .
To avoid an embarrassing situation , I don't want to be criticized again , It can only be done .
Analyze the status quo
3 Applications ,.NET Framework Project , All are windows The server , No containerization .
From double 11 Only a few days , You can't change it too much , And deal with the new needs of the business unit .
The plan that came to mind at that time
- Business , Access prometheus, combination grafana
- Business development MQ, Consumption data to ES, Make a front panel
- Business , Access log service , Combined with the instrument panel
A general analysis
- programme 1, The business team is right about prometheus Almost 0 cognition , It takes a lot of time to understand the relevant concepts ,pass
- programme 2,MQ At present, Tencent cloud is used CMQ, I've been holed up 2 Time , Not very good either hold live ES,pass
- programme 3, Log according to the internal standard , The business side only needs to add a line of corresponding log in the key place , And then leave it to logtail Go collect , Upload to log service
So in these three scenarios , Lao Huang chose the third plan .
First of all, the log service has been connected to various internal systems , I'm familiar with the team , Secondly, it will not affect the main business , Just bury it in key places , Add log .
Although it's intrusive to business code , But no doubt this is an optimal solution at this stage .
The overall implementation logic is as follows .
Business , It's actually a very simple , And the most important step .
We have the corresponding log help class , So what we have to deal with here is just the content of the log .
SerilogHelper.Info($"[field1] [field2] [field3]", "metrics_name");
Lao Huang's agreement here is , The field content is placed in
 Inside , Each field should be separated by a space .
Then the log will be dropped to a specific directory , Waiting to be Logtail The log service has been collected .
Of course, there is a problem here , It's the format of the log file , The previous designation was
UTF-8, The resulting file is
UTF-8 with bom Format .
This will cause the first line of log not to be parsed correctly , So pay special attention to .
Data access and display
At the code level, the above step is done , The next thing to do is log access .
According to business scenarios , Now it's an application, an indicator , So we need to build three logstores , Select the storage time on demand , The default is permanent .
After the log database is built, you need to access the data source , Here Lao Huang chose Regular - Text log .
And then there's a bunch of regular configurations .
most important of all
Logtail To configure This step .
Log path , Is the path of the program log output , such logtail To set the path collection .
Regularization is an important content ,logtail The log will be parsed according to this rule , Extract into fields , This will be more convenient for later query .
Next is the configuration of query analysis .
Here is to specify the statistical fields , Another is to turn off full-text indexing , Because in this scenario , It's not meaningful to open a full-text index , Waste money .
To this step , We can collect the data .
The last thing to do is to query the results . As long as it will be simple sql, Using log service to do statistics is certainly not a problem , The difficulty is relatively low .
Here are the specific effects .
It's a lot of coding , Rough it .
Because the console provides automatic refresh and full screen function , So when the hanging panel comes out, it can save manual intervention .
I was sued Monday night , On Tuesday morning , On Tuesday afternoon , On Wednesday, the results , This wave of operation is really fierce .
Have to say , Alibaba cloud's log service really simplifies a lot of tedious operations .
But the way to extract logs needs to be improved , It's a bit awkward .