当前位置:网站首页>Constructing request log analysis system

Constructing request log analysis system

2020-11-07 20:55:54 rife

What data is requested to be logged

  • time_local : Requested time
  • remote_addr : Client's IP Address
  • request_method : Request method
  • request_schema : Request protocol , common http and https
  • request_host : Requested domain name
  • request_path : Requested path route
  • request_query : Requested query Parameters
  • request_size : Requested size
  • referer : Request source address , Suppose you a.com Posted under the website b.com Link to , So when the user from a.com Click on the access b.com When ,referer The record is a.com , This is the behavior of the browser
  • user_agent : Client browser related information
  • status : Response status of the request
  • request_time : The time taken to request
  • bytes_sent : The size of the response

Most of the time, we will use the load gateway to proxy forward requests to the actual back-end services , At this time, the request log also includes the following data :

  • upstream_host : Forwarded by agent host
  • upstream_addr : Forwarded by agent IP Address
  • upstream_url : The agent forwards to the service url
  • upstream_status : Upstream services return status
  • proxy_time : Time consuming in proxy forwarding

Data derivation

client IP The address can derive the following data :

  • asn Related information :

    • asn_asn : Autonomous system number ,IP Addresses are managed by autonomous systems , For example, China Unicom Shanghai network manages all of Shanghai Unicom's IP
    • as_org : Autonomous system organization , For example, China Mobile 、 China Unicom
  • geo Address location information :

    • geo_location : Longitude and latitude
    • geo_country : Country
    • geo_country_code : Country code
    • geo_region : Area ( Province )
    • geo_city : City

user_agent You can parse the following information :

  • ua_device : Use equipment
  • ua_os : operating system
  • ua_name : browser

Data analysis

  • PV / QPS : Page views / Requests per second
  • UV : Number of users visited , A lot of website users can also visit without order , At this time, it can be based on IP + user_agent The uniqueness of the user determines
  • IP Count : How many sources are there IP Address


<br/>

  • The network traffic : according to request_size The size of the request counts network traffic ,bytes_sent The response size calculates the network outflow traffic


<br/>

  • referer Source analysis


<br/>

  • Geographic analysis of customer requests : according to IP Address derived geo data



<br/>

  • Customer equipment analysis : according to user_agent Extract the data


<br/>

  • Request time consuming Statistics : according to request_time data

    • p99、p95、p90 Delay ( The first percentage of the request time , such as p99 It's the front 99% The time taken to request )
    • Long time abnormal monitoring

<br/>

  • Response status monitoring : according to status data

    • The response proportion of each state code is
    • 5xx Number of server exceptions

<br/>

  • Combine business analysis : Requested request_path Address and request_query The parameter must correspond to the specific business , for example

    • Request the address of an album is /album/:id , So in the log request_path The corresponding is a visit to the album
    • The address for the site search is /search?q=< key word > , So Statistics request_path yes /search You can know how many searches have been made , Statistics request_query in q You can know the search keywords by using the parameters of

Common architecture

The log system uses ELK + kafka Building is the mainstream solution in the industry ,beats、 logstash Log collection and transportation ,kafka Store logs for consumption ,elasticsearch Data aggregation analysis ,grafana and kibana Graphic display .

image

版权声明
本文为[rife]所创,转载请带上原文链接,感谢