当前位置:网站首页>The difference between elk and efk

The difference between elk and efk

2021-01-03 09:36:48 osc_ bvzab11e


Preface

Mainstream ELK(Elasticsearch,Logstash,Kibana) Now it has become EFK(Elasticsearch,Filebeat or Fluentd,Kibana) It's heavy , For the container cloud log scheme, it is also generally recommended in the industry Fluentd, Let's take a look at it from ELK To EFK What has changed , At the same time, I also recommend you to learn about Grafana Loki.

ELK and EFK summary

With the increasing complexity of various software systems , Especially after deployment to the cloud , Then I want to log in to each node to view each module's log, It's basically not feasible . Because it's not just inefficient , And sometimes because of security , It's impossible for engineers to have direct access to the physical nodes . And now large-scale software systems are basically deployed in clusters , It means for every service, It will start many exactly the same Pod External services , Every container Will have their own log, Only from log Look at , You don't know which one Pod Produced , This makes it more difficult to view distributed logs .

So in the cloud age , Need a collection and Analysis log Solutions for . First of all, we need to spread the log Collected to a centralized place , Convenient view . After collecting , You can also do all kinds of statistical analysis , Even with popular big data or maching learning In this paper, the author analyzes it by using the following methods . Of course , For traditional software deployment , It needs to be like this too log Solutions for , However, this article mainly introduces from the perspective of cloud .

ELK That's the solution , And it's basically the de facto standard .ELK It's the acronym for three open source projects , as follows :

  • E:Elasticsearch

  • L:Logstash

  • K:Kibana

Logstash Its main function is to collect information about log And process it ;Elasticsearch It's a centralized storage log The place of , More importantly, it is a full-text search and analysis engine , It allows users to view in near real time 、 Analyzing massive amounts of data .Kibana It is for Elasticsearch The front end of development GUI, So that users can easily query with graphical interface Elasticsearch Data stored in , At the same time, it also provides various analysis modules , For example, build Dashboard The function of .

I personally believe that ELK Medium L Comprehend Logging Agent More appropriate .Elasticsearch and Kibana It's basically storage 、 Search and analyze log The standard solution for , and Logstash It's not the only collection log The plan ,Fluentd and Filebeats It can also be used to collect log. So now there are ELK,EFK Abbreviations like that .

The general architecture is shown in the figure below . Usually a small one cluster There are three nodes , There may be dozens or even hundreds of containers running on these three nodes . And we just need to start one on each node logging agent Example ( stay kubernetes The middle is DaemonSet The concept of ) that will do .

Filebeats、Logstash、Fluentd The difference and connection of the three

There is a need for Filebeats、Logstash and Fluentd Make a brief explanation of the connection and difference between the three .Filebeats It's a lightweight collection of local log Data solutions , The official response to Filebeats The description is as follows . It can be seen that Filebeats The function is relatively simple , It only collects local log, But it's not about collecting Log What to do with , So usually Filebeats It's usually necessary to collect log Send to Logstash Do further processing .

Filebeat is a log data shipper for local files. Installed as an agent on your servers, Filebeat monitors the log directories or specific log files, tails the files, and forwards them either to Elasticsearch or Logstash for indexing

Logstash and Fluentd Have the ability to collect and process log The ability of , There's a lot of online comparisons between the two , Provide a link to a well written article as follows . Functionally, the two are equal , but Logstash Consume more memory, Regarding this Logstash Our solution is to use Filebeats Collect... From each leaf node log, Of course Fluentd There is a corresponding Fluent Bit.

https://logz.io/blog/fluentd-Logstash/

Another important difference is that Fluentd Abstraction is better , Shielding users from the tedious details of the bottom . The author's original words are as follows :

Fluentd’s approach is more declarative whereas Logstash’s method is procedural. For programmers trained in procedural programming, Logstash’s configuration can be easier to get started. On the other hand, Fluentd’s tag-based routing allows complex routing to be expressed cleanly.

Although the author said to be neutral to both (Logstash and Fluentd) Contrast , But actually the bias is obvious :). This paper is also based on Fluentd To introduce , But the general idea is the same .

Say something extra ,Filebeats、Logstash、Elasticsearch and Kibana It's an open source project belonging to the same company , The official documents are as follows :

https://www.elastic.co/guide/index.html

Fluentd Another company's open source project , The official documents are as follows :

https://docs.fluentd.org

About ELK

ELK brief introduction

ELK yes Elastic A complete set of log collection and display solutions provided by the company , It's the acronym for three products , Namely Elasticsearch、Logstash and Kibana.

  • Elasticsearch It's a real-time full-text search and analysis engine , Provide collection 、 analysis 、 Three functions of data storage

  • Logstash It's a collection of 、 analysis 、 Tools for filtering logs

  • Kibana It's based on Web Graphical interface of , Used to search 、 Analysis and visualization are stored in Elasticsearch Log data in metrics

ELK Log processing flow

The picture above shows the Docker In the environment , A typical ELK Log collection process under the scheme :

  • Logstash From each Docker Extracting log information from container

  • Logstash Forward the log to Elasticsearch Index and save

  • Kibana Responsible for analyzing and visualizing log information

because Logstash Not good at data collection , And as Agent, Its performance is not up to standard . Based on this ,Elastic Released beats Series of lightweight acquisition components .

Here we want to practice Beat Components are Filebeat,Filebeat Is built on beats Above , It is applied to the implementation of log collection scenario , To replace Logstash Forwarder The next generation of Logstash The collector , In order to collect more quickly and stably, light weight and low consumption , It's very convenient to work with it Logstash There's also direct contact Elasticsearch Docking .

This experiment directly uses Filebeat As Agent, It will collect us in the first article 《Docker logs & logging driver》 Introduced in json-file Of log Changes in records in the document , And send the log directly to Elasticsearch Index and save , The processing flow is shown in the figure below , You can also think of it as EFK.

ELK Installation of the kit

In this experiment we used Docker How to deploy a minimal ELK Running environment , Of course , In a real environment, we may need to consider high availability and load balancing .

First pull sebp/elk This integrated image , Here's the choice tag The version is lates:

docker pull sebp/elk:latest

notes : Because it contains the whole ELK programme , So you need to be patient for a while .

Use... With the following command sebp/elk This integrated image starts running ELK:

docker run -it -d --name elk \
    -p 5601:5601 \
    -p 9200:9200 \
    -p 5044:5044 \
    sebp/elk:latest

After running, you can visit http://192.168.4.31:5601 have a look Kibana The effect of :

Of course , There's nothing to show right now ES Index and data of , Another visit http://192.168.4.31:9200  have a look Elasticsearch Of API Is the interface available :

Be careful :

If you find some errors during startup , Lead to ELK Container failed to start , You can refer to 《ElasticSearch Launch common errors [1]》 One article . If your host memory is less than 4G, It is recommended to add configuration settings ES Memory usage size , In case it doesn't work . For example, the configuration added below , Limit ES The maximum memory usage is 1G:

docker run -it -d --name elk \
    -p 5601:5601 \
    -p 9200:9200 \
    -p 5044:5044 \
    -e ES_MIN_MEM=512m \
    -e ES_MAX_MEM=1024m \
    sebp/elk:latest

When prompted to start the container max virtual memory areas vm.max_map_count [65530] is too low, increase to at least [262144] Please refer to :

#  edit  sysctl.con
vi /etc/sysctl.conf
#  Add the following configuration 
vm.max_map_count=655360
#  And then execute the command 
sysctl -p

Filebeat To configure

install Filebeat

Here we go through rpm Way to download Filebeat, Notice the download here and we ELK Corresponding version (ELK yes 7.6.1, Here is also the download 7.6.1, Avoid mistakes ):

wget https://artifacts.elastic.co/downloads/beats/filebeat/filebeat-7.6.1-x86_64.rpm
rpm -ivh filebeat-7.6.1-x86_64.rpm

To configure Filebeat

Here we need to tell Filebeat Which log files to monitor And Where to send the logs , So we need to modify Filebeat Configuration of :

nano /etc/filebeat/filebeat.yml

The content to be modified is :

1、 Monitor which logs ?

filebeat.inputs:

# Each - is an input. Most options can be set at the input level, so
# you can use different inputs for various configurations.
# Below are the input specific configurations.typelog

  # Change to true to enable this input configuration.
  enabled: true

  # Paths that should be crawled and fetched. Glob based paths.
  paths:
    - /var/lib/docker/containers/*/*.log

Designated here paths:/var/lib/docker/containers//.log, In addition, it should be noted that enabled Set to true.

2、 Where to send the log ?

#-------------------------- Elasticsearch output ------------------------------
output.elasticsearch:
  # Array of hosts to connect to.
  hosts: ["192.168.4.31:9200"]

  # Optional protocol and basic auth credentials.
  #protocol: "https"
  #username: "elastic"
  #password: "changeme"

It's specified here to send directly to Elasticsearch, Configure it ES The interface address of .

Be careful : If you want to send it to Logstash, Please use the following configuration , Uncomment it and configure it :

#----------------------------- Logstash output --------------------------------
#output.Logstash:
  # The Logstash hosts
  #hosts: ["localhost:9200"]

  # Optional SSL. By default is off.
  # List of root certificates for HTTPS server verifications
  #ssl.certificate_authorities: ["/etc/pki/root/ca.pem"]

  # Certificate for SSL client authentication
  #ssl.certificate: "/etc/pki/client/cert.pem"

  # Client Certificate Key
  #ssl.key: "/etc/pki/client/cert.key"

start-up Filebeat

because Filebeat Registered as... At installation time systemd Service for , So just start it directly :

systemctl start filebeat

Set boot up :

systemctl enable filebeat

Check Filebeat Start state :

systemctl status filebeat

The above operation is summarized as the script is :

wget https://artifacts.elastic.co/downloads/beats/filebeat/filebeat-7.6.1-x86_64.rpm
rpm -ivh filebeat-7.6.1-x86_64.rpm
echo "please input elk host_ip"
read host_ip

sed -i "s/  enabled: false/  enabled: true/g" /etc/filebeat/filebeat.yml
sed -i "s/\/var\/log\/\*.log/\/var\/lib\/docker\/containers\/\*\/\*.log/g" /etc/filebeat/filebeat.yml
sed -i "s/localhost:9200/${host_ip}:9200/g" /etc/filebeat/filebeat.yml

systemctl start filebeat
systemctl enable filebeat
systemctl status filebeat

Kibana To configure

Next we're going to tell Kibana, To query and analyze Elasticsearch Which logs in , So you need to configure a Index Pattern. from Filebeat We know that Index yes filebeat-timestamp This format , So here we define Index Pattern by filebeat-*.

Click on Next Step, Here we choose Time Filter field name by @timestamp:

single click Create index pattern Button , You can complete the configuration .

Now we click Kibana Left side Discover menu , You can see the log information of the container :

Take a closer look at the details , Let's focus on message Field :

You can see , What we should focus on is message, So we can also filter the information in this field :

This is just a simple demonstration of the import ELK Log information , actually ELK There are also many rich ways to play , For example, analysis aggregation 、 cool Dashboard wait . The author here is also a preliminary use , That's it .

Fluentd introduce

About Fluentd

What we used earlier is Filebeat collect Docker Log information , be based on Docker default json-file This logging driver, Here we use Fluentd This open source project to replace json-file Collect container logs .

Fluentd Is an open source data collector , Designed to handle data streams , Use JSON As a data format . It uses a plug-in architecture , High scalability, high availability , At the same time, it also realizes highly reliable information forwarding .Fluentd Cloud native Foundation (CNCF) One of the members of the project , follow Apache 2 License agreement , Its GitHub The address is :https://github.com/fluent/fluentd/.Fluentd And Logstash comparison , Less than memory 、 The community is more active , A comparison of the two can be found in this article 《Fluentd vs Logstash[2]》.

therefore , The following figure shows the whole process of log collection and processing , We use it Filebeat take Fluentd The collected logs are forwarded to Elasticsearch.

Of course , We can also use Fluentd Plug in for (fluent-plugin-elasticsearch) Send logs directly to Elasticsearch, You can replace it according to your own needs Filebeat, Thus forming Fluentd => Elasticsearch => Kibana The architecture of , Also known as EFK.

function Fluentd

Here we run a through the container Fluentd collector :

docker run -it -d --name fluentd \
    -p 24224:24224 \
    -p 24224:24224/udp \
    -v /etc/fluentd/log:/fluentd/log \
    fluent/fluentd:latest

Default Fluentd Will use 24224 port , Its log will be collected in our mapped path .

Besides , We still need to revise it Filebeat Configuration file for , take / etc/fluentd/log Join the monitoring directory :

#=========================== Filebeat inputs =============================

filebeat.inputs:

# Each - is an input. Most options can be set at the input level, so
# you can use different inputs for various configurations.
# Below are the input specific configurations.typelog

  # Change to true to enable this input configuration.
  enabled: true

  # Paths that should be crawled and fetched. Glob based paths.
  paths:
    - /etc/fluentd/log/*.log

After adding monitoring configuration , Need to re restart once filebeat:

systemctl restart filebeat

Run the test container

In order to verify the effect , Here we are Run Two containers , And formulate their respective log-dirver by fluentd:

docker run -d \
    --log-driver=fluentd \
    --log-opt fluentd-address=localhost:24224 \
    --log-opt tag="test-docker-A" \
    busybox sh -c 'while true; do echo "This is a log message from container A"; sleep 10; done;'

docker run -d \
    --log-driver=fluentd \
    --log-opt fluentd-address=localhost:24224 \
    --log-opt tag="test-docker-B" \
    busybox sh -c 'while true; do echo "This is a log message from container B"; sleep 10; done;'

Here, by specifying the log-driver, And set up for each container tag, It's convenient for us to check the log later .

verification EFK effect

And then it's back in Kibana View log information in , You can use the tag The log information just added to the filter container :

Simulation log generation stress testing tool

  1. https://github.com/elastic/rally

  2. https://pypi.org/project/log-generator/

  3. https://github.com/mingrammer/flog

Summary

This paper starts from ELK Start with the basic composition of , It introduces ELK Basic processing flow of , And from 0 Started building a ELK Environmental Science , Demonstrated based on Filebeat Cases of collecting container log information . then , By introducing Fluentd This open source data collector , Demonstrates how to build on EFK Log collection cases of . Of course ,ELK/EFK There's a lot of knowledge , The author also is only preliminary use , Hope to share more practice summary in the future .

Related links :

  1. https://www.cnblogs.com/zhi-leaf/p/8484337.html

  2. https://logz.io/blog/fluentd-Logstash/

Link to the original text :https://wsgzao.github.io/post/efk/

Kubernetes Administrator authentication (CKA) train

This time CKA Training courses , Based on the latest Syllabus , adopt Offline teaching 、 Interpretation of examination questions 、 Simulation exercise Methods such as , Help the students to master Kubernetes The theoretical knowledge and professional skills of , And do special intensive training for the exam , Let the students face it calmly CKA Certification examination , So that students can master Kubernetes Related knowledge , Can pass again CKA Certification examination , Trainees can attend the training many times , Until it's certified . Click on the picture below or read the link to see the details .

版权声明
本文为[osc_ bvzab11e]所创,转载请带上原文链接,感谢
https://chowdera.com/2021/01/20210103093444619x.html