Introduction to Logstash and Filebeat
Filebeat
We can define Filebeat as a data-shipper tool that belongs to the Beats family. This shipper can be installed on our servers according to our needs and requirements. Basically, filebeat ships with the modules to gather logs from a variety of inputs like Nginx, system, the Apache webserver or MySql, etc. These logs can be either access logs or error logs or system logs etc. Depending upon us where we output to them, they then send log entries to either Elasticsearch or Logstash.
Logstash
Simply, we can define logstash as a data parser. logstash can take input from various sources such as beats, file, Syslog, etc. and those logs could be of any kind like chat messages, log file entries, or any. After collecting logs we can then parse them, and store them for later use.
After processing these data, Logstash then shipped off these data destinations as per our needs. We can even ship off these data to one or more destinations such as Kafka queue, Elasticsearch at the same time. If these data are stored in elasticsearch, Kibana can be used to view and analyze them.
Configuring logstash with filebeat
Here, in this article, I have installed a filebeat (version 7.5.0) and logstash (version 7.5.0) using the Debian package. And the version of the stack (Elasticsearch and kibana) that I am using currently is also 7.5.0.
Filebeat setup
Step 1: Download filebeat Debian package from:
https://www.elastic.co/downloads/past-releases
Step 2: Install the package via command:
sudo dpkg -i filebeat-7.5.0-amd64.deb
Step 3: Provide ownership
After you are done with the installation you need to provide ownership to the following files. If you don’t provide ownership to these files you will face the error of permission denied afterward while running filebeat.
Run the command below to provide ownership:
sudo chown -R $USER:$USER /etc/filebeat/
sudo chown -R $USER:$USER /usr/share/filebeat
sudo chmod +x /usr/share/filebeat/bin/
sudo chown -R $USER:$USER /var/log/filebeat/
sudo chown -R $USER:$USER /etc/default/filebeat
sudo chown -R $USER:$USER /var/lib/filebeat/
Step 4: Edit the filebeat.yml file located inside /etc/filebeat/filebeat.yml and make the following changes.
sudo nano /etc/filebeat/filebeat.yml
Here, in the above configuration, we can list which log file we want to read. I have listed only one file i.e log2.
enabled: true
If not set to true then it won’t do any work.
Paths:
/home/sajita/log2
As we are sending the logs to logstash, we need to do the following task :
Rem all other outputs and host’s field except logstash
Provide IP address of the logstash.
Step 5: Start filebeat
bin/filebeat -c /etc/filebeat/filebeat.yml -e -d "*"
Logstash setup
Before setup let’s have a brief overview of the logstash pipeline. A Logstash pipeline consists of three stages:
i. Input stage: This stage tells how Logstash receives the data. Input plugin could be any kind of file or beats family or even a Kafka queue.
ii. Filter Stage: This stage tells how logstash would process the events that they receive from Input stage plugins. Here we can parse any kind of file formats such as CSV, XML, or JSON.
iii. Output Stage: This stage tells where we can send the processed events to. We can output the events to any place such as Elasticsearch, Kafka queue, file, etc.
Setup Process:
Step 1: Download logstash Debian package from:
https://www.elastic.co/downloads/past-releases
Step 2: Install the package via command:
sudo dpkg -i logstash-7.5.0-amd64.deb
Step 3: Provide ownership
After you are done with the installation you need to provide ownership to the following files. If you don’t provide ownership to these files you will face the error of permission denied afterward while running logstash.
Run the command below to provide ownership:
sudo chown -R $USER:$USER /etc/logstash/
sudo chown -R $USER:$USER /usr/share/logstash/
sudo chmod +x /usr/share/logstash/bin/
sudo chown -R $USER:$USER /var/log/logstash/
sudo chown -R $USER:$USER /etc/default/logstash
sudo chown -R $USER:$USER /var/lib/logstash/
Step 4: Check pipelines.yml file inside etc/logstash and make sure configuration looks like this:
Step 5: Initialize a custom template inside /etc/logstash with correct mappings.
Here, the log that I am processing is in JSON format so I have configured pipeline and template accordingly.
My log format
I have created a template named log_temp.json for the above log format.
The content of the template looks like this according to my log file.
Logstash uses this template and creates an index in elasticsearch according to the mapping provided.
Step 6: Configure Logstash pipeline inside /etc/logstash/conf.d to ingest data from filebeat.
sudo nano /etc/logstash/conf.d/test1.conf
Test1.conf pipeline consist of the following content:
In the above configuration:
input -> Tells logstash to listen to Beats on port 5044Filter{
grok{ -> grok is a file parser tool.we parse our logs using grok pattern. It understands different file formats.Output{
elasticsearch{} -> All those filtered data that we output to ( Here we have output to elasticsearch)stdout { codec => rubydebug } -> when we write the output to stdout we can see that it is working.document_type : hostname → This tells us from which host the data is coming from.parsed_json field → This field Contains only log data.Mutate filter → This filter maps our log data into new fields.Prune and whitelist → When this filter is used, we can output only those data that we want to specify in output section.
Note: we need to specify date format in both filter and template otherwise it will throw an error stating that it could not recognize the date format and logstash won’t be able to insert data to the elasticsearch index.
Step7: Start logstash
bin/logstash -f /etc/logstash/conf.d/logstash-log.conf — path.settings /etc/logstash/
Now our “stash” here is Elasticsearch. If we run the above configuration logstash will index the sample documents into Elasticsearch according to the mapping provided by us. Now we can retrieve the data through the Elasticsearch GET API.