How to Set Up Centralized Logging with the ELK Stack: Collect and Visualize Logs from Multiple Servers

Intermediate

Why Centralized Logging Matters in Real DevOps Work

Imagine this: it’s 2 AM, your application is throwing errors, and you have 15 servers in production. You SSH into one server, grep through logs, find nothing useful, move to the next server, and repeat. By the time you find the root cause, your users have been suffering for an hour.

This is exactly the problem centralized logging solves. Instead of scattered log files across dozens of servers, every log line flows into a single searchable system. You can correlate events across services, set up alerts, build dashboards, and debug issues in minutes instead of hours.

The ELK Stack — Elasticsearch, Logstash, and Kibana — has been the industry-standard open-source solution for centralized logging for years. In this guide, we’ll set up a complete ELK pipeline that collects application logs from multiple servers using Filebeat (a lightweight log shipper), processes them through Logstash, stores them in Elasticsearch, and visualizes them in Kibana.

Architecture Overview

Before we touch any configuration, let’s understand how the pieces fit together:

Filebeat — A lightweight agent installed on each application server. It watches log files and ships new lines to Logstash. It uses very little memory and CPU, which is why we prefer it over installing Logstash directly on every server.
Logstash — The processing engine. It receives logs from Filebeat, parses and transforms them (e.g., extracting timestamps, log levels, structured fields), and sends the results to Elasticsearch.
Elasticsearch — The search and storage engine. It indexes your logs and makes them searchable in near real-time.
Kibana — The visualization layer. It provides a web UI where you can search logs, build dashboards, and explore your data.

The data flow looks like this:

[App Server 1: Filebeat] ──┐
[App Server 2: Filebeat] ──┼──▶ [Logstash] ──▶ [Elasticsearch] ──▶ [Kibana]
[App Server 3: Filebeat] ──┘

For this guide, we’ll use ELK version 8.x (the current major release as of 2024-2025). We’ll set up everything on Ubuntu 22.04 LTS, but the concepts apply to any Linux distribution with minor path adjustments.

Prerequisites

A dedicated server (or VM) for Elasticsearch + Logstash + Kibana with at least 4 GB RAM (8 GB recommended). Elasticsearch is memory-hungry.
One or more application servers that generate log files.
Java is not required as a separate install — Elasticsearch 8.x bundles its own JDK.
All servers can communicate over the network (Filebeat → Logstash on port 5044, and you can reach Kibana on port 5601).

Step 1: Install Elasticsearch

We’ll install from the official Elastic APT repository. Run these commands on your central ELK server:

# Import the Elastic GPG key
wget -qO - https://artifacts.elastic.co/GPG-KEY-elasticsearch | sudo gpg --dearmor -o /usr/share/keyrings/elasticsearch-keyring.gpg

# Add the Elastic 8.x repository
echo "deb [signed-by=/usr/share/keyrings/elasticsearch-keyring.gpg] https://artifacts.elastic.co/packages/8.x/apt stable main" | sudo tee /etc/apt/sources.list.d/elastic-8.x.list

# Install Elasticsearch
sudo apt-get update && sudo apt-get install -y elasticsearch

Important: During installation, Elasticsearch 8.x will auto-generate a password for the built-in elastic superuser and enable security features by default. You’ll see this in the installation output. Save that password. It looks something like this:

--------------------------- Security autoconfiguration information ------------------------------
...
The generated password for the elastic built-in superuser is : aBcDeFgHiJkLmNoP
...

For our learning setup, we’ll simplify by disabling security so we can focus on the logging pipeline. In production, you should absolutely keep security enabled and configure TLS properly. Open the Elasticsearch configuration:

sudo nano /etc/elasticsearch/elasticsearch.yml

Find and modify these settings:

# Cluster name - pick something descriptive
cluster.name: logging-cluster

# Node name
node.name: elk-node-1

# Network binding - for a single-server setup
network.host: 0.0.0.0

# Discovery for single-node setup
discovery.type: single-node

# Disable security for learning (DO NOT do this in production)
xpack.security.enabled: false
xpack.security.enrollment.enabled: false
xpack.security.http.ssl.enabled: false
xpack.security.transport.ssl.enabled: false

Now start Elasticsearch:

sudo systemctl daemon-reload
sudo systemctl enable elasticsearch
sudo systemctl start elasticsearch

Verify it’s running:

curl -s http://localhost:9200 | python3 -m json.tool

You should see output like:

{
    "name": "elk-node-1",
    "cluster_name": "logging-cluster",
    "cluster_uuid": "...",
    "version": {
        "number": "8.x.x",
        "build_flavor": "default",
        ...
    },
    "tagline": "You Know, for Search"
}

Step 2: Install Kibana

Still on the same central server:

sudo apt-get install -y kibana

Edit the Kibana configuration:

sudo nano /etc/kibana/kibana.yml

Set the following:

# Make Kibana accessible from your browser (not just localhost)
server.host: "0.0.0.0"

# Port (default is 5601)
server.port: 5601

# Elasticsearch connection
elasticsearch.hosts: ["http://localhost:9200"]

Start Kibana:

sudo systemctl enable kibana
sudo systemctl start kibana

Wait about 30 seconds, then open http://your-server-ip:5601 in your browser. You should see the Kibana welcome page. If it doesn’t load immediately, give it a minute — Kibana can take a little while to initialize on first start.

Step 3: Install and Configure Logstash

Still on the central server:

sudo apt-get install -y logstash

Logstash configuration uses a pipeline model with three sections: input (where data comes from), filter (how to parse/transform it), and output (where to send it). Let’s create a pipeline configuration file:

sudo nano /etc/logstash/conf.d/logstash.conf

Here’s a working pipeline that receives logs from Filebeat, parses common log formats, and sends everything to Elasticsearch:

input {
  beats {
    port => 5044
  }
}

filter {
  # If the log line looks like a syslog or general app log,
  # try to extract the timestamp and log level
  grok {
    match => {
      "message" => "%{TIMESTAMP_ISO8601:log_timestamp}\s+%{LOGLEVEL:log_level}\s+%{GREEDYDATA:log_message}"
    }
    tag_on_failure => ["_grokparsefailure"]
  }

  # Parse the extracted timestamp into a proper date field
  if [log_timestamp] {
    date {
      match => [ "log_timestamp", "ISO8601", "yyyy-MM-dd HH:mm:ss,SSS", "yyyy-MM-dd HH:mm:ss" ]
      target => "@timestamp"
    }
  }

  # Add a field identifying which app/server the log came from
  # Filebeat already sends agent.hostname, but let's ensure we have it
  mutate {
    remove_field => [ "log_timestamp" ]
  }
}

output {
  elasticsearch {
    hosts => ["http://localhost:9200"]
    index => "app-logs-%{+YYYY.MM.dd}"
  }

  # Uncomment this for debugging — prints processed events to the Logstash log
  # stdout { codec => rubydebug }
}

Let me break down what’s happening in the filter section:

grok — This is Logstash’s pattern-matching plugin. It uses predefined patterns (like TIMESTAMP_ISO8601, LOGLEVEL) to extract structured fields from unstructured log lines. For example, it turns 2024-06-15 10:23:45 ERROR Database connection failed into separate fields for the timestamp, level, and message.
date — Takes the extracted timestamp string and makes it the event’s official @timestamp. This ensures your logs are indexed by their actual time, not the time Logstash received them.
mutate — Cleans up by removing the intermediate log_timestamp field since we’ve already used it.

Start Logstash:

sudo systemctl enable logstash
sudo systemctl start logstash

You can check if Logstash started successfully:

sudo systemctl status logstash

Logstash takes 30-60 seconds to start — it’s the slowest component. Be patient and check the logs if something goes wrong:

sudo tail -f /var/log/logstash/logstash-plain.log

Step 4: Install and Configure Filebeat on Application Servers

Now switch to your application servers — the machines whose logs you want to collect. Run these commands on each app server:

# Same Elastic repository setup
wget -qO - https://artifacts.elastic.co/GPG-KEY-elasticsearch | sudo gpg --dearmor -o /usr/share/keyrings/elasticsearch-keyring.gpg

echo "deb [signed-by=/usr/share/keyrings/elasticsearch-keyring.gpg] https://artifacts.elastic.co/packages/8.x/apt stable main" | sudo tee /etc/apt/sources.list.d/elastic-8.x.list

sudo apt-get update && sudo apt-get install -y filebeat

Edit the Filebeat configuration:

sudo nano /etc/filebeat/filebeat.yml

Here’s the key configuration. Replace the entire file contents with this, or carefully edit the relevant sections:

filebeat.inputs:
  - type: filestream
    id: app-logs
    enabled: true
    paths:
      - /var/log/myapp/*.log
      - /var/log/syslog
    # Add fields to identify this server
    fields:
      environment: production
      app_name: my-web-app
    fields_under_root: true

# Disable sending directly to Elasticsearch
output.elasticsearch:
  enabled: false

# Send to Logstash instead
output.logstash:
  hosts: ["your-elk-server-ip:5044"]

# Logging for Filebeat itself
logging.level: info
logging.to_files: true
logging.files:
  path: /var/log/filebeat
  name: filebeat
  keepfiles: 7

Key things to customize:

Change the paths to match your actual log file locations. You can use wildcards like /var/log/myapp/*.log.
Replace your-elk-server-ip with the actual IP address or hostname of your central ELK server.
The fields section adds custom metadata to every log event — this is extremely useful for filtering logs by application or environment in Kibana later.

A note about filestream vs log input type: In Filebeat 8.x, the filestream input type is the recommended replacement for the older log input type. You’ll see many tutorials still using type: log, which still works but is deprecated. Use filestream for new setups.

Start Filebeat:

sudo systemctl enable filebeat
sudo systemctl start filebeat

Check that Filebeat is running and connecting to Logstash:

sudo systemctl status filebeat
sudo tail -f /var/log/filebeat/filebeat

You should see log lines indicating a successful connection. If you see connection refused errors, verify that port 5044 is open on your ELK server and that Logstash is running.

Step 5: Generate Some Test Logs

Let’s create a simple Python script on one of your app servers to generate realistic log data. This helps us verify the entire pipeline works:

#!/usr/bin/env python3

"""generate_logs.py - Generate sample application logs for ELK testing"""
import logging

import random

import time

import os
# Create log directory if it doesn't exist

os.makedirs("/var/log/myapp", exist_ok=True)
# Configure logging

logging.basicConfig(

    filename="/var/log/myapp/application.log",

    level=logging.DEBUG,

    format="%(asctime)s %(levelname)s %(name)s - %(message)s"

)
logger = logging.getLogger("my-web-app")
messages = {

    "INFO": [

        "User login successful for user_id={}",

        "Request processed in {}ms",

        "Health check passed",

        "Cache hit for key=session_{}",

        "Order {} created successfully",

    ],

    "WARNING": [

        "