Fluentd
Getting Started with Fluentd: Unified Logging for Modern Applications
When you run applications across multiple servers, containers, and cloud services, logs quickly become… chaos. Different formats, different locations, and no easy way to search or analyze them.
Fluentd is a powerful open-source data collector that helps you solve exactly this problem by unifying log collection and routing from many sources to many destinations.
In this post, we’ll cover:
-
What Fluentd is and why you’d use it
-
Key concepts (inputs, filters, outputs, buffers)
-
How to install Fluentd
-
A simple architecture diagram and explanation
What is Fluentd?
Fluentd is an open-source log and data collector. It sits between your applications and your log storage/analytics systems and acts as a unified logging layer.
Think of Fluentd as:
“A smart pipe that collects data from many places, transforms it, and sends it where you want.”
Why use Fluentd?
Some reasons Fluentd is popular:
-
Unified logging: Collect logs from apps, containers (like Docker/Kubernetes), system logs, Nginx, Apache, etc.
-
Flexible routing: Send data to Elasticsearch, OpenSearch, Loki, Kafka, S3, CloudWatch, BigQuery, Datadog, Splunk, and many more.
-
Plugin-based: 1 core + 100s of plugins for inputs, filters, and outputs.
-
JSON by default: Makes it easy to parse, enrich, and query logs.
-
Reliable: Supports buffering, retries, and backpressure so you don’t lose logs when a destination is down.
Fluentd Core Concepts
Fluentd’s configuration is typically done in a file like /etc/fluent/fluent.conf. It’s built around a simple pipeline idea: Input → Filter → Output, with buffering in between.
1. Inputs
Inputs define where Fluentd reads data from. Examples:
-
Tail a log file
-
Listen on a TCP/UDP port
-
Read from syslog
-
Receive logs from another Fluentd or Fluent Bit instance
Example (tail input):
2. Filters
Filters modify or enrich logs as they pass through. You can:
-
Add Kubernetes metadata
-
Mask sensitive fields
-
Rename or remove fields
-
Change log format
Example (adding a field):
3. Outputs
Outputs define where logs go: Elasticsearch, Loki, S3, Kafka, stdout, etc.
Example (send to Elasticsearch):
4. Buffering & Reliability
Fluentd uses buffers to handle spikes and destination downtime.
-
Memory buffer: Fast but limited; good for small/low-risk workloads
-
File buffer: Stores data on disk; better for reliability
Example (file buffer inside an output):
Installing Fluentd
There are several ways to install Fluentd depending on your environment.
Option 1: Install Fluentd (td-agent) on Linux (Ubuntu/Debian)
The easiest way is usually via the Fluentd/td-agent package.
Step 1: Install via package (example: Ubuntu)
For modern Ubuntu/Debian, you typically:
-
Add the Fluentd repository
-
Install
td-agent(the production-ready Fluentd package)
Example (generic pattern – adjust for your OS version as needed):
Replace
focal/td-agent4with the appropriate script for your distribution if needed, based on Fluentd docs for your OS version.
Step 2: Manage the Fluentd (td-agent) service
Step 3: Configuration file
The main configuration for td-agent is usually at:
You edit this file to add <source>, <filter>, and <match> sections as needed, then restart:
Option 2: Run Fluentd with Docker
If you prefer containers:
Step 1: Pull the Fluentd image
(Tag is just an example; use a recent stable tag from Docker Hub.)
Step 2: Create a Fluentd configuration file
Create a local fluent.conf:
This configuration:
-
Listens on port 24224 for incoming logs (Fluentd forward protocol)
-
Prints everything to stdout (useful for testing)
Step 3: Run the container
Now any client that speaks Fluentd’s forward protocol (like Fluent Bit) can send logs to this instance.
Option 3: Kubernetes (High Level)
In Kubernetes, Fluentd is usually deployed as a DaemonSet so that each node runs a Fluentd pod that:
-
Mounts
/var/log/containersor/var/log/pods -
Parses container logs
-
Sends them to a central backend (e.g., Elasticsearch, Loki, OpenSearch, etc.)
Most logging stacks (EFK, Elastic, Loki, etc.) provide ready-made Helm charts or YAML manifests you can install and customize.
Fluentd Architecture Diagram (Text Explanation)
Here’s a simple conceptual architecture you can visualize or convert into a proper diagram tool (Draw.io, Mermaid, Lucidchart, etc.):
If you want a Mermaid diagram (for Markdown-based blogs):
You can paste this into any Markdown renderer that supports Mermaid (like GitLab, GitHub with extensions, some blogs, Obsidian, etc.).
Minimal End-to-End Example
Here’s a tiny example of a full td-agent.conf snippet that:
-
Tails an app log
-
Adds an environment field
-
Sends data to Elasticsearch
Wrap-Up
Fluentd gives you:
-
A unified way to collect and route logs
-
Flexibility via a huge plugin ecosystem
-
Reliability with buffering and retries
From a single node setup to a large-scale Kubernetes cluster, Fluentd can grow with your infrastructure.
If you’d like, I can also:
-
Tailor this blog to a specific stack (e.g., Fluentd + Elasticsearch + Kibana or Fluentd + Loki + Grafana)
-
Add Kubernetes-specific YAML/Helm snippets
-
Turn the architecture diagram into a more detailed microservices/logging diagram for your environment
Comments
Post a Comment