fluentd vs. logstash
April 14, 2020

FluentD vs. Logstash: How to Decide for Your Organization

Open Source

What's the difference between FluentD vs. Logstash? That's what we break down in this blog. 

Good apps tell you a lot about what they are doing behind the scenes. Reading that data is one of the most important aspects of system monitoring. Inevitably, as the number of apps in our infrastructure begins to increase that chattiness can become too noisy. If we don’t have a good system for collecting, organizing, visualizing, and indexing our log data, we quickly suffer from information overload and find ourselves faced with a constant “needle-in-a-haystack” situation.

A full-stack log collection and analysis solution will have several layers, but in general, we will need something to:

  • Collect and prepare our logging data.
  • Store our logging data.
  • Present our logging data.

FluentD vs. Logstash Comparison

FluentD and Logstash are both open source data collectors used for Kubernetes logging. Logstash is centralized while FluentD is decentralized. FluentD offers better performance than Logstash. 

In fact, FluentD offers many benefits over Logstash. Keep reading to learn more. Or watch the on-demand webinar for an introduction.

ElasticSearch, Logstash, and Kibana

The most popular open source solution available for this purpose is the ELK (ElasticSearch, Logstash, and Kibana) Stack. Comprised of three separate technologies, ELK provides meets all three qualifiers:

  • Logstash collects and prepares logs.
  • ElasticSearch store logs.
  • Kibana lets users search and visualize log data. 

Log Collection is Complex

Out of all three of these layers, log collection and analysis require the most flexibility and is often the most complicated part of a log collection implementation. Systems often generate from many different log sources and produce varying formats. Modern microservice and container-based applications encourage the use of 12-Factor principles which specify that logs should be treated as event streams. Many systems are better suited pushing these details as message-like events into downstream collectors.

Storage and presentation are important, but databases are for the most part abstracted from these systems, and many options exist for data presentation. For this reason, we will focus on the collection and analysis layer by comparing Logstash with a more recently emerging solution called FluentD. 

Downfalls of Logstash

Logstash continues to serve us well, and for systems that really just need to aggregate a bunch of logs from one place and multicast them to various endpoints, it does the trick.

As we start to add on more and more types of apps, and really, as we start to generate more and more logging data, however, additional concerns emerge.

Availability at Scale with Logstash

Logstash provides the Lumberjack protocol for active/passive failover of Logstash instances. Though reliable, this can cause performance bottlenecks during failover. The “Beats” protocol was introduced to deal with some of these issues, but, the licensing  may limit your ability to ship and deploy the solution.

Conditional Routing with Logstash

Logstash does provide the ability to do some light routing of content but doing so requires using a conditional syntax which can quickly become complicated. 

Interoperability Issues with Logstash

Though many good plugins exist for Logstash, there are many times where users find themselves writing complicated grok patterns to parse log data for which an input, filter, or parser plugin cannot be found.

FluentD, originally conceived for gathering metrics inside of Kubernetes environments, takes a more advanced approach to the problem of log aggregation.

Benefits of FluentD

Advanced Deployment with FluentD

FluentD provides both active-active and active-passive deployment patterns for both availability and scale.  FluentD can forward log and event data to any number of additional processing nodes. Those nodes will automatically failover, and semantics exist to ensure idempotency, where necessary. This same forwarding approach also allows for both even and weighted load-balancing, ideal for horizontal scale.

Tagging and Dynamic Routing with FluentD

Input events in FluentD are tagged and output plugins have data routed to them based on their associated way. The routing is inherent to the way that FluentD is configured, and conditional logic isn’t necessary.

Larger Plugin Library with FluentD

FluentD hosts a very large and a well-organized repository of plugins. They are effectively Ruby gems and installed using typical “gem install” commands. These plugins vary across a range of legacy and modern use cases and are often a bit more relevant than their Logstash counterparts. 

Why Choose FluentD Over Logstash?

One of the great things about free software is that we often have our choice of solutions and can curate the very best one for our use case. With FluentD, we get everything we love about Logstash and more.

Where Logstash sends all input data to all output endpoints, FluentD gives us the ability to route.  Where Logstash provides acceptable truly open high availability Lumberjack, FluentD gives us a more sophisticated solution. When it comes to plugins, FluentD simply has more of them.

If you'd like, you can keep ElasticSearch and Grafana, and replace your log collection functionality with FluentD.

Looking for Advice on How to Get Started?

Connect with an enterprise architects from OpenLogic or try our support for free. Get real, consultative support from one of our Enterprise Architects through a free support ticket.