FluentD vs. Logstash — which is better?
Oftentimes, comparing open source overlapping solutions can come down to preference. Do you want vanilla, or do you want chocolate? But is that the case for FluentD and Logstash?
In this blog, we compare FluentD vs. Logstash, including analysis on the comparative strengths and weaknesses of these two popular open source data collection solutions.
Good apps tell you a lot about what they are doing behind the scenes. Reading that data is one of the most important aspects of system monitoring. As the number of apps in our infrastructure begins to increase that chattiness can become too noisy.
You need to have a good system for collecting, organizing, visualizing, and indexing log data. Without it, you'll quickly suffer from information overload. And you'll be faced with a constant “needle-in-a-haystack” situation.
A full-stack log collection and analysis solution will have several layers, but in general, you need something to:
FluentD vs. Logstash ComparisonFluentD and Logstash are both open source data collectors used for Kubernetes logging. Logstash is centralized while FluentD is decentralized. FluentD offers better performance than Logstash.
FluentD and Logstash are both open source data collectors used for Kubernetes logging. Logstash is centralized while FluentD is decentralized. FluentD offers better performance than Logstash.
In fact, FluentD offers many benefits over Logstash. Keep reading to learn more. Or watch the on-demand webinar below for an introduction.
The most popular open source solution available for this purpose is the ELK (ElasticSearch, Logstash, and Kibana) Stack. Comprised of three separate technologies, ELK provides meets all three qualifiers:
Log collection and analysis require the most flexibility. This is often the most complicated part of a log collection implementation.
Systems often generate from many different log sources and produce varying formats. Modern microservices and container-based applications encourage the use of 12-Factor principles. These principles specify that logs should be treated as event streams. Many systems are better suited pushing these details as message-like events into downstream collectors.
Storage and presentation are important. But databases are for the most part abstracted from these systems. And many options exist for data presentation. For this reason, we will focus on the collection and analysis layer by comparing Logstash with a more recently emerging solution called FluentD.
Logstash continues to be a good option. And for systems that really just need to aggregate a bunch of logs from one place and multicast them to various endpoints, it works well.
This isn't the case when you start to add on more and more types of apps. As you start to generate more and more logging data, there are additional concerns that come up.
Logstash provides the Lumberjack protocol for active/passive failover of Logstash instances. Though reliable, this can cause performance bottlenecks during failover. The “Beats” protocol was introduced to deal with some of these issues. But, the licensing may limit your ability to ship and deploy the solution.
Logstash does provide the ability to do some light routing of content. But doing so requires using a conditional syntax which can quickly become complicated.
Many good plugins exist for Logstash. But there are many times where you may need to write complicated grok patterns to parse log data for which an input, filter, or parser plugin cannot be found.
FluentD takes a more advanced approach to the problem of log aggregation. It was originally conceived for gathering metrics inside of Kubernetes environments.
FluentD provides both active-active and active-passive deployment patterns for both availability and scale. FluentD can forward log and event data to any number of additional processing nodes. Those nodes will automatically failover, and semantics exist to ensure idempotency, where necessary. This same forwarding approach also allows for both even and weighted load-balancing, ideal for horizontal scale.
Input events in FluentD are tagged and output plugins have data routed to them based on their associated way. The routing is inherent to the way that FluentD is configured, and conditional logic isn’t necessary.
FluentD hosts a very large and a well-organized repository of plugins. They are effectively Ruby gems and installed using typical “gem install” commands. These plugins vary across a range of legacy and modern use cases and are often a bit more relevant than their Logstash counterparts.
One of the great things about free software is that we often have our choice of solutions and can curate the very best one for our use case. With FluentD, you get everything you love about Logstash and more.
Where Logstash sends all input data to all output endpoints, FluentD gives us the ability to route. Where Logstash provides acceptable truly open high availability Lumberjack, FluentD gives us a more sophisticated solution. When it comes to plugins, FluentD simply has more of them.
If you'd like, you can keep ElasticSearch and Grafana, and replace your log collection functionality with FluentD.
FluentD and Logstash are both useful for collecting, preparing, storing, and presenting logging data. But using them — and getting the most out of them — isn't so easy. That's why it's important to enlist the help of the experts.
OpenLogic experts can help you maximize FluentD or Logstash. They can even help you migrate from Logstash to FluentD.
Talk to an expert today to learn more.
Talk to a Logging Expert
Ex-Chief Evangelist - OSS & API Management, Perforce Software
Justin has over 20 years of experience working in various software roles. He is an outspoken free software evangelist, delivering enterprise solutions, technical leadership, and community education on databases, architectures, and integration projects.