Observing modern applications is challenging. Microservices allow for applications that are not only more distributed but are made up of a number of different languages, frameworks, and backend services. DevOps teams have far greater flexibility in where and how they deploy applications,but when it comes time to collect logs, this flexibility can quickly become a hurdle. Fluentd is an open source tool that streams log data from multiple sources to multiple destinations. It is designed to be a unified logging layer, helping you centralize and route logs in a consistent way. With over 600 plugins, it supports a wide number of log types, log sources, log ingesters, and big data platforms.In this article, we'll look at the features Fluentd offers and how it compares to LogDNA.
Fluentd acts as a pipeline for log data to flow from log generators (such as a host or application) to log destinations (such as Elasticsearch). Sources send log messages to it, and Fluentd forwards those messages to a destination based on a set of configurable rules. As part of this process, it automatically converts each message from its original format into standard JSON, and finally into a format that can be read by the destination.[caption id="attachment_2963" align="aligncenter" width="892"]
Fluentd architecture. © 2018 Fluentd Project. All rights reserved.[/caption]Fluentd's routing engine redirects messages to one or more destinations based on their source, format, or metadata. It also supports filtering messages, adding custom fields, and basic data stream manipulation. As a Cloud Native Computing Foundation (CNCF) project, Fluentd integrates with Docker and Kubernetes as a deployable container or Kubernetes DaemonSet.It is often used in logging stacks as a replacement to Logstash, so much so that it's given rise to the EFK (Elasticsearch, Fluentd, Kibana) stack. While this isn't the only way to use it, it's quickly becoming one of the most popular.
The main strength of Fluentd is that it's both source and destination agnostic. As long as your service has a relevant Fluentd plugin (of which there are hundreds), you can immediately begin transferring logs to or from it. Since it converts all incoming logs into standard JSON, it can connect any supported log source to any supported log destination.Fluentd also supports some stream manipulation capabilities including log parsing, conversion, and data processing. You can define custom log formats, apply custom labels to individual messages for advanced filtering, and inject or remove fields. While it doesn't support complex processing, you can integrate it with stream processing software such as Norikra and Amazon Kinesis.It is also lightweight, requiring only 40MB of RAM. There is an even lighter version called Fluent Bit that removes much of it's functionality, but only requires around 450KB of RAM. Fluent Bit only has around 30 plugins compared to Fluentd's 600+, but it supports many common log types and destinations including Elasticsearch, Splunk, InfluxDB, HTTP, and local file.
One of it's main challenges is performance. While much of Fluentd is written in C, its plugin framework is written in Ruby. This adds flexibility, but at the cost of speed; on standard hardware, each Fluentd instance can only process around 18,000 events per second. You could enable multi-process workers to increase throughput, but this may cause problems with plugins that don't support this feature.As an open source product, it requires installation and setup before it can be used. There are community-created Deployments for quickly deploying a generic Fluentd instance to Docker or Kubernetes, but these must be configured, tested, and maintained for your specific requirements and infrastructure. Enterprise support is only available for customers of Treasure Data, the maintainers of Fluentd. All other support options must go through public channels.Lastly, it adds an intermediate layer between your log sources and log destinations. This can slow down your logging pipeline, causing backups if your sources generate events faster than Fluentd can parse, process, and forward them. While it provides some additional buffering, it will drop new events once the buffer is full.
Fluentd and LogDNA both handle log ingestion, aggregation, and routing between services. However, LogDNA provides a number of benefits.
There's more to deploying Fluentd than simply running a Kubernetes Deployment. You also need to reconfigure your applications, services, and platforms to target Fluentd as a log destination. Fluent Bit makes this process easier for Docker and Kubernetes in particular, but also requires reconfiguration before it can collect host cluster logs. You end up having to become an expert in scaling Fluentd and Elasticsearch as your log volumes grow.With LogDNA, collecting host and cluster logs is as easy as running two commands. Deploying the LogDNA agent over Kubernetes lets you immediately begin streaming logs from all of your hosts and applications. LogDNA supports cloud platforms including AWS CloudWatch, Elastic Beanstalk, and Heroku. LogDNA can even ingest logs from Fluentd through the LogDNA Fluentd plugin.
Fluentd is strictly a log shipper. While it supports some log management capabilities, most of this functionality is only available through community-developed plugins, or by integrating it into a log management stack. LogDNA is a complete log management solution capable of indexing, filtering, searching, graphing, and alerting on log data.Although Fluentd supports hundreds of log formats and sources, LogDNA supports most common log formats including JSON, Syslog, Nginx, Apache, and Logfmt. You can also define custom log formats using the Custom Log Parser, a fast and simple step-by-step tool for defining custom log templates.
A single Fluentd instance running on commodity hardware handles around 18,000 events per second. According to it's performance tuning guide, CPU is the main bottleneck for high-traffic instances. To handle larger workloads, Fluentd can launch multiple processes to utilize multi-threaded CPUs. However, not all plugins support multiple processes, which could lead to plugin-dependent bottlenecks and more complex configurations.LogDNA handles over 20 terabytes of data per day, which breaks down to hundreds of thousands of events per second per customer. Events are processed by LogDNA's ingestion servers rather than the host itself, limiting the amount of CPU used by the host agent. And since LogDNA scales to your needs, spikes and surges in your log volume won't result in lost data.
Fluentd is useful as a lightweight logging pipeline between a wide range of tools, services, and platforms. However, it's hampered by its performance and log analysis capabilities and scale.If you are currently using Fluentd, you can easily ship your logs to LogDNA using the LogDNA Fluentd plugin. This lets you combine it's extensive plugin library with LogDNA's aggregation, analysis, and monitoring tools. To get started, download the LogDNA Fluentd plugin and start shipping your Fluentd logs to LogDNA in minutes.