This blog post will explain the basics of Grafana Loki, a log aggregation system designed to store and query logs from applications and infrastructure. We will describe Loki, explain its architecture and uses, and discuss why MinIO object storage is a great place to save Loki data. We will provide a tutorial on how to use MinIO to store Loki data in a follow up blog post.
Logs provide basic information about devices and systems. They are put to myriad uses within the enterprise. Monitoring is a critical use case that involves watching logs for errors and sending alerts when significant errors occur or when thresholds are crossed for error rates. Logs are a valuable source of debugging and troubleshooting information, and are essential when tracking application health and crashes. Logs can be used to identify malicious activity and conduct forensic investigations. Logs can also be used for business intelligence to provide insight and help create strategies for business growth.
Logs are always growing, and indexing, searching and storing them using legacy enterprise storage can become resource intensive. To alleviate this burden, log search tools store data on S3-compatible object storage, such as MinIO. This is a simpler architecture where applications built on log data benefit from MinIO’s high throughput at scale, as well as versioning, immutability and durability. Our recent benchmark achieved PUT throughput of 1.32 Tbps and GET throughput of 2.6 Tbps on 32 nodes using NVME drives.
Let’s dig into Loki.
Introduction to Grafana Loki
Grafana Loki is a distributed multi-tenant log aggregation system modeled after Prometheus. Announced at KubeCon Seattle 2018 and released under the AGPLv3 license, Loki features prominently in the cloud-native observability stack and is frequently combined with Grafana and Prometheus to view and alert on metrics, logs and traces within a single UI.
It’s helpful to think of the set of applications Grafana-Loki-Promtail as roughly equivalent to the ELK stack, with the Promtail agent sending (tailing) logs to the Loki datastore and visualizing them in Grafana. Promtail is a logs collector that uses the same service discovery as Prometheus and includes analogous features for labeling, transforming and filtering logs before sending them to Loki. Loki does not index the text of logs, and instead indexes metadata. Entries are grouped into streams and indexed with labels. Typically, admins use Grafana and Loki’s query language, LogQL, to explore and visualize logs. They also configure alerting rules and send alerts to AlertManager for routing.
Loki is a lightweight and cost-effective log analysis tool because it only indexes metadata. Hardware requirements are much lower than for full-text processing tools that build and store indices in memory. You can run Loki locally in your own environment or a cloud platform and scale horizontally as operations grow. As with Prometheus and Grafana, Loki easily deploys on Kubernetes.
The Promtail agent is designed for Loki. It acquires logs, turns them into streams and pushes the streams to Loki via HTTP API. Loki isn’t limited to the Promtail agent - you may also use other clients to send logs such as Docker Driver, Fluentd, Logstash and Lambda Promtail. Being able to use Logstash as a client is convenient if you already have it running and want to quickly try Loki. The ability to scrape Kubernetes and Docker container logs is very helpful for troubleshooting.
Since version 2.0, Loki stores data in a single object storage back end. Grafana Labs relies on object storage because it is “fast, cost-effective, and simple, not to mention where all current future development lies.” This mode relies on an adapter,
boltdb_shipper to store Loki indices in object storage. BoltDB Shipper lets you run Loki without a dependency on an external database for storing indices. It locally stores the index in BoltDB files and continuously ships those files to an object store such as MinIO. This simplifies Loki deployment, removes a potential point of failure and removes the cost of running an external database like Apache Cassandra, Google Bigtable or Amazon DynamoDB.
Modernizing Log Storage Paradigms
Working with logs can be a resource intensive proposition. Even with logs supporting multiple use cases, the effort of ingesting, storing and querying log data threatens to exceed the value. Enterprises need a way to reduce the complexity of storing and exploring log data - and that is object storage such as MinIO.
Enterprises go to great lengths to store logs and make them available for analysis. They deploy convoluted architectures and processes to increase efficiency such as archiving logs to cold storage, reducing the duration of log storage, rolling up logs to be less detailed or even dropping logs completely.
These tactics defeat the purpose of collecting logs in the first place. The value isn’t in the log itself, the value is in how the log is applied. Logs contain all of the details needed to solve performance issues, analyze usage data, understand customer preferences - and more. Many industries require that logs be saved for regulatory compliance and auditing purposes. However, when storage is so cumbersome that every log’s value is questioned, opportunities to apply the log to solve a problem become limited.
There is no limit to the volume of logs that MinIO can store and protect. MinIO places no constraints on the number and organization of objects, and distributes and protects them with erasure coding across multiple nodes and drives. MinIO’s tremendous performance has led many enterprises to implement object storage as primary storage in order to consolidate storage footprint, migrate to cloud-native application architectures and gain flexibility, portability and elastic scalability.
Infinitely Scalable Object Storage for Logs
For enterprises struggling with log management, object storage such as MinIO provides an opportunity to evolve their strategy. It’s hard to beat object storage for the performance and scalability required for high-volume and highly complex log data. Designing application stacks to run against object storage allows enterprises to decouple storage from compute resources, increasing efficiency over traditional monolithic log management solutions. Moving to a cloud-native architecture simplifies operations while expanding the ecosystem of available log management and analysis tools, such as Humio, Cribl, Splunk, Elastic and more.
For example, saving and working with log data in MinIO simplifies management while dramatically increasing the number of use cases to which this data can be applied. Admins don’t have to make detailed plans that take data format and size, or proprietary methods for data protection like snapshots and replication, into account. Having to plan and manage storage in this way leads to a painful, overly complicated result that frequently creates increased operational overhead. MinIO abstracts away the challenges of storing log data, handling data protection behind the scenes while exposing data via the ubiquitous S3 API. MinIO provides durable object storage for log data, reducing back-end complexity while simplifying and streamlining use cases and the applications that drive them. For the average developer or application end user, MinIO requires no management or oversight, leaving them free to focus on the log analysis that leads to true business insight instead of on storage.
MinIO and Loki for Enterprise Logging
Saving logs and Loki log data to MinIO helps overcome many of the challenges that enterprises face when managing large scale log analytics programs. Enterprise must break free from the constraints presented by traditional enterprise storage to achieve the improved performance, portability and durability offered by MinIO. Removing the burden of outdated storage modalities frees developers to build the analytics and AI/ML applications that will enable enterprises to unlock the full value of their logs.
Stay tuned, we’ll be following up on the “why” of this blog post with the “how” of a detailed tutorial shortly.