decorative image showing visualization of key value databases
August 11, 2021

Guide to Open Source Key-Value Databases

Databases
Open Source

For organizations that need horizontal scaling for their data, key-value databases are a popular option. And, with open source databases more popular than their commercial counterparts, understanding available open source options is a central component of any successful search.

In this blog, we give an overview of key-value databases, discuss four popular open source key-value database options, and give considerations for their use within the enterprise.

What Are Key-Value Databases?

Key-value databases are a popular type of non-relational databases used in instances where horizontal scaling is a necessity. They use a key-value approach, which associates a specific value with a specific key. This is then used to identify specific objects.

Key-Value Database Use Cases

Common use cases for key-value databases include web applications, ecommerce, and advertising (think session data, shopping carts, or real-time recommendations).

Key-Value Database Example

In the example below, you see the schema for Redis, with keys that indicate a person, a time, and a value for the indicated person by time.

Read the Full Decision Maker's Guide to Open Source Databases

From open source RDBMS and columnar databases, to emerging graph databases, our Decision Maker's Guide to Open Source Databases covers the top open source options available today, and when you should use them.

Download the Guide

When it comes to popularity, open source key-value databases are king. Among them, Redis and Elasticsearch are the most popular. According to db-engines.com, which provides a trending rank for databases based on a number of factors, Redis is currently the 6th most popular database in use today, with Elasticsearch the 8th most popular.

Key-Value DatabaseRelatedStructuredReadWrite
Redis
HeavyHeavy
etcd
HeavyHeavy
Elasticsearch
HeavyHeavy
Prometheus
HeavyHeavy

Prometheus and etcd, while less popular, still register within the top 60 databases, with Prometheus the 60th most popular, and etcd at 46th.

While there are other databases that we could have selected here, our experts chose these key-value databases for inclusion based on their popularity and enterprise viability.

Editor's Note: Latest Release refers to the latest release as of August 11, 2021.

Redis

Redis Overview
Websitewww.redis.io
Latest Release6.2.5 | July, 2021
LicenseGNU, GPL
Governance ModelMeritocracy

Redis was one of the first key caching solutions available as open source and has seen widespread adoption across a range of use cases. One of its most popular use cases was as an enterprise-class session cache, but it has since found applications in other data use cases, such as fraud analysis and inventory systems.

Redis’s maturity has also led to wide compatibility, and it currently boasts a wide set of client libraries across a number of programming languages, from C# to MATLAB. It also offers timeseries capabilities, making it suitable for analytics.

Considerations

Redis was designed with general purpose key storage, indexing, and caching in mind, but that means that for some purposes such as search, another solution (such as ElasticSearch) may be a better choice.

Redis scales very well across multiple clouds and deployment architectures, and it provides sharding, multi-source replication and source-replicate replication, so it is highly flexible and reliable.

etcd

etcd Overview
Websitewww.etcd.io
Latest Release3.5.0 | June, 2021
LicenseApache 2.0

etcd is the default service registry and backing store application included with Kubernetes, and was designed to be a highly scalable database to hold service endpoints inside a Kubernetes deployment. etcd’s data model is solidly in the realm of a keyvalue structure, but its primary access methods are meant to be universal and ubiquitous, so it allows for cloud-compatible integrations such as JSON/HTTP and gRPC.

Considerations

etcd was born in Kubernetes but is now a top-level project of its own, finding applications beyond just service registry. That said, most adoption of etcd continues to be for the purpose of providing configuration storage for Kubernetes.

ElasticSearch

Elasticsearch Overview
Websitewww.elastic.co/elasticsearch
Latest Release7.14 | August, 2021
LicenseSSPL, Elastic
Governance ModelCorporate Board

Elasticsearch is a “Search Engine” style of key-value database. It takes the capabilities and simplicity that comes with keyvalue stores and extends the indexing and searching features a little further.

Where a traditional keyvalue database can only look up records by key, Elasticsearch allows for all manner of searching and indexing in the values as well.

This makes it ideal for searching lots of freeform data, which is why Elasticsearch forms the critical E in the ELK Stack. Combined with Logstash and Kibana, Elasticsearch forms part of a log search and aggregation function that competes with technologies like Splunk.

Considerations

Elasticsearch gives you extensive capabilities for searching through text, beyond what is normally offered by a traditional key-value database, but that comes at the cost of tolerance for scale.

Other key-value databases, such as Redis, which only allow for a single index, are more tolerant of being partitioned across multiple cluster nodes and can achieve better scale in most cases.

It’s also important to note that Elasticsearch has moved to source available, and does not use traditional open source licenses.

Prometheus

Prometheus Overview
Websitewww.prometheus.io
Latest Release2.29.1 | August, 2021
LicenseApache 2.0
Governance ModelMeritocracy

Prometheus, the second project to be sponsored by the Cloud Native Computing Foundation (after Kubernetes), has become the de facto standard for gathering metric data from Kubernetes implementations. It’s a high-performance timeseries key-value database with a focus on accessibility.

A crowdsourced ecosystem of data exporters ensures compatibility with an extensive number of languages and platforms.

Originally introduced by SoundCloud in 2012, and open source from the very beginning, Prometheus has benefitted from almost a decade of open development. When paired with Grafana and AlertManager, it complements a full, advanced, and beautiful metrics visualization and platform monitoring solution.

Considerations

Prometheus was envisioned with metrics gathering in mind, and though it can be used as a general-purpose timeseries database, there are other solutions that are intended to be more generic.

Prometheus does best serving its intended purpose: providing a large number of ingest points for distributed applications and retaining metrics. The AlertManager framework, also provided by the Prometheus community, allows for thresholding and alerting through popular mechanisms like PagerDuty.

Final Thoughts

Ultimately, this guide is just the tip of the iceberg when it comes to open source key-value databases. Organizations that are researching their options should always perform their due diligence before picking a database. That requires honest effort and consideration, and often calls for external expertise.

Get Technical Support and Guidance for Your Open Source Database

If your organization needs help finding the right database for their needs, or needs technical support for their implementation, our team of Enterprise Architects can provide the guidance and technical support they need.

Talk to an expert today to see how we can help support your goals.

Talk to a Database Expert

Additional Resources