Guide to Open Source Key-Value Databases
For organizations that need horizontal scaling for their data, key-value databases are a popular option. And, with open source databases more popular than their commercial counterparts, understanding available open source options is a central component of any successful search.
In this blog, we give an overview of key-value databases, discuss four popular open source key-value database options, and give considerations for their use within the enterprise.
- What Are Key-Value Databases?
- Popular Open Source Key-Value Databases
- Final Thoughts
What Are Key-Value Databases?
Key-value databases are a popular type of non-relational databases used in instances where horizontal scaling is a necessity. They use a key-value approach, which associates a specific value with a specific key. This is then used to identify specific objects.
Key-Value Database Use Cases
Common use cases for key-value databases include web applications, ecommerce, and advertising (think session data, shopping carts, or real-time recommendations).
Key-Value Database Example
In the example below, you see the schema for Redis, with keys that indicate a person, a time, and a value for the indicated person by time.
Read the Full Decision Maker's Guide to Open Source Databases
From open source RDBMS and columnar databases, to emerging graph databases, our Decision Maker's Guide to Open Source Databases covers the top open source options available today, and when you should use them.
Popular Open Source Key-Value Databases
When it comes to popularity, open source key-value databases are king. Among them, Redis and Elasticsearch are the most popular. According to db-engines.com, which provides a trending rank for databases based on a number of factors, Redis is currently the 6th most popular database in use today, with Elasticsearch the 8th most popular.
Prometheus and etcd, while less popular, still register within the top 60 databases, with Prometheus the 60th most popular, and etcd at 46th.
While there are other databases that we could have selected here, our experts chose these key-value databases for inclusion based on their popularity and enterprise viability.
Editor's Note: Latest Release refers to the latest release as of August 11, 2021.
|Latest Release||6.2.5 | July, 2021|
Redis was one of the first key caching solutions available as open source and has seen widespread adoption across a range of use cases. One of its most popular use cases was as an enterprise-class session cache, but it has since found applications in other data use cases, such as fraud analysis and inventory systems.
Redis’s maturity has also led to wide compatibility, and it currently boasts a wide set of client libraries across a number of programming languages, from C# to MATLAB. It also offers timeseries capabilities, making it suitable for analytics.
Redis was designed with general purpose key storage, indexing, and caching in mind, but that means that for some purposes such as search, another solution (such as ElasticSearch) may be a better choice.
Redis scales very well across multiple clouds and deployment architectures, and it provides sharding, multi-source replication and source-replicate replication, so it is highly flexible and reliable.
See our breakdown of Kafka vs. Redis >>
|Latest Release||3.5.0 | June, 2021|
etcd is the default service registry and backing store application included with Kubernetes, and was designed to be a highly scalable database to hold service endpoints inside a Kubernetes deployment. etcd’s data model is solidly in the realm of a keyvalue structure, but its primary access methods are meant to be universal and ubiquitous, so it allows for cloud-compatible integrations such as JSON/HTTP and gRPC.
etcd was born in Kubernetes but is now a top-level project of its own, finding applications beyond just service registry. That said, most adoption of etcd continues to be for the purpose of providing configuration storage for Kubernetes.
|Latest Release||7.14 | August, 2021|
|Governance Model||Corporate Board|
Elasticsearch is a “Search Engine” style of key-value database. It takes the capabilities and simplicity that comes with keyvalue stores and extends the indexing and searching features a little further.
Where a traditional keyvalue database can only look up records by key, Elasticsearch allows for all manner of searching and indexing in the values as well.
This makes it ideal for searching lots of freeform data, which is why Elasticsearch forms the critical E in the ELK Stack. Combined with Logstash and Kibana, Elasticsearch forms part of a log search and aggregation function that competes with technologies like Splunk.
Elasticsearch gives you extensive capabilities for searching through text, beyond what is normally offered by a traditional key-value database, but that comes at the cost of tolerance for scale.
Other key-value databases, such as Redis, which only allow for a single index, are more tolerant of being partitioned across multiple cluster nodes and can achieve better scale in most cases.
It’s also important to note that Elasticsearch has moved to source available, and does not use traditional open source licenses.
|Latest Release||2.29.1 | August, 2021|
Prometheus, the second project to be sponsored by the Cloud Native Computing Foundation (after Kubernetes), has become the de facto standard for gathering metric data from Kubernetes implementations. It’s a high-performance timeseries key-value database with a focus on accessibility.
A crowdsourced ecosystem of data exporters ensures compatibility with an extensive number of languages and platforms.
Originally introduced by SoundCloud in 2012, and open source from the very beginning, Prometheus has benefitted from almost a decade of open development. When paired with Grafana and AlertManager, it complements a full, advanced, and beautiful metrics visualization and platform monitoring solution.
Prometheus was envisioned with metrics gathering in mind, and though it can be used as a general-purpose timeseries database, there are other solutions that are intended to be more generic.
Prometheus does best serving its intended purpose: providing a large number of ingest points for distributed applications and retaining metrics. The AlertManager framework, also provided by the Prometheus community, allows for thresholding and alerting through popular mechanisms like PagerDuty.
Ultimately, this guide is just the tip of the iceberg when it comes to open source key-value databases. Organizations that are researching their options should always perform their due diligence before picking a database. That requires honest effort and consideration, and often calls for external expertise.
Get Technical Support and Guidance for Your Open Source Database
If your organization needs help finding the right database for their needs, or needs technical support for their implementation, our team of Enterprise Architects can provide the guidance and technical support they need.
Talk to an expert today to see how we can help support your goals.
- Blog - Guide to Open Source Relational Databases
- White Paper – 2021 Open Source Database Trend Report
- Resource Collection - Intro to Open Source Databases
- White Paper - Decision Maker's Guide to Open Source Databases
- On-Demand Webinar - Monitoring Java Applications With Prometheus and Grafana