November 29, 2016

ActiveMQ Failover: What to Do Without LevelDB

Web Infrastructure

ActiveMQ failover is important for maintaining high availability, but with the deprecation of LevelDB, it just got harder for some ActiveMQ users.

In this blog, we discuss the change, and how users can implement ActiveMQ failover without using LevelDB.

Back to top

ActiveMQ Deprecated Use of LevelDB

In a surprising move, the ActiveMQ community has deprecated the use of LevelDB as a persistence store for their popular messaging broker. Christopher Shannon issued the following statement on November 15, 2016:

The main reason is that KahaDB continues to be the main focus where bugs are fixed and not much attention is paid to LevelDB. There seems to be several issues with corruption (especially with replication) so I don't think it should be a recommended store unless the stability is sorted out. Unfortunately nearly every JIRA reported against LevelDB goes ignored.

The ActiveMQ community reacted quickly and agreed to sunset further development efforts towards LevelDB.

Back to top

Why LevelDB Was Used For ActiveMQ Failover

The LevelDB persistence store, originally created from Google’s BigTable project, is actively being used in many modern applications. This includes the Google Chrome and Chromium browsers. It was added to ActiveMQ version 5.8 in response to issues surrounding the default persistence store, KahaDB.

Most of these issues were related to inefficiencies in indexing and cleanup in KahaDB, which uses a B-Tree style index. LevelDB’s key-caching index has more reliably performed extent cleanup during checkpoint processes. This addition was augmented in 5.9 with Zookeeper-powered persistence store replication, to provide faster ActiveMQ failover and a high availability model without a single point of failure.

ActiveMQ’s user base warmly received this addition. And many people switched their persistence configuration over to LevelDB. Adoption of the LevelDB/Zookeeper high availability (HA) model was hindered by the additional infrastructure requirements (a minimum of six machines are necessary to implement it). But a significant number of businesses did take advantage of the improved HA model.

Why the Change?

Despite its benefits, LevelDB is a project maintained outside of the ActiveMQ community, and therefore relies on third-party development efforts to keep it up to date. Although LevelDB itself is a solid solution, ActiveMQ had to maintain its own client library to wrap LevelDB, and the core committers who worked on that adapter are no longer actively developing the ActiveMQ 5.x branch. Without people to improve the client adapter, the community cannot adequately respond to bug reports and enhancement requests. Rather than let these issues continue to pile up, the community has decided to deprecate the functionality and focus on KahaDB improvement and ActiveMQ 6.0 (Artemis).

Back to top

How to Implement ActiveMQ Failover Now

When a feature is deprecated by an OSS community, the likelihood of improvements to that feature becomes drastically low. ActiveMQ is written against an older version of LevelDB, and it is now very unlikely that this will change. We are recommending that you migrate off of LevelDB (including LevelDB/Zookeeper) as soon as possible. You have some options there:

1. Move Back to KahaDB 

If you are coming off of LevelDB/Zookeeper and need faster failover for your HA model, don’t forget that you are by no means limited to just one passive instance. You can have several passive instances fighting for lock state. This will statistically decrease the amount of time a passive broker will take to become active.

Bear in mind as well that a lot of work and improvement has gone into KahaDB since the 5.11 release. Problems experienced with it in the past may no longer be a concern for you.

2. Add Redundancy to Your Shared Storage for KahaDB HA

Technically, it won’t remove the shared persistence as a single point of failure. But by investing in that shared persistence and the surrounding infrastructure, you can significantly reduce the chance of loss.

3. Keep KahaDB Happy 

Don’t run your shared storage on CIFS/SMB, or keep it on any kind of NTFS-based filesystem. The best throughput will be gained by using an iSCSI protocol with a multi-user filesystem like GFS2 layered on top of it.

If you are having trouble maintaining accurate lock state, be sure to check out the JDBC Pluggable Storage Locker, which can provide a more reliable locking mechanism.

4. Don’t Get Fancy

Many people have tried using a replicated filesystem like GlusterFS, CephFS, or DRDB to replicate persistence. But they still use the active/passive locking mechanism.

Though good on paper, these solutions do not perform well under load. They often lose the lock state and leading to either no-active, or worse, multi-active brokers which can corrupt the persistence store.

5. Use JDBC Persistence

JDBC persistence for ActiveMQ can be clustered using familiar RDBMS replication concepts, and can be sped up considerably by using a connection pooling library like c3p0.

6. Keep an eye on Artemis

Be ready to switch to ActiveMQ Artemis when it is production-ready.

Back to top

Need Help Implementing ActiveMQ Failover?

Open source solutions can move quickly, and although we will miss LevelDB, the community has spoken and we must be prepared to move our critical infrastructure in the same direction.

You can still achieve reliable ActiveMQ failover with KahaDB, and the community has worked to improve many of the issues KahaDB users experienced in the past. Artemis comes with its own clustering solution, and when it’s ready for prime-time you can expect the same level of high availability previously achievable through LevelDB/Zookeeper.

If you need help or have more questions, contact the OpenLogic team. We provide around-the-clock access to Tier 3/4 open source architects ready to support, consult, and educate your team to solve issues across your entire software stack and development lifecycle. Find out how we can help you with ActiveMQ support and beyond.


Related Content:

Back to top