Kafka Raft Mode: Running Kafka Without ZooKeeper
It's official: Kafka Raft (KRaft) is replacing ZooKeeper soon in Apache Kafka. Many developers are excited about this change, but it will impact teams currently running Kafka with ZooKeeper who need to determine an upgrade path once ZooKeeper is no longer supported.
In this blog, our expert explains what Kafka Raft (KRaft) is, how Raft implementations differ from ZooKeeper-based deployments, and what to consider when planning the transition to a KRaft environment.
- What Is Kafka Raft (KRaft)?
- Timeline for ZooKeeper Deprecation
- Transitioning From ZooKeeper to Kafka Raft
- Final Thoughts
What Is Kafka Raft (KRaft)?
Kafka Raft (which loosely stands for Reliable, Replicated, Redundant, And Fault Tolerant) or KRaft, is Kafka’s implementation of the Raft consensus algorithm.
Created as an alternative to the Paxos family of algorithms, the Raft Consensus protocol is meant to be a simpler consensus implementation with the goal of being easier to understand than Paxos. Both Paxos and Raft operate in similar manner under normal stable operating conditions, and both protocols accomplish the following:
- Leader writes operation to its log and requests following servers to do the same thing
- The operation is marked as “commited” once a majority of servers acknowledge the operation
This results in a consensus-based change to the state machine, or in this specific case, the Kafka cluster.
The main difference between Raft and Paxos, however, is when operations are not normal and new leader must be elected. Both algorithms will guarantee that the new leader’s log will contain the most up-to-date commits, but how they accomplish this process differs.
In Paxos, the leader election contains not only the proposal and subsequent vote, but also must contain any missing log entries the candidate is missing. Followers in Paxos implementations can vote for any candidate and once the candidate is elected as leader, the new leader will utilize this data to update its log to maintain currency.
In Raft, on the other hand, followers will only vote for a candidate if the candidate’s log is the at least as up to date as the follower’s log. This means only the most up-to-date candidate will be elected as leader. Ultimately, both protocols are remarkably similar in their approach to solving the consensus problem. However, with Raft making some base assumptions about the data, namely the order of commits in the log, we can see improvements in election efficiency in Raft.
What does this mean in regards to Kafka? From the protocol side of things, not much. ZooKeeper utilizes a proprietary consensus protocol called ZAB (ZooKeeper Atomic Broadcast) that is much more focused on total ordering of commits to the change log. This focus on commit order makes Raft consensus fit quite well into the Kafka ecosystem.
That said, changes from an infrastructure perspective will be quite noticeable. With each broker having the Kraft logic incorporated into the base code, ZooKeeper nodes will no longer be part of the Kafka infrastructure. Note that this doesn’t necessarily mean less servers in the production environment — more on that later.
Why Is Kafka Raft Replacing ZooKeeper?
To understand why the community leadership decided to make this move away from ZooKeeper, we can look directly at KIP-500 for their reasoning. In short, this move was meant to reduce complexity and handle cluster metadata in a more robust fashion. Removing the requirement for ZooKeeper means there is no longer a need to deploy two distinctly different distributed systems. ZooKeeper has different deployment patterns, management tools, and configuration syntax when compared to Kafka. Unifying the functionality to single system will reduce configuration errors and overall operational complexity.
In addition to simpler operations, treating the metadata as its own event stream means that a single number, an offset, can be used to describe a cluster member's position and be quickly brought up to date. This in effect applies the same principles used for producers and consumers to the Kafka cluster members themselves.
How Is Kafka Raft Different From ZooKeeper?
In a ZooKeeper-based Kafka deployment, the cluster consists of several broker nodes and a quorum of ZooKeeper nodes. In this environment, each change to the cluster metadata is treated as an isolated event, with no relationship to previous or future events. When state changes are pushed out to the cluster from the cluster controller, a.k.a. the broker in charge of tracking/electing partition leadership, there is potential for some brokers to not receive all updates, or for stale updates to create race conditions as we’ve seen in some larger Kafka installations. Ultimately, these failure points have the potential to leave brokers in divergent states.
While not entirely accurate, as all broker nodes can (and do) talk to ZooKeeper, the image below is a basic example of what that looks like:
In contrast, the metadata in KRaft is stored within Kafka itself and ZooKeeper is effectively replaced by a quorum of Kafka controllers. The controller nodes comprise a raft quorum to elect the active controller that manages the metadata partition. This log contains everything that used to be found ZooKeeper: topic, partition, ISRs, configuration data, etc. will all be located in this metadata partition.
Using the Raft algorithm controller nodes will elect the leader without the use of an external system like ZooKeeper. The leader, or active controller, will be the partition leader for the metadata partition and will handle all RPCs from the brokers.
The diagram below is a logical representation of the new cluster environment implementation using KRaft:
Note in the diagram above there is no longer a double-sided arrow. This denotes another major difference in the two environments: Instead of the controller sending updates to the brokers, controllers fetch the metadata via a MetadataFetch API. In similar fashion to a regular fetch request, the broker will track the offset of the latest update it fetched, requesting only newer updates from the active controller persisting that metadata to disk for faster startup times.
In most cases, the broker will only need to request the deltas of the metadata log. However, in cases where no data exists on a broker or a broker is too far out of sync, a full metadata set can be shipped. A broker will periodically request metadata and this request will act as a heartbeat as well.
Previously, when a broker entered or exited a cluster, this was kept track of in ZooKeeper, but now the broker status will be registered directly with the active controller. In a post-ZooKeeper world, cluster membership and metadata updates are tightly coupled. Failure to receive metadata updates will result in eviction from the cluster.
Timeline for ZooKeeper Deprecation
Starting with Kafka release 3.3, KRaft is now considered “production ready” (with some known caveats that we will discuss in the next section). The current plan, according to KIP-833, is that starting with 3.4, all subsequent 3.X releases will be considered “bridge releases.”
This means that while ZooKeeper will still exist in 3.X releases, dependency on ZooKeeper will continue to be increasingly isolated with an emphasis on moving to a KRaft implementation. This is also means that an upgrade from ZooKeeper releases (Kafka 2.8 and previous) will be required to upgrade to a bridge release before being able to upgrade to a post-ZooKeeper release (4.X and on).
Transitioning From ZooKeeper to Kafka Raft
For those organizations planning the move to a post-ZooKeeper environment, there are quite a few things to consider. In a KRaft-based cluster, Kafka nodes can be run in one of three different modes know as process.role. The process.role can be set to broker, controller, or combined. In a production cluster, it is recommended that the combined process.role should be avoided — in other words, having dedicated JVM resources assigned to brokers and controllers. So, as mentioned previously, doing away with ZooKeeper doesn’t necessarily mean doing away with the compute resources in production. Using the combined process.role in develop or staging environments is perfectly acceptable.
Also, as of version 3.3, exactly three controllers are recommended regardless of cluster size in production environments. There are currently known issues where partial network failure can result in metadata becoming unavailable, and having more than three controllers can increase the likelihood of this becoming an issue. The community is currently working on fixes for these issues that should be available in subsequent releases.
Lastly, there are some known missing features in Kafka 3.3:
- Configuring SCRAM users via the administrative API
- Supporting JBOD configurations with multiple storage directories
- Modifying certain dynamic configurations on the standalone KRaft controller
- Delegation tokens
- Upgrade from ZooKeeper mode
In some instances, these missing features could make switching over to KRaft a non-starter. For example, if you are using delegation tokens in your broker/client authentication, it might be a considerable lift to move to mutual TLS/SSL authentication. Also, large and complex existing Kafka Clusters might find the lack of a direct upgrade path from ZooKeeper mode to KRaft quite daunting as well.
For greenfield implementations, using KRaft is definitely a no-brainer, but for mature Kafka environments, migrating to a post-ZooKeeper is a complete rip and replace for your cluster, with all the complications that could follow. Creating a detailed migration plan, with blue/green deployment strategies, is crucial in such cases. And if your team is lacking in Kafka expertise, seeking out external support to guide your migration would also be a good idea.
Need Support for Your Kafka Deployments?
Whether you need assistance with your partitioning strategy or preparing for the switch to Raft mode, OpenLogic can help. Our Kafka experts can provide technical support, consultations, or even train your team. Click the button below to learn more.