Kafka vs. RabbitMQ: Comparing Features and Use Cases
Apache Kafka and RabbitMQ frequently get compared, despite the fact that RabbitMQ is a message broker and Kafka is an event streaming platform. While these two open source technologies share some capabilities in common, their enterprise use cases vary considerably.
In this blog, our expert unpacks the key similarities and differences between Kafka vs. RabbitMQ and what to take into account when deciding which one to implement.
- Comparing Kafka vs. RabbitMQ
- Kafka vs. RabbitMQ: Key Differences
- Kafka vs. RabbitMQ: Use Cases
- Final Thoughts
Comparing Kafka vs. RabbitMQ
While RabbitMQ and Apache Kafka are both open source applications with similar message brokering capabilities, they are not interchangeable. Before choosing whether to deploy RabbitMQ or Kafka, it’s important to understand how they differ from one another.
What Is Apache Kafka?
Apache Kafka is an open source, distributed event streaming application, written for speed, in Java and Scala. Scala is a language that combines functional programming and object-oriented programming that uses the JVM.
Kafka is a pub/sub message bus that is log-based rather than queue-based. Messages stay in the log until they reach the retention limit. Kafka is also pull-based, meaning clients request messages when they need them.
Kafka only has a Java client, but there is an SDK so programmers can build their own integrations.
What Is RabbitMQ?
RabbitMQ is an open source, distributed message broker focused on high availability and fault tolerance. It is written in Erlang, a functional programming language used to build scalable, real-time systems.
RabbitMQ has several types of exchanges — such as direct, fan-out, topic, and header-based — that determine which messages a client receives.
RabbitMQ supports AMQP 0.9.1, plus several other protocols using plugins, such as AMQP 1.0, HTTP, STOMP, and MQTT. It supports several languages including Java, Go, PHP, Python, Ruby, and many others.
Kafka vs. RabbitMQ: Key Differences
While both Kafka and RabbitMQ can handle a large volume of messages, there are a number of differences in terms of how they work.
RabbitMQ has a smart broker/dumb consumer model. All the routing and decisions are made in the broker, then it pushes the messages to the clients. The messages are then removed from the queue after all acknowledgments are received.
Kafka, on the other hand, uses a dumb broker/smart consumer model. The broker just sends messages to the queues to be read. This is decided by the producer (sending to the correct queue), and the consumer (reading from the correct queue). All the decisions on what queue to read from and which messages to do anything with are made by the client. Kafka clients must keep track of their position in the log to make sure they only get new messages. This also means that new clients can get all the messages currently in the log, and not just new messages.
Kafka generally has better performance. If you are looking for more throughput, Kafka can go up to around 1,000,000 messages per second, whereas the throughput for RabbitMQ is around 4K-10K messages per second. This is due to the architecture, as Kafka was designed around throughput.
Scalability & Redundancy
Both Kafka and RabbitMQ are scalable and redundant. They both allow multiple nodes, and replication of messages. RabbitMQ nodes are all equal peers; there are no leader or follower nodes. Kafka currently uses ZooKeeper for its metadata storage and clustering, so for now it is a required application for proper operation of Kafka. However, future versions will deprecate and eventually remove ZooKeeper as a requirement, replacing it with the Raft protocol for cluster operations in Kafka itself.
RabbitMQ and Kafka implement their message queueing and distribution differently.
RabbitMQ nodes are all equal peers. Metadata is stored on all the nodes. Queues are present on one node, although data is reachable from any node by cluster communication. The exception are special queues, such as quorum or mirrored queues, that replicate data on multiple nodes. Messages are sent to exchanges, which have rules that send messages to queues, and the consumers connect to queues in order to receive messages.
Kafka uses partitioning to replicate data to other nodes. Because of the smart consumer model, messages are sent to a topic, and received by consumers that subscribe to that topic. There are no filters or rules; those are all performed by the producers (sending to correct topics), and consumers (subscribing to correct topics).
RabbitMQ tends to have a lower message rate, which is a consequence of the design considerations around high availability and fault tolerance. Messages in RabbitMQ are acknowledgment-based, meaning they are deleted as soon as they are acknowledged by a client, whereas messages in Kafka are policy-based. This means they stay in the queue for a time period, regardless of whether they have been pulled by all clients or not.
In RabbitMQ, messages are sent using a push model. Clients connect and listen, usually with a background task. They bind to a queue and set parameters. Then when a message is received, a callback is called and the message is processed. Messages can be sent in batches, from one to hundreds. This will usually be configured based on number of consumers and processing speed. Testing will give the best numbers for configuration.
In Kafka, consumers read from the broker when ready. They keep track of where they are in order to get new messages.
In RabbitMQ, messages are deleted once successfully acknowledged by a consumer. In order for multiple consumers to get the same message, multiple queues have to be created, and rules in an exchange will send the message to multiple queues. This can be done with a fan-out exchange (all queues will get the message), or a topic exchange (queues will get copies of the message based on some keys), or other kinds of exchanges.
With Kafka, messages are kept until the retention time passes. This is why they are called topics, and not queues. Multiple consumers can connect to that topic and receive the same messages. This also means that the consumers must keep track of where they are in the topic so they can get new messages, or reprocess earlier messages.
In RabbitMQ, messages can be given priorities, so some messages can arrive before others if they have higher priority.
With Kafka, all messages have the same priority, so messages are always received in the order that they were sent to the topic.
Data Flow & Usage
Messages are sent to exchanges and routed to queues in RabbitMQ. They then sit in the queue until read (and acknowledged), at which point they are deleted.
In Kafka, messages are sent in a continuous stream to the topic, where they are read by the consumers.
Both Kafka and RabbitMQ have RBAC, SASL authentication, and cli control. RabbitMQ comes with a browser interface to manage users and queues, while Kafka supports JAAS.
Kafka vs. RabbitMQ: Use Cases
Kafka is best for big data cases that require extremely fast throughput. With its retention policies, it is also good for clients that want to connect and get a history of messages to replay.
RabbitMQ would be the better option in situations where complex routing and low latency delivery is needed.
Ultimately, your particular use case will determine whether Kafka or RabbitMQ is a better fit for your organization. RabbitMQ excels in single broker implementation and is typically used for simple scenarios. Kafka can function effectively as a message broker, but really shines as a stream processor for teams that need to manage and route their streaming data in real-time.
Get Support for Your Messaging Middleware
Regardless of which open source message-oriented middleware you choose, OpenLogic can deliver the training, guidance, and technical support your team needs. Learn how our enterprise architects can help you improve your Kafka, RabbitMQ, or other middleware deployments.