Introduction#
RabbitMQ and Kafka are both widely-used message brokers, but they serve fundamentally different purposes. Choosing the wrong one leads to operational complexity without benefit. Understanding their architectural differences helps you select the right tool for your use case.
Core Architecture Differences#
1
2
3
4
5
6
7
8
9
10
11
12
13
| RabbitMQ: Traditional Message Broker
Producers → Exchange → Queue → Consumers
- Messages are routed by exchanges (direct, topic, fanout, headers)
- Messages are consumed and deleted
- Consumer acknowledgment drives message removal
- Focus: routing, delivery guarantees, consumer workloads
Kafka: Distributed Event Log
Producers → Topic (Partitioned Log) ← Consumer Groups
- Messages are appended to a persistent log
- Consumers track their own offset — messages are not deleted
- Multiple consumer groups can independently replay the same events
- Focus: throughput, replay, stream processing, event sourcing
|
When to Use RabbitMQ#
1
2
3
4
5
6
7
8
9
10
11
12
13
| Use RabbitMQ when you need:
- Task queues (worker pools processing jobs)
- Complex routing (route by message type, priority, headers)
- Per-message TTL and dead-letter queues
- Request-reply patterns
- Sub-second latency at moderate throughput
- Consumer-driven backpressure
Examples:
- Email/notification sending queue
- Background job processing (image resize, report generation)
- Microservice RPC over messaging
- Priority-based work scheduling
|
When to Use Kafka#
1
2
3
4
5
6
7
8
9
10
11
12
13
14
| Use Kafka when you need:
- Very high throughput (millions of messages/sec)
- Event replay and time-travel (reprocess historical events)
- Multiple independent consumers of the same events
- Stream processing (Kafka Streams, Flink, Spark)
- Event sourcing and audit log
- Long message retention (days, weeks, or forever)
Examples:
- Activity tracking (clicks, page views)
- Audit log for compliance
- CDC (change data capture) from databases
- Real-time analytics pipeline
- Event-sourced system integration
|
RabbitMQ: Producer and Consumer#
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
| import pika
import json
# Producer: publish a task to a work queue
def publish_task(task: dict) -> None:
connection = pika.BlockingConnection(pika.ConnectionParameters("localhost"))
channel = connection.channel()
channel.queue_declare(
queue="email_tasks",
durable=True, # survives broker restart
)
channel.basic_publish(
exchange="",
routing_key="email_tasks",
body=json.dumps(task),
properties=pika.BasicProperties(
delivery_mode=pika.DeliveryMode.Persistent, # persist to disk
content_type="application/json",
),
)
connection.close()
# Consumer: process tasks with acknowledgment
def start_worker() -> None:
connection = pika.BlockingConnection(pika.ConnectionParameters("localhost"))
channel = connection.channel()
channel.queue_declare(queue="email_tasks", durable=True)
channel.basic_qos(prefetch_count=1) # one task at a time per worker
def callback(ch, method, properties, body):
task = json.loads(body)
try:
send_email(task["to"], task["subject"], task["body"])
ch.basic_ack(delivery_tag=method.delivery_tag) # mark done
except Exception as e:
print(f"Failed: {e}")
ch.basic_nack(delivery_tag=method.delivery_tag, requeue=True) # retry
channel.basic_consume(queue="email_tasks", on_message_callback=callback)
channel.start_consuming()
|
RabbitMQ: Topic Exchange Routing#
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
| # Route messages by routing key pattern
def setup_topic_exchange(channel):
channel.exchange_declare(exchange="logs", exchange_type="topic", durable=True)
# Bind queues to routing key patterns
channel.queue_declare(queue="error_logs", durable=True)
channel.queue_bind(queue="error_logs", exchange="logs", routing_key="*.error")
channel.queue_declare(queue="all_logs", durable=True)
channel.queue_bind(queue="all_logs", exchange="logs", routing_key="#") # everything
channel.queue_declare(queue="auth_logs", durable=True)
channel.queue_bind(queue="auth_logs", exchange="logs", routing_key="auth.*")
def publish_log(channel, service: str, level: str, message: str):
routing_key = f"{service}.{level}" # e.g., "auth.error", "api.info"
channel.basic_publish(
exchange="logs",
routing_key=routing_key,
body=json.dumps({"message": message, "service": service, "level": level}),
)
# "auth.error" → error_logs + all_logs + auth_logs
# "api.info" → all_logs only
|
Kafka: Producer and Consumer#
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
| from confluent_kafka import Producer, Consumer, KafkaError
import json
# Producer
def publish_event(topic: str, key: str, event: dict) -> None:
producer = Producer({"bootstrap.servers": "localhost:9092"})
producer.produce(
topic=topic,
key=key.encode(),
value=json.dumps(event).encode(),
on_delivery=lambda err, msg: print(f"Delivered: {msg.topic()}@{msg.offset()}" if not err else f"Error: {err}")
)
producer.flush() # wait for delivery
# Consumer
def start_consumer(topics: list[str], group_id: str) -> None:
consumer = Consumer({
"bootstrap.servers": "localhost:9092",
"group.id": group_id,
"auto.offset.reset": "earliest", # start from beginning for new groups
"enable.auto.commit": False, # manual commit for at-least-once
})
consumer.subscribe(topics)
try:
while True:
msg = consumer.poll(timeout=1.0)
if msg is None:
continue
if msg.error():
if msg.error().code() == KafkaError._PARTITION_EOF:
continue
raise Exception(msg.error())
event = json.loads(msg.value().decode())
try:
process_event(event)
consumer.commit(asynchronous=False) # commit after successful processing
except Exception as e:
print(f"Processing failed: {e}")
# Do NOT commit — message will be reprocessed
finally:
consumer.close()
|
Kafka: Multiple Independent Consumers#
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
| # Same topic, different consumer groups = independent processing
# Analytics consumer group
def analytics_consumer():
start_consumer(["user_events"], group_id="analytics-service")
# Notification consumer group
def notification_consumer():
start_consumer(["user_events"], group_id="notification-service")
# Audit log consumer group
def audit_consumer():
start_consumer(["user_events"], group_id="audit-service")
# All three read from the same topic independently
# Each maintains its own offset
# analytics-service can lag behind; notification-service can be fast
# Neither affects the other
|
Dead Letter Queues (RabbitMQ)#
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
| # Configure a dead letter exchange for failed messages
def setup_with_dlq(channel):
# Dead letter exchange
channel.exchange_declare(exchange="dlx", exchange_type="direct", durable=True)
channel.queue_declare(queue="failed_tasks", durable=True)
channel.queue_bind(queue="failed_tasks", exchange="dlx", routing_key="tasks")
# Main queue: messages rejected or expired go to DLX
channel.queue_declare(
queue="tasks",
durable=True,
arguments={
"x-dead-letter-exchange": "dlx",
"x-dead-letter-routing-key": "tasks",
"x-message-ttl": 30000, # 30 second TTL before DLQ
"x-max-retries": 3,
}
)
|
Comparison Table#
1
2
3
4
5
6
7
8
9
10
11
12
13
| Feature | RabbitMQ | Kafka
---------------------|----------------------|----------------------
Message model | Queue (delete after) | Log (retain and replay)
Routing | Exchanges/bindings | Partitions by key
Throughput | ~50k msg/s | 1M+ msg/s
Latency | Sub-millisecond | Low millisecond
Message ordering | Per queue | Per partition
Consumer model | Push (broker-driven)| Pull (consumer-driven)
Replay | No (consumed = gone)| Yes (seek to offset)
Multiple consumers | Competing (one gets)| All get independently
Retention | Until acknowledged | Configurable (days/forever)
Protocol | AMQP | Custom TCP
Use case | Task queues, RPC | Event streaming, analytics
|
Conclusion#
RabbitMQ excels at task distribution, complex routing, and request-reply patterns where messages are consumed and completed. Kafka excels at high-throughput event streams where multiple consumers need independent access to the same data and replay is valuable. Many systems use both: Kafka for event streaming and RabbitMQ for job queues. The key question is: do consumers need to replay events independently, or do they compete to process work items?