Introduction#
Apache Kafka is a distributed log. Its architecture — partitions, replication, and consumer groups — determines throughput, latency, ordering guarantees, and fault tolerance. This post explains these concepts and their practical implications.
Log-Based Architecture#
Kafka stores messages in append-only, ordered logs called partitions. Consumers read by tracking their position (offset) in the log.
1
2
3
4
5
6
Topic: orders (3 partitions, replication factor 3)
Partition 0: [msg0][msg1][msg2][msg3]...
Partition 1: [msg0][msg1][msg2]...
Partition 2: [msg0][msg1][msg2][msg3][msg4]...
offset: 0,1,2,3...
Unlike traditional message queues (RabbitMQ, SQS), messages in Kafka are not deleted after consumption. They are retained for a configurable period (days or bytes). Multiple consumer groups can independently read the same topic from different offsets.
Partitions: Parallelism and Ordering#
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
from confluent_kafka import Producer
producer = Producer({"bootstrap.servers": "kafka:9092"})
# Messages with the same key go to the same partition (ordered)
producer.produce(
topic="orders",
key="user:42", # key determines partition
value='{"id": 1001, "total": 99.99}'
)
# Round-robin (no key) = max throughput, no ordering guarantee
producer.produce(
topic="events",
value='{"type": "page_view"}'
)
producer.flush()
Partition count determines max parallelism for consumers. A topic with 3 partitions can be consumed by at most 3 consumers in a group simultaneously. Each consumer in the group is assigned one or more partitions.
1
2
3
4
5
6
7
8
9
# Create topic with appropriate partition count
kafka-topics.sh --create \
--topic orders \
--partitions 12 \ # plan for 12 consumer threads max
--replication-factor 3 \
--bootstrap-server kafka:9092
# Check topic metadata
kafka-topics.sh --describe --topic orders --bootstrap-server kafka:9092
Replication: Fault Tolerance#
Each partition has one leader and N-1 followers (ISR = In-Sync Replicas).
1
2
3
4
5
Partition 0:
Leader: broker-1 (receives all writes and reads)
Follower: broker-2 (synced copy)
Follower: broker-3 (synced copy)
ISR: [broker-1, broker-2, broker-3]
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
# Producer durability settings
producer = Producer({
"bootstrap.servers": "kafka:9092",
# acks=all: wait for all ISR replicas to acknowledge
# acks=1: wait only for leader (faster, less durable)
# acks=0: fire and forget (fastest, no guarantee)
"acks": "all",
# Retry on transient failures
"retries": 5,
"retry.backoff.ms": 100,
# Idempotent producer: exactly-once semantics per partition
"enable.idempotence": True,
# Message compression
"compression.type": "snappy",
})
min.insync.replicas (topic config) defines the minimum number of replicas that must acknowledge a write for it to succeed. With replication.factor=3 and min.insync.replicas=2, the cluster tolerates one broker failure without losing writes.
Consumer Groups#
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
from confluent_kafka import Consumer
consumer = Consumer({
"bootstrap.servers": "kafka:9092",
"group.id": "order-processor", # group ID defines the consumer group
# Offset reset: what to do when no committed offset exists
"auto.offset.reset": "earliest", # read from beginning
# "auto.offset.reset": "latest", # read only new messages
# Disable auto-commit: commit manually after processing
"enable.auto.commit": False,
})
consumer.subscribe(["orders"])
try:
while True:
msg = consumer.poll(timeout=1.0)
if msg is None:
continue
if msg.error():
print(f"Consumer error: {msg.error()}")
continue
try:
process_order(msg.value())
# Commit only after successful processing
consumer.commit(msg)
except Exception as e:
# Do NOT commit — message will be redelivered
handle_error(e, msg)
finally:
consumer.close()
Consumer group rebalancing occurs when a consumer joins or leaves the group. During rebalancing, consumption pauses. Minimize rebalance impact with:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
from confluent_kafka import Consumer
consumer = Consumer({
"bootstrap.servers": "kafka:9092",
"group.id": "order-processor",
# Cooperative rebalancing: only reassigned partitions pause
"partition.assignment.strategy": "cooperative-sticky",
# Prevent rebalance if consumer is slow (not failed)
"max.poll.interval.ms": 300000, # 5 minutes max between polls
"session.timeout.ms": 30000, # declare dead after 30s of no heartbeat
"heartbeat.interval.ms": 3000,
})
Offsets and Delivery Semantics#
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
# At-least-once (recommended default)
# Commit after processing — may reprocess on crash
try:
process(msg)
consumer.commit(msg)
except:
# No commit — will reprocess
pass
# At-most-once (rarely correct)
# Commit before processing — may lose messages on crash
consumer.commit(msg)
process(msg)
# Exactly-once requires:
# 1. Idempotent producer (enable.idempotence=True)
# 2. Transactional producer
# 3. read_committed isolation level on consumer
# Only use when the cost of complexity is justified
Lag Monitoring#
Consumer lag is the number of messages not yet consumed.
1
2
3
4
5
6
7
8
9
10
11
12
13
# Check consumer group lag
kafka-consumer-groups.sh \
--bootstrap-server kafka:9092 \
--describe \
--group order-processor
# Output:
# GROUP TOPIC PARTITION CURRENT-OFFSET LOG-END-OFFSET LAG
# order-processor orders 0 1000 1050 50
# order-processor orders 1 980 980 0
# order-processor orders 2 1100 1120 20
# Prometheus: kafka_consumer_group_lag metric via kafka_exporter
Persistent lag growth means consumers are slower than producers. Options: add consumers (up to partition count), optimize processing, or increase partitions.
Conclusion#
Choose partition count based on desired consumer parallelism. Set replication.factor=3 and min.insync.replicas=2 for production. Use manual offset commits with at-least-once delivery. Monitor consumer lag — it is the primary health metric for Kafka consumers. Use message keys to guarantee ordering for related events while distributing unrelated events across partitions.