CAP Theorem Explained

Introduction#

The CAP Theorem, introduced by Eric Brewer in 2000 and formally proven by Seth Gilbert and Nancy Lynch in 2002, is a fundamental theorem in distributed systems. It states that a distributed data store can only guarantee two of the following three properties simultaneously:

C - Consistency
A - Availability
P - Partition Tolerance

Understanding CAP helps engineers make informed trade-offs when choosing databases and designing distributed architectures.

The Three Properties#

Consistency#

Every read receives the most recent write or an error. All nodes in the distributed system see the same data at the same time.

When you write a value to a consistent system, any subsequent read from any node returns that updated value — no stale data is served.

Availability#

Every request receives a response (not an error), though the response may not contain the most recent write. The system remains operational even if some nodes are down.

An available system always responds, but cannot guarantee the data is the latest.

Partition Tolerance#

The system continues to operate even when network messages between nodes are dropped or delayed — i.e., a network partition occurs.

In any real-world distributed system, network partitions can and do happen. Therefore, partition tolerance is effectively a requirement, not an option.

Why You Can Only Have Two of Three#

In a distributed system, a network partition can split nodes so they cannot communicate. When this happens:

If you prioritize Consistency + Partition Tolerance (CP): The system refuses to respond to requests until all nodes agree on the data. This sacrifices availability.
If you prioritize Availability + Partition Tolerance (AP): The system continues serving requests but may return stale data. This sacrifices consistency.
Consistency + Availability (CA): Only achievable if there are no partitions — which is only possible in a single-node or perfectly reliable network (not realistic in distributed deployments).

CAP Trade-offs in Practice#

Since partition tolerance is mandatory for distributed systems, the real choice is between CP and AP.

CP Systems — Consistency over Availability#

CP systems reject or delay requests if they cannot guarantee consistent data across all nodes after a partition.

Examples:

HBase — Refuses writes if it cannot reach the master node
Zookeeper — Returns errors rather than stale data during partitions
MongoDB (with majority write concern) — Waits for majority acknowledgment before confirming writes
etcd — Used in Kubernetes; prioritizes consistency for leader election

When to choose CP:

Financial transactions where stale balances are unacceptable
Inventory management where overselling must be prevented
Leader election and distributed locking

AP Systems — Availability over Consistency#

AP systems continue responding during partitions but may return data that is not the most recent. Nodes eventually synchronize when the partition heals (eventual consistency).

Examples:

Cassandra — Highly available with tunable consistency
DynamoDB — Eventual consistency by default, strong consistency optional
CouchDB — Designed for availability with offline-first sync
Riak — AP by design with eventual consistency

When to choose AP:

Social media feeds where slight staleness is acceptable
Product catalog browsing where a slightly outdated price is tolerable
DNS resolution — propagates changes eventually, not instantly
Shopping cart where conflicting updates can be merged

Consistency Models Beyond Binary#

CAP presents consistency as binary, but real systems offer a spectrum:

Strong Consistency#

All reads reflect the most recent write. Used in CP systems. Every node agrees before responding.

Eventual Consistency#

Given no new updates, all replicas will eventually converge to the same value. Used in AP systems.

Read-Your-Writes Consistency#

A user always sees their own writes, even if other users may not see them yet. Common in social platforms.

Monotonic Read Consistency#

Once a client has read a value, it will never read an older value in subsequent reads.

Causal Consistency#

Operations that are causally related are seen in the same order by all nodes. If event A causes event B, no node will see B before A.

PACELC Extension#

The CAP theorem only covers behavior during partitions. The PACELC model extends CAP to address latency trade-offs under normal operation:

If Partition (P): choose between Availability (A) and Consistency (C) — same as CAP
Else (E): choose between Latency (L) and Consistency (C)

Even when no partition exists, there is a trade-off: stronger consistency requires more coordination between nodes, which introduces latency.

Database	Partition	Normal Operation
DynamoDB	AP	EL (low latency, eventual)
Cassandra	AP	EL
HBase	CP	EC (higher latency, consistent)
Spanner	CP	EC

Practical Example: E-Commerce Order System#

Consider an order management system:

Scenario A: Inventory service (CP)

When placing an order, the system checks if stock is available. Serving stale inventory data could cause overselling. A CP approach holds the request until all inventory nodes agree on the current stock level. Availability is temporarily sacrificed during a partition.

Scenario B: Product search (AP)

A customer searching for products can tolerate seeing a product that was just discontinued moments ago, or missing a product added seconds ago. An AP approach serves the request immediately from the nearest node. Consistency is relaxed in favor of responsiveness.

Choosing the Right Database#

Use Case	CAP Choice	Example Databases
Banking / Payments	CP	PostgreSQL, CockroachDB
User sessions	AP	Redis (with replication), Cassandra
Configuration / coordination	CP	etcd, Zookeeper
Analytics / logs	AP	Cassandra, DynamoDB
E-commerce catalog	AP	DynamoDB, MongoDB
Order management	CP	PostgreSQL, CockroachDB

Best Practices#

Understand the failure modes of each system before choosing CP or AP.
Use CP systems at transactional boundaries and AP systems at query/read boundaries.
Design for eventual consistency explicitly — define how conflicts are resolved.
Test partition behavior in staging; chaos engineering tools like Chaos Monkey can simulate network splits.
Do not conflate availability with fault tolerance — AP systems can still fail; they just respond during partitions.

Conclusion#

The CAP Theorem is not a rigid rule but a framework for reasoning about trade-offs in distributed systems. In practice, partition tolerance is non-negotiable, leaving the choice between consistency and availability. The right choice depends on the domain: financial systems lean toward CP, while content delivery and user-facing reads lean toward AP. Understanding where each applies lets you compose systems that are both robust and appropriate for their use case.