CAP Theorem Explained Simply: Consistency, Availability, Partition Tolerance
Understand CAP theorem for system design interviews. Learn consistency, availability, partition tolerance with real examples from DynamoDB, MongoDB, PostgreSQL, and practical trade-offs

The Impossible Triangle: Pick Two, You Can't Have Three
You want your distributed database to be:
- Consistent (all users see the same data)
- Available (system always responds to requests)
- Partition Tolerant (works even when network fails)
CAP Theorem says: You can only pick 2 out of 3.
It's like this triangleβyou can't have all three corners simultaneously:
Consistency
/\
/ \
/ \
/ CA \
/ \
/----------\
Availability Partition
Tolerance
This isn't a theoryβit's a fundamental law of distributed systems, proven mathematically in 2002 by Eric Brewer.
Instagram chooses CP (consistency over availability during network partitions). Amazon chooses AP (availability over consistencyβyou might see stale product info, but checkout always works).
There's no right answer. Only trade-offs.
In this guide, I'll explain CAP theorem simply, show real-world examples, and help you understand which databases choose whatβand why. Let's dive in.
What Is CAP Theorem?
CAP Theorem = In a distributed system, you can guarantee at most 2 of 3 properties:
- Consistency (C)
- Availability (A)
- Partition Tolerance (P)
Proven by: Eric Brewer (2000), formally proven by Seth Gilbert and Nancy Lynch (2002)
The Three Properties
1. Consistency (C)
Definition: All nodes see the same data at the same time
In practice: After a write, all subsequent reads return the updated value
Example:
Consistent system:
User updates profile picture
β
Write to Database Node 1
β
Synchronously replicate to Node 2, Node 3
β
Once all nodes updated: Return success
β
Next read from any node: Returns new picture β
Non-consistent (eventual consistency):
User updates profile picture
β
Write to Database Node 1
β
Asynchronously replicate to Node 2, Node 3 (takes 100ms)
β
Return success immediately
β
Read from Node 2 (before replication): Returns old picture β
Wait 100ms, read again: Returns new picture β
Real Example: Banking
You transfer $100 from Account A to Account B:
Consistent:
Account A: $500 β $400 (immediate)
Account B: $200 β $300 (immediate)
Any ATM, any location: Shows $400 and $300 β
Eventual consistency:
Account A: $500 β $400 (immediate)
Account B: $200 β $300 (after 1 second)
ATM on East Coast: Shows $400 and $200 (inconsistent!) β
1 second later: Shows $400 and $300 β
For banking, consistency is mandatory.
2. Availability (A)
Definition: Every request receives a response (success or failure), without guarantee it's the most recent data
In practice: System is always up, always responding (even if data is stale)
Example:
Available system:
User requests profile
β
Server 1 is down β
β
Request routed to Server 2
β
Server 2 responds (maybe with cached/stale data)
β
User gets a response β
(even if not perfectly up-to-date)
Unavailable system:
User requests profile
β
Server 1 is down β
β
System requires Server 1 for consistency
β
Return error: "Service unavailable" β
Real Example: Amazon Shopping
Black Friday, millions of users:
Available system:
- Show product info (even if stock count slightly stale)
- Allow adding to cart
- Process checkout
- Better to show "possibly 5 left" than "Site down"
Unavailable system:
- Wait for perfect stock count from all warehouses
- If one warehouse offline: Show "Service unavailable"
- Lose sales β
Amazon chooses availability (you might see stale stock, but site works).
3. Partition Tolerance (P)
Definition: System continues to operate despite network failures (network partition)
Network partition = Servers can't communicate (due to network failure, router issue, cable cut)
Example:
Partition-tolerant system:
[Node 1 - US East]
β‘ Network partition! β‘
[Node 2 - US West]
Partition occurs (network failure)
β
Node 1 and Node 2 can't communicate
β
Both nodes continue accepting requests independently β
β
When network heals: Reconcile differences
Not partition-tolerant:
[Node 1 - US East]
β‘ Network partition! β‘
[Node 2 - US West]
Partition occurs
β
System stops accepting writes (to maintain consistency)
β
Users see errors β
Reality: In distributed systems, network failures are inevitable.
- Cables get cut
- Routers fail
- Data centers lose connectivity
- Cloud providers have outages
Therefore: Partition tolerance is NOT optional. You must handle it.
The Real Choice: CP vs AP
Since partition tolerance is mandatory (networks fail), the real choice is:
CP (Consistency + Partition Tolerance) vs AP (Availability + Partition Tolerance)
CP Systems (Consistency + Partition Tolerance)
Trade-off: Sacrifice availability during network partitions
During partition:
Network partition occurs
β
System cannot guarantee consistency across nodes
β
Reject writes (or reads) until partition heals
β
Users see errors β (temporarily unavailable)
Once partition heals:
Nodes reconnect
β
Data consistent across all nodes β
β
System resumes normal operation
CP Databases
1. MongoDB (CP-leaning)
How it works:
MongoDB Replica Set:
- 1 Primary (handles writes)
- 2 Secondaries (replicas)
Network partition:
- Primary becomes isolated (can't reach majority)
- Primary steps down (no longer accepts writes) β
- Secondaries elect new primary
- Writes go to new primary β
Result:
- Consistency: All reads return latest data
- Availability: During partition (10-30 seconds), writes are unavailable β
When to use: When consistency is critical (financial data, inventory)
2. HBase (CP)
Used by: Facebook Messages
During partition:
- Region server loses connection to master
- Stops serving requests until reconnected
- Ensures consistency
3. Redis (with sentinel, CP-leaning)
How it works:
Network partition:
- Master isolated
- Sentinels detect failure (quorum vote)
- Promote slave to master
- During failover (10-30 seconds): Writes unavailable β
Real-World Example: Bank Transfer (CP System)
Scenario: Transfer $100 from Account A to Account B
CP System (PostgreSQL with synchronous replication):
Step 1: Deduct $100 from Account A
Step 2: Add $100 to Account B
Step 3: Commit transaction (only if both nodes acknowledge)
Network partition occurs during Step 2:
- Node 2 unreachable
- Transaction cannot complete
- Rollback (Account A unchanged)
- User sees error: "Transaction failed, try again" β
Result:
- Consistent: Money not lost β
- Not available: Transaction couldn't complete β
Better than: Money disappearing!
AP Systems (Availability + Partition Tolerance)
Trade-off: Sacrifice consistency during network partitions
During partition:
Network partition occurs
β
Nodes accept writes independently (can't sync)
β
Data diverges (different values on different nodes) β οΈ
β
Users get responses β
(but possibly stale data)
Once partition heals:
Nodes reconnect
β
Reconcile differences (conflict resolution)
β
Eventually consistent β
AP Databases
1. Cassandra (AP)
How it works:
Cassandra Cluster: 10 nodes
Network partition:
- Node 1-5 in US East (can't reach US West)
- Node 6-10 in US West (can't reach US East)
- Both sides continue accepting writes β
User A writes to US East: "status = online"
User B writes to US West: "status = offline"
Result: Conflicting data β οΈ
When partition heals:
- Cassandra uses "last write wins" (timestamp-based)
- Conflict resolved
Result:
- Available: Always accepting requests β
- Consistent: During partition, data diverges β (eventually consistent)
When to use: When availability is critical (social media, analytics, logging)
2. DynamoDB (AP)
Used by: Amazon shopping cart
Why AP?
Black Friday: 100M users shopping
Network partition occurs (US East β US West)
AP system:
- Both regions continue working β
- User in US East adds item to cart β Written to US East
- User in US West removes item β Written to US West
- Conflicting cart states β οΈ
When partition heals:
- Merge carts (union of items)
- Better to show extra item than lose sale β
Amazon's philosophy: Availability = revenue. Slight inconsistency acceptable.
3. CouchDB (AP)
How it works:
- Multi-master replication
- All nodes accept writes
- Conflicts resolved via versioning
Real-World Example: Social Media (AP System)
Scenario: User posts a status update
AP System (Facebook):
User posts: "Just got engaged!"
β
Write to Database Node 1 (US East)
β
Asynchronously replicate to Node 2 (US West)
β
Immediately show post to user β
Network partition occurs:
β
Replication delayed (500ms)
β
Friend in US West views profile: Doesn't see post yet β οΈ
β
500ms later: Post appears β
Result:
- Available: User could post β
- Consistent: Brief delay for global sync β οΈ (eventual consistency)
Acceptable trade-off for social media. No one dies if they see a post 500ms late.
CA Systems (Consistency + Availability)
Wait, what about CA (Consistency + Availability without Partition Tolerance)?
Theoretical: Single-node databases (no network = no partitions)
Examples:
- PostgreSQL (single instance)
- MySQL (single instance)
Reality: Not realistic for large-scale systems
- Single node = single point of failure
- Can't scale horizontally
- No geographic distribution
Used for: Small applications, development environments
CAP Theorem in Practice: Database Choices
| Database | CAP Type | Real Use Case |
|---|---|---|
| PostgreSQL (single) | CA | Small apps, single data center |
| PostgreSQL (replicated) | CP-leaning | Banking, e-commerce (consistency critical) |
| MongoDB | CP-leaning | Financial data, inventory |
| Cassandra | AP | Analytics, social media, logging |
| DynamoDB | AP | Shopping cart, session storage |
| Redis | CP (with sentinel) | Cache, real-time data |
| CouchDB | AP | Mobile apps, offline-first apps |
How Systems Handle Partitions
CP System: Refuse Requests
MongoDB:
try:
db.users.insert_one({"name": "John"})
except pymongo.errors.NotMasterError:
print("Cannot write: primary unavailable")
# Show error to user β
User experience: "Service temporarily unavailable"
AP System: Accept Requests, Resolve Later
Cassandra:
# Write succeeds even during partition
session.execute("INSERT INTO users (id, name) VALUES (1, 'John')")
# Read might return stale data β οΈ
result = session.execute("SELECT * FROM users WHERE id = 1")
print(result) # Might not include latest write yet
User experience: Seems to work, but data might be stale temporarily
Eventual Consistency
Most AP systems use eventual consistency:
Definition: Given enough time (and no new writes), all nodes will converge to the same value
Example:
Write at t=0: "status = online" (Node 1)
Read at t=50ms: "status = offline" (Node 2, old value) β οΈ
Read at t=200ms: "status = online" (Node 2, updated) β
Time to consistency:
- Cassandra: 100-500ms
- DynamoDB: 1-2 seconds (default)
- Couchbase: Configurable
Trade-Off: Tunable Consistency
Some databases let you choose per-query:
Cassandra Consistency Levels
Write consistency:
# Require acknowledgment from ALL nodes (strong consistency)
session.execute(query, consistency_level=ConsistencyLevel.ALL) # CP-like
# Require acknowledgment from 1 node (high availability)
session.execute(query, consistency_level=ConsistencyLevel.ONE) # AP-like
# Require quorum (majority of nodes)
session.execute(query, consistency_level=ConsistencyLevel.QUORUM) # Balanced
Trade-off:
- ALL: Slow, consistent β , unavailable during partition β
- ONE: Fast β , available β , inconsistent β οΈ
- QUORUM: Balanced
System Design Interview Tips
Common Question: "Explain CAP theorem"
Good answer:
"CAP theorem states that in a distributed system, you can only guarantee 2 of 3:
- Consistency: All nodes see same data
- Availability: System always responds
- Partition tolerance: Works despite network failures
Since network failures are inevitable, the real choice is CP vs AP.
CP systems (MongoDB, HBase): Prioritize consistency, sacrifice availability during partitions
AP systems (Cassandra, DynamoDB): Prioritize availability, accept eventual consistency
Choice depends on use case:
- Banking: CP (consistency critical)
- Social media: AP (availability critical)"
Follow-Up: "Design a shopping cart"
Answer:
"I'd choose AP (like Amazon's DynamoDB):
Reason:
- Availability is critical (users must add to cart, even during partition)
- Slight inconsistency acceptable (if user adds item on phone and tablet simultaneously, merge both)
- Losing a sale due to "service unavailable" is worse than showing extra item
Trade-off:
- Possible duplicate items in cart (rare, user can remove)
- Revenue > perfect consistency"
What to Mention
β Network partitions are inevitable (not optional) β CP vs AP depends on use case β Give specific database examples (MongoDB = CP, Cassandra = AP) β Explain trade-offs (consistency vs availability) β Mention eventual consistency for AP systems
Avoid These Mistakes
β Saying "We'll use CA" (not realistic for distributed systems) β Not explaining the trade-off (why CP or AP for your use case?) β Claiming "NoSQL is faster" (it's about consistency model, not speed) β Ignoring partition scenarios (always discuss what happens during network failure)
Beyond CAP: PACELC Theorem
CAP is simplified. Reality is more nuanced:
PACELC Theorem:
- If Partition (P): Choose Availability (A) or Consistency (C)
- Else (E): Choose Latency (L) or Consistency (C)
Meaning: Even without partitions, there's a trade-off between latency and consistency.
Example:
Strong consistency:
- Write to Node 1
- Wait for Node 2, Node 3 to acknowledge
- High latency (100ms) β οΈ
Eventual consistency:
- Write to Node 1
- Return immediately
- Low latency (10ms) β
- But temporary inconsistency
Conclusion
CAP Theorem is fundamental to distributed systems:
- Consistency: All nodes see same data
- Availability: System always responds
- Partition Tolerance: Works despite network failures
The real choice: CP vs AP
CP (Consistency + Partition Tolerance):
- Sacrifice availability during partitions
- Use for: Banking, payments, inventory
- Examples: MongoDB, PostgreSQL (replicated), Redis
AP (Availability + Partition Tolerance):
- Sacrifice consistency (eventual consistency)
- Use for: Social media, shopping cart, analytics
- Examples: Cassandra, DynamoDB, CouchDB
The secret? There's no "best" choice. Only trade-offs based on your requirements.
Banking system? Choose CP (money can't disappear). Social media? Choose AP (users must always be able to post).
Master CAP theorem, and you'll make better database choicesβand ace system design interviews.