Database Design in System Design: SQL vs NoSQL, Sharding, Replication
Master database design for system design interviews. Learn SQL vs NoSQL, database sharding, replication, indexing with real examples from Facebook, Twitter, Uber

"NoSQL Is Always Faster Than SQL" β The Expensive Myth
I've heard this in dozens of system design interviews: "Let's use MongoDB because it's faster than PostgreSQL."
Wrong.
Instagram serves billions of photosβPostgreSQL. Uber tracks millions of ridesβPostgreSQL. Discord handles billions of messagesβtheir custom database built on... Cassandra? No, they actually use Scylla, but store critical data in PostgreSQL.
The truth? Database choice isn't about "faster" or "slower." It's about matching your data model, access patterns, and consistency requirements to the right storage system.
In this guide, I'll break down SQL vs NoSQL, when to use each, sharding, replication, and how companies like Facebook, Twitter, and Uber design their databases. Let's dive in.
Types of Databases
1. SQL (Relational) Databases
Examples: PostgreSQL, MySQL, Oracle, SQL Server
Data Model: Tables with rows and columns
Example:
-- Users table
users
+----+----------+------------------+
| id | username | email |
+----+----------+------------------+
| 1 | john_doe | john@example.com |
| 2 | jane_doe | jane@example.com |
+----+----------+------------------+
-- Posts table
posts
+----+---------+------------------+------------+
| id | user_id | content | created_at |
+----+---------+------------------+------------+
| 1 | 1 | Hello world! | 2025-01-01 |
| 2 | 1 | Learning SQL | 2025-01-02 |
| 3 | 2 | NoSQL vs SQL | 2025-01-03 |
+----+---------+------------------+------------+
-- Relationships
Foreign key: posts.user_id β users.id
Characteristics: β ACID transactions (Atomicity, Consistency, Isolation, Durability) β Structured schema (enforced data integrity) β Complex queries with JOINs β Mature ecosystem (40+ years old)
When to use:
- Financial systems (banking, payments)
- E-commerce (orders, inventory)
- Applications with complex relationships
- Need for strong consistency
Real Example: Instagram
Instagram uses PostgreSQL for:
- User profiles
- Relationships (followers/following)
- Posts metadata
- Comments
Why PostgreSQL?
- Complex queries (fetch all posts from followed users)
- Transactional integrity (ensure consistency)
- Proven scalability (with sharding)
2. NoSQL Databases
"Not only SQL" β Multiple types:
a) Document Stores (MongoDB, Couchbase)
Data Model: JSON-like documents
Example:
{
"_id": "user_123",
"username": "john_doe",
"email": "john@example.com",
"profile": {
"bio": "Software developer",
"location": "San Francisco"
},
"posts": [
{
"content": "Hello world!",
"created_at": "2025-01-01",
"likes": 42
}
]
}
Characteristics: β Flexible schema (no predefined structure) β Nested data (no JOINs needed) β Horizontal scaling (sharding built-in)
When to use:
- Rapidly changing schema
- Nested/hierarchical data
- Content management systems
- Catalogs with varying attributes
Real Example: Facebook Messages
Facebook uses HBase (similar to document store) for messaging:
- Billions of messages
- Schema flexibility (attachments, reactions, etc.)
- Write-heavy workload
b) Key-Value Stores (Redis, DynamoDB, Riak)
Data Model: Simple key β value
Example:
# Redis
"session:abc123" β {"user_id": 1, "logged_in": true}
"user:1:cart" β ["product_1", "product_2"]
"rate_limit:user_1" β 45
Characteristics: β Extremely fast (in-memory) β Simple operations (GET, SET, DELETE) β TTL support (auto-expiration)
When to use:
- Caching
- Session storage
- Real-time analytics (counters, leaderboards)
- Rate limiting
Real Example: Twitter
Twitter uses Redis for:
- Timeline caching
- User session storage
- Rate limiting (API calls per user)
c) Column-Family Stores (Cassandra, HBase, ScyllaDB)
Data Model: Rows with dynamic columns
Example:
Row key: user_123
| Column family: profile
| name: "John Doe"
| email: "john@example.com"
| Column family: posts
| post_1: "Hello world!"
| post_2: "Learning Cassandra"
Characteristics: β Write-optimized (append-only) β Massive scalability (petabytes) β Tunable consistency
When to use:
- Time-series data
- High write throughput
- Analytics / logging
Real Example: Netflix
Netflix uses Cassandra for:
- Viewing history (billions of events)
- User preferences
- A/B testing data
Why Cassandra?
- Handles 1+ million writes per second
- Globally distributed (multi-region)
- Always available (no single point of failure)
d) Graph Databases (Neo4j, Amazon Neptune)
Data Model: Nodes and edges (relationships)
Example:
(User: john_doe) -[:FOLLOWS]-> (User: jane_doe)
(User: john_doe) -[:LIKES]-> (Post: 123)
(Post: 123) -[:TAGGED]-> (Tag: #systemdesign)
When to use:
- Social networks (friends, followers)
- Recommendation engines
- Fraud detection
Real Example: LinkedIn
LinkedIn uses graph database concepts for:
- Connection recommendations ("People you may know")
- 2nd/3rd degree connections
- Shortest path between users
SQL vs NoSQL: Decision Matrix
| Factor | SQL | NoSQL |
|---|---|---|
| Schema | Fixed, predefined | Flexible, dynamic |
| Transactions | Full ACID β | Limited (eventual consistency) |
| Scalability | Vertical (harder to scale) | Horizontal (built-in) β |
| Complex Queries | JOINs, aggregations β | Limited support |
| Consistency | Strong β | Eventual (tunable) |
| Maturity | 40+ years β | 15 years |
| Use Case | Banking, e-commerce | Social media, analytics |
Truth: Most large systems use both.
Database Scaling Techniques
1. Indexing
Problem: Query slow on large table
-- 10 million users
SELECT * FROM users WHERE email = 'john@example.com';
-- Scans entire table: 5 seconds β
Solution: Create index
CREATE INDEX idx_email ON users(email);
-- Now uses index: 5ms β
How indexes work:
Think of a book:
- Without index: Read every page to find "Chapter 5" (slow)
- With index: Look at table of contents, jump to page 87 (fast)
Trade-off: β Faster reads β Slower writes (must update index)
Best practice: Index columns used in WHERE, ORDER BY, JOIN
Real Example: Facebook
Facebook heavily indexes:
- User IDs
- Email addresses
- Phone numbers
- Timestamps (for timeline queries)
2. Replication
Strategy: Copy data to multiple servers
Master-Slave Replication
[Master DB]
(Handles writes)
β
βββββββββ΄ββββββββ
β β
[Slave 1] [Slave 2]
(Handles reads) (Handles reads)
How it works:
- All writes β Master
- Master replicates changes to Slaves
- Reads β Distributed across Slaves
Benefits: β Scale reads (add more slaves) β High availability (if master fails, promote slave) β Backup (slaves have full copy)
Drawbacks: β Replication lag (slaves slightly behind master) β Single write point (master is bottleneck)
Real Example: Reddit
Reddit uses master-slave replication:
- Master: Handles posts, comments, votes (writes)
- Slaves (10+): Serve home feed, subreddit pages (reads)
Master-Master Replication
[Master 1] ββ [Master 2]
(Writes) (Writes)
Both databases accept writes, replicate to each other.
Benefits: β No single write bottleneck β High availability
Drawbacks: β Write conflicts (complex to resolve) β Consistency challenges
When to use: Multi-region applications needing low write latency everywhere
3. Sharding (Horizontal Partitioning)
Problem: Database too large for single server
Database: 1 TB
Single server: 256 GB RAM
β
Cannot fit in memory
β
Slow queries
Solution: Split data across multiple databases
Sharding Strategies
a) Range-Based Sharding
Shard 1: Users with ID 1 - 1,000,000
Shard 2: Users with ID 1,000,001 - 2,000,000
Shard 3: Users with ID 2,000,001 - 3,000,000
Pros: β Simple to implement β Easy to add new shard (extend range)
Cons: β Uneven distribution (hotspots) β Shard 1 might have 80% traffic
b) Hash-Based Sharding
def get_shard(user_id, num_shards):
return hash(user_id) % num_shards
# Examples
user_123 β Shard 1
user_456 β Shard 3
user_789 β Shard 2
Pros: β Even distribution β No hotspots
Cons: β Adding shards requires re-sharding all data β Range queries difficult (data scattered)
c) Geographic Sharding
US users β US database
EU users β EU database
Asia users β Asia database
Pros: β Low latency (data close to users) β Regulatory compliance (GDPR)
Cons: β Uneven load (if user distribution varies) β Cross-region queries expensive
Real Example: Uber
Uber shards by city:
- New York rides β New York database
- San Francisco rides β SF database
- London rides β London database
Why?
- Rides are geographically bounded
- Most queries within single city
- Rare cross-city queries
Sharding Challenges
Problem 1: Joins across shards
-- User in Shard 1, Posts in Shard 2
SELECT users.name, posts.content
FROM users JOIN posts ON users.id = posts.user_id;
-- Requires fetching data from multiple shards β
Solution: Denormalize (duplicate data to avoid cross-shard queries)
Problem 2: Auto-increment IDs
-- Shard 1 generates: 1, 2, 3, ...
-- Shard 2 generates: 1, 2, 3, ... (collision!)
Solutions:
- Use UUID (globally unique, but large)
- Twitter Snowflake (timestamp + shard ID + sequence)
- Instagram ID generation (timestamp + shard ID)
Data Storage Types
1. OLTP (Online Transaction Processing)
Purpose: Handle day-to-day transactions
Characteristics:
- Fast writes/reads
- Small queries (fetch single user, update order)
- High concurrency (thousands of queries/sec)
Examples: User profiles, orders, inventory
Databases: MySQL, PostgreSQL, MongoDB
Real Example: Amazon Orders
When you place an order:
BEGIN TRANSACTION;
INSERT INTO orders (user_id, total) VALUES (123, 49.99);
UPDATE inventory SET quantity = quantity - 1 WHERE product_id = 789;
INSERT INTO payments (order_id, status) VALUES (456, 'pending');
COMMIT;
Fast, transactional, consistent.
2. OLAP (Online Analytical Processing)
Purpose: Analyze historical data
Characteristics:
- Complex queries (aggregations, JOINs)
- Large datasets (billions of rows)
- Read-heavy
Examples: Business intelligence, reports, dashboards
Databases: Snowflake, Google BigQuery, Amazon Redshift
Real Example: Netflix Analytics
Netflix analyzes billions of viewing events:
-- "What shows are most popular in India?"
SELECT show_name, COUNT(*) as views
FROM viewing_history
WHERE country = 'India' AND date >= '2025-01-01'
GROUP BY show_name
ORDER BY views DESC
LIMIT 10;
Query might take 30 secondsβbut that's okay, it's not real-time.
3. Data Warehouse
Purpose: Central repository for analytics
Architecture:
[Data Warehouse]
β
βββββββββββββΌββββββββββββ
β β β
[OLTP DB 1] [OLTP DB 2] [Logs]
(Products) (Users) (Events)
Extract data from multiple sources, transform, load into warehouse (ETL).
Real Example: Airbnb
Airbnb consolidates:
- Booking data (PostgreSQL)
- Search logs (Elasticsearch)
- User behavior (event tracking)
Into data warehouse (Presto) for analytics.
4. Data Lake
Purpose: Store raw, unstructured data
Difference from Data Warehouse:
- Data Lake: Store everything (unprocessed)
- Data Warehouse: Store structured, processed data
Real Example: Facebook
Facebook stores:
- Photos (billions)
- Videos
- Logs
- Everything
In data lake (Hadoop), process later as needed.
CAP Theorem (Briefly)
CAP Theorem: You can only choose 2 out of 3:
- Consistency: All nodes see same data
- Availability: Every request gets response
- Partition Tolerance: System works despite network failures
Choices:
- CP (Consistency + Partition Tolerance): MongoDB, HBase
- AP (Availability + Partition Tolerance): Cassandra, DynamoDB
- CA (Consistency + Availability): Not realistic (networks fail)
Most systems: Choose AP or CP based on needs
System Design Interview Tips
Common Question: "Design Twitter"
Database Strategy:
1. User Profile: PostgreSQL
- Structured data
- Relationships (followers)
- ACID transactions
2. Tweets: Cassandra
- Massive write volume (6,000 tweets/sec)
- Time-series data
- No complex queries
3. Timeline Cache: Redis
- Pre-computed timelines
- Fast access
- TTL = 5 minutes
4. Analytics: Data Warehouse (BigQuery)
- Aggregate metrics
- Historical trends
Sharding:
Users: Hash-based sharding by user_id
Tweets: Range-based sharding by timestamp (recent tweets on hot shards)
Replication:
Master-slave replication:
- 1 master (writes)
- 5 slaves (reads)
- Geographic distribution
What to Mention
β SQL for structured data, NoSQL for unstructured/high-scale β Replication for read scaling β Sharding for write scaling β Indexes for query optimization β Cache layer (Redis) for hot data
Avoid These Mistakes
β "Let's use NoSQL because it's faster" (no justification) β Not discussing replication/sharding β Ignoring consistency requirements β Not mentioning indexes
Practical Tips
1. Start with SQL
Unless you have a specific reason (massive scale, flexible schema), start with PostgreSQL or MySQL.
Why?
- Mature, well-understood
- Rich querying capabilities
- Most applications fit relational model
Uber, Instagram, and Reddit all started with SQL.
2. Denormalize for Performance
Normalized (relational):
-- Fetch user's tweet with retweet count: 2 queries
SELECT * FROM tweets WHERE id = 123;
SELECT COUNT(*) FROM retweets WHERE tweet_id = 123;
Denormalized (fast):
tweets
+----+---------+----------------+
| id | content | retweet_count |
+----+---------+----------------+
| 123| Hello | 42 |
+----+---------+----------------+
-- Fetch in 1 query
SELECT * FROM tweets WHERE id = 123;
Trade-off: Update retweet_count on every retweet (extra write), but reads are faster.
3. Cache Aggressively
Pattern:
1. Check Redis cache
2. If miss: Query database
3. Store result in cache (TTL = 5 min)
4. Return data
Result: 90%+ requests served from cache (< 1ms), 10% from database (100ms)
4. Monitor Query Performance
Use:
- PostgreSQL:
EXPLAIN ANALYZE - MySQL:
EXPLAIN - MongoDB:
.explain()
Find slow queries, add indexes.
Conclusion
Database design is the foundation of scalable systems:
- SQL: Structured data, transactions, complex queries
- NoSQL: Flexibility, horizontal scaling, high throughput
- Replication: Scale reads, high availability
- Sharding: Scale writes, massive data
- Indexing: Fast queries
Key takeaway: There's no "best" database. Match your data model and access patterns to the right storage system.
Instagram uses PostgreSQL. Facebook uses MySQL, Cassandra, HBase, and more. Twitter uses MySQL, Cassandra, and Redis.
Use the right tool for the job. Master the fundamentals, and you'll ace database design questions in system design interviewsβand build systems that scale to billions of users.