Ojaswi Athghara | System Design Interview Prep: Complete Guide with 20+ Questions

System Design Interview Prep: Complete Guide with 20+ Questions

"How Would You Design Twitter?"

Interviewer: "You have 45 minutes. Go."

Candidate A: "Um... we need a database for tweets... and... users... maybe MongoDB? And... AWS?"

(Interview ends poorly)

Candidate B: "Before I start, let me clarify requirements. Are we designing the core Twitter functionality—posting tweets, following users, and viewing timeline? Or including features like DMs, trending topics, and notifications?"

Interviewer: "Let's focus on core functionality."

Candidate B: "Great. Let me estimate scale. Twitter has 500M users, 200M daily active. That's roughly 6,000 tweets per second on average, with peaks at 20,000 during events. Storage: 500M users × 1KB profile = 500GB. 500B tweets historical at 1KB each = 500TB..."

(Interview going well)

The difference? Candidate B has a framework.

System design interviews aren't about memorizing solutions. They're about demonstrating structured thinking, understanding trade-offs, and communicating your thought process clearly.

In this guide, I'll give you the exact framework used by successful candidates at Google, Amazon, Facebook, and Netflix, plus 20+ practice questions with approaches. Let's begin.

What Are System Design Interviews?

System Design Interview = Design a large-scale system (Twitter, Uber, YouTube) in 45-60 minutes

Not coding (though you may sketch APIs, data models)

What interviewers evaluate:

Structured thinking (do you have a systematic approach?)
Trade-offs (can you justify decisions?)
Scalability (can system handle millions of users?)
Communication (can you explain clearly?)
Depth (do you understand technologies you mention?)

Common at:

Google (L4+)
Amazon (SDE2+)
Facebook/Meta (E4+)
Netflix
Uber
Airbnb

The Framework (SNAKE Method)

S - Scope and requirements N - Numbers (scale estimation) A - API design K - Key components E - Evolution (scaling, trade-offs)

Let's break down each step.

Step 1: Scope and Requirements (5 minutes)

DON'T: Jump into designing immediately

DO: Clarify what you're building

Questions to Ask

Functional Requirements:

"What features should we support?"
"Are we building a mobile app, web app, or both?"
"What's the core functionality vs nice-to-have?"

Non-Functional Requirements:

"How many users?"
"Read-heavy or write-heavy?"
"Consistency or availability more important?"
"Latency requirements?"
"Geographic distribution?"

Out of Scope (Explicitly state):

"I'll focus on core functionality and skip [feature X] for time. We can discuss later if needed."

Example: "Design Twitter"

Good clarification:

Candidate: "Let me clarify scope. Should I focus on:
  - Posting tweets
  - Following users
  - Viewing home timeline

And skip for now:
  - Direct messages
  - Trending topics
  - Notifications

Is that correct?"

Interviewer: "Yes, perfect."

This shows: ✅ You don't make assumptions ✅ You manage scope ✅ You prioritize

Step 2: Numbers / Scale Estimation (5 minutes)

Calculate:

Daily Active Users (DAU)
Requests per second (read & write)
Storage requirements
Bandwidth

Example: "Design Twitter"

Given:

500M total users
200M daily active users (DAU)
Each user posts 2 tweets/day on average
Each user visits 10 times/day, fetches 50 tweets per visit

Calculations:

Tweets per second:

200M DAU × 2 tweets/day = 400M tweets/day
400M / 86,400 seconds = 4,630 tweets/sec (average)
Peak (3x average) = ~14,000 tweets/sec

Timeline reads per second:

200M DAU × 10 visits/day = 2B timeline fetches/day
2B / 86,400 = 23,148 reads/sec
Peak = ~70,000 reads/sec

Read/Write Ratio:

70,000 reads : 4,630 writes = ~15:1
(Read-heavy system!)

Storage (10 years):

400M tweets/day × 365 days × 10 years = 1.46 trillion tweets
Each tweet: 140 chars + metadata ≈ 1KB
1.46T × 1KB = 1.46 PB (petabytes)

Bandwidth:

Writes: 4,630 tweets/sec × 1KB = 4.6 MB/sec
Reads: 70,000 reads/sec × 50 tweets × 1KB = 3.5 GB/sec

Summary (write on whiteboard):

- Write QPS: ~5K (peak: 15K)
- Read QPS: ~70K (peak: 200K)
- Storage: ~1.5 PB (10 years)
- Read-heavy system (15:1 ratio)

Why this matters:

Shows read-heavy → Need caching, replication
5K writes/sec → Single database can handle (no immediate sharding needed)
70K reads/sec → Need cache layer + multiple read replicas

Step 3: API Design (5 minutes)

Define key APIs (REST-like, simplified)

Example: "Design Twitter"

API 1: Post Tweet

POST /tweets
Request:
{
  "user_id": "123",
  "content": "Hello world!",
  "timestamp": "2025-10-28T10:00:00Z"
}

Response:
{
  "tweet_id": "456",
  "status": "success"
}

API 2: Get User Timeline

GET /timeline?user_id=123&limit=50

Response:
{
  "tweets": [
    {
      "tweet_id": "456",
      "user_id": "789",
      "content": "Tweet content",
      "timestamp": "2025-10-28T09:55:00Z",
      "likes": 42,
      "retweets": 10
    },
    ...
  ]
}

API 3: Follow User

POST /follow
{
  "follower_id": "123",
  "followee_id": "789"
}

Keep it simple! Don't over-design APIs.

Step 4: Key Components (20 minutes)

This is the core of the interview.

High-Level Architecture

Start with simple diagram:

[Client]
   ↓
[Load Balancer]
   ↓
[API Servers]
   ↓
[Cache] ← [Database]

Example: "Design Twitter"

Component 1: API Servers

- Handle incoming requests
- Stateless (can scale horizontally)
- Node.js or Go (async I/O)

Component 2: Database

- User data: PostgreSQL (relational)
  - users table (id, username, email)
  - followers table (follower_id, followee_id)

- Tweets: Cassandra (write-heavy, time-series)
  - tweets table (tweet_id, user_id, content, timestamp)

Why Cassandra for tweets?
  ✅ High write throughput (5K writes/sec)
  ✅ Time-series data (query by timestamp)
  ✅ Easy to scale horizontally

Component 3: Cache (Redis)

- Cache user timelines
- Key: "timeline:{user_id}"
- Value: List of 50 recent tweet IDs
- TTL: 5 minutes

Why cache?
  ✅ Read-heavy (70K reads/sec)
  ✅ Timeline generation expensive
  ✅ Cache hit rate: 90%+ → 10x fewer DB queries

Component 4: Timeline Generation Service

When user requests timeline:
  1. Check Redis cache
  2. If hit: Return cached tweets (< 10ms)
  3. If miss:
     - Fetch follower list from DB
     - Fetch recent tweets from those users
     - Merge and sort by timestamp
     - Cache result
     - Return (200-500ms)

Component 5: Load Balancer

- Distribute traffic across API servers
- Algorithm: Least connections
- Health checks every 10 seconds

Component 6: CDN

- Profile images
- Tweet images/videos
- Serves from edge locations

Data Flow

Post Tweet:

1. User posts tweet → Load Balancer → API Server
2. API Server writes to Cassandra (tweet data)
3. API Server publishes event to message queue (fan-out)
4. Background workers update followers' cached timelines
5. Return success to user

View Timeline:

1. User requests timeline → Load Balancer → API Server
2. API Server checks Redis cache
3. If hit: Return (< 10ms)
4. If miss: Generate from DB, cache, return (200ms)

Step 5: Evolution / Deep Dive (15 minutes)

Interviewer: "How would you scale this to 1 billion users?"

This is where you show depth.

Scaling Database

Problem: Single PostgreSQL instance can't handle 1B users

Solution 1: Read Replicas

[Master DB] (writes)
    ↓
[Slave 1] [Slave 2] [Slave 3] (reads)

Benefit: 3x read capacity

Solution 2: Sharding

Shard 1: Users with ID 1-100M
Shard 2: Users with ID 100M-200M
...

Benefit: Distribute writes

Solution 3: Separate Databases

User Service → PostgreSQL (user profiles)
Tweet Service → Cassandra (tweets)
Timeline Service → Redis (cached timelines)

Benefit: Independent scaling

Handling Celebrity Users

Problem: Celebrity tweets (Elon Musk, Taylor Swift) → millions of followers

Naive approach:

When celebrity posts tweet:
  - Fan out to 10M followers' cached timelines
  - Update 10M Redis keys
  - Takes 10+ seconds ❌

Better approach (Hybrid):

Regular users: Fan-out on write (pre-compute timelines)
Celebrities: Fan-out on read (compute on-demand)

When user requests timeline:
  1. Fetch from cache (regular users they follow)
  2. Merge with latest tweets from celebrities (query DB)
  3. Sort and return

Used by Twitter!

Dealing with Hot Spots

Problem: During World Cup final, everyone tweets simultaneously → spike to 100K tweets/sec

Solutions:

1. Auto-scaling (add servers based on metrics)
2. Rate limiting (per user, per IP)
3. Write buffering (queue tweets, process asynchronously)
4. Graceful degradation (disable non-essential features)

Common System Design Questions

Easy

URL Shortener (bit.ly)
- Focus: Hashing, base62 encoding, database design
Pastebin
- Focus: Storage, expiration, cache
Key-Value Store
- Focus: Hash table, consistent hashing

Medium

Design Twitter
- Focus: Timeline generation, fan-out, caching
Design Instagram
- Focus: Photo storage, CDN, feed ranking
Design Uber
- Focus: Geographic sharding, real-time matching, maps
Design WhatsApp
- Focus: WebSocket, message queues, online/offline status
Design YouTube
- Focus: Video encoding, CDN, recommendation
Design Netflix
- Focus: Video streaming, CDN, personalization
Design Amazon (E-Commerce)
- Focus: Inventory, transactions, recommendations

Hard

Design Google Search
- Focus: Crawling, indexing, ranking, distributed systems
Design Google Maps
- Focus: Graph dsa, routing, location services
Design Dropbox
- Focus: File sync, chunking, conflict resolution
Design Ticketmaster
- Focus: Concurrency, race conditions, inventory management
Design a Distributed Cache
- Focus: Consistent hashing, replication, eviction

Question-Specific Tips

URL Shortener

Key Topics:

Base62 encoding (0-9, a-z, A-Z = 62 characters)
Hash collisions
Expiration
Analytics (click tracking)

Approach:

1. Generate unique ID (auto-increment or hash)
2. Encode to base62 (e.g., 12345 → "dnh")
3. Store mapping: "dnh" → "https://example.com/long-url"
4. Redirect on access

Design Instagram

Key Topics:

Photo upload flow
CDN for images
Feed generation (like Twitter)
Image metadata (likes, comments)

Approach:

1. User uploads photo → API Server → S3 (storage)
2. Metadata stored in DB (user_id, photo_id, timestamp)
3. Fan-out to followers' feeds (cached timelines)
4. Serve images via CDN (low latency globally)

Design Uber

Key Topics:

Real-time matching (rider ↔ driver)
Geographic sharding (by city)
Surge pricing
ETA calculation

Approach:

1. Rider requests ride → Load Balancer → API Server
2. Query drivers in same city (geo-sharded DB)
3. Matching algorithm (nearest available driver)
4. Notify driver (push notification)
5. Track ride in real-time (WebSocket)

Interview Do's and Don'ts

✅ Do's

✅ Clarify requirements first (never assume) ✅ Think out loud (interviewer wants to hear your thought process) ✅ Draw diagrams (visual communication) ✅ Discuss trade-offs ("We could use NoSQL for flexibility, but SQL gives us ACID transactions...") ✅ Justify decisions ("I chose Redis for caching because...") ✅ Ask for feedback ("Does this approach make sense?") ✅ Manage time (don't spend 30 minutes on API design)

❌ Don'ts

❌ Don't jump to solutions (clarify first) ❌ Don't ignore scale ("This works for 100 users, but what about 100M?") ❌ Don't over-engineer (start simple, then scale) ❌ Don't memorize solutions (interviewers can tell) ❌ Don't use buzzwords without understanding ("Let's use Kafka!" "Why?" "Uh...") ❌ Don't ignore interviewer hints (they're guiding you)

How to Prepare

1. Study Fundamentals

Must-know topics:

Load balancing
Caching (Redis)
Database (SQL vs NoSQL, replication, sharding)
Message queues (Kafka, RabbitMQ)
CDN
Microservices vs monolith
CAP theorem

2. Practice Questions

Resources:

Grokking the System Design Interview (educative.io)
System Design Primer (GitHub repo)
YouTube: Gaurav Sen, Tech Dummies
Books: "Designing Data-Intensive Applications" by Martin Kleppmann

3. Mock Interviews

Practice with peers:

Take turns being interviewer/candidate
Time yourself (45 minutes)
Get feedback

Platforms:

Pramp (free mock interviews)
Interviewing.io

4. Learn from Real Systems

Read engineering blogs:

Netflix Tech Blog
Uber Engineering
Airbnb Engineering
AWS Architecture Blog

Understand:

Why did they choose technology X?
What problems did they face?
How did they scale?

Sample Timeline (4 Weeks)

Week 1: Fundamentals

- Load balancing
- Caching
- Databases
- CAP theorem

Week 2: Practice Easy Questions

- URL shortener
- Pastebin
- Key-value store

Week 3: Practice Medium Questions

- Twitter
- Instagram
- Uber

Week 4: Mock Interviews + Hard Questions

- Mock interviews (3-5)
- Google Search
- Dropbox

Conclusion

System design interviews test your ability to:

Think systematically (use a framework)
Scale systems (handle millions of users)
Make trade-offs (justify decisions)
Communicate clearly (explain your thinking)

The secret? There's no "perfect" answer. Interviewers want to see your thought process, not memorized solutions.

Framework: SNAKE Method

Scope: Clarify requirements
Numbers: Estimate scale
API: Design key APIs
Key components: High-level architecture
Evolution: Scale and deep dive

Practice 10-15 questions using this framework, and you'll be ready for any system design interview—Google, Amazon, Facebook, or beyond.

Preparing for system design interviews? Let's connect on Twitter or LinkedIn to share resources and tips!

System Design Interview Prep: Complete Guide with 20+ Questions

"How Would You Design Twitter?"

What Are System Design Interviews?

The Framework (SNAKE Method)

Step 1: Scope and Requirements (5 minutes)

Questions to Ask

Example: "Design Twitter"

Step 2: Numbers / Scale Estimation (5 minutes)

Example: "Design Twitter"

Step 3: API Design (5 minutes)

Example: "Design Twitter"

Step 4: Key Components (20 minutes)

High-Level Architecture

Example: "Design Twitter"

Data Flow

Step 5: Evolution / Deep Dive (15 minutes)

Scaling Database

Handling Celebrity Users

Dealing with Hot Spots

Common System Design Questions

Easy

Medium

Hard

Question-Specific Tips

URL Shortener

Design Instagram

Design Uber

Interview Do's and Don'ts

✅ Do's

❌ Don'ts

How to Prepare

1. Study Fundamentals

2. Practice Questions

3. Mock Interviews

4. Learn from Real Systems

Sample Timeline (4 Weeks)

Conclusion

Support My Work

Related Blogs