System Design Interview Prep: Complete Guide with 20+ Questions
Ace system design interviews at FAANG. Learn systematic approach, common questions (URL shortener, Twitter, Instagram), and real interview frameworks used at Google, Amazon, Facebook

"How Would You Design Twitter?"
Interviewer: "You have 45 minutes. Go."
Candidate A: "Um... we need a database for tweets... and... users... maybe MongoDB? And... AWS?"
(Interview ends poorly)
Candidate B: "Before I start, let me clarify requirements. Are we designing the core Twitter functionalityβposting tweets, following users, and viewing timeline? Or including features like DMs, trending topics, and notifications?"
Interviewer: "Let's focus on core functionality."
Candidate B: "Great. Let me estimate scale. Twitter has 500M users, 200M daily active. That's roughly 6,000 tweets per second on average, with peaks at 20,000 during events. Storage: 500M users Γ 1KB profile = 500GB. 500B tweets historical at 1KB each = 500TB..."
(Interview going well)
The difference? Candidate B has a framework.
System design interviews aren't about memorizing solutions. They're about demonstrating structured thinking, understanding trade-offs, and communicating your thought process clearly.
In this guide, I'll give you the exact framework used by successful candidates at Google, Amazon, Facebook, and Netflix, plus 20+ practice questions with approaches. Let's begin.
What Are System Design Interviews?
System Design Interview = Design a large-scale system (Twitter, Uber, YouTube) in 45-60 minutes
Not coding (though you may sketch APIs, data models)
What interviewers evaluate:
- Structured thinking (do you have a systematic approach?)
- Trade-offs (can you justify decisions?)
- Scalability (can system handle millions of users?)
- Communication (can you explain clearly?)
- Depth (do you understand technologies you mention?)
Common at:
- Google (L4+)
- Amazon (SDE2+)
- Facebook/Meta (E4+)
- Netflix
- Uber
- Airbnb
The Framework (SNAKE Method)
S - Scope and requirements N - Numbers (scale estimation) A - API design K - Key components E - Evolution (scaling, trade-offs)
Let's break down each step.
Step 1: Scope and Requirements (5 minutes)
DON'T: Jump into designing immediately
DO: Clarify what you're building
Questions to Ask
Functional Requirements:
"What features should we support?"
"Are we building a mobile app, web app, or both?"
"What's the core functionality vs nice-to-have?"
Non-Functional Requirements:
"How many users?"
"Read-heavy or write-heavy?"
"Consistency or availability more important?"
"Latency requirements?"
"Geographic distribution?"
Out of Scope (Explicitly state):
"I'll focus on core functionality and skip [feature X] for time. We can discuss later if needed."
Example: "Design Twitter"
Good clarification:
Candidate: "Let me clarify scope. Should I focus on:
- Posting tweets
- Following users
- Viewing home timeline
And skip for now:
- Direct messages
- Trending topics
- Notifications
Is that correct?"
Interviewer: "Yes, perfect."
This shows: β You don't make assumptions β You manage scope β You prioritize
Step 2: Numbers / Scale Estimation (5 minutes)
Calculate:
- Daily Active Users (DAU)
- Requests per second (read & write)
- Storage requirements
- Bandwidth
Example: "Design Twitter"
Given:
- 500M total users
- 200M daily active users (DAU)
- Each user posts 2 tweets/day on average
- Each user visits 10 times/day, fetches 50 tweets per visit
Calculations:
Tweets per second:
200M DAU Γ 2 tweets/day = 400M tweets/day
400M / 86,400 seconds = 4,630 tweets/sec (average)
Peak (3x average) = ~14,000 tweets/sec
Timeline reads per second:
200M DAU Γ 10 visits/day = 2B timeline fetches/day
2B / 86,400 = 23,148 reads/sec
Peak = ~70,000 reads/sec
Read/Write Ratio:
70,000 reads : 4,630 writes = ~15:1
(Read-heavy system!)
Storage (10 years):
400M tweets/day Γ 365 days Γ 10 years = 1.46 trillion tweets
Each tweet: 140 chars + metadata β 1KB
1.46T Γ 1KB = 1.46 PB (petabytes)
Bandwidth:
Writes: 4,630 tweets/sec Γ 1KB = 4.6 MB/sec
Reads: 70,000 reads/sec Γ 50 tweets Γ 1KB = 3.5 GB/sec
Summary (write on whiteboard):
- Write QPS: ~5K (peak: 15K)
- Read QPS: ~70K (peak: 200K)
- Storage: ~1.5 PB (10 years)
- Read-heavy system (15:1 ratio)
Why this matters:
- Shows read-heavy β Need caching, replication
- 5K writes/sec β Single database can handle (no immediate sharding needed)
- 70K reads/sec β Need cache layer + multiple read replicas
Step 3: API Design (5 minutes)
Define key APIs (REST-like, simplified)
Example: "Design Twitter"
API 1: Post Tweet
POST /tweets
Request:
{
"user_id": "123",
"content": "Hello world!",
"timestamp": "2025-10-28T10:00:00Z"
}
Response:
{
"tweet_id": "456",
"status": "success"
}
API 2: Get User Timeline
GET /timeline?user_id=123&limit=50
Response:
{
"tweets": [
{
"tweet_id": "456",
"user_id": "789",
"content": "Tweet content",
"timestamp": "2025-10-28T09:55:00Z",
"likes": 42,
"retweets": 10
},
...
]
}
API 3: Follow User
POST /follow
{
"follower_id": "123",
"followee_id": "789"
}
Keep it simple! Don't over-design APIs.
Step 4: Key Components (20 minutes)
This is the core of the interview.
High-Level Architecture
Start with simple diagram:
[Client]
β
[Load Balancer]
β
[API Servers]
β
[Cache] β [Database]
Example: "Design Twitter"
Component 1: API Servers
- Handle incoming requests
- Stateless (can scale horizontally)
- Node.js or Go (async I/O)
Component 2: Database
- User data: PostgreSQL (relational)
- users table (id, username, email)
- followers table (follower_id, followee_id)
- Tweets: Cassandra (write-heavy, time-series)
- tweets table (tweet_id, user_id, content, timestamp)
Why Cassandra for tweets?
β
High write throughput (5K writes/sec)
β
Time-series data (query by timestamp)
β
Easy to scale horizontally
Component 3: Cache (Redis)
- Cache user timelines
- Key: "timeline:{user_id}"
- Value: List of 50 recent tweet IDs
- TTL: 5 minutes
Why cache?
β
Read-heavy (70K reads/sec)
β
Timeline generation expensive
β
Cache hit rate: 90%+ β 10x fewer DB queries
Component 4: Timeline Generation Service
When user requests timeline:
1. Check Redis cache
2. If hit: Return cached tweets (< 10ms)
3. If miss:
- Fetch follower list from DB
- Fetch recent tweets from those users
- Merge and sort by timestamp
- Cache result
- Return (200-500ms)
Component 5: Load Balancer
- Distribute traffic across API servers
- Algorithm: Least connections
- Health checks every 10 seconds
Component 6: CDN
- Profile images
- Tweet images/videos
- Serves from edge locations
Data Flow
Post Tweet:
1. User posts tweet β Load Balancer β API Server
2. API Server writes to Cassandra (tweet data)
3. API Server publishes event to message queue (fan-out)
4. Background workers update followers' cached timelines
5. Return success to user
View Timeline:
1. User requests timeline β Load Balancer β API Server
2. API Server checks Redis cache
3. If hit: Return (< 10ms)
4. If miss: Generate from DB, cache, return (200ms)
Step 5: Evolution / Deep Dive (15 minutes)
Interviewer: "How would you scale this to 1 billion users?"
This is where you show depth.
Scaling Database
Problem: Single PostgreSQL instance can't handle 1B users
Solution 1: Read Replicas
[Master DB] (writes)
β
[Slave 1] [Slave 2] [Slave 3] (reads)
Benefit: 3x read capacity
Solution 2: Sharding
Shard 1: Users with ID 1-100M
Shard 2: Users with ID 100M-200M
...
Benefit: Distribute writes
Solution 3: Separate Databases
User Service β PostgreSQL (user profiles)
Tweet Service β Cassandra (tweets)
Timeline Service β Redis (cached timelines)
Benefit: Independent scaling
Handling Celebrity Users
Problem: Celebrity tweets (Elon Musk, Taylor Swift) β millions of followers
Naive approach:
When celebrity posts tweet:
- Fan out to 10M followers' cached timelines
- Update 10M Redis keys
- Takes 10+ seconds β
Better approach (Hybrid):
Regular users: Fan-out on write (pre-compute timelines)
Celebrities: Fan-out on read (compute on-demand)
When user requests timeline:
1. Fetch from cache (regular users they follow)
2. Merge with latest tweets from celebrities (query DB)
3. Sort and return
Used by Twitter!
Dealing with Hot Spots
Problem: During World Cup final, everyone tweets simultaneously β spike to 100K tweets/sec
Solutions:
1. Auto-scaling (add servers based on metrics)
2. Rate limiting (per user, per IP)
3. Write buffering (queue tweets, process asynchronously)
4. Graceful degradation (disable non-essential features)
Common System Design Questions
Easy
- URL Shortener (bit.ly)
- Focus: Hashing, base62 encoding, database design
- Pastebin
- Focus: Storage, expiration, cache
- Key-Value Store
- Focus: Hash table, consistent hashing
Medium
- Design Twitter
- Focus: Timeline generation, fan-out, caching
- Design Instagram
- Focus: Photo storage, CDN, feed ranking
- Design Uber
- Focus: Geographic sharding, real-time matching, maps
- Design WhatsApp
- Focus: WebSocket, message queues, online/offline status
- Design YouTube
- Focus: Video encoding, CDN, recommendation
- Design Netflix
- Focus: Video streaming, CDN, personalization
- Design Amazon (E-Commerce)
- Focus: Inventory, transactions, recommendations
Hard
- Design Google Search
- Focus: Crawling, indexing, ranking, distributed systems
- Design Google Maps
- Focus: Graph dsa, routing, location services
- Design Dropbox
- Focus: File sync, chunking, conflict resolution
- Design Ticketmaster
- Focus: Concurrency, race conditions, inventory management
- Design a Distributed Cache
- Focus: Consistent hashing, replication, eviction
Question-Specific Tips
URL Shortener
Key Topics:
- Base62 encoding (
0-9, a-z, A-Z= 62 characters) - Hash collisions
- Expiration
- Analytics (click tracking)
Approach:
1. Generate unique ID (auto-increment or hash)
2. Encode to base62 (e.g., 12345 β "dnh")
3. Store mapping: "dnh" β "https://example.com/long-url"
4. Redirect on access
Design Instagram
Key Topics:
- Photo upload flow
- CDN for images
- Feed generation (like Twitter)
- Image metadata (likes, comments)
Approach:
1. User uploads photo β API Server β S3 (storage)
2. Metadata stored in DB (user_id, photo_id, timestamp)
3. Fan-out to followers' feeds (cached timelines)
4. Serve images via CDN (low latency globally)
Design Uber
Key Topics:
- Real-time matching (rider β driver)
- Geographic sharding (by city)
- Surge pricing
- ETA calculation
Approach:
1. Rider requests ride β Load Balancer β API Server
2. Query drivers in same city (geo-sharded DB)
3. Matching algorithm (nearest available driver)
4. Notify driver (push notification)
5. Track ride in real-time (WebSocket)
Interview Do's and Don'ts
β Do's
β Clarify requirements first (never assume) β Think out loud (interviewer wants to hear your thought process) β Draw diagrams (visual communication) β Discuss trade-offs ("We could use NoSQL for flexibility, but SQL gives us ACID transactions...") β Justify decisions ("I chose Redis for caching because...") β Ask for feedback ("Does this approach make sense?") β Manage time (don't spend 30 minutes on API design)
β Don'ts
β Don't jump to solutions (clarify first) β Don't ignore scale ("This works for 100 users, but what about 100M?") β Don't over-engineer (start simple, then scale) β Don't memorize solutions (interviewers can tell) β Don't use buzzwords without understanding ("Let's use Kafka!" "Why?" "Uh...") β Don't ignore interviewer hints (they're guiding you)
How to Prepare
1. Study Fundamentals
Must-know topics:
- Load balancing
- Caching (Redis)
- Database (SQL vs NoSQL, replication, sharding)
- Message queues (Kafka, RabbitMQ)
- CDN
- Microservices vs monolith
- CAP theorem
2. Practice Questions
Resources:
- Grokking the System Design Interview (educative.io)
- System Design Primer (GitHub repo)
- YouTube: Gaurav Sen, Tech Dummies
- Books: "Designing Data-Intensive Applications" by Martin Kleppmann
3. Mock Interviews
Practice with peers:
- Take turns being interviewer/candidate
- Time yourself (45 minutes)
- Get feedback
Platforms:
- Pramp (free mock interviews)
- Interviewing.io
4. Learn from Real Systems
Read engineering blogs:
- Netflix Tech Blog
- Uber Engineering
- Airbnb Engineering
- AWS Architecture Blog
Understand:
- Why did they choose technology X?
- What problems did they face?
- How did they scale?
Sample Timeline (4 Weeks)
Week 1: Fundamentals
- Load balancing
- Caching
- Databases
- CAP theorem
Week 2: Practice Easy Questions
- URL shortener
- Pastebin
- Key-value store
Week 3: Practice Medium Questions
- Twitter
- Instagram
- Uber
Week 4: Mock Interviews + Hard Questions
- Mock interviews (3-5)
- Google Search
- Dropbox
Conclusion
System design interviews test your ability to:
- Think systematically (use a framework)
- Scale systems (handle millions of users)
- Make trade-offs (justify decisions)
- Communicate clearly (explain your thinking)
The secret? There's no "perfect" answer. Interviewers want to see your thought process, not memorized solutions.
Framework: SNAKE Method
- Scope: Clarify requirements
- Numbers: Estimate scale
- API: Design key APIs
- Key components: High-level architecture
- Evolution: Scale and deep dive
Practice 10-15 questions using this framework, and you'll be ready for any system design interviewβGoogle, Amazon, Facebook, or beyond.
Preparing for system design interviews? Let's connect on Twitter or LinkedIn to share resources and tips!