System Design Interview
Guide
A comprehensive, language-agnostic guide to system design interviews — covering scalability, databases, caching, messaging, distributed systems, and fintech patterns.
Week 1 — Core Foundations (Days 1–7)
- Day 1–2: Scalability — vertical vs horizontal, load balancers, stateless apps
- Day 3: Databases — SQL vs NoSQL, ACID vs BASE, when to use what
- Day 4: Caching — Cache-Aside, Write-Through, Redis internals
- Day 5: Messaging — Kafka vs RabbitMQ, async processing, event-driven patterns
- Day 6–7: Distributed Systems — CAP theorem, replication, sharding, consistency models
Week 2 — Beginner Problems (Days 8–14)
- Day 8–9: Design URL Shortener (TinyURL) — hashing, redirection, analytics
- Day 10: Design Pastebin — blob storage, TTL, deduplication
- Day 11–12: Design Rate Limiter — token bucket, sliding window, Redis Lua scripts
- Day 13–14: Design Notification Service — fan-out, push vs pull, delivery guarantees
Week 3 — Intermediate Problems (Days 15–21)
- Day 15–16: Design Chat System (WhatsApp) — WebSockets, message storage, delivery states
- Day 17–18: Design News Feed — fan-out on write vs read, ranking, Redis sorted sets
- Day 19–20: Design BookMyShow — seat inventory, concurrency, optimistic locking
- Day 21: Design API Gateway — auth, rate limiting, circuit breaker, routing
Week 4 — Advanced & Fintech (Days 22–30)
- Day 22–23: Design Payment Gateway — idempotency, 2-phase commit, saga pattern
- Day 24–25: Design Banking Ledger — event sourcing, CQRS, double-entry accounting
- Day 26–27: Design Fraud Detection Platform — stream processing, ML model serving, alerting
- Day 28: Design Distributed Cache — consistent hashing, eviction policies
- Day 29–30: Mock Interviews + Review weak spots + cheat sheet drill
Vertical vs Horizontal Scaling
2 CPU · 4GB RAM
8 CPU · 32GB RAM
32 CPU · 256GB RAM
Standard Load Balancer Architecture
(Nginx / ALB)
Stateless
Shared State
| Algorithm | How it works | Best for |
|---|---|---|
| Round Robin | Sends requests in circular order | Homogeneous servers, stateless apps |
| Least Connections | Routes to server with fewest active connections | Long-lived connections, heterogeneous load |
| IP Hash | Same client always hits same server | Session stickiness (avoid if possible) |
| Weighted | Higher weight = more requests | Servers with different capacities |
Horizontal scaling requires stateless app servers. If your app stores user session in local memory, a different server can't serve the next request. Instead:
- Store sessions in Redis (shared, fast)
- Use JWT tokens (state lives in the token itself)
- Externalize all user state to the database layer
A CDN caches static assets (images, CSS, JS, videos) at edge nodes globally, so users receive content from the nearest server rather than your origin.
Cache Hit ✅
Miss ❌
(US East)
Tools: Cloudflare, AWS CloudFront, Akamai. Mention CDN whenever the problem involves serving media at scale (YouTube, Netflix, Instagram).
SQL vs NoSQL — Decision Tree
Payments, banking, inventory
PostgreSQL / MySQL
User profiles, catalogs
Document
Wide-column
Key-Value
Weakness: Schema migrations are painful; harder to scale writes horizontally.
Weakness: No multi-document ACID (well, limited since v4); not ideal for complex joins.
Weakness: Eventual consistency; no joins; queries must be designed around access patterns.
Sharding splits data across multiple database nodes (shards) so each node holds only a subset of the data.
Sharding by User ID (Hash-based)
user_id % 4
uid%4=0
uid%4=1
uid%4=2
uid%4=3
Reads
Reads
Reads
Replication lag means replicas may serve slightly stale data — acceptable for most reads, not acceptable for financial reads (always read from primary after a write in banking).
| Property | ACID (SQL) | BASE (NoSQL) |
|---|---|---|
| Consistency | Strong (immediate) | Eventual (may lag) |
| Availability | Can sacrifice for consistency | Highly available |
| Best for | Payments, banking, orders | Social feeds, analytics, metrics |
| Example DB | PostgreSQL, MySQL | Cassandra, DynamoDB, MongoDB |
Cache Aside (Lazy Loading) — Most Common Pattern
Check cache first
instantly
| Structure | Use Case | Example |
|---|---|---|
| String | Simple cache, counters, sessions | user:1001:session → JWT |
| Hash | Object with fields | user:1001 → {name, email, age} |
| List | Queues, recent activity | notifications:user:1001 (latest 20) |
| Sorted Set | Leaderboards, ranking, news feed | feed:user:1001 → posts sorted by score |
| Set | Unique membership, tags | online_users → {uid1, uid2, ...} |
| Policy | What it evicts | Best for |
|---|---|---|
| LRU | Least recently used items | General purpose — most common |
| LFU | Least frequently used items | Skewed access patterns (Zipf distribution) |
| TTL | Items past expiry time | Session data, OTPs, rate limits |
| Random | Random item | When all items equally likely to be accessed |
Synchronous vs Asynchronous Processing
user.registered
- Topic: Named stream of events
- Partition: Ordered, immutable log (parallelism unit)
- Consumer Group: Multiple consumers sharing a partition
- Offset: Position of a message in a partition
- Retention: Messages kept for N hours/days
- High-throughput event streaming
- Multiple consumers per event
- Replay capability needed
- Audit log / event sourcing
- Fraud detection pipelines
- Real-time analytics
| Factor | Kafka | RabbitMQ |
|---|---|---|
| Model | Log-based (pull) | Queue-based (push) |
| Throughput | Millions/sec | Thousands/sec |
| Message replay | ✅ Yes (retention) | ❌ No (consumed = gone) |
| Complex routing | Basic (topic-based) | ✅ Rich (exchanges, bindings) |
| Best for | Event streaming, audit logs, payment pipelines | Task queues, RPC, job workers |
| Ecosystem | spring-kafka / confluent-kafka-python / sarama | spring-amqp / pika / amqp-client |
| Guarantee | Meaning | Risk | Use |
|---|---|---|---|
| At-Most-Once | Message may be lost, never duplicated | Data loss | Metrics, logs (loss OK) |
| At-Least-Once | Delivered ≥1 times, may duplicate | Duplicate processing | Most systems (idempotent consumers) |
| Exactly-Once | Delivered exactly once | Performance cost | Payments, ledger (critical) |
CAP Theorem — You can only choose 2 of 3
So the real choice is: CP (consistent but may be unavailable) vs AP (available but may be stale).
For payment systems: payments need linearizable (strongest). User profile updates can be eventual. Chat message ordering can be causal.
Regular hashing (key % N) breaks when you add/remove servers — everything remaps. Consistent hashing places both servers and keys on a ring so only K/N keys migrate when a server changes (K = keys, N = servers).
Used by: Redis Cluster, Cassandra, DynamoDB, Memcached. Virtual nodes improve distribution uniformity.
| Failure | Description | Mitigation |
|---|---|---|
| Single Point of Failure | One component takes down the whole system | Redundancy, active-active setup |
| Cascading Failure | One service failure overloads dependents | Circuit breaker, bulkhead pattern |
| Hot Partition | One shard gets disproportionate traffic | Better shard key, random suffix, celebrity handling |
| Split Brain | Two nodes both think they're the primary | Consensus protocol (Raft/Paxos), odd number nodes |
| Network Partition | Nodes can't communicate | CAP trade-off choice, timeout + retry |
Clarify Requirements (2–3 min)
Never start drawing. Ask: Who uses this? What scale? Which features are MVP vs nice-to-have? What are the SLA requirements (latency, availability)? Interviewers love this — it shows senior thinking. Example: "Before I start, let me clarify — are we designing for global users or India-only? Do we need real-time delivery receipts or is eventual OK?"
Estimate Scale (3–5 min)
Order-of-magnitude math. 100M users × 10 req/day = 1B req/day ÷ 86,400 sec ≈ 11,500 RPS. Storage: 100M users × 1KB profile = 100GB. This tells you: do you need caching? Sharding? CDN? Always walk the interviewer through your estimates — the process matters more than precision.
High-Level Design (5–8 min)
Draw the skeleton: Client → API Gateway → Services → Cache → Database. Start simple. Don't jump to microservices immediately — start monolith, then extract services if scale demands. Label every arrow (HTTP, WebSocket, gRPC, Kafka).
Standard High-Level Template
Mobile/Web
Auth · Rate Limit
Stateless
Deep Dive a Component (10–15 min)
The interviewer will direct you — follow their lead. Common deep-dives: database schema design, API contract, caching strategy, failure handling, consistency guarantees. This is where you show seniority. Discuss trade-offs explicitly: "I chose X over Y because of Z, and the trade-off is W."
Identify Bottlenecks & Improvements (5 min)
Proactively identify where the system will break: "The DB write path is the likely bottleneck at 10,000 RPS. I'd shard by user_id. The read path can be cached in Redis with a 5-min TTL. For global users, add CDN and regional read replicas." This is the separator between mid-level and senior candidates.
"I'd start simple with X, then scale to Y when..."
"Failure mode here would be... mitigated by..."
"Consistency requirement for this is [strong/eventual] because..."
"I'd instrument this with [Prometheus/Grafana] to detect..."
"I'll use microservices" (without justifying)
Jumping to Cassandra when PostgreSQL suffices
"I'll use Kubernetes" as a solution to everything
Not asking clarifying questions at the start
1. URL Shortener (TinyURL)
BEGINNERKey Questions to Clarify
- Custom short codes or random?
- Analytics needed (click counts)?
- Expiry / TTL on URLs?
- Read:Write ratio? (Typically 100:1 — very read-heavy)
Architecture
shortcode→URL
url_mappings
Short Code Generation
Option A: MD5/SHA256 hash of long URL → take first 7 chars. Collision rate is low but handle with retry. Option B: Auto-increment ID → Base62 encode (a-z, A-Z, 0-9). 7 chars of Base62 = 62^7 = 3.5 trillion URLs. Option B is preferred — no collision, predictable space.
Key Trade-offs
- Redis caches hot URLs — 80% traffic served from cache
- 302 redirect (temporary) vs 301 (permanent) — 301 offloads server, but loses analytics
- Shard by short code hash if scale demands it
2. Rate Limiter
BEGINNERAlgorithms
| Algorithm | Pros | Cons |
|---|---|---|
| Token Bucket | Burst allowed, smooth | Race condition on distributed |
| Sliding Window Counter | Accurate, fair | Memory per user |
| Fixed Window Counter | Simple | Boundary burst problem |
| Leaky Bucket | Consistent output rate | Drops bursts |
Redis Implementation
Distributed Rate Limiter
Single Redis instance is a SPOF. Use Redis Cluster or Lua scripts for atomic operations. For global rate limiting, use a distributed counter with sticky sessions or a global Redis.
3. Notification Service
BEGINNERArchitecture
SendGrid
Twilio
FCM/APNs
WebSocket
Key Design Points
- Fan-out: one event → multiple channels
- Idempotency: deduplicate with notification_id to avoid double-send
- User preferences: respect opt-out per channel per category
- Retry with exponential backoff for failed deliveries
- Dead letter queue (DLQ) for permanently failed messages
4. Chat System (WhatsApp)
INTERMEDIATEConnection Layer
Use WebSockets (persistent bidirectional) for real-time messaging. HTTP long-polling is a fallback. Each user connects to a chat server — need a routing layer to find which server a user is on.
Kafka
Message Storage
- Cassandra: ideal for chat — time-series, high write throughput, partition by conversation_id
- Schema: (conv_id, message_id, sender_id, content, timestamp, status)
- Message IDs: use Snowflake IDs (time-sortable, unique across nodes)
Online Presence
Heartbeat every 30s → update Redis key "online:user_id" with TTL 60s. If key expires, user is offline. For "last seen": store timestamp in Redis, batch-persist to DB every 5 min.
5. BookMyShow (Seat Booking)
INTERMEDIATEThe Core Challenge
Concurrency: 10,000 users trying to book the last 5 seats simultaneously. You need to prevent double-booking without sacrificing performance.
Approaches
| Approach | Mechanism | Trade-off |
|---|---|---|
| Pessimistic Locking | SELECT FOR UPDATE on seat row | Serialized — safe but slow under load |
| Optimistic Locking | version column — retry on conflict | Fast but retry storms under high contention |
| Redis Distributed Lock | SETNX seat_id with TTL | Fast, handles contention, but Redis SPOF risk |
| Queue + Reserve | Virtual queue, 10-min hold TTL | Best UX — user gets time to pay |
Recommended: Temporary Seat Reservation
- User selects seat → Redis: SET seat:show_1:seat_A5 user_1001 EX 600 (10 min)
- Payment completes → mark seat as BOOKED in PostgreSQL, remove Redis key
- Timeout → seat auto-released back to available pool
6. News Feed (Facebook/Instagram)
INTERMEDIATEFan-out Strategies
| Strategy | How | Best for |
|---|---|---|
| Fan-out on Write | On post, push to all followers' feeds immediately | Users with small follower counts |
| Fan-out on Read | On feed load, fetch and merge followed users' posts | Celebrity users (millions of followers) |
| Hybrid | Fan-out on write for normal users, on read for celebrities | Production (Facebook, Instagram) |
Feed Storage
Redis Sorted Set per user: key = "feed:user:1001", score = timestamp. Store post_ids, not full content. Fetch top 20 post_ids → batch fetch post content from cache/DB.
7. YouTube / Netflix
ADVANCEDVideo Upload Pipeline
360p 720p 1080p 4K
Adaptive Bitrate Streaming (ABR)
Store video in HLS/DASH chunks. Player requests manifest file → selects quality based on current bandwidth → streams chunks. Redis caches popular video metadata. CDN serves video chunks from edge nodes nearest to user.
💳 Design a Payment Gateway
EXPERTThe Core Challenge: Exactly-Once Payments
If a payment request times out, did it go through? The user retries — you must never charge them twice. Solution: Idempotency Keys.
Saga Pattern — Distributed Transactions
Across microservices, you can't use a single DB transaction. Use the Saga pattern:
Database Schema (Simplified)
What Fintech Interviewers Probe
- "What if the network fails after debit but before credit?" → Saga compensating transactions
- "How do you prevent duplicate charges?" → Idempotency keys
- "How do you ensure your ledger always balances?" → Double-entry accounting, event sourcing
- "How do you handle PCI compliance?" → Tokenization, never store raw card data
📒 Design a Banking Ledger (Event Sourcing + CQRS)
EXPERTEvent Sourcing
Instead of storing current balance (mutable state), store every event that changed it. Current balance = replay all events.
CQRS (Command Query Responsibility Segregation)
Separate write model (commands: debit, credit) from read model (queries: current balance, transaction history). Write to event store → async project to read-optimized views in Redis/PostgreSQL.
validates + writes event
append-only
update read views
Double-Entry Accounting
Every financial transaction debits one account and credits another. The sum of all debits must always equal sum of all credits — this is how you verify ledger integrity. Never violate this rule in a banking system design.
🚨 Design a Fraud Detection Platform
EXPERTArchitecture
"velocity > 5/min"
Flink / Spark
Risk scoring
Features to store in feature store (Redis)
- Transaction velocity: number of transactions in last 1min / 5min / 1hr
- Geolocation anomaly: user usually pays in Pune, now paying in London
- Merchant category risk score
- Device fingerprint change
- Time-of-day anomaly
Latency requirement
Fraud decision must happen in <100ms for real-time payment approval. Pre-compute features in Redis. Serve ML model via low-latency inference endpoint (not batch). Rule engine executes in memory.
Database Cheatsheet
- Payments/Banking → PostgreSQL (ACID)
- User profiles/Catalog → MongoDB
- Time-series/Chat → Cassandra
- Session/Cache → Redis
- Search → Elasticsearch
- Graph relationships → Neo4j
- Analytics/OLAP → Redshift/BigQuery
Caching Cheatsheet
- Read-heavy → Cache-Aside (Redis)
- Write-through for fresh data
- Write-back for high-freq non-critical
- Feed ranking → Sorted Set (ZADD)
- Session storage → String + TTL
- Rate limiting → INCR + EXPIRE
- Distributed lock → SETNX + EX
Messaging Cheatsheet
- Event streaming / audit → Kafka
- Task queue / RPC → RabbitMQ
- Payments → Exactly-once (Kafka tx)
- Notifications → Kafka fan-out
- Dead letter queue for failures
- Idempotency key for dedup
- Retry + exponential backoff
Scale Numbers
- 1M users → single DB fine
- 10M users → add read replicas
- 100M users → sharding needed
- 1B users → multi-region
- 10K RPS → Redis mandatory
- 100K RPS → Kafka + sharding
- 1M RPS → CDN + edge compute
Fintech Must-Knows
- Idempotency key on every payment
- Saga pattern for distributed tx
- Event sourcing for audit trail
- CQRS for read/write separation
- Double-entry accounting always
- Optimistic lock for balance updates
- Never store raw card data (PCI)
Reliability Patterns
- Circuit Breaker (Resilience4j)
- Bulkhead — isolate failures
- Retry with backoff + jitter
- Timeout on every external call
- Health checks + readiness probes
- Graceful degradation
- Blue-green / canary deploys
Interview Steps
- 1. Clarify requirements (2-3 min)
- 2. Estimate scale (3-5 min)
- 3. High-level design (5-8 min)
- 4. Deep dive (10-15 min)
- 5. Bottlenecks (5 min)
- Always state trade-offs
- Start simple, evolve design
Back-of-Envelope
- 1 day = 86,400 sec (~100K sec)
- 1M req/day = ~12 RPS
- 10M req/day = ~120 RPS
- 1B req/day = ~11,500 RPS
- 1 char = 1 byte
- 1 tweet = ~300 bytes
- 1 photo = ~300 KB avg