The Question
DesignDesign a High-Throughput Social Media Feed System
Design a social media platform similar to Twitter (X) that supports real-time tweet posting and home timeline generation for 300 million daily active users. The system must handle extreme write amplification caused by 'celebrity' accounts with millions of followers while maintaining sub-200ms read latency for the feed. Detail your strategy for the 'fan-out' process, storage sharding, and how you would balance consistency and availability during peak traffic events (e.g., major sporting events).
PostgreSQL
Redis
Kafka
S3
Snowflake ID
CDN
gRPC
Microservices
Questions & Insights
Clarifying Questions
Scale: What is the scale in terms of Daily Active Users (DAU) and Tweet volume? (Assumption: 300M DAU, 500M tweets/day).
Social Graph: What is the average and max follower count? (Assumption: Avg 200, Max 100M+ for celebrities).
Timeline Type: Are we building a chronological or algorithmic timeline? (Assumption: Chronological for MVP).
Content: Does the MVP include media (images/video)? (Assumption: Yes, stored in Object Storage with metadata in DB).
Read/Write Ratio: Is the system read-heavy? (Assumption: Yes, 100:1 read-to-write ratio).
Thinking Process
Core Bottleneck: The "Fan-out" problem. How do we deliver a single tweet to millions of followers' timelines without crashing the system?
Step 1: The Write Path: How to ingest tweets efficiently and store them durably?
Step 2: The Fan-out Mechanism: How to distribute tweets to followers? (Push vs. Pull vs. Hybrid).
Step 3: The Read Path: How to serve the Home Timeline with sub-100ms latency?
Step 4: Celebrity Handling: How to manage "hot keys" when a user with 100M followers tweets?
Bonus Points
Hybrid Fan-out: Using a "Push" model for regular users and a "Pull" model for celebrities to prevent Write Amplification and "Thundering Herd" on Redis.
Geo-Sharding: Sharding the Social Graph and Timeline cache by user proximity to minimize cross-region tail latency.
Storage Optimization: Using Snowflake-like IDs for tweets to ensure K-sortable ordering and efficient range scans in the database.
Feed Ranker (Future-proofing): Separating the Fan-out logic from the Ranking logic to allow for a transition from chronological to ML-driven timelines.
Design Breakdown
Functional Requirements
Core Use Cases:
Users can post tweets (text + media).
Users can follow/unfollow other users.
Users can view a "Home Timeline" (tweets from people they follow).
Users can view a "User Timeline" (their own tweets).
Scope Control:
In-Scope: Posting, Following, Home/User Timelines, Basic Media.
Out-of-Scope: Search, Real-time notifications (WebSockets), Retweets/Likes (for MVP simplicity), Direct Messages.
Non-Functional Requirements
Scale: Handle 500M Tweets/day and 30B Read requests/day.
Latency: Home Timeline generation < 200ms; Tweet posting < 500ms.
Availability & Reliability: 99.9% uptime (High availability is critical for a social platform).
Consistency: Eventual consistency for timelines (it is okay if a tweet shows up 2 seconds late).
Fault Tolerance: No single point of failure in the fan-out pipeline.
Estimation
Traffic Estimation:
Write QPS: 500M / 86400s \approx 6,000 QPS.
Peak Write QPS: 12,000 QPS.
Read QPS (Home Timeline): 300M DAU * 10 visits \approx 35,000 QPS.
Storage Estimation:
Tweet size: 280 chars + metadata \approx 1 KB.
Daily Tweet Storage: 500M * 1 KB = 500 GB/day.
5-year storage: 500 GB 365 5 \approx 900 TB.
Bandwidth Estimation:
Ingress: 6,000 * 1 KB = 6 MB/s (excluding media).
Egress: 35,000 20 tweets 1 KB = 700 MB/s.
Blueprint
Concise Summary: A microservices architecture leveraging an asynchronous fan-out pipeline. It uses a Hybrid Push/Pull model to balance write amplification and read latency.
Major Components:
Tweet Service: Handles tweet ingestion, validation, and metadata persistence.
Social Graph Service: Manages follow/unfollow relationships and fetches follower lists.
Fan-out Workers: Asynchronously propagates tweets to followers' pre-computed timeline caches.
Timeline Service: Aggregates and serves the Home/User timelines from cache or DB.
Redis Cache: Stores pre-computed Home Timelines for fast retrieval.
Simplicity Audit: This design avoids complex graph databases for the MVP, opting for relational sharding and Redis-based pre-computation which covers 90% of the core value.
Architecture Decision Rationale:
Why this architecture?: Pre-computing timelines (Push) is essential for a read-heavy system like Twitter, while the Pull fallback for celebrities prevents system collapse.
Functional Satisfaction: Covers the full "post-follow-view" loop.
Non-functional Satisfaction: Redis ensures low latency; Kafka ensures high-throughput fan-out; Horizontal scaling of services ensures availability.
High Level Architecture
Sub-system Deep Dive
Edge (Optional)
Content Delivery: Use CDN (e.g., Cloudflare/Akamai) for static assets and media files to reduce latency for global users.
API Gateway: Handles AuthN/AuthZ (JWT), Rate Limiting (e.g., 100 tweets/hour per user), and Request Routing to specific microservices.
Load Balancing: L7 Load Balancer (NGINX/Envoy) to distribute traffic to service instances.
Service
Tweet Service:
Logic: Validates tweet length and media links. Writes to
Tweet DB. Pushes tweet_id and author_id to Kafka.Protocol: gRPC for internal service communication; REST for external.
Social Graph Service:
Logic: Manages the
Followers and Following tables. Scaling: Sharded by
follower_id to quickly find who a user is following.Timeline Service:
Logic: For regular users, fetches from
Timeline Cache (Redis). For users following celebrities, fetches the celebrity tweets from Tweet DB and merges them on-the-fly (Hybrid Pull).Fan-out Workers:
Logic: Consumes from Kafka. Fetches followers from
Social Graph Service. Updates Timeline Cache for each follower.Celebrity Filter: If
author_id is a celebrity (Followers > 1M), skip the fan-out to prevent Kafka congestion.Storage
Tweet DB (PostgreSQL/Postgres-XL):
Access Pattern: Write-heavy (Ingestion) and Read-heavy (User Timelines).
Schema:
tweet_id (BigInt, Primary Key, Snowflake ID)user_id (BigInt, Index)content (Text)created_at (Timestamp, Index)Sharding: Sharded by
user_id to keep a user's tweets on the same physical node.Graph DB (PostgreSQL/MySQL):
Schema:
follower_id (BigInt)followee_id (BigInt)Unique Index:
(follower_id, followee_id)Sharding: Sharded by
follower_id.Cache
Purpose: Home Timelines are extremely read-heavy. Pre-computing them in Redis avoids massive JOINs across tables.
Key-Value Schema:
Key:
TL:{user_id}Value:
List<tweet_id> (Redis List or Sorted Set).TTL: 72 hours (Inactive users don't need cached timelines).
Failure Handling: If Redis is cold or down, the Timeline Service falls back to a DB query (Query people user follows -> Get tweets -> Sort by time).
Messaging
Purpose: Decouples the Tweet Service from the expensive fan-out process.
Topic Schema:
tweet_events { tweet_id, author_id, timestamp }.Throughput: 12k Peak QPS. Kafka handles this easily.
Partitioning: Partition by
author_id to ensure ordering of tweets from the same user.Wrap Up
Advanced Topics
Trade-offs (Push vs. Pull):
Push: Low read latency, but high write amplification for celebrities (1 tweet = 100M writes).
Pull: No write amplification, but high read latency (JOIN on read).
Solution: Hybrid Approach. Most users use Push. Celebrities are "pulled" at read-time and merged into the user's cached timeline.
Reliability: If a fan-out worker fails, Kafka keeps the offset. We can retry the fan-out for that batch.
Bottleneck - Hot Keys: Celebrities are hot keys in the Social Graph. We use a local cache (LRU) in the Tweet Service to identify celebrities and skip fan-out.
Security: Data is encrypted in transit via TLS. PII (emails/IPs) in the User table is encrypted at rest.
Optimization: For the Home Timeline, we only store
tweet_ids in Redis. The actual tweet content is fetched from a separate Tweet Cache (Redis) to save memory (De-duplication).