The Question
DesignDesign a Scalable Short-Form Video Platform
Design a system similar to TikTok that supports high-volume video uploads, asynchronous video processing (transcoding), and a personalized recommendation feed for millions of users. The design should address low-latency video playback globally, efficient handling of 'viral' content, and the trade-offs between real-time feed updates and system performance. Discuss how you would handle massive storage requirements and ensure high availability of the feed service.
S3
Kafka
Redis
Cassandra
CDN
HLS
DASH
FFMPEG
Spark
NoSQL
JWT
Questions & Insights
Clarifying Questions
What is the primary content format and scale? (Assumption: Short-form videos up to 60 seconds. Target 500M DAU with a 1:100 creator-to-viewer ratio).
How critical is the recommendation engine for the MVP? (Assumption: A "For You" feed is mandatory. We will implement a basic asynchronous scoring mechanism rather than a complex real-time deep learning model for the MVP).
What are the latency requirements for video playback? (Assumption: Start-up latency < 500ms. Global delivery via CDN is required).
Are social features like "Live Streaming" or "Real-time Messaging" in scope? (Assumption: No. Out-of-scope for MVP to satisfy YAGNI).
What is the consistency requirement for view counts and likes? (Assumption: Eventual consistency is acceptable for engagement metrics to prioritize availability).
Thinking Process
Core Bottleneck: The system is read-heavy (feed consumption) and write-intensive (video uploads/transcoding). The "Cold Start" problem for new videos is the primary challenge.
Strategy Steps:
Asynchronous Ingestion: Use a message-driven architecture to decouple video uploads from the heavy transcoding and metadata indexing process.
Edge-Heavy Delivery: Leverage CDNs for both caching video segments and terminating SSL/TLS closer to the user to reduce "Time to First Byte".
Pre-computed Feeds: Shift the heavy lifting of recommendations from "Read-time" to "Write-time" or "Background-time" using a ranking worker that populates a Redis-based Feed Cache.
Tiered Storage: Store raw videos in high-durability object storage and serve transcoded segments via optimized edge-friendly formats (HLS/DASH).
Bonus Points
Video Bitrate Adaptation: Implementing HLS/DASH with multiple resolution ladders to handle users on varying network conditions (3G vs. Fiber).
Vector Embeddings for Recs: Using a Vector Database (like Milvus or Pinecone) to perform K-Nearest Neighbor (KNN) searches for content similarity in the recommendation pipeline.
Quic/HTTP3 Protocol: Utilizing UDP-based protocols at the Edge to minimize head-of-line blocking for video segment streaming.
Write-back Cache for Metrics: Using a "Buffered Counter" pattern to batch "Like" and "View" increments in memory before flushing to the persistent DB to prevent hot-partition issues.
Design Breakdown
Functional Requirements
Core Use Cases:
Users can record and upload short videos.
Users can scroll through a "For You" feed (infinite scroll).
Users can follow other creators and see a "Following" feed.
Users can "Like" videos.
Scope Control:
In-Scope: Video upload, transcoding, feed generation, and global delivery.
Out-of-Scope: Live streaming, video editing tools, ads platform, and direct messaging.
Non-Functional Requirements
Scale: Support 500M DAU and 5M video uploads per day.
Latency: Feed generation < 200ms; Video start-up < 500ms.
Availability & Reliability: 99.9% (High availability for the feed is more critical than upload availability).
Consistency: Eventual consistency for feed updates and engagement counts.
Fault Tolerance: Automatic retry on transcoding failure; multi-AZ storage for video blobs.
Estimation
Traffic Estimation:
Read QPS: 500M DAU * 40 videos/day = ~230k QPS (Read).
Write QPS: 5M uploads/day = ~60 QPS (Write - low QPS but high bandwidth).
Storage Estimation:
5M videos/day * 20MB (avg compressed) = 100 TB/day.
36.5 PB per year (before replication/multi-resolution overhead).
Bandwidth Estimation:
Ingress: 60 QPS * 20MB = 1.2 GB/s.
Egress: 230k QPS * 2MB (initial segment) = 460 GB/s (handled by CDN).
Blueprint
Concise Summary: A globally distributed video platform using an asynchronous "upload-and-notify" pipeline and a pre-computed feed cache for low-latency discovery.
Major Components:
CDN: Caches video segments (TS files) globally to reduce origin load.
Upload Service: Handles multipart uploads and persists raw files to S3.
Transcoding Pipeline: Decoupled via Kafka, converts raw video into multiple resolutions.
Feed Service: Fetches pre-ranked video IDs from Redis for immediate user consumption.
Metadata Store: A NoSQL database (Cassandra) storing video attributes and user relations.
Simplicity Audit: This design avoids complex real-time ranking by using a background worker to populate feeds, fulfilling the MVP need for speed without a massive ML infrastructure.
Architecture Decision Rationale:
Why this?: Short-form video success depends on "instant-on" playback and relevant content. Pre-computation and CDNs are the standard for achieving this at scale.
Functional Satisfaction: Covers the full lifecycle from creation to consumption.
Non-functional Satisfaction: Scalable via horizontal service scaling and decoupled via messaging; low latency via Redis and CDN.
High Level Architecture
Sub-system Deep Dive
Edge (Optional)
Content Delivery & Traffic Routing:
CDN: Use CloudFront or Akamai. Static assets (JS/CSS) and dynamic video segments (.ts) are cached. Use signed URLs for security.
DNS: Latency-based routing to the nearest regional Load Balancer.
Security & Perimeter:
API Gateway: Handles JWT-based Authentication.
Rate Limiting: Applied at the user level (e.g., 5 uploads/hour) to prevent spam.
Service
Topology & Scaling:
Stateless microservices deployed across multiple Availability Zones (AZs).
Scaling based on CPU/Request count for Feed Service; Scaling based on Queue Depth for Workers.
API Schema Design:
POST /v1/videos/upload: Multipart upload, returns video_id.GET /v1/feed?type=foryou: Returns a list of video objects (URLs, metadata).POST /v1/videos/{id}/like: Idempotent like action.Resilience:
Circuit Breakers: If the Ranking Engine is down, Feed Service falls back to a "Global Popularity" static list.
Retries: Exponential backoff for video segment uploads.
Storage
Access Pattern: Heavy writes for metadata during upload; massive random reads for metadata during feed generation.
Database Table Design (Cassandra):
videos: video_id (PK), creator_id, description, s3_url, created_at.user_videos: user_id (PK), video_id, timestamp. (For "Following" feed).Technical Selection: Cassandra.
Rationale: High write throughput and easy linear scaling. Partitioning by
video_id ensures no single hot node for metadata.Distribution Logic: Sharded by
user_id for user-specific data and video_id for global video metadata.Cache
Purpose: Reducing latency for "For You" feeds.
Key-Value Schema:
Key:
feed:foryou:{user_id}Value:
List<video_id> (Max 200 items).TTL: 30 minutes.
Technical Selection: Redis. Use Redis Clusters for high availability.
Failure Handling: If Redis is empty, fetch a "warm start" list from the Metadata DB based on popularity.
Messaging
Purpose: Decouples the user-facing upload from the long-running transcoding and ML tasks.
Topic Schema:
video-uploaded-event containing video_id and raw_s3_path.Throughput & Partitioning: Partitioned by
video_id to ensure ordering for specific video processing stages.Technical Selection: Kafka. High durability and replayability for failed transcoding jobs.
Data Processing
Processing Model:
Transcoding: Worker pulls from Kafka, downloads from S3, uses FFMPEG to create 360p, 720p, 1080p versions, and pushes back to S3.
ML Ranking: Spark Streaming job aggregates user interactions (likes, views) to update user preference vectors.
Technical Selection: Spark/Flink for real-time ranking updates; FFMPEG-based custom workers for transcoding.
Wrap Up
Advanced Topics
Trade-offs: We chose Eventual Consistency for engagement counts (likes/views). A user might see 100k likes while another sees 100,005. This allows us to use a write-back cache and avoid locking the DB on every single "like" click.
Reliability: Transcoding is the most brittle part. We use a Dead Letter Queue (DLQ) in Kafka. If a video fails to transcode 3 times, it's flagged for manual review or lower-quality fallback.
Bottleneck Analysis: The "Following" feed is a fan-out problem. For the MVP, we use the "Pull" model (fetching from followed users at read-time) rather than "Push" (delivering to all followers' feeds) to save storage and simplify the architecture.
Optimization: Video Chunking. Instead of waiting for the full video to upload, the client can stream chunks to the Upload Service, which pipes them directly to S3. This improves the perceived speed for users on slow connections.