The Question
DesignVideo Streaming Platform
Design a large-scale video streaming platform similar to YouTube. The system should handle video upload and transcoding pipelines, CDN-based adaptive bitrate delivery, search and discovery, and personalized recommendations for a global user base.
CDN
S3
FFmpeg Transcoding
ABR Streaming
NoSQL
Questions & Insights
Thinking Process
To design a high-scale video platform like YouTube, focus on the asymmetry between write (upload) and read (view) operations.
Asynchronous Processing: Video processing is slow; never do it in the request-response cycle. Use a message queue to decouple uploads from transcoding.
Storage Hierarchy: Use Blob Storage for heavy assets and a high-throughput NoSQL or optimized SQL for metadata.
Content Delivery (CDN): Latency is the enemy. Moving content to the edge is mandatory for a global user base.
Progressive Discovery Questions:
How do we handle multi-gigabyte uploads without blocking the user? (Chunked uploads + Async Workers).
How do we support diverse devices (4K TV vs. 3G Mobile)? (Transcoding into multiple resolutions/bitrates).
How do we scale the viewing experience to millions of concurrent users? (CDN + Adaptive Bitrate Streaming).
Bonus Points
Adaptive Bitrate Streaming (ABR): Implement protocols like DASH or HLS that dynamically switch video quality based on the user's real-time network conditions.
Cost-Optimized Storage: Utilize Object Storage Lifecycle policies (e.g., S3 Intelligent-Tiering) to move older, less popular videos to Cold Storage (Glacier).
Geo-Sharding & Read Replicas: Use globally distributed read replicas for metadata to ensure low latency for "Video Title/Description" lookups in different regions.
Video Deduplication: Use content-aware hashing (Perceptual Hashing) to detect identical video uploads and save petabytes of storage.
Design Breakdown
Functional Requirements
Video Uploading: Users can upload videos up to 1GB (MVP limit).
Video Streaming: Users can view videos in various resolutions (360p, 720p).
Search/Discovery: Basic search by video title.
Metadata Management: Users can add titles and descriptions.
Non-Functional Requirements
High Availability: Viewing must be 99.99% available (users hate playback errors).
High Scalability: System must handle spikes in uploads and massive concurrent views.
Low Latency: Playback should start in < 2 seconds.
Eventual Consistency: It is acceptable if a view count or new upload takes a few seconds to appear globally.
Estimation
Daily Active Users (DAU): 100 Million.
Upload Rate: 0.1% of users upload 1 video/day = 100,000 videos/day.
Storage per Video: Average 100MB (after transcoding) = 10TB/day.
Retention: 5 years = ~18PB total storage.
Read/Write Ratio: 100:1 (Heavy Read).
Egress Bandwidth: 10M concurrent viewers * 2Mbps = 20 Tbps (Requires heavy CDN reliance).
Blueprint
Concise Summary: An asynchronous, event-driven architecture that separates the heavy video processing pipeline from the lightweight metadata and discovery services.
Major Components:
API Gateway: Entry point for authentication, rate limiting, and request routing.
Upload Service: Handles chunked file uploads and persists raw files to temporary storage.
Transcoding Workers: Processes raw videos into multiple formats and resolutions.
Metadata DB: Stores video info (Title, URL, Uploader ID).
CDN: Distributes processed video segments to edge locations for low-latency streaming.
Simplicity Audit: This architecture omits complex recommendation engines and real-time commenting to focus on the core "Upload-Process-Watch" loop.
High Level Architecture
Sub-system Deep Dive
Service
Topology & Scaling: Services are deployed as Dockerized microservices in an Autoscaling Group (K8s).
Upload Service: Uses Chunked Uploads. The client sends 5MB parts with a
SequenceID. This prevents full restart on network failure.Metadata Service: RESTful API for fetching video details.
Search Service: Simple prefix-matching or integrated with an Elasticsearch index for the MVP.
Storage
Data Model (NoSQL/Key-Value):
VideoID (PK), UploaderID, Title, Description, S3_URL, Status (Processing/Ready), CreatedAt.Database Logic: Metadata DB uses wide-column storage (e.g., Cassandra) to handle high-frequency writes (view counts) and massive record counts.
Cache
Implementation: Redis cluster using LRU (Least Recently Used) eviction.
Data Structures: Stores serialized
VideoMetadata objects keyed by VideoID.TTL: 24 hours for popular videos; hot videos are kept in memory to reduce DB load.
Messaging
Implementation: RabbitMQ or AWS SQS.
Topic Structure:
video_upload_task topic.Guarantees: At-least-once delivery. Transcoding workers must be idempotent (if they crash and restart, they simply overwrite the partial output).
Data Processing
Implementation: A fleet of workers using FFmpeg.
DAG/Transformations:
Extraction: Pull raw file from S3.
Transcoding: Convert to H.264/AAC.
Resolution Scaling: Create 360p, 720p, 1080p versions.
Segmentation: Split videos into 10-second
.ts chunks for HLS streaming.Manifest Generation: Create an
.m3u8 playlist file.Wrap Up
Advanced Topics
Trade-offs: Availability over Consistency. If a video description is updated, it's okay if a user sees the old one for a few seconds. We prioritize the video stream never buffering over immediate metadata consistency.
Bottlenecks: The Transcoding Queue can become a massive bottleneck if a viral event causes 1M uploads.
Fix: Priority Queuing (Premium users or short videos get processed first).
Failure Handling:
S3 Durability: 99.999999999% (11 nines).
Dead Letter Queues (DLQ): If a video fails transcoding 3 times, move it to a DLQ for manual inspection.
Alternatives:
Architecture: Could use a monolithic Metadata DB (PostgreSQL) for the MVP to simplify development, but NoSQL scales better for the expected YouTube load.
Storage: Using a multi-cloud storage strategy (GCP + AWS) to avoid vendor lock-in and provide a fallback if one provider's region goes down.