The Question
DesignVideo Streaming Platform (Netflix-scale)
Design a globally distributed video-on-demand (VoD) streaming service. The system must support high-efficiency content ingestion and transcoding, sub-second playback start times via CDN integration, and a resilient metadata management system for millions of titles. Address the challenges of handling high-write viewing history 'heartbeats' and implementing adaptive bitrate streaming to ensure high availability and low latency across varying network conditions.
S3
Cassandra
PostgreSQL
Kafka
Redis
CDN
FFmpeg
HLS
DASH
JWT
Kubernetes
Questions & Insights
Clarifying Questions
What is the target scale for the MVP and long-term?
Assumption: Support 10M DAU for the MVP, with architectural paths to 100M+.
What are the primary device constraints?
Assumption: Support web, mobile (iOS/Android), and Smart TVs, requiring multiple bitrates and codecs (H.264, VP9).
Is this a live-streaming or Video-on-Demand (VoD) service?
Assumption: Primarily VoD, focusing on high-quality playback and library browsing.
What is the geographic distribution of the user base?
Assumption: Global distribution, necessitating a Content Delivery Network (CDN) strategy.
How critical is the recommendation engine for the MVP?
Assumption: Basic category-based discovery is sufficient for MVP; sophisticated ML personalization is a Phase 2 goal.
Thinking Process
Core Bottleneck: High-bandwidth delivery of large binary files (video) to millions of concurrent users.
Critical Focus Areas:
Content Ingestion & Transcoding: How to efficiently process a raw video into 100+ versions (resolutions/bitrates/containers).
Metadata & Discovery: High-availability storage for movie details, user profiles, and watch history.
Global Delivery (CDN): Minimizing latency and buffering by pushing content to the "edge."
Progressive Questions:
How do we store and transform high-definition master files reliably?
How do we ensure the playback starts instantly regardless of the user's network speed?
How do we handle the massive write-load of "watch history" (viewing heartbeats)?
Bonus Points
Content-Aware Encoding: Dynamically adjusting bitrates based on the complexity of the video scene (e.g., an action movie needs higher bitrates than an animation).
OpenConnect Architecture: Discussing the benefits of building a private CDN (IXP-based appliances) vs. using commercial vendors (CloudFront/Akamai).
Predictive Prefetching: Using user history to cache the first few minutes of the "Next Episode" on the client-side to achieve zero-latency starts.
Chaos Engineering: Implementation of "Chaos Monkey" to ensure the control plane remains resilient even when microservices fail.
Design Breakdown
Functional Requirements
Core Use Cases:
Users can search and browse movie/TV show metadata.
Users can stream video in various qualities based on bandwidth (Adaptive Bitrate Streaming).
Users can maintain a "Continue Watching" list (Viewing History).
Admins can upload and process new content.
Scope Control:
In-Scope: Video ingest, metadata management, playback API, and CDN distribution.
Out-of-Scope: Social features, live-streaming, payment gateway integration (assume external), and complex ML-driven recommendations.
Non-Functional Requirements
Scale: Support 10M DAU and petabytes of video storage.
Latency: Playback start time < 2 seconds; API responses < 200ms.
Availability: 99.99% for the playback service; 99.9% for the ingest pipeline.
Consistency: Eventual consistency for viewing history; strong consistency for user account/billing.
Reliability: Fault-tolerant transcoding; if a worker fails, the job must resume.
Estimation
Traffic Estimation:
10M DAU.
Average user watches 2 hours/day.
Total viewing hours: 20M hours/day.
Average bitrate: 3 Mbps (720p).
Egress Bandwidth: 20M 3600s 3Mbps = ~216 Petabits/day (~2.5 Tbps average).
Storage Estimation:
10,000 titles.
Each title (with all versions): 50 GB.
Total Storage: 500 TB (manageable in S3).
Write QPS:
Viewing history "heartbeats" every 10 seconds: (10M users * 2 hours / 10s) = ~7.2M events per hour -> ~2,000 QPS average, Peak 10,000 QPS.
Blueprint
Concise Summary: A microservices-based architecture leveraging an asynchronous transcoding pipeline and a geo-distributed CDN for low-latency video delivery.
Major Components:
API Gateway: Entry point for auth, routing, and rate-limiting.
Metadata Service: Manages movie details, categories, and search.
Transcoding Pipeline: Worker-based system to convert raw files into DASH/HLS segments.
CDN: Geographically distributed nodes for edge-caching video segments.
Playback Service: Negotiates the best video URL/manifest for the client.
Simplicity Audit: This design uses S3 for storage and managed databases (PostgreSQL/Cassandra) to minimize operational overhead for the MVP.
Architecture Decision Rationale:
Why this architecture?: Decoupling the "Control Plane" (Metadata/Auth) from the "Data Plane" (Video Delivery) allows the streaming to scale independently of the UI.
Functional Satisfaction: All core flows (Browse, Search, Play) are handled by dedicated services.
Non-functional Satisfaction: CDN ensures low latency, while S3/Cassandra handles the storage and write-heavy history scale.
High Level Architecture
Sub-system Deep Dive
Edge (Optional)
Content Delivery & Traffic Routing:
Use a global CDN (e.g., CloudFront or Akamai) for the MVP.
OpenConnect Strategy: For long-term, deploy "OpenConnect Appliances" (OCA) inside ISP data centers to serve 95%+ of traffic from within the ISP network.
Security & Perimeter:
API Gateway (Zuul/Spring Cloud): Handles SSL termination, JWT validation, and dynamic routing.
Rate Limiting: Applied per-user ID to prevent scraping of metadata and brute-force playback requests.
Service
Topology & Scaling: Stateless microservices deployed on Kubernetes (EKS/GKE) across multiple Availability Zones. Scaling is based on CPU and Request Count.
API Schema Design:
GET /v1/metadata/{titleId}: Fetch movie details.GET /v1/play/{titleId}: Returns a signed CDN URL and a manifest file (M3U8/MPD).POST /v1/history/heartbeat: Receives current timestamp of a user's viewing session.Resilience & Reliability:
Circuit Breakers (Resilience4j): Prevent the Playback service from hanging if the History service is down.
Graceful Degradation: If the Personalization/History service fails, show "Trending Now" instead of "Continue Watching."
Storage
Access Pattern:
Metadata: Read-heavy.
History: Extremely Write-heavy.
Video Assets: Large binary read-only once written.
Database Table Design:
PostgreSQL (Metadata): Tables for
Titles, Actors, Genres, Mapping. (Relational is best for complex queries).Cassandra (History): Partition Key:
user_id, Clustering Key: title_id + timestamp. (Optimized for high-volume time-series writes).Technical Selection:
S3: Durable object storage for raw and transcoded video segments.
Cassandra: Chosen for its linear scalability and ability to handle the "Viewing History" write volume.
Distribution Logic: Sharding PostgreSQL by
category_id if needed, though for 10k titles, a single master with read replicas is sufficient.Cache
Purpose: Reduce load on the Metadata DB and speed up "Top 10" queries.
Key-Value Schema:
Key:
title:{id}, Value: JSON-serialized metadata. TTL: 24h.Key:
trending:global, Value: List of title IDs.Technical Selection: Redis. Use Redis Cluster for high availability.
Failure Handling: If Redis is down, fall back to PostgreSQL with a strict circuit breaker to prevent DB meltdown.
Messaging
Purpose: Decouples the upload process from the time-consuming transcoding process.
Event Schema:
VideoUploadedEvent {videoId, s3_path, format}.Technical Selection: Kafka. High throughput and message persistence allow re-running transcoding jobs if they fail.
Failure Handling: Use a Dead Letter Queue (DLQ) for corrupted video files that fail transcoding multiple times.
Data Processing
Processing Model: Asynchronous Batch Processing for transcoding.
Processing DAG:
Validation: Check file integrity.
Chunking: Split video into 2-10 second chunks.
Parallel Transcoding: Convert chunks into multiple resolutions (4K, 1080p, 720p).
Packaging: Create HLS/DASH manifest files.
Technical Selection: Custom worker pool using FFmpeg or AWS Elemental MediaConvert for simplicity in MVP.
Infrastructure (Optional)
Observability: Prometheus for metrics, ELK for logs, and Jaeger for tracing playback requests.
Platform Security: mTLS between microservices to ensure secure internal communication.
Wrap Up
Advanced Topics
Trade-offs (Consistency vs. Availability): We choose Eventual Consistency for "Continue Watching" history (Cassandra). It is better for a user to see a slightly stale progress bar than for the playback to fail because the history DB is locked.
Reliability: Use "Adaptive Bitrate Streaming" (ABR). The client measures network speed and requests different chunks (e.g., 1080p segment vs 480p segment) dynamically from the CDN.
Bottleneck Analysis: The primary bottleneck is Transcoding. We use a "Chunking" strategy to parallelize a single movie's processing across 100+ workers, reducing total processing time from hours to minutes.
Security: Video chunks are stored in private S3 buckets. The Playback service generates Signed URLs with short TTLs, ensuring only authorized users can download video segments.