The Question

Scalable Video Sharing and Streaming Platform

Design a global video platform similar to YouTube. The system must support high-volume video uploads (500 hrs/min), asynchronous transcoding into multiple resolutions, and low-latency global streaming for billions of users. Focus on the end-to-end lifecycle of a video from upload to playback, detailing the storage strategy for petabyte-scale data, the transcoding pipeline orchestration, and the content delivery architecture. Address challenges related to massive egress bandwidth, metadata scalability, and adaptive bitrate streaming.

Cassandra

Redis

Kafka

Elasticsearch

CDN

HLS

DASH

FFmpeg

QUIC

gRPC

Questions & Insights

Clarifying Questions

Scale: What is the target DAU and the volume of video uploads per minute? (Assumption: 1B DAU, 500 hours of video uploaded every minute).

Video Quality: Do we need to support 4K/8K and HDR, or is 1080p the MVP limit? (Assumption: Support up to 4K with multiple resolutions for Adaptive Bitrate Streaming).

Latency: What is the acceptable delay between finishing an upload and the video being "live"? (Assumption: Near real-time availability is not required; a few minutes for transcoding is acceptable).

Features: Are social features like comments, likes, and real-time chat in scope? (Assumption: Focus on core MVP: Upload, Metadata management, Search, and Video Streaming).

Geographic Distribution: Is this a global service or regional? (Assumption: Global audience requiring multi-region delivery).

Thinking Process

Core Bottleneck: The primary challenge is the sheer volume of data (storage) and the massive egress bandwidth (streaming).

Progressive Questions:

How do we handle multi-GB file uploads reliably over unstable connections? (Answer: Chunked multipart uploads + S3 Pre-signed URLs).

How do we ensure videos play smoothly on all devices and network conditions? (Answer: Asynchronous transcoding pipeline + Adaptive Bitrate Streaming via HLS/DASH).

How do we scale metadata queries for billions of videos? (Answer: Wide-column NoSQL like Cassandra for scalability and high write throughput).

How do we minimize global playback latency? (Answer: Multi-layered CDN strategy and Edge caching).

Bonus Points

QUIC Protocol: Implement HTTP/3 (QUIC) for the "Last Mile" to reduce rebuffering in lossy mobile networks.

Cost-Aware Storage Tiering: Move older, unpopular videos from Standard S3 to S3 Glacier or Cold Storage to optimize costs.

VMAF Optimization: Use Netflix’s Video Multi-Method Assessment Fusion (VMAF) during transcoding to maintain high perceptual quality while minimizing bitrate.

Bloom Filters: Use Bloom filters at the Edge/API Gateway to quickly reject requests for non-existent video IDs, protecting the backend from "Cache-Miss" attacks.

Design Breakdown

Functional Requirements

Core Use Cases:

Users can upload videos.

Users can view video metadata (title, views, description).

Users can stream videos in various resolutions (360p, 720p, 1080p, 4K).

Users can search for videos by title.

Scope Control:

In-scope: Upload, Transcoding, Metadata, Streaming, Search.

Out-of-scope: Recommendations engine, Comments, Subscriptions, Monetization/Ads.

Non-Functional Requirements

Scale: Must handle PB-scale storage daily and millions of concurrent viewers.

Latency: Video start time (Time to First Frame) should be < 200ms globally.

Availability & Reliability: 99.99% availability; uploaded videos must never be lost (High Durability).

Consistency: Eventual consistency is acceptable for view counts and metadata updates.

Fault Tolerance: Transcoding failures should be retried automatically; CDN failure should failover to origin or secondary CDN.

Security: Content protection via Signed URLs/Cookies and AES-128 encryption.

Estimation

Traffic Estimation:

1B DAU, 5 videos/day = 5B views/day.

Avg QPS (Read): 5B / 86400 ≈ 60,000 QPS.

Peak QPS: ~120,000 QPS.

Uploads: 500 hrs/min ≈ 30,000 mins/min ≈ 500 uploads/sec.

Storage Estimation:

500 hrs/min 60 min 24 hrs = 720,000 hrs/day.

At 1080p (compressed), 1 hr ≈ 2GB.

Raw Storage: 720,000 * 2GB = 1.44 PB/day.

With replicas and multi-resolution: ~4-5 PB/day.

Bandwidth Estimation:

Egress: 5B views * 100MB (avg 5 min video) = 500 PB/day.

Bandwidth: 500 PB / 86400 ≈ 46 Tbps.

Blueprint

Concise Summary: A decoupled, event-driven architecture that separates the high-bandwidth upload/streaming path from the high-frequency metadata management path.

Major Components:

API Gateway: Entry point for authentication, rate limiting, and request routing.

Upload Service: Orchestrates video uploads using S3 Pre-signed URLs to offload heavy traffic.

Transcoding Pipeline: Asynchronous workers that convert raw videos into multiple formats/bitrates.

Metadata Service: Manages video info, user data, and view counts using Cassandra.

CDN (Content Delivery Network): Distributes transcoded video segments to edge locations for low-latency streaming.

Simplicity Audit: This design avoids complex micro-services for non-core features and uses managed object storage to handle the "heavy lifting" of binary data.

Architecture Decision Rationale:

Why this?: Separating the upload from transcoding via a message queue allows the system to handle spikes in uploads without crashing the processing layer.

Functional Satisfaction: HLS/DASH support ensures device compatibility; S3 ensures durability.

Non-functional Satisfaction: CDN handles the massive egress bandwidth; Cassandra scales horizontally for metadata.

High Level Architecture

Sub-system Deep Dive

Edge (Optional)

Content Delivery & Traffic Routing:

CDN Strategy: Use a mix of Tier-1 CDNs (e.g., Akamai, CloudFront) and an internal "Open Connect" style appliance for high-traffic regions.

Streaming Protocol: Use HLS (HTTP Live Streaming) or DASH. These break videos into small .ts or .m4s segments.

Security:

API Gateway: Handles JWT validation and SSL termination.

Rate Limiting: Applied per User-ID to prevent scraping or DOS on the search/metadata endpoints.

Service

Upload Service:

Uses Multipart Upload. The client requests a pre-signed URL for each chunk. This allows for resuming failed uploads and parallelizing the transfer.

API Schema Design:

POST /v1/video/upload-init: Returns a VideoID and S3 Upload ID.

GET /v1/video/:id: Returns metadata + Manifest URL (m3u8).

GET /v1/search?q=...: Search videos.

Resilience:

Retries: Exponential backoff for Transcoding Workers.

Circuit Breaker: If Cassandra is slow, the Metadata service returns cached data from Redis.

Storage

Access Pattern: Write-heavy for uploads, Read-heavy for metadata and search.

Database Table Design (Cassandra):

videos_by_id: video_id (PK), user_id, title, description, manifest_url, status, created_at.

video_stats: video_id (PK), view_count, likes. (Use Cassandra counters or separate stream processing).

Technical Selection:

Cassandra: Chosen for linear scalability and high availability (No Single Point of Failure).

Elasticsearch: Chosen for full-text search capabilities on titles/descriptions.

Distribution: Sharding by video_id using consistent hashing to avoid hot partitions.

Cache

Purpose: Reduce latency for popular video metadata and "trending" lists.

Key-Value Schema:

Key: video:meta:{video_id}, Value: JSON string of metadata, TTL: 1 hour (longer for viral videos).

Technical Selection: Redis. Used for its sub-millisecond latency and support for data structures (Sorted Sets for trending videos).

Failure Handling: If Redis fails, fall back to Cassandra. Use "Cache Aside" pattern.

Messaging

Purpose: Decouples video upload from the resource-intensive transcoding process.

Event Schema: video_id, s3_raw_path, priority, timestamp.

Throughput: Kafka can handle millions of events; partitioning by video_id ensures that if multiple tasks exist for one video, they can be tracked, though transcoding is usually a single event per video.

Technical Selection: Kafka. Chosen for persistence and replayability if the transcoding cluster fails.

Data Processing

Processing Model: Directed Acyclic Graph (DAG) for video processing.

Step 1: Inspection (get metadata).

Step 2: Split video into chunks.

Step 3: Transcode chunks in parallel (360p, 720p, etc.).

Step 4: Merge chunks and generate Manifest files.

Technical Selection: Custom worker pool using FFmpeg. Orchestrated by a workflow engine (e.g., Temporal or Airflow).

Scalability: Auto-scaling worker nodes based on the depth of the Kafka queue.

Infrastructure (Optional)

Observability: Prometheus for metrics (Transcoding lag, 5xx errors). ELK stack for logs. Distributed tracing (Jaeger) to track a video's journey from upload to "Live".

Wrap Up

Advanced Topics

Trade-offs: We chose Eventual Consistency for view counts to prioritize availability. A user might see slightly different view counts on different refreshes.

Reliability: Transcoding is idempotent. If a worker fails halfway, a new worker can restart the task using the same video_id.

Bottleneck Analysis:

Hot Videos: Popular videos can overwhelm a CDN node. Solution: Multi-CDN and internal caching layers.

Storage Cost: 1.4PB/day is expensive. Optimization: Use H.265/VP9 codecs for better compression ratios on popular videos.

Security: Prevent "Deep-linking" by signing the HLS manifest URLs with a short TTL.

Distinguishing Insight: For an MVP, we use S3 Event Notifications to trigger Kafka. This eliminates the need for the Upload Service to manually track file completion, making the system more robust against service crashes during the final bytes of an upload.