The Question
Design

Video Streaming Platform

Design a large-scale video streaming platform similar to YouTube. The system should handle video upload and transcoding pipelines, CDN-based adaptive bitrate delivery, search and discovery, and personalized recommendations for a global user base.
Cassandra
DynamoDB
S3
Redis
CDN
Questions & Insights

Thinking Process

To design a video-sharing platform like YouTube, focus on the fundamental decoupling of the Write Path (Upload/Transcode) and the Read Path (Streaming/Delivery).
The Bottleneck: How do we handle massive binary files without blocking the application servers?
Solution: Use Pre-signed URLs for direct-to-S3 uploads and asynchronous transcoding via Message Queues.
The Playback Latency: How do we ensure users globally see video without buffering?
Solution: Leverage a Content Delivery Network (CDN) and adaptive bitrate streaming (HLS/DASH).
The Metadata Consistency: How do we manage video titles, view counts, and user data at scale?
Solution: Use a distributed NoSQL database for metadata and a caching layer for high-volume read requests.
The Scalability Ladder: How do we move from one video to billions?
Solution: Transition from monolithic processing to a microservices architecture where the transcoding worker fleet scales independently of the web API.

Bonus Points

Cost-Aware Storage Tiering: Implement a lifecycle policy where "cold" videos (low views) are moved to cheaper storage (e.g., S3 Glacier) to optimize ROI.
Quic/HTTP3 Protocol: Use QUIC for video ingestion to improve upload reliability in high-packet-loss mobile environments.
Vector Search for Discovery: For an advanced MVP, use a Vector Database (like Milvus or Pinecone) to power semantic search and basic recommendations.
Bloom Filters: Implement Bloom filters in the caching layer to prevent "Cache Penetration" for non-existent video IDs, saving database CPU cycles.
Design Breakdown

Functional Requirements

Users can upload videos.
Users can view/stream videos in different resolutions.
Users can search for videos by title.
The system tracks view counts for videos.

Non-Functional Requirements

High Availability: 99.99% for video playback.
Low Latency: Start-up playback latency should be < 200ms via CDN.
Reliability: Uploaded videos must never be lost (99.999999999% durability).
Scalability: Support 100M+ Daily Active Users (DAU).

Estimation

DAU: 100 Million.
Uploads: 1% of users upload 1 video/day = 1 Million videos/day.
Storage: Avg video size 100MB. 1M * 100MB = 100 TB/day.
Bandwidth (Read): If 100M users watch 5 mins of 480p (5MB/min) = 100M * 25MB = 2.5 Petabytes/day.
Egress: ~29 GB/s average throughput.

Blueprint

Concise Summary: A microservices-based architecture that utilizes an asynchronous pipeline for video processing and a global CDN for low-latency delivery.
Major Components:
API Gateway/LB: Routes traffic and handles authentication.
Upload Service: Coordinates pre-signed URLs and metadata entry.
Object Storage: High-durability storage for raw and processed video files.
Transcoding Worker: A background fleet that converts videos into multiple resolutions (360p, 720p, 1080p).
Metadata DB: Stores video info, user profiles, and view counts.
CDN: Caches video chunks close to the end-user.
Simplicity Audit: This design avoids complex recommendation engines or real-time streaming (RTMP) in favor of simple VOD (Video on Demand) using standard HLS/DASH protocols.

High Level Architecture

Sub-system Deep Dive

Service

Topology & Scaling: Deploying Upload and View services as independent auto-scaling groups (K8s pods). Upload is write-heavy/bursty; View is read-heavy.
API Spec:
POST /v1/video/upload: Returns a Pre-signed URL for direct S3 upload to bypass server bandwidth limits.
GET /v1/video/:id: Returns video metadata and the CDN URL for the HLS manifest (.m3u8).
POST /v1/video/:id/view: Increment view count (debounced).

Storage

Data Model:
Videos Table (NoSQL - DynamoDB/Cassandra): video_id (PK), user_id, title, description, hls_manifest_url, status (Processing/Ready).
Blob Storage (S3): Structured as raw/[video_id] and processed/[video_id]/[resolution]/.
Database Logic: Sharding by video_id for even distribution. Use DynamoDB Streams or CDC to trigger search index updates.

Cache

Implementation: Redis Cluster.
Data Structures:
String keys for Video Metadata: video:metadata:{id} (TTL: 24h).
Sorted Sets for Trending: trending_videos (Score: view count).
Eviction: Least Recently Used (LRU) to keep hot video metadata in memory.

Messaging

Implementation: AWS SQS or RabbitMQ.
Topic Structure: video-transcode-jobs.
Delivery Guarantees: At-least-once delivery. If a worker fails, the visibility timeout expires and the job is retried by another worker.

Data Processing

Implementation: A fleet of workers running FFmpeg.
DAG/Transformations:
Pull raw video from S3.
Transcode into 1080p, 720p, 480p.
Segment videos into 4-second .ts chunks.
Generate .m3u8 manifest file.
Upload chunks/manifest to S3 and update Metadata DB status to "Ready".
Wrap Up

Advanced Topics

Trade-offs: Eventual Consistency vs. Strong Consistency. When a video is uploaded, it isn't visible immediately (Availability over Consistency). This is acceptable for social media.
Bottlenecks: Bandwidth Cost. Egress fees from S3 to the internet are huge.
Mitigation: Ensure the CDN is the only path for video delivery to leverage lower CDN egress rates.
Failure Handling:
Transcoding Failure: Use a Dead Letter Queue (DLQ) to investigate videos that crash FFmpeg.
Region Failure: Multi-region S3 replication for the most popular 5% of videos.
Alternatives & Optimization:
Alternative: Use a managed service like AWS Elemental MediaConvert instead of custom FFmpeg workers if engineering headcount is low.
Optimization: Implement Content-Adaptive Encoding to reduce bitrate (and costs) for simple videos (e.g., talk shows) vs complex videos (e.g., gaming).