The Question

Scalable Short-Video Platform (TikTok)

Design a high-scale, global short-video social platform capable of supporting 100M+ Daily Active Users. The system must handle high-throughput video uploads, automated asynchronous video processing (transcoding), and provide a low-latency, personalized 'For You' feed. Address how you would manage petabyte-scale storage, optimize video delivery for global users, and ensure the recommendation engine remains responsive under heavy load. Discuss trade-offs between consistency and availability in a massive social graph.

Cassandra

Redis

Kafka

Flink

CDN

HLS

DASH

gRPC

Snowflake ID

Questions & Insights

Clarifying Questions

Scale: What is the target scale for Daily Active Users (DAU) and the expected upload-to-view ratio?

Content: What is the maximum video duration and file size? Do we support high-definition (4K) or just mobile-optimized resolutions?

Personalization: Does the MVP require a real-time recommendation engine ("For You" page) or a simple chronological feed?

Geographic Distribution: Is this a global launch or regional? This impacts CDN and data residency strategies.

Interactions: Are social features (likes, comments, shares, follows) required for the MVP?

Assumptions:

DAU: 100 Million.

Traffic: 1:100 write-to-read ratio (1 upload per 100 views).

Video: Max 60 seconds, ~10-20MB per file.

Personalization: An ML-driven "For You" feed is the core value proposition and must be included.

Storage: Petabyte-scale object storage for raw and transcoded videos.

Thinking Process

To design a short-video platform at scale, we must solve for high-bandwidth ingestion, massive storage, and ultra-low latency playback.

Core Bottleneck 1: Video Ingestion & Processing. How do we handle high-concurrency uploads and ensure videos are playable across all devices? (Solution: Async Transcoding Pipeline + S3 + Kafka).

Core Bottleneck 2: Feed Latency. How do we serve a personalized feed to 100M users in <200ms? (Solution: Pre-computed Feed Cache in Redis).

Core Bottleneck 3: Global Delivery. How do we prevent buffering? (Solution: Multi-tier CDN and Edge Caching).

Core Bottleneck 4: Data Scalability. How do we store trillions of metadata rows? (Solution: Wide-column NoSQL like Cassandra).

Bonus Points

QUIC Protocol: Use QUIC (HTTP/3) for video uploads to improve reliability over lossy mobile networks.

Adaptive Bitrate Streaming (ABS): Implement HLS/DASH with multiple resolution ladders to dynamically switch quality based on user bandwidth.

Vector Embeddings: Use a Vector Database (e.g., Milvus/Pinecone) for the recommendation engine to perform similarity searches on user/video embeddings.

Write-Back Cache: Use a write-back strategy for video view counts to protect the database from hot-key contention.

Design Breakdown

Functional Requirements

Core Use Cases:

Users can upload short videos (up to 60s).

Users can view a personalized "For You" feed.

Users can follow other creators.

Users can like/interact with videos.

Scope Control:

In-scope: Video upload, transcoding, feed generation, basic interactions.

Out-of-scope: Live streaming, video editing tools, monetization/ads, real-time messaging.

Non-Functional Requirements

Scale: Handle 100M+ DAU and 1M+ concurrent video streams.

Latency: Feed loading and video start time (TTFB) should be <200ms.

Availability: 99.99% availability (High availability for the Feed service is critical).

Consistency: Eventual consistency for likes, follower counts, and comment counts.

Fault Tolerance: System must survive the loss of an entire Availability Zone (AZ).

Security: Content moderation hooks and secure URL signatures for CDN access.

Estimation

Traffic Estimation:

100M DAU.

Uploads: 1% of users upload 1 video/day = 1M uploads/day (~12/sec avg, ~100/sec peak).

Feed Reads: 100M users * 20 videos/day = 2B views/day (~23k/sec avg, ~100k/sec peak).

Storage Estimation:

1M videos * 15MB (avg) = 15TB/day.

1 Year = ~5.4PB (before replication/transcoding).

Transcoding (3 resolutions) = ~3x storage = ~16PB/year.

Bandwidth Estimation:

Ingress: 100 uploads/sec * 15MB = 1.5 GB/s (12 Gbps).

Egress: 100k views/sec 15MB (cached/streamed) = 1.5 TB/s (12 Tbps) - Heavy reliance on CDN*.

Blueprint

The architecture utilizes an event-driven approach for video processing and a pre-computed cache for feeding.

API Gateway: Central entry point for auth, rate-limiting, and routing.

Upload Service: Handles multipart uploads and persists raw files to S3.

Transcoder Service: Async workers that convert videos into multiple formats (HLS/DASH).

Feed Service: Aggregates content from the ML-driven pre-computed cache.

Metadata DB (Cassandra): Stores video/user metadata with high write throughput.

ML Engine: Processes user interactions to update feed recommendations.

Simplicity Audit: We use S3 for storage and Cassandra for metadata to avoid the complexity of manual sharding in RDBMS. Redis serves as the "speed layer" for the feed.

Architecture Decision Rationale:

Cassandra is chosen for metadata because its LSM-tree structure handles massive write volumes (likes/views) better than traditional SQL.

Kafka decouples the ingestion from the heavy compute of transcoding, ensuring the user isn't blocked by processing.

High Level Architecture

Sub-system Deep Dive

Edge (Optional)

Content Delivery & Traffic Routing:

CDN: Multi-vendor CDN strategy (Cloudflare/Akamai) to deliver video segments (.ts files) via HLS.

Geo-DNS: Routes users to the nearest Point of Presence (PoP).

Security:

WAF: Protects against DDoS and scrapers.

Signed URLs: Upload/Download services generate short-lived tokens to prevent unauthorized S3 access.

Service

Upload Service:

Supports Multipart Upload for large files and resume-ability.

Generates a VideoID using a distributed ID generator (Snowflake).

Feed Service:

Protocol: gRPC for internal service communication; REST/JSON for the client.

Strategy: If the user's Redis cache is empty, it falls back to a "Popularity-based" cold-start feed from Cassandra.

Resilience:

Circuit Breakers: Applied to the ML Engine to prevent the Feed Service from hanging if recommendations are slow.

Retries: Exponential backoff for video metadata writes to Cassandra.

Storage

Access Pattern: High write for interaction logs; High read for video metadata.

Database Table Design (Cassandra):

videos: video_id (K), creator_id, s3_url, thumbnail_url, timestamp, description.

user_feed: user_id (K), video_id (S), score.

Technical Selection:Cassandra for linear scalability and high availability.

Distribution Logic: Shard videos by video_id. Shard user_feed by user_id.

Cache

Purpose: Store the pre-computed "For You" list for every active user.

Key-Value Schema:

Key: feed:{user_id}

Value: List of video_id (e.g., [v123, v456, v789]).

TTL: 24 hours for active users; 1 hour for inactive.

Failure Handling: If Redis is down, the system fetches a "Global Hot" list from Cassandra.

Messaging

Purpose: Decouple video upload from transcoding and ML indexing.

Event Schema:UploadEvent { video_id, user_id, raw_s3_path, timestamp }.

Throughput: Kafka partitions scaled to handle peak upload bursts (~1k events/sec).

Failure Handling: Dead-letter queues (DLQ) for failed transcoding jobs.

Data Processing

Processing Model: Flink for real-time interaction processing; Spark for batch ML model training.

Processing DAG: Kafka Source -> Feature Extraction -> ML Model Scoring -> Redis Sink (Feed update).

Technical Selection:Flink for sub-second latency in updating the "For You" feed based on the video just watched.

Wrap Up

Advanced Topics

Trade-offs (PACELC): We prioritize Availability (A) and Partition Tolerance (P) over Consistency (C). A user seeing a "Like" count off by a few digits is acceptable; the feed being down is not.

Reliability:

Transcoding Retries: If a worker fails, another picks up the message from Kafka.

S3 Cross-Region Replication: Critical for disaster recovery.

Bottleneck Analysis:

Hot Creators: Viral videos create "hot partitions" in Cassandra. Mitigation: Use a write-back cache in Redis for incrementing view counts before flushing to DB.

Security:

Moderation: Use AWS Rekognition or a custom CNN model at the end of the transcoding pipeline to flag NSFW content before it hits the feed.

Distinguishing Insights:

Video Tail-Latency: TikTok's "smoothness" comes from the app pre-fetching the next 2-3 videos in the feed background while the user is watching the current one. The API must support pagination and pre-fetch hints.

Edge Compute: Move the "Video URL Signing" logic to the CDN Edge (Cloudflare Workers) to reduce API Gateway load.