The Question
Design

Scalable URL Shortening System

Design a high-performance URL shortening service similar to Bitly. The system must handle 100 million new URLs per month and provide sub-100ms redirection for 10 billion monthly requests. Detail your strategy for unique ID generation, storage of billions of records, and ensuring high availability during traffic surges.
DynamoDB
Redis
ZooKeeper
Base62 Encoding
Anycast DNS
LRU
TTL
Questions & Insights

Clarifying Questions

Scale: What is the expected volume of new URLs created per month and the number of redirections (reads) per month?
URL Length: Is there a specific constraint on the short URL length (e.g., 6-8 characters)?
Custom Aliases: Should the system support user-defined custom short links (e.g., bit.ly/my-custom-link)?
Analytics: Does the MVP require real-time analytics (click counts, geo-location) or just basic redirection?
Availability vs. Consistency: In the event of a network partition, do we prioritize serving the redirect (Availability) or ensuring the link is immediately globally visible (Consistency)?
Assumptions for this design:
Scale: 100M new URLs/month (~40 writes/sec), 10B redirections/month (~4K reads/sec).
Read/Write Ratio: 100:1 (Read-heavy).
Persistence: Links should last for 5 years by default.
ID Generation: 6-character Base62 encoding (62^6 \approx 56.8 Billion combinations).

Thinking Process

Unique ID Generation: How do we generate 100M unique, non-colliding IDs per month without a single point of failure?
Read Path Optimization: With a 100:1 read ratio, how do we ensure sub-10ms redirection latency?
Storage Scalability: How do we handle 3-5 TB of mapping data while maintaining high performance?
End-to-End Flow: How does a request travel from a user's browser to the destination long URL?

Bonus Points

Bloom Filters: Implement Bloom Filters in the application layer to quickly reject requests for non-existent short URLs, preventing unnecessary cache/DB lookups.
Geographic Routing: Use Latency-based DNS routing to direct users to the nearest regional stack to minimize the speed-of-light delay for redirects.
Cold Storage Strategy: Move expired or inactive links (no clicks in > 1 year) to S3/Glacier to keep the primary database (DynamoDB/Cassandra) lean and performant.
301 vs 302 Logic: Use 302 Found (Temporary Redirect) to ensure all clicks hit our backend for analytics, whereas 301 Moved Permanently would be cached by browsers, losing tracking data.
Design Breakdown

Functional Requirements

Shorten a long URL into a unique 6-character alias.
Redirect a short URL to the original long URL with minimal latency.
Support custom aliases (if not already taken).
Set an optional expiration time for links.

Non-Functional Requirements

High Availability: Redirection must be available 99.99% of the time.
Low Latency: Redirection should happen in under 100ms.
Scalability: Handle sudden spikes in traffic (e.g., a link goes viral).
Unpredictability: Short IDs should not be easily guessable (avoid simple auto-increment integers).

Estimation

Write Throughput: 100M / (30 days * 86,400s) \approx 40 writes/sec.
Read Throughput: 10B / (30 days * 86,400s) \approx 4,000 reads/sec.
Storage (5 years): 100M 12 months 5 years = 6B records.
Data Size: 6B * 500 bytes (URL + Metadata) \approx 3 TB.
Cache Size: Following the 80/20 rule, caching 20% of daily redirects (333M/day 0.2 500 bytes) \approx 33 GB RAM.

Blueprint

Concise Summary: A globally distributed key-value store optimized for reads, utilizing a range-based Key Generation Service (KGS) to ensure unique, high-performance ID assignment.
Major Components:
API Gateway: Handles rate limiting and routes requests to the Shortener Service.
Shortener Service: Main logic for URL creation and redirection.
Key Generation Service (KGS): Pre-allocates ranges of unique IDs to prevent collisions without DB locking.
Redis Cache: Stores the most frequently accessed short-to-long URL mappings.
DynamoDB: Scalable NoSQL storage for the source of truth mappings.
Simplicity Audit: This design avoids complex distributed locking by using a range-based KGS and leverages managed NoSQL for zero-ops scaling of the 3TB dataset.
Architecture Decision Rationale:
Why this architecture?: The read-heavy nature dictates a cache-aside pattern. A NoSQL database (DynamoDB) is chosen over SQL because we don't need complex joins, and we need horizontal scaling for the 6B+ records.
Functional Satisfaction: Covers shortening, redirection, and custom aliases via the KGS and DB lookups.
Non-functional Satisfaction: Redis provides sub-10ms latency for 80% of traffic; DynamoDB provides high availability and partition-based scaling.

High Level Architecture

Sub-system Deep Dive

Edge (Optional)

Content Delivery & Traffic Routing: Use Anycast DNS to route users to the nearest Point of Presence (PoP).
Security & Perimeter:
API Gateway: Implement JWT-based Auth for creation; public access for redirection.
Rate Limiting: Limit to 5 URL creations per second per IP to prevent service abuse.

Service

Topology & Scaling: Stateless microservices deployed in Docker containers on EKS/ECS across multiple Availability Zones.
API Schema Design:
Create: POST /v1/shorten | JSON {long_url, custom_alias?, expire_at?} | Returns 201 Created.
Redirect: GET /{short_id} | Returns 302 Found with Location header.
Resilience & Reliability:
Circuit Breaker: If DynamoDB latency spikes, fail fast and serve from cache or return 503.
KGS Buffer: The Shortener service fetches a batch of 1,000 IDs from KGS and stores them in memory to survive short KGS outages.
Observability: Prometheus for QPS/Latency metrics; ELK for audit logs of URL creations.

Storage

Access Pattern: 99% Read by short_id. 1% Write by long_url or short_id.
Database Table Design:
short_id (Partition Key): String (6-8 chars)
long_url: String (up to 2048 chars)
created_at: Timestamp
expire_at: Timestamp (TTL index for auto-deletion)
user_id: UUID (Optional)
Technical Selection: DynamoDB.
Rationale: Auto-sharding handles 3TB+ growth seamlessly. Built-in TTL automatically cleans up expired links. Consistent performance at scale.
Distribution Logic: Partitioning on short_id ensures even distribution across the cluster.

Cache

Purpose & Justification: Reduces read latency and DB load for "viral" links.
Key-Value Schema:
Key: url:{short_id}
Value: long_url (String)
TTL: 24 hours (sliding window for active links).
Technical Selection: Redis. Use an LRU (Least Recently Used) eviction policy.
Failure Handling: On cache miss, fetch from DynamoDB and populate Redis.

Infrastructure (Optional)

Distributed Coordination: ZooKeeper.
Used by the Key Generation Service to manage ID ranges (e.g., Service A takes 1-1,000,000, Service B takes 1,000,001-2,000,000) to ensure no two instances ever issue the same ID.
Wrap Up

Advanced Topics

ID Generation Trade-off: We use a KGS with ZooKeeper rather than UUIDs. UUIDs are too long (36 chars), defeating the purpose of a "short" URL. Base62 encoding of a central counter is the most space-efficient.
Consistency: The system is Eventually Consistent across regions. If a user creates a link in US-East, it might take seconds to appear in EU-West. This is acceptable for a URL shortener.
Security: To prevent "link crawling," the KGS can skip random intervals of IDs so they are not perfectly sequential (e.g., 1, 5, 12, 18...).
Custom Aliases: If a user provides a custom alias, we check DynamoDB directly. Since custom aliases are rare, this doesn't hit the KGS logic.