The Question
DesignGlobal URL Shortener Design
Design a globally distributed URL shortening service similar to Bitly. The system must support 10 billion persistent mappings, handle 1 million read requests per second with sub-20ms latency, and 10,000 write requests per second. Focus specifically on a collision-free ID generation strategy that works across multiple data centers and how to maintain high availability and performance during regional outages.
Cassandra
Redis
Zookeeper
Kafka
Geo-DNS
NoSQL
Base62 Encoding
Bloom Filter
Questions & Insights
Clarifying Questions
Scale & Traffic: What is the expected scale for total stored URLs and the peak Read/Write QPS?
Latency Targets: What are the P99 latency requirements for both shortening a URL and performing a redirect?
Customization: Do we need to support custom "vanilla" aliases (e.g.,
bit.ly/my-link), or only auto-generated IDs?Retention: Should URLs expire after a certain period, and if so, what is the default TTL?
Geographic Distribution: Is the traffic globally distributed, and do we need to ensure local low-latency redirects (e.g., a user in Asia redirected by a server in Asia)?
Assumptions for MVP:
Total Storage: 10 Billion URLs.
Read/Write Ratio: 100:1 (Read-heavy).
Read QPS: 1 Million redirects per second.
Write QPS: 10,000 shortening requests per second.
ID Generation: Base62 (a-z, A-Z, 0-9). 7 characters provide ~3.5 trillion combinations (62^7), sufficient for 10B records.
Availability: 99.99% (High availability is critical).
Thinking Process
Core Bottleneck: High read throughput (1M RPS) and ensuring globally unique, collision-free ID generation without a central bottleneck.
Progressive Strategy:
Unique ID Generation: How do we generate 7-character IDs at scale without collisions or high-latency locking? (Solution: Distributed Range Allocator).
Low Latency Reads: How do we handle 1M RPS globally? (Solution: Multi-region caching and Geo-routing).
Data Persistence: How do we store 10B records while maintaining performance? (Solution: Partitioned NoSQL/Wide-column store).
Availability: How do we handle regional failures? (Solution: Global Server Load Balancing (GSLB) and multi-region replication).
Bonus Points
Bloom Filters: Implement at the caching layer to prevent "Cache Penetration" for non-existent short URLs, saving DB lookups.
Range-Based ID Allocation: Using a distributed coordinator (e.g., Zookeeper) to hand out batches of 100,000 IDs to local application servers to minimize cross-region latency for ID generation.
Read-Repair & Anti-Entropy: For multi-region NoSQL, using read-repair to ensure eventual consistency across regions for the redirect mappings.
Edge-Redirects: Utilizing Cloudflare Workers or Lambda@Edge to perform redirects at the CDN POP, reducing latency to near-zero.
Design Breakdown
Functional Requirements
Core Use Cases:
Shorten: User provides a long URL and receives a short, unique URL.
Redirect: User clicks a short URL and is redirected (302 Found) to the original long URL.
Custom Alias: (Optional for MVP) User provides a custom string for the short URL.
Scope Control:
In-Scope: Shortening, Redirecting, Analytics (basic hit counts), ID generation.
Out-of-Scope: User accounts/auth, sophisticated marketing dashboards, link editing/deletion.
Non-Functional Requirements
Scale: Support 1M+ Read QPS and 10k Write QPS.
Latency: < 20ms for redirects (from cache), < 200ms for shortening.
Availability & Reliability: 99.99% uptime; no single point of failure.
Consistency: Eventual consistency is acceptable for redirects; Strong consistency for ID range allocation.
Fault Tolerance: Regional failover via Geo-DNS.
Estimation
Traffic Estimation:
Write: 10k/sec.
Read: 1M/sec.
Storage Estimation:
10B URLs (500 bytes per record: Short URL + Long URL + Metadata) = 5 TB**.
Bandwidth Estimation:
Write: 10k * 500B = 5 MB/s.
Read: 1M * 500B = 500 MB/s (High egress, requires CDN/Global Load Balancer).
Blueprint
Concise Summary: A geo-distributed service using a distributed range-based ID generator to ensure unique short IDs, backed by a NoSQL store for persistence and a global Redis layer for high-speed redirects.
Major Components:
Range Manager (Zookeeper): Coordinates blocks of unique IDs to prevent collisions across distributed workers.
URL Service: Handles shortening and redirect logic; fetches and caches ranges of IDs locally.
Distributed NoSQL (Cassandra/DynamoDB): Stores the
short_id -> long_url mapping across multiple regions.Cache Layer (Redis): Stores frequently accessed mappings to handle the 1M RPS load.
Simplicity Audit: This design avoids complex hashing/collision-retry logic by using pre-allocated ranges, ensuring writes are always O(1).
Architecture Decision Rationale:
ID Generation: Range-based allocation is chosen over Snowflake IDs to keep the URL length strictly to 7 characters without complex time-based bits.
Database: NoSQL (Cassandra) is used for its linear scalability and multi-region master-less replication.
High Level Architecture
Sub-system Deep Dive
Edge (Optional)
Content Delivery: Use a Global Server Load Balancer (GSLB) to route users to the nearest regional data center based on IP.
Security: Implement Rate Limiting at the API Gateway level (e.g., 100 requests/minute per IP for shortening) to prevent scrapers from exhausting the ID space.
SSL/TLS termination at the Load Balancer to reduce the cryptographic load on the URL Service.
Service
Topology & Scaling: Stateless URL Service deployed in multiple regions (US-East, EU-West, AP-South). Horizontal scaling triggered by CPU/Request Count.
ID Generation Logic:
Each instance requests a range (e.g., 1 to 100,000) from Zookeeper.
Once the local range is 80% exhausted, the instance requests the next available range.
Base62 encoding converts the numeric ID to a 7-character string.
API Schema Design:
POST /v1/shorten: { "long_url": "...", "ttl": 3600 } -> Returns { "short_url": "..." }.GET /{short_id}: Performs 302 redirect.Resilience: Circuit breakers on DB/Cache calls; if Cache is down, fallback to DB.
Storage
Access Pattern: Heavy read-by-key (short_id). Write-heavy during peak shortening hours.
Database Table Design:
short_id (Primary Key, String)long_url (String)created_at (Timestamp)expiry_at (Timestamp, TTL Index)Technical Selection: Cassandra.
Rationale: Excellent for high-volume writes and scales horizontally by adding nodes. Multi-region replication is native.
Distribution Logic: Partitioned by
short_id hash. This ensures even distribution across the cluster.Cache
Purpose: Absorb 1M+ RPS. Redirects are static, making them perfect for caching.
Key-Value Schema:
Key: short_id, Value: long_url. TTL: 24 hours (LRU eviction).Technical Selection: Redis Cluster.
Failure Handling: Use a "Write-through" or "Cache-aside" strategy. If a key is missing, fetch from DB and populate cache.
Messaging
Purpose: Asynchronous processing of analytics. We don't want to block the redirect while logging click data.
Event Schema:
short_id, timestamp, geo_location, referrer.Technical Selection: Kafka. High throughput for high-volume clickstream data.
Wrap Up
Advanced Topics
Consistency vs. Availability (CAP): We prioritize Availability and Partition Tolerance (AP). If a link is shortened in the US and someone in Europe clicks it immediately, they might get a 404 for a few milliseconds until the record replicates. This is acceptable.
ID Generation Alternative: We could use MD5/SHA hashes of the URL, but collisions require a "check-and-retry" logic which is slower than range-based allocation.
Optimization (Custom Aliases): For custom aliases, we store them in a separate table or a specific partition. Since custom aliases are user-provided, we must check for collisions in the DB before committing.
Storage Optimization: We can use a compact binary format for storing long URLs to save space, but with 5TB, a standard NoSQL approach is manageable and more flexible for the MVP.