DowngradedOur downstream service providers are currently experiencing outages, and our engineering team is actively working on a resolution. Some services—including the Solver, Partner, and Tools—are temporarily degraded with higher latency and lower bandwidth. Rest assured, Intervipedia, Solutions, and the Question Bank features are not impacted and remain fully operational.DowngradedOur downstream service providers are currently experiencing outages, and our engineering team is actively working on a resolution. Some services—including the Solver, Partner, and Tools—are temporarily degraded with higher latency and lower bandwidth. Rest assured, Intervipedia, Solutions, and the Question Bank features are not impacted and remain fully operational.DowngradedOur downstream service providers are currently experiencing outages, and our engineering team is actively working on a resolution. Some services—including the Solver, Partner, and Tools—are temporarily degraded with higher latency and lower bandwidth. Rest assured, Intervipedia, Solutions, and the Question Bank features are not impacted and remain fully operational.DowngradedOur downstream service providers are currently experiencing outages, and our engineering team is actively working on a resolution. Some services—including the Solver, Partner, and Tools—are temporarily degraded with higher latency and lower bandwidth. Rest assured, Intervipedia, Solutions, and the Question Bank features are not impacted and remain fully operational.
DowngradedOur downstream service providers are currently experiencing outages, and our engineering team is actively working on a resolution. Some services—including the Solver, Partner, and Tools—are temporarily degraded with higher latency and lower bandwidth. Rest assured, Intervipedia, Solutions, and the Question Bank features are not impacted and remain fully operational.DowngradedOur downstream service providers are currently experiencing outages, and our engineering team is actively working on a resolution. Some services—including the Solver, Partner, and Tools—are temporarily degraded with higher latency and lower bandwidth. Rest assured, Intervipedia, Solutions, and the Question Bank features are not impacted and remain fully operational.DowngradedOur downstream service providers are currently experiencing outages, and our engineering team is actively working on a resolution. Some services—including the Solver, Partner, and Tools—are temporarily degraded with higher latency and lower bandwidth. Rest assured, Intervipedia, Solutions, and the Question Bank features are not impacted and remain fully operational.DowngradedOur downstream service providers are currently experiencing outages, and our engineering team is actively working on a resolution. Some services—including the Solver, Partner, and Tools—are temporarily degraded with higher latency and lower bandwidth. Rest assured, Intervipedia, Solutions, and the Question Bank features are not impacted and remain fully operational.
The Question
Design

Scale-Out URL Shortener

Design a globally distributed URL shortening service capable of handling 1 million redirection requests per second and 10,000 shortening requests per second. The system must ensure unique short-code generation without collisions, provide sub-50ms latency for redirects, and scale to store over 1 trillion mappings over 5 years. Address how you would handle hot keys, distributed ID generation, and data consistency across multiple geographic regions.
Redis
DynamoDB
Zookeeper
Base62
Bloom Filter
Anycast
Cassandra
NoSQL
Questions & Insights

Clarifying Questions

What is the read/write ratio?
Assumption: Heavily read-skewed. 100:1 (e.g., 990k reads/sec, 10k writes/sec).
What is the character limit for the shortened URL?
Assumption: 7-8 characters using Base62 (a-z, A-Z, 0-9) to support enough combinations for 100+ billion URLs.
Do we need custom aliases or expiration dates?
Assumption: Yes for both. Custom aliases are a core functional requirement; expiration handles storage cleanup.
Is the "Millions of requests per second" global or regional?
Assumption: Global traffic. We need a multi-region deployment strategy to ensure low latency.
What is the data retention policy?
Assumption: 5-year retention by default unless specified otherwise.

Thinking Process

Core Bottleneck: 1M+ QPS means the database cannot be hit for every read. The ID generation must also be distributed to avoid being a single point of failure (SPOF) or a performance bottleneck.
Strategy Questions:
How do we generate unique, short IDs at scale without collisions or central locking?
How do we ensure sub-20ms redirection latency under massive load?
How do we store 1.5 trillion records (5-year projection) while maintaining query performance?
How do we handle "hot" URLs that might receive millions of hits individually?

Bonus Points

ID Pre-allocation: Using a distributed coordinator (like Zookeeper) to manage "ranges" of IDs for each app server to minimize network round trips.
Bloom Filters: Utilizing Bloom filters at the caching layer to quickly reject requests for non-existent short URLs, preventing "cache-miss" attacks on the DB.
Anycast Routing: Using BGP Anycast to route users to the nearest edge location, minimizing the Speed of Light (SoL) latency.
Tiered Cache: Implementing a L1 (In-memory/Guava) and L2 (Redis) cache strategy to handle extremely hot keys.
Design Breakdown

Functional Requirements

Core Use Cases:
Generate a unique short URL from a long URL.
Redirect users from a short URL to the original long URL (301 Moved Permanently).
Support custom short aliases.
Set optional expiration times for links.
Scope Control:
In-scope: Shortening, Redirection, High-scale ID generation, Basic Expiration.
Out-of-scope: Advanced analytics/click-tracking (MVP focuses on throughput), User accounts/Management UI, Editing existing short URLs.

Non-Functional Requirements

Scale: Support 1 Million QPS (Read) and 10k QPS (Write).
Latency: Redirection (Read) < 50ms (P99). Creation (Write) < 200ms (P99).
Availability & Reliability: 99.99% availability (Tier 0 service).
Consistency: Eventual consistency for redirection is acceptable; strong consistency for ID generation is required.
Security: Prevent URL discovery through sequential guessing (use non-sequential ID mapping).

Estimation

Traffic Estimation:
Total QPS: 1,000,000.
Write QPS: 10,000.
Read QPS: 990,000.
Storage Estimation:
10k writes/sec 86,400 sec/day 365 days * 5 years ≈ 1.5 Trillion URLs.
500 bytes per record (URL + Metadata) ≈ 750 TB total storage.
Bandwidth Estimation:
Outgoing (Reads): 1M QPS * 500 bytes ≈ 500 MB/s (4 Gbps).
Incoming (Writes): 10k QPS * 1 KB ≈ 10 MB/s.

Blueprint

Concise Summary: A globally distributed URL shortening service utilizing a range-based ID generator for uniqueness and a heavy Redis-based caching layer for ultra-low latency redirection.
Major Components:
API Gateway: Handles rate limiting, authentication, and routing.
Shortening Service: Stateless service that fetches ID ranges and persists mappings.
Redirection Service: High-throughput service focused on cache-heavy URL lookups.
ID Generator (Zookeeper-backed): Manages numeric ID ranges to ensure distributed uniqueness.
NoSQL Database (DynamoDB/Cassandra): Stores URL mappings; chosen for horizontal scalability.
Distributed Cache (Redis): Stores hot URL mappings to satisfy the 1M QPS requirement.
Simplicity Audit: This design avoids complex message queues or stream processing for the MVP, focusing on the direct path from request to redirection.
Architecture Decision Rationale:
Why this architecture?: Range-based ID generation is faster than checking for DB collisions. NoSQL allows us to scale storage linearly with traffic.
Functional Satisfaction: Covers shortening, redirection, and custom aliases via a unified DB schema.
Non-functional Satisfaction: High availability is achieved through stateless services and replicated NoSQL; low latency is achieved via Redis.

High Level Architecture

Sub-system Deep Dive

Edge (Optional)

Content Delivery & Traffic Routing:
Global Load Balancing: Use Latency-based routing to direct users to the nearest data center.
SSL/TLS Termination: Handled at the Edge to reduce processing load on application servers.
Security & Perimeter:
Rate Limiting: Implement strict limits (e.g., 5 requests/sec per IP for writes) to prevent scraping or DOS.
WAF: Block common injection attacks and bot traffic.

Service

Topology & Scaling:
Stateless Design: All services are stateless, allowing for horizontal auto-scaling based on CPU/Request count.
Isolation: Separate the Shortening (Write) and Redirection (Read) clusters. Redirection requires significantly more replicas.
API Schema Design:
POST /v1/shorten:
Protocol: REST/JSON
Request: { long_url: string, custom_alias: string?, expire_at: timestamp? }
Response: { short_url: string }
GET /{short_code}:
Protocol: HTTP/REST
Response: 301 Redirect with Location header.
Resilience & Reliability:
Retries: Shortening service retries range acquisition from ID Generator with exponential backoff.
Circuit Breaker: If the DB is slow, the Redirection service fails fast or serves a "Try again later" page for non-cached items.

Storage

Access Pattern:
Write: 10k/s. Read: 990k/s. Read-heavy.
Database Table Design:
Table: `URL_Mapping
short_code (String, PK): The Base62 encoded ID.
long_url (String): The original URL.
created_at (Timestamp): For cleanup.
expire_at (Timestamp): Index for TTL-based deletion.
user_id (String): Owner ID for management (Optional).
Technical Selection:
DynamoDB (or Cassandra): Chosen for its ability to handle 10k+ writes/sec and PB-scale storage with consistent single-digit millisecond performance.
Distribution Logic:
Sharding: Partition by short_code hash to ensure uniform distribution of traffic.
Reliability & Recovery:
Multi-Region Replication (Global Tables) to survive regional outages and provide local reads.

Cache

Purpose & Justification: Crucial for handling 1M QPS; prevents DB exhaustion.
Key-Value Schema:
Key: url:{short_code}
Value: long_url (string)
TTL: 24 hours (Dynamic based on popularity).
Technical Selection: Redis Cluster. High availability through replication and sharding.
Failure Handling:
Cache Aside: Application checks Redis first, then DB.
Bloom Filter: Stored in Redis to avoid DB hits for keys that definitely don't exist.
Wrap Up

Advanced Topics

Trade-offs (PACELC): We prioritize Availability and Partition Tolerance (AP) for redirects. If a record is slightly stale (e.g., a just-updated URL), it’s less critical than the service being down.
ID Generation (Alternative): Could use MD5/SHA-256 and take the first 7 chars, but collisions are guaranteed at 1.5T scale. Range-based counter + Base62 encoding is collision-free.
Hot Spot Handling: Extremely popular URLs (e.g., a viral Tweet) can be handled by "Cache-Local" memory in the redirection service (L1 cache) to avoid even the Redis network hop.
Base62 Encoding:
62^7 \approx 3.5 \text{ Trillion} combinations.
62^8 \approx 218 \text{ Trillion} combinations.
7 characters are sufficient for our 1.5T requirement.
Security: To prevent sequential guessing of short codes (e.g., .../1, .../2), the numeric ID from the range generator is passed through a Feistel Cipher or a simple bit-shuffling function before Base62 encoding to make the output look random while remaining reversible and unique.