The Question

Scalable URL Shortener Design

Design a high-concurrency URL shortening service like Bitly. The system must support generating shortened URLs from long ones, custom aliases, and high-performance redirection. Address challenges regarding unique ID generation without collisions, handling massive read-heavy traffic (100:1 ratio), and efficient storage management for billions of records over multiple years. Consider trade-offs between different shortening algorithms and data consistency models.

NoSQL

Redis

DynamoDB

Base62

Zookeeper

API Gateway

Snowflake ID

Questions & Insights

Clarifying Questions

Traffic Scale: What is the expected volume of new URL creations and redirections?

Assumption: 100 million new URLs per month and a 100:1 read/write ratio (10 billion redirects/month).

Short Link Length: Are there specific requirements for the length or character set of the shortened URL?

Assumption: Approximately 7 characters using Base62 ([a-z, A-Z, 0-3]) to support ~

3.5 \times 10^{12}

combinations.

Custom Aliases & Expiration: Do we need to support user-defined short links or expiration dates?

Assumption: Yes, both are required for a standard MVP.

Analytics: Is real-time click tracking and geo-analytics required for the MVP?

Assumption: No, to follow YAGNI. We will focus on the core redirection engine and basic click counts.

Thinking Process

Core Bottleneck: The system is read-heavy. The primary challenge is minimizing redirection latency while ensuring unique ID generation without collisions at scale.

Key Progressive Questions:

How do we generate a unique, non-colliding short key efficiently? (Base62 encoding of a distributed counter vs. Hashing).

How do we ensure sub-10ms redirection latency for "hot" links? (Layered caching strategy).

How do we handle storage growth over 5-10 years? (NoSQL horizontal scaling vs. RDBMS sharding).

How do we handle custom aliases without introducing race conditions? (Atomic "put-if-absent" operations).

Bonus Points

Base62 vs. Hashing: Using a distributed ID generator (like Snowflake) + Base62 encoding is superior to MD5 hashing because it avoids the overhead of collision resolution (check-then-insert).

Zookeeper-based Range Allocation: For the ID generator, services can fetch ranges of IDs (e.g., 1-1000) from Zookeeper to generate IDs locally, reducing the pressure on a central database.

Cache Warming: Pre-populating the cache with frequently accessed URLs or using a "Lazy TTL" approach where high-traffic links never expire from the cache.

Security against Crawlers: Implementing "Honey-token" short URLs to detect and rate-limit scrapers trying to enumerate the entire URL space.

Design Breakdown

Functional Requirements

Core Use Cases:

Shorten a long URL to a unique short link.

Redirect a short link to the original long URL (302 Found).

Support custom short link aliases.

Set optional expiration times for links.

Scope Control:

In-scope: Shortening, Redirection, Custom Aliases, Expiration.

Out-of-scope: Detailed user analytics, UI/Dashboard, User accounts/Auth (handled by external IAM), Bulk URL shortening.

Non-Functional Requirements

Scale: 40 Writes/sec, 4,000 Reads/sec (Average); Peak is 5x.

Latency: Redirection should be < 100ms; Shortening should be < 200ms.

Availability & Reliability: 99.99% availability. Links must never be lost (high durability).

Consistency: Eventual consistency for analytics; Strong consistency for custom alias creation (prevent duplicates).

Fault Tolerance: No single point of failure; multi-AZ deployment.

Security & Privacy: Protection against URL brute-forcing and rapid-fire shortening (Rate limiting).

Estimation

Traffic Estimation:

Writes: 100M/month

\approx

40 URLs/sec.

Reads: 10B/month

\approx

4,000 URLs/sec.

Storage Estimation:

Average URL size: 500 bytes.

100M URLs/month

\times

12 months

\times

5 years = 6 Billion URLs.

Total Storage: 6B

\times

500 bytes

\approx

3 TB.

Bandwidth Estimation:

Write: 40

\times

500B

\approx

20 KB/s.

\times

500B

\approx

2 MB/s.

Blueprint

Concise Summary: A high-performance redirection engine using a distributed NoSQL store for durability and Redis for low-latency lookups.

Major Components:

API Gateway: Handles rate limiting and request routing.

Shortening Service: Manages ID generation, Base62 encoding, and storage logic.

NoSQL Database (DynamoDB/Cassandra): Provides a highly scalable Key-Value store for URL mappings.

Redis Cache: Stores hot URL mappings to ensure ultra-low latency for frequent redirects.

Simplicity Audit: This design avoids complex batch processing or message queues, focusing purely on the synchronous request-response flow required for the MVP.

Architecture Decision Rationale:

Why this architecture?: NoSQL is chosen over SQL because the data is naturally a Key-Value pair (ShortURL -> LongURL) and needs to scale horizontally without complex joins.

Functional Requirement Satisfaction: Handles core shortening and redirection with 0(1) lookup time.

Non-functional Requirement Satisfaction: Redis ensures latency targets, while NoSQL ensures multi-region availability and scalability.

High Level Architecture

Sub-system Deep Dive

Edge (Optional)

Content Delivery & Traffic Routing:

Use Global Server Load Balancing (GSLB) to route users to the nearest regional deployment.

Security & Perimeter:

API Gateway: Implements JWT validation (if auth is added) and Rate Limiting (e.g., 100 requests/min per IP) to prevent system abuse.

WAF: Standard protection against common web attacks.

Service

Topology & Scaling:

Stateless Services: The Shortening Service is stateless, allowing auto-scaling based on CPU/Request count.

Multi-AZ: Deployed across 3 Availability Zones to handle data center failures.

API Schema Design:

POST /v1/shorten:

Request: { "long_url": "string", "custom_alias": "string?", "expiry_delta": "int?" }

Response: { "short_url": "string" }

GET /{short_url}:

Returns 302 Redirect to the long URL.

Resilience & Reliability:

Retries: Shortening service retries DB writes with exponential backoff.

Circuit Breaker: If the DB is slow, the service fails fast to prevent resource exhaustion.

Observability:

Metrics: Monitor p99 latency for redirects and DB write errors.

Storage

Access Pattern: 100:1 Read/Write. Reads are by Primary Key (ShortCode).

Database Table Design:

URL_Mapping Table:

short_id (String, Partition Key): The Base62 encoded string.

long_url (String): The destination.

created_at (Timestamp).

expiry_at (Timestamp, TTL Index).

Technical Selection: Amazon DynamoDB or Cassandra.

Rationale: DynamoDB provides predictable performance at any scale and has built-in TTL support to automatically delete expired URLs without manual cleanup scripts.

Distribution Logic:

The short_id serves as a natural partition key, ensuring even distribution across the cluster.

Cache

Purpose & Justification: To reduce read latency and decrease the load on the NoSQL database for viral/popular links.

Key-Value Schema:

Key: url:{short_id}

Value: long_url string.

TTL: 24 hours (with LRU eviction).

Technical Selection: Redis.

Rationale: In-memory performance with support for cluster mode.

Failure Handling: If Redis is down, the system falls back to the NoSQL DB (Cache-aside pattern).

Wrap Up

Advanced Topics

Trade-offs (Consistency vs. Availability): We prioritize Availability for redirects. Using a 302 (Found) instead of a 301 (Permanent) allows us to capture click data on every visit and change the mapping if needed.

ID Generation (Alternative): We could use a Counter in a DB, but that creates a single point of failure. A Distributed ID Generator (Snowflake-like) or a Range-based ID service (using Zookeeper/Etcd) is more robust for high-scale environments.

Collision Handling: By using a distributed counter and Base62 encoding it, we mathematically guarantee no collisions, which is more efficient than hashing and checking for existence in the DB.

Hot Spot Mitigation: Viral links (e.g., a link shared on Twitter) could hit a single DB partition. Redis caches these links, effectively absorbing the "thundering herd."

Security: To prevent "enumeration attacks," we can use a small random offset in our ID generation so short codes aren't perfectly sequential.