The Question
Design

High-Performance Ad Serving & Tracking System

Design a scalable ad serving platform capable of handling 10,000 requests per second with sub-100ms latency. The system must support advertiser campaign management, targeted ad retrieval based on keywords and location, real-time budget enforcement to prevent significant overspending, and a robust tracking mechanism for billions of impressions and clicks. Discuss how you would handle the trade-offs between low-latency ad delivery and eventual consistency in billing/budgeting, especially under peak load conditions.
Redis
Kafka
PostgreSQL
S3
Flink
gRPC
HMAC
Bloom Filter
Questions & Insights

Clarifying Questions

Traffic Scale: What is the expected scale for the MVP? (Assumed: 10,000 QPS for ad requests, 100M active ads).
Targeting Complexity: Are we supporting complex ML-based behavioral targeting or simple attribute-based (Geo, Keyword, Category) targeting? (Assumed: Simple attribute-based for MVP).
Ad Types: Which formats are supported? (Assumed: Search and Display ads).
Budgeting & Billing: Is "near real-time" budget exhaustion required to prevent overspending? (Assumed: Yes, within a 1-5 minute window).
Latency SLA: What is the p99 latency requirement for the ad-selection response? (Assumed: < 100ms).

Thinking Process

The core challenge of an ad serving system is the "Selection Problem"—quickly filtering millions of ads down to 1-5 relevant ones—and the "Accounting Problem"—ensuring advertisers aren't charged after their budget is gone.
How do we retrieve ads in < 50ms? Use an In-Memory Ad Index (inverted index) to filter by targeting criteria, followed by a lightweight ranker.
How do we handle the massive write-volume of clicks/impressions? Decouple the Ad Server from the data store using an asynchronous messaging queue (Kafka) for event logging.
How do we prevent budget overspending? Maintain a distributed counter in a fast cache (Redis) and have the Ad Server check this before including an ad in the auction.
How do we scale? Stateless Ad Servers that can scale horizontally based on QPS.

Bonus Points

Probabilistic Budgeting: Using tokens or "pacing" algorithms to smooth out ad delivery throughout the day rather than exhausting the budget in the first hour.
Click Fraud Detection: Implementing a stream-processing layer that identifies bot patterns (high-frequency clicks from same IP/User-Agent) and filters them before billing.
Normalization (eCPMs): Converting different bid types (CPC, CPM, CPA) into a common "effective CPM" metric to run a fair unified auction.
Two-Phase Ranking: A coarse-grained filter (retrieval) followed by a fine-grained Ranker (ML model) to balance latency and relevance.
Design Breakdown

Functional Requirements

Core Use Cases:
Advertisers can create and manage campaigns (CRUD).
Users receive relevant ads based on targeting criteria (Keywords, Geo).
System tracks impressions and clicks accurately.
System enforces advertiser budgets.
Scope Control:
In-Scope: Ad retrieval, simple ranking, event collection, budget tracking.
Out-of-Scope: Complex ML model training pipelines, rich media transcoding, real-time bidding (RTB) with external DSPs.

Non-Functional Requirements

Scale: Support 10k QPS (Read) and 50k QPS (Write - Impressions/Clicks).
Latency: Ad selection response in < 100ms.
Availability & Reliability: High availability (99.99%) for ad serving; missing an ad view is okay, but missing a click (revenue) is not.
Consistency: Eventual consistency for budget (minutes) and reporting (hours); strong consistency for campaign metadata.
Security & Privacy: Protection against click-fraud; GDPR/CCPA compliance for user data.

Estimation

Traffic Estimation:
Ad Requests: 10k QPS.
Impressions (10x requests): 100k QPS.
Clicks (1% CTR): 1k QPS.
Storage Estimation:
100M Ads * 2KB per ad = 200GB (Metadata).
100k events/sec * 500 bytes = 50MB/sec raw logs = ~4.3TB/day.
Bandwidth Estimation:
Incoming: 10k 1KB (request) + 101k 0.5KB (events) ≈ 60MB/sec.
Outgoing: 10k * 5KB (ad payload) ≈ 50MB/sec.

Blueprint

The design focuses on a "Search & Rank" architecture. It separates the "Hot Path" (Ad Serving) from the "Cold Path" (Campaign Management and Analytics).
Ad Server: The stateless worker that performs filtering, ranking, and budget checks.
Ad Index (Redis/In-Memory): Stores active ads indexed by targeting attributes (e.g., keyword: "shoes" -> [Ad1, Ad5, Ad9]).
Campaign Service: A standard CRUD service for advertisers to manage budgets and creatives.
Event Collector: A high-throughput gateway to ingest clicks/impressions into Kafka.
Budget Aggregator: A stream processor that updates the "Remaining Budget" in Redis.
Simplicity Audit: We use an inverted index in Redis instead of a complex ElasticSearch cluster to keep the MVP lean. We use simple "eCPM = Bid * CTR" ranking.
Architecture Decision Rationale:
Why this?: Decoupling event tracking via Kafka ensures the Ad Server isn't blocked by database writes, maintaining the sub-100ms SLA.
Functional Satisfaction: Meets all CRUD, serving, and tracking needs.
Non-functional Satisfaction: Redis provides the low-latency reads needed for the Ad Server; Kafka provides the scalability for the write-heavy event stream.

High Level Architecture

Sub-system Deep Dive

Edge (Optional)

Load Balancing: L7 Load Balancer for SSL termination and path-based routing (/v1/ad to Ad Server, /v1/event to Event Collector).
Rate Limiting: IP-based rate limiting at the Gateway to prevent DDoS and basic click-spamming.

Service

Ad Server:
Protocol: gRPC or REST.
Logic: 1) Fetch user context (Geo/Keyword). 2) Query Ad Index for candidate ads. 3) Filter ads based on budget (Check Redis). 4) Rank ads using Bid * Estimated CTR. 5) Return top N.
Stateless: Can scale horizontally via Auto-scaling Groups on CPU/QPS.
Campaign Service:
Manages the lifecycle of ads (Draft -> Active -> Paused -> Completed).
Syncs "Active" ads to the Ad Index whenever a campaign starts or updates.

Storage

Campaign DB (PostgreSQL):
Schema:
campaigns: id (PK), advertiser_id, total_budget, daily_budget, start_date, end_date.
ads: id (PK), campaign_id, creative_url, bid_amount, targeting_json.
Consistency: Strong consistency for financial and configuration data.
Data Warehouse (S3 + Snowflake/BigQuery):
Stores all raw events for offline ML training and advertiser reporting.

Cache

Ad Index (Redis):
Key-Value: targeting:key:val -> Set<Ad_ID>.
Example: geo:US -> {101, 202}, kw:nike -> {101, 305}.
Update Logic: Campaign Service pushes updates on change.
Budget Cache (Redis):
Key-Value: budget:campaign_id -> remaining_amount.
Failure Handling: If Redis is down, Ad Server defaults to "Allow" (prioritizing revenue over overspend risk).

Messaging

Event Stream (Kafka):
Topics: impressions, clicks.
Partitioning: Partition by campaign_id to ensure ordered processing by the Budget Processor.
Retention: 7 days for replayability.

Data Processing

Budget Processor (Flink/Spark Streaming):
Consumes from Kafka.
Aggregates costs per campaign in 1-minute windows.
Deducts from remaining_amount in Budget Redis.
If budget <= 0, triggers a "Pause" event to the Ad Index.
Wrap Up

Advanced Topics

Trade-offs (PACELC): We choose Availability over Consistency (AP) for the Ad Server. If the Budget Redis is slightly stale, we might overspend by 1-2 minutes of traffic, which is an acceptable business risk compared to the system stopping ads altogether.
Reliability:
Circuit Breaker: If the Ad Index is slow, return a default "House Ad" to maintain UX.
Dead Letter Queues (DLQ): For click events that fail to process, ensuring we don't lose billing data.
Bottleneck Analysis:
Ad Index Size: If 100M ads exceed one Redis node, shard by ad_id or targeting_category.
Hot Partitions: Popular keywords (e.g., "iPhone") might cause hot spots. Use local caching on Ad Servers for top 1000 keywords.
Security:
All click URLs should be signed with a HMAC token containing a timestamp to prevent "Click Replay" attacks.
Distinguishing Insight:
Negative Targeting: For MVP, we often forget "Exclusion List" (e.g., don't show this ad to users who already bought the product). Using a Bloom Filter in the Ad Server can efficiently handle user-level exclusions without massive DB lookups.