The Question
Design

Ad Frequency Capping System

Design an ad frequency capping system for a large-scale advertising platform. The system should limit how often a specific advertisement is shown to a user across multiple channels and devices within a defined time window, enforcing caps in real time with minimal latency impact on ad serving.
Redis Cluster
Kafka
Atomic Counter
API Gateway
Fixed Window
Questions & Insights

Thinking Process

To design a frequency capping system, we must balance sub-millisecond latency with high write throughput. The core challenge is the "read-modify-write" cycle at scale.
How do we store the counters for millions of users without exploding memory? We use a distributed key-value store with TTLs to automatically expire old frequency data.
How do we handle the latency requirements of an Ad auction (< 10ms)? We perform a "check" synchronously and an "increment" either atomically or asynchronously to keep the ad-serving path unblocked.
Fixed Window vs. Sliding Window? For an MVP, we use Fixed Windows (e.g., hourly/daily buckets) because they are significantly cheaper to compute and store than Sliding Windows (which require Redis Sorted Sets or event logs).

Bonus Points

Write-Back Pattern with Local Aggregation: To reduce Redis pressure, aggregate counts in the Ad Server memory for 100ms before flushing a batch increment to the global store.
Probabilistic Counting (Bloom Filters/Count-Min Sketch): If memory becomes a bottleneck for "Is this user new?", use a Bloom Filter to avoid querying the database for users who have never seen the ad.
Consistency Model: Acknowledge the use of Eventual Consistency. In AdTech, over-delivery (showing 6 ads instead of 5) is slightly annoying, but high latency results in lost revenue.
Edge Deployment: Deploying the Frequency Cap Service at the Edge (CDN/PoPs) to minimize the RTT (Round Trip Time) between the user and the cap check.
Design Breakdown

Functional Requirements

Limit Checks: Check if a specific user has exceeded the cap for a specific Ad/Campaign (e.g., 3 views per 24 hours).
Counter Increment: Update the view count every time an ad is successfully served.
Configurability: Support different time windows (Hour, Day, Week).

Non-Functional Requirements

Ultra-Low Latency: Cap checks must occur in < 5ms to fit within the total 100ms auction window.
High Scalability: Support 1M+ requests per second.
Availability: If the cap service is down, the system should "fail open" (show the ad) rather than "fail closed" (block revenue).
Accuracy: High precision is desired, but 99% accuracy is acceptable to prioritize performance.

Estimation

Users: 100M Daily Active Users.
Ads: 10,000 active campaigns.
Storage per User: If a user interacts with 20 ads/day, and each record is (UserID:AdID:Window) -> (Counter).
Keys: 100M users * 20 ads = 2B keys.
Memory: 2B keys * 32 bytes (optimized) \approx 64 GB of RAM. This fits easily in a small Redis cluster.

Blueprint

Concise Summary: A low-latency lookup service backed by an in-memory Redis cluster using atomic increments and fixed-window keys.
Major Components:
Ad Server: The entry point that orchestrates the auction and calls the Cap Service.
Cap Service: A Go/Java microservice that implements the logic for windowing and limit checking.
Redis Cluster: The source of truth for real-time counters, utilizing TTLs for self-cleanup.
Event Bus (Kafka): Asynchronously records impressions to ensure the Cap Service isn't a bottleneck for the "Increment" phase.
Simplicity Audit: This design avoids complex stream processing (Flink/Spark) or heavy relational joins by using Redis' native INCR and EXPIRE commands.

High Level Architecture

Sub-system Deep Dive

Service

Topology & Scaling: The Cap Service is a stateless Go-based microservice deployed in a sidecar or as a high-performance cluster. It scales horizontally based on CPU/Request count.
API Spec:
POST /v1/check-limits: Payload {userId, adIds: []}, returns {adId: allowed_boolean}.
Communication: Internal gRPC for minimal overhead compared to JSON/HTTP.

Storage

Data Model: Key-value pairs.
Key Format: cap:{user_id}:{ad_id}:{window_timestamp}
Value: Integer (Count).
Database Logic:
Read: MGET for all AdIDs in the auction.
Write (Async): The Counter Updater Worker receives events from Kafka and executes INCRBY <key> 1 followed by an EXPIRE <key> <window_duration>.

Cache

Implementation: Redis is the primary store for this MVP (acting as a "Speed Layer").
Data Structure: Simple Strings (Integers).
TTL/Eviction: TTL is set to the duration of the frequency window (e.g., 24 hours). This ensures memory is automatically reclaimed without a cleanup job.

Messaging

Implementation: Kafka.
Topic Structure: impression-events partitioned by UserID to ensure ordered updates for the same user.
Guarantees: At-least-once delivery. Occasional double-counting is acceptable for an MVP.
Wrap Up

Advanced Topics

Monitoring:
P99 Latency: Critical for the check-limits call.
Redis Memory Usage: To prevent OOM.
Kafka Consumer Lag: To ensure caps are updated in "near" real-time.
Trade-offs:
Accuracy vs. Latency: By using Kafka for increments, there is a sub-second lag between an impression and the counter update. A user might see an ad twice in very rapid succession. This is an acceptable trade-off for serving performance.
Bottlenecks:
Redis Hot Keys: If a single "Mega-Ad" is targeted at everyone, its keys might cause a hotspot.
Solution: Sharding by UserID ensures the load is distributed across the Redis cluster.
Failure Handling:
Redis Down: The Cap Service returns "Allowed" for all requests. We prioritize revenue over strict cap enforcement.
Alternatives & Optimization:
Local Cache: The Ad Server could keep a very short-lived (1s) local cache of "Blocked Users" to avoid even the gRPC call to the Cap Service for extremely heavy users.