DowngradedOur downstream service providers are currently experiencing outages, and our engineering team is actively working on a resolution. Some services—including the Solver, Partner, and Tools—are temporarily degraded with higher latency and lower bandwidth. Rest assured, Intervipedia, Solutions, and the Question Bank features are not impacted and remain fully operational.DowngradedOur downstream service providers are currently experiencing outages, and our engineering team is actively working on a resolution. Some services—including the Solver, Partner, and Tools—are temporarily degraded with higher latency and lower bandwidth. Rest assured, Intervipedia, Solutions, and the Question Bank features are not impacted and remain fully operational.DowngradedOur downstream service providers are currently experiencing outages, and our engineering team is actively working on a resolution. Some services—including the Solver, Partner, and Tools—are temporarily degraded with higher latency and lower bandwidth. Rest assured, Intervipedia, Solutions, and the Question Bank features are not impacted and remain fully operational.DowngradedOur downstream service providers are currently experiencing outages, and our engineering team is actively working on a resolution. Some services—including the Solver, Partner, and Tools—are temporarily degraded with higher latency and lower bandwidth. Rest assured, Intervipedia, Solutions, and the Question Bank features are not impacted and remain fully operational.
DowngradedOur downstream service providers are currently experiencing outages, and our engineering team is actively working on a resolution. Some services—including the Solver, Partner, and Tools—are temporarily degraded with higher latency and lower bandwidth. Rest assured, Intervipedia, Solutions, and the Question Bank features are not impacted and remain fully operational.DowngradedOur downstream service providers are currently experiencing outages, and our engineering team is actively working on a resolution. Some services—including the Solver, Partner, and Tools—are temporarily degraded with higher latency and lower bandwidth. Rest assured, Intervipedia, Solutions, and the Question Bank features are not impacted and remain fully operational.DowngradedOur downstream service providers are currently experiencing outages, and our engineering team is actively working on a resolution. Some services—including the Solver, Partner, and Tools—are temporarily degraded with higher latency and lower bandwidth. Rest assured, Intervipedia, Solutions, and the Question Bank features are not impacted and remain fully operational.DowngradedOur downstream service providers are currently experiencing outages, and our engineering team is actively working on a resolution. Some services—including the Solver, Partner, and Tools—are temporarily degraded with higher latency and lower bandwidth. Rest assured, Intervipedia, Solutions, and the Question Bank features are not impacted and remain fully operational.
The Question
Design

Real-time Stock Price Alert System

Design a high-scale system capable of monitoring real-time stock price movements and triggering user-defined alerts (e.g., 'Notify me if AAPL goes above $180'). The system must handle 10 million daily active users, 100 million configured alerts, and a stream of 100,000 price updates per second. Focus on minimizing the latency between a price crossing a threshold and the notification delivery, while ensuring the system remains resilient to massive bursts in market activity.
Kafka
Redis
PostgreSQL
Push Notifications
Consistent Hashing
ZSET
Questions & Insights

Clarifying Questions

Scale & Performance: What is the expected scale in terms of users (DAU) and total active alerts? (Assumption: 10M DAU, 100M total alerts).
Latency: What is the end-to-end latency requirement from a price hitting a target to the user receiving a notification? (Assumption: Near real-time, < 1 second).
Data Source: How many tickers are we tracking and what is the update frequency? (Assumption: 10,000 tickers, update frequency of 100ms per ticker).
Alert Types: Are we supporting complex logic (e.g., RSI, moving averages) or just simple price thresholds? (Assumption: Simple "Above X" and "Below X" thresholds for MVP).
Delivery Channels: Which channels are required? (Assumption: Push Notifications only for MVP).

Thinking Process

The core challenge is the High-Frequency Matching Problem: efficiently checking 100M alerts against 10,000 price updates per second without overwhelming the database.
How do we ingest high-velocity price data without dropping events? Use a message bus (Kafka) to buffer incoming price ticks from external providers.
How do we avoid scanning the entire database for every price update? Shard alerts by Ticker ID and maintain an in-memory sorted data structure (like a Treap or Sorted Set) per ticker to find "triggered" alerts in O(log N + K) time.
How do we ensure a user isn't spammed if the price oscillates around the threshold? Implement alert state management (e.g., mark as "triggered" and require manual or cool-down reset).
How do we handle notification delivery at scale? Decouple matching from delivery using a separate notification queue to ensure the matching engine isn't blocked by network I/O.

Bonus Points

Precision vs. Recall Trade-off: Discuss handling "gaps" in price data (e.g., price jumps from 99 to 101 without hitting $100 exactly) using range-based triggering.
Exactly-once Processing: Leveraging Kafka's idempotent producers and atomic commits to ensure a price move triggers an alert exactly once.
Backpressure Handling: Implementing a "latest-price-only" drop policy in the Matcher Service if the processing lag exceeds a threshold, prioritizing current market state over historical ticks.
Data Locality: Using a consistent hashing strategy to ensure all alerts for AAPL and the price stream for AAPL land on the same Matcher Service instance to minimize cross-service chatter.
Design Breakdown

Functional Requirements

Core Use Cases:
Users can create, update, and delete price alerts (Above/Below price).
System monitors real-time stock price streams.
System sends a Push Notification when a threshold is met.
Scope Control:
In-Scope: Real-time price tracking, simple threshold alerts, push notifications.
Out-of-Scope: Complex technical indicators (MACD, RSI), SMS/Email delivery, Portfolio tracking, Historical price charts.

Non-Functional Requirements

Scale: Support 100M active alerts and 10k price updates/sec.
Latency: Sub-second alert delivery (p99 < 1s).
Availability & Reliability: 99.9% availability; alerts must not be lost (Reliability > Availability).
Consistency: Eventual consistency for alert creation (takes a few seconds to go live), but high accuracy for matching.
Fault Tolerance: Automatic recovery of in-memory matching state from the database if a node fails.

Estimation

Traffic Estimation:
Write QPS (Alert Creation): 100M alerts / 30 days \approx 40 req/sec. (Negligible).
Ingestion QPS: 10,000 tickers * 10 updates/sec = 100,000 price events/sec.
Storage Estimation:
Alerts Table: 100M rows * 100 bytes \approx 10 GB (Fits in modern DB).
Bandwidth Estimation:
Ingestion: 100k events/sec * 100 bytes \approx 10 MB/s.
Outgoing Notifications: Peak event (e.g., market crash) could trigger 1M+ notifications in seconds.

Blueprint

Concise Summary: A streaming architecture where price updates are ingested into Kafka and processed by a distributed Matcher Service that maintains in-memory sorted indexes of alerts for low-latency triggering.
Major Components:
API Gateway: Entry point for users to manage alert configurations.
Alert Service: Manages the lifecycle of alert rules in the persistent DB.
Price Ingester: Connects to WebSocket/SSE stock providers and publishes to Kafka.
Kafka (Messaging): Buffers high-volume price updates and decoupled notification tasks.
Matcher Service (Data Processing): Performs high-speed matching of prices against rules using local memory and Redis.
Redis (Cache): Stores active alert rules to allow Matcher nodes to hydrate their local memory quickly.
PostgreSQL (Storage): Source of truth for all user alert configurations.
Simplicity Audit: This design avoids complex stream-processing frameworks (like Flink) for the MVP, using a custom Matcher Service with local memory for maximum performance and predictable scaling.
Architecture Decision Rationale:
Why this architecture?: Price alerts are a classic "Needle in a Haystack" problem. In-memory sorted sets are the most efficient way to find which of the 10,000 alerts for "TSLA" are triggered by a price change to $200.
Functional Satisfaction: Covers alert CRUD and real-time notification.
Non-functional Satisfaction: Kafka ensures no price updates are lost during spikes; sharding allows horizontal scaling to millions of alerts.

High Level Architecture

Sub-system Deep Dive

Edge (Optional)

Content Delivery & Traffic Routing: Not critical for the stock price stream (internal ingestion), but the API Gateway uses Latency-based DNS for users to manage alerts.
Security & Perimeter:
API Gateway: Handles JWT validation.
Rate Limiting: Limits alert creation (e.g., 50 alerts per user) to prevent resource exhaustion.

Service

Topology & Scaling:
Matcher Service: Partitioned by TickerID. Each node handles a subset of tickers. Uses consistent hashing to ensure price ticks and alerts for the same ticker meet at the same node.
Statelessness: The Alert Service is stateless. The Matcher Service is stateful (in-memory rules) but can be rehydrated from Redis/PostgreSQL.
API Schema Design:
POST /v1/alerts: {ticker, price, condition: ABOVE|BELOW}. Returns alert_id.
DELETE /v1/alerts/{id}: Idempotent deletion.
Resilience:
Circuit Breakers: Applied to the Notification Service to prevent slow external push providers from backing up the system.

Storage

Access Pattern:
Alert Service: Write-heavy (creation/deletion).
Matcher Service: Read-heavy (initial load), but mostly relies on memory.
Database Table Design:
Alerts Table: id (UUID), user_id, ticker_id (Indexed), target_price, condition (ENUM), status (ACTIVE/TRIGGERED).
Technical Selection: PostgreSQL. Standard relational DB handles 100M rows easily with indexing on ticker_id.
Distribution Logic: Sharded by user_id for management, but indexed by ticker_id for the matching engine.

Cache

Purpose & Justification: Redis acts as a distributed lookup layer. When a Matcher Service node starts up or a new alert is created, Redis provides a faster hydration source than the main DB.
Key-Value Schema:
Key: ticker:{id}:alerts
Value: Sorted Set (ZSET) where score is the target_price and member is the alert_id.
Failure Handling: If Redis fails, Matcher nodes fall back to the PostgreSQL replica.

Messaging

Purpose & Decoupling:
Kafka (Price Stream): Buffers ingestion. High throughput (100k+ events/sec).
Notification Queue (Kafka or SQS): Decouples the matching logic from the high-latency task of sending push notifications.
Throughput & Partitioning: Kafka topics partitioned by TickerID to maintain strict ordering of price updates per stock and facilitate sharded processing.

Data Processing

Processing Model: Custom Matcher Service consuming from Kafka.
Processing Logic:
Receive PriceUpdate(Ticker: AAPL, Price: 150.50).
Fetch local SortedSet<Alert> for AAPL.
For "Price Above" alerts: Range query for alerts with threshold <= 150.50.
For "Price Below" alerts: Range query for alerts with threshold >= 150.50.
Filter out alerts already marked as "Triggered" in the local bloom filter or cache.
Emit TriggeredAlert to Notification Queue.
Correctness: Checkpointing Kafka offsets only after a batch of price updates is processed.
Wrap Up

Advanced Topics

Trade-offs: We chose Eventual Consistency for alert creation. When a user creates an alert, it may take ~1 second to propagate through the Alert Service -> Redis -> Matcher Service. This is acceptable for stock alerts.
Reliability: If the Matcher Service crashes, it restarts and reloads rules for its assigned Ticker partitions from Redis. The Kafka consumer group ensures no price ticks are lost during the downtime.
Optimization (The "Flip-Flop" Problem): To avoid sending 100 alerts if a price bounces between 99.99 and 100.01, we implement a Hysteresis or a cool-down period. Once an alert triggers, its state is moved to TRIGGERED in the DB and removed from the active memory of the Matcher Service.
Scalability: As the number of tickers or alerts grows, we simply add more partitions to the Kafka topic and more Matcher Service instances.