DowngradedOur downstream service providers are currently experiencing outages, and our engineering team is actively working on a resolution. Some services—including the Solver, Partner, and Tools—are temporarily degraded with higher latency and lower bandwidth. Rest assured, Intervipedia, Solutions, and the Question Bank features are not impacted and remain fully operational.DowngradedOur downstream service providers are currently experiencing outages, and our engineering team is actively working on a resolution. Some services—including the Solver, Partner, and Tools—are temporarily degraded with higher latency and lower bandwidth. Rest assured, Intervipedia, Solutions, and the Question Bank features are not impacted and remain fully operational.DowngradedOur downstream service providers are currently experiencing outages, and our engineering team is actively working on a resolution. Some services—including the Solver, Partner, and Tools—are temporarily degraded with higher latency and lower bandwidth. Rest assured, Intervipedia, Solutions, and the Question Bank features are not impacted and remain fully operational.DowngradedOur downstream service providers are currently experiencing outages, and our engineering team is actively working on a resolution. Some services—including the Solver, Partner, and Tools—are temporarily degraded with higher latency and lower bandwidth. Rest assured, Intervipedia, Solutions, and the Question Bank features are not impacted and remain fully operational.
DowngradedOur downstream service providers are currently experiencing outages, and our engineering team is actively working on a resolution. Some services—including the Solver, Partner, and Tools—are temporarily degraded with higher latency and lower bandwidth. Rest assured, Intervipedia, Solutions, and the Question Bank features are not impacted and remain fully operational.DowngradedOur downstream service providers are currently experiencing outages, and our engineering team is actively working on a resolution. Some services—including the Solver, Partner, and Tools—are temporarily degraded with higher latency and lower bandwidth. Rest assured, Intervipedia, Solutions, and the Question Bank features are not impacted and remain fully operational.DowngradedOur downstream service providers are currently experiencing outages, and our engineering team is actively working on a resolution. Some services—including the Solver, Partner, and Tools—are temporarily degraded with higher latency and lower bandwidth. Rest assured, Intervipedia, Solutions, and the Question Bank features are not impacted and remain fully operational.DowngradedOur downstream service providers are currently experiencing outages, and our engineering team is actively working on a resolution. Some services—including the Solver, Partner, and Tools—are temporarily degraded with higher latency and lower bandwidth. Rest assured, Intervipedia, Solutions, and the Question Bank features are not impacted and remain fully operational.
The Question
Design

Real-Time Online Chess Platform

Design a high-scale online chess platform capable of supporting 10 million daily active users. The system must handle real-time multiplayer gameplay with low latency (<200ms), manage distributed matchmaking based on player skill (Elo), and maintain a server-authoritative game state to prevent cheating. Consider how to handle clock synchronization, game persistence for millions of matches daily, and a global leaderboard that updates accurately as games conclude.
WebSockets
Redis
PostgreSQL
Kafka
Consistent Hashing
JWT
Protobuf
Questions & Insights

Clarifying Questions

Scale and Concurrency: What is the target for Daily Active Users (DAU) and peak concurrent games? (Assumption: 10M DAU, 500k concurrent games).
Latency Sensitivity: What is the acceptable "move-to-reflection" latency? (Assumption: < 200ms globally).
Matchmaking Criteria: Is it purely ELO-based, or do we consider geo-proximity and connection quality? (Assumption: Rank-based with secondary latency-based buckets).
Persistence: Do we need to store every move for every game indefinitely, or just the final result? (Assumption: Full move history for anti-cheat and "Game Review" features).
Game Logic: Should the game server run a full chess engine (e.g., Stockfish) for move validation and cheat detection? (Assumption: Lightweight validation on the game server; heavy analysis async).

Thinking Process

Real-Time Synchronicity: How do we maintain a persistent, low-latency duplex connection for 500k games simultaneously?
Server Authority: How do we prevent illegal moves and "time-hacking" in a distributed environment?
Distributed Matchmaking: How do we efficiently pair players of similar skill levels without creating bottlenecks or long wait times?
Reliable Rating Updates: How do we ensure Elo/Glicko-2 ratings are updated atomically and reflected on leaderboards at scale?

Bonus Points

Clock Drift Compensation: Implementing a "Network Time Protocol" (NTP) style handshake to calibrate game clocks, ensuring fair "flagging" (losing on time) despite network jitter.
Z-Order Curve for Matchmaking: Using spatial indexing or bucketing for multi-dimensional matchmaking (Skill + Latency + Behavior Score).
Sticky WebSocket Sessions: Leveraging consistent hashing at the Load Balancer level to route players of the same game to the same Game Server instance to minimize cross-server communication.
Delta-Compressed Move History: Using compact binary formats (Protobuf) for move notation to reduce bandwidth for mobile users.
Design Breakdown

Functional Requirements

Core Use Cases:
Players can join a matchmaking queue and be paired with opponents of similar rank.
Real-time move execution with server-authoritative validation.
Game clock management (Bullet, Blitz, Rapid).
Persistence of game history (PGN format).
Global leaderboard based on Elo ratings.
Scope Control:
In-scope: Matchmaking, Real-time moves, Leaderboards.
Out-of-scope: In-game chat, AI-bot play, tournament bracket management, social/friends lists.

Non-Functional Requirements

Scale: Support 10M DAU and 500k concurrent games.
Latency: Move propagation latency < 100ms (P99).
Availability: 99.99% availability; active games must survive server restarts (reconnection logic).
Consistency: Strong consistency for game state and rating updates.
Security: TLS encryption; server-side move validation to prevent cheating.

Estimation

Traffic:
500k concurrent games = 1M active WebSocket connections.
Avg moves/game = 40. Avg move frequency = 1 move per 5s.
Peak Move QPS = 1M users / 5s = 200k move requests/sec.
Storage:
10M games/day. 1 Game History (PGN) \approx 2KB.
Daily storage: 10M * 2KB = 20GB.
Yearly storage: ~7.3TB.
Bandwidth:
Incoming: 200k QPS * 500 bytes (move packet) \approx 100MB/s.
Outgoing: 200k QPS * 500 bytes (broadcast to opponent) \approx 100MB/s.

Blueprint

Concise Summary: A WebSocket-based architecture using a distributed Matchmaker and authoritative Game Servers. Game state is cached in Redis for low-latency access and persisted in PostgreSQL upon game completion.
Major Components:
API Gateway: Handles authentication and routing for REST and WebSocket upgrades.
Matchmaker Service: Uses Redis-sorted sets to group players by rank and pair them.
Game Server Cluster: Manages real-time game state, clocks, and move validation.
State Store (Redis): Stores ephemeral game data (current board, active clocks).
Rating Worker: Async processing of game results to update Elo and leaderboards.
Simplicity Audit: This design avoids complex distributed lock managers by assigning a specific Game Server instance as the "leader" for a specific Game ID via consistent hashing.
Architecture Decision Rationale:
WebSockets over HTTP: Essential for bidirectional real-time updates and minimizing header overhead.
Redis for Matchmaking: Extremely high throughput for atomic "pop" operations when pairing players.
PostgreSQL: Chosen for game history due to ACID requirements for rating integrity.

High Level Architecture

Sub-system Deep Dive

Edge (Optional)

Content Delivery & Traffic Routing: Use a Global Accelerator (Anycast IP) to route traffic to the nearest regional PoP, minimizing TCP handshake latency.
Security & Perimeter:
API Gateway: Performs JWT validation and terminates SSL.
Rate Limiting: Applied per UserID to prevent "move-spamming" DoS.

Service

Matchmaker Service:
Players are placed into "Elo Buckets" (e.g., 1200-1300).
If no match is found in 5s, the bucket range expands.
Once two players are matched, a GameID is generated and they are redirected to a specific Game Server node.
Game Server Cluster:
Stateful Management: Each server handles a subset of active games.
Authority: Validates move legality (FIDE rules). If a move is illegal, the server rejects it without broadcasting.
Heartbeats: Detects player disconnects. Starts a "grace period" timer for reconnection.
API Schema:
POST /v1/match/join: Join queue.
WS /v1/game/{gameId}: WebSocket endpoint for moves.
Message: { move: "e2e4", timestamp: 1625... }
GET /v1/leaderboard: Fetch top players.

Storage

Access Pattern: Heavy writes for game history (end of game), heavy reads/writes for active state (Redis).
Database Table Design:
Games Table: game_id (PK), white_user_id, black_user_id, pgn_history (TEXT), result, end_time.
User_Stats Table: user_id (PK), elo_rating, games_played, wins, losses.
Technical Selection:
PostgreSQL: Handles the relational nature of user stats and game history with strong consistency.
Redis: Sorted sets (ZSET) for the leaderboard (Score = Elo, Member = UserID).

Cache

Purpose: Low-latency storage for active board states and session data.
Key-Value Schema:
game:state:{gameId}: Hash map containing board_fen, white_clock_ms, black_clock_ms, last_move_ts.
Failure Handling: If a Game Server crashes, the new assigned server fetches the state from Redis.

Messaging

Purpose: Decouples game completion from rating calculations and persistence.
Event Schema: GameCompletedEvent { gameId, winnerId, pgn, duration }.
Technical Selection: Kafka or RabbitMQ. Used to ensure that even if the Rating Worker is down, no game results are lost (durability).
Wrap Up

Advanced Topics

Trade-offs: We choose Consistency over Availability (CP) for the game state. A player would rather have the game pause (wait for state) than have the board desynchronize between opponents.
Reliability:
Redis Replication: Use Redis Sentinel or Cluster to ensure the active game state is not lost.
Reconnection Logic: Clients must be able to resume a WebSocket session using the gameId and an auth token.
Bottleneck Analysis:
Matchmaker Hotspots: At 10M DAU, a single Redis instance for matchmaking might bottleneck. Strategy: Shard the matchmaking queue by Elo ranges.
WebSocket Count: A single server can handle ~50k-100k connections. We need 10-20 servers for 1M connections.
Security:
Move Validation: Every move must be checked against the current FEN (Forsyth-Edwards Notation) stored in the server's memory/Redis.
Anti-Cheat: Periodically stream moves to an async Analysis Service running Stockfish to flag high accuracy/engine-like play.