The Question
Design

Real-Time Community Chat Platform

Design a large-scale real-time community chat platform similar to Discord. The system should support servers, channels, direct messaging, and live presence indicators for tens of millions of concurrent users, with sub-200ms message delivery and high availability across global regions.
WebSocket
Cassandra
Redis
Pub/Sub
Snowflake ID
Questions & Insights

Thinking Process

Designing Discord requires balancing extreme write throughput with low-latency real-time delivery to large groups of users.
How do we maintain millions of persistent connections? Use a distributed Gateway layer (WebSockets) that manages stateful connections and maps SessionID to UserID.
How do we handle message fan-out for large servers? Instead of pushing to every user immediately, we use a Pub/Sub mechanism where the Gateway subscribes only to active channels the user is viewing.
What is the optimal storage for message history? Messages are time-series in nature and high-volume. A Wide-Column store (Cassandra/ScyllaDB) is ideal for range queries by ChannelID.
How do we handle "Presence" (Online/Offline) at scale? Use a distributed NoSQL cache (Redis) with heartbeats to avoid heavy database writes for transient state changes.

Bonus Points

Causal Consistency: Use Snowflake IDs (distributed sequence generators) to ensure total ordering of messages across different shards without a central bottleneck.
Presence Optimization: Implement "Lazy Loading" for presence in large servers (> 100k members) to prevent "thundering herd" updates when a popular user goes online.
Gateway Sharding: Utilize consistent hashing on the Load Balancer to ensure client reconnections land on the same gateway shard when possible, reducing session re-authentication overhead.
Message Compaction: Use Discord’s actual approach—storing messages in "buckets" (e.g., 100 messages per row in Cassandra) to optimize disk seek time for chat history.
Design Breakdown

Functional Requirements

Users can send and receive real-time messages in channels.
Users can see the online/offline status (Presence) of friends/server members.
Message history is persisted and searchable within a channel.
Users can join "Servers" containing multiple "Channels."

Non-Functional Requirements

Low Latency: Real-time delivery should feel instantaneous (< 200ms).
High Availability: The system must remain functional even if regional shards fail.
Scalability: Support millions of concurrent users and billions of messages.
Order Consistency: Messages must appear in the same order for all users in a channel.

Estimation

DAU: 20 Million.
Messages per user/day: 50.
Total Messages/Day: 1 Billion.
Average QPS: ~12,000 writes/sec. Peak QPS: ~30,000 writes/sec.
Storage (1 Year): 1B msgs * 200 bytes/msg ≈ 200 GB/day → ~73 TB/year (before replication).
Presence: 20M users sending heartbeats every 30s = ~660k heartbeats/sec.

Blueprint

Concise Summary: A WebSocket-based architecture using a stateful Gateway to push messages, backed by a Wide-Column store for high-volume message persistence and Redis for transient presence data.
Major Components:
Gateway Service (WebSocket): Maintains persistent connections and handles real-time message routing to/from clients.
Message Service: Orchestrates message validation, persistence, and triggers the fan-out via Pub/Sub.
Presence Service: Tracks user heartbeats and broadcasts status changes to interested parties.
Message Store (Cassandra): Highly scalable storage for trillions of messages indexed by ChannelID.
Simplicity Audit: This design avoids complex stream processing (Flink) or heavy search clusters (Elasticsearch) in favor of direct DB writes and a simple Pub/Sub for the MVP chat experience.

High Level Architecture

Sub-system Deep Dive

Service

Gateway Service:
Stateless in logic but stateful in connection. Each instance maintains a mapping of UserID -> WebSocket Descriptor.
Uses a Session Service to resume connections without full re-authentication.
Message Service (API):
Handles POST /messages requests.
Assigns a Snowflake ID (64-bit time-sortable) to every message to ensure ordering.
Updates the "Last Message ID" in the channel metadata.

Storage

Message Store (Cassandra/ScyllaDB):
Partition Key: channel_id (distributes chat rooms across nodes).
Clustering Key: message_id (ordered descending).
This allows high-performance range queries: SELECT * FROM messages WHERE channel_id = ? LIMIT 50.
Metadata DB (Postgres):
Stores relational data: Servers, Channels, User Roles, and Friendships.
Sharded by ServerID for horizontal scale.

Cache

Presence Cache (Redis):
Stores UserID -> {Status, LastActiveTimestamp}.
Uses Redis SET with an Expiration (TTL) of 60 seconds. If a client doesn't send a heartbeat, the key expires, and the user is considered "Offline."
Logic: On status change, query the "Friends" list from Metadata DB and push updates to the Gateway.

Messaging

Internal Pub/Sub (Redis Pub/Sub or NATS):
When a message is saved to Cassandra, the Message Service publishes to a topic: channel:{channel_id}.
Gateway nodes subscribe to the topics of channels currently being viewed by their connected users.
This ensures a message is only "fanned out" to active observers, not all 1,000,000 members of a server.
Wrap Up

Advanced Topics

Trade-offs (Consistency vs Availability): Discord prioritizes Availability and Latency (AP). In extreme partitions, two users might see messages in slightly different orders for a few seconds (Eventual Consistency), but the system remains usable.
Bottlenecks: The Presence Service is the highest QPS component. As the user base grows, we must use Presence Sharding (hashing users to specific Redis clusters) to avoid a single Redis instance becoming a bottleneck.
Failure Handling:
Gateway Failure: Clients detect a disconnect and perform an exponential backoff reconnect.
Cassandra Replication: Uses a replication factor of 3 across availability zones to ensure no data loss.
Alternatives & Optimization:
Optimization: For very large servers (e.g., 500k members), we stop sending "User joined/left" or "Presence" updates to everyone. We only update the UI for the specific channel/user-list the client is actively looking at.