The Question
Design

Real-Time End-to-End Encrypted Messaging System

Design the backend for a global-scale messaging application similar to WhatsApp. The system must support real-time 1:1 and group text messaging with strict end-to-end encryption (E2EE). Key constraints include supporting 100M daily active users, providing reliable delivery status (sent/delivered/read), and managing user presence (online/offline). Discuss how you would handle asynchronous key exchanges for E2EE, connection management for millions of mobile clients, and the storage strategy for messages that cannot be delivered immediately. Ensure the design addresses network volatility and minimizes latency while maintaining zero visibility into message content for the service provider.
WebSocket
Signal Protocol
Cassandra
Redis
Kafka
Protobuf
X3DH
Double Ratchet
S3
PostgreSQL
Questions & Insights

Clarifying Questions

Scale and Usage: What is the target Daily Active Users (DAU) and the average number of messages sent per user per day?
Encryption Model: Is the End-to-End Encryption (E2EE) required for both 1:1 and group chats, and should the server store messages permanently or just until delivery?
Media Support: Does the MVP include media (images/video) or is it text-only?
Presence: Is real-time "Online/Last Seen" status required for the MVP?
Assumptions:
DAU: 100 Million.
Volume: 5 Billion messages/day (~50 messages/user).
Storage: Messages are stored on the server only until they are delivered (ephemeral storage policy, similar to WhatsApp).
E2EE: Implementation using the Signal Protocol (X3DH and Double Ratchet).
Media: Supported via encrypted binary uploads to Object Storage.

Thinking Process

Core Bottleneck: Maintaining millions of persistent WebSocket/gRPC connections while ensuring message delivery guarantees across different devices.
Key Strategy:
How do we establish a secure, encrypted session between two clients who aren't online at the same time? (Key Transparency & Pre-keys).
How do we route a message from User A to User B with sub-100ms latency? (Presence-aware routing via WebSockets).
How do we handle "offline" users without losing data? (Message Queuing/Buffering).
How do we scale the connection layer to millions of concurrent sockets? (Distributed Gateway with a Consistent Hashing/Registry).

Bonus Points

Signal Protocol Integration: Implementing X3DH (Extended Triple Diffie-Hellman) for asynchronous key agreement and Double Ratchet for perfect forward secrecy.
Binary Protocol Optimization: Using Protocol Buffers (Protobuf) or Thrift instead of JSON to reduce the payload size over cellular networks.
Connection Handover: Implementing session resumption and heartbeat optimization to handle mobile users switching between Wi-Fi and 5G.
Message Sequencing: Using logical clocks or vector clocks to handle message ordering in group chats where distributed clocks might drift.
Design Breakdown

Functional Requirements

Core Use Cases:
1:1 Instant Messaging with E2EE.
User registration and Key Management (Pre-keys).
Delivery receipts (Sent, Delivered, Read).
Presence tracking (Online/Offline status).
Media sharing (Encrypted).
Scope Control:
In-scope: 1:1 chat, small groups (up to 100 users), presence, delivery receipts.
Out-of-scope: Voice/Video calls, status stories, message search (server-side search is impossible with E2EE anyway), archival of all history (history stays on device).

Non-Functional Requirements

Scale: Support 100M DAU and 10M+ concurrent connections.
Latency: End-to-end message delivery < 200ms for online users.
Availability & Reliability: 99.99% availability; zero message loss for sent messages.
Consistency: High consistency for message delivery (messages must arrive in order).
Security & Privacy: Strict E2EE where the server cannot decrypt message payloads.

Estimation

Traffic Estimation:
5 Billion msgs/day / 86,400s \approx 58,000 Avg QPS.
Peak QPS (3x) \approx 175,000 QPS.
Storage Estimation:
Message metadata (500 bytes) + encrypted payload (variable).
If messages are deleted after delivery: 100M users 10 undelivered msgs/user 1KB \approx 1TB of "hot" buffer storage.
Bandwidth Estimation:
175k QPS * 2KB/msg \approx 350 MB/s (Inbound/Outbound).

Blueprint

Concise Summary: A microservices architecture leveraging persistent WebSocket connections for real-time delivery and a distributed NoSQL store for buffering undelivered E2EE messages.
Major Components:
Chat Gateway (WebSocket): Manages persistent connections and bidirectional communication with clients.
Key Management Service: Stores public pre-keys for E2EE session establishment.
Message Service: Routes messages to the correct Gateway or pushes them to the offline buffer.
Presence Service: Tracks user online status using a heartbeat mechanism.
Simplicity Audit: This design avoids complex distributed transactions by treating the Message Service as a stateless router and using an ephemeral message buffer.
Architecture Decision Rationale:
Why this architecture?: WebSockets are essential for low-latency full-duplex communication. A dedicated Key Service is required to enable E2EE when the recipient is offline.
Functional Satisfaction: Meets all core messaging and security needs.
Non-functional Satisfaction: Scalable via horizontal Gateway scaling; reliable via message acknowledgement (ACK) loops.

High Level Architecture

Sub-system Deep Dive

Edge (Optional)

Content Delivery & Traffic Routing: Global Load Balancers (L7) perform SSL termination and route to the nearest regional Chat Gateway.
Security & Perimeter: API Gateway handles JWT-based authentication for REST endpoints (Key Service, Media Service) and Rate Limiting to prevent spam.

Service

Topology & Scaling: Chat Gateways are stateless (with respect to message content) but maintain state (socket connections). They scale based on the number of concurrent connections (e.g., 50k connections per node).
API Schema Design:
POST /v1/keys/upload: Upload pre-keys (X3DH).
GET /v1/keys/{userId}: Fetch pre-keys for a contact.
WebSocket Message: {"to": "B", "payload": "encrypted_blob", "msgId": "123"}.
Resilience & Reliability:
Client-Side ACKs: User A sends a message \rightarrow Gateway \rightarrow User B. User B sends an ACK to Gateway \rightarrow Gateway sends ACK to User A.
Retries: Client-side exponential backoff for connection loss.
Observability: Metrics on "Time to Delivery" and "Socket Count."

Storage

Access Pattern: Extremely high write (buffering) and high delete (after delivery).
Database Table Design (Message Buffer):
receiver_id (Partition Key)
message_id (Clustering Key)
encrypted_payload (Blob)
timestamp
Technical Selection:
Cassandra: Ideal for high-write message buffering. Tuned for fast writes and range queries by receiver_id.
PostgreSQL: Used for User Profiles and Key Management where strong consistency is needed for Identity Keys.
Distribution Logic: Partition message buffer by receiver_id to ensure all messages for a user are co-located.

Cache

Purpose: Presence tracking.
Key-Value Schema: user_id: {status: "online", last_seen: timestamp, gateway_id: "gw-101"}.
Technical Selection: Redis. High-speed lookups for routing messages to the correct Gateway node.
Failure Handling: Heartbeat intervals of 30 seconds; if no heartbeat, mark as offline.

Messaging

Purpose: To decouple the message routing from the push notification system.
Event Schema: {"user_id": "B", "type": "NEW_MESSAGE", "preview": "Encrypted"}.
Technical Selection: Kafka. Handles high throughput of events for downstream workers (Push notifications, analytics).

Infrastructure (Optional)

Distributed Coordination: ZooKeeper or HashiCorp Consul to maintain a registry of Chat Gateway nodes, allowing the Message Service to locate which node User B is connected to.
Wrap Up

Advanced Topics

Trade-offs: We choose Availability over Consistency for presence (last-seen can be slightly delayed), but Consistency for key exchange (Identity must be verified).
Security & Privacy:
E2EE: The server only sees metadata (sender, receiver, timestamp). The message body is encrypted with a session key known only to clients.
Forward Secrecy: The Double Ratchet algorithm ensures that if a key is compromised today, past messages remain secure.
Reliability: If the Message Buffer DB is down, the Gateway should hold messages in memory for a short period before rejecting them (Backpressure).
Distinguishing Insights: To handle Group Chats efficiently in E2EE, we use Sender Keys. Instead of User A encrypting a message N times for N users, User A generates a "Chain Key," sends it to every member once, and then sends messages encrypted once with that chain key.