The Question
DesignTeam Collaboration & Messaging Platform
Design a team collaboration platform similar to Slack. The system should support workspaces, public and private channels, direct messaging, file sharing, threaded conversations, and third-party app integrations for enterprise organizations at scale.
PostgreSQL
S3
Redis
WebSocket
Consistent Hashing
Questions & Insights
Thinking Process
Designing a real-time messaging system like Slack requires balancing high-throughput message ingestion with low-latency delivery. The core challenge is the "Fan-out" problem (one message sent to thousands of members).
How do we maintain real-time bidirectional communication? We use WebSockets for persistent connections to minimize header overhead and enable server-side pushes.
How do we route messages to the correct user who might be on a different server? We implement a Messaging Hub (Pub/Sub) that acts as the connective tissue between disparate WebSocket Gateway instances.
How do we ensure message order and consistency? We use a centralized database with auto-incrementing IDs or Snowflake IDs per channel to ensure a strictly monotonic timeline.
How do we handle "Large Channel" performance? We differentiate between DMs (1:1) and Channels (1:N) and optimize the fan-out logic to prevent head-of-line blocking during massive broadcasts.
Bonus Points
Causality & Vector Clocks: Discussing logical clocks vs. physical clocks to handle message ordering across distributed clients with intermittent connectivity.
Last-Write-Wins (LWW) vs. CRDTs: Using Conflict-free Replicated Data Types for collaborative features like shared "is typing" indicators or emoji reactions in high-concurrency environments.
Edge Presence: Offloading "Presence" (online/offline status) to edge nodes to reduce the heartbeat traffic hitting the core data centers.
Tiered Storage: Moving message history older than 30 days to a compressed columnar store (e.g., Parquet on S3) to keep the primary operational DB (PostgreSQL) performant.
Design Breakdown
Functional Requirements
Users can join/leave channels (Public/Private) and send Direct Messages (DMs).
Real-time message delivery to online users.
Persistent message history for offline users to sync upon return.
Presence indicators (Online/Away/Offline).
Support for basic file/image attachments.
Non-Functional Requirements
Low Latency: End-to-end delivery in < 200ms.
High Availability: 99.99% uptime; the system must not go down if the message history service lags.
Consistency: All users in a channel must see the same order of messages.
Scalability: Support 10M+ concurrent users and peak loads of 100k messages per second.
Estimation
DAU: 20M.
Messages per user/day: 50.
Total Messages/day: 1 Billion.
Average Message Size: 200 bytes.
Daily Storage: 1B * 200 bytes \approx 200 GB.
Throughput (Avg): 1B / 86,400s \approx 12,000 Messages/sec.
Peak Throughput: 3x-5x avg \approx 60,000 Messages/sec.
Presence Traffic: Heartbeats every 30s. 20M / 30s \approx 666k requests/sec (This is the highest load).
Blueprint
Concise Summary: A WebSocket-based architecture where a distributed Gateway layer manages persistent connections, backed by a Pub/Sub bus for message routing and an RDBMS for durable message ordering.
Major Components:
WebSocket Gateway: Manages stateful long-lived connections for real-time push/pull.
Messaging Hub (Pub/Sub): Routes messages between Gateway nodes to find the target recipient.
Metadata/API Service: Handles CRUD for workspaces, channels, and user profiles via REST.
Presence Service: High-speed key-value store to track user heartbeat and status.
Simplicity Audit: This architecture avoids complex stream processing (Spark) or specialized graph databases for the MVP, relying on the proven reliability of PostgreSQL for consistency and Redis for speed.
High Level Architecture
Sub-system Deep Dive
Service
WebSocket Gateway:
Stateless at the logic level but stateful at the connection level.
Each node maintains a local mapping of
user_id -> socket_connection.Horizontal scaling is achieved by consistent hashing or a simple round-robin LB.
API Service:
Standard RESTful service for non-real-time actions (creating a channel, updating a profile).
Uses JWT for authentication to remain stateless.
Storage
Data Model:
Messages table: id (BigInt, PK), channel_id (Indexed), sender_id, content, timestamp.Channel_Members table: channel_id, user_id.Database Logic:
Sharding: Shared by
workspace_id. This ensures that all data for a single company/organization resides on the same shard, making "Load History" queries highly efficient.Ordering: Use Snowflake IDs (64-bit) for message IDs to ensure time-based ordering across distributed systems without a central lock.
Cache
Presence Engine:
Uses Redis
SETEX with a TTL of 60 seconds. Clients send heartbeats every 30 seconds. If the key exists, the user is "Online".Session Cache:
Stores frequently accessed channel metadata and user permissions to reduce RDBMS hits.
Messaging
Messaging Hub:
Implemented using Redis Pub/Sub for the MVP.
Topic Structure: Each channel is a topic (
channel_123). Flow: When a user sends a message to
channel_123, the Gateway publishes to that Redis topic. All Gateway nodes subscribed to channel_123 receive the message and push it to their locally connected users who are members of that channel.Wrap Up
Advanced Topics
Trade-offs:
Redis Pub/Sub: Chosen for speed and low complexity, but it is "fire-and-forget." If a Gateway node crashes, messages in transit are lost. We mitigate this by having the client acknowledge receipt and retrying from the Database if a gap in message IDs is detected.
Bottlenecks:
The "General" Channel: In a 100k-person workspace, one message triggers 100k pushes.
Optimization: Implement "Lazy Loading" for presence in large channels (don't show "typing" or "online" for channels > 1000 people).
Failure Handling:
Gateway Failover: If a node dies, clients automatically reconnect to another node via the LB and "catch up" on missed messages using the
last_message_id they received.Alternatives:
NATS or Pulsar: Could replace Redis Pub/Sub for better persistence guarantees in the messaging layer, but at the cost of higher operational overhead.
Cassandra/ScyllaDB: Could replace Postgres for message storage if the write volume exceeds RDBMS capacity, but joining tables (e.g., getting channel members) becomes significantly more complex.