The Question

Delayed Virtual Currency Escrow System

Design a high-scale virtual currency escrow service (e.g., for Robux) that allows users to schedule payments to be released after a designated future delay (ranging from seconds to days). The system must deduct funds immediately from the sender, support cancellations prior to maturity, and ensure at-least-once execution within a tight SLA. Handle 5,000 QPS and discuss how you ensure ledger consistency, handle short-term vs. long-term delays, and manage the race conditions between execution and cancellation.

PostgreSQL

Redis

ACID

Distributed Transactions

TCC

Idempotency

ZSet

Rate Limiting

Questions & Insights

Clarifying Questions

Precision of Execution: Does "within a few minutes" apply strictly to all delays, including the 10-second one?

Transaction Atomicitiy: Is the sender's balance stored in our system or an external service? (Assumption: Our system manages the Robux ledger).

Scale of History: How long should we retain "Completed" or "Cancelled" hold records for auditing? (Assumption: 7 years for regulatory compliance, but only 48-72 hours of "active" data).

Failure Modes: If the Payment API is down for 4 hours, should we backlog the releases or fail them? (Assumption: Backlog and retry with exponential backoff).

Assumptions:

QPS: 5,000 writes/sec (scheduling/cancelling).

Storage: Robux ledger is highly consistent (ACID).

Delay Range: 1 second to 90 days.

Concurrency: Multiple payments for the same user must be processed in order if they have the same maturity time, or we rely on idempotency keys to prevent race conditions.

Thinking Process

Core Bottleneck: Efficiently triggering millions of events at arbitrary future timestamps without constant expensive DB scanning.

Progressive Approach:

How do we ensure "Atomic Hold"? Use a Relational DB transaction to deduct balance and create a "Hold" record in one go.

How do we trigger "Just-in-Time"? Use a tiered scheduling approach: Long-term storage in DB + Short-term "Hot Path" in a Redis Sorted Set.

How do we handle "Cancellations"? Treat the message queue as a "trigger" only; the consumer must re-verify the "Hold Status" in the DB before executing the payment.

How do we handle "Rollbacks"? Implement a "State Machine" where non-retryable recipient errors trigger a ledger credit back to the sender.

Bonus Points

TCC Pattern (Try-Confirm-Cancel): Implementing the escrow as a formal TCC distributed transaction to ensure ledger integrity across sharded databases.

Determinism via Idempotency: Using a composite key of hold_id + status to ensure that the payment API is never called twice for the same phase of a hold.

Hybrid Sharding: Sharding the Ledger DB by user_id for horizontal scaling while using a Time-series partition for the "Maturity Index" to optimize the sweeper service.

Double-Entry Bookkeeping: Ensuring every Robux movement (Debit Sender -> Credit Escrow -> Credit Recipient) is recorded as two immutable ledger entries for auditability.

Design Breakdown

Functional Requirements

Core Use Cases:

Create Hold: Deduct Robux from Sender, create a record with a unlock_at timestamp.

Cancel Hold: Mark record as CANCELLED, refund Sender immediately.

Execute Payment: Transfer Robux from Escrow to Recipient at unlock_at.

Scope Control:

In-Scope: Hold lifecycle management, scheduling, ledger consistency, and basic idempotency.

Out-of-Scope: User authentication, UI/UX, and actual "Stripe" integration logic (treated as a black-box API).

Non-Functional Requirements

Scale: Support 5,000 QPS (write-heavy).

Latency: Hold creation < 100ms; Execution within 60s of target.

Availability & Reliability: 99.99% availability; No "lost" money (At-least-once execution).

Consistency: Strong consistency for balance deductions (no double-spending).

Fault Tolerance: Automatic retries for external API failures; Dead Letter Queues (DLQ) for manual intervention on "Banned Recipient" cases.

Estimation

Traffic: 5k QPS Write. Total daily writes = 5,000 * 86,400 ≈ 432 Million transactions/day.

Storage: 1 KB per Hold record. 432M * 1KB ≈ 432 GB/day. With 2-day retention for "Active" holds, we need ~1 TB of high-performance storage.

Bandwidth: 5,000 * 1 KB = 5 MB/s. Minimal network pressure.

Blueprint

Concise Summary: A ledger-backed escrow service that uses a "Sweeper" to move maturing holds from a sharded PostgreSQL database into a Redis-based "Hot Path" for sub-second triggering.

Major Components:

API Gateway: Entry point for scheduling and cancelling holds.

Hold Service: Orchestrates the ACID transaction (Ledger update + Hold record creation).

Ledger DB (PostgreSQL): Sharded source of truth for balances and hold states.

Time-Wheel Sweeper: A background process that polls for holds maturing in the next N minutes.

Redis (Hot Path): A Sorted Set storing hold_id scored by unlock_at for high-precision triggering.

Payment Worker: Consumes ready tasks, calls external APIs, and updates final status.

Simplicity Audit: This design avoids complex distributed timer architectures (like Netflix Conductor) in favor of a standard SQL + Redis pattern, which is sufficient for 5k QPS.

Architecture Decision Rationale:

RDBMS: Required for ACID guarantees on currency.

Sweeper + Redis: Solves the "arbitrary delay" problem. SQL is bad at "find items where time = NOW", but great at "find items where time < NOW + 5mins". Redis ZSet is perfect for "give me everything due now".

High Level Architecture

Sub-system Deep Dive

Edge (Optional)

Content Delivery & Traffic Routing: Not majorly impactful for a backend payment system.

Security & Perimeter:

API Gateway: Performs JWT validation and Rate Limiting (5k QPS).

Idempotency Header: Clients must pass X-Idempotency-Key to prevent duplicate holds during retry storms.

Service

Topology & Scaling: Stateless microservices deployed across multiple Availability Zones. Scaling is based on CPU and Request Count.

API Schema Design:

POST /v1/holds:

Request: { sender_id, recipient_id, amount, unlock_at, idempotency_key }

Returns: { hold_id, status: PENDING }

DELETE /v1/holds/{hold_id}:

Logic: Check DB -> If status=PENDING -> Set status=CANCELLED -> Credit Sender balance.

Resilience:

Exponential Backoff: Used when calling the External Payment API.

Circuit Breaker: Trips if the External API error rate exceeds 10% to prevent ledger exhaustion.

Storage

Access Pattern:

High-volume inserts (Hold creation).

Point lookups by hold_id (Cancellations).

Range scans by unlock_at (Sweeper).

Database Table Design:

Table: Holds

hold_id (UUID, Primary Key)

sender_id (UUID, Shard Key)

recipient_id (UUID)

amount (Decimal)

status (Enum: PENDING, EXECUTED, CANCELLED, FAILED)

unlock_at (Timestamp, Indexed)

idempotency_key (String, Unique Index)

Technical Selection: PostgreSQL with logical sharding by sender_id.

Distribution Logic: Sharding by sender_id ensures that a "Hold" and the "Sender's Balance" live on the same physical shard, allowing for local ACID transactions.

Cache

Purpose & Justification: Redis Sorted Set acts as the high-precision trigger.

Key-Value Schema:

Key: pending_triggers

Member: hold_id

Score: unix_timestamp(unlock_at)

Failure Handling: If Redis fails, the Sweeper can rebuild the ZSet from the PostgreSQL DB by scanning for status=PENDING AND unlock_at < NOW + Buffer.

Data Processing

Processing Model: The Time-Wheel Sweeper runs every 1 minute. It queries: SELECT hold_id FROM holds WHERE status='PENDING' AND unlock_at < NOW + 5 minutes.

Deduplication: The Sweeper uses a SET NX lock in Redis for the specific time-window to ensure only one instance processes a batch.

Infrastructure (Optional)

Observability:

Metrics: Monitor "Hold-to-Execution Latency" (Difference between unlock_at and actual execution).

Alerting: Alert if the number of PENDING holds with unlock_at < NOW - 10 minutes exceeds a threshold.

Wrap Up

Advanced Topics

Trade-offs: We trade some "Real-time" precision for "Guaranteed Consistency." The Sweeper/Redis combo ensures we don't miss payments, even if the delay is 10 seconds.

Reliability: If the recipient is banned (non-retryable), the system enters a FAILED state. A background "Refund Worker" periodically finds FAILED holds and returns funds to the sender, ensuring Robux are never lost in limbo.

Bottleneck Analysis: The primary bottleneck is the "Hot Shard" if a single celebrity creator receives 1,000s of payments. Since we shard by sender_id, we distribute the write load effectively across all users.

Ordering: If a user schedules Payment A and Payment B for the same second, the Redis ZSet doesn't guarantee order. However, since the sender's balance is already deducted, these payments are independent. If dependency is required, we use the "Option B" strategy: fail the dependent and require manual retry.