The Question
Design

Scalable RESTful Resource Management Service

Design a highly available and scalable RESTful API service capable of managing 100 million resources (e.g., a product catalog). The system must handle 50,000 read QPS with sub-100ms latency and 500 write QPS. Discuss your choice of database, caching strategy, API versioning, and how you would handle distributed concurrency and idempotency to ensure a robust developer experience.
PostgreSQL
Redis
REST
OAuth2
JWT
API Gateway
Load Balancer
ULID
Questions & Insights

Clarifying Questions

What is the scale of the system? (DAU, total number of resources/items, and expected QPS).
What is the Read/Write ratio? (e.g., Is it a read-heavy system like a catalog or write-heavy like a logging service?)
What are the consistency requirements? (Does the user need strong consistency immediately after a write, or is eventual consistency acceptable?)
What is the geographic distribution of the users? (Global audience vs. single region).
Assumptions for this design:
Scale: 100 million total resources (e.g., Products).
Traffic: Read-heavy (100:1 ratio). Peak Read QPS: 50,000. Peak Write QPS: 500.
Latency: < 100ms for p99 reads.
Consistency: Strong consistency for writes, Eventual consistency for reads is acceptable via cache.

Thinking Process

To design a production-grade REST API, we move from a simple server to a distributed architecture:
Core Interface: Define the REST resources and standard status codes to ensure predictable client-server interaction.
Scaling the Read Path: Introduce an in-memory cache to handle the 50k QPS and prevent database saturation.
Ensuring Data Integrity: Use a relational database for ACID compliance on updates while implementing horizontal sharding to handle data growth.
Availability & Security: Wrap the service with a Load Balancer and API Gateway for rate limiting and authentication.

Bonus Points

Idempotency Keys: Implementing X-Idempotent-ID headers for POST/PUT requests to prevent duplicate resource creation in distributed retry scenarios.
Optimistic Concurrency Control (OCC): Using ETag/If-Match headers to handle "Lost Update" problems without heavy database locking.
Pagination Strategy: Using Cursor-based pagination (keyset pagination) instead of Offset/Limit to ensure stable performance and consistent results at scale.
API Versioning: Implementing header-based versioning (e.g., Accept: application/vnd.api.v2+json) to maintain backward compatibility without URL clutter.
Design Breakdown

Functional Requirements

Core Use Cases:
Create, Read, Update, and Delete (CRUD) resources.
List resources with filtering and pagination.
Search for resources by specific attributes.
Scope Control:
In-scope: End-to-end API lifecycle, storage strategy, and caching.
Out-of-scope: Front-end implementation, complex full-text search engine (Elasticsearch), and analytics processing.

Non-Functional Requirements

Scale: Must support 100M+ records and 50k+ QPS.
Latency: Sub-100ms response time for read operations.
Availability & Reliability: 99.99% (four nines) availability; no single point of failure.
Consistency: Strong consistency for writes; read-after-write consistency for the owning user where possible.
Security & Privacy: TLS encryption, OAuth2/JWT authentication, and strict Rate Limiting.

Estimation

Traffic Estimation:
Read: 50,000 QPS.
Write: 500 QPS.
Storage Estimation:
100M items * 2KB per item ≈ 200 GB. (Easily fits in modern DBs, but requires sharding for IOPS).
Bandwidth Estimation:
Incoming: 500 writes * 2KB = 1 MB/s.
Outgoing: 50,000 reads * 2KB = 100 MB/s (800 Mbps).

Blueprint

Concise Summary: A stateless microservice architecture backed by a partitioned Relational Database and a distributed Cache Layer, exposed via a managed API Gateway.
Major Components:
API Gateway: Handles authentication, rate limiting, and request routing.
Application Service: Stateless compute nodes executing business logic and RESTful controllers.
Redis Cache: In-memory store to absorb the high read volume (99% of traffic).
PostgreSQL Database: Persistent, partitioned storage ensuring data durability and ACID transactions.
Simplicity Audit: This design avoids complex message brokers and asynchronous workers for the MVP, relying on the DB's performance for the relatively low 500 QPS write volume.
Architecture Decision Rationale:
Why this architecture?: Stateless services allow effortless horizontal scaling. Separation of Read (Cache) and Write (DB) paths prevents contention.
Functional Satisfaction: Standard REST patterns are followed for CRUD; Indexing in SQL handles basic listing.
Non-functional Satisfaction: High availability is achieved through redundancy at every layer (LB, App Nodes, DB Replicas).

High Level Architecture

Sub-system Deep Dive

Edge (Optional)

Content Delivery & Traffic Routing: CDN used for static assets (if any), but dynamic REST responses are routed via DNS to the nearest API Gateway.
Security & Perimeter:
API Gateway: Performs JWT validation and extracts user_id.
Rate Limiting: Applied at the Gateway level (e.g., 100 requests/minute per API key) using a Leaky Bucket algorithm.
SSL/TLS: Terminated at the Load Balancer to reduce overhead on App Services.

Service

Topology & Scaling: Stateless Docker containers deployed across multiple Availability Zones (AZs). Scaling is triggered by CPU (>60%) or Request Count.
API Schema Design:
GET /v1/items/{id}: Returns JSON object. p99 < 50ms.
POST /v1/items: Accepts JSON body. Requires X-Idempotency-Key.
PUT /v1/items/{id}: Full update. Uses If-Match (ETag) for concurrency.
GET /v1/items?limit=20&cursor=abc: Keyset pagination.
Resilience & Reliability:
Retries: Exponential backoff with jitter for DB connections.
Circuit Breaker: Trips if Redis is down, falling back directly to DB Read Replicas with reduced rate limits.
Security: Service-to-service communication via mTLS or internal VPC security groups.

Storage

Access Pattern: 99% reads by ID or indexed attributes. Low frequency updates.
Database Table Design:
id (UUID/ULID), owner_id, data (JSONB for flexibility), version (INT for OCC), created_at, updated_at.
Primary Key: id. Index on owner_id and created_at.
Technical Selection: PostgreSQL.
Rationale: Robust indexing, JSONB support for semi-structured data, and strong community support for sharding tools (e.g., Citus) if growth exceeds 1TB.
Distribution Logic: Horizontal sharding by id (hash-based) to distribute IOPS evenly across multiple primary nodes.
Reliability & Recovery: Daily snapshots to S3; Write-Ahead Logs (WAL) streamed to replicas for <1 min RPO.

Cache

Purpose & Justification: Mitigates read pressure (50k QPS) which would otherwise require a massive DB cluster.
Key-Value Schema:
Key: item:{id}
Value: Serialized JSON object.
TTL: 1 hour with "sliding window" or explicit invalidation on update (Cache-Aside).
Technical Selection: Redis.
Rationale: Support for eviction policies (LRU) and sub-millisecond response times.
Failure Handling: If a cache miss occurs (or Redis is down), the app fetches from DB Read Replicas and asynchronously attempts to re-populate the cache.
Wrap Up

Advanced Topics

Trade-offs: We chose Availability over Consistency (AP) for the read path by using a cache. A user might see data that is a few seconds old if a cache invalidation fails.
Reliability: We use Read Replicas for the DB. If the Primary fails, a replica is promoted. If the Cache fails, the system stays up but runs slower (graceful degradation).
Bottleneck Analysis: The primary DB is the ultimate bottleneck. We mitigate this by sharding and ensuring 99% of reads never hit the DB.
Distinguishing Insights:
ULIDs: Use Universally Unique Lexicographically Sortable Identifiers instead of standard UUIDs for better DB index performance (sequential insertion) while maintaining uniqueness.
Compression: Using Brotli/Gzip compression on the API responses to reduce bandwidth costs given the high read volume.