The Question

Scalable RESTful Resource Management Service

Design a highly available and scalable RESTful API service capable of managing 100 million resources (e.g., a product catalog). The system must handle 50,000 read QPS with sub-100ms latency and 500 write QPS. Discuss your choice of database, caching strategy, API versioning, and how you would handle distributed concurrency and idempotency to ensure a robust developer experience.

PostgreSQL

Redis

REST

OAuth2

JWT

API Gateway

Load Balancer

ULID

Questions & Insights

Clarifying Questions

What is the scale of the system? (DAU, total number of resources/items, and expected QPS).

What is the Read/Write ratio? (e.g., Is it a read-heavy system like a catalog or write-heavy like a logging service?)

What are the consistency requirements? (Does the user need strong consistency immediately after a write, or is eventual consistency acceptable?)

What is the geographic distribution of the users? (Global audience vs. single region).

Assumptions for this design:

Scale: 100 million total resources (e.g., Products).

Traffic: Read-heavy (100:1 ratio). Peak Read QPS: 50,000. Peak Write QPS: 500.

Latency: < 100ms for p99 reads.

Consistency: Strong consistency for writes, Eventual consistency for reads is acceptable via cache.

Thinking Process

To design a production-grade REST API, we move from a simple server to a distributed architecture:

Core Interface: Define the REST resources and standard status codes to ensure predictable client-server interaction.

Scaling the Read Path: Introduce an in-memory cache to handle the 50k QPS and prevent database saturation.

Ensuring Data Integrity: Use a relational database for ACID compliance on updates while implementing horizontal sharding to handle data growth.

Availability & Security: Wrap the service with a Load Balancer and API Gateway for rate limiting and authentication.

Bonus Points

Idempotency Keys: Implementing X-Idempotent-ID headers for POST/PUT requests to prevent duplicate resource creation in distributed retry scenarios.

Optimistic Concurrency Control (OCC): Using ETag/If-Match headers to handle "Lost Update" problems without heavy database locking.

Pagination Strategy: Using Cursor-based pagination (keyset pagination) instead of Offset/Limit to ensure stable performance and consistent results at scale.

API Versioning: Implementing header-based versioning (e.g., Accept: application/vnd.api.v2+json) to maintain backward compatibility without URL clutter.

Design Breakdown

Functional Requirements

Core Use Cases:

Create, Read, Update, and Delete (CRUD) resources.

List resources with filtering and pagination.

Search for resources by specific attributes.

Scope Control:

In-scope: End-to-end API lifecycle, storage strategy, and caching.

Out-of-scope: Front-end implementation, complex full-text search engine (Elasticsearch), and analytics processing.

Non-Functional Requirements

Scale: Must support 100M+ records and 50k+ QPS.

Latency: Sub-100ms response time for read operations.

Availability & Reliability: 99.99% (four nines) availability; no single point of failure.

Consistency: Strong consistency for writes; read-after-write consistency for the owning user where possible.

Security & Privacy: TLS encryption, OAuth2/JWT authentication, and strict Rate Limiting.

Estimation

Traffic Estimation:

Read: 50,000 QPS.

Write: 500 QPS.

Storage Estimation:

100M items * 2KB per item ≈ 200 GB. (Easily fits in modern DBs, but requires sharding for IOPS).

Bandwidth Estimation:

Incoming: 500 writes * 2KB = 1 MB/s.

Outgoing: 50,000 reads * 2KB = 100 MB/s (800 Mbps).

Blueprint

Concise Summary: A stateless microservice architecture backed by a partitioned Relational Database and a distributed Cache Layer, exposed via a managed API Gateway.

Major Components:

API Gateway: Handles authentication, rate limiting, and request routing.

Application Service: Stateless compute nodes executing business logic and RESTful controllers.

Redis Cache: In-memory store to absorb the high read volume (99% of traffic).

PostgreSQL Database: Persistent, partitioned storage ensuring data durability and ACID transactions.

Simplicity Audit: This design avoids complex message brokers and asynchronous workers for the MVP, relying on the DB's performance for the relatively low 500 QPS write volume.

Architecture Decision Rationale:

Why this architecture?: Stateless services allow effortless horizontal scaling. Separation of Read (Cache) and Write (DB) paths prevents contention.

Functional Satisfaction: Standard REST patterns are followed for CRUD; Indexing in SQL handles basic listing.

Non-functional Satisfaction: High availability is achieved through redundancy at every layer (LB, App Nodes, DB Replicas).

High Level Architecture

Sub-system Deep Dive

Edge (Optional)

Content Delivery & Traffic Routing: CDN used for static assets (if any), but dynamic REST responses are routed via DNS to the nearest API Gateway.

Security & Perimeter:

API Gateway: Performs JWT validation and extracts user_id.

Rate Limiting: Applied at the Gateway level (e.g., 100 requests/minute per API key) using a Leaky Bucket algorithm.

SSL/TLS: Terminated at the Load Balancer to reduce overhead on App Services.

Service

Topology & Scaling: Stateless Docker containers deployed across multiple Availability Zones (AZs). Scaling is triggered by CPU (>60%) or Request Count.

API Schema Design:

GET /v1/items/{id}: Returns JSON object. p99 < 50ms.

POST /v1/items: Accepts JSON body. Requires X-Idempotency-Key.

PUT /v1/items/{id}: Full update. Uses If-Match (ETag) for concurrency.

GET /v1/items?limit=20&cursor=abc: Keyset pagination.

Resilience & Reliability:

Retries: Exponential backoff with jitter for DB connections.

Circuit Breaker: Trips if Redis is down, falling back directly to DB Read Replicas with reduced rate limits.

Security: Service-to-service communication via mTLS or internal VPC security groups.

Storage

Access Pattern: 99% reads by ID or indexed attributes. Low frequency updates.

Database Table Design:

id (UUID/ULID), owner_id, data (JSONB for flexibility), version (INT for OCC), created_at, updated_at.

Primary Key: id. Index on owner_id and created_at.

Technical Selection: PostgreSQL.

Rationale: Robust indexing, JSONB support for semi-structured data, and strong community support for sharding tools (e.g., Citus) if growth exceeds 1TB.

Distribution Logic: Horizontal sharding by id (hash-based) to distribute IOPS evenly across multiple primary nodes.

Reliability & Recovery: Daily snapshots to S3; Write-Ahead Logs (WAL) streamed to replicas for <1 min RPO.

Cache

Purpose & Justification: Mitigates read pressure (50k QPS) which would otherwise require a massive DB cluster.

Key-Value Schema:

Key: item:{id}

Value: Serialized JSON object.

TTL: 1 hour with "sliding window" or explicit invalidation on update (Cache-Aside).

Technical Selection: Redis.

Rationale: Support for eviction policies (LRU) and sub-millisecond response times.

Failure Handling: If a cache miss occurs (or Redis is down), the app fetches from DB Read Replicas and asynchronously attempts to re-populate the cache.

Wrap Up

Advanced Topics

Trade-offs: We chose Availability over Consistency (AP) for the read path by using a cache. A user might see data that is a few seconds old if a cache invalidation fails.

Reliability: We use Read Replicas for the DB. If the Primary fails, a replica is promoted. If the Cache fails, the system stays up but runs slower (graceful degradation).

Bottleneck Analysis: The primary DB is the ultimate bottleneck. We mitigate this by sharding and ensuring 99% of reads never hit the DB.

Distinguishing Insights:

ULIDs: Use Universally Unique Lexicographically Sortable Identifiers instead of standard UUIDs for better DB index performance (sequential insertion) while maintaining uniqueness.

Compression: Using Brotli/Gzip compression on the API responses to reduce bandwidth costs given the high read volume.