The Question
DesignDesign a Proximity-Based Discovery System (Yelp)
Design a highly available and scalable system like Yelp that allows users to search for local businesses based on geographical location, category, and keyword. The system must support millions of businesses and high-frequency read traffic. Key requirements include efficient spatial indexing for low-latency search (<200ms), a mechanism for posting and aggregating reviews, and handling high-volume photo uploads. Address the trade-offs between search accuracy and system performance, and explain how you would handle varying business densities across different geographic regions.
PostgreSQL
PostGIS
Redis
Kafka
S3
CDN
Geohash
Cassandra
Kubernetes
Elasticsearch
Questions & Insights
Clarifying Questions
Q1: What is the scale of the system in terms of businesses and active users?
Assumption: 100M businesses, 50M Daily Active Users (DAU), and 200M monthly searches.
Q2: What are the primary search criteria?
Assumption: Search by location (lat/lng) and radius, category (e.g., "Sushi"), and business name.
Q3: How fresh must the search results be when a new review is posted?
Assumption: Eventual consistency is acceptable; a 1-2 minute delay for review count and average rating updates in search results is fine.
Q4: Are photos and videos included in the MVP?
Assumption: Photos are in-scope (high impact), videos are out-of-scope for MVP.
Thinking Process
Spatial Indexing Strategy: The core bottleneck is performing efficient 2D range queries. I will use Geohash (or PostGIS) to convert 2D coordinates into 1D strings to leverage standard B-Tree or inverted indices.
Read-Heavy Optimization: Yelp is roughly 1000:1 read-to-write. I will prioritize a read-aside cache for business metadata and pre-computed search results for popular areas (e.g., "Best Pizza in NYC").
Async Aggregations: Review scores shouldn't be calculated on-the-fly. I will use an asynchronous pipeline to update a business's aggregate rating to keep the write path for reviews highly responsive.
Bonus Points
Quadrant-based Sharding: Discussing how to shard spatial data using Quadtrees to handle varying density (e.g., Manhattan vs. Montana) so no single shard becomes a hotspot.
Geo-fencing & Cache Warming: Using user movement patterns to pre-fetch business data for the user’s current and neighboring Geohashes into a local or edge cache.
Advanced Ranking: Moving beyond simple distance/rating to a "Learning to Rank" (LTR) model incorporating user preferences and historical click-through rates.
Design Breakdown
Functional Requirements
Core Use Cases:
Users can search for businesses by location and category/keyword.
Users can view detailed business profiles, including reviews and photos.
Users can post text reviews and upload photos for a business.
Scope Control:
In-Scope: Search, Business Metadata, Review System, Photo Storage.
Out-of-Scope: User social graph (friends), reservations, real-time messaging, and video content.
Non-Functional Requirements
Scale: Support 100M businesses and 10k+ search QPS.
Latency: Search results should return in < 200ms (P99).
Availability & Reliability: 99.99% availability; search must work even if the review system is lagging.
Consistency: Eventual consistency for reviews; strong consistency for business metadata updates (e.g., hours of operation).
Security: Photos must be served via signed URLs to prevent hotlinking/unauthorized access.
Estimation
Traffic Estimation:
50M DAU / 86400s \approx 600 average users/sec.
Peak Search QPS: 10,000.
Peak Review QPS: 100.
Storage Estimation:
100M businesses \times 2KB/metadata \approx 200GB.
1B reviews \times 1KB/review \approx 1TB.
Photos: 1B photos \times 500KB \approx 500TB (requires Object Storage).
Bandwidth Estimation:
Incoming (Reviews + Photos): 100 \text{ reviews/sec} \times 500\text{KB} \approx 50\text{MB/s}.
Outgoing (Search + Details): 10,000 \text{ QPS} \times 10\text{KB} \approx 100\text{MB/s} (excluding photo delivery via CDN).
Blueprint
Concise Summary: A microservices-based architecture leveraging a spatial database for location-based discovery and an asynchronous pipeline for updating business reputations.
Major Components:
API Gateway: Handles authentication, rate limiting, and request routing.
Search Service: Uses a spatial index (Elasticsearch/PostGIS) to perform proximity searches.
Business Service: Manages CRUD operations for business profiles and metadata.
Review Service: Manages user reviews and triggers background updates for business ratings.
Storage Layer: Relational DB for structured data and Object Storage for photos.
Simplicity Audit: This design uses PostGIS for spatial indexing instead of a custom Quadtree implementation to reduce operational complexity for the MVP.
Architecture Decision Rationale:
Why this architecture?: Decoupling Search from Review allows the search cluster to scale independently based on high read volume.
Functional Satisfaction: Meets search, view, and write requirements using dedicated, optimized paths.
Non-functional Satisfaction: High availability through service redundancy and low latency via spatial indexing and caching.
High Level Architecture
Sub-system Deep Dive
Edge (Optional)
Content Delivery & Traffic Routing:
CDN: Cloudflare or CloudFront to cache business photos and static UI assets.
DNS: Latency-based routing to direct users to the nearest regional deployment.
Security & Perimeter:
API Gateway: Implements JWT-based AuthN, Rate Limiting (100 searches/min per user), and SSL termination.
WAF: Protection against common SQLi or DDoS attacks on the search endpoint.
Service
Topology & Scaling:
Stateless Services: All services (Search, Business, Review) are deployed on Kubernetes, scaling based on CPU and Request Count.
Isolation: Search Service is isolated from the Review Service to ensure a spike in review writes doesn't impact search latency.
API Schema Design:
GET /v1/search?term=sushi&lat=40.7&lng=-74.0&radius=5km: Returns list of business summaries.GET /v1/business/{id}: Returns full metadata, hours, and top reviews.POST /v1/business/{id}/review: Submit text and photo IDs. (Returns 202 Accepted).Resilience & Reliability:
Circuit Breakers: Implemented on the Search Service when calling the Business DB for hydration.
Retries: Exponential backoff for Review Service writes to the Message Queue.
Storage
Access Pattern:
Search: High-frequency spatial range queries.
Business: Key-value lookups by BusinessID.
Reviews: High-volume writes, time-ordered reads by BusinessID.
Database Table Design:
Business Table:
id (UUID), name, location (GEOGRAPHY), geohash (Varchar), avg_rating, review_count.Review Table:
id, business_id, user_id, rating, comment, created_at.Technical Selection:
PostgreSQL + PostGIS: Chosen for the Business DB to handle spatial indexing with
GIST indices.Cassandra: Chosen for the Review DB to handle high write throughput and scalability by
business_id.S3: For storing high-resolution business and review photos.
Distribution Logic:
Sharding Business DB by
Geohash prefix (e.g., 4-character prefix) to keep nearby businesses on the same shard.Cache
Purpose & Justification: Reduces database load for the most frequently viewed businesses and common search terms.
Key-Value Schema:
bus_md:{id}: JSON blob of business details. TTL: 1 hour.geo_search:{geohash}:{category}: List of Business IDs. TTL: 5 minutes.Technical Selection: Redis Cluster for its support of geospatial data structures (GEOADD/GEORADIUS) if we want to offload simple proximity logic from the DB.
Failure Handling: Cache-aside pattern; if Redis is down, services fall back to the primary DB.
Messaging
Purpose & Decoupling: Decouples the review submission from the expensive aggregation logic (updating the business's average score).
Event / Topic Schema:
Topic: review_events. Payload: {business_id: 123, new_rating: 5, action: "ADD"}.Throughput & Partitioning: Partitioned by
business_id to ensure sequential processing of reviews for the same business.Technical Selection: Kafka for high-throughput event persistence and replayability.
Infrastructure (Optional)
Observability:
Prometheus/Grafana for monitoring Search QPS and P99 latency.
Jaeger for tracing requests from Gateway to Search to DB.
Platform Security:
mTLS for inter-service communication within the VPC.
Wrap Up
Advanced Topics
Trade-offs (Latency vs. Accuracy): We use eventual consistency for review counts. A user might see "4.5 stars (102 reviews)" in search, but when they click the profile, they see 103 reviews. This is acceptable for user experience.
Reliability: If the Review Stream (Kafka) fails, reviews are still persisted in the Review DB. A recovery script can backfill the aggregates later.
Bottleneck Analysis: The "Hot Shard" problem occurs in dense areas like San Francisco.
Optimization: Use sub-geohashes or dynamic Quadtree splitting to further divide dense areas across more nodes.
Distinguishing Insights: For a Staff-level design, I would suggest Tiered Storage for reviews. Recent reviews (last 1 year) are in Cassandra for fast access, while 10-year-old reviews are archived in S3/Snowflake to keep the operational DB lean and performant.