The Question
DesignScalable Cinema Booking System Design
Design a high-concurrency cinema ticket booking system that supports movie discovery, real-time seat selection with temporary holds, and secure payment processing. The system must guarantee zero double-bookings during peak traffic for blockbuster releases and handle seat-release logic if payments are not completed within a specific timeout period.
PostgreSQL
Redis
Kafka
CDN
JWT
Microservices
Questions & Insights
Clarifying Questions
Scale and Traffic: What is the expected scale in terms of Daily Active Users (DAU) and Peak QPS (Queries Per Second)? Specifically, how should the system handle "blockbuster" releases where traffic might spike 10x-100x?
Booking Window & Hold Policy: How long is a seat "held" for a user once selected before it is released back to the pool? Is partial booking (e.g., 2 out of 3 requested seats) allowed?
Geographic Scope: Is this a global system or restricted to a specific region/country? Does it need to handle multiple time zones and currencies?
Inventory Complexity: Are there different seat types (VIP, Recliner, IMAX) or dynamic pricing models based on demand?
Assumptions for MVP:
Scale: 10M DAU, Peak QPS of 50k (mostly reads/searches), Booking QPS of 5k.
Seat Hold: Seats are held for 10 minutes. No partial bookings.
Scope: Single country, single currency for MVP.
Consistency: Strong consistency is required for seat selection to prevent double-booking.
Thinking Process
Core Bottleneck: The primary challenge is high-concurrency seat selection. Multiple users clicking the same seat at the exact same millisecond must not result in double-booking.
Progressive Questions:
How do we allow millions of users to browse movie schedules without hitting the main database? (Caching & Search Indexing).
How do we ensure that when two users try to book Seat A1, only one succeeds? (Distributed Locking/Atomic Transactions).
How do we handle the "temporary hold" state without leaving the database in an inconsistent state if a user abandons their cart? (TTL-based locks or status timers).
How do we ensure the system remains responsive even if the Payment Gateway is slow or down? (Asynchronous Payment Processing & Webhooks).
Bonus Points
Transactional Outbox Pattern: Use this to ensure that booking records and notification events (emails/SMS) are updated atomically, preventing "phantom" bookings where a user pays but never gets a ticket.
Hot-Partition Mitigation: For blockbuster movies, sharding by
MovieID might cause hot spots. We can use a composite sharding key CinemaID_ShowtimeID to distribute load across the cluster.Optimistic Locking with Versioning: Instead of heavy pessimistic locks in the DB, use
version columns to handle seat updates, significantly improving throughput for high-contention shows.Read-Heavy Optimization: Use a search-optimized store (e.g., Elasticsearch) for movie discovery to offload complex queries from the transactional RDBMS.
Design Breakdown
Functional Requirements
Users can search for movies by title, genre, city, and date.
Users can view cinema layouts and real-time seat availability.
Users can select seats and "hold" them for 10 minutes.
Users can complete payments and receive a digital ticket (QR Code).
Cinema managers can add/update showtimes and seat configurations.
Non-Functional Requirements
High Consistency: Zero tolerance for double-booking.
High Availability: Browsing and searching must be available 99.99% of the time.
Low Latency: Seat selection and availability checks must be < 200ms.
Scalability: Must handle massive spikes during major movie releases.
Estimation
Storage: 1,000 cinemas 10 screens 5 shows/day * 200 seats = 10M seat-records per day. For 5 years of data: ~18B records.
Bandwidth: 50k QPS * 2KB avg response = 100MB/s (Search traffic).
Concurrency: 5k booking QPS. If a "hold" lasts 10 mins, we might have 3M active "held" seats at peak. Redis can easily manage this volume.
Blueprint
Concise Summary: A microservices-based architecture using an RDBMS for transactional integrity and a distributed cache for high-speed seat locking.
Major Components:
API Gateway: Entry point for authentication, rate limiting, and request routing.
Search Service: Provides low-latency movie and showtime discovery using read-replicas.
Booking Service: Orchestrates the seat selection, temporary holding, and final ticket issuance.
Payment Service: Integrates with 3rd party providers and manages payment states.
Notification Service: Asynchronously sends tickets via Email/SMS.
Simplicity Audit: This design avoids complex "Saga" orchestrators by using a simple "Pending" state in the database, which is the most reliable way to maintain consistency for an MVP.
Architecture Decision Rationale:
Why this architecture?: RDBMS provides the ACID guarantees necessary for financial and inventory transactions.
Functional Satisfaction: Covers end-to-end flow from discovery to ticket delivery.
Non-functional Satisfaction: Redis ensures low-latency seat locking, while asynchronous messaging ensures the system doesn't block on slow external payment APIs.
High Level Architecture
Sub-system Deep Dive
Edge (Optional)
Content Delivery & Traffic Routing:
Use Cloudfront/Akamai to cache static movie posters and UI assets.
Security & Perimeter:
API Gateway handles JWT validation.
Rate limiting is applied per UserID to prevent bot-scraping of seat maps.
Service
Topology & Scaling: Stateless microservices deployed across multiple Availability Zones (AZs). Scaling is triggered by CPU and Request Count.
API Schema Design:
POST /v1/bookings/reserve: Reserve seats. (Request: showtime_id, seat_ids[]; Response: booking_id, expiry_time).POST /v1/payments/confirm: Confirm payment. (Request: booking_id, payment_token).Resilience: Implement exponential backoff for Payment Gateway calls.
Observability: Prometheus metrics for "Booking Success Rate" and "Seat Conflict Rate".
Storage
Access Pattern: 90% Read (Browsing), 10% Write (Booking).
Database Table Design:
Shows:
id, movie_id, cinema_id, start_time, price.Seats:
id, show_id, row_num, col_num, status (Available, Reserved, Booked).Bookings:
id, user_id, show_id, total_price, status (Pending, Confirmed, Expired).Technical Selection: PostgreSQL. Rationale: Strong ACID support and Mature Row-Level Locking.
Distribution Logic: Sharded by
CinemaID to keep all seats for a specific venue on the same physical shard, simplifying transaction logic.Cache
Purpose & Justification: Redis is used for "Seat Holds". DB writes are too slow for the high-frequency "clicking" phase.
Key-Value Schema:
Key:
lock:show:{id}:seat:{id}Value:
user_idTTL: 600 seconds (10 mins).
Failure Handling: If Redis is down, fall back to DB-level pessimistic locking (slower but safe).
Messaging
Purpose & Decoupling: Kafka/SQS decouples the Payment Service from the Notification Service.
Event Schema:
BookingConfirmedEvent { booking_id, user_email, qr_code_data }.Technical Selection: Kafka for high throughput and replayability in case the Notification service fails.
Wrap Up
Advanced Topics
Trade-offs: We prioritize Consistency over Availability (CP in CAP) during the seat selection process. If the database is partitioned, we block booking to prevent double-selling.
Reliability: A "Cleanup Worker" runs every minute to find
Pending bookings in the DB that have exceeded 10 minutes and marks them as Expired, releasing seats.Bottleneck Analysis: The
Seats table could become a bottleneck. Optimization: Move highly dynamic "Seat Status" to a BitMap in Redis to show real-time availability without querying the DB.Security: All payment data is handled via PCI-compliant redirection (Stripe/Paypal) so the system never stores credit card numbers (PII protection).