The Question
Design

Cinema Ticket Booking System

Design a high-scale cinema ticket booking platform similar to Fandango or BookMyShow. The system must support movie discovery, real-time seat availability visualization, and a secure booking flow. Key challenges include handling massive traffic spikes for blockbuster releases, preventing double-bookings of the same seat, and managing temporary seat locks during payment. Discuss your approach to data consistency, concurrency control, and how you would ensure high availability while maintaining strict transactional integrity.
PostgreSQL
Redis
Kafka
Kubernetes
CDN
API Gateway
mTLS
OCC
Questions & Insights

Clarifying Questions

What is the expected scale of the system? (Assumption: 10 million Daily Active Users (DAU), with peak traffic during blockbuster releases reaching 50,000 QPS for searching and 5,000 QPS for bookings).
How long should a seat be locked during the checkout process? (Assumption: 5-10 minutes. If payment isn't completed, the seat must be released).
Is the system global or regional? (Assumption: Primarily regional initially, but designed to scale to multiple regions. Data for a specific cinema is local to a region).
Do we handle seat selection or just "best available"? (Assumption: Users must be able to select specific seats from a theater map).
What are the payment requirements? (Assumption: Integration with external providers like Stripe/PayPal; we must handle "at-least-once" payment confirmation).

Thinking Process

Core Bottleneck: Concurrent Seat Booking. How do we prevent two users from booking the same seat for the same showtime during a high-traffic "blockbuster" event?
Step 1: The Locking Mechanism. Use a distributed lock or database-level row locking with a TTL (Time-To-Live) to transition seats from Available -> Locked -> Booked.
Step 2: Database Integrity. Utilize RDBMS transactions to ensure that the payment status and the seat status are updated atomically.
Step 3: Scalability via Read-Scaling. Offload movie metadata and theater maps to a cache, as 99% of traffic is browsing/searching rather than booking.
Step 4: Reliability under Pressure. Implement virtual queues or rate limiting to protect the Booking Service during peak bursts.

Bonus Points

Transactional Outbox Pattern: Ensure that when a booking is finalized, downstream events (email confirmation, loyalty points) are triggered reliably without 2PC (Two-Phase Commit) overhead.
Optimistic Concurrency Control (OCC): Use versioning on seat records to handle high-concurrency seat selection without long-lived database locks.
Distributed Locks with Redlock/Redis: Using Redis to manage the 10-minute "soft lock" on seats to keep the heavy RDBMS free of transient state.
Geo-sharding: Partitioning the database by Cinema_ID or City_ID to ensure local performance and limit the blast radius of database failures.
Design Breakdown

Functional Requirements

Core Use Cases:
Users can search for movies by city/date/genre.
Users can view real-time seat availability for a specific showtime.
Users can temporarily lock seats (5-10 mins) to proceed to payment.
Users can complete payment and receive a digital ticket/QR code.
Scope Control:
In-scope: Browsing, Seat Selection, Locking, Booking, Payment Integration.
Out-of-scope: Popcorn/Food ordering, Cinema employee management portal, Movie recommendation engine.

Non-Functional Requirements

Scale: Must handle 50k+ QPS for read-heavy operations (movie listings).
Latency: Seat selection and map loading must be < 200ms.
Availability: 99.99% (Users should always be able to browse, even if booking is briefly degraded).
Consistency: Strong Consistency for seat booking (Strictly NO double-bookings).
Fault Tolerance: Automatic failover for database primaries; isolation of payment failures.
Security: PCI-DSS compliance for payment handling; secure auth for user tickets.

Estimation

Traffic Estimation:
10M DAU -> ~100M searches/day (~1,200 QPS avg, 25,000 peak).
Booking rate (~5%): 500k bookings/day (~6 QPS avg, 500 peak).
Storage Estimation:
1,000 Cinemas 20 Screens 5 Showtimes * 200 Seats = 20M seat records per day.
1 year of data: 7.3 Billion seat status rows.
Total Storage: ~1 TB/year for booking metadata.
Bandwidth Estimation:
Read: 25k QPS * 10KB (Movie/Seat map) = 250 MB/s.
Write: 500 QPS * 2KB (Booking) = 1 MB/s.

Blueprint

Concise Summary: A microservices architecture centered around a strongly consistent Booking Service using PostgreSQL for transactions and Redis for high-speed seat locking.
Major Components:
API Gateway: Handles rate limiting, authentication, and request routing.
Movie Service: Manages movie metadata and showtimes (Read-optimized).
Booking Service: Orchestrates the seat locking and booking state machine.
Payment Service: Interfaces with external gateways and manages transaction states.
Search/Theater Service: Provides cinema locations and seat map layouts.
Simplicity Audit: We use Redis for transient locks to keep the main DB clean, and standard RDBMS for permanent records. No complex event-sourcing or NoSQL is required for the MVP.
Architecture Decision Rationale:
Why this architecture?: RDBMS is non-negotiable for financial/booking integrity. Redis handles the high-frequency "is this seat taken?" checks.
Functional Satisfaction: Covers search-to-ticket flow seamlessly.
Non-functional Satisfaction: Scalable via read-replicas for movies; consistent via row-level locks for bookings.

High Level Architecture

Sub-system Deep Dive

Edge (Optional)

Content Delivery & Traffic Routing: CDN (Cloudfront/Akamai) used for movie posters, trailers, and static seat map templates to reduce origin load.
Security & Perimeter: API Gateway implements Rate Limiting based on UserID to prevent bot-scraping during blockbuster releases. SSL termination happens at the Load Balancer.

Service

Topology & Scaling: Stateless microservices deployed on Kubernetes (EKS/GKE). Auto-scaling triggered by CPU and Request Count.
API Schema Design:
GET /v1/movies: List movies (REST).
POST /v1/bookings/reserve: Reserve seats (REST).
Request: {showtimeId, seatIds[]}
Response: {reservationId, expiryTimestamp}.
POST /v1/bookings/confirm: Complete booking (REST).
Resilience & Reliability: Circuit breakers (Hystrix/Resilience4j) on the Payment Service to prevent the Booking Service from hanging if Stripe is down.
Observability: Prometheus for QPS/Latency metrics; ELK stack for structured logs (Correlation IDs across services).

Storage

Access Pattern: Heavy reads for Movie/Showtime (100:1 read/write). High-concurrency writes for Seat Status during peak.
Database Table Design (Booking DB):
Showtimes: id, movie_id, theater_id, start_time, base_price.
Seats: id, theater_id, row, number.
Showtime_Seats: id, showtime_id, seat_id, status (Available, Locked, Booked), version (for OCC), user_id.
Technical Selection: PostgreSQL.
Rationale: ACID compliance is critical. PostGIS support is useful for "cinemas near me" queries.
Distribution Logic: Sharded by Cinema_ID. Bookings for different cinemas don't need to be in the same transaction.

Cache

Purpose & Justification:
Movie Cache: Cache movie details and showtimes (TTL 1 hour).
Seat Lock Cache (Redis): Track "Locked" seats. Key: lock:showtime:{id}:seat:{id}, Value: userId, TTL: 10 mins.
Failure Handling: If Redis fails, the system falls back to checking the PostgreSQL Showtime_Seats table (degraded performance but consistent).

Messaging

Purpose & Decoupling: Decouples the booking completion from downstream side effects (Email, SMS, Loyalty updates).
Event / Topic Schema: Booking.Confirmed -> {bookingId, userEmail, movieName, seats[]}.
Technical Selection: Kafka.
Rationale: High throughput and replayability if the Notification Service fails.

Infrastructure (Optional)

Distributed Coordination: Used for leader election in scheduled tasks (e.g., a cron job to clean up expired Showtime_Seats rows that Redis missed).
Wrap Up

Advanced Topics

Trade-offs: We choose Consistency over Availability (CP) for the booking flow. If the Booking DB is down, users cannot buy tickets. This is preferred over overbooking.
Reliability & Failure Handling: Idempotency keys are required for the confirm endpoint to prevent double-charging users on retry.
Bottleneck Analysis: The Showtime_Seats table is a potential hot spot. Optimization: Use Redis to store seat maps as bitmaps for ultra-fast "read" availability checks, only hitting the DB during the actual "Lock" request.
Security: Data at rest (PII) is encrypted using AES-256. All service-to-service communication uses mTLS.