CDN
Cheat Sheet
Prime Use Case
Use when you have a global user base and need to minimize latency for static assets, large media files, or even dynamic content that can be cached at the edge.
Critical Tradeoffs
- Reduced latency vs. Cache consistency challenges
- Lower origin server load vs. Increased operational cost
- Improved availability vs. Debugging complexity (black-box behavior)
Killer Senior Insight
Modern CDNs have evolved from simple static file caches into 'Edge Computing' platforms; they are the first line of defense (WAF/DDoS) and the first layer of logic (Edge Workers), effectively moving the 'Front Door' of your architecture thousands of miles closer to the user.
Recognition
Common Interview Phrases
Common Scenarios
- Static asset hosting (JS, CSS, Images).
- Video on Demand (VoD) and Live Streaming (HLS/DASH).
- API Acceleration (caching dynamic responses with short TTLs).
- Security at the edge (DDoS mitigation and Web Application Firewalls).
Anti-patterns to Avoid
- Using a CDN for a purely internal application with users in a single office.
- Caching highly sensitive, frequently changing PII (Personally Identifiable Information) without strict 'Private' headers.
- Relying on CDN for real-time, low-latency bidirectional communication like WebSockets (though some CDNs support this, it's often not the primary use case).
The Problem
The Fundamental Issue
The 'Speed of Light' problem: Physical distance between a server and a user creates unavoidable propagation delay, leading to high RTT (Round Trip Time) and poor user experience.
What breaks without it
Origin servers crash under 'Flash Crowd' events (e.g., a product launch).
Global users experience multi-second load times due to TCP/TLS handshakes across oceans.
Bandwidth costs at the origin become prohibitively expensive.
Why alternatives fail
Vertical scaling of origin servers doesn't solve physical distance/latency.
Multi-region database deployments are complex and expensive for simple asset delivery.
Local caching (browser-side) only helps on repeat visits, not the critical first-load experience.
Mental Model
The Intuition
Think of a CDN like a global chain of convenience stores. Instead of every customer driving to the central factory (the Origin) to buy milk, the factory sends truckloads to local neighborhood stores (PoPs). Customers get their milk faster, and the factory doesn't have a traffic jam at its front gate.
Key Mechanics
DNS Resolution: Using CNAMEs to point to the CDN's managed DNS.
Anycast Routing: Routing the user to the topologically nearest edge node using the same IP address.
Cache-Control Headers: Directing the edge on how long to store content (TTL).
Purging/Invalidation: The mechanism to remove stale content from the edge globally.
Origin Shielding: An intermediate cache layer to protect the origin from 'thundering herd' cache misses.
Framework
When it's the best choice
- When read-to-write ratio is high.
- When content is static or semi-static.
- When global availability and low latency are non-negotiable requirements.
When to avoid
- When data is strictly private and cannot be stored on third-party infrastructure.
- When content changes every second and has zero cacheability (though Dynamic Site Acceleration might still help with TCP optimization).
Fast Heuristics
Tradeoffs
Strengths
- Massive reduction in Time to First Byte (TTFB).
- Offloads 90%+ of traffic from origin servers.
- Built-in DDoS protection and global traffic management.
- Reduced bandwidth costs via peering and compression (Brotli/Gzip).
Weaknesses
- Cache invalidation is 'one of the two hard things in computer science'.
- Potential for 'Stale-while-revalidate' issues leading to UI inconsistencies.
- Increased complexity in the request-response flow and debugging.
- Vendor lock-in and potential high costs for premium features like 'Edge Compute'.
Alternatives
When it wins
When the application is highly dynamic and requires low-latency database access rather than just asset delivery.
Key Difference
Involves deploying the full application stack in multiple geographic locations.
When it wins
For extremely large file updates (like game patches) where users can share fragments with each other.
Key Difference
Decentralizes delivery to the client devices themselves rather than managed edge servers.
Execution
Must-hit talking points
- Mention 'Anycast' for routing users to the nearest PoP.
- Discuss 'Cache Hit Ratio' (CHR) as the primary success metric.
- Explain 'Pull' vs 'Push' CDN models.
- Address the 'Thundering Herd' problem and how 'Origin Shielding' or 'Request Collapsing' mitigates it.
- Talk about 'Tiered Caching' to improve hit rates.
Anticipate follow-ups
- Q:How do you handle cache invalidation at scale?
- Q:How does a CDN handle HTTPS/TLS termination at the edge?
- Q:What happens if the CDN provider goes down? (Multi-CDN strategy).
- Q:How do you secure private content (e.g., signed URLs/cookies)?
Red Flags
Setting TTLs too long without a robust invalidation strategy.
Why it fails: Users see outdated content (e.g., old CSS/JS), leading to broken UI or incorrect data that is hard to clear globally.
Forgetting to vary cache keys by headers (like 'Accept-Encoding').
Why it fails: A client that doesn't support Gzip might receive a compressed file, or vice versa, causing errors.
Ignoring the 'Long Tail' of content.
Why it fails: If you have millions of unique assets that are rarely accessed, your Cache Hit Ratio will be low, and the CDN will provide little value while increasing cost.