Large-Scale Video Recommendation System Design

Scalable Personalized Recommendation System

Design a large-scale recommendation system for a music streaming platform to generate weekly personalized discovery playlists for 500 million users from a corpus of 100 million tracks. Your design should cover the end-to-end ML lifecycle: from multi-stage candidate retrieval and ranking to batch inference pipelines. Address specific challenges including cold-start for new tracks, popularity bias, and the use of audio-based vs. collaborative embeddings. Detail your approach to high-throughput offline serving, data consistency across weekly updates, and evaluation metrics that balance user satisfaction with discovery novelty.
Two-Tower DNNLightGBMWord2VecFAISSCNNApproximate Nearest NeighborSparkCassandraNegative SamplingMMR
00
Read

Scalable Personalized Content Feed Design

Design a large-scale personalized article recommendation system for a platform with 100M+ users. The system must provide real-time updates for news freshness, handle high-throughput candidate retrieval from a corpus of 10M+ items, and optimize for multi-objective engagement (clicks vs. read time). Detail the data lifecycle from ingestion to model serving, and discuss how you mitigate common production issues like position bias, cold-start items, and training-serving skew.
Two-Tower DNNMMoEHNSWFAISSLightGBMFlinkKafkaDistilBERTThompson Sampling
00
Read

Large-Scale Discovery and Recommendation System

Design a high-scale content discovery feed for a visual-first social platform (similar to Pinterest) serving billions of items to hundreds of millions of users. The system must optimize for both immediate engagement (clicks) and long-term utility (saves/re-pins). Your design should address the full ML lifecycle: from multi-modal data ingestion (images, text, graphs) and feature engineering for low-latency retrieval, to multi-task model training that balances conflicting objectives. Detail your strategy for handling massive-scale candidate generation, real-time ranking with under 200ms latency, and the mechanisms you would implement for model observability, drift detection, and continuous improvement through feedback loops.
Two-Tower ModelMMoEPinSageGCNHNSWDCN-v2KafkaFlinkSparkFeature StoreANNMulti-Task Learning
00
Read

Large-Scale Personalized Video Recommendation System

Design a recommendation system for a video platform with 1B+ users and 100M+ videos. The system must provide personalized content in real-time with a P99 latency of <200ms. Detail the two-stage pipeline (retrieval and ranking), explain how you handle real-time user feedback, discuss strategies for negative sampling in training, and describe how the system addresses position bias and data freshness. Ensure the design includes a robust infrastructure for model evaluation and monitoring for training-serving skew.
Two-Tower ModelDeepFMDCN-v2ANNHNSWFAISSSparkFlinkKafkaTensorRTONNXFeast
00
Read

Short-Form Video Recommendation System

Design the end-to-end recommendation engine for a short-form video platform similar to TikTok. The system must scale to 1B+ users and 100M+ videos, delivering personalized content with sub-150ms latency. Your design should specifically address the multi-objective nature of engagement (watch time, likes, shares), the need for ultra-fresh real-time user feedback, and strategies for video cold-start/exploration. Elaborate on the data pipelines for real-time feature engineering, the two-stage retrieval and ranking architecture, and the infrastructure required for continuous model evaluation and deployment.
Two-TowerMMoEFAISSHNSWKafkaFlinkSparkRedisPyTorchTriton Inference ServerProtobufKubernetes
00
Read

Scalable Short-Video Recommendation System Design

Design the core recommendation engine for a high-growth short-form video platform similar to TikTok. The system must serve personalized content to 100M+ daily active users from a corpus of 100M+ videos. Your design should specifically address the multi-stage funnel (retrieval and ranking), the optimization for multiple competing objectives (e.g., watch time vs. engagement), and the engineering of real-time data pipelines to handle sub-second feedback loops. Focus on how you would minimize serving latency while maximizing content freshness and handling the cold-start problem for new creators.
Two-Tower ModelMMoEHNSWFAISSKafkaFlinkRedisMilvusDeepFMTransformerTRTONNX
00
Read

Music Recommendation System for High-Scale Streaming

Design a personalized music recommendation system for a global streaming platform with 500M+ users and 100M+ tracks. The system must handle high-concurrency requests (50k+ QPS) with sub-200ms latency. Detail the end-to-end architecture including multi-stage retrieval, ranking strategies to balance engagement vs. discovery, real-time feature engineering for session-based personalization, and robust evaluation frameworks to measure long-term user retention. Address specific challenges like audio-based cold start, handling negative signals (skips), and maintaining model freshness at scale.
Two-Tower DNNMMoEFAISSKafkaFlinkSparkTectonPyTorchHNSW
00
Read

Large-Scale E-commerce Recommendation System

Design an end-to-end recommendation system for a global e-commerce platform with hundreds of millions of users and products. The system must surface personalized product suggestions in real-time on the homepage. Address the full ML lifecycle: building scalable data pipelines for high-throughput clickstream data, a two-stage retrieval and ranking architecture to meet a 150ms P99 latency budget, strategies for handling cold-start items, and techniques for optimizing multiple objectives like CTR and conversion rate. Detail your approach to ensuring offline/online feature consistency and monitoring for model performance degradation in a production environment.
Two-Tower ModelDeepFMFAISSMMoEKafkaFlinkSparkPyTorchFeature Store
00
Read

Ad Click-Through Rate (CTR) Prediction System

Design a high-scale Ad Click-Through Rate (CTR) prediction system capable of processing 100k+ QPS with sub-100ms latency. The system must handle a corpus of 10 million ads and 100 million users. Detail the multi-stage ranking funnel (retrieval and ranking), feature engineering for high-cardinality ID features, strategies for addressing training-serving skew, and how to maintain model calibration for downstream bidding. Explicitly discuss data ingestion, distributed training for massive datasets, and real-time monitoring of model performance and data drift.
DeepFMFactorization MachinesKafkaSparkFlinkRedisFAISSTensorFlowHorovodIsotonic RegressionONNX
10
Read

Scalable Video Recommendation System Design

Design an end-to-end recommendation system for a global video streaming platform with 100M+ DAU. The system must maximize user engagement (watch time) while maintaining low latency (<200ms). Detail the multi-stage architecture (retrieval and ranking), explain how you handle high-cardinality categorical features, and describe the data pipelines required to minimize training-serving skew. Address how the system balances different product objectives (e.g., clicks vs. completion rate) and how you ensure the model remains fresh in the face of rapidly changing trends.
Two-Tower ModelMMoEANNFAISSHNSWKafkaFlinkSparkFeature StoreTriton Inference Server
00
Read

Large-Scale Video Recommendation System Design

Design an end-to-end recommendation system for a global video streaming platform with 500M users and 100M videos. The system must optimize for multiple objectives (CTR and Watch Time) while meeting a 200ms P99 latency SLA. Detail the multi-stage architecture, including candidate retrieval via embedding-based search and high-precision ranking. Address specific challenges like position bias, delayed feedback in watch-time labels, and the infrastructure required for real-time feature freshness and model monitoring in production.
Two-TowerMMoEFAISSKafkaFlinkSparkTectonXGBoostDeepFMProtobufIsotonic Regression
00
Read
1
InterviewGPT

AI-powered tools to help you succeed in tech interviews — from resume to offer.

Interview Solver

  • Coding Puzzles
  • System Design
  • Behavioral Challenges
  • ML System Design
  • SQL Puzzles
  • FE System Design
Explore Solver

Question Bank

  • Coding Interview Questions
  • System Design Interview Questions
  • Behavioral Interview Questions
  • ML System Design Questions
  • SQL & Database Questions
  • FE System Design Questions
Explore Questions

Golden Blogs

  • Coding Solutions
  • System Design Guides
  • Behavioral Guides
  • ML System Design Guides
  • SQL Solutions
  • FE System Design Guides
Explore Blogs

Intervipedia

  • Coding Concepts
  • System Design Concepts
  • Behavioral Concepts
  • ML System Concepts
  • SQL Concepts
  • FE System Concepts
Explore Concepts

Application Tools

  • Self-Intro Generator

Company

  • Pricing
  • FAQ
  • About
  • Privacy Policy
  • Terms of Service

© 2026 InterviewGPT Inc. All rights reserved.

All systems operationalUS-East

Made with ♥ for developers