DowngradedOur downstream service providers are currently experiencing outages, and our engineering team is actively working on a resolution. Some services—including the Solver, Partner, and Tools—are temporarily degraded with higher latency and lower bandwidth. Rest assured, Intervipedia, Solutions, and the Question Bank features are not impacted and remain fully operational.DowngradedOur downstream service providers are currently experiencing outages, and our engineering team is actively working on a resolution. Some services—including the Solver, Partner, and Tools—are temporarily degraded with higher latency and lower bandwidth. Rest assured, Intervipedia, Solutions, and the Question Bank features are not impacted and remain fully operational.DowngradedOur downstream service providers are currently experiencing outages, and our engineering team is actively working on a resolution. Some services—including the Solver, Partner, and Tools—are temporarily degraded with higher latency and lower bandwidth. Rest assured, Intervipedia, Solutions, and the Question Bank features are not impacted and remain fully operational.DowngradedOur downstream service providers are currently experiencing outages, and our engineering team is actively working on a resolution. Some services—including the Solver, Partner, and Tools—are temporarily degraded with higher latency and lower bandwidth. Rest assured, Intervipedia, Solutions, and the Question Bank features are not impacted and remain fully operational.
DowngradedOur downstream service providers are currently experiencing outages, and our engineering team is actively working on a resolution. Some services—including the Solver, Partner, and Tools—are temporarily degraded with higher latency and lower bandwidth. Rest assured, Intervipedia, Solutions, and the Question Bank features are not impacted and remain fully operational.DowngradedOur downstream service providers are currently experiencing outages, and our engineering team is actively working on a resolution. Some services—including the Solver, Partner, and Tools—are temporarily degraded with higher latency and lower bandwidth. Rest assured, Intervipedia, Solutions, and the Question Bank features are not impacted and remain fully operational.DowngradedOur downstream service providers are currently experiencing outages, and our engineering team is actively working on a resolution. Some services—including the Solver, Partner, and Tools—are temporarily degraded with higher latency and lower bandwidth. Rest assured, Intervipedia, Solutions, and the Question Bank features are not impacted and remain fully operational.DowngradedOur downstream service providers are currently experiencing outages, and our engineering team is actively working on a resolution. Some services—including the Solver, Partner, and Tools—are temporarily degraded with higher latency and lower bandwidth. Rest assured, Intervipedia, Solutions, and the Question Bank features are not impacted and remain fully operational.
The Question
FE Design

Scalable Resumable File Upload System

Design a robust frontend system for uploading large files (up to several gigabytes) with real-time progress tracking. The system must handle network interruptions gracefully via resumable uploads, support multiple concurrent file transfers without blocking the main UI thread, and provide a performant queue management interface. Explain your strategies for memory management, file integrity verification, and how you would handle state synchronization for a large number of concurrent tasks.
React
Zustand
Web Workers
IndexedDB
XMLHttpRequest
Blob API
SparkMD5
Questions & Insights

Clarifying Questions

What is the maximum file size and type supported?Assumption: Up to 5GB per file, supporting any binary format. This necessitates chunked uploads.
Should the system support resumable uploads after a network failure or browser refresh?Assumption: Yes. We will use local persistence to store upload metadata for resumption.
Are we handling multiple concurrent uploads?Assumption: Yes, with a configurable concurrency limit (e.g., 3-6 simultaneous uploads) to prevent browser socket exhaustion.
Is file integrity a priority?Assumption: Yes, we will implement client-side hashing (MD5/SHA) to verify chunks on the server.

Crash Strategy

Progress Granularity: For large files, native XHR progress events are insufficient if the connection drops. We will use Chunked Uploads (slicing the file) to provide granular progress and reliability.
Queue Management: How do we manage 50+ files without crashing the UI? We use a Centralized Upload Queue with a worker-like orchestration.
Memory Management: How to handle multi-GB files without OOM (Out of Memory) errors? We use Blob.slice() which is a pointer-based operation and doesn't load the whole file into RAM.
The Core Flow:
Initialize: Request an upload ID and check for existing progress.
Process: Slice file and hash chunks (in Web Workers).
Transmit: Upload chunks concurrently with retry logic.
Finalize: Notify the server to merge chunks and verify total hash.

Elite Bonus Points

Web Workers for Hashing: Use SparkMD5 inside a Worker to prevent UI jank during large file processing.
Tus Protocol Alignment: Design the API interaction to follow the Tus.io open protocol for resumable file uploads.
Network Awareness: Use navigator.connection to dynamically adjust chunk size or pause uploads on "save-data" mode.
Headless Logic: Decouple the upload state machine into a framework-agnostic core, making it testable without a DOM.
Design Breakdown

Requirements

Functional Requirements:
Select/Drag-and-drop multiple files.
Real-time progress bar (per file and aggregate).
Pause, Resume, and Cancel actions.
Error handling with "Retry" capability.
Non-Functional Requirements:
Performance: Zero UI blocking during hashing/slicing; low memory footprint.
Reliability: Resumable from the last successful chunk after a crash.
Scalability: Support a queue of 100+ files efficiently.
Responsiveness: Mobile-friendly progress tracking.

Design Summary

Concise Summary: A chunk-based upload system managed by a centralized queue store, utilizing Web Workers for non-blocking file hashing and XHR for granular progress tracking.
Major Components:
Upload Manager: Orchestrates the queue, concurrency, and global progress state.
Chunking Service: Handles File.slice() logic and creates payload units.
Integrity Hasher: A Web Worker-based service that generates unique identifiers for resumability.
Persistence Store: IndexedDB/LocalStorage to track chunk manifests for offline-resumption.
CUJ Walkthrough: User drops 3 files -> Upload Manager adds them to queue -> Hasher generates IDs -> Manager starts first N files -> Chunking Service sends slices -> UI reflects progress via Queue Store.
Simplicity Audit: This is the simplest robust architecture. While single-stream fetch is easier, it lacks reliable "Resume" and "Progress" for large files on unstable networks, which are core requirements for a "system."
Architecture Decision Rationale:
Why this?: Chunking is the industry standard (S3, Dropbox) for reliability. Using a centralized store (Zustand/Redux) ensures the UI stays synced with background upload tasks.
Requirement Satisfaction: Meets all functional needs; Web Workers ensure the "Performance" requirement is met by offloading CPU-heavy hashing.

System Diagram

Architecture Deep Dive

Presentation Layer

Component Hierarchy: The App Shell provides the context. The Upload Manager Feature is the smart container that connects the Upload Queue Store to the UI. It renders a list of File Item components which are dumb components receiving status and percentage.
Interaction Layer: Supports DragEvent for file drops and standard <input type="file">. Buttons trigger actions (pause/resume) which dispatch commands to the Concurrency Coordinator.
Rendering Layer: For long lists of uploads, we use List Virtualization (e.g., react-window) to ensure the DOM remains performant. Progress bars are optimized using transform: scaleX() to avoid layout reflows during frequent updates.
UI Frameworks: React for componentization, Tailwind CSS for styling, and Headless UI for accessible modals/dialogs.

Application Layer

Data Fetching Layer: While fetch is modern, we use XMLHttpRequest (XHR) for chunks because it provides a reliable upload.onprogress event and easier aborting mechanisms for MVP simplicity.
State Management Layer: A centralized Upload Queue Store (using Zustand or Redux) tracks every file's status (IDLE, HASHING, UPLOADING, PAUSED, COMPLETED, ERROR).
Concurrency Coordinator: A simple semaphore-based logic that monitors the queue and ensures only MAX_CONCURRENT_UPLOADS (e.g., 3) are in the active state.

Domain Layer

Business Rules: Validates file size and types before processing. Implements the logic that a file is only "Complete" when the server returns a 201/200 on the final merge request.
Integrity Hasher: A dedicated Web Worker reads the file in chunks using FileReaderSync to generate an MD5 hash. This hash serves as the upload_id for resumability.
Chunking Logic: Slices the File (Blob) into fixed sizes (e.g., 5MB). Each chunk is treated as an independent transactional unit.

Infrastructure Layer

API / Network: Standard RESTful endpoints: POST /uploads (init), PATCH /uploads/:id (upload chunk), and POST /uploads/:id/finish (merge).
Storage: Uses IndexedDB to store the mapping of File Hash -> Last Successful Chunk Index. This allows the user to refresh the page, re-select the same file, and resume exactly where they left off.
Wrap Up

Wrap-up

Trade-offs: Chunking adds complexity to the backend (merging files) and frontend (state management), but it is necessary for large file reliability. If we only supported <10MB files, a simple multipart/form-data upload would be more YAGNI-compliant.
Optimization: For very fast networks, we can implement Dynamic Chunk Sizing—increasing chunk size if the speed is high to reduce HTTP overhead.
Security: All pre-signed URLs or upload tokens must have short TTLs. Client-side hashing prevents the server from processing corrupted data.