The Question
FE DesignScalable Resumable File Upload System
Design a robust frontend system for uploading large files (up to several gigabytes) with real-time progress tracking. The system must handle network interruptions gracefully via resumable uploads, support multiple concurrent file transfers without blocking the main UI thread, and provide a performant queue management interface. Explain your strategies for memory management, file integrity verification, and how you would handle state synchronization for a large number of concurrent tasks.
React
Zustand
Web Workers
IndexedDB
XMLHttpRequest
Blob API
SparkMD5
Questions & Insights
Clarifying Questions
What is the maximum file size and type supported?Assumption: Up to 5GB per file, supporting any binary format. This necessitates chunked uploads.
Should the system support resumable uploads after a network failure or browser refresh?Assumption: Yes. We will use local persistence to store upload metadata for resumption.
Are we handling multiple concurrent uploads?Assumption: Yes, with a configurable concurrency limit (e.g., 3-6 simultaneous uploads) to prevent browser socket exhaustion.
Is file integrity a priority?Assumption: Yes, we will implement client-side hashing (MD5/SHA) to verify chunks on the server.
Crash Strategy
Progress Granularity: For large files, native XHR
progress events are insufficient if the connection drops. We will use Chunked Uploads (slicing the file) to provide granular progress and reliability.Queue Management: How do we manage 50+ files without crashing the UI? We use a Centralized Upload Queue with a worker-like orchestration.
Memory Management: How to handle multi-GB files without OOM (Out of Memory) errors? We use Blob.slice() which is a pointer-based operation and doesn't load the whole file into RAM.
The Core Flow:
Initialize: Request an upload ID and check for existing progress.
Process: Slice file and hash chunks (in Web Workers).
Transmit: Upload chunks concurrently with retry logic.
Finalize: Notify the server to merge chunks and verify total hash.
Elite Bonus Points
Web Workers for Hashing: Use
SparkMD5 inside a Worker to prevent UI jank during large file processing.Tus Protocol Alignment: Design the API interaction to follow the
Tus.io open protocol for resumable file uploads.Network Awareness: Use
navigator.connection to dynamically adjust chunk size or pause uploads on "save-data" mode.Headless Logic: Decouple the upload state machine into a framework-agnostic core, making it testable without a DOM.
Design Breakdown
Requirements
Functional Requirements:
Select/Drag-and-drop multiple files.
Real-time progress bar (per file and aggregate).
Pause, Resume, and Cancel actions.
Error handling with "Retry" capability.
Non-Functional Requirements:
Performance: Zero UI blocking during hashing/slicing; low memory footprint.
Reliability: Resumable from the last successful chunk after a crash.
Scalability: Support a queue of 100+ files efficiently.
Responsiveness: Mobile-friendly progress tracking.
Design Summary
Concise Summary: A chunk-based upload system managed by a centralized queue store, utilizing Web Workers for non-blocking file hashing and XHR for granular progress tracking.
Major Components:
Upload Manager: Orchestrates the queue, concurrency, and global progress state.
Chunking Service: Handles
File.slice() logic and creates payload units.Integrity Hasher: A Web Worker-based service that generates unique identifiers for resumability.
Persistence Store: IndexedDB/LocalStorage to track chunk manifests for offline-resumption.
CUJ Walkthrough: User drops 3 files ->
Upload Manager adds them to queue -> Hasher generates IDs -> Manager starts first N files -> Chunking Service sends slices -> UI reflects progress via Queue Store.Simplicity Audit: This is the simplest robust architecture. While single-stream
fetch is easier, it lacks reliable "Resume" and "Progress" for large files on unstable networks, which are core requirements for a "system."Architecture Decision Rationale:
Why this?: Chunking is the industry standard (S3, Dropbox) for reliability. Using a centralized store (Zustand/Redux) ensures the UI stays synced with background upload tasks.
Requirement Satisfaction: Meets all functional needs; Web Workers ensure the "Performance" requirement is met by offloading CPU-heavy hashing.
System Diagram
Architecture Deep Dive
Presentation Layer
Component Hierarchy: The
App Shell provides the context. The Upload Manager Feature is the smart container that connects the Upload Queue Store to the UI. It renders a list of File Item components which are dumb components receiving status and percentage.Interaction Layer: Supports
DragEvent for file drops and standard <input type="file">. Buttons trigger actions (pause/resume) which dispatch commands to the Concurrency Coordinator.Rendering Layer: For long lists of uploads, we use List Virtualization (e.g.,
react-window) to ensure the DOM remains performant. Progress bars are optimized using transform: scaleX() to avoid layout reflows during frequent updates.UI Frameworks: React for componentization, Tailwind CSS for styling, and Headless UI for accessible modals/dialogs.
Application Layer
Data Fetching Layer: While
fetch is modern, we use XMLHttpRequest (XHR) for chunks because it provides a reliable upload.onprogress event and easier aborting mechanisms for MVP simplicity.State Management Layer: A centralized Upload Queue Store (using Zustand or Redux) tracks every file's status (
IDLE, HASHING, UPLOADING, PAUSED, COMPLETED, ERROR).Concurrency Coordinator: A simple semaphore-based logic that monitors the queue and ensures only
MAX_CONCURRENT_UPLOADS (e.g., 3) are in the active state.Domain Layer
Business Rules: Validates file size and types before processing. Implements the logic that a file is only "Complete" when the server returns a 201/200 on the final merge request.
Integrity Hasher: A dedicated Web Worker reads the file in chunks using
FileReaderSync to generate an MD5 hash. This hash serves as the upload_id for resumability.Chunking Logic: Slices the
File (Blob) into fixed sizes (e.g., 5MB). Each chunk is treated as an independent transactional unit.Infrastructure Layer
API / Network: Standard RESTful endpoints:
POST /uploads (init), PATCH /uploads/:id (upload chunk), and POST /uploads/:id/finish (merge).Storage: Uses IndexedDB to store the mapping of
File Hash -> Last Successful Chunk Index. This allows the user to refresh the page, re-select the same file, and resume exactly where they left off.Wrap Up
Wrap-up
Trade-offs: Chunking adds complexity to the backend (merging files) and frontend (state management), but it is necessary for large file reliability. If we only supported <10MB files, a simple multipart/form-data upload would be more YAGNI-compliant.
Optimization: For very fast networks, we can implement Dynamic Chunk Sizing—increasing chunk size if the speed is high to reduce HTTP overhead.
Security: All pre-signed URLs or upload tokens must have short TTLs. Client-side hashing prevents the server from processing corrupted data.