The Question
FE DesignDesign a Client-Side PDF Viewer and Annotation Tool
Design a highly responsive, memory-efficient client-side PDF viewer and annotation workspace. The application must render massive PDF files (up to 500 pages) smoothly, support real-time user-drawn annotations (freehand, highlight, shape, and text), and scale without performance issues on mobile devices. Detail your approach to multi-layer viewport rendering, high-DPI scaling, virtualized rendering lists, memory management, coordinate scaling across zoom actions, and offline-first storage architecture.
React
Tailwind CSS
pdf.js
Web Workers
IndexedDB
Dexie.js
OffscreenCanvas
SVG
Questions & Insights
Clarifying Questions
Question 1: Do we need to support editing the base PDF content (e.g., reflowing text, altering embedded images), or are we strictly overlaying non-destructive annotations on top of the rendered PDF pages?
Assumption: We are only overlaying non-destructive annotations (highlights, text comments, freehand drawings, shapes). This is the standard pattern for web-based PDF annotation tools and avoids writing a full PDF generation/modification engine for the MVP.
Question 2: How large are the expected documents (page count and file size), and what are the target hardware capabilities?
Assumption: The system must support documents up to 500 pages (approx. 100MB) while maintaining 60fps scrolling on mid-tier mobile tablets and modern laptops. This necessitates virtualized rendering and dynamic memory unloading of offscreen pages.
Question 3: How will annotations be saved, synced, and exported?
Assumption: Annotations will be serialized to standard JSON structures containing relative coordinates, page indexes, and creator metadata. They will be saved to a backend via debounced auto-saves and can be burned directly into a downloadable PDF binary client-side during export.
Question 4: What specific annotation tools are required for the Minimum Viable Product (MVP)?
Assumption: Highlight (attaches to native PDF text segments), Freehand Pen (vector path sketching), Rectangle/Shape (vector geometry), and Textbox (rich-text inputs placed at absolute page locations).
---
Crash Strategy
This design resolves the high-performance constraints of rendering heavy vector/raster files in a single-threaded browser environment without crashing the viewport.
Progressive Architecture Flow
How do we render high-resolution PDF pages on-demand without locking up the browser thread?
Resolution: We use
pdf.js inside a dedicated Web Worker to fetch, parse, and rasterize PDF pages onto dynamic, high-DPI HTML5 <canvas> elements.How do we manage memory and ensure smooth scrolling across hundreds of pages?
Resolution: We implement a custom bi-directional virtual list (
VirtualizedPageViewport) that mounts only the active visible viewport pages (plus a small buffer) into the DOM, instantly unmounting and cleaning up offscreen <canvas> context memories.How do we guarantee annotations scale seamlessly and remain perfectly positioned across different screen dimensions, devices, and zoom levels?
Resolution: We implement an isolated SVG coordinate system on top of each page. The system maps all canvas interaction coordinates into a standardized unit space (PDF Points, default 72 points per inch) and applies responsive scaling dynamically using coordinate matrices.
---
Elite Bonus Points
Low-Latency Freehand Interpolation: Implementing Chaikin's Algorithm or Catmull-Rom spline interpolation on raw mouse/pointer inputs to render smooth freehand strokes at 60fps, bypassing the jitter of raw browser mouse events.
OffscreenCanvas rendering: Utilizing standard
OffscreenCanvas transfer rules inside the PDF Worker to offload the expensive rasterization calculation from the main UI thread entirely, preventing frame drops during background page loading.Incremental Byte-Range PDF Loading: Configuring the network client to fetch the PDF using partial HTTP Range requests (
206 Partial Content). This allows the application to render page 100 instantly without downloading pages 1 through 99.Sub-pixel Text Overlay Alignment: Parsing and injecting invisible, standard-compliant HTML text characters dynamically matched to the text matrices of the PDF. This enables native browser search, highlight selection, and screen reader access (
ARIA) that overlays perfectly over the rendered canvas image.---
Design Breakdown
Requirements
Functional Requirements
Document Loading & Rendering: Users can load local PDF files or open remote URLs with support for dynamic zooming (50% to 400%) and fit-to-width/fit-to-height scaling.
Annotation Toolbar: Selection of tools including Hand (pan), Text Highlight, Freehand Pen, Rectangle Shape, and Textbox.
Interactive Canvas Overlay: Users can draw, resize, move, edit, and delete annotations directly on top of individual PDF pages.
Undo/Redo History: Granular transactional undo/redo capability covering all drawing and editing actions.
Local Cache & Autosave: Auto-saves active annotations to client-side storage to prevent data loss on sudden connection drops.
Non-Functional Requirements
Performance: Frame rate must stay above 55fps during active freehand drawing or panning. Page rasterization must complete within 200ms of the page entering the virtualized viewport.
Memory Footprint: Strict memory limits preventing the DOM and GPU from exceeding 300MB of canvas cache, discarding offscreen buffers actively.
Responsiveness: Mobile-friendly pointer event normalization for touch screen styluses (e.g., Apple Pencil, Samsung S-Pen) with pressure sensitivity and palm-rejection.
Accessibility (a11y): Focus states for keyboard-based annotation navigation, aria tags for annotator elements, and screen-readable structural text matching.
---
Design Summary
Concise Summary
The Client-Side PDF Viewer leverages a decoupled, multi-layered architecture where a virtualized list renders individual pages. Each page consists of three layered views: a background Canvas Layer for PDF rendering (handled via Web Workers), a Middle Text Layer for native browser selection, and a top SVG Annotation Layer for capturing and drawing annotations, all synchronized through a central normalized coordinate store.
Major Components
PDFViewerContainer: The high-level Orchestrator component that initiates file parsing, handles the Toolbar system state, and coordinates dynamic scale and rotation configurations.
Toolbar: The presentation component managing the active tool state, layout zoom actions, and global undo/redo command triggers.
VirtualizedPageViewport: A bi-directional virtualization viewport wrapper that tracks scroll offsets and calculates active layouts to mount only visible pages.
PDFPage: The localized container representing a single PDF page viewport, managing coordinates and lifecycle events of its sub-layers.
CanvasLayer: A canvas component that invokes the offscreen Web Worker to rasterize page buffers based on the active scale and device pixel ratio.
TextSelectionLayer: A DOM-aligned layer that overlays invisible HTML text nodes returned by the PDF engine, allowing native-like text search and highlighting.
SVGAnnotationLayer: An SVG overlay container that hosts vector elements (paths, rectangles) and interactive text areas representing individual annotations, listening to coordinate-mapped user interactions.
CUJ Walkthrough
Loading a Document: The user selects a 100-page PDF.
PDFViewerContainer calls the PDFJSWorker to load the file structure. VirtualizedPageViewport calculates that only pages 1 and 2 are visible, rendering those instances of PDFPage.Adding a Freehand Annotation: The user selects the "Pen" tool from the
Toolbar. They drag their cursor across Page 1. The SVGAnnotationLayer captures these mouse events, translates screen coordinates (x, y) into normalized PDF space coordinates (X_{pdf}, Y_{pdf}) at 72 DPI, and displays the progressive path dynamically inside an SVG <path> node. On mouse up, the path is committed to the Global Annotation Store.Simplicity Audit
This architecture utilizes an HTML5 SVG element layered over standard 2D canvas pages instead of a complex WebGL engine. It implements coordinate scaling natively through CSS
transform and SVG viewBox structures, which is simple, standard-compliant, and avoids custom vector engines.Architecture Decision Rationale
Why this architecture is the best for this problem: The decoupled layering of visual assets (Canvas), physical text (TextLayer), and interactive vector drawings (SVG Layer) means each component has a single responsibility.
Requirement Satisfaction: It keeps performance high (canvas unmounting avoids memory bloat), provides accurate selection (using real browser text nodes in the middle layer), and handles drawings crisply without pixelation (via resolution-independent SVGs on top).
---
System Diagram
---
Architecture Deep Dive
Presentation Layer
+-------------------------------------------------------------+ | PDFViewerContainer | | +-------------------------------------------------------+ | | | Toolbar | | | +-------------------------------------------------------+ | | | VirtualizedPageViewport | | | | +-------------------------------------------------+ | | | | | PDFPage | | | | | | +-------------------------------------------+ | | | | | | | SVGAnnotationLayer | | | | | | | +-------------------------------------------+ | | | | | | | TextSelectionLayer | | | | | | | +-------------------------------------------+ | | | | | | | CanvasLayer | | | | | | | +-------------------------------------------+ | | | | | +-------------------------------------------------+ | | | +-------------------------------------------------------+ | +-------------------------------------------------------------+
Component Hierarchy
The presentation layer strictly enforces a nested structure:
PDFViewerContainer (Outer App Shell): Manages global styles, file selection interfaces, error boundary states, and injects state providers down the tree.
MainLayout: Houses layout boundaries, sidebar navigation (thumbnails, bookmark list, annotation lists), and the toolbar.
Toolbar Component: Renders controls for scale adjustment, tools, undo/redo, and action menus.
VirtualizedPageViewport (Feature Container): Evaluates page size metadata, dynamically renders calculated container blocks, and keeps only visible pages mounted.
PDFPage (Leaf Container): Controls the mounting of individual page elements, listening to viewport intersection updates.
CanvasLayer, TextSelectionLayer, SVGAnnotationLayer (Leaf Components): Three stacked absolute siblings that map onto the calculated dimension bounds.
Interaction Layer
Coordinate Mapping: Every touch or click event must be translated from screen coordinate space (x_{screen}, y_{screen}) into standardized PDF space coordinates (X_{pdf}, Y_{pdf}) at 72 DPI.
\begin{bmatrix} X_{pdf} \\ Y_{pdf} \end{bmatrix} = \begin{bmatrix} \text{scale} \cdot \cos(\theta) & -\text{scale} \cdot \sin(\theta) \\ \text{scale} \cdot \sin(\theta) & \text{scale} \cdot \cos(\theta) \end{bmatrix}^{-1} \left( \begin{bmatrix} x_{screen} \\ y_{screen} \end{bmatrix} - \begin{bmatrix} \text{offsetLeft} \\ \text{offsetTop} \end{bmatrix} \right)
Pointer Event Normalization: PointerEvents (
pointerdown, pointermove, pointerup) unified with touch event targets enable high-fidelity stylus integration with pressure-sensitive brush widths and automated palm-rejection logic.Accessibility & Focus: Visual annotation overlays are paired with keyboard accessibility. When a user navigates via the
Tab key, focus frames land on interactive SVG elements or editable Textboxes. Dynamic screen reader messages describe annotations (e.g., "Highlight on Page 4 containing text: Systems Architecture").Rendering Layer
CSR vs SSR vs Hybrid: Client-side rendering is used exclusively for the interactive workspace to avoid rendering massive interactive documents server-side.
Virtualization Logic: The viewport tracks scroll position and checks intersections. Unmounted pages release their
<canvas> context memories, preventing memory-exhaustion crashes.Retina Displays Scaling: To prevent blurriness, the page canvas is scaled by the device pixel ratio (DPR):
\text{CanvasWidth}_{physical} = \text{PageWidth}_{CSS} \times \text{Scale} \times DPR
This matches physical device pixels to CSS coordinates, keeping text sharp on Retina displays.
UI Frameworks / Tools
Tailwind CSS: Utility-first CSS classes ensure performant rendering with minimal layout recalculations.
Radix UI Primitive / HeadlessUI: Unstyled accessible components provide standard compliant accessible controls (modals, dropdowns, keyboard focus loops).
---
Application Layer
Data Fetching Layer
Byte-Range Requests: Utilizes fetch requests with standard
Range: bytes=X-Y headers to pull index information and render specific target pages without downloading the entire file.Parallel Fetching & Parsing: The network manager downloads standard chunks in the background while the background worker processes layout blocks in parallel.
Offline Sync & Resilience: If the network goes offline, requests queue up locally. An exponential backoff background sync engine pushes local changes back to the cloud database when connections resume.
State Management Layer
Decentralized State Partitioning: Global state is divided into independent channels:
interface AppState {
document: DocumentMetadata;
viewport: { zoom: number; rotation: number; activePage: number };
annotations: Record<string, PageAnnotations>; // Fast access by Page Key
transientStroke: Stroke | null; // Drawing path before mouse-up commit
history: UndoRedoHistory;
}Undo/Redo Command Pattern: Implements a strict Command history interface. Every addition, deletion, or modification records a reversible execution state:
interface Command {
execute(): void;
undo(): void;
}Routing & Navigation
URL-Based State Syncing: Syncs the document ID, current page number, and zoom level directly to the URL parameters (e.g.,
/viewer/doc-123#page=12&zoom=1.5). This allows deep-linking to specific pages within documents.---
Domain Layer
Business Rules & Use Cases
Annotation Validation Rules: Dictates absolute bounds constraints (annotations cannot be drawn or moved outside of the physical boundary coordinate system of the target PDF page).
Coordinate Transformation Core: Encapsulates coordinate conversion algorithms outside of UI frameworks, keeping the coordinate engine testable and decoupled.
Text Range Anchoring: Calculates highlights using permanent, layout-independent text offsets (e.g., character start/end positions within page streams) rather than absolute coordinates. This ensures highlighting is robust against page-size modifications and layout changes.
export interface HighlightAnchor {
pageIndex: number;
startOffset: number;
endOffset: number;
text: string;
}Entities & Models
Annotation Entity Schema: Standardized structures representing annotation variants:
export type AnnotationType = 'highlight' | 'freehand' | 'shape' | 'textbox';
export interface BaseAnnotation {
id: string;
type: AnnotationType;
pageIndex: number;
color: string;
authorId: string;
createdAt: number;
}
export interface HighlightAnnotation extends BaseAnnotation {
type: 'highlight';
rects: Array<{ x: number; y: number; width: number; height: number }>;
anchor: HighlightAnchor;
}
export interface FreehandAnnotation extends BaseAnnotation {
type: 'freehand';
points: Array<{ x: number; y: number; pressure: number }>;
}---
Infrastructure Layer
API / Network
REST & WebSocket API Gateway: Employs normal HTTPS REST APIs for batch saving and loading of annotation files, with real-time WebSocket syncing for collaborative annotation environments.
Request Deduplication: Guarantees identical document chunk requests are combined, protecting the backend service against duplicate, concurrent client requests.
Storage
IndexedDB via Dexie.js: Stores the parsed binary documents and locally changed annotations. This provides quick startup times and supports full offline functionality.
Canvas Context Recycler: Maintains a pool of allocated, unused canvas components to avoid memory fragmentation and high garbage collection overhead.
---
Wrap Up
Wrap-up
Evaluation & Trade-offs
Canvas vs SVG for Annotation Drawing: We use SVG layers for the editing surface instead of rendering directly to a canvas.
Trade-off: SVGs add DOM elements, which can reduce performance if there are thousands of shapes. However, SVGs provide cleaner interaction handling (resizing handles, drag-and-drop, focus control) and sharp vector scaling when zooming. This approach meets the performance target when combined with virtualization that clears offscreen elements.
Web Worker Parsers: Passing binary arrays across boundary thresholds carries overhead.
Mitigation: We pass structured PDF data using transferrable objects (
ArrayBuffers), transferring memory ownership to the main UI thread without cloning costs.Optimization Matrix
| Bottleneck | Architectural Solution | Implementation Detail |
|---|---|---|
| Garbage Collection Janks | Canvas Pool Recycling | Reuses canvas elements instead of destroying and recreating them on scroll. |
| Stylus Inaccuracy & Jitter | Catmull-Rom Spline Interpolation | smooths freehand vectors dynamically as points are collected. |
| Main Thread Freezing | OffscreenCanvas Transfer | Runs rasterization operations inside a separate Web Worker thread. |
---