Guide

Virtual scrolling explained

Harbor Analytics’ settlement ledger view loaded 48,000 transaction rows into a React table. Chrome’s main thread blocked for 4.2 seconds on first paint; scroll jank hit 280 ms frames and INP failed field thresholds. Replacing naive rows.map() rendering with virtual scrolling — painting only the ~25 rows visible in the viewport plus a small buffer — cut initial render to 90 ms and kept scroll frames under 16 ms even as the dataset grew. Virtual scrolling (also called windowing or list virtualization) is the technique of mounting DOM nodes for visible items only, positioning them with transforms or top offsets inside a tall scroll container, and recycling nodes as the user scrolls. This guide explains how windowing works, fixed versus variable row heights, overscan and scroll anchoring, library patterns in TanStack Virtual and react-window, pairing virtualization with API pagination, a Harbor Analytics ledger worked example, a decision table, common pitfalls, and a production checklist.

Why naive lists break at scale

Browsers are fast, but each DOM node has a cost: layout, style recalculation, paint, and memory for event listeners and React fiber trees. A data grid with 50 columns and 50,000 rows is not 50,000 × “one div” — it is millions of layout calculations on every resize and a garbage-collection pressure spike when filters change.

Common symptoms of an unvirtualized list:

  • Long task on mount — the main thread is busy for seconds before the user can interact.
  • Scroll jank — frames exceed 16 ms because the browser reconciles thousands of off-screen nodes.
  • Memory growth — mobile Safari kills tabs holding huge tables.
  • Filter/sort stalls — re-rendering the full list on every keystroke blocks input.

Pagination solves part of the problem by limiting rows per request, but product teams often still want infinite scroll through a large in-memory or prefetched dataset. Virtual scrolling keeps the UX of a long list without paying the full DOM price.

How virtual scrolling works

The core idea is simple: treat the scroll container as a viewport sliding over a logical list whose total height is computed, not rendered.

  1. Measure the viewport — height (or width for horizontal lists) of the scrollable element.
  2. Track scroll offsetscrollTop from scroll events or a scroll observer.
  3. Compute visible range — which item indices intersect the viewport, e.g. items 412–437 of 48,000.
  4. Render only that slice — mount row components for visible indices; position each with transform: translateY(...) or absolute top inside a spacer element whose height equals totalItems × rowHeight (fixed height) or summed measured heights (variable).
  5. Recycle on scroll — as indices leave the viewport, unmount or reuse row components for newly visible indices.

The user sees a scrollbar that behaves like a full list; the browser paints a constant small number of rows regardless of dataset size.

Overscan

Rendering exactly the visible rows causes blank flashes during fast flick scrolling. Overscan renders a few extra rows above and below the viewport (often 3–10) so the next frame is already painted. Overscan trades a small amount of extra DOM for smoother motion — tune it per device; low-end phones may need less overscan to protect memory.

Fixed vs variable row heights

  • Fixed height — every row is 48 px (or uniform grid cells). Index math is O(1): startIndex = floor(scrollTop / rowHeight). Simplest and fastest.
  • Variable height — chat messages, wrapped text, expandable rows. Requires a height cache per index, estimated heights until measured, and re-measurement when content changes. Libraries like TanStack Virtual expose measureElement refs for this path.
  • Sticky headers / grouped sections — section headers that pin while their group scrolls add index offset math; most libraries support group keys or custom range extractors.

Virtual scrolling vs pagination vs infinite load

These patterns complement each other; they solve different bottlenecks.

  • Server pagination limits bytes over the wire and database work. See API pagination explained for offset vs cursor patterns.
  • Infinite scroll / fetch-on-scroll appends pages as the user approaches the end — still needs virtualization once the client holds thousands of rows.
  • Virtual scrolling limits DOM and layout cost for whatever rows are already in memory.

Production pattern: cursor-paginated API + TanStack Query useInfiniteQuery for data + TanStack Virtual for rendering. Debounce scroll-end detection when firing the next page fetch (see debouncing and throttling).

Implementation patterns

TanStack Virtual (framework-agnostic)

@tanstack/react-virtual (and solid/vue/svelte adapters) is the modern default. You pass count, getScrollElement, and estimateSize; it returns virtualItems with start, size, and index for positioning. Pair with TanStack Query when rows come from paginated APIs.

react-window / react-virtualized

Older but battle-tested. FixedSizeList and VariableSizeList wrap children in a list component with itemSize callbacks. Less flexible than TanStack Virtual for custom scroll parents but fine for standard vertical lists.

Native CSS content-visibility

For moderately large static lists, content-visibility: auto lets the browser skip layout for off-screen subtrees without full windowing logic. It does not reduce React render work — only browser layout/paint. Use virtualization when React reconciliation itself is the bottleneck.

Canvas / WebGL tables

Spreadsheet-grade grids (millions of cells) sometimes render to canvas for uniform cells. Accessibility and text selection suffer; prefer DOM virtualization unless profiling proves canvas is necessary.

Worked example: Harbor Analytics settlement ledger

Harbor’s ops team filters a 48,000-row ledger by merchant, date range, and status. Requirements: sub-100 ms filter response, smooth scroll, keyboard row focus, and export of filtered results (not just visible rows).

Architecture:

  1. Server returns cursor-paginated JSON; client accumulates pages in a normalized store keyed by transaction ID (dedupe on merge).
  2. Filter runs in a Web Worker over the in-memory index so typing in the search box does not block the main thread.
  3. Filtered IDs feed useVirtualizer with count = filteredIds.length and fixed estimateSize: 52.
  4. Each visible row looks up transactions[id] from client state — rows are presentational; virtualization does not own data.
  5. Scroll within 800 px of the bottom triggers the next cursor page unless hasNextPage is false.
  6. Export serializes the filtered ID list server-side via POST — not the 25 visible DOM nodes.

Results: first contentful paint with data dropped from 4.2 s to 0.09 s; 95th-percentile scroll frame 11 ms on a 2020 MacBook Air; INP on row click improved because fewer layout-invalidating descendants existed.

Decision table: when to virtualize

Scenario Recommended approach Why
< 100 simple rows Render all Virtualization overhead exceeds benefit
500–5,000 uniform rows Virtual scroll (fixed height) DOM cost dominates; math is trivial
Chat / feed with variable heights Virtual scroll + measureElement Need height cache; consider scroll anchoring when prepending
Millions of rows, never all needed client-side Server pagination only Do not download what you will not show; virtualize each page slice
Data grid with sorting on all columns Virtual rows + server-side sort Client sort of 100k rows blocks the worker/main thread
SEO-critical HTML list Paginated static pages Virtualized client lists are invisible to crawlers
Mobile chat infinite history Virtual + prepend anchor Loading older messages must preserve scroll position

Common pitfalls

  • Virtualizing without measuring. Assuming variable rows are 48 px when they wrap to 120 px causes scroll jump and wrong scrollbar thumb size.
  • Re-creating row components each scroll. Use stable key={item.id} on data identity, not row index, or focus and selection state breaks.
  • Ignoring accessibility. Arrow-key navigation and screen-reader row counts need explicit aria-rowcount and roving tabindex — virtual lists are not inaccessible by default but require intentional patterns.
  • Nested scroll containers. Two scrollable parents fight over wheel events; pick one scroll element and pass it to the virtualizer.
  • Exporting visible rows only. Users expect CSV of the filtered set, not the viewport slice.
  • Virtualizing while filtering on main thread. Move heavy filter/sort to a worker or server; virtualization does not fix O(n) data processing.
  • Prepend without scroll anchoring. Loading chat history above the viewport jumps the user unless you adjust scrollTop by the height of inserted items.

Production checklist

  • Profile with DevTools Performance: confirm DOM count and long tasks, not just FPS.
  • Choose fixed vs variable height strategy before picking a library API.
  • Set overscan (3–10 rows) and test on a low-end Android device.
  • Pair client virtualization with cursor pagination for datasets larger than memory allows.
  • Keep row components pure; pass IDs and select data from a store to limit re-renders.
  • Implement keyboard navigation and announce total row count to assistive tech.
  • Preserve scroll position when prepending or when filter results shrink.
  • Load heavy filter/sort off the main thread when n > 10,000.
  • Verify INP on row click after virtualization — fewer nodes should help, not hurt.
  • Document export and bulk-action behavior independent of visible window.

Key takeaways

  • Virtual scrolling renders only viewport rows — holding DOM count constant as logical list size grows.
  • Fixed-height rows are simplest — variable heights need measurement caches and scroll anchoring discipline.
  • Virtualization complements pagination — it solves layout cost, not network or database scale.
  • Overscan and stable keys prevent flicker and preserve focus during fast scroll.
  • Profile before canvas hacks — TanStack Virtual handles most product tables if data fetching and filtering are also optimized.

Related reading