Explainer · 7 June 2026

How tries and prefix trees work: autocomplete, routing, and radix trees

A trie (pronounced "try," from reTRIEval) — also called a prefix tree — stores strings by sharing common prefixes along a path of nodes instead of treating each key as an opaque blob. If your dataset contains cat, car, and card, a trie walks c → a → t for the first word and reuses the c → a segment for the others. That layout makes prefix queries natural: "all keys starting with ca" is a subtree walk, which is exactly what type-ahead autocomplete, spell-check dictionaries, and IP longest-prefix routing need. Tries are not always the fastest structure for arbitrary key lookup — a hash map wins on average-case exact match — but when the question is about shared beginnings of strings, tries are the canonical answer.

Node structure: edges labeled by characters

The simplest trie represents each string as a path from root to a terminal node. Each node holds a map from the next character (or byte) to a child pointer — often an array of size 26 for lowercase English, a sparse hash map for Unicode, or a fixed 256-slot array for raw bytes. A boolean flag (or a stored value) marks whether the path spells a complete key, not just a prefix of one.

Inserting dog creates nodes d → o → g and marks the leaf as terminal. Inserting dot reuses d → o and adds a sibling branch under o for t. Lookup walks one character at a time; if any edge is missing, the key is absent. Time is O(m) where m is key length — independent of how many keys are stored, unlike a sorted array's O(log n) comparisons that still touch many characters per step.

Memory is the catch: a sparse trie with many short shared prefixes can use more space than storing the strings themselves, because each node carries pointer overhead. Dense alphabets (256-byte children) explode quickly unless you compress or use map-backed children only where edges exist.

Prefix search, autocomplete, and ranked suggestions

Autocomplete is a trie traversal: locate the node at the end of the typed prefix, then depth-first or breadth-first collect all terminal descendants. A search box showing "prog" → "program," "progress," "protobuf" is walking the subtree under p → r → o → g. Production systems add ranking (frequency, recency, personalization) on top of the candidate set — the trie supplies candidates in O(number of matches), not O(total dictionary size).

Spell-checkers and keyboard dictionaries use the same idea: generate edits within distance one or two, probe the trie for valid words, and prefer high-frequency entries. Contrast this with an inverted index, which excels at "documents containing token X" across a corpus but is awkward for "all words with this prefix" unless you maintain a separate trie or finite automaton alongside the index.

For purely regular-language pattern matching (not full dictionary semantics), a finite automaton built from a regex can serve a similar role; tries are the simpler structure when keys are literal strings without alternation or quantifiers.

Longest-prefix match: IP routing and ACL tables

Routers do not store every host address individually. They store CIDR prefixes like 203.0.113.0/24 and, for each incoming packet, find the longest matching prefix in the forwarding table. That is a trie walk on bit labels: each level branches on the next bit of the destination IP. A /32 rule is a deep leaf; a /8 rule is a shallow branch that covers everything below unless a longer prefix overrides it.

Access-control lists and policy engines use the same longest-prefix semantics: more specific rules beat broader ones. Hardware switches often implement compressed trie variants in TCAM or specialized ASIC lookup tables because line-rate forwarding cannot afford a hash probe per hop. Software routers (Linux FIB, BGP daemons) use radix-tree structures closely related to Patricia tries.

Radix trees and Patricia tries: compressing sparse paths

A plain character trie wastes nodes when keys share long runs with no branches — think of a single key internationalization creating 20 single-child nodes. A radix tree (or Patricia trie) collapses unbranched chains into one edge labeled with a substring (or bit string). Internal nodes store only branching points; leaf edges carry the compressed segment.

Linux kernel's radix_tree (and the newer xarray) and many URI routers use this compression. Insert and lookup compare whole edge labels at once and may split an edge when a new key diverges mid-label — similar in spirit to B-tree node splits, though trie splits are along string boundaries. The result is fewer pointer chases and better cache behavior on long shared prefixes.

Succinct tries and LOUDS-encoded trees go further for read-heavy static dictionaries: rank/select structures on a bit vector emulate child navigation with less RAM than explicit objects — common in genomics and static search indexes where the trie never mutates after build time.

Tries vs hash maps, B-trees, and skip lists

Exact lookup: hash maps average O(1) with a good hash; tries are O(m). For fixed short keys (UUIDs, 32-byte pubkeys), hashing usually wins.

Ordered iteration and prefixes: hash maps cannot list "all keys starting with X" without scanning everything. Tries expose ordered walks if children are sorted. Skip lists and B-trees support range scans on sorted keys but do not share prefix paths — storing foobar and foobaz duplicates the entire fooba string in each key slot.

Worst-case guarantees: unlike hash tables, tries have no collision chains and no rehash pauses — predictable latency matters for real-time autocomplete and kernel routing.

Database indexes on text columns often combine a B-tree on the full value with application-side trie caches for prefix UI; see database indexing for how B-trees differ from trie-shaped auxiliary structures.

Production examples and implementation notes

  • RedisKEYS prefix* is a full scan; production code uses SCAN or maintains a trie-like index externally. Sorted sets help range queries but are not tries.
  • DNS resolvers — domain names are hierarchical labels read right-to-left; resolver caches and negative caching layers often use tree structures keyed by label segments (related to DNS resolution).
  • Ethereum/Solana address books — human-readable name services (ENS, SNS) resolve names to addresses; reverse lookups and substring search may use trie backends for prefix browse UIs.
  • TypeScript/JavaScript — no trie in the standard library; npm packages (mnemonist, trie-search) or rolling your own with Map children is typical for autocomplete widgets.

When implementing, prefer Map (or sorted arrays for tiny alphabets) over fixed 256-length arrays unless profiling shows array indexing wins. Pool nodes if you insert millions of keys — allocator churn on tiny node objects adds up; see memory allocators for why object-heavy structures stress the heap.

Practical checklist

  • Reach for a trie when queries are prefix-shaped (autocomplete, longest-prefix match, dictionary neighbor search).
  • Compress with radix/Patricia form if keys are long and branches are sparse.
  • Measure memory: trie pointer overhead can exceed raw string storage for small alphabets with few shared prefixes.
  • For corpus search ("find documents containing word"), pair tries with inverted indexes — each structure solves a different query shape.
  • On hot paths, cache child maps locally and avoid Unicode normalization surprises (compose equivalent prefixes differently unless you normalize on insert).

Related on Solana Garden: hash tables and hash maps explained, inverted indexes and full-text search, skip lists explained, graph algorithms explained, Explainers hub.