Explainer · 7 June 2026
How tries and prefix trees work: autocomplete, routing, and radix trees
A trie (pronounced "try," from reTRIEval) — also called a
prefix tree — stores strings by sharing common prefixes along a path of
nodes instead of treating each key as an opaque blob. If your dataset contains
cat, car, and card, a trie walks
c → a → t for the first word and reuses the c → a segment for
the others. That layout makes prefix queries natural: "all keys starting with
ca" is a subtree walk, which is exactly what type-ahead autocomplete,
spell-check dictionaries, and IP longest-prefix routing need. Tries are not always the
fastest structure for arbitrary key lookup — a
hash map wins on average-case
exact match — but when the question is about shared beginnings of strings,
tries are the canonical answer.
Node structure: edges labeled by characters
The simplest trie represents each string as a path from root to a terminal node. Each node holds a map from the next character (or byte) to a child pointer — often an array of size 26 for lowercase English, a sparse hash map for Unicode, or a fixed 256-slot array for raw bytes. A boolean flag (or a stored value) marks whether the path spells a complete key, not just a prefix of one.
Inserting dog creates nodes d → o → g and marks the leaf as
terminal. Inserting dot reuses d → o and adds a sibling branch
under o for t. Lookup walks one character at a time; if any
edge is missing, the key is absent. Time is O(m) where m is
key length — independent of how many keys are stored, unlike a sorted array's
O(log n) comparisons that still touch many characters per step.
Memory is the catch: a sparse trie with many short shared prefixes can use more space than storing the strings themselves, because each node carries pointer overhead. Dense alphabets (256-byte children) explode quickly unless you compress or use map-backed children only where edges exist.
Prefix search, autocomplete, and ranked suggestions
Autocomplete is a trie traversal: locate the node at the end of the typed prefix, then
depth-first or breadth-first collect all terminal descendants. A search box showing
"prog" → "program," "progress," "protobuf" is walking the subtree under
p → r → o → g. Production systems add ranking (frequency, recency,
personalization) on top of the candidate set — the trie supplies candidates in
O(number of matches), not O(total dictionary size).
Spell-checkers and keyboard dictionaries use the same idea: generate edits within distance one or two, probe the trie for valid words, and prefer high-frequency entries. Contrast this with an inverted index, which excels at "documents containing token X" across a corpus but is awkward for "all words with this prefix" unless you maintain a separate trie or finite automaton alongside the index.
For purely regular-language pattern matching (not full dictionary semantics), a finite automaton built from a regex can serve a similar role; tries are the simpler structure when keys are literal strings without alternation or quantifiers.
Longest-prefix match: IP routing and ACL tables
Routers do not store every host address individually. They store CIDR
prefixes like 203.0.113.0/24 and, for each incoming packet,
find the longest matching prefix in the forwarding table. That is a
trie walk on bit labels: each level branches on the next bit of the destination IP.
A /32 rule is a deep leaf; a /8 rule is a shallow branch that covers everything below
unless a longer prefix overrides it.
Access-control lists and policy engines use the same longest-prefix semantics: more specific rules beat broader ones. Hardware switches often implement compressed trie variants in TCAM or specialized ASIC lookup tables because line-rate forwarding cannot afford a hash probe per hop. Software routers (Linux FIB, BGP daemons) use radix-tree structures closely related to Patricia tries.
Radix trees and Patricia tries: compressing sparse paths
A plain character trie wastes nodes when keys share long runs with no branches — think
of a single key internationalization creating 20 single-child nodes. A
radix tree (or Patricia trie) collapses unbranched
chains into one edge labeled with a substring (or bit string). Internal nodes store
only branching points; leaf edges carry the compressed segment.
Linux kernel's radix_tree (and the newer xarray) and many
URI routers use this compression. Insert and lookup compare whole edge labels at once
and may split an edge when a new key diverges mid-label — similar in spirit to B-tree
node splits, though trie splits are along string boundaries. The result is fewer pointer
chases and better cache behavior on long shared prefixes.
Succinct tries and LOUDS-encoded trees go further for read-heavy static dictionaries: rank/select structures on a bit vector emulate child navigation with less RAM than explicit objects — common in genomics and static search indexes where the trie never mutates after build time.
Tries vs hash maps, B-trees, and skip lists
Exact lookup: hash maps average O(1) with a good hash; tries are O(m). For fixed short keys (UUIDs, 32-byte pubkeys), hashing usually wins.
Ordered iteration and prefixes: hash maps cannot list "all keys starting
with X" without scanning everything. Tries expose ordered walks if children are sorted.
Skip lists
and B-trees support range scans on sorted keys but do not share prefix paths — storing
foobar and foobaz duplicates the entire fooba
string in each key slot.
Worst-case guarantees: unlike hash tables, tries have no collision chains and no rehash pauses — predictable latency matters for real-time autocomplete and kernel routing.
Database indexes on text columns often combine a B-tree on the full value with application-side trie caches for prefix UI; see database indexing for how B-trees differ from trie-shaped auxiliary structures.
Production examples and implementation notes
- Redis —
KEYS prefix*is a full scan; production code usesSCANor maintains a trie-like index externally. Sorted sets help range queries but are not tries. - DNS resolvers — domain names are hierarchical labels read right-to-left; resolver caches and negative caching layers often use tree structures keyed by label segments (related to DNS resolution).
- Ethereum/Solana address books — human-readable name services (ENS, SNS) resolve names to addresses; reverse lookups and substring search may use trie backends for prefix browse UIs.
- TypeScript/JavaScript — no trie in the standard library; npm
packages (
mnemonist,trie-search) or rolling your own withMapchildren is typical for autocomplete widgets.
When implementing, prefer Map (or sorted arrays for tiny alphabets) over
fixed 256-length arrays unless profiling shows array indexing wins. Pool nodes if you
insert millions of keys — allocator churn on tiny node objects adds up; see
memory allocators
for why object-heavy structures stress the heap.
Practical checklist
- Reach for a trie when queries are prefix-shaped (autocomplete, longest-prefix match, dictionary neighbor search).
- Compress with radix/Patricia form if keys are long and branches are sparse.
- Measure memory: trie pointer overhead can exceed raw string storage for small alphabets with few shared prefixes.
- For corpus search ("find documents containing word"), pair tries with inverted indexes — each structure solves a different query shape.
- On hot paths, cache child maps locally and avoid Unicode normalization surprises (compose equivalent prefixes differently unless you normalize on insert).
Related on Solana Garden: hash tables and hash maps explained, inverted indexes and full-text search, skip lists explained, graph algorithms explained, Explainers hub.