Explainer · 7 June 2026

How graph algorithms work: BFS, DFS, and shortest paths

A graph is a set of vertices (nodes) connected by edges (links). Social networks, package dependencies, road maps, liquidity pools on a DEX, and the links between web pages are all graphs wearing different costumes. Once you see the pattern, a handful of classic algorithms — breadth-first search (BFS), depth-first search (DFS), and Dijkstra's shortest-path method — explain a surprising fraction of backend routing, build systems, and recommendation pipelines.

Vertices, edges, and representations

Graphs can be directed (edges have a from and to, like "A follows B") or undirected (mutual friendship). Edges may carry weights — distance, latency, fee, or trust score. The same abstract object powers a subway map (undirected, weighted by time) and a compiler's module graph (directed, often unweighted).

How you store the graph dominates performance. An adjacency matrix is a V × V table: M[i][j] = 1 (or the weight) if an edge exists. Edge lookup is O(1), but memory is O(V²) and iterating neighbors of one node scans an entire row. Matrices suit dense graphs with frequent "is there an edge?" checks.

An adjacency list stores, for each vertex, a list of outgoing neighbors (often as pairs (neighbor, weight)). Space is O(V + E) and neighbor iteration is proportional to actual degree. Sparse real-world graphs — the internet, social graphs, most blockchains — almost always use adjacency lists backed by hash maps or sorted arrays for the per-vertex neighbor bags.

Breadth-first search (BFS)

BFS explores a graph layer by layer. Start at a source vertex, enqueue it, then repeatedly dequeue a vertex and enqueue every unvisited neighbor. A FIFO queue guarantees you finish all nodes at distance k before any node at distance k + 1.

On an unweighted graph, the first time BFS reaches a vertex is along a shortest path measured in edge count. That is why BFS answers "degrees of separation," finds the minimum number of relay hops in a network, and powers grid puzzles where each move costs one step.

BFS also underpins level-order tree traversal and bipartite testing (two-color the graph as you walk: neighbors must alternate colors). Time complexity is O(V + E) with a proper visited set; space is O(V) for the queue and parent pointers if you need to reconstruct paths.

In production, BFS shows up in CDN cache warming (expand from origin one hop at a time), permission inheritance in shallow org trees, and catch-all "find anything within N hops" graph APIs. It is a poor fit when edges have heterogeneous costs — walking three cheap links may beat one expensive bridge edge, and BFS ignores that distinction.

Depth-first search (DFS)

DFS goes as deep as possible before backtracking. Implement it with a stack (explicit or the call stack via recursion): pop a vertex, mark visited, push unvisited neighbors. DFS does not naturally find shortest paths, but it excels at structure discovery.

Cycle detection on directed graphs uses DFS with a recursion stack (or three-color marking: white/gray/black). If DFS reaches a gray node, a back edge exists and the graph has a cycle — fatal for dependency resolution or smart-contract call graphs that must terminate.

Topological sort orders vertices in a DAG so every edge goes from earlier to later. DFS finishes nodes in reverse finish time; that order is a valid build sequence for packages, task schedulers, and spreadsheet recalculation. CI pipelines that parallelize independent stages first compute a topological levelization (often Kahn's algorithm, a BFS variant on in-degrees).

Connected components in undirected graphs are the disjoint sets DFS discovers in one pass per unvisited seed. Union-find (disjoint set union) is an alternative with near-constant amortized merges, popular when graphs grow incrementally online.

Dijkstra's algorithm and weighted shortest paths

When edges have non-negative weights, Dijkstra's algorithm finds shortest paths from a single source. Maintain a priority queue (min-heap) of tentative distances. Extract the smallest distance node, relax each outgoing edge: if dist[u] + w(u,v) < dist[v], update dist[v] and push or decrease-key in the heap.

With a binary heap, complexity is O((V + E) log V). Fibonacci heaps improve the theoretical bound but rarely win in practice; a simple binary heap plus lazy deletion is the standard production choice. The priority queue is the same abstraction behind ordered in-memory indexes — Dijkstra cares about "smallest key" extraction, not the specific backing structure.

Dijkstra powers road routing (with preprocessing tricks like contraction hierarchies on top), network routing with latency-weighted edges, and DEX swap routing where each pool is a vertex and trade fees plus slippage are edge weights. Aggregators search for the path that minimizes total cost, not hop count.

Important limitation: Dijkstra requires non-negative weights. Negative edges break the "extract-min is final" invariant; use Bellman-Ford for general weights (O(VE)) or SPFA in competitive programming. Negative cycles mean "infinite profit" loops — relevant when modeling arbitrage, where you detect the cycle instead of routing through it blindly.

A* and goal-directed search

When you have a known target and a heuristic estimate of remaining cost, A* is Dijkstra with guidance. Priority is f(n) = g(n) + h(n) where g is cost from the start and h estimates distance to the goal. With an admissible heuristic (never overestimates), A* remains optimal while expanding far fewer nodes — critical for game pathfinding on grids and maps.

Poor heuristics degenerate to Dijkstra; inadmissible heuristics trade optimality for speed. In web-scale graphs, bidirectional Dijkstra (search from both ends) or landmark-based preprocessing often beats plain A* when no geometric embedding exists.

Choosing the right traversal

ProblemTypical algorithmWhy
Fewest hops, unweightedBFSFirst arrival is shortest
Cycle in dependenciesDFSBack-edge detection
Build order of tasksTopological sort (DFS or Kahn BFS)Respects edge direction
Minimum total cost pathDijkstraNon-negative weights
Grid pathfinding with mapA*Heuristic prunes search
All-pairs shortest paths (small V)Floyd-WarshallO(V³) but simple

At scale, the graph rarely fits one machine. Sharded social graphs partition by user id; consistent hashing picks which shard owns a vertex. Cross-shard BFS becomes a multi-round message protocol with strict hop budgets. Recommendation systems precompute random-walk scores offline rather than running live BFS per request.

Implementation pitfalls

  1. Mark vertices visited when enqueued (BFS) or when pushed (DFS) — double visits explode runtime on cyclic graphs.
  2. Reconstruct paths with a parent[] or prev map; distance alone is not enough for routing UX.
  3. On large sparse graphs, prefer adjacency lists; dense cliques alone do not justify O(V²) matrices.
  4. Watch stack overflow on deep DFS — iterative DFS with an explicit stack is safer for million-node crawls.
  5. For Dijkstra, stale heap entries from lazy decrease-key are fine; skip nodes already finalized at a better distance.
  6. Document whether your graph allows multi-edges and self-loops; algorithms assume simple graphs unless you dedupe.

Solana's runtime is not a general graph database, but off-chain indexers model token flows, program invocations, and wallet interactions as graphs constantly. Wallet clustering, mixer detection, and "shortest path to a known exchange" queries are graph problems solved in application code — usually BFS or Dijkstra on adjacency lists built from streamed ledger events.

Related on Solana Garden: Hash tables and hash maps explained, Skip lists explained, Consistent hashing explained, Recommendation algorithms explained, Explainers hub.