Explainer · 7 June 2026

How consistent hashing works for distributed caches

You have ten memcached servers and a million hot keys. The naive fix — server = hash(key) % 10 — works until server #7 dies. Suddenly every key might land on a different machine because the modulus changed. That is a cache stampede, a database overload, and angry on-call pages. Consistent hashing, introduced by Karger et al. in 1997 and popularized by Amazon's Dynamo paper, maps both servers and keys onto a hash ring so that adding or removing one node only moves keys that belonged to that slice of the ring — roughly 1/n of the total when you have n nodes.

Why modulo hashing breaks at scale

Simple sharding picks a bucket with remainder arithmetic. With four servers, key user:48291 might hash to 17, and 17 % 4 = 1 sends it to server 1. Clean and fast.

Scale to five servers and the same key goes to 17 % 5 = 2. Almost every key changes buckets when n changes — not because the data changed, but because the formula changed. Clients and proxies must invalidate local caches, backends cold-start, and downstream databases see a wave of misses. In a load-balanced fleet, rolling deploys that add or drain instances trigger the same problem if routing is modulo-based.

What you want is minimal remapping: when one node leaves, only keys that were on that node should move — typically to the next node clockwise on the ring. Everyone else's mapping stays stable.

The hash ring in one picture

Imagine the output space of your hash function — 0 to 232−1 — bent into a circle. Place each server at a point on that circle by hashing its identifier: hash("cache-east-3"). Place each key the same way: hash("user:48291").

To route a key, walk clockwise from the key's position until you hit the first server point. That server owns the key. Geometrically, each server owns the arc between itself and the previous server counter-clockwise — a contiguous partition of the ring.

Add a new server by inserting one more point. Only keys in the arc that the new node "steals" from its predecessor move. Remove a server and its arc merges into the next clockwise neighbor. The expected fraction of keys that move per membership change is about 1/n for n nodes — not (n-1)/n like naive modulo.

Virtual nodes (vnodes) and balance

A physical machine is one point on the ring. With few servers, arcs can be wildly uneven — one unlucky server might own 40% of the ring while another owns 8%. Virtual nodes fix that: each physical host is hashed multiple times under different labels (cache-3#0, cache-3#1, …). A machine with 100 vnodes spreads its responsibility across the ring in smaller slices, smoothing load.

The vnode count is a tuning knob. More vnodes mean better balance and finer migration when a host fails (keys move in smaller chunks), but more memory for the routing table and more work on ring updates. Production systems like Cassandra often use hundreds of vnodes per physical node; small memcached pools might use dozens.

Weighted vnodes let you steer capacity: a machine with twice the RAM can claim twice as many virtual points and receive proportionally more keys without a separate routing layer.

Replication on the ring

Caches are ephemeral; databases built on consistent hashing need replicas. The usual pattern: after finding the primary owner clockwise, continue walking for the next R-1 distinct servers to hold replica copies. A key's replicas are predictable from the ring topology, which helps repair: when a node dies, each successor inherits the dead node's arcs and peers backfill missing replicas.

Hinted handoff (from Dynamo) temporarily redirects writes for unreachable replicas to a healthy node that "hints" it is standing in — when the original returns, data streams back. Read repair fixes drift when clients see mismatched versions across replicas. These patterns pair with consistent hashing but are not automatic; they are explicit consistency machinery on top of the partition map.

This is different from Raft-style consensus, which replicates an ordered log to a fixed voter set. Consistent hashing answers "which nodes store this key?"; consensus answers "do all nodes agree on the value?". Wide-column stores combine both.

Where you meet it in production

  • Distributed caches — memcached client libraries (ketama), Redis Cluster slot migration uses related ideas; any "add a node without flushing the cache" story touches the ring.
  • CDNs and object stores — origin selection and partition maps for blob storage; pairs with HTTP caching at the edge.
  • Databases — Cassandra, Riak, DynamoDB's partition key routing (conceptually similar partitioning, though AWS hides the ring).
  • Service meshes and RPC — sticky routing by request id to the same backend instance without a central directory.
  • Rate limiters — sharding counter state across Redis nodes; see also API rate limiting for token-bucket semantics above the shard layer.

Negative lookups in large caches sometimes add a Bloom filter per shard to avoid hitting disk for keys that definitely do not exist — another layer orthogonal to the partition function.

Alternatives worth knowing

Jump consistent hash (Google, 2014) maps keys to buckets with minimal movement and no ring data structure — excellent when bucket count changes rarely and you want a compact, fast formula. It does not handle weighted nodes as flexibly as vnodes.

Rendezvous hashing (highest random weight) scores every (key, server) pair and picks the max. No ring to maintain; lookup is O(servers), fine for tens of backends, painful for thousands.

Maglev hashing (Google load balancers) builds a lookup table for near-perfect balance at L4/L7 — common behind DNS when connection churn must be tiny during backend swaps.

Pick consistent hashing when nodes join and leave frequently, you need weighted capacity, and clients or proxies can share an updated ring view (or fetch it from a coordinator).

Operational pitfalls

  • Hot keys — hashing spreads keys uniformly, not traffic. A viral object still hammers one partition; use local LRU caches, key splitting, or read replicas above the ring.
  • Ring gossip lag — if clients route with a stale ring during a failover, two nodes may briefly disagree on ownership. Version the ring and reject writes to wrong primaries.
  • Rebalance storms — adding many vnodes at once moves many small slices; throttle migration bandwidth so production traffic survives.
  • Hash quality — use a stable, well-distributed hash (Murmur, xxHash, SHA-256 truncated). Poor hashes cluster points and defeat the ring.
  • Clocks and TTL — cache expiry is independent of placement; a moved key is a miss until repopulated — plan for warm-up.

Practical checklist

  • Measure key distribution and request distribution per node before blaming the hash function for hotspots.
  • Start with enough vnodes per host (often 50–200) and adjust from skew metrics.
  • Publish ring versions; clients should atomically swap maps, not blend old and new.
  • Test add-node, remove-node, and crash scenarios in staging with production-like key samples.
  • Document whether reads tolerate stale replicas and how repair is triggered.

Related on Solana Garden: Load balancing explained, Bloom filters explained, HTTP caching explained, API rate limiting explained, Explainers hub.