Guide

Distributed locking explained

A single-process mutex prevents two threads in one binary from corrupting shared state. Scale that app to four containers behind a load balancer and the mutex disappears — each replica has its own memory. Two cron schedulers both fire at midnight; two checkout workers both decrement the last warehouse unit; two payout jobs both send the same refund. A distributed lock is the coordination primitive that says "only one node across the cluster may execute this critical section right now." This guide covers when you actually need one, lease-based mutex design, Redis and database implementations, fencing tokens that defeat stale-lock races, and the idempotency and optimistic patterns that often replace locks entirely.

Why local locks fail at scale

In-process locks (Java synchronized, Go sync.Mutex, Python threading.Lock) serialize threads inside one OS process. They are fast and correct for that process — and useless once you have horizontal replicas, serverless concurrency, or a background worker fleet separate from your API tier.

Classic scenarios that push you toward distributed coordination:

Singleton cron — nightly billing, report generation, or cache warming must run exactly once cluster-wide.
Inventory and ledger updates — decrement stock or move money where a race could oversell or double-pay.
Leader election — one node owns partition consumption or schema migration leadership.
Expensive cold-start work — rebuild a search index or warm a model; duplicate work wastes money.

The goal is mutual exclusion: at any moment, at most one holder for a named lock key. The hard part is doing that when networks delay packets, processes pause for GC, and clocks disagree — topics covered in distributed consistency models.

Lease locks vs true mutexes

Most production distributed locks are leases: the holder acquires the lock with a time-to-live (TTL). If the worker crashes without releasing, the lease expires and another contender can take over. That prevents deadlocks from vanished processes — but introduces a new failure mode: a slow holder can still be running when the lease expires, so two workers believe they both own the lock.

Acquire, renew, release

A minimal lease API looks like:

Acquire — atomically set key lock:invoice-reconcile to a unique token (UUID) only if absent, with TTL 30s.
Renew — extend TTL while work continues (heartbeat every 10s on a 30s lease).
Release — delete the key only if the stored token matches yours (compare-and-delete).

Skipping compare-and-delete lets a slow worker delete a key a new owner just acquired. Skipping renewal lets TTL expire mid-transaction. Both bugs show up in postmortems under load.

Fencing tokens

Martin Kleppmann's critique of Redis-style locks highlights the fencing token fix: the lock service returns a monotonically increasing integer with every successful acquire. The protected resource (database, payment API, storage layer) rejects any write whose token is less than the highest token it has seen. Even if a stale lock holder wakes up and tries to commit, the resource ignores it.

Fencing is not optional for financial or inventory paths. A lock alone does not make downstream systems safe — the storage layer must enforce the token.

Implementation options

Redis (and Redlock)

Redis is the most common lock backend because it is already in the stack for caching. Single-instance pattern: SET lock_key token NX PX 30000 sets the key only if not exists, with 30s expiry. Release uses a Lua script that deletes only when the value equals your token.

Redlock (Salvatore Sanfilippo) runs the acquire against N independent Redis masters and considers the lock held if a quorum agrees within a short window. It targets higher fault tolerance than one Redis node — but remains controversial under clock skew and long GC pauses. For many teams, a single highly available Redis (or Redis Cluster with careful key placement) plus fencing tokens is simpler than operating five masters for theoretical purity.

PostgreSQL advisory locks

PostgreSQL offers session-level and transaction-level advisory locks via pg_advisory_lock(key). They live inside the database you already trust for ACID transactions, which eliminates a separate coordination service. Transaction-level locks release automatically on commit or rollback — ideal for "only one migration row at a time" inside a short transaction.

Downsides: locks tie up a connection, do not span heterogeneous services unless everyone talks to Postgres, and session locks are easy to leak if clients disconnect uncleanly. Use them when the critical section is already a DB transaction, not as a global cluster mutex for arbitrary jobs.

ZooKeeper, etcd, and Consul

Coordination services built on consensus (Raft/ZAB) provide ephemeral sequential nodes: the lowest sequence number is the leader; watchers notify contenders when the leader dies. Locks here are naturally lease-like — session expiry removes ephemeral nodes. Latency is higher than Redis, but correctness arguments are stronger for leader election and small-metadata coordination. Kubernetes itself uses etcd under the hood; some teams reuse that layer via libraries like client-go leader election.

Database row locking

Sometimes the lock is the row: SELECT … FOR UPDATE on an inventory row serializes concurrent checkouts in one database. That is not a distributed lock across services, but it is often the right tool when all writers share one OLTP database. Combine with short transactions and clear isolation level choices — holding a row lock across HTTP calls to a payment gateway is a reliability trap.

Failure modes to design for

Process pause — JVM GC or container freeze exceeds TTL; two workers enter the critical section. Mitigate with fencing tokens and short, resumable work units.
Clock skew — wall-clock TTL assumes roughly synchronized time. Prefer relative TTL from the lock service's clock, not the client's.
Split brain on the lock store — if Redis loses quorum, some clients may think they hold locks. Run the lock backend with the same seriousness as your database.
Lock renewal storms — thousands of workers renewing the same hot key creates Redis hot spots. Shard lock namespaces or redesign so fewer contenders fight for one key.
Forgotten release — always use TTL even if you plan to release cleanly; crashes are guaranteed eventually.
Running work outside the lock — acquiring a lock then performing slow network I/O while holding it serializes the entire fleet. Hold locks only around the minimal atomic commit.

Alternatives that often beat a lock

Distributed locks are powerful and easy to misuse. Before adding one, ask whether the data path can be made safe without mutual exclusion:

Idempotent consumers — duplicate cron runs that upsert the same derived state are harmless if keys are deduplicated. See idempotency keys.
Optimistic concurrency — add a version column; updates succeed only if WHERE id = ? AND version = ?. Contention surfaces as a retryable conflict, not a global lock.
Partitioned ownership — shard work by user ID or warehouse ID so each partition has a single consumer; locks become local.
Exactly-once-ish queues — move the singleton job to a queue with single active consumer per partition (Kafka consumer group, SQS FIFO deduplication).
Compare-and-swap in the store — object storage and many KV APIs offer conditional writes without a separate lock service.

Locks coordinate who may proceed; idempotency and versioning define what happens when two writers collide anyway. Production systems use both layers.

Production checklist

Document every lock key, TTL, and owning team — mystery locks become undeletable superstition.
Use unique holder tokens (UUID) and compare-and-delete on release.
Renew leases on an interval well under TTL; bound critical section runtime.
Require fencing tokens on downstream writes for money, inventory, and irreversible side effects.
Emit metrics: acquire latency, hold duration, renewal failures, contention rate.
Alert when hold duration approaches TTL — that is a pause/GC incident in progress.
Fail open vs closed explicitly: missing lock service should block payouts (closed) but may allow read-only cache refresh (open).
Load-test lock expiry during a long GC pause in staging.
Prefer DB row locks or optimistic versioning when all writers already share one database.
Re-evaluate quarterly: can this path be idempotent or partitioned instead?

Key takeaways

Local mutexes do not cross process boundaries — horizontal scale needs external coordination.
Most distributed locks are leases — TTL prevents dead holders, but stale workers need fencing tokens.
Redis is convenient; consensus stores are stricter — match the backend to your correctness bar.
Hold locks briefly — slow I/O inside a critical section becomes a fleet-wide bottleneck.
Locks are not a substitute for idempotency — design the data layer to reject stale writers even when locks fail.