Explainer · 7 June 2026
How CRDTs and conflict-free replication work
Two users edit the same shopping-cart counter while offline. Three regions append to a shared favorites list without talking to a leader. A mobile app queues edits on a plane and syncs when Wi-Fi returns — without overwriting your colleague's changes. Traditional databases solve this with locks, transactions, or a single leader that orders every write. A conflict-free replicated data type (CRDT) takes a different path: replicas are allowed to diverge, and a merge function is mathematically guaranteed to produce the same final state no matter what order updates arrive in. That property — strong eventual consistency with commutative, associative, idempotent merges — is what powers offline-first note apps, collaborative whiteboards, geo-replicated counters, and peer-to-peer document libraries without a central coordinator holding the truth.
Why replicas diverge in the first place
In a distributed system, latency and partitions are not edge cases — they are the default. Client A increments a like counter on a phone in Tokyo; client B increments the same counter on a laptop in Berlin before either update reaches the server. If both replicas simply assign "the latest timestamp wins," one increment disappears. If they lock the row until both ACK, offline users block. CRDTs accept that both increments happened and define merge rules so the combined state reflects both operations.
The key insight from the CRDT research program (Shapiro, Preguiça, Baquero, and others, building on earlier work on semilattices) is to restrict data types so merges are always defined and converge. You trade some expressiveness — not every arbitrary JSON document is a CRDT out of the box — for predictable convergence without coordination.
State-based vs operation-based CRDTs
Implementations fall into two families, often equivalent in expressive power:
State-based (CvRDT) — replicas gossip their full state (or
deltas). Merge is a function merge(S1, S2) → S' that is
commutative, associative, and idempotent. Think of merging two grow-only
counters by taking the element-wise max of per-replica counts.
Operation-based (CmRDT) — replicas broadcast operations that must be commutative (or delivered reliably in causal order via a middleware). The state is the fold of all operations. A counter increment op "+1 from replica A" commutes with "+1 from replica B" because order does not change the sum of per-replica contributions.
State-based types are simpler to reason about offline — ship the whole state blob when you reconnect. Operation-based types can be more bandwidth-efficient when the log of ops is small relative to state, but they need causal delivery or careful op design.
Building blocks: grow-only and counters
The simplest CRDTs never delete — only grow:
- G-Counter (grow-only counter) — each replica keeps its own count; global value is the sum of per-replica counts. Concurrent increments on different replicas add cleanly.
- PN-Counter — two G-Counters (increments and decrements) so you can model subtraction without going negative incorrectly.
- G-Set — add-only set; merge is set union. You cannot remove elements — fine for "tags ever applied" but not for toggling membership.
These types illustrate the pattern: encode intent so concurrent operations do not clobber each other. A naive shared integer fails; a vector of per-replica tallies succeeds.
Registers, sets, and the delete problem
Real apps need updates and deletes:
LWW-Register (last-writer-wins) — each write carries a timestamp (or hybrid logical clock). Merge picks the write with the highest timestamp. Simple for profile fields; dangerous when clocks skew — two users can still "lose" edits if their device clocks disagree.
OR-Set (observed-remove set) — each add tags an element with a unique identifier. Remove tombstones specific tags the remover has seen, not the element globally. Re-adding after remove works because new tags appear. Merge unions adds and tombstones. This fixes the classic set CRDT bug where "remove A" races with "add A" and the element ghost-resurrects or vanishes incorrectly.
Tombstones accumulate — removed items leave metadata until garbage-collected. Production systems run compaction when all replicas acknowledge a horizon, similar in spirit to log retention in event-sourced systems.
Text and JSON: sequence CRDTs
Ordered sequences (document text, arrays) are harder because position matters. RGA (replicated growable array) and WOOT/WISE assign each character a unique ID with fractional positions or linked-list predecessors so inserts commute. Libraries like Yjs, Automerge, and Diamond Types implement efficient sequence CRDTs for collaborative editors — the reason two cursors can type in the same paragraph without a central Google Docs server serializing every keystroke through one machine.
Nested JSON CRDTs compose registers, maps, and lists — Automerge's document model is essentially a tree of CRDTs with a binary encoding for sync. The merge propagates up the tree the same way: children merge independently, parents combine child states.
CRDTs vs consensus vs operational transformation
Raft / Paxos elect a leader to impose a total order on operations. Everyone agrees on log entry 42 before entry 43. Strong consistency, but writes stall when the leader is unreachable — the CP side of the CAP trade-off. Use consensus when correctness requires one global serial history (bank ledger, inventory that cannot oversell).
CRDTs skip the leader for data that can be merged semantically. Availability stays high under partition; consistency becomes eventual with a defined merge. Use CRDTs when offline edits, multi-master geo replication, or peer-to-peer sync matter more than byte-identical instantaneous agreement.
Operational transformation (OT) — used in early Google Docs — transforms concurrent ops against each other relative to a central or semi-central model. OT can be correct but is notoriously hard to prove for all cases; CRDTs shifted the industry toward algebraically specified merges, though some products still blend both.
Where you already meet CRDTs
- Collaborative editing — Yjs in Notion-like tools, Figma's multiplayer layer (with product-specific constraints), VS Code Live Share patterns.
- Local-first software — apps that treat the device copy as primary and sync opportunistically (Roam Research plugins, Ink & Switch style note systems).
- Distributed databases — Riak DT-Map, Redis CRDT modules, AntidoteDB — expose CRDT types at the API layer.
- Edge counters and presence — view counts, online user sets, feature flags replicated to PoPs without round-tripping every click to one region.
- Blockchain-adjacent state — less common on-chain due to cost, but off-chain indexers and wallet sync layers use CRDT-like merge for user preferences and read models.
Limits and sharp edges
CRDTs are not magic unstructured JSON:
- Semantic conflicts — merging two concurrent edits to "price" converges, but the business may have wanted a human to pick the winner. CRDTs resolve state, not meaning.
- Tombstone growth — OR-Sets and sequence CRDTs retain metadata; long-lived docs need compaction strategies.
- Clock dependence — LWW registers trust timestamps; use logical clocks or hybrid logical clocks (HLC) in production.
- Not every invariant — "balance must not go negative" is a constraint, not a CRDT; you may still need validation or consensus for hard invariants.
- Bandwidth — shipping full state for large documents hurts; delta-state CRDTs and op-based encodings help but add complexity.
Practical checklist
- Identify data that can merge by rule vs data that needs a single serial history — split them in your architecture.
- Prefer composed CRDT libraries (Automerge, Yjs) over hand-rolling merges for text and JSON trees.
- Plan tombstone GC and snapshot compaction before production traffic.
- Use hybrid logical clocks when LWW registers appear anywhere in the model.
- Test merge under reorder — property-based tests that shuffle op delivery catch non-commutative bugs early.
- Expose sync status in the UI — eventual consistency is correct but users still want to know when peers have caught up.
- Pair CRDT replicas with event-driven sync for audit logs even when state merges locally.
Related on Solana Garden: CAP theorem explained, Raft consensus explained, event-driven architecture guide, consistent hashing explained, Explainers hub.