Guide
Event sourcing explained: event stores, projections, and CQRS
Most applications store current state directly: a row in PostgreSQL says
balance = 42, and an UPDATE overwrites it. Event
sourcing flips the model: the system of record is an append-only log
of domain events — AccountOpened, MoneyDeposited,
WithdrawalRequested — and current state is derived by replaying
those events. You gain a perfect audit trail, time travel, and flexible read models at
the cost of more moving parts. This guide explains how event stores work, how
CQRS (Command Query Responsibility Segregation) pairs with projections,
and when event sourcing is worth the complexity in a
microservices
or monolith backend.
CRUD state vs event-sourced state
In a classic CRUD service, the database row is truth. If you overwrite
status = 'shipped', the previous status = 'paid' is gone
unless you added a separate audit table. Debugging “how did we get here?”
means reconstructing from logs, backups, or guesswork.
Event sourcing treats each state change as an immutable fact about
something that already happened in the business domain. Events are named in past tense
(OrderPlaced, not PlaceOrder). The aggregate’s current
balance is computed by folding the event sequence — never stored as the authoritative
write path, though you may cache it in a projection for fast reads.
What you gain
- Auditability — regulators and support teams see exactly what happened and when.
- Temporal queries — replay events as-of any timestamp (“what was inventory on March 1?”).
- Multiple read models — one event stream can feed a SQL reporting table, a search index, and a cache without triple-writing on every command.
- Natural integration — publishing events to a message queue for downstream consumers is a first-class side effect.
What you pay
- Complexity — versioning, snapshots, eventual consistency on read models, and operational tooling.
- Learning curve — teams accustomed to ORM entities must think in commands, events, and aggregates.
- Not a default — simple CRUD with good audit columns solves 90% of products; event sourcing targets the other 10% with hard requirements.
Core building blocks
The event store
An event store persists events in append-only streams, usually keyed
by aggregate ID (e.g. order-7f3a). Each append is atomic per stream:
concurrent writers get optimistic concurrency conflicts if they read the same version
and both try to append — the second writer retries with fresh events. Dedicated stores
(EventStoreDB, Marten, Axon Server) optimize for stream reads and subscriptions;
teams also implement streams on PostgreSQL with an events table and
monotonic stream_version columns.
Aggregates and commands
An aggregate is the consistency boundary: one order, one bank account,
one shopping cart. A command (PlaceOrder) arrives, the
aggregate validates business rules against its in-memory state (rebuilt from prior
events), and either rejects or emits new domain events. Commands can
fail; events cannot be “undone” — corrections are new compensating events
(OrderCancelled), mirroring how accountants use reversing entries rather
than deleting ledger lines.
Projections and CQRS
CQRS separates the write model (event-sourced aggregate) from read models (projections) optimized for queries. A projection consumer listens to the event stream and updates a denormalized table: line items, customer name, and totals flattened for a dashboard. Read models are eventually consistent — milliseconds behind the write path in healthy systems, but not instantaneous. UI copy should not promise “your order appears instantly” if the list view reads a projection lagging the command handler.
This split pairs naturally with event-driven architecture: the event bus that feeds projections is the same infrastructure that notifies warehouses, sends email, or updates analytics.
Replay, snapshots, and performance
Rebuilding an aggregate from event 1 through event 50,000 on every command is too slow. Production systems use snapshots: periodically persist the folded state at version N, then replay only events after N on load. Snapshot frequency trades storage for CPU — hourly for hot aggregates, every 100 events for long-lived accounts.
Global replay and new projections
When you add a new read model six months after launch, you replay history from the event store (or a compacted archive) through a new projector. This is powerful — you backfill search indexes without migrating legacy tables — but replay jobs must be throttled so they do not starve live traffic. Many teams keep cold storage (S3 parquet exports) alongside the hot event store for cheap bulk replays.
Idempotency in projectors
Projectors will see duplicate events after at-least-once delivery from the bus.
Design projections to be
idempotent:
store last_processed_event_id per projector, or use upserts keyed by
event ID so reprocessing is harmless.
Event versioning and schema evolution
Business facts change shape. AddressAdded v1 might store a single
line1 string; v2 adds country_code. You cannot mutate old
events in the store — that destroys audit integrity. Strategies:
- Upcasting on read — when replaying, map v1 payloads to v2 in code before applying to the aggregate.
- Parallel event types — introduce
AddressAddedV2for new writes; projectors handle both. - Weak schema events — JSON blobs with optional fields; works early, painful at scale without discipline.
Version every event type explicitly in metadata (event_type,
schema_version). Integration tests should replay golden fixture streams
through projectors after each deploy — a breaking rename in one event poisons every
downstream read model silently until someone notices stale dashboards.
Consistency, ordering, and distributed boundaries
Within one aggregate stream, events are totally ordered. Across
aggregates, ordering is undefined unless you design correlation IDs and process managers
(sagas) that orchestrate multi-step workflows. A payment service emitting
PaymentCaptured and an inventory service emitting
StockReserved on separate streams may arrive at a projector in either
order — projectors must tolerate gaps or use saga state machines documented in your
consistency model.
Do not event-source everything in a bounded context that still needs cross-table ACID in one commit. Either keep related invariants inside one aggregate (larger streams, simpler consistency) or accept sagas with compensating events and visible intermediate states. Splitting an order and its line items into independent streams without a orchestration story is a common source of “ghost orders” in production.
When event sourcing fits — and when it does not
Good fits
- Financial ledgers — balances, holds, settlements; auditors expect immutable journals.
- Workflow-heavy domains — insurance claims, loan origination, multiplayer game state with replay for dispute resolution.
- Collaborative editing with history — undo stacks and blame views are natural projections.
- Systems that already publish domain events — event sourcing formalizes what you were approximating with outbox tables.
Poor fits
- Simple CRUD admin panels — extra ceremony with no audit requirement.
- High-churn telemetry — append-only at billions of events/day needs specialized time-series stores, not domain event sourcing.
- Teams without ops maturity — if you lack observability and replay runbooks, debugging projection lag will hurt more than mutable rows would.
Common mistakes
- Storing commands as events —
UpdateEmailis not a fact;CustomerEmailChangedis. - God aggregates — one stream per entire company creates a serialization bottleneck; split by true invariants.
- Querying the event store directly — ad-hoc SQL on raw events bypasses projections and breaks when you rename fields.
- No monitoring on projection lag — alert when read models fall more than N seconds behind the write head.
- Deleting or editing events — GDPR erasure needs explicit tombstone events and projection redaction policies, not silent deletes.
- Skipping idempotency on consumers — duplicate
PaymentCapturedevents double-charge customers in naive projectors.
Production checklist
- Define aggregate boundaries around true invariants, not database tables.
- Name events in past tense with explicit schema versions in metadata.
- Implement optimistic concurrency on stream appends (
expected_version). - Build at least one projection per user-facing query path; never query raw streams in the UI.
- Snapshot long-lived aggregates; benchmark replay time after 10k and 100k events.
- Make every projector idempotent; track processed event IDs.
- Export events to cold storage for bulk replays and disaster recovery.
- Metric: projection lag, append latency, conflict retry rate, replay job duration.
- Document saga/compensation flows for cross-aggregate workflows.
- Load-test command handlers under concurrent writes to the same aggregate.
Event sourcing is not a microservices requirement — many successful event-sourced systems live inside a single deployable service with one PostgreSQL database. Adopt it when immutable history and flexible projections solve a problem you already feel in production, not because the architecture diagram looks sophisticated.
Related reading
- Event-driven architecture explained — pub/sub, delivery guarantees, and how domain events flow between services
- Message queues explained — Kafka, RabbitMQ, and SQS patterns that feed projections and sagas
- Idempotency explained — safe retries for command handlers and duplicate event delivery
- Distributed systems consistency explained — eventual vs strong consistency across write and read paths