Guide

Event sourcing explained: event stores, projections, and CQRS

Most applications store current state directly: a row in PostgreSQL says balance = 42, and an UPDATE overwrites it. Event sourcing flips the model: the system of record is an append-only log of domain events — AccountOpened, MoneyDeposited, WithdrawalRequested — and current state is derived by replaying those events. You gain a perfect audit trail, time travel, and flexible read models at the cost of more moving parts. This guide explains how event stores work, how CQRS (Command Query Responsibility Segregation) pairs with projections, and when event sourcing is worth the complexity in a microservices or monolith backend.

CRUD state vs event-sourced state

In a classic CRUD service, the database row is truth. If you overwrite status = 'shipped', the previous status = 'paid' is gone unless you added a separate audit table. Debugging “how did we get here?” means reconstructing from logs, backups, or guesswork.

Event sourcing treats each state change as an immutable fact about something that already happened in the business domain. Events are named in past tense (OrderPlaced, not PlaceOrder). The aggregate’s current balance is computed by folding the event sequence — never stored as the authoritative write path, though you may cache it in a projection for fast reads.

What you gain

Auditability — regulators and support teams see exactly what happened and when.
Temporal queries — replay events as-of any timestamp (“what was inventory on March 1?”).
Multiple read models — one event stream can feed a SQL reporting table, a search index, and a cache without triple-writing on every command.
Natural integration — publishing events to a message queue for downstream consumers is a first-class side effect.

What you pay

Complexity — versioning, snapshots, eventual consistency on read models, and operational tooling.
Learning curve — teams accustomed to ORM entities must think in commands, events, and aggregates.
Not a default — simple CRUD with good audit columns solves 90% of products; event sourcing targets the other 10% with hard requirements.

Core building blocks

The event store

An event store persists events in append-only streams, usually keyed by aggregate ID (e.g. order-7f3a). Each append is atomic per stream: concurrent writers get optimistic concurrency conflicts if they read the same version and both try to append — the second writer retries with fresh events. Dedicated stores (EventStoreDB, Marten, Axon Server) optimize for stream reads and subscriptions; teams also implement streams on PostgreSQL with an events table and monotonic stream_version columns.

Aggregates and commands

An aggregate is the consistency boundary: one order, one bank account, one shopping cart. A command (PlaceOrder) arrives, the aggregate validates business rules against its in-memory state (rebuilt from prior events), and either rejects or emits new domain events. Commands can fail; events cannot be “undone” — corrections are new compensating events (OrderCancelled), mirroring how accountants use reversing entries rather than deleting ledger lines.

Projections and CQRS

CQRS separates the write model (event-sourced aggregate) from read models (projections) optimized for queries. A projection consumer listens to the event stream and updates a denormalized table: line items, customer name, and totals flattened for a dashboard. Read models are eventually consistent — milliseconds behind the write path in healthy systems, but not instantaneous. UI copy should not promise “your order appears instantly” if the list view reads a projection lagging the command handler.

This split pairs naturally with event-driven architecture: the event bus that feeds projections is the same infrastructure that notifies warehouses, sends email, or updates analytics.

Replay, snapshots, and performance

Rebuilding an aggregate from event 1 through event 50,000 on every command is too slow. Production systems use snapshots: periodically persist the folded state at version N, then replay only events after N on load. Snapshot frequency trades storage for CPU — hourly for hot aggregates, every 100 events for long-lived accounts.

Global replay and new projections

When you add a new read model six months after launch, you replay history from the event store (or a compacted archive) through a new projector. This is powerful — you backfill search indexes without migrating legacy tables — but replay jobs must be throttled so they do not starve live traffic. Many teams keep cold storage (S3 parquet exports) alongside the hot event store for cheap bulk replays.

Idempotency in projectors

Projectors will see duplicate events after at-least-once delivery from the bus. Design projections to be idempotent: store last_processed_event_id per projector, or use upserts keyed by event ID so reprocessing is harmless.

Event versioning and schema evolution

Business facts change shape. AddressAdded v1 might store a single line1 string; v2 adds country_code. You cannot mutate old events in the store — that destroys audit integrity. Strategies:

Upcasting on read — when replaying, map v1 payloads to v2 in code before applying to the aggregate.
Parallel event types — introduce AddressAddedV2 for new writes; projectors handle both.
Weak schema events — JSON blobs with optional fields; works early, painful at scale without discipline.

Version every event type explicitly in metadata (event_type, schema_version). Integration tests should replay golden fixture streams through projectors after each deploy — a breaking rename in one event poisons every downstream read model silently until someone notices stale dashboards.

Consistency, ordering, and distributed boundaries

Within one aggregate stream, events are totally ordered. Across aggregates, ordering is undefined unless you design correlation IDs and process managers (sagas) that orchestrate multi-step workflows. A payment service emitting PaymentCaptured and an inventory service emitting StockReserved on separate streams may arrive at a projector in either order — projectors must tolerate gaps or use saga state machines documented in your consistency model.

Do not event-source everything in a bounded context that still needs cross-table ACID in one commit. Either keep related invariants inside one aggregate (larger streams, simpler consistency) or accept sagas with compensating events and visible intermediate states. Splitting an order and its line items into independent streams without a orchestration story is a common source of “ghost orders” in production.

When event sourcing fits — and when it does not

Good fits

Financial ledgers — balances, holds, settlements; auditors expect immutable journals.
Workflow-heavy domains — insurance claims, loan origination, multiplayer game state with replay for dispute resolution.
Collaborative editing with history — undo stacks and blame views are natural projections.
Systems that already publish domain events — event sourcing formalizes what you were approximating with outbox tables.

Poor fits

Simple CRUD admin panels — extra ceremony with no audit requirement.
High-churn telemetry — append-only at billions of events/day needs specialized time-series stores, not domain event sourcing.
Teams without ops maturity — if you lack observability and replay runbooks, debugging projection lag will hurt more than mutable rows would.

Common mistakes

Storing commands as events — UpdateEmail is not a fact; CustomerEmailChanged is.
God aggregates — one stream per entire company creates a serialization bottleneck; split by true invariants.
Querying the event store directly — ad-hoc SQL on raw events bypasses projections and breaks when you rename fields.
No monitoring on projection lag — alert when read models fall more than N seconds behind the write head.
Deleting or editing events — GDPR erasure needs explicit tombstone events and projection redaction policies, not silent deletes.
Skipping idempotency on consumers — duplicate PaymentCaptured events double-charge customers in naive projectors.

Production checklist

Define aggregate boundaries around true invariants, not database tables.
Name events in past tense with explicit schema versions in metadata.
Implement optimistic concurrency on stream appends (expected_version).
Build at least one projection per user-facing query path; never query raw streams in the UI.
Snapshot long-lived aggregates; benchmark replay time after 10k and 100k events.
Make every projector idempotent; track processed event IDs.
Export events to cold storage for bulk replays and disaster recovery.
Metric: projection lag, append latency, conflict retry rate, replay job duration.
Document saga/compensation flows for cross-aggregate workflows.
Load-test command handlers under concurrent writes to the same aggregate.

Event sourcing is not a microservices requirement — many successful event-sourced systems live inside a single deployable service with one PostgreSQL database. Adopt it when immutable history and flexible projections solve a problem you already feel in production, not because the architecture diagram looks sophisticated.