Guide
Event-driven architecture explained: events, queues, and async systems
In a request-response API, service A calls service B and waits. In event-driven architecture (EDA), services publish facts — OrderPlaced, PaymentSettled, UserSignedUp — and other components react without tight coupling. Producers do not need to know who listens. Consumers can scale independently, retry on failure, and join the system later. This guide explains how EDA works, where it shines, and the delivery semantics you must design for so async systems stay correct under retries and crashes.
Events vs commands
An event describes something that already happened in the past tense:
InvoicePaid, InventoryReserved, BlockFinalized.
Events are facts — immutable records of state change. Multiple subscribers can
react to the same event without the publisher caring.
A command expresses intent: ChargeCustomer,
SendEmail, RefundOrder. Commands target one handler
and expect a specific outcome. Mixing commands into an event stream creates
ambiguity — did the handler succeed? Should others retry?
Good EDA keeps events as notifications of completed work. The service that owns the data performs the command, commits to its database, then publishes the event. This outbox pattern avoids the classic bug where you emit an event but the database transaction rolls back.
Core building blocks
Message queues
A queue delivers each message to exactly one consumer (work-queue semantics). Use queues when a task must be processed once — sending a receipt email, resizing an uploaded image, settling a game payout. Popular implementations include Amazon SQS, Google Cloud Pub/Sub pull subscriptions, RabbitMQ, and Redis Streams.
Pub/sub (topics)
Publish-subscribe broadcasts an event to every subscriber on a topic.
One OrderCreated event might trigger inventory reservation, fraud scoring,
analytics, and a Slack notification — all in parallel. Kafka topics, SNS fan-out,
and NATS subjects follow this model.
Event bus vs broker
An in-process event bus (common in monoliths) dispatches events inside one application. A message broker sits between services, persisting messages to disk so they survive process crashes. Microservices almost always need a broker; monoliths often graduate to one when extract-ing services.
Webhooks as lightweight events
HTTP webhooks are a simple form of EDA: the producer POSTs a JSON payload when something happens. They lack broker durability unless you add a queue in front of the receiver, but they are ubiquitous for payment processors, GitHub integrations, and blockchain indexers.
Delivery guarantees
Distributed systems cannot guarantee all three of exactly-once delivery, availability, and partition tolerance at once. Real brokers pick trade-offs you must code against.
| Guarantee | Meaning | Your responsibility |
|---|---|---|
| At-most-once | Message may be lost, never duplicated | Acceptable only for metrics or logs |
| At-least-once | Message delivered one or more times | Consumers must be idempotent |
| Exactly-once | Processed once end-to-end | Broker + transactional consumer (hard); often approximated |
Most production queues default to at-least-once. A consumer crashes after processing but before acknowledging — the broker redelivers. Design every handler so running twice produces the same result: store processed event IDs in a database, use natural idempotency keys (payment intent IDs), or rely on upserts instead of blind inserts.
Dead-letter queues (DLQs) catch messages that fail after N retries. Monitor DLQ depth — it is your early warning for schema mismatches, downstream outages, or poison payloads.
Event sourcing and CQRS (when you need them)
Event sourcing stores state as an append-only log of events instead
of overwriting rows. Replaying AccountOpened, DepositMade,
WithdrawalMade reconstructs the current balance. Audit trails become
free; time-travel debugging is possible. The cost is complexity — snapshots, schema
evolution, and replay performance all need planning.
CQRS (Command Query Responsibility Segregation) splits writes (commands that emit events) from reads (materialized views optimized for queries). A write model appends events; projection workers build read models — search indexes, dashboards, denormalized tables. CQRS pairs naturally with event sourcing but can stand alone when read and write shapes diverge sharply.
You do not need either pattern for every microservice. Start with simple pub/sub and add sourcing only when audit, replay, or read/write scaling demands it.
EDA vs synchronous REST
Synchronous REST APIs are simpler to reason about: call, wait, get a response. Use them when the caller needs an immediate answer, the operation is fast, and failure should surface directly to the user.
Choose events when:
- Work can happen in the background (send email, generate PDF, index blockchain tx).
- Multiple downstream systems must react to the same fact.
- Peak load would overwhelm a synchronous chain of HTTP calls.
- You need temporal decoupling — the consumer can be offline and catch up later.
- Retries and backpressure are easier through a queue than through cascading timeouts.
Hybrid architectures are normal: an API accepts a request synchronously, writes to
the database, enqueues an event, and returns 202 Accepted with a
status URL. The client polls or receives a webhook when processing finishes.
Blockchain and payment flows
On-chain systems are inherently event-driven. A Solana transaction confirmation is an event your backend learns about asynchronously — via RPC polling, WebSocket subscriptions, or indexer webhooks. Merchant flows should never block the user's browser until finality; accept the payment intent, enqueue verification, and fulfill when the chain event arrives.
Our verify a Solana payment guide covers confirmation levels and idempotent crediting. The same principles apply to any webhook-driven settlement: deduplicate by transaction signature, handle reorganizations at the appropriate commitment level, and never double-ship goods on duplicate delivery.
High-throughput indexers batch chain events into Kafka or similar before fan-out to wallets, analytics, and tax tools — classic EDA at network scale.
Operational concerns
Schema evolution
Events are contracts. Add fields compatibly (consumers ignore unknown keys).
Version event types (OrderCreatedV2) or wrap payloads in a versioned
envelope. Breaking changes require dual-write or migration windows.
Ordering
Global ordering is expensive. Kafka guarantees order within a partition key
— partition by user_id so all events for one user stay sequential.
Cross-user ordering rarely matters.
Backpressure and rate limits
Slow consumers lag behind producers. Monitor consumer lag, scale workers horizontally, and apply rate limits on outbound side effects (email APIs, RPC calls) so retries do not amplify outages.
Observability
Propagate a correlation_id from the originating HTTP request through
every event and log line. Distributed traces (OpenTelemetry) across publish and
consume boundaries make debugging async flows tractable.
Common mistakes
- Publishing events before the database commit succeeds.
- Assuming exactly-once without idempotent consumers.
- Using events as RPC — requesting data via event round-trips instead of a query API.
- Giant payloads in messages (store blobs in object storage; events carry references).
- No DLQ — failed messages retry forever and hide systemic bugs.
- Synchronous chains disguised as events (A waits for B's event before emitting C).
Architecture checklist
- Events name past-tense facts; commands stay with the owning service.
- Outbox or transactional publish so DB and broker stay consistent.
- Idempotent consumers with deduplication keys.
- Dead-letter queue with alerting on depth.
- Schema versioning strategy before the second producer ships.
- Correlation IDs across HTTP, events, and logs.
- Clear boundary: sync API for interactive paths, async events for side effects.
Related reading
- Webhooks explained — HTTP push events, HMAC verification, retries
- REST API design explained — when synchronous APIs fit better
- Verify a Solana payment — async on-chain event handling
- All Solana Garden guides