Explainer · 7 June 2026

How event loops and async I/O models work

A naive HTTP server spawns one thread per connection. At ten thousand concurrent clients you have ten thousand stacks, ten thousand kernel schedulings, and a memory bill that kills the machine before you serve a single byte. The fix is not "buy bigger hardware" — it is to stop waiting. Most of the time a server thread sits idle while a packet crosses the network or a disk head seeks. An event loop turns that idle time into work for other connections: register interest in many file descriptors, ask the OS which ones are ready, and run short handler callbacks only when data can actually move.

Blocking I/O vs non-blocking I/O

In blocking mode, a call like read() on a socket does not return until at least one byte arrives (or an error occurs). The thread is parked — it cannot serve anyone else. In non-blocking mode the same call returns immediately: either it copies available bytes into your buffer, or it returns EAGAIN / EWOULDBLOCK meaning "nothing ready yet, try later."

Non-blocking alone is not enough. Busy-polling every socket in a tight loop burns CPU. You need a multiplexing primitive that sleeps until some registered descriptor becomes readable or writable. POSIX gives you select(), poll(), and on Linux epoll; BSD and macOS have kqueue; Windows has IOCP. All answer the same question: "Which of these connections can I progress right now?"

This is different from CPU process scheduling, which decides which thread runs on a core. The event loop decides which connection gets attention inside a thread that is already running.

The reactor pattern: one loop, many callbacks

The classic reactor pattern looks like this:

  1. Put sockets in non-blocking mode.
  2. Register each socket with the multiplexer for read and/or write interest.
  3. Call epoll_wait() (or equivalent) — the thread blocks here, but only once for the whole set.
  4. For each ready descriptor, dispatch a small callback: read available bytes, parse what you can, maybe queue a response write.
  5. Return to step 3.

Handlers must be short. If you run a 200 ms JSON parse inside the callback, every other connection on that loop stalls — including health checks and keep-alives. That is why mature stacks split work: the loop handles I/O; a worker pool handles CPU-heavy jobs, with results posted back via another event or channel.

Node.js popularized "JavaScript on the event loop" via libuv, which wraps epoll/kqueue/IOCP behind a uniform API. Python's asyncio, Rust's tokio, and Go's netpoller follow the same shape even when the surface syntax differs. Go looks synchronous thanks to goroutines, but the runtime still multiplexes network I/O onto a small set of threads under the hood.

From select to epoll: scaling registration

select() passes the entire fd set into the kernel on every call — O(n) work per wake-up, with low fd limits. poll() improves the interface but still scans the full list. epoll (Linux) inverts the model: you register fds once with epoll_ctl, and epoll_wait returns only the ready subset. Edge-triggered mode (EPOLLET) fires once per transition to ready, which reduces wake-ups but demands that you drain the socket completely or you miss data.

Linux io_uring goes further: submission and completion queues let the kernel batch reads and writes with fewer syscalls — useful for high-throughput storage and proxies. It complements rather than replaces epoll for many web servers today, but the direction is clear: push waiting and batching into the kernel.

At the protocol layer, readiness events still sit on top of TCP's byte-stream semantics. Your loop may be told a socket is writable, but the kernel send buffer can fill when the peer reads slowly — that is where application-level flow control begins.

Async/await: syntax sugar over the loop

Callback pyramids ("callback hell") made async code hard to read. async/await compiles to state machines: each await registers a continuation and yields control back to the loop. When the awaited I/O completes, the task resumes. Under the hood it is still non-blocking fds and a multiplexer — the syntax hides registration and callback wiring.

Important distinction: async I/O (waiting without blocking threads) is not the same as parallelism (using multiple cores). A single-threaded event loop uses one core well for I/O-bound workloads. CPU-bound work on that thread still blocks everything. Production Node services run cluster mode or worker threads; Rust async runtimes spawn blocking pools for disk and compute; Python documents this as "run CPU work in asyncio.to_thread()."

Backpressure: when producers outrun consumers

If clients send faster than your service processes, unbounded buffers grow until the process OOMs. Backpressure is the contract that slow consumers signal slow producers:

  • Stop reading — remove read interest from the multiplexer until downstream catches up. TCP window shrinking propagates pause to the sender.
  • Pause writes — when the outbound queue exceeds a watermark, stop accepting new work or return 503 with rate-limit headers.
  • Bounded queues — between loop and workers, use fixed capacity; offer failures trigger drop or retry policies instead of silent memory growth.

Without backpressure, "async" just delays the crash. Load tests should verify memory stays flat under sustained overload, not only that p50 latency looks fine at moderate RPS.

Timers, signals, and the "next tick"

Real servers need more than socket events. Event loops integrate timer wheels or priority queues for deadlines: HTTP keep-alive timeouts, retry backoff, session expiry. Each loop iteration computes the earliest timer and passes that as the timeout to epoll_wait so the thread wakes for either I/O or clock.

Cross-thread communication uses eventfd, pipe writes, or "wake the loop" primitives: one thread pushes work to a queue and pings the multiplexer so the loop thread picks it up safely on the next iteration. Getting this wrong causes data races — only the loop thread should mutate connection state unless you use explicit locking (which reintroduces contention you wanted to avoid).

When not to use a single event loop

  • CPU-heavy handlers — video transcoding, ZK proof verification, large JSON on hot paths. Offload to workers or separate services.
  • True parallel compute — if the problem is matrix math across cores, threads or processes win; async will not magick away GIL or single-thread limits.
  • Blocking libraries — legacy DB drivers or DNS lookups that block the thread poison the whole loop. Wrap in thread pools or replace with async-native clients.
  • Long-lived blocking SSL handshakes on loop — TLS can be async, but misconfigured stacks stall; terminate TLS at an edge proxy when possible.

Hybrid architectures are normal: nginx (evented) in front of a small pool of threaded application servers; or many Node workers behind a circuit breaker when backends saturate.

Crypto RPC clients: the same lessons

Wallets and indexers hammer JSON-RPC endpoints with concurrent requests. A browser tab using async fetch is a micro event loop; the server side multiplexes thousands of TLS connections the same way. When an endpoint returns 429, naive "fire more parallel requests" code makes things worse — apply jittered backoff, cap in-flight calls, and treat saturation as backpressure, not as a prompt to spawn more threads.

For fan-out to many on-chain accounts, batching and event-driven pipelines (websocket subscriptions feeding a queue) often beat polling each account in a loop from a single thread.

Practical checklist

  • Keep loop callbacks short; measure event-loop lag (Node: perf_hooks delay; libuv thread pool queue depth).
  • Use non-blocking sockets with epoll/kqueue/IOCP — not one thread per connection.
  • Bound every queue between network, loop, and workers; test under overload.
  • Offload CPU and blocking I/O to worker pools; never call sleep() or synchronous disk on the loop thread.
  • Integrate timers and cross-thread wakeups without racing connection state.
  • Pair with timeouts and breakers at the client — async does not remove failure modes, it concentrates them in one thread if you mis-handle them.

Related on Solana Garden: TCP congestion control explained, Rate limiting algorithms explained, Event-driven architecture explained, Circuit breakers explained, Explainers hub.