Explainer · 7 June 2026
How event loops and async I/O models work
A naive HTTP server spawns one thread per connection. At ten thousand concurrent clients you have ten thousand stacks, ten thousand kernel schedulings, and a memory bill that kills the machine before you serve a single byte. The fix is not "buy bigger hardware" — it is to stop waiting. Most of the time a server thread sits idle while a packet crosses the network or a disk head seeks. An event loop turns that idle time into work for other connections: register interest in many file descriptors, ask the OS which ones are ready, and run short handler callbacks only when data can actually move.
Blocking I/O vs non-blocking I/O
In blocking mode, a call like read() on a
socket does not return until at least one byte arrives (or an error
occurs). The thread is parked — it cannot serve anyone else. In
non-blocking mode the same call returns immediately:
either it copies available bytes into your buffer, or it returns
EAGAIN / EWOULDBLOCK meaning "nothing ready
yet, try later."
Non-blocking alone is not enough. Busy-polling every socket in a tight
loop burns CPU. You need a multiplexing primitive that
sleeps until some registered descriptor becomes readable or
writable. POSIX gives you select(), poll(),
and on Linux epoll; BSD and macOS have kqueue;
Windows has IOCP. All answer the same question: "Which of these
connections can I progress right now?"
This is different from CPU process scheduling, which decides which thread runs on a core. The event loop decides which connection gets attention inside a thread that is already running.
The reactor pattern: one loop, many callbacks
The classic reactor pattern looks like this:
- Put sockets in non-blocking mode.
- Register each socket with the multiplexer for read and/or write interest.
- Call
epoll_wait()(or equivalent) — the thread blocks here, but only once for the whole set. - For each ready descriptor, dispatch a small callback: read available bytes, parse what you can, maybe queue a response write.
- Return to step 3.
Handlers must be short. If you run a 200 ms JSON parse inside the callback, every other connection on that loop stalls — including health checks and keep-alives. That is why mature stacks split work: the loop handles I/O; a worker pool handles CPU-heavy jobs, with results posted back via another event or channel.
Node.js popularized "JavaScript on the event loop" via
libuv, which wraps epoll/kqueue/IOCP behind a uniform
API. Python's asyncio, Rust's tokio, and Go's
netpoller follow the same shape even when the surface syntax differs.
Go looks synchronous thanks to goroutines, but the runtime still
multiplexes network I/O onto a small set of threads under the hood.
From select to epoll: scaling registration
select() passes the entire fd set into the kernel on every
call — O(n) work per wake-up, with low fd limits. poll()
improves the interface but still scans the full list. epoll
(Linux) inverts the model: you register fds once with
epoll_ctl, and epoll_wait returns only the
ready subset. Edge-triggered mode (EPOLLET) fires once per
transition to ready, which reduces wake-ups but demands that you drain
the socket completely or you miss data.
Linux io_uring goes further: submission and completion queues let the kernel batch reads and writes with fewer syscalls — useful for high-throughput storage and proxies. It complements rather than replaces epoll for many web servers today, but the direction is clear: push waiting and batching into the kernel.
At the protocol layer, readiness events still sit on top of TCP's byte-stream semantics. Your loop may be told a socket is writable, but the kernel send buffer can fill when the peer reads slowly — that is where application-level flow control begins.
Async/await: syntax sugar over the loop
Callback pyramids ("callback hell") made async code hard to read.
async/await compiles to state machines: each
await registers a continuation and yields control back to
the loop. When the awaited I/O completes, the task resumes. Under the
hood it is still non-blocking fds and a multiplexer — the syntax hides
registration and callback wiring.
Important distinction: async I/O (waiting without
blocking threads) is not the same as parallelism
(using multiple cores). A single-threaded event loop uses one core well
for I/O-bound workloads. CPU-bound work on that thread still blocks
everything. Production Node services run cluster mode or worker threads;
Rust async runtimes spawn blocking pools for disk and compute; Python
documents this as "run CPU work in
asyncio.to_thread()."
Backpressure: when producers outrun consumers
If clients send faster than your service processes, unbounded buffers grow until the process OOMs. Backpressure is the contract that slow consumers signal slow producers:
- Stop reading — remove read interest from the multiplexer until downstream catches up. TCP window shrinking propagates pause to the sender.
- Pause writes — when the outbound queue exceeds a watermark, stop accepting new work or return 503 with rate-limit headers.
- Bounded queues — between loop and workers, use fixed
capacity;
offerfailures trigger drop or retry policies instead of silent memory growth.
Without backpressure, "async" just delays the crash. Load tests should verify memory stays flat under sustained overload, not only that p50 latency looks fine at moderate RPS.
Timers, signals, and the "next tick"
Real servers need more than socket events. Event loops integrate
timer wheels or priority queues for deadlines: HTTP
keep-alive timeouts, retry backoff, session expiry. Each loop iteration
computes the earliest timer and passes that as the timeout to
epoll_wait so the thread wakes for either I/O or clock.
Cross-thread communication uses eventfd, pipe writes, or
"wake the loop" primitives: one thread pushes work to a queue and pings
the multiplexer so the loop thread picks it up safely on the next
iteration. Getting this wrong causes data races — only the loop thread
should mutate connection state unless you use explicit locking (which
reintroduces contention you wanted to avoid).
When not to use a single event loop
- CPU-heavy handlers — video transcoding, ZK proof verification, large JSON on hot paths. Offload to workers or separate services.
- True parallel compute — if the problem is matrix math across cores, threads or processes win; async will not magick away GIL or single-thread limits.
- Blocking libraries — legacy DB drivers or DNS lookups that block the thread poison the whole loop. Wrap in thread pools or replace with async-native clients.
- Long-lived blocking SSL handshakes on loop — TLS can be async, but misconfigured stacks stall; terminate TLS at an edge proxy when possible.
Hybrid architectures are normal: nginx (evented) in front of a small pool of threaded application servers; or many Node workers behind a circuit breaker when backends saturate.
Crypto RPC clients: the same lessons
Wallets and indexers hammer JSON-RPC endpoints with concurrent requests.
A browser tab using async fetch is a micro event loop;
the server side multiplexes thousands of TLS connections the same way.
When an endpoint returns 429, naive "fire more parallel requests" code
makes things worse — apply jittered backoff, cap in-flight calls, and
treat saturation as backpressure, not as a prompt to spawn more threads.
For fan-out to many on-chain accounts, batching and event-driven pipelines (websocket subscriptions feeding a queue) often beat polling each account in a loop from a single thread.
Practical checklist
- Keep loop callbacks short; measure event-loop lag (Node:
perf_hooksdelay; libuv thread pool queue depth). - Use non-blocking sockets with epoll/kqueue/IOCP — not one thread per connection.
- Bound every queue between network, loop, and workers; test under overload.
- Offload CPU and blocking I/O to worker pools; never call
sleep()or synchronous disk on the loop thread. - Integrate timers and cross-thread wakeups without racing connection state.
- Pair with timeouts and breakers at the client — async does not remove failure modes, it concentrates them in one thread if you mis-handle them.
Related on Solana Garden: TCP congestion control explained, Rate limiting algorithms explained, Event-driven architecture explained, Circuit breakers explained, Explainers hub.