Explainer · 7 June 2026
How NAT traversal and game relay networks work
When you queue into an online match, your game client does not simply dial your opponent's IP address. Home routers sit behind Network Address Translation (NAT), firewalls block unsolicited UDP, and carrier-grade NAT stacks multiple households behind one public address. Game engines therefore run a discovery dance — STUN to learn your public endpoint, ICE to rank connection paths, hole punching to open a direct path, and TURN or proprietary relays when punching fails. In early 2026, a Steam networking regression showed what happens when that dance breaks: same-city players routed through distant relays, latency jumping from single digits to triple digits. This explainer covers the mechanisms so outages like that make sense instead of feeling like magic smoke.
Why your PC does not have a real public IP
IPv4 ran out of addresses decades ago. Your ISP assigns one public IP to your router; every phone, laptop, and console behind it gets a private address (10.x, 172.16–31.x, or 192.168.x). Outbound packets get rewritten: the router replaces your private source IP and ephemeral port with its public IP and a mapped port, then remembers the mapping in a translation table so return traffic finds you.
Inbound connections from the internet cannot reach you unless the router already created a mapping — which is why hosting a game server from your bedroom without port forwarding fails. Peer-to-peer games need both sides to cooperate: each client must convince its own NAT to accept packets from the other player's public endpoint. That cooperation is NAT traversal.
NAT behavior is not standardized. RFC 4787 classifies common types: full cone (easiest — any remote host can send to a mapped port once you have sent outbound), restricted cone (only the IP you contacted), port-restricted (IP and port must match), and symmetric (hardest — the router allocates a different public port for every distinct destination, breaking naive hole punching). Symmetric NAT behind mobile hotspots is a frequent reason console players cannot host.
STUN: learning what the internet thinks your address is
STUN (Session Traversal Utilities for NAT) is a lightweight protocol. Your client sends a binding request to a STUN server on the public internet; the server echoes the source IP and port it saw. That tuple — public IP plus mapped port — is your server-reflexive candidate. You share it with your peer through a signaling channel (Steam lobby, Xbox Live, a WebSocket matchmaker) so they know where to send UDP packets.
STUN alone does not punch holes; it only discovers reflexive addresses. Many games also gather host candidates (your LAN IP, useful on the same network) and sometimes peer-reflexive candidates learned when an unexpected packet arrives during probing. The full ranking and connectivity-check machinery lives in ICE (Interactive Connectivity Establishment), which WebRTC and GameNetworkingSockets both implement in some form.
ICE runs controlled STUN binding requests between peers (with authentication to prevent amplification abuse), tries candidate pairs in priority order, and picks the first pair that passes bidirectional checks. For fighting games and rhythm titles where every millisecond counts, developers prefer UDP with unreliable delivery — retransmitting old input frames is worse than dropping them.
Hole punching: opening a path without port forwarding
Simultaneous open is the classic trick. Both clients send UDP packets to each other's server-reflexive addresses at roughly the same time. Each outbound packet creates or refreshes a NAT mapping; when the inbound packet from the peer arrives, the mapping is already warm, so the router forwards it to the correct private host. Voila — direct P2P without user-configured port forwarding.
Hole punching fails when either side uses symmetric NAT, aggressive firewall rules, or ISP policies that drop unsolicited UDP. It also fails when signaling is wrong — if you advertise stale ports or the matchmaker never exchanges candidates, peers never attempt the simultaneous open. Corporate networks that only allow TCP 443 are another dead end; those clients need a relay from the first frame.
Latency on a successful punch is dominated by physics: speed of light through fiber, ISP routing, Wi-Fi contention. Two apartments in one city on decent fiber often see 5–15 ms round-trip on a direct path. That is why players notice instantly when traffic detours through another continent.
TURN relays: the expensive fallback
When punching fails, traffic must flow through a TURN (Traversal Using Relays around NAT) server. Each player sends encrypted UDP (or TCP/TLS if UDP is blocked) to the relay; the relay forwards packets to the other player. You pay twice the bandwidth and add at least one extra hop — often more if the relay is poorly placed.
Operating TURN at scale is not cheap. Cloud egress fees, DDoS protection, and geographic distribution of relay nodes all land on the platform bill. Game publishers therefore treat relays as a last resort: acceptable for co-op RPGs targeting 100 ms, unacceptable for rollback fighters targeting sub-frame input delay. Good matchmaking tries relay only after ICE exhausts direct candidates.
Misconfigured clients that skip punch attempts and default to relay produce the symptom players describe as "we live in the same city but ping is 120 ms" — packets hairpin through a distant POP because the relay selection algorithm picked the wrong region or because hole punching was never attempted successfully.
Steam SDR and proprietary relay meshes
Valve operates a global Steam Datagram Relay (SDR) network — thousands of edge nodes that carry game traffic over encrypted UDP, with routing optimized for Steam's own titles and third-party games using GameNetworkingSockets. SDR is not identical to generic TURN, but it fills the same architectural slot: when direct P2P fails, packets flow through Valve's mesh instead of the public internet's shortest BGP path.
In healthy operation, SDR can improve routes — steering around congested ISP peering points or providing DDoS scrubbing near the player. In failure mode — buggy ICE logic, broken WebRTC data-channel handshakes, or a bad client DLL — every match relays even when punch would succeed. Community workarounds during the 2026 regression (rolling back specific networking libraries) strongly suggested the bug lived in candidate selection or punch timing, not in the relay hardware itself.
Other platforms mirror the pattern. Xbox and PlayStation run their own session and relay infrastructure; Epic's EOS provides relays for cross-platform titles; Discord routes voice through selective relays when UDP is blocked. The names differ; the trade-off is universal: reliability versus latency versus operating cost.
What developers and players can actually control
Players: Wired Ethernet beats Wi-Fi for jitter. Disable double-NAT (modem-router combos with another router behind them). On PC, firewall rules that block outbound UDP to game ports cause silent relay fallback — permissive rules for the game binary help. If latency spikes overnight while direct punch used to work, check platform status and community threads before blaming your ISP — infrastructure regressions happen.
Developers: Expose connection type in debug UI (direct vs relay vs SDR POP). Log ICE candidate pairs that won. Run automated punch tests from symmetric-NAT lab VMs in CI. Cap relay usage metrics in production dashboards so a bad release shows up as "relay rate 95%" before players flood forums. For rollback netcode, prefer dedicated servers in region when punch fails — a 40 ms server is better than a 120 ms relay.
Everyone else: Understanding NAT traversal explains why "decentralized P2P" products — blockchain games, voice chat, file sync — still need signaling servers and often relays. There is no free lunch behind a home router. The same lesson applies to web apps optimizing latency: our Core Web Vitals guide covers user- perceived delay on HTTP; this explainer covers the transport layer underneath real-time multiplayer.
Reading the 2026 Steam outage through this lens
Reports from fighting-game communities described matches that used to punch at 5–10 ms suddenly showing 80–120 ms with relay indicators lit — sometimes between players in the same metro. That pattern screams "ICE never selected a host or srflx pair" rather than "fiber got slower." When a platform update changes default library versions or reorders connectivity checks, millions of players can inherit broken traversal without any single game shipping a patch.
The incident is a reminder that multiplayer feel is infrastructure. Players experience it as "this game is laggy"; engineers should ask which candidate pair carried the frames. NAT traversal is invisible when it works and brutally visible when it does not — much like how a faithful analog video emulator only earns praise after you have seen what cheap filters get wrong.
Further reading: RFC 8445 (ICE), RFC 8656 (TURN), Valve GameNetworkingSockets documentation. Related on Solana Garden: Valve P2P outage analysis, Games hub, World Pulse.