Guide
Game reconnect and disconnect recovery systems explained
Harbor Ridge’s ranked 5v5 tactical mode treated every dropped packet like a rage quit: the server removed the player on the first missed heartbeat, awarded the round to the enemy team, and logged a competitive abandon. Telemetry across three regions showed 52% of ranked sessions ended with at least one disconnect-forfeit — and post-match surveys blamed “unstable internet” more often than balance or cheating. Rematch rates cratered because a thirty-second router reboot meant a lost rank and a twenty-minute cooldown.
A reconnect and disconnect recovery stack separates brief network blips from intentional leaves. It detects loss of presence, holds the player’s slot with a grace window, snapshots authoritative state for restoration, and only escalates to forfeit or bot substitution when grace expires. This guide covers disconnect FSM design, heartbeat and timeout tuning, rejoin tokens and snapshot formats, integration with pregame lobbies and rollback netcode, the Harbor Ridge refactor, a technique decision table, pitfalls, and a production checklist.
Disconnect detection: heartbeats and the presence FSM
Clients and servers exchange lightweight heartbeats —
small UDP or WebSocket pings on a fixed interval (typically 1–3 seconds
for action games, 5–10 for slower modes). When the server misses
N consecutive heartbeats, it transitions the player from
CONNECTED to DISCONNECTED_PENDING. The client
may still be alive but behind a NAT rebinding or mobile tower handoff;
jumping straight to REMOVED is what burned Harbor Ridge.
A practical FSM for competitive shooters:
- CONNECTED — heartbeats arriving; full input accepted.
- DISCONNECTED_PENDING — grace timer running; slot reserved; character may enter passive or AI-hold state.
- RECONNECTING — client presented rejoin token; server streaming snapshot delta.
- RESTORED — player caught up to current tick; brief spawn protection optional.
- ABANDONED — grace expired; forfeit rules apply; bot backfill or team shrinkage.
Tune N against false positives. Three missed 2-second heartbeats (6 seconds silent) is a common starting point for PC fiber; mobile titles often stretch to 15–30 seconds before marking abandon. Log disconnect reason codes separately: timeout, client quit, server kick, version mismatch, and anti-cheat ban each need different recovery policy.
Grace windows: how long to hold the slot
The grace window is how long teammates wait before the match proceeds without the disconnected player. Too short and Wi-Fi blips become forfeits; too long and four-versus-five rounds feel stalled. Harbor Ridge originally used zero grace; the refactor tiered by context:
- Pregame / lobby — 60–90 seconds (align with lobby accept timers).
- Between rounds — full buy-phase duration plus 15 seconds (player can reconnect during economy).
- Mid-round combat — 45 seconds ranked, 25 seconds casual; character enters “spectator ghost” (invulnerable, no input) instead of standing AFK.
- Match point / overtime — extended 60-second grace to avoid anticlimactic wins on router reboots.
Display grace remaining to both teams. Hidden timers breed accusations of pause abuse; a visible countdown with mute audio stings less than a silent freeze. Pair grace length with MMR policy: reconnect within grace should never count as abandon; expiry may apply a reduced penalty on first offense.
State snapshots and rejoin tokens
Reconnecting is not reloading the main menu. The server must prove the returning client owns the slot and ship enough authoritative state to resume without giving an informational advantage.
Rejoin tokens are short-lived signed blobs issued at
match start: {match_id, player_id, session_secret, expiry}.
The client stores them locally; on reconnect it presents the token before
any gameplay snapshot. Rotate secrets per match; reject tokens after
abandon or match end.
Snapshot contents depend on genre but typically include:
- Player transform, velocity, stance, health, armor, ammo, economy credits.
- Inventory and equipped loadout IDs (locked from pregame customization).
- Team score, round index, objective state (bomb timer, zone ownership).
- Input ack cursor — last processed client sequence number.
- Compressed event log tail for catch-up (last 5–15 seconds of relevant deltas).
For tick-based shooters using lag compensation, the reconnecting client replays the event tail locally until its simulation time matches the server clock, then resumes live input. Cap replay length; if a player was gone four minutes, a full state snapshot plus “you were eliminated during downtime” is fairer than simulating four minutes of combat in one frame.
Rollback titles need special care: stored savestates per frame explode memory if every disconnect spawns a fork. Prefer rejoin into the current confirmed frame with explicit “you missed these inputs” UI rather than rewinding opponents.
Bot backfill, pause, and team shrinkage
When grace expires, the match must continue. Three common policies:
- AI backfill — substitute a tuned bot at the disconnected player’s last known state. Works in casual modes; ranked players hate fighting bots that play differently from the human they were leading against.
- Team shrinkage — continue 4v5 with economy compensation (extra credits, longer respawn timers on the advantaged side). Harbor Ridge ranked uses shrinkage after grace; casual uses easy bots.
- Pause vote — one team-wide pause per half (30 seconds max). Abuse-prone in esports; acceptable in co-op PvE.
If the disconnected player returns after backfill, decide explicitly: replace the bot silently, bench the returnee until next round, or deny reentry. Silent replacement mid-gunfight is confusing; benching until round end is the cleanest competitive compromise.
Client UX: reconnect flow that players trust
Technical recovery fails if the UI panics. Minimum viable reconnect UX:
- Automatic retry with exponential backoff (1s, 2s, 4s, 8s) while showing “Reconnecting… attempt 3 of 8.”
- Preserve rejoin token across app backgrounding on mobile.
- Offer “Leave match” only after two failed full retry cycles — mis-taps during lag spikes cause real abandons.
- On restore, flash a non-blocking banner: “Reconnected — you were dead for 12s” so death recap gaps make sense.
- Route diagnostics: log RTT, packet loss, and disconnect phase for support tickets without exposing other players’ IPs.
Cross-play and NAT traversal issues often masquerade as disconnects. Surface NAT type warnings in the lobby, not mid-round, so players fix forwarding before ranked starts.
Harbor Ridge refactor: 52% to 11% session abandon
Harbor Ridge shipped four changes over one season:
- Tiered grace FSM replacing instant kick (see above).
- Signed rejoin tokens stored in client secure prefs; eliminated “reconnect put me in spectator with no team” bugs.
- Snapshot v2 with 8-second event tail instead of full world dump — median reconnect time fell from 9.4s to 2.1s.
- Abandon forgiveness — first disconnect-forfeit per week downgrades to -50% MMR loss if reconnect was attempted (client logs retry count).
Ranked session abandon (any player leaving forfeit) dropped from 52% to 11%. Average match completion rose 19%. Support tickets tagged “unfair disconnect loss” fell 74%. Remaining abandons clustered on genuine quits and anti-cheat kicks — the signal the systems team actually wanted to punish.
Technique decision table
| Approach | Best for | Trade-off |
|---|---|---|
| Instant kick on missed heartbeat | Fighting-game locals, tiny lobbies | Catastrophic ranked UX on real-world networks |
| Grace + snapshot rejoin | Competitive shooters, MOBAs, sports | Engineering cost; snapshot size tuning |
| Rollback resync | Frame-perfect fighters, platform fighters | Memory and CPU; poor for 10-player tactical |
| Bot backfill | Casual/quickplay, co-op PvE | Ranked integrity complaints |
| Pause vote | Co-op, small esports LAN | Abuse, pacing breaks in online ranked |
| Late-join replacement (new human) | Large battle royale, MMO zones | Not true reconnect; different fairness model |
Common pitfalls
- Zero grace in ranked — punishes ISP blips more than intentional leaves.
- Rejoin without token auth — ghost spectators or slot stealing exploits.
- Full-world snapshot on reconnect — multi-second stalls and leak of fog-of-war data.
- Silent bot swap — players shoot a bot thinking it is the human they were tracking.
- Same abandon penalty for timeout and quit — support volume explodes; no forgiveness path.
- Ignoring lobby-phase disconnects — should requeue the lobby, not start 4v5 round one.
- Rollback fork per disconnect — memory blowups in long sessions.
Production checklist
- Define presence FSM states and transitions with logged reason codes.
- Tune heartbeat interval and missed-beat threshold per platform.
- Set tiered grace durations for lobby, between-round, and mid-round.
- Issue signed per-match rejoin tokens at session start.
- Design minimal authoritative snapshot + event tail for catch-up.
- Cap replay length; handle death-during-disconnect explicitly.
- Choose backfill vs shrinkage policy per queue type.
- Build automatic client retry with visible progress UI.
- Separate competitive abandon from reconnect-forgiven timeout in MMR.
- Load-test 20% concurrent disconnect rate without server meltdown.
Key takeaways
- Disconnect is not abandon until grace expires — hold the slot with a visible timer.
- Rejoin tokens + snapshots restore authority without leaking extra information.
- Tier grace by match phase — lobby, buy phase, and mid-combat need different budgets.
- Ranked prefers shrinkage over sneaky bots — casual can backfill AI.
- Harbor Ridge cut session abandon from 52% to 11% with grace, tokens, and snapshot v2.
Related reading
- Match lobby and pregame systems explained — accept gates and roster lock before round one
- Rollback netcode systems explained — savestates and resimulation for fighters
- Hit registration and lag compensation explained — catch-up after reconnect in shooters
- Game matchmaking explained — queue design and abandon penalties