Guide

Weight balance scale puzzle systems explained

Harbor Vault's Scale Sanctum was a compact teaching room: three numbered crates (2, 3, and 5 weight units), one floor scale, and a vault door that should open when the pan read exactly 8. On paper it is a clean subset-sum puzzle. Yet 31% of playtesters abandoned before solving it. Telemetry showed three failure modes: the scale silently included the player's inventory weight, stacked crates counted as one lump without showing per-crate contribution, and the HUD never displayed the running total — only a vague “balanced” glow when within tolerance. Players brute-forced combinations by dragging every object twice instead of reasoning. After a refactor — explicit weight readout, optional player-weight exclusion on puzzle scales, per-crate stack labels, and a one-time tutorial glyph showing unit weights — first-attempt solve rate rose from 19% to 64% and median time-to-solve dropped from 4m 12s to 1m 38s.

Weight-balance scale puzzles gate progression when the cumulative mass on a pan (or the difference between two pans) satisfies a predicate: reach a target total, match a reference weight, or stay within a band. Unlike boolean pressure plates that only care whether something is present, scale puzzles teach combinatorial reasoning — which objects to place, in what combination, and in what order if weights are consumable. This guide covers scale archetypes, weight property models, receiver chains, player interaction UX, solvability validation, the Harbor Vault refactor, a technique decision table against adjacent puzzle families, pitfalls, and a designer checklist.

Scale puzzles vs adjacent puzzle families

Scale rooms overlap with other gating mechanics but optimize for different player skills:

Pressure plates — binary or weighted presence; usually no combinatorial target. Good when standing on a spot is the whole puzzle.
Sokoban block pushing — spatial occupancy and deadlock avoidance; blocks rarely carry numeric weight labels. Sokoban teaches topology; scales teach arithmetic.
Inventory keys and locks — discrete tokens, not sums. Combine when a key is heavy enough that carrying it changes a scale elsewhere.
Circuit routing — boolean signal graphs. Scales output a numeric or threshold signal into the same logic layer as plates.

Hybrid rooms work well: push a crate onto a scale (Sokoban + weight) to power a relay that opens a door (circuit). Keep one primary teaching goal per chamber so players know whether they are optimizing placement or arithmetic.

Scale archetypes

Most shipped scale puzzles fall into four archetypes. Document which one a room uses in the level brief so QA tests the right edge cases:

Single-pan threshold

One platform sums all entities in a trigger volume. Receivers fire when sum(weights) == target, >= min, or in [low, high]. Classic “place crates until the door opens.” Tolerance bands (±0.5) reduce frustration when physics jitter exists; show the band on the UI if you use them.

Dual-pan equality

Left and right pans each sum independently. Success when |left - right| <= epsilon or exact equality. Teaches partitioning: split a set into two equal subsets (NP-hard in general, but level designers choose small instances). Visual tilt animation should map monotonically to imbalance so players can binary-search mentally.

Comparative reference

One pan holds a fixed reference mass (statue, water level, golden ingot); the player must match or exceed it on the other side. Useful when you want a clear visual anchor without revealing numeric labels — though hiding numbers increases abandon rate unless you provide a unit legend early.

Sequential weigh-in

Multiple scales must each hit distinct targets in order, or a central display accumulates weigh-ins (combination-lock metaphor). Locks progression until prior scale latches. Prevents placing all crates on one pan at once; good for pacing in escape-room style games covered in escape room design.

Weight property model

Implement weights as data on interactables, not as physics mass alone. Physics mass drifts with scale, drag, and joint stacking; puzzle weight should be authoritative:

Integer units — clearest for players (1, 2, 3 glyphs on crate faces). Avoid floats unless you teach decimals explicitly.
Hidden vs visible — hidden weights support discovery; visible weights support deduction. Switching mid-campaign without signposting feels unfair.
Player body weight — decide per scale: include (realistic), exclude (puzzle pan only counts objects), or toggle via “step off” prompt. Inclusion without feedback caused most Harbor abandons.
Stack semantics — define whether nested stacks sum all children or only root colliders. One rule project-wide.
Consumable weights — sand bags, water, or fuel that deplete change the sum over time; require different validation (monotone decrease).

Expose a debug overlay in editor builds listing entity_id → weight for every object in the volume. Designers catch duplicate colliders double-counting before players do.

Receivers, hysteresis and output signals

Scales rarely open doors directly; they emit a signal into your puzzle logic layer:

Threshold met — boolean sustained while predicate true.
Latch on match — one-shot when equality first achieved; prevents door slamming if player removes a crate.
Analog output — current sum drives a dial, bridge height, or water fill (continuous feedback).
Hysteresis — open at sum ≥ 8, close at sum < 7 so physics bounce does not flicker receivers.

When chaining to circuit routing, treat the scale comparator as a node: output high when predicate true, else low. Document evaluation tick relative to physics settle (see below).

Player interaction and readability

Scale puzzles live or die on feedback. Minimum viable UX:

Numeric readout — show current sum / target on the scale pedestal or floating HUD when in range. “6 / 8” beats ambiguous glow.
Per-object contribution — highlight crates on the pan with their unit value; dim objects not yet placed.
Tilt or compression animation — proportional to imbalance on dual-pan setups; audio pitch optional.
Pickup and placement affordances — snap zones on pan; reject oversize objects with clear VFX, not silent failure.
Reset control — return all movable weights to spawn poses without reloading the room.

For action-adventure pacing, allow leaving the room and returning; persist scale state in save data so experimentation is low cost.

Physics settle window

Do not evaluate predicates every physics substep while crates wobble. Common pattern: wait until linear velocity of all pan occupants < ε for 200–300 ms, then sample weights once. Prevents the door opening mid-bounce and closing when objects settle.

Solvability validation and softlocks

Treat each scale room as a small constraint problem. Before shipping:

Subset-sum check — for single-pan target T and weights {w₁…wₙ}, confirm some subset sums to T (or lies in the allowed band). NP-complete in general but trivial at n ≤ 12 via bitmask enumeration in CI.
Partition check — for dual-pan equality, confirm the multiset splits into two equal sums.
Consumable audit — if the player can remove weights from the level (throw in a pit), verify no required piece is destroyable.
Inventory cross-check — if body weight counts, verify the player cannot stand on the pan in a state that falsely satisfies the predicate before intended.
Unique vs multiple solutions — multiple valid subsets are usually fine; document if you require a canonical placement for narrative reasons.

Log scale_state_hash on abandon to find clusters stuck at sum 7 when target is 8 — often a double-counted collider, not puzzle difficulty.

Harbor Vault Scale Sanctum refactor

Post-mortem on the 19% first-attempt solve rate identified four fixes:

Player weight policy (38% of abandons) — puzzle scales now use count_mask = MOVABLE_OBJECTS by default; player mass excluded unless a room flag include_avatar is set for deliberate body-weight puzzles.
Hidden running total (31%) — pedestal UI shows current / target with color shift at 75% and 100%.
Stack double-count (19%) — child crate colliders inside a pallet prop were summing twice; parent aggregate weight only.
No unit legend (12%) — first scale in the zone now has wall glyphs 1–5; later rooms may hide numbers but never the first.

Secondary metric: players who opened the in-game journal hint (subset-sum tip) rose from 8% to 22% after the UI made them aware they were close (6/8) rather than lost.

Technique decision table

Player skill you want	Prefer	Avoid
Arithmetic combination	Single-pan threshold with labeled weights	Binary plate with no numbers
Partition / trade-off	Dual-pan equality	Sokoban-only deadlock mazes
Spatial + numeric	Push crates onto distant scales	Instant teleport deposit buttons
Discovery	Hidden weights on identical-looking props	Same room mixing hidden and visible without cue
Timed pressure	Decay weights (melting ice) on a latch	Physics jitter without settle window
Multi-room progression	Sequential weigh-in latches	One scale that must satisfy three unrelated targets simultaneously

Common pitfalls

Counting player mass silently — changes the math without teaching it; always explicit.
Physics mass ≠ puzzle weight — large light crate confuses when mass drives bounce but not the sum.
Evaluating before settle — flickering doors and latch races.
Duplicate colliders — one visual crate, two trigger volumes, double weight.
Unsolvable instances — subset with no valid combination; caught only by offline search.
Overloading one scale — six objects and a hidden 7th weight class; cognitive load explodes.
No reset — player carries wrong crate onto pan, softlocks without return-to-spawn.
Mixing units mid-game — kilograms in one room, arbitrary gems in the next without conversion fiction.

Designer and engineer checklist

Pick archetype: threshold, dual-pan, reference, or sequential.
Assign integer puzzle weights; separate from Rigidbody mass if needed.
Define player inclusion policy per scale; document on the scale actor.
Show current sum and target (or clear tilt metaphor) before expecting solve.
Wait for physics settle before predicate evaluation; add hysteresis on outputs.
Run subset-sum or partition validation in level CI for movable weight sets.
Provide reset pedestal or respawn movable props.
Chain to receivers with explicit latch vs sustained semantics.
Telemetry: log sum at abandon and time-on-pan for tuning.
First scale in a zone teaches units; later rooms may increase difficulty.

Key takeaways

Scale puzzles gate on numeric predicates — sums, equality, or bands — not just presence.
Readable totals matter more than realistic physics — Harbor Vault solve rate tripled after showing 6/8 style feedback.
Player weight must be a deliberate rule — never silent.
Validate solvability offline — subset-sum at design time prevents impossible rooms.
Hybridize with plates, Sokoban, and circuits — but one primary skill per chamber.