Guide

Risk of ruin explained

Harbor Capital's momentum sleeve showed +0.18R expectancy out-of-sample — a solid edge by any trade-level metric. A junior allocator proposed scaling from 0.5% to 3% equity risk per signal because “the model is proven.” Risk committee asked a different question: what is the probability this sizing blows the sleeve below its 15% drawdown kill switch before the edge compounds? A Monte Carlo run on the actual trade distribution at 3% risk showed a 22% chance of hitting ruin (defined as −50% from peak) within 250 trades — unacceptable for a regulated sleeve. At 1% risk the same simulation showed 1.4% ruin probability. The allocator kept the edge; they changed the risk of ruin profile instead. This guide defines ruin probability for traders and allocators, covers closed-form approximations and simulation, relates ruin to Kelly sizing and maximum drawdown, walks through the Harbor Capital refactor, provides a method decision table, common pitfalls, and a production checklist.

What “ruin” means

Risk of ruin (probability of ruin) is the chance that a sequence of bets or trades drives account equity below a defined ruin threshold before recovering. The threshold is not always zero:

  • Total loss: equity reaches $0 (or margin call).
  • Fractional ruin: equity falls 50% or 75% from starting capital — common in prop-firm and fund mandates.
  • Drawdown kill switch: peak-to-trough loss exceeds a policy limit (e.g. −15%), triggering flat positioning.

Ruin is a path question, not an average-return question. Two strategies with identical positive expectancy can have wildly different ruin probabilities if bet size, payoff skew, or loss clustering differ. Sequence risk matters: the same ten wins and ten losses in different orders produce different drawdowns.

Risk of ruin complements per-trade metrics. Expectancy tells you the drift; ruin probability tells you whether you survive long enough for the drift to matter.

Classic fixed-fractional ruin formula

For a simplified model — fixed bet size as a fraction f of current equity, binary outcomes (win +1 unit or lose −1 unit at that fraction), win probability p, loss probability q = 1 − p — the probability of eventually reaching ruin (equity → 0) when p > q (positive edge):

Ruin Probability ≈ ((1 − p) / p) ^ (1 / f)

Intuition: each loss shrinks equity geometrically. Larger f means fewer consecutive losses until the account cannot recover. The ratio (1−p)/p captures edge; the exponent 1/f captures leverage.

Example: p = 0.55, even-money payoff (win and loss both 1R), f = 0.10 (10% of equity per trade):

(0.45 / 0.55) ^ (1 / 0.10) = (0.818) ^ 10 ≈ 0.137 → 13.7% ruin risk

Drop f to 0.02 (2% risk) and ruin probability falls to roughly 0.3%. The formula is approximate — real trading has variable payoffs, costs, and correlation — but the sensitivity to f is the lesson professionals internalize.

When p ≤ q (no edge or negative edge), ruin probability approaches 1.0 regardless of sizing. There is no safe bet size for a losing game over infinite horizons.

Variable payoffs and the edge ratio

Real strategies rarely have symmetric +1/−1 outcomes. Ralph Vince and others extend ruin analysis using the edge ratio (average win divided by average loss) alongside win rate. For fixed fractional betting where each win multiplies equity by 1 + f·W and each loss multiplies by 1 − f·L (with W and L in R-multiples), simulation is more reliable than a single closed form.

Key relationships practitioners use:

  • Higher win rate lowers ruin at fixed f, but payoff ratio still dominates at low win rates.
  • Higher payoff ratio (bigger winners vs losers) lowers ruin because recovery from losses requires fewer wins.
  • Larger f increases ruin exponentially, not linearly.
  • Positive skew (occasional large wins) reduces ruin vs negative skew at the same expectancy.

Use MAE and MFE distributions to ensure your assumed W and L match realized trade paths, not idealized backtests.

Monte Carlo simulation (the practical method)

For production sizing decisions, Monte Carlo simulation is the standard tool:

  1. Collect an out-of-sample trade return series in R-multiples (or % equity).
  2. Choose a risk fraction f per trade.
  3. Define ruin threshold (e.g. −50% from start, or −15% from peak).
  4. Run 10,000+ simulated equity curves by resampling trades with replacement (bootstrap) or shuffling order.
  5. Record the fraction of paths that hit ruin before horizon N trades.

Bootstrap resampling preserves the empirical payoff distribution (fat tails, outliers) better than assuming normal returns. Shuffling order isolates sequence risk: if ruin probability is high only in certain orderings, your edge may be real but your sizing is not.

Report ruin probability at multiple horizons (100, 250, 500 trades) and multiple f values. A sleeve with 200 trades per year cares about 250-trade ruin; a day trader cares about 50-trade windows.

Pair simulation with walk-forward backtesting so the trade pool feeding the bootstrap is out-of-sample and includes realistic costs.

Ruin vs Kelly vs maximum drawdown

These three concepts answer related but distinct questions:

Concept Question Output
Kelly criterion What f maximizes long-run geometric growth? Optimal fraction (often too aggressive for real mandates)
Risk of ruin What is P(hit ruin threshold) at a chosen f? Survival probability
Maximum drawdown What was the worst peak-to-trough loss on a realized path? Historical depth (backward-looking)

Full Kelly maximizes growth but often implies double-digit ruin probability over realistic horizons. Practitioners use fractional Kelly (half-Kelly, quarter-Kelly) specifically to cut ruin risk while retaining most of the growth benefit. A common workflow: compute Kelly fraction from win rate and payoff, take 25–50% of it, then verify ruin probability via Monte Carlo is below a policy cap (e.g. 2% over 250 trades).

Maximum drawdown on a backtest is one realized path. Ruin simulation explores thousands of paths. A strategy with −12% historical max drawdown might still show 8% ruin probability at 2% risk if loss clustering is severe.

Harbor Capital momentum sleeve refactor

Harbor's cross-sectional momentum sleeve traded 180 US equities monthly with these out-of-sample stats (312 trades, costs included):

  • Win rate: 51.3%
  • Average win: +1.62R; average loss: −1.05R
  • Expectancy: +0.18R; profit factor: 1.34

Proposal: raise risk from 0.5% to 3% equity per signal. Monte Carlo (10,000 bootstrap paths, ruin = −50% from peak, 250-trade horizon):

  • At f = 3%: ruin probability 22.1%; median max drawdown −28%
  • At f = 1%: ruin probability 1.4%; median max drawdown −11%
  • At f = 0.5%: ruin probability 0.2%; median max drawdown −6%

Committee approved 1% with a hard −15% peak drawdown kill switch. Additional guardrails from portfolio heat limits:

  1. Cap simultaneous open risk at 6% equity (six uncorrelated 1R positions max).
  2. Halve size after any rolling 20-trade window with expectancy below +0.05R.
  3. Monthly re-run ruin simulation; if 250-trade ruin exceeds 3%, cut f by 25%.

Eighteen months later: realized max drawdown −9.4%, no kill-switch breach, annualized return +11.2%. The edge was real; survival required sizing discipline, not bigger bets.

Method decision table

Question Best approach Why not alternatives alone
Quick sanity check on even-money bets Classic ruin formula Breaks with variable payoffs
Production sizing for a real trade log Monte Carlo bootstrap Closed forms miss tail events
Does this strategy have edge? Out-of-sample expectancy / PF Ruin math is meaningless without edge
What f maximizes growth? Kelly criterion Kelly ignores ruin tolerance
How bad was the worst historical path? Maximum drawdown One path, not probability
Fund-level tail risk CVaR / stress tests Trade-level ruin is per-sleeve, not portfolio
Are stops calibrated? MAE / MFE analysis Feeds realistic L into simulation

Common pitfalls

  • Assuming positive expectancy guarantees survival: oversized bets on a +0.1R edge still ruin frequently.
  • Using in-sample stats for ruin simulation: overfitted win rates inflate perceived safety.
  • Ignoring costs in the trade pool: a +0.05R edge before costs may be negative after; ruin becomes certain.
  • Defining ruin as only $0: prop and fund mandates hit ruin at −10–20% long before zero.
  • Underestimating loss clustering: autocorrelated losses (regime shifts) raise ruin vs independent-trade models.
  • Confusing max drawdown with ruin probability: a single lucky path does not prove low ruin risk.
  • Full Kelly without simulation: growth-optimal sizing often implies unacceptable ruin odds.
  • Changing f after wins (anti-martingale without rules): increases effective f at peaks where drawdowns hurt most.

Practitioner checklist

  • Confirm positive out-of-sample expectancy before any ruin analysis.
  • Define ruin threshold explicitly (dollar zero, % loss, or drawdown kill switch).
  • Express trade outcomes in consistent R-multiples with documented stop rules.
  • Run Monte Carlo bootstrap with at least 10,000 paths and realistic costs.
  • Report ruin probability at policy-relevant horizons (e.g. 250 trades).
  • Compare multiple f values; plot ruin vs bet size curve.
  • Cross-check Kelly fraction; use fractional Kelly if full Kelly ruin exceeds policy.
  • Cap portfolio heat (sum of open risk) independently of per-trade f.
  • Re-simulate monthly or after any regime change in volatility or correlation.
  • Document the approved f and ruin cap in the sleeve risk policy.

Key takeaways

  • Risk of ruin measures survival probability, not average profitability.
  • Bet size dominates ruin risk more than small changes in win rate.
  • Monte Carlo bootstrap on out-of-sample trades is the production standard.
  • Fractional Kelly exists primarily to reduce ruin while preserving growth.
  • Define ruin relative to your mandate — funds hit kill switches well before zero.

Related reading