Guide

Risk of ruin explained

Harbor Capital's momentum sleeve showed +0.18R expectancy out-of-sample — a solid edge by any trade-level metric. A junior allocator proposed scaling from 0.5% to 3% equity risk per signal because “the model is proven.” Risk committee asked a different question: what is the probability this sizing blows the sleeve below its 15% drawdown kill switch before the edge compounds? A Monte Carlo run on the actual trade distribution at 3% risk showed a 22% chance of hitting ruin (defined as −50% from peak) within 250 trades — unacceptable for a regulated sleeve. At 1% risk the same simulation showed 1.4% ruin probability. The allocator kept the edge; they changed the risk of ruin profile instead. This guide defines ruin probability for traders and allocators, covers closed-form approximations and simulation, relates ruin to Kelly sizing and maximum drawdown, walks through the Harbor Capital refactor, provides a method decision table, common pitfalls, and a production checklist.

What “ruin” means

Risk of ruin (probability of ruin) is the chance that a sequence of bets or trades drives account equity below a defined ruin threshold before recovering. The threshold is not always zero:

Total loss: equity reaches $0 (or margin call).
Fractional ruin: equity falls 50% or 75% from starting capital — common in prop-firm and fund mandates.
Drawdown kill switch: peak-to-trough loss exceeds a policy limit (e.g. −15%), triggering flat positioning.

Ruin is a path question, not an average-return question. Two strategies with identical positive expectancy can have wildly different ruin probabilities if bet size, payoff skew, or loss clustering differ. Sequence risk matters: the same ten wins and ten losses in different orders produce different drawdowns.

Risk of ruin complements per-trade metrics. Expectancy tells you the drift; ruin probability tells you whether you survive long enough for the drift to matter.

Classic fixed-fractional ruin formula

For a simplified model — fixed bet size as a fraction f of current equity, binary outcomes (win +1 unit or lose −1 unit at that fraction), win probability p, loss probability q = 1 − p — the probability of eventually reaching ruin (equity → 0) when p > q (positive edge):

Ruin Probability ≈ ((1 − p) / p) ^ (1 / f)

Intuition: each loss shrinks equity geometrically. Larger f means fewer consecutive losses until the account cannot recover. The ratio (1−p)/p captures edge; the exponent 1/f captures leverage.

Example: p = 0.55, even-money payoff (win and loss both 1R), f = 0.10 (10% of equity per trade):

(0.45 / 0.55) ^ (1 / 0.10) = (0.818) ^ 10 ≈ 0.137 → 13.7% ruin risk

Drop f to 0.02 (2% risk) and ruin probability falls to roughly 0.3%. The formula is approximate — real trading has variable payoffs, costs, and correlation — but the sensitivity to f is the lesson professionals internalize.

When p ≤ q (no edge or negative edge), ruin probability approaches 1.0 regardless of sizing. There is no safe bet size for a losing game over infinite horizons.

Variable payoffs and the edge ratio

Real strategies rarely have symmetric +1/−1 outcomes. Ralph Vince and others extend ruin analysis using the edge ratio (average win divided by average loss) alongside win rate. For fixed fractional betting where each win multiplies equity by 1 + f·W and each loss multiplies by 1 − f·L (with W and L in R-multiples), simulation is more reliable than a single closed form.

Key relationships practitioners use:

Higher win rate lowers ruin at fixed f, but payoff ratio still dominates at low win rates.
Higher payoff ratio (bigger winners vs losers) lowers ruin because recovery from losses requires fewer wins.
Larger f increases ruin exponentially, not linearly.
Positive skew (occasional large wins) reduces ruin vs negative skew at the same expectancy.

Use MAE and MFE distributions to ensure your assumed W and L match realized trade paths, not idealized backtests.

Monte Carlo simulation (the practical method)

For production sizing decisions, Monte Carlo simulation is the standard tool:

Collect an out-of-sample trade return series in R-multiples (or % equity).
Choose a risk fraction f per trade.
Define ruin threshold (e.g. −50% from start, or −15% from peak).
Run 10,000+ simulated equity curves by resampling trades with replacement (bootstrap) or shuffling order.
Record the fraction of paths that hit ruin before horizon N trades.

Bootstrap resampling preserves the empirical payoff distribution (fat tails, outliers) better than assuming normal returns. Shuffling order isolates sequence risk: if ruin probability is high only in certain orderings, your edge may be real but your sizing is not.

Report ruin probability at multiple horizons (100, 250, 500 trades) and multiple f values. A sleeve with 200 trades per year cares about 250-trade ruin; a day trader cares about 50-trade windows.

Pair simulation with walk-forward backtesting so the trade pool feeding the bootstrap is out-of-sample and includes realistic costs.

Ruin vs Kelly vs maximum drawdown

These three concepts answer related but distinct questions:

Concept	Question	Output
Kelly criterion	What `f` maximizes long-run geometric growth?	Optimal fraction (often too aggressive for real mandates)
Risk of ruin	What is P(hit ruin threshold) at a chosen `f`?	Survival probability
Maximum drawdown	What was the worst peak-to-trough loss on a realized path?	Historical depth (backward-looking)

Full Kelly maximizes growth but often implies double-digit ruin probability over realistic horizons. Practitioners use fractional Kelly (half-Kelly, quarter-Kelly) specifically to cut ruin risk while retaining most of the growth benefit. A common workflow: compute Kelly fraction from win rate and payoff, take 25–50% of it, then verify ruin probability via Monte Carlo is below a policy cap (e.g. 2% over 250 trades).

Maximum drawdown on a backtest is one realized path. Ruin simulation explores thousands of paths. A strategy with −12% historical max drawdown might still show 8% ruin probability at 2% risk if loss clustering is severe.

Harbor Capital momentum sleeve refactor

Harbor's cross-sectional momentum sleeve traded 180 US equities monthly with these out-of-sample stats (312 trades, costs included):

Win rate: 51.3%
Average win: +1.62R; average loss: −1.05R
Expectancy: +0.18R; profit factor: 1.34

Proposal: raise risk from 0.5% to 3% equity per signal. Monte Carlo (10,000 bootstrap paths, ruin = −50% from peak, 250-trade horizon):

At f = 3%: ruin probability 22.1%; median max drawdown −28%
At f = 1%: ruin probability 1.4%; median max drawdown −11%
At f = 0.5%: ruin probability 0.2%; median max drawdown −6%

Committee approved 1% with a hard −15% peak drawdown kill switch. Additional guardrails from portfolio heat limits:

Cap simultaneous open risk at 6% equity (six uncorrelated 1R positions max).
Halve size after any rolling 20-trade window with expectancy below +0.05R.
Monthly re-run ruin simulation; if 250-trade ruin exceeds 3%, cut f by 25%.

Eighteen months later: realized max drawdown −9.4%, no kill-switch breach, annualized return +11.2%. The edge was real; survival required sizing discipline, not bigger bets.

Method decision table

Question	Best approach	Why not alternatives alone
Quick sanity check on even-money bets	Classic ruin formula	Breaks with variable payoffs
Production sizing for a real trade log	Monte Carlo bootstrap	Closed forms miss tail events
Does this strategy have edge?	Out-of-sample expectancy / PF	Ruin math is meaningless without edge
What `f` maximizes growth?	Kelly criterion	Kelly ignores ruin tolerance
How bad was the worst historical path?	Maximum drawdown	One path, not probability
Fund-level tail risk	CVaR / stress tests	Trade-level ruin is per-sleeve, not portfolio
Are stops calibrated?	MAE / MFE analysis	Feeds realistic `L` into simulation

Common pitfalls

Assuming positive expectancy guarantees survival: oversized bets on a +0.1R edge still ruin frequently.
Using in-sample stats for ruin simulation: overfitted win rates inflate perceived safety.
Ignoring costs in the trade pool: a +0.05R edge before costs may be negative after; ruin becomes certain.
Defining ruin as only $0: prop and fund mandates hit ruin at −10–20% long before zero.
Underestimating loss clustering: autocorrelated losses (regime shifts) raise ruin vs independent-trade models.
Confusing max drawdown with ruin probability: a single lucky path does not prove low ruin risk.
Full Kelly without simulation: growth-optimal sizing often implies unacceptable ruin odds.
Changing f after wins (anti-martingale without rules): increases effective f at peaks where drawdowns hurt most.

Practitioner checklist

Confirm positive out-of-sample expectancy before any ruin analysis.
Define ruin threshold explicitly (dollar zero, % loss, or drawdown kill switch).
Express trade outcomes in consistent R-multiples with documented stop rules.
Run Monte Carlo bootstrap with at least 10,000 paths and realistic costs.
Report ruin probability at policy-relevant horizons (e.g. 250 trades).
Compare multiple f values; plot ruin vs bet size curve.
Cross-check Kelly fraction; use fractional Kelly if full Kelly ruin exceeds policy.
Cap portfolio heat (sum of open risk) independently of per-trade f.
Re-simulate monthly or after any regime change in volatility or correlation.
Document the approved f and ruin cap in the sleeve risk policy.

Key takeaways

Risk of ruin measures survival probability, not average profitability.
Bet size dominates ruin risk more than small changes in win rate.
Monte Carlo bootstrap on out-of-sample trades is the production standard.
Fractional Kelly exists primarily to reduce ruin while preserving growth.
Define ruin relative to your mandate — funds hit kill switches well before zero.