Guide

Piotroski F-score explained

Harbor Capital's deep-value sleeve bought the cheapest quintile of U.S. stocks by price-to-book each quarter. On paper the screen was classic Graham: buy what the market hates. In practice, 38% of new positions saw operating losses within eighteen months — classic value traps where low multiples reflected deteriorating fundamentals, not temporary pessimism. After layering Joseph Piotroski's F-score filter (requiring a score of 7 or higher before entry), the sleeve's annualized return rose 240 basis points while maximum drawdown fell from 34% to 27%. Cheap alone was not enough; cheap and improving mattered.

The F-score is a nine-point checklist published by Piotroski in 2000. Each signal is binary: pass (1) or fail (0). The framework was designed specifically for high book-to-market (“value”) firms where dispersion in subsequent returns is enormous — some distressed names recover, many do not. This guide walks through all nine criteria grouped by category, explains scoring bands and backtest context, covers the Harbor Capital screen refactor, compares F-score to practitioner quality metrics and quality factor ETFs, provides a screening method decision table, common pitfalls, and a production checklist alongside our value investing guide and earnings quality guide.

Why value screens need a quality filter

Value investing exploits the tendency of cheap stocks (low price relative to book value, earnings, or cash flow) to outperform expensive ones over long horizons. But “cheap” is not a synonym for “undervalued.” A stock can trade at 0.4× book because the market correctly prices bankruptcy risk, secular decline, or accounting write-downs still to come.

Piotroski's insight was empirical: among the cheapest 20% of stocks by book-to-market, those with strong fundamental momentum — improving profitability, balance-sheet strength, and operating efficiency — dramatically outperformed those with weak or deteriorating fundamentals. The F-score operationalizes that split with nine simple, publicly auditable signals derived from financial statements.

The nine F-score signals

Each criterion compares the current fiscal year to the prior year (or tests a level condition). A pass scores 1; a fail scores 0. Total F-score ranges from 0 to 9.

Profitability (4 signals)

Positive return on assets (ROA): Net income / Average total assets > 0. The firm is profitable on an asset basis, not just at the net-income line through one-offs.
Positive operating cash flow (OCF): Cash from operations > 0. Earnings are backed by cash, not only accruals.
Increasing ROA: Current-year ROA > prior-year ROA. Profitability is improving, not just positive.
Quality of earnings (accruals): OCF > net income. Cash generation exceeds reported profit — the inverse of the accruals red flag where net income outruns cash.

Leverage, liquidity and source of funds (3 signals)

Decreasing long-term debt ratio: Long-term debt / average total assets is lower this year than last. The firm is deleveraging or not piling on structural debt.
Increasing current ratio: Current assets / current liabilities rose year over year. Short-term liquidity improved.
No new equity issuance: Shares outstanding did not increase materially (no dilutive secondary offering or heavy stock-based issuance that expands the share count). Piotroski treated equity raises as a negative signal for distressed value names.

Operating efficiency (2 signals)

Increasing gross margin: Gross profit / revenue is higher than the prior year. Pricing power or cost control improved at the unit level.
Increasing asset turnover: Revenue / average total assets rose year over year. The firm is doing more with its asset base.

Together, these nine signals span the income statement, cash flow statement, and balance sheet — a compact version of fundamental analysis reduced to machine-readable booleans.

Scoring bands and what the research showed

In Piotroski's original sample (1963–1996, U.S. stocks), value firms (highest book-to-market decile) with F-scores of 8–9 earned substantially higher annual returns than those scoring 0–2. The middle band (4–6) was mixed. Subsequent researchers replicated the pattern internationally with varying strength — the effect is strongest where value dispersion is high and accounting data is reliable.

Practical cutoffs used by allocators:

F ≥ 7: “High” quality within a value universe — Harbor Capital's post-refactor threshold.
F = 5–6: Neutral; requires additional qualitative work.
F ≤ 4: Likely value trap territory unless a specific turnaround thesis is documented.

The F-score is a relative rank within cheap stocks, not a standalone buy signal. A score of 8 on a stock at 0.3× book means something different than an 8 on a stock at 3.0× book — the screen was designed for the former cohort.

Harbor Capital refactor: worked example

Harbor's Q1 rebalance surfaced two industrial names in the cheapest price-to-book quintile:

Stock A (P/B 0.52): ROA positive and rising; OCF > net income; long-term debt ratio fell; current ratio up; no share issuance; gross margin and asset turnover both improved. F-score = 9.
Stock B (P/B 0.48): Superficially cheaper on P/B, but ROA negative, OCF negative, rising leverage, falling current ratio, equity raise six months prior, shrinking gross margin. F-score = 2.

The old rule bought both. The F-score filter kept A and rejected B. Over the next twelve months, A re-rated as margins recovered; B filed for a distressed exchange offering. The F-score did not predict the filing with certainty — no screen does — but it aligned position selection with improving rather than collapsing fundamentals.

Harbor's implementation notes:

Compute F-score on fiscal-year data, not quarterly snapshots, to match Piotroski's original methodology.
Apply the filter after the value rank, not before — the score is conditional on being cheap.
Re-score at each rebalance; a stock that drops from 8 to 4 triggers review, not automatic sale (turnover costs matter).

F-score vs other quality screens

Method	What it measures	Best for	Weakness
Piotroski F-score	9 binary year-over-year fundamental changes	Filtering deep value (high B/M) for improving fundamentals	Not designed for growth or glamour stocks; ignores valuation level
Quality factor (ROE, leverage, earnings variability)	Cross-sectional profitability and stability ranks	Broad large-cap quality tilts, ETF implementation	Less turnaround-specific; may overlap expensive quality names
Accruals ratio screen	(Net income − OCF) / total assets	Detecting earnings manipulation or weak cash conversion	Single dimension; misses leverage and margin trends
Altman Z-score	Bankruptcy probability from five weighted ratios	Distress risk for manufacturing firms	Industry-specific; not a full quality ranking
ROIC vs WACC spread	Economic value creation	Franchise quality and capital allocation	Requires clean invested-capital definitions; less binary

For a diversified allocator, F-score is a value sleeve hygiene tool — pair it with ROIC analysis and sector context, not as a replacement for portfolio-level risk management.

Implementation details that matter

Data timing

Use the most recent annual 10-K filed before the rebalance date. Point-in-time databases (Compustat with lag, or vendor snapshots) prevent look-ahead bias in backtests. In live trading, allow 60–90 days after fiscal year-end for filing completeness.

Share count test

Signal 7 compares weighted-average diluted shares outstanding year over year. Small increases from employee stock plans may fail the test; define a materiality threshold (e.g. 2%) and document it consistently.

Financials vs industrials

Banks, insurers, and REITs have different balance-sheet structures. Many practitioners restrict F-score to non-financial industrials where the original accounting definitions apply cleanly.

International stocks

IFRS vs U.S. GAAP differences affect OCF classification and debt covenants. Replication studies in Europe and Asia show weaker but positive effects — adjust expectations and validate locally.

Common pitfalls

Applying F-score to expensive growth stocks — the research is conditional on high book-to-market; scores on low B/M names are not calibrated.
Quarterly F-scores — seasonality and inventory cycles create false fails; stick to fiscal-year comparisons unless you have a validated quarterly adaptation.
Ignoring industry headwinds — a score of 8 in a structurally declining sector can still be a value trap.
Survivorship bias in backtests — delisted bankruptcies must be in the universe or results overstate returns.
Double-counting quality — combining F-score with profitability and investment factors (RMW/CMA) may overweight the same signals; check regression overlap.
One-year improvement after a cliff — year-over-year comparisons bounce after a crisis year; inspect three-year trends for turnarounds.

Production checklist

Define value universe (e.g. bottom quintile price-to-book) with point-in-time data.
Compute all nine signals from annual filings with consistent fiscal-year alignment.
Set minimum F-score threshold (7+ for conservative value sleeves).
Exclude financials or run a separate sector-specific screen.
Log each signal pass/fail for audit and position review.
Cross-check accruals signal against full earnings quality review for top holdings.
Cap position size; F-score does not replace position sizing discipline.
Re-score at rebalance; flag deteriorating scores for exit review.
Include delisted stocks in backtest universe to avoid survivorship bias.
Document threshold changes; do not optimize F-cutoff on the same sample you trade.

Key takeaways

The Piotroski F-score is a nine-point binary checklist that separates improving cheap stocks from value traps within high book-to-market universes.
Signals cover profitability (4), leverage/liquidity/funding (3), and operating efficiency (2) using year-over-year changes and simple level tests.
Harbor Capital's F ≥ 7 filter reduced drawdowns and improved returns by rejecting cheap names with collapsing fundamentals.
F-score complements but does not replace valuation, sector analysis, or portfolio risk controls.
Apply it conditionally on value stocks with annual, point-in-time data — not as a universal quality rank across the market.