Guide

SQL window functions explained

Harbor Analytics’ finance team needed monthly recurring revenue per customer and each customer’s rank within their signup cohort — without losing the underlying invoice rows. A GROUP BY customer_id query collapsed detail; correlated subqueries for “previous month’s MRR” scanned the table once per row. Window functions solved both: they compute aggregates, ranks, and offsets across a defined slice of rows while returning every input row intact. This guide explains the OVER clause, PARTITION BY and ORDER BY, ranking and offset functions, running and moving aggregates, frame boundaries, a Harbor subscription cohort worked example, a window-vs-alternative decision table, common pitfalls, and a production checklist. It assumes baseline SQL fundamentals and complements query optimization when analytic queries hit production scale.

Window functions vs GROUP BY

GROUP BY collapses rows sharing a key into one output row per group. Aggregates like SUM(amount) apply to the whole group; non-aggregated columns must appear in the GROUP BY list or inside an aggregate.

Window functions add a computed column per row using a “window” of related rows. Syntax always includes OVER (...) after the function name:

SELECT
  invoice_id,
  customer_id,
  invoice_date,
  amount,
  SUM(amount) OVER (
    PARTITION BY customer_id
    ORDER BY invoice_date
    ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
  ) AS running_total
FROM invoices;

Each invoice row remains visible; running_total is the cumulative sum for that customer up to and including the current invoice. That row-preserving property is what makes windows indispensable for rankings, period-over-period deltas, and deduplication.

The OVER clause

PARTITION BY — optional; divides rows into independent windows (like GROUP BY buckets, but without collapsing). Omit it and the window spans the entire result set.
ORDER BY — defines row sequence inside each partition. Required for ranking, LAG/LEAD, and most frame-aware aggregates.
Frame clause — optional ROWS or RANGE/GROUPS bounds limiting which rows in the ordered partition contribute to the current calculation.

Ranking functions: ROW_NUMBER, RANK, DENSE_RANK

Ranking functions assign an integer position within a partition. They are the standard tool for “top N per group” queries without fragile GROUP BY ... HAVING MAX(...) patterns.

ROW_NUMBER() — unique sequential integers 1, 2, 3 even when values tie. Use for deduplication: keep WHERE rn = 1 after numbering duplicates.
RANK() — ties share the same rank; next rank skips (1, 2, 2, 4). Mirrors Olympic medal standings.
DENSE_RANK() — ties share rank; next rank does not skip (1, 2, 2, 3). Better when gaps confuse downstream charts.
NTILE(n) — splits rows into n roughly equal buckets (quartiles, deciles). Useful for stratified sampling or percentile bands.

SELECT *
FROM (
  SELECT
    product_id,
    region,
    revenue,
    ROW_NUMBER() OVER (
      PARTITION BY region ORDER BY revenue DESC
    ) AS revenue_rank
  FROM regional_sales
) ranked
WHERE revenue_rank <= 5;

This returns the top five products per region in one pass. Wrap in a subquery or CTE because standard SQL disallows window functions directly in WHERE (dialects like BigQuery are exceptions).

Offset functions: LAG, LEAD, FIRST_VALUE, LAST_VALUE

Offset functions peek at other rows relative to the current row within the ordered partition — the SQL equivalent of a spreadsheet’s “cell above” without a self-join.

LAG(expr, offset, default) — value from offset rows earlier (default 1). Computes month-over-month change, session gaps, or previous status.
LEAD(expr, offset, default) — value from rows ahead. Useful for time-to-next-event or forward-filled gaps.
FIRST_VALUE / LAST_VALUE — first or last value in the frame. Often paired with explicit frame bounds; default frames can surprise you (see pitfalls).

SELECT
  month,
  mrr,
  LAG(mrr, 1) OVER (ORDER BY month) AS prev_mrr,
  mrr - LAG(mrr, 1) OVER (ORDER BY month) AS mrr_delta
FROM monthly_mrr;

LAG returns NULL for the first row unless you supply a third-argument default (e.g. 0).

Running and moving aggregates

Standard aggregates (SUM, AVG, COUNT, MIN, MAX) become window functions when followed by OVER. Common patterns:

Cumulative sum — ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW for running totals.
Moving average — ROWS BETWEEN 6 PRECEDING AND CURRENT ROW for a seven-day rolling mean.
Share of partition — amount / SUM(amount) OVER (PARTITION BY category) for percentage of category total on each line.

Frame clauses: ROWS vs RANGE

ROWS counts physical row offsets — predictable for unique timestamps. RANGE (and PostgreSQL 11+ GROUPS) groups rows tied on the ORDER BY expression, which matters when multiple events share the same timestamp. If you omit a frame for aggregates with ORDER BY, Postgres defaults to RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW — peers tied on the sort key are included, which can inflate running sums.

Worked example: Harbor Analytics subscription cohort report

Harbor Analytics tracks SaaS subscriptions in an events table: (customer_id, event_date, event_type, mrr_cents) where event_type is 'activated', 'upgraded', 'churned', or 'reactivated'. Product wants, per customer:

Signup month (first activation date).
Current MRR after each event (running sum of deltas).
Months since signup at each event.
Whether this event is the customer’s latest state.

WITH enriched AS (
  SELECT
    customer_id,
    event_date,
    event_type,
    mrr_cents,
    DATE_TRUNC('month', MIN(event_date) OVER (
      PARTITION BY customer_id
    )) AS cohort_month,
    SUM(
      CASE event_type
        WHEN 'churned' THEN -mrr_cents
        ELSE mrr_cents
      END
    ) OVER (
      PARTITION BY customer_id
      ORDER BY event_date, event_id
      ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
    ) AS running_mrr_cents,
    ROW_NUMBER() OVER (
      PARTITION BY customer_id
      ORDER BY event_date DESC, event_id DESC
    ) AS recency_rank
  FROM subscription_events
)
SELECT
  cohort_month,
  COUNT(DISTINCT customer_id) FILTER (WHERE recency_rank = 1 AND running_mrr_cents > 0) AS active_customers,
  SUM(running_mrr_cents) FILTER (WHERE recency_rank = 1) / 100.0 AS total_mrr_usd
FROM enriched
GROUP BY cohort_month
ORDER BY cohort_month;

Window functions compute per-row running MRR and latest-event flags; an outer GROUP BY cohort_month aggregates to the dashboard grain. Mixing windows and group aggregates in one SELECT is valid when the grouping keys are consistent — but nesting order matters for readability and planner cost.

Decision table: window function vs alternatives

Need	Prefer	Why
One summary row per group	`GROUP BY`	Simpler plan; fewer rows returned.
Rank or top-N per group with detail rows	Window + filter	Avoids duplicate self-joins; single table scan.
Previous/next row value	`LAG` / `LEAD`	Clearer than correlated subquery; often faster.
Running total or moving average	Window aggregate + frame	Self-join on row numbers scales poorly.
Complex multi-table hierarchy rollups	CTEs + windows, or OLAP cube	Windows alone do not replace dimensional modeling.
Pre-aggregated dashboard at billions of rows	Materialized view or stream pipeline	Repeated full-window scans on raw events are expensive.

Performance and optimization notes

Window evaluation typically requires sorting or hashing partitions. Large PARTITION BY cardinalities with wide ORDER BY keys can spill to disk. Mitigations:

Filter early in a CTE before applying windows — fewer rows to sort.
Ensure supporting indexes on (partition_key, order_key) when the query is selective on those columns (see index design).
Replace repeated window expressions with a CTE that computes them once.
For recurring reports, refresh a materialized view instead of scanning raw fact tables hourly.
Run EXPLAIN (ANALYZE) — window nodes show as WindowAgg in PostgreSQL; compare to alternative plans per optimization workflow.

Common pitfalls

Filtering on window results in WHERE — wrap in a subquery; use the outer query’s WHERE rank = 1.
Unstable ORDER BY — ties without a unique tiebreaker (event_id) produce nondeterministic ranks between runs.
Default frame surprises — LAST_VALUE with default frame often returns the current row, not the partition end; specify ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING explicitly.
NULL sort order — NULLS FIRST vs NULLS LAST changes rank and LAG behavior; set explicitly.
Mixing windows and DISTINCT carelessly — SELECT DISTINCT ... window may dedupe before or after window evaluation depending on dialect; test the actual plan.
Portability — frame syntax and function names differ slightly across MySQL, SQL Server, and BigQuery; verify edge cases when migrating.

Production checklist

State the grain: one output row per what entity?
Choose partition keys that match business cohorts (customer, account, region).
Add a unique tiebreaker column to every ORDER BY inside windows.
Specify frame bounds explicitly for running and moving aggregates.
Wrap rank filters in a CTE or subquery, not bare WHERE.
Compare plan cost with a self-join alternative on realistic row counts.
Document dialect-specific syntax if the query runs on multiple engines.
Schedule heavy window reports off peak or pre-aggregate into summary tables.

Key takeaways

Window functions preserve row detail while adding analytic context — the opposite of GROUP BY collapse.
PARTITION BY and ORDER BY define the window; frame clauses control which rows participate in each calculation.
Ranking and LAG/LEAD replace most self-join patterns for top-N and period-over-period logic.
Explicit frames and tiebreakers prevent subtle bugs in running totals and LAST_VALUE.
At scale, pre-aggregation beats repeated window scans on raw event streams.