Guide

Particle filter explained

Harbor Logistics’ warehouse robots navigated a 40,000 m² grid of identical aisles. At every T-junction, wheel odometry and a single-beam lidar were ambiguous: the robot could be in corridor A or corridor B with equal plausibility. An extended Kalman filter forced a unimodal Gaussian belief and “picked” the wrong branch 14% of the time — each mistake required a manual reposition. Switching to a particle filter that maintained hundreds of weighted pose hypotheses cut localization failures to 5.4% without new sensors. A particle filter (also called sequential Monte Carlo) represents belief about a hidden state with a cloud of weighted samples rather than a single mean and covariance. It handles nonlinear dynamics, non-Gaussian noise, and multimodal uncertainty — cases where Kalman-family filters break. This guide covers the predict-weight-resample loop, importance sampling, particle degeneracy, resampling strategies, comparison with Kalman filters and hidden Markov models, a Harbor Logistics robot localization refactor, a filter decision table, pitfalls, and a production checklist.

Why Gaussian filters are not enough

The Kalman filter and its extensions (EKF, UKF) maintain a Gaussian belief over the hidden state. That assumption is elegant and fast when it holds: one bump of probability, one mean, one covariance ellipsoid. Real estimation problems often violate it:

Multimodal ambiguity — symmetric corridors, kidnapped robot problems, or two aircraft with crossing trajectories create multiple plausible states simultaneously. A Gaussian cannot represent “50% here, 50% there.”
Heavy-tailed or skewed noise — occasional gross outliers (slipping wheels, multipath GPS) are poorly modeled as Gaussian.
Strong nonlinearities — EKF linearization can collapse multimodal structure into a misleading single mode.
Discrete hybrid states — “which room am I in?” combined with continuous pose is naturally represented by particles clustered per room.

Particle filters trade compute for expressiveness: instead of tracking (\mu, \Sigma), they approximate the full posterior p(x_t | z_{1:t}) with weighted samples \{(x_t^{(i)}, w_t^{(i)})\}_{i=1}^N. Any shape — multimodal, skewed, bounded — is representable if you use enough particles.

Sequential Monte Carlo intuition

Bayesian filtering recursively updates belief as measurements arrive. At time t, you want the posterior over state x_t given all observations z_{1:t}. Particle filters approximate this with Monte Carlo integration: draw samples from a proposal distribution, weight each sample by how well it explains the latest measurement, and occasionally duplicate high-weight particles while discarding low-weight ones.

Think of each particle as a hypothesis about reality: “the robot is at (12.4, 7.1) facing east with 0.3% weight.” After motion, every hypothesis moves according to the dynamics model plus process noise. After sensing, hypotheses inconsistent with lidar or GPS get down-weighted; consistent ones gain weight. When one hypothesis dominates, you have localized; when weight spreads across two clusters, you are correctly uncertain until disambiguating data arrives.

Foundations sit in Bayesian inference: the filter is importance sampling applied sequentially, with corrections to prevent sample impoverishment.

The SIR loop: predict, weight, resample

The most common variant is the Sampling Importance Resampling (SIR) filter, also called bootstrap particle filter:

Predict (propagate) — for each particle x_{t-1}^{(i)}, sample x_t^{(i)} \sim p(x_t | x_{t-1}^{(i)}) using your motion model (odometry, IMU integration, kinematics). Process noise injects diversity.
Weight (update) — assign importance weight w_t^{(i)} \propto p(z_t | x_t^{(i)}) from the measurement likelihood (lidar scan match, GPS, vision feature match). Normalize so weights sum to 1.
Resample (optional but critical) — when effective sample size N_{\text{eff}} = 1 / \sum_i (w^{(i)})^2 drops below a threshold (often N/2), draw N new particles by sampling indices proportional to weights, reset weights to 1/N.

The predict step spreads particles; the update step concentrates them near states that explain observations; resampling kills hypotheses that accumulated negligible weight. Skipping resample leads to particle degeneracy: after many steps, a single particle carries all weight while others are dead weight — Monte Carlo variance explodes.

Resampling strategies

Multinomial resampling — simple random sampling with replacement; higher variance.
Systematic resampling — lower variance, O(N); the default in most libraries.
Stratified resampling — partitions the cumulative weight axis; slightly better diversity preservation.

Low-variance resampling and residual resampling are refinements when particle count is tight. Always resample after weighting, not before prediction, in the standard SIR formulation.

Importance sampling and proposal design

The bootstrap filter uses the transition model as the proposal: q(x_t | x_{t-1}, z_t) = p(x_t | x_{t-1}). That is easy but inefficient when the measurement is much more informative than motion — most propagated particles land where the sensor says you cannot be, receive near-zero weight, and die on resample. Optimal proposals incorporate the latest measurement: q(x_t | x_{t-1}, z_t) \propto p(z_t | x_t)\, p(x_t | x_{t-1}).

In practice, auxiliary particle filters (APF) and unscented particle filters blend UKF-style proposals with particle diversity. For lidar localization, sampling around high-likelihood poses from scan matching (rather than blind odometry) can reduce required N by an order of magnitude. The engineering trade-off: smarter proposals need more code and must remain correct importance weights if you depart from the bootstrap choice.

State representation and dimensions

Particle count scales poorly with state dimension — the curse of dimensionality that keeps particle filters in roughly 4–10 continuous dimensions for real-time use, sometimes higher with careful proposals. Common state vectors:

Robot pose — (x, y, \theta) or SE(2); 500–2000 particles often suffice for warehouse grids with good lidar.
Target tracking — position and velocity in 2D/3D; gating measurements with a Kalman pre-filter reduces particles needed.
Financial latent regimes — discrete regime index plus continuous volatility; particles mix GARCH-like parameters across hypotheses.

High-dimensional states (full SLAM maps) use Rao-Blackwellized particle filters: particles over robot trajectory, closed-form Kalman updates per landmark per particle — FastSLAM is the canonical example.

Worked example — Harbor Logistics warehouse localization

Harbor Logistics deployed 18 AMRs in a symmetrical racking layout. Sensors: wheel encoders (10 Hz), 2D lidar (5 Hz), ceiling QR codes visible only at aisle ends. The EKF tracked (x, y, \theta) with Gaussian noise. At junctions where odometry error exceeded 0.5 m, the filter committed to one aisle; wrong commits triggered recovery behaviors averaging 47 seconds.

The refactor used 800 particles with a differential-drive motion model. Propagation added Gaussian noise on translation and rotation tuned from encoder slip logs. Lidar updates scored each particle with a likelihood field map (precomputed distance transform) — fast beam-endpoint matching. QR detections applied a sharp Gaussian bump when a particle entered a code footprint. Resampling used systematic resampling when N_{\text{eff}} < 400.

Critical fix: at ambiguous junctions, weight remained split across branches until lidar geometry or a QR code collapsed the distribution. Mean time lost per junction dropped from 47 s to 11 s; overall pick-route throughput rose 8%. CPU cost rose 3.2× versus EKF on the same ARM SoC but stayed within the 50 ms budget at 5 Hz lidar. The team kept EKF on highway segments with unique geometry and switched to particles only in tagged ambiguity zones — a hybrid many production systems use.

Filter choice decision table

Situation	Recommended approach	Why
Linear dynamics, Gaussian noise, unimodal belief	Kalman filter	Optimal, O(n³) per step, minimal tuning.
Mild nonlinearity, unimodal, real-time constraints	EKF or UKF	Cheaper than particles; UKF often better on bearing-only sensors.
Multimodal pose or data association ambiguity	Particle filter	Represents multiple hypotheses until disambiguated.
Discrete hidden states, finite alphabet	HMM or grid filter	Exact inference when state space is small and discrete.
SLAM with many landmarks	Rao-Blackwellized PF (FastSLAM)	Particles over path; Kalman per landmark per particle.
High-dimensional continuous state (>10D) real-time	EKF/UKF or learned end-to-end model	Raw particle cost explodes; consider reduction or learning.
Offline batch smoothing	Particle smoother or variational methods	Forward-backward particle passes improve past estimates.

Common pitfalls

Too few particles: multimodal beliefs look unimodal; increase N or improve proposals before tuning noise matrices.
Never resampling: degeneracy silently kills diversity; monitor N_{\text{eff}} every step.
Resampling every step: removes diversity prematurely; sample impoverishment and particle deprivation — resample only when needed.
Wrong likelihood units: mixing log-likelihoods with linear weights without consistent normalization collapses weights to NaN.
Ignoring sample impoverishment after resample: add jitter (small process noise) to duplicated particles to maintain spread.
Using particles where Kalman suffices: 10× CPU for no accuracy gain on unimodal linear problems.
Global localization without spread: uniform initial cloud over the whole map needs thousands of particles; use coarse grid or Wi-Fi fingerprint to seed.

Practitioner checklist

Confirm belief is genuinely multimodal or strongly non-Gaussian; otherwise benchmark EKF/UKF first.
Define state vector, motion model p(x_t|x_{t-1}), and likelihood p(z_t|x_t) with units tested on recorded bags.
Choose particle count N via simulation; target N_{\text{eff}} > N/4 at steady state.
Implement systematic resampling with effective-sample threshold; log N_{\text{eff}} in production.
Add post-resample jitter if identical duplicates cause numerical issues.
Precompute expensive likelihood structures (occupancy grids, distance fields) offline.
Validate on logged sequences with ground truth; measure pose error and failure recovery time.
Profile latency per predict-update-resample cycle against sensor rate.
Hybridize: run Kalman on easy segments, particles in tagged ambiguity regions.
Document when filter mode switches and how operators should interpret multimodal debug visualizations.

Key takeaways

Particle filters approximate the full Bayesian posterior with weighted samples — not just a mean and covariance.
The SIR loop propagates hypotheses, weights them by measurement likelihood, and resamples to fight degeneracy.
They excel at multimodal localization and nonlinear/non-Gaussian models at the cost of CPU and careful tuning.
Proposal design and resampling strategy matter as much as particle count.
Use Kalman or HMM when their assumptions hold; reach for particles when Gaussian unimodality is visibly wrong.