Guide
Particle filter explained
Harbor Logistics’ warehouse robots navigated a 40,000 m² grid of identical aisles. At every T-junction, wheel odometry and a single-beam lidar were ambiguous: the robot could be in corridor A or corridor B with equal plausibility. An extended Kalman filter forced a unimodal Gaussian belief and “picked” the wrong branch 14% of the time — each mistake required a manual reposition. Switching to a particle filter that maintained hundreds of weighted pose hypotheses cut localization failures to 5.4% without new sensors. A particle filter (also called sequential Monte Carlo) represents belief about a hidden state with a cloud of weighted samples rather than a single mean and covariance. It handles nonlinear dynamics, non-Gaussian noise, and multimodal uncertainty — cases where Kalman-family filters break. This guide covers the predict-weight-resample loop, importance sampling, particle degeneracy, resampling strategies, comparison with Kalman filters and hidden Markov models, a Harbor Logistics robot localization refactor, a filter decision table, pitfalls, and a production checklist.
Why Gaussian filters are not enough
The Kalman filter and its extensions (EKF, UKF) maintain a Gaussian belief over the hidden state. That assumption is elegant and fast when it holds: one bump of probability, one mean, one covariance ellipsoid. Real estimation problems often violate it:
- Multimodal ambiguity — symmetric corridors, kidnapped robot problems, or two aircraft with crossing trajectories create multiple plausible states simultaneously. A Gaussian cannot represent “50% here, 50% there.”
- Heavy-tailed or skewed noise — occasional gross outliers (slipping wheels, multipath GPS) are poorly modeled as Gaussian.
- Strong nonlinearities — EKF linearization can collapse multimodal structure into a misleading single mode.
- Discrete hybrid states — “which room am I in?” combined with continuous pose is naturally represented by particles clustered per room.
Particle filters trade compute for expressiveness: instead of tracking
(\mu, \Sigma), they approximate the full posterior
p(x_t | z_{1:t}) with weighted samples
\{(x_t^{(i)}, w_t^{(i)})\}_{i=1}^N. Any shape — multimodal,
skewed, bounded — is representable if you use enough particles.
Sequential Monte Carlo intuition
Bayesian filtering recursively updates belief as measurements arrive. At time
t, you want the posterior over state x_t given all
observations z_{1:t}. Particle filters approximate this with
Monte Carlo integration: draw samples from a proposal distribution, weight
each sample by how well it explains the latest measurement, and occasionally
duplicate high-weight particles while discarding low-weight ones.
Think of each particle as a hypothesis about reality: “the robot is at (12.4, 7.1) facing east with 0.3% weight.” After motion, every hypothesis moves according to the dynamics model plus process noise. After sensing, hypotheses inconsistent with lidar or GPS get down-weighted; consistent ones gain weight. When one hypothesis dominates, you have localized; when weight spreads across two clusters, you are correctly uncertain until disambiguating data arrives.
Foundations sit in Bayesian inference: the filter is importance sampling applied sequentially, with corrections to prevent sample impoverishment.
The SIR loop: predict, weight, resample
The most common variant is the Sampling Importance Resampling (SIR) filter, also called bootstrap particle filter:
- Predict (propagate) — for each particle
x_{t-1}^{(i)}, samplex_t^{(i)} \sim p(x_t | x_{t-1}^{(i)})using your motion model (odometry, IMU integration, kinematics). Process noise injects diversity. - Weight (update) — assign importance weight
w_t^{(i)} \propto p(z_t | x_t^{(i)})from the measurement likelihood (lidar scan match, GPS, vision feature match). Normalize so weights sum to 1. - Resample (optional but critical) — when effective
sample size
N_{\text{eff}} = 1 / \sum_i (w^{(i)})^2drops below a threshold (oftenN/2), drawNnew particles by sampling indices proportional to weights, reset weights to1/N.
The predict step spreads particles; the update step concentrates them near states that explain observations; resampling kills hypotheses that accumulated negligible weight. Skipping resample leads to particle degeneracy: after many steps, a single particle carries all weight while others are dead weight — Monte Carlo variance explodes.
Resampling strategies
- Multinomial resampling — simple random sampling with replacement; higher variance.
- Systematic resampling — lower variance, O(N); the default in most libraries.
- Stratified resampling — partitions the cumulative weight axis; slightly better diversity preservation.
Low-variance resampling and residual resampling are refinements when particle count is tight. Always resample after weighting, not before prediction, in the standard SIR formulation.
Importance sampling and proposal design
The bootstrap filter uses the transition model as the proposal:
q(x_t | x_{t-1}, z_t) = p(x_t | x_{t-1}). That is easy but
inefficient when the measurement is much more informative than motion —
most propagated particles land where the sensor says you cannot be, receive
near-zero weight, and die on resample. Optimal proposals
incorporate the latest measurement:
q(x_t | x_{t-1}, z_t) \propto p(z_t | x_t)\, p(x_t | x_{t-1}).
In practice, auxiliary particle filters (APF) and unscented particle filters
blend UKF-style proposals with particle diversity. For lidar localization,
sampling around high-likelihood poses from scan matching (rather than blind
odometry) can reduce required N by an order of magnitude. The
engineering trade-off: smarter proposals need more code and must remain
correct importance weights if you depart from the bootstrap choice.
State representation and dimensions
Particle count scales poorly with state dimension — the curse of dimensionality that keeps particle filters in roughly 4–10 continuous dimensions for real-time use, sometimes higher with careful proposals. Common state vectors:
- Robot pose —
(x, y, \theta)or SE(2); 500–2000 particles often suffice for warehouse grids with good lidar. - Target tracking — position and velocity in 2D/3D; gating measurements with a Kalman pre-filter reduces particles needed.
- Financial latent regimes — discrete regime index plus continuous volatility; particles mix GARCH-like parameters across hypotheses.
High-dimensional states (full SLAM maps) use Rao-Blackwellized particle filters: particles over robot trajectory, closed-form Kalman updates per landmark per particle — FastSLAM is the canonical example.
Worked example — Harbor Logistics warehouse localization
Harbor Logistics deployed 18 AMRs in a symmetrical racking layout. Sensors:
wheel encoders (10 Hz), 2D lidar (5 Hz), ceiling QR codes visible only at
aisle ends. The EKF tracked (x, y, \theta) with Gaussian noise.
At junctions where odometry error exceeded 0.5 m, the filter committed to one
aisle; wrong commits triggered recovery behaviors averaging 47 seconds.
The refactor used 800 particles with a differential-drive motion model.
Propagation added Gaussian noise on translation and rotation tuned from
encoder slip logs. Lidar updates scored each particle with a likelihood field
map (precomputed distance transform) — fast beam-endpoint matching.
QR detections applied a sharp Gaussian bump when a particle entered a code
footprint. Resampling used systematic resampling when
N_{\text{eff}} < 400.
Critical fix: at ambiguous junctions, weight remained split across branches until lidar geometry or a QR code collapsed the distribution. Mean time lost per junction dropped from 47 s to 11 s; overall pick-route throughput rose 8%. CPU cost rose 3.2× versus EKF on the same ARM SoC but stayed within the 50 ms budget at 5 Hz lidar. The team kept EKF on highway segments with unique geometry and switched to particles only in tagged ambiguity zones — a hybrid many production systems use.
Filter choice decision table
| Situation | Recommended approach | Why |
|---|---|---|
| Linear dynamics, Gaussian noise, unimodal belief | Kalman filter | Optimal, O(n³) per step, minimal tuning. |
| Mild nonlinearity, unimodal, real-time constraints | EKF or UKF | Cheaper than particles; UKF often better on bearing-only sensors. |
| Multimodal pose or data association ambiguity | Particle filter | Represents multiple hypotheses until disambiguated. |
| Discrete hidden states, finite alphabet | HMM or grid filter | Exact inference when state space is small and discrete. |
| SLAM with many landmarks | Rao-Blackwellized PF (FastSLAM) | Particles over path; Kalman per landmark per particle. |
| High-dimensional continuous state (>10D) real-time | EKF/UKF or learned end-to-end model | Raw particle cost explodes; consider reduction or learning. |
| Offline batch smoothing | Particle smoother or variational methods | Forward-backward particle passes improve past estimates. |
Common pitfalls
- Too few particles: multimodal beliefs look unimodal; increase
Nor improve proposals before tuning noise matrices. - Never resampling: degeneracy silently kills diversity; monitor
N_{\text{eff}}every step. - Resampling every step: removes diversity prematurely; sample impoverishment and particle deprivation — resample only when needed.
- Wrong likelihood units: mixing log-likelihoods with linear weights without consistent normalization collapses weights to NaN.
- Ignoring sample impoverishment after resample: add jitter (small process noise) to duplicated particles to maintain spread.
- Using particles where Kalman suffices: 10× CPU for no accuracy gain on unimodal linear problems.
- Global localization without spread: uniform initial cloud over the whole map needs thousands of particles; use coarse grid or Wi-Fi fingerprint to seed.
Practitioner checklist
- Confirm belief is genuinely multimodal or strongly non-Gaussian; otherwise benchmark EKF/UKF first.
- Define state vector, motion model
p(x_t|x_{t-1}), and likelihoodp(z_t|x_t)with units tested on recorded bags. - Choose particle count
Nvia simulation; targetN_{\text{eff}} > N/4at steady state. - Implement systematic resampling with effective-sample threshold; log
N_{\text{eff}}in production. - Add post-resample jitter if identical duplicates cause numerical issues.
- Precompute expensive likelihood structures (occupancy grids, distance fields) offline.
- Validate on logged sequences with ground truth; measure pose error and failure recovery time.
- Profile latency per predict-update-resample cycle against sensor rate.
- Hybridize: run Kalman on easy segments, particles in tagged ambiguity regions.
- Document when filter mode switches and how operators should interpret multimodal debug visualizations.
Key takeaways
- Particle filters approximate the full Bayesian posterior with weighted samples — not just a mean and covariance.
- The SIR loop propagates hypotheses, weights them by measurement likelihood, and resamples to fight degeneracy.
- They excel at multimodal localization and nonlinear/non-Gaussian models at the cost of CPU and careful tuning.
- Proposal design and resampling strategy matter as much as particle count.
- Use Kalman or HMM when their assumptions hold; reach for particles when Gaussian unimodality is visibly wrong.
Related reading
- Kalman filter explained — Gaussian predict-update baseline and EKF/UKF extensions
- Bayesian inference explained — priors, posteriors, and sequential updating foundations
- Hidden Markov models explained — discrete latent states when the alphabet is finite and small
- Time series forecasting explained — when to model states versus direct future prediction