Guide

Kalman filter explained

A Kalman filter is a recursive algorithm that estimates a hidden state — position, velocity, inventory level, sensor bias — from noisy measurements over time. It alternates between predicting how the state should evolve and updating that prediction when new data arrives, weighting each source by its uncertainty. GPS navigation, drone autopilots, robotics, econometric models, and trading systems all use variants of the same idea. This guide covers the predict-update cycle, process and measurement noise matrices, linear vs nonlinear extensions, multi-sensor fusion, a Harbor Courier fleet-tracking worked example, a filter decision table, common pitfalls, and a practitioner checklist. For discrete state transitions without continuous dynamics, see Markov chains explained; for forecasting without a mechanistic state model, see time series forecasting explained; for probabilistic foundations, see Bayesian inference explained.

The problem Kalman filters solve

Real sensors lie — GPS drifts in urban canyons, accelerometers pick up vibration, wheel odometry slips on wet pavement. You rarely observe the quantity you care about directly; you observe a corrupted version of it, often at irregular intervals from multiple sources. A naive approach averages recent readings, which lags behind reality and smears sharp turns. A Kalman filter maintains a belief distribution over the true state (mean plus covariance) and fuses prediction with measurement in a statistically optimal way — for linear Gaussian systems.

The filter assumes you can write two equations. A state transition model describes how the hidden state evolves between time steps (physics, kinematics, or learned dynamics). An observation model maps the hidden state to what sensors should read. Both equations carry noise terms whose variances you specify or estimate. The algorithm is recursive: each step needs only the previous estimate and the latest measurement, not the full history — which makes it fast enough for real-time control loops at hundreds of hertz.

State vector and covariance

The state vector x holds everything you want to track — for a delivery van that might be [latitude, longitude, speed, heading]. The covariance matrix P encodes how uncertain you are about each component and how they correlate. Large diagonal entries mean "we are not sure"; small entries mean "we are confident." Off-diagonal entries capture coupling — uncertainty in speed affects uncertainty in future position.

Predict and update — the two-step cycle

Every time step runs the same loop. During prediction (also called the time update), the filter projects the state forward using the transition model and inflates covariance to reflect process noise — model imperfection and unmodeled disturbances. No new sensor data is used here; uncertainty grows because you are extrapolating.

During update (the measurement update), a sensor reading arrives. The filter computes the Kalman gain K, which answers: "how much should I trust this measurement versus my prediction?" High measurement noise yields low gain (ignore the sensor); high prediction uncertainty yields high gain (correct aggressively). The state mean shifts toward the measurement; covariance shrinks because you incorporated new information.

Intuition for the Kalman gain

Think of tracking a car with GPS that jumps around and a speedometer that drifts slowly. When GPS variance is low and you just predicted confidently, gain is moderate — you nudge position but do not throw away velocity estimates built from integration. When GPS drops out (tunnel), you predict open-loop and covariance balloons until signal returns. When GPS suddenly disagrees sharply with prediction, a well-tuned filter asks whether the jump is plausible given both noise models before snapping the estimate.

Process noise (Q) and measurement noise (R)

Process noise matrix Q models how wrong your transition equation can be — driver acceleration not in the model, wind gusts, inventory shocks. Measurement noise matrix R models sensor error — GPS horizontal accuracy, scale calibration, rounding. Tuning Q and R is where most practical effort goes: too little process noise makes the filter sluggish and over-trust the model; too much makes it chase sensor noise and jitter.

Linear Kalman filter — when it applies

The classic Kalman filter requires linear transition and observation functions plus Gaussian noise. The state update is x_k = F x_{k-1} + w and the measurement is z_k = H x_k + v, where F and H are matrices. Constant-velocity motion models, linear-Gaussian econometric state-space models, and simple inventory trackers fit this mold exactly — closed-form predict/update with no approximations.

A constant-velocity model in one dimension might track position and velocity with F encoding "position += velocity * dt." Radar tracking aircraft range and range-rate uses similar structure. When dynamics are nearly linear over your sampling interval, the linear filter often suffices even if the true world is mildly nonlinear — a useful baseline before reaching for heavier machinery.

Nonlinear extensions — EKF, UKF, and particle filters

Extended Kalman filter (EKF)

Most real systems are nonlinear — bearing-only sensors, quaternion attitude, logistic growth rates. The extended Kalman filter linearizes the transition and observation functions around the current estimate via Jacobians, then runs the standard Kalman equations on the local linear approximation. EKF is the workhorse for robotics and navigation when nonlinearity is moderate. It can diverge if linearization is poor or uncertainties are large relative to curvature.

Unscented Kalman filter (UKF)

The unscented Kalman filter propagates a set of sigma points through the nonlinear functions instead of linearizing. It often beats EKF on strongly nonlinear observation models (e.g., converting GPS lat/lon to local tangent plane with high attitude uncertainty) at similar computational cost for moderate state dimensions.

Particle filters

When distributions are far from Gaussian — multimodal hypotheses ("the target is in one of two rooms") — particle filters represent belief with weighted samples. They are more flexible and more expensive. Use them when Kalman-family filters demonstrably fail, not as a default.

Multi-sensor fusion

Kalman filters shine when fusing heterogeneous sensors at different rates. IMU accelerometers and gyroscopes update at 200 Hz; GPS arrives at 1 Hz; wheel encoders tick every few milliseconds. The predict step runs at the highest rate using IMU; GPS and encoder readings each trigger an update with their own H and R matrices observing subsets of the state. Mis-timed measurements (latency, clock skew) are a common production bug — timestamp alignment matters as much as matrix tuning.

Sensor fusion is not magic averaging: the filter encodes which sensors observe which state components and how trustworthy each is under current conditions. A magnetometer update might be down-weighted near power lines; visual odometry might be trusted only when feature count is high. Those rules often appear as adaptive R inflation rather than hard switches, preserving filter continuity.

Worked example — Harbor Courier fleet tracking

Harbor Courier operates 40 vans across a metro area. Dispatch needs smooth position, speed, and heading estimates to assign jobs and quote ETAs. Each van streams GPS (1 Hz, ~8 m horizontal error in good conditions), CAN-bus speed (10 Hz, low noise), and intermittent cell-tower fixes when GPS is lost downtown.

The team defines a six-dimensional state: local east/north position, velocity components, and heading rate. A constant-turn-rate model drives prediction at 10 Hz between speed updates. GPS updates observe position with diagonal R scaled by reported accuracy; speed updates observe velocity magnitude. When GPS accuracy blows past 50 m (urban canyon), they inflate R rather than dropping fixes — weak updates beat dead reckoning alone.

Tuning started with Q derived from maximum plausible acceleration and turn rate; field logs compared filter innovation (measurement minus prediction) against theoretical chi-square bounds. Innovations that were consistently too large signaled underestimated process noise; jittery tracks signaled over-trusted GPS. After two weeks of shadow mode, ETA error at 15-minute horizon dropped 22% versus raw GPS interpolation, with no extra hardware — the same pattern many logistics platforms use before adding map-matching layers on top.

Filter choice decision table

Situation	Recommended approach	Why
Linear dynamics, Gaussian noise, real-time budget	Linear Kalman filter	Optimal, fast, easy to analyze
Mild nonlinearity, smooth Jacobians	Extended Kalman filter (EKF)	Industry default for navigation and robotics
Strong nonlinear observations, moderate state size	Unscented Kalman filter (UKF)	Better moment capture than EKF linearization
Multimodal beliefs, discrete hypotheses	Particle filter	Handles non-Gaussian, multi-peaked posteriors
No credible dynamics model, only history	ARIMA / ML forecasting	Kalman needs a state model; see time series guide
Discrete states only (idle, moving, fault)	Hidden Markov model	Markov chains when continuous state is unnecessary
Offline batch smoothing, all data available	Rauch-Tung-Striebel smoother	Uses future measurements to refine past estimates

Common pitfalls

Overconfident initialization — starting with near-zero covariance makes early bad measurements dominate permanently; use loose priors until sensors agree.
Mismatched units and frames — mixing degrees with radians or GPS WGS84 with local ENU without consistent transforms produces "mystery drift."
Ignoring latency — applying a measurement as if it arrived at filter time when it was captured 500 ms ago smears turns; buffer and back-propagate or shift timestamps.
Q and R copied from papers — noise matrices are vehicle-specific and environment-specific; tune on your logs or use adaptive estimation.
EKF on highly nonlinear manifolds without care — attitude on SO(3) needs quaternion or error-state formulations; naive Euler-angle EKF gimbal-locks.
Updating with correlated measurements twice — two sensors derived from the same GPS chip counted separately over-trust one reading.
No innovation monitoring — chi-square tests on innovations catch sensor failures and model breakdown early; filters without monitoring fail silently.

Practitioner checklist

Define state vector, transition model, and which sensors observe which components.
Document units, coordinate frames, and sampling rates for every input.
Initialize mean and covariance with conservative (large) uncertainty.
Derive or estimate Q from physical limits; set R from sensor datasheets then refine on data.
Run predict at the highest sensor rate; apply measurement updates as data arrives.
Log innovations and compare to expected distributions; alert on sustained mismatch.
Handle missing sensors by predict-only steps, not by repeating stale measurements.
Validate on held-out trajectories with ground truth or high-grade reference sensors.
Escalate to EKF/UKF only when linear filter innovation tests fail systematically.
Document tuning parameters and revalidate after hardware or firmware changes.

Key takeaways

Kalman filters recursively fuse a dynamics model with noisy measurements via predict-update cycles.
The Kalman gain balances prediction uncertainty against measurement noise — tuning Q and R is the core engineering task.
Linear Kalman filters are optimal for linear Gaussian systems; EKF and UKF extend to most navigation and robotics cases.
Multi-rate, multi-sensor fusion is the killer application — IMU plus GPS plus odometry is the canonical stack.
Innovation monitoring and timestamp hygiene separate production trackers from demo filters that look smooth until they diverge.