Guide
Kalman filter explained
A Kalman filter is a recursive algorithm that estimates a hidden state — position, velocity, inventory level, sensor bias — from noisy measurements over time. It alternates between predicting how the state should evolve and updating that prediction when new data arrives, weighting each source by its uncertainty. GPS navigation, drone autopilots, robotics, econometric models, and trading systems all use variants of the same idea. This guide covers the predict-update cycle, process and measurement noise matrices, linear vs nonlinear extensions, multi-sensor fusion, a Harbor Courier fleet-tracking worked example, a filter decision table, common pitfalls, and a practitioner checklist. For discrete state transitions without continuous dynamics, see Markov chains explained; for forecasting without a mechanistic state model, see time series forecasting explained; for probabilistic foundations, see Bayesian inference explained.
The problem Kalman filters solve
Real sensors lie — GPS drifts in urban canyons, accelerometers pick up vibration, wheel odometry slips on wet pavement. You rarely observe the quantity you care about directly; you observe a corrupted version of it, often at irregular intervals from multiple sources. A naive approach averages recent readings, which lags behind reality and smears sharp turns. A Kalman filter maintains a belief distribution over the true state (mean plus covariance) and fuses prediction with measurement in a statistically optimal way — for linear Gaussian systems.
The filter assumes you can write two equations. A state transition model describes how the hidden state evolves between time steps (physics, kinematics, or learned dynamics). An observation model maps the hidden state to what sensors should read. Both equations carry noise terms whose variances you specify or estimate. The algorithm is recursive: each step needs only the previous estimate and the latest measurement, not the full history — which makes it fast enough for real-time control loops at hundreds of hertz.
State vector and covariance
The state vector x holds everything you want
to track — for a delivery van that might be [latitude, longitude,
speed, heading]. The covariance matrix P
encodes how uncertain you are about each component and how they correlate.
Large diagonal entries mean "we are not sure"; small entries mean "we are
confident." Off-diagonal entries capture coupling — uncertainty in speed
affects uncertainty in future position.
Predict and update — the two-step cycle
Every time step runs the same loop. During prediction (also called the time update), the filter projects the state forward using the transition model and inflates covariance to reflect process noise — model imperfection and unmodeled disturbances. No new sensor data is used here; uncertainty grows because you are extrapolating.
During update (the measurement update), a sensor reading
arrives. The filter computes the Kalman gain K,
which answers: "how much should I trust this measurement versus my
prediction?" High measurement noise yields low gain (ignore the sensor);
high prediction uncertainty yields high gain (correct aggressively). The
state mean shifts toward the measurement; covariance shrinks because you
incorporated new information.
Intuition for the Kalman gain
Think of tracking a car with GPS that jumps around and a speedometer that drifts slowly. When GPS variance is low and you just predicted confidently, gain is moderate — you nudge position but do not throw away velocity estimates built from integration. When GPS drops out (tunnel), you predict open-loop and covariance balloons until signal returns. When GPS suddenly disagrees sharply with prediction, a well-tuned filter asks whether the jump is plausible given both noise models before snapping the estimate.
Process noise (Q) and measurement noise (R)
Process noise matrix Q models how wrong your
transition equation can be — driver acceleration not in the model, wind
gusts, inventory shocks. Measurement noise matrix R
models sensor error — GPS horizontal accuracy, scale calibration, rounding.
Tuning Q and R is where most practical effort
goes: too little process noise makes the filter sluggish and over-trust
the model; too much makes it chase sensor noise and jitter.
Linear Kalman filter — when it applies
The classic Kalman filter requires linear transition and
observation functions plus Gaussian noise. The state update
is x_k = F x_{k-1} + w and the measurement is z_k = H x_k + v,
where F and H are matrices. Constant-velocity
motion models, linear-Gaussian econometric state-space models, and simple
inventory trackers fit this mold exactly — closed-form predict/update with
no approximations.
A constant-velocity model in one dimension might track position and
velocity with F encoding "position += velocity * dt." Radar
tracking aircraft range and range-rate uses similar structure. When
dynamics are nearly linear over your sampling interval, the linear filter
often suffices even if the true world is mildly nonlinear — a useful
baseline before reaching for heavier machinery.
Nonlinear extensions — EKF, UKF, and particle filters
Extended Kalman filter (EKF)
Most real systems are nonlinear — bearing-only sensors, quaternion attitude, logistic growth rates. The extended Kalman filter linearizes the transition and observation functions around the current estimate via Jacobians, then runs the standard Kalman equations on the local linear approximation. EKF is the workhorse for robotics and navigation when nonlinearity is moderate. It can diverge if linearization is poor or uncertainties are large relative to curvature.
Unscented Kalman filter (UKF)
The unscented Kalman filter propagates a set of sigma points through the nonlinear functions instead of linearizing. It often beats EKF on strongly nonlinear observation models (e.g., converting GPS lat/lon to local tangent plane with high attitude uncertainty) at similar computational cost for moderate state dimensions.
Particle filters
When distributions are far from Gaussian — multimodal hypotheses ("the target is in one of two rooms") — particle filters represent belief with weighted samples. They are more flexible and more expensive. Use them when Kalman-family filters demonstrably fail, not as a default.
Multi-sensor fusion
Kalman filters shine when fusing heterogeneous sensors at different rates.
IMU accelerometers and gyroscopes update at 200 Hz; GPS arrives at 1 Hz;
wheel encoders tick every few milliseconds. The predict step runs at the
highest rate using IMU; GPS and encoder readings each trigger an update
with their own H and R matrices observing subsets
of the state. Mis-timed measurements (latency, clock skew) are a common
production bug — timestamp alignment matters as much as matrix tuning.
Sensor fusion is not magic averaging: the filter encodes
which sensors observe which state components and how trustworthy each is
under current conditions. A magnetometer update might be down-weighted near
power lines; visual odometry might be trusted only when feature count is
high. Those rules often appear as adaptive R inflation rather
than hard switches, preserving filter continuity.
Worked example — Harbor Courier fleet tracking
Harbor Courier operates 40 vans across a metro area. Dispatch needs smooth position, speed, and heading estimates to assign jobs and quote ETAs. Each van streams GPS (1 Hz, ~8 m horizontal error in good conditions), CAN-bus speed (10 Hz, low noise), and intermittent cell-tower fixes when GPS is lost downtown.
The team defines a six-dimensional state: local east/north position,
velocity components, and heading rate. A constant-turn-rate model
drives prediction at 10 Hz between speed updates. GPS updates observe
position with diagonal R scaled by reported accuracy; speed
updates observe velocity magnitude. When GPS accuracy blows past 50 m
(urban canyon), they inflate R rather than dropping fixes —
weak updates beat dead reckoning alone.
Tuning started with Q derived from maximum plausible
acceleration and turn rate; field logs compared filter innovation
(measurement minus prediction) against theoretical chi-square bounds.
Innovations that were consistently too large signaled underestimated
process noise; jittery tracks signaled over-trusted GPS. After two weeks
of shadow mode, ETA error at 15-minute horizon dropped 22% versus raw
GPS interpolation, with no extra hardware — the same pattern many
logistics platforms use before adding map-matching layers on top.
Filter choice decision table
| Situation | Recommended approach | Why |
|---|---|---|
| Linear dynamics, Gaussian noise, real-time budget | Linear Kalman filter | Optimal, fast, easy to analyze |
| Mild nonlinearity, smooth Jacobians | Extended Kalman filter (EKF) | Industry default for navigation and robotics |
| Strong nonlinear observations, moderate state size | Unscented Kalman filter (UKF) | Better moment capture than EKF linearization |
| Multimodal beliefs, discrete hypotheses | Particle filter | Handles non-Gaussian, multi-peaked posteriors |
| No credible dynamics model, only history | ARIMA / ML forecasting | Kalman needs a state model; see time series guide |
| Discrete states only (idle, moving, fault) | Hidden Markov model | Markov chains when continuous state is unnecessary |
| Offline batch smoothing, all data available | Rauch-Tung-Striebel smoother | Uses future measurements to refine past estimates |
Common pitfalls
- Overconfident initialization — starting with near-zero covariance makes early bad measurements dominate permanently; use loose priors until sensors agree.
- Mismatched units and frames — mixing degrees with radians or GPS WGS84 with local ENU without consistent transforms produces "mystery drift."
- Ignoring latency — applying a measurement as if it arrived at filter time when it was captured 500 ms ago smears turns; buffer and back-propagate or shift timestamps.
- Q and R copied from papers — noise matrices are vehicle-specific and environment-specific; tune on your logs or use adaptive estimation.
- EKF on highly nonlinear manifolds without care — attitude on SO(3) needs quaternion or error-state formulations; naive Euler-angle EKF gimbal-locks.
- Updating with correlated measurements twice — two sensors derived from the same GPS chip counted separately over-trust one reading.
- No innovation monitoring — chi-square tests on innovations catch sensor failures and model breakdown early; filters without monitoring fail silently.
Practitioner checklist
- Define state vector, transition model, and which sensors observe which components.
- Document units, coordinate frames, and sampling rates for every input.
- Initialize mean and covariance with conservative (large) uncertainty.
- Derive or estimate
Qfrom physical limits; setRfrom sensor datasheets then refine on data. - Run predict at the highest sensor rate; apply measurement updates as data arrives.
- Log innovations and compare to expected distributions; alert on sustained mismatch.
- Handle missing sensors by predict-only steps, not by repeating stale measurements.
- Validate on held-out trajectories with ground truth or high-grade reference sensors.
- Escalate to EKF/UKF only when linear filter innovation tests fail systematically.
- Document tuning parameters and revalidate after hardware or firmware changes.
Key takeaways
- Kalman filters recursively fuse a dynamics model with noisy measurements via predict-update cycles.
- The Kalman gain balances prediction uncertainty against measurement noise — tuning
QandRis the core engineering task. - Linear Kalman filters are optimal for linear Gaussian systems; EKF and UKF extend to most navigation and robotics cases.
- Multi-rate, multi-sensor fusion is the killer application — IMU plus GPS plus odometry is the canonical stack.
- Innovation monitoring and timestamp hygiene separate production trackers from demo filters that look smooth until they diverge.
Related reading
- Markov chains explained — discrete state transitions and hidden Markov models
- Time series forecasting explained — ARIMA, seasonality, and forecast horizons without state models
- Bayesian inference explained — priors, posteriors, and uncertainty quantification
- Anomaly detection explained — spotting when sensor or process behavior diverges from norms