State Estimation using Moving Horizon Estimation and Particle Filtering

Similar documents
State Estimation of Linear and Nonlinear Dynamic Systems

State Estimation of Linear and Nonlinear Dynamic Systems

State Estimation of Linear and Nonlinear Dynamic Systems

RAO-BLACKWELLISED PARTICLE FILTERS: EXAMPLES OF APPLICATIONS

Constrained State Estimation Using the Unscented Kalman Filter

State Estimation of Linear and Nonlinear Dynamic Systems

Using the Kalman Filter to Estimate the State of a Maneuvering Aircraft

Outline. 1 Full information estimation. 2 Moving horizon estimation - zero prior weighting. 3 Moving horizon estimation - nonzero prior weighting

Moving Horizon Estimation (MHE)

Lecture 7: Optimal Smoothing

ROBUST CONSTRAINED ESTIMATION VIA UNSCENTED TRANSFORMATION. Pramod Vachhani a, Shankar Narasimhan b and Raghunathan Rengaswamy a 1

Lecture 2: From Linear Regression to Kalman Filter and Beyond

Nonlinear Stochastic Modeling and State Estimation of Weakly Observable Systems: Application to Industrial Polymerization Processes

Online monitoring of MPC disturbance models using closed-loop data

Nonlinear State Estimation! Particle, Sigma-Points Filters!

Sensor Fusion: Particle Filter

Lecture 6: Multiple Model Filtering, Particle Filtering and Other Approximations

Outline. Model Predictive Control: Current Status and Future Challenges. Separation of the control problem. Separation of the control problem

EVALUATING SYMMETRIC INFORMATION GAP BETWEEN DYNAMICAL SYSTEMS USING PARTICLE FILTER

Lecture 6: Bayesian Inference in SDE Models

2D Image Processing (Extended) Kalman and particle filter

Nonlinear Estimation Techniques for Impact Point Prediction of Ballistic Targets

Lecture 2: From Linear Regression to Kalman Filter and Beyond

Human Pose Tracking I: Basics. David Fleet University of Toronto

Introduction to Unscented Kalman Filter

Optimal control and estimation

A Monte Carlo Sequential Estimation for Point Process Optimum Filtering

TSRT14: Sensor Fusion Lecture 8

Introduction to Particle Filters for Data Assimilation

L09. PARTICLE FILTERING. NA568 Mobile Robotics: Methods & Algorithms

Particle Filters. Outline

The Kalman Filter. Data Assimilation & Inverse Problems from Weather Forecasting to Neuroscience. Sarah Dance

FUNDAMENTAL FILTERING LIMITATIONS IN LINEAR NON-GAUSSIAN SYSTEMS

Density Approximation Based on Dirac Mixtures with Regard to Nonlinear Estimation and Filtering

Outline. Linear regulation and state estimation (LQR and LQE) Linear differential equations. Discrete time linear difference equations

Markov Chain Monte Carlo Methods for Stochastic Optimization

Particle Filtering Approaches for Dynamic Stochastic Optimization

Particle Filtering for Data-Driven Simulation and Optimization

A new unscented Kalman filter with higher order moment-matching

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 13: SEQUENTIAL DATA

Partially Observable Markov Decision Processes (POMDPs)

The Scaled Unscented Transformation

A Note on Auxiliary Particle Filters

An introduction to particle filters

Particle filters, the optimal proposal and high-dimensional systems

Unscented Filtering for Interval-Constrained Nonlinear Systems

Markov Chain Monte Carlo Methods for Stochastic

Data assimilation with and without a model

A Brief Tutorial On Recursive Estimation With Examples From Intelligent Vehicle Applications (Part IV): Sampling Based Methods And The Particle Filter

Tracking. Readings: Chapter 17 of Forsyth and Ponce. Matlab Tutorials: motiontutorial.m. 2503: Tracking c D.J. Fleet & A.D. Jepson, 2009 Page: 1

Inferring biological dynamics Iterated filtering (IF)

IN particle filter (PF) applications, knowledge of the computational

EKF, UKF. Pieter Abbeel UC Berkeley EECS. Many slides adapted from Thrun, Burgard and Fox, Probabilistic Robotics

EKF, UKF. Pieter Abbeel UC Berkeley EECS. Many slides adapted from Thrun, Burgard and Fox, Probabilistic Robotics

L06. LINEAR KALMAN FILTERS. NA568 Mobile Robotics: Methods & Algorithms

Efficient Monitoring for Planetary Rovers

Data assimilation with and without a model

A nested sampling particle filter for nonlinear data assimilation

Prediction of ESTSP Competition Time Series by Unscented Kalman Filter and RTS Smoother

Advanced Computational Methods in Statistics: Lecture 5 Sequential Monte Carlo/Particle Filtering

ON MODEL SELECTION FOR STATE ESTIMATION FOR NONLINEAR SYSTEMS. Robert Bos,1 Xavier Bombois Paul M. J. Van den Hof

BAYESIAN ESTIMATION BY SEQUENTIAL MONTE CARLO SAMPLING FOR NONLINEAR DYNAMIC SYSTEMS

NONLINEAR BAYESIAN FILTERING FOR STATE AND PARAMETER ESTIMATION

Nonlinear and/or Non-normal Filtering. Jesús Fernández-Villaverde University of Pennsylvania

Kalman filtering and friends: Inference in time series models. Herke van Hoof slides mostly by Michael Rubinstein

Dual Estimation and the Unscented Transformation

The Unscented Particle Filter

Kalman Filter. Predict: Update: x k k 1 = F k x k 1 k 1 + B k u k P k k 1 = F k P k 1 k 1 F T k + Q

Why do we care? Examples. Bayes Rule. What room am I in? Handling uncertainty over time: predicting, estimating, recognizing, learning

Robotics. Mobile Robotics. Marc Toussaint U Stuttgart

Time Series Prediction by Kalman Smoother with Cross-Validated Noise Density

AN EFFICIENT TWO-STAGE SAMPLING METHOD IN PARTICLE FILTER. Qi Cheng and Pascal Bondon. CNRS UMR 8506, Université Paris XI, France.

Monte Carlo Approximation of Monte Carlo Filters

in a Rao-Blackwellised Unscented Kalman Filter

Moving Horizon Estimation with Dynamic Programming

Blind Equalization via Particle Filtering

SLAM Techniques and Algorithms. Jack Collier. Canada. Recherche et développement pour la défense Canada. Defence Research and Development Canada

FIR Filters for Stationary State Space Signal Models

Optimizing Economic Performance using Model Predictive Control

A NEW NONLINEAR FILTER

In search of the unreachable setpoint

Computer Intensive Methods in Mathematical Statistics

Stochastic Spectral Approaches to Bayesian Inference

A Comparison of the EKF, SPKF, and the Bayes Filter for Landmark-Based Localization

Lecture 8: Bayesian Estimation of Parameters in State Space Models

Computer Intensive Methods in Mathematical Statistics

The Hierarchical Particle Filter

Filtering and Likelihood Inference

Mini-Course 07 Kalman Particle Filters. Henrique Massard da Fonseca Cesar Cunha Pacheco Wellington Bettencurte Julio Dutra

The Kalman Filter ImPr Talk

A Note on the Particle Filter with Posterior Gaussian Resampling

City, University of London Institutional Repository

Bearings-Only Tracking in Modified Polar Coordinate System : Initialization of the Particle Filter and Posterior Cramér-Rao Bound

OPTIMAL ESTIMATION of DYNAMIC SYSTEMS

CSC487/2503: Foundations of Computer Vision. Visual Tracking. David Fleet

2D Image Processing. Bayes filter implementation: Kalman filter

Nonlinear Measurement Update and Prediction: Prior Density Splitting Mixture Estimator

Lagrangian Data Assimilation and Manifold Detection for a Point-Vortex Model. David Darmon, AMSC Kayo Ide, AOSC, IPST, CSCAMM, ESSIC

Dynamic System Identification using HDMR-Bayesian Technique

A NOVEL OPTIMAL PROBABILITY DENSITY FUNCTION TRACKING FILTER DESIGN 1

Transcription:

State Estimation using Moving Horizon Estimation and Particle Filtering James B. Rawlings Department of Chemical and Biological Engineering UW Math Probability Seminar Spring 2009 Rawlings MHE & PF 1 / 45

Outline 1 State Estimation of Linear Systems Kalman Filter (KF) Least Squares (LS) Estimation Connections between KF and LS Limitations of These Approaches 2 Moving Horizon Estimation (MHE) 3 Particle Filtering 4 Combining Particle Filtering and MHE 5 Conclusions Rawlings MHE & PF 2 / 45

The conditional density function For the linear, time invariant model with Gaussian noise, x(k + 1) = Ax + Gw y = Cx + v w N(0, Q) v N(0, R) x(0) N(x 0, Q 0 ) Rawlings MHE & PF 3 / 45

The conditional density function For the linear, time invariant model with Gaussian noise, x(k + 1) = Ax + Gw y = Cx + v w N(0, Q) v N(0, R) x(0) N(x 0, Q 0 ) We can compute the conditional density function exactly p x Y (x Y (k 1)) = N( x, P ) p x Y (x Y (k)) = N( x, P) (before y(k)) (after y(k)) Rawlings MHE & PF 3 / 45

Mean and covariance before and after measurement Forecast x (k + 1) = A x (estimate) P (k + 1) = APA + GQG (covariance) x (0) = x 0 P (0) = Q 0 (initial condition) Rawlings MHE & PF 4 / 45

Mean and covariance before and after measurement Forecast x (k + 1) = A x (estimate) P (k + 1) = APA + GQG (covariance) x (0) = x 0 P (0) = Q 0 (initial condition) Correction x = x + L(y C x ) L = P C (R + CP C ) 1 P = P LCP (estimate) (gain) (covariance) Rawlings MHE & PF 4 / 45

Mean and covariance before and after measurement Forecast x (k + 1) = A x (estimate) P (k + 1) = APA + GQG (covariance) x (0) = x 0 P (0) = Q 0 (initial condition) Correction x = x + L(y C x ) L = P C (R + CP C ) 1 P = P LCP (estimate) (gain) (covariance) This result is the celebrated Kalman Filter Rawlings MHE & PF 4 / 45

Large R, ignore the measurement, trust the model forecast 10 8 x(1) 6 x 2 4 2 x(0) y(1) 0-2 -4-4 -2 0 2 4 6 8 10 12 14 16 x 1 Rawlings MHE & PF 5 / 45

Medium R, blend the measurement and the forecast 10 8 x(1) 6 x 2 4 2 x(0) y(1) 0-2 -4-4 -2 0 2 4 6 8 10 12 14 16 x 1 Rawlings MHE & PF 6 / 45

Small R, trust the measurement, override the forecast 10 8 x(1) 6 x 2 4 2 x(0) y(1) 0-2 -4-4 -2 0 2 4 6 8 10 12 14 16 x 1 Rawlings MHE & PF 7 / 45

Large R, y measures x 1 only 10 8 x(1) 6 x 2 4 2 x(0) y(1) 0-2 -4-4 -2 0 2 4 6 8 10 12 14 16 x 1 Rawlings MHE & PF 8 / 45

Medium R, y measures x 1 only 10 8 x(1) 6 x 2 4 2 x(0) y(1) 0-2 -4-4 -2 0 2 4 6 8 10 12 14 16 x 1 Rawlings MHE & PF 9 / 45

Small R, y measures x 1 only 10 8 x(1) 6 x 2 4 2 x(0) y(1) 0-2 -4-4 -2 0 2 4 6 8 10 12 14 16 x 1 Rawlings MHE & PF 10 / 45

Least Squares (LS) Estimation One of the most important problems in the application of mathematics to the natural sciences is to choose the best of these many combinations, i.e., the combination that yields values of the unknowns that are least subject to errors. Theory of the Combination of Observations Least Subject to Errors. C.F. Gauss, 1821. G.W. Stewart Translation, 1995, p. 31. Rawlings MHE & PF 11 / 45

LS Formulation for Unconstrained Linear Systems Recall the unconstrained linear state space model x(k + 1) = Ax(k) + w(k) y(k) = Cx(k) + v(k) The state estimation problem is formulated as a deterministic LS optimization problem min Φ(X (T )) x(0),...,x(t ) Rawlings MHE & PF 12 / 45

LS Formulation: Objective Function A reasonably flexible choice for objective function is T 1 Φ(X (T )) = x(0) x(0) 2 (Q(0)) 1 + + k=0 T y(k) Cx(k) 2 R 1 k=0 + x(k + 1) Ax(k) 2 Q 1 Rawlings MHE & PF 13 / 45

LS Formulation: Objective Function A reasonably flexible choice for objective function is T 1 Φ(X (T )) = x(0) x(0) 2 (Q(0)) 1 + + k=0 T y(k) Cx(k) 2 R 1 k=0 + x(k + 1) Ax(k) 2 Q 1 Heuristic selection of Q and R Q R: trust the measurement, override the model forecast R Q: ignore the measurement, trust the model forecast Rawlings MHE & PF 13 / 45

Solution of LS Problem by Forward Dynamic Programming Step 1: Adding the measurement at time k P(k) = P (k) P (k)c T (CP (k)c T + R) 1 CP (k) L(k) = P (k)c T (CP (k)c T + R) 1 x(k) = x (k) + L(k)(y(k) C x (k)) (covariance) (gain) (estimate) Rawlings MHE & PF 14 / 45

Solution of LS Problem by Forward Dynamic Programming Step 1: Adding the measurement at time k P(k) = P (k) P (k)c T (CP (k)c T + R) 1 CP (k) L(k) = P (k)c T (CP (k)c T + R) 1 x(k) = x (k) + L(k)(y(k) C x (k)) (covariance) (gain) (estimate) Step 2: Propagating the model to time k + 1 x (k + 1) = A x(k) P (k + 1) = Q + AP(k)A T ( x (0), P (0)) = (x(0), Q(0)) (estimate) (covariance) (initial condition) Rawlings MHE & PF 14 / 45

Solution of LS Problem by Forward Dynamic Programming Step 1: Adding the measurement at time k P(k) = P (k) P (k)c T (CP (k)c T + R) 1 CP (k) L(k) = P (k)c T (CP (k)c T + R) 1 x(k) = x (k) + L(k)(y(k) C x (k)) (covariance) (gain) (estimate) Step 2: Propagating the model to time k + 1 x (k + 1) = A x(k) P (k + 1) = Q + AP(k)A T ( x (0), P (0)) = (x(0), Q(0)) (estimate) (covariance) (initial condition) Same result as KF! Rawlings MHE & PF 14 / 45

Probabilistic Estimation versus Least Squares The recursive least squares approach was actually inspired by probabilistic results that automatically produce an equation of evolution for the estimate (the conditional mean). A.H. Jazwinski Stochastic Processes and Filtering Theory (1970) Rawlings MHE & PF 15 / 45

Probabilistic Estimation versus Least Squares The recursive least squares approach was actually inspired by probabilistic results that automatically produce an equation of evolution for the estimate (the conditional mean). In fact, much of the recent least squares work did nothing more than rederive the probabilistic results (perhaps in an attempt to understand them). A.H. Jazwinski Stochastic Processes and Filtering Theory (1970) Rawlings MHE & PF 15 / 45

Probabilistic Estimation versus Least Squares The recursive least squares approach was actually inspired by probabilistic results that automatically produce an equation of evolution for the estimate (the conditional mean). In fact, much of the recent least squares work did nothing more than rederive the probabilistic results (perhaps in an attempt to understand them). As a result, much of the least squares work contributes very little to estimation theory. A.H. Jazwinski Stochastic Processes and Filtering Theory (1970) Rawlings MHE & PF 15 / 45

Probability and Estimation Probabilistic justification of least squares estimation. Either an arbitrary assumption on errors: Normal distribution. Gauss. Rawlings MHE & PF 16 / 45

Probability and Estimation Probabilistic justification of least squares estimation. Either an arbitrary assumption on errors: Normal distribution. Gauss. Or an asymptotic argument: central limit theorem. Justifies least squares for infinitely large data sets regardless of error distribution. Laplace. Rawlings MHE & PF 16 / 45

Probability and Estimation Probabilistic justification of least squares estimation. Either an arbitrary assumption on errors: Normal distribution. Gauss. Or an asymptotic argument: central limit theorem. Justifies least squares for infinitely large data sets regardless of error distribution. Laplace. The connection of probability theory with a special problem in the combination of observations was made by Laplace in 1774... G.W. Stewart, 1995, p. 214 Rawlings MHE & PF 16 / 45

Probability and Estimation Probabilistic justification of least squares estimation. Either an arbitrary assumption on errors: Normal distribution. Gauss. Or an asymptotic argument: central limit theorem. Justifies least squares for infinitely large data sets regardless of error distribution. Laplace. The connection of probability theory with a special problem in the combination of observations was made by Laplace in 1774... Laplace s work described above was the beginning of a game of intellectual leapfrog between Gauss and Laplace that spanned several decades, and it is not easy to untangle their relative contributions. G.W. Stewart, 1995, p. 214 Rawlings MHE & PF 16 / 45

Probability and Estimation Probabilistic justification of least squares estimation. Either an arbitrary assumption on errors: Normal distribution. Gauss. Or an asymptotic argument: central limit theorem. Justifies least squares for infinitely large data sets regardless of error distribution. Laplace. The connection of probability theory with a special problem in the combination of observations was made by Laplace in 1774... Laplace s work described above was the beginning of a game of intellectual leapfrog between Gauss and Laplace that spanned several decades, and it is not easy to untangle their relative contributions. The problem is complicated by the fact that the two men are at extremes stylistically. Laplace is slapdash and lacks rigor, even by the standards of the time, while Gauss is reserved, often to the point of obscurity. Neither is easy to read. G.W. Stewart, 1995, p. 214 Rawlings MHE & PF 16 / 45

Kalman Filter and Least Squares: Comparison Kalman Filter (Probabilistic) Rawlings MHE & PF 17 / 45

Kalman Filter and Least Squares: Comparison Kalman Filter (Probabilistic) Offers more insights on the comparison of different state estimators based on the variance of their estimate error Choice of Q and R is part of the model Superior for unconstrained linear systems Rawlings MHE & PF 17 / 45

Kalman Filter and Least Squares: Comparison Kalman Filter (Probabilistic) Offers more insights on the comparison of different state estimators based on the variance of their estimate error Choice of Q and R is part of the model Superior for unconstrained linear systems Least Squares Rawlings MHE & PF 17 / 45

Kalman Filter and Least Squares: Comparison Kalman Filter (Probabilistic) Offers more insights on the comparison of different state estimators based on the variance of their estimate error Choice of Q and R is part of the model Superior for unconstrained linear systems Least Squares Objective function, although reasonable, is ad hoc Choice of Q and R is arbitrary Advantageous for significantly more complex models Rawlings MHE & PF 17 / 45

Deterministic Stability of State Estimator Asymptotic Stability of the State Estimator The estimator is asymptotically stable in sense of an observer if the estimator is able to recover from the incorrect initial value of state as data with no measurement noise are collected. x k initial estimate Estimator System k For example: assume an incorrect initial estimate the estimator converges (asymptotically) to the correct value Rawlings MHE & PF 18 / 45

State estimation: probabilistic optimality versus stability As Kalman has often stressed the major contribution of his work is not perhaps the actual filter algorithm, elegant and useful as it no doubt is, but the proof that under certain technical conditions called controllability and observability, the optimum filter is stable in the sense that the effects of initial errors and round-off and other computational errors will die out asymptotically. T. Kailath, 1974 Rawlings MHE & PF 19 / 45

Limitations of These Approaches What about constraints? Rawlings MHE & PF 20 / 45

Limitations of These Approaches What about constraints? Concentrations, particle size distributions, pressures, temperatures are positive. Using this extra information provides more accurate estimates. Projecting the unconstrained KF estimates to the feasible region is an ad hoc solution that does not satisfy the model. Rawlings MHE & PF 20 / 45

Limitations of These Approaches What about constraints? Concentrations, particle size distributions, pressures, temperatures are positive. Using this extra information provides more accurate estimates. Projecting the unconstrained KF estimates to the feasible region is an ad hoc solution that does not satisfy the model. What about nonlinear models? Rawlings MHE & PF 20 / 45

Limitations of These Approaches What about constraints? Concentrations, particle size distributions, pressures, temperatures are positive. Using this extra information provides more accurate estimates. Projecting the unconstrained KF estimates to the feasible region is an ad hoc solution that does not satisfy the model. What about nonlinear models? Almost all physical models in chemical and biological applications are nonlinear differential equations or nonlinear Markov processes. Linearizing the nonlinear model and using the standard update formulas (extended Kalman filter) is the standard industrial approach. Rawlings MHE & PF 20 / 45

Extended Kalman filter assessment The extended Kalman filter is probably the most widely used estimation algorithm for nonlinear systems. Julier and Uhlmann (2004). Rawlings MHE & PF 21 / 45

Extended Kalman filter assessment The extended Kalman filter is probably the most widely used estimation algorithm for nonlinear systems. However, more than 35 years of experience in the estimation community has shown that it is difficult to implement, difficult to tune, and only reliable for systems that are almost linear on the time scale of the updates. Julier and Uhlmann (2004). Rawlings MHE & PF 21 / 45

Extended Kalman filter assessment The extended Kalman filter is probably the most widely used estimation algorithm for nonlinear systems. However, more than 35 years of experience in the estimation community has shown that it is difficult to implement, difficult to tune, and only reliable for systems that are almost linear on the time scale of the updates. Many of these difficulties arise from its use of linearization. Julier and Uhlmann (2004). Rawlings MHE & PF 21 / 45

Alternative approaches Options for handling constraints and nonlinearity in state estimation 1 Optimization (moving horizon estimation (MHE)) 2 Sampling (particle filtering) Rawlings MHE & PF 22 / 45

Full information estimation Nonlinear model, Gaussian noise, x(k + 1) = F (x, u) + G(x, u)w y = h(x) + v Rawlings MHE & PF 23 / 45

Full information estimation Nonlinear model, Gaussian noise, x(k + 1) = F (x, u) + G(x, u)w y = h(x) + v The trajectory of states X (T ) := {x(0),... x(t )} Rawlings MHE & PF 23 / 45

Full information estimation Nonlinear model, Gaussian noise, x(k + 1) = F (x, u) + G(x, u)w y = h(x) + v The trajectory of states X (T ) := {x(0),... x(t )} Maximizing the conditional density function max p X Y (X (T ) Y (T )) X (T ) Rawlings MHE & PF 23 / 45

Equivalent optimization problem Using the model and taking logarithms T min V 1 0(x 0 ) + L w (w j ) + X (T ) j=1 T L v (y j h(x j )) j=0 subject to x(j + 1) = F (x, u) + w (G(x, u) = I ) L w (w) := log(p w (w)) V 0 (x) := log(p x0 (x)) L v (v) := log(p v (v)) Rawlings MHE & PF 24 / 45

Arrival cost and moving horizon estimation Most recent N states X (T N : T ) := {x T N... x T } Rawlings MHE & PF 25 / 45

Arrival cost and moving horizon estimation Most recent N states X (T N : T ) := {x T N... x T } Optimization problem min V T N(x T N ) X (T N:T ) }{{} arrival cost + T 1 j=t N L w (w j ) + T j=t N L v (y j h(x j )) subject to x(j + 1) = F (x, u) + w. Rawlings MHE & PF 25 / 45

Arrival cost and moving horizon Arrival Cost V T N (x) 00 11 01 00 11 01 0 00 11 0100 11 1 00 11 01 00 11 01 01 01 00 11 01 01 01 01 01 01 00 1 1 01 00 11 01 00 11 01 00 11 01 00 11 01 00 11 01 00 11 01 00 11 01 00 11 0100 01 01 0 11 1 y k Moving Horizon Full Information T N T Rawlings MHE & PF 26 / 45

Arrival cost approximation The statistically correct choice for the arrival cost is the conditional density of x T N Y (T N 1) V T N (x) = log p xt N Y (x Y (T N 1)) Rawlings MHE & PF 27 / 45

Arrival cost approximation The statistically correct choice for the arrival cost is the conditional density of x T N Y (T N 1) V T N (x) = log p xt N Y (x Y (T N 1)) Arrival cost approximations (Rao et al., 2003) uniform prior (and large N) EKF covariance formula MHE smoothing Rawlings MHE & PF 27 / 45

The challenge of nonlinear estimation Linear Estimation Estimation Possibilities: Rawlings MHE & PF 28 / 45

The challenge of nonlinear estimation Linear Estimation Estimation Possibilities: 1 one state is the optimal estimate 2 infinitely many states are optimal estimates (unobservable) Rawlings MHE & PF 28 / 45

The challenge of nonlinear estimation Linear Estimation Nonlinear Estimation Estimation Possibilities: 1 one state is the optimal estimate 2 infinitely many states are optimal estimates (unobservable) Estimation Possibilities: 1 one state is the optimal estimate 2 infinitely many states are optimal estimates (unobservable) Rawlings MHE & PF 28 / 45

The challenge of nonlinear estimation Linear Estimation Nonlinear Estimation Estimation Possibilities: 1 one state is the optimal estimate 2 infinitely many states are optimal estimates (unobservable) Estimation Possibilities: 1 one state is the optimal estimate 2 infinitely many states are optimal estimates (unobservable) 3 finitely many states are locally optimal estimates Rawlings MHE & PF 28 / 45

Particle filtering sampled densities p s (x) = s w i δ(x x i ) x i samples (particles) w i weights i=1 0.5 0.4 p(x) 0.3 0.2 p(x) p s(x) 0.1 0-4 -3-2 -1 0 1 2 3 4 x Exact density p(x) and a sampled density p s (x) with five samples for ξ N(0, 1) Rawlings MHE & PF 29 / 45

Convergence cumulative distributions 1 0.8 P(x) 0.6 0.4 P(x) P s (x) 0.2 0-4 -3-2 -1 0 1 2 3 4 Corresponding exact P(x) and sampled P s (x) cumulative distributions x Rawlings MHE & PF 30 / 45

Importance sampling In state estimation, p of interest is easy to evaluate but difficult to sample. Rawlings MHE & PF 31 / 45

Importance sampling In state estimation, p of interest is easy to evaluate but difficult to sample. We choose an importance function, q, instead. Rawlings MHE & PF 31 / 45

Importance sampling In state estimation, p of interest is easy to evaluate but difficult to sample. We choose an importance function, q, instead. When we can sample p, the sampled density is { p s = x i, w i = 1 } p sa (x i ) = p(x i ) s Rawlings MHE & PF 31 / 45

Importance sampling In state estimation, p of interest is easy to evaluate but difficult to sample. We choose an importance function, q, instead. When we can sample p, the sampled density is { p s = x i, w i = 1 } p sa (x i ) = p(x i ) s When we cannot sample p, the importance sampled density p s (x) is { p s = x i, w i = 1 } p(x i ) p is (x i ) = q(x i ) s q(x i ) Rawlings MHE & PF 31 / 45

Importance sampling In state estimation, p of interest is easy to evaluate but difficult to sample. We choose an importance function, q, instead. When we can sample p, the sampled density is { p s = x i, w i = 1 } p sa (x i ) = p(x i ) s When we cannot sample p, the importance sampled density p s (x) is { p s = x i, w i = 1 } p(x i ) p is (x i ) = q(x i ) s q(x i ) Both p s (x) and p s (x) are unbiased and converge to p(x) as sample size increases (Smith and Gelfand, 1992). Rawlings MHE & PF 31 / 45

Importance sampled particle filter (Arulampalam et al., 2002) p(x(k + 1) Y (k + 1)) = {x i (k + 1), w i (k + 1)} Rawlings MHE & PF 32 / 45

Importance sampled particle filter (Arulampalam et al., 2002) p(x(k + 1) Y (k + 1)) = {x i (k + 1), w i (k + 1)} x i (k + 1) is a sample of q(x(k + 1) x i (k), y(k + 1)) Rawlings MHE & PF 32 / 45

Importance sampled particle filter (Arulampalam et al., 2002) p(x(k + 1) Y (k + 1)) = {x i (k + 1), w i (k + 1)} x i (k + 1) is a sample of q(x(k + 1) x i (k), y(k + 1)) w i (k + 1) = w i (k) p(y(k + 1) x i(k + 1))p(x i (k + 1) x i (k)) q(x i (k + 1) x i (k), y(k + 1)) Rawlings MHE & PF 32 / 45

Importance sampled particle filter (Arulampalam et al., 2002) p(x(k + 1) Y (k + 1)) = {x i (k + 1), w i (k + 1)} x i (k + 1) is a sample of q(x(k + 1) x i (k), y(k + 1)) w i (k + 1) = w i (k) p(y(k + 1) x i(k + 1))p(x i (k + 1) x i (k)) q(x i (k + 1) x i (k), y(k + 1)) The importance sampled particle filter converges to the conditional density with increasing sample size. It is biased for finite sample size. Rawlings MHE & PF 32 / 45

Research challenge placing the particles Optimal importance function (Doucet et al., 2000). Restricted to linear measurement y = Cx + v. Resampling Curse of dimensionality Rawlings MHE & PF 33 / 45

Optimal importance function 20 15 k = 5 k = 4 k = 3 x 2 (k) 10 k = 2 5 0 k = 0 k = 1-6 -4-2 0 2 4 6 8 10 12 14 Particles locations versus time using the optimal importance function; 250 particles. Ellipses show the 95% contour of the true conditional densities before and after measurement. Rawlings MHE & PF 34 / 45 x 1 (k)

Resampling 20 15 k = 5 k = 4 k = 3 x 2 (k) 10 k = 2 5 0 k = 0 k = 1-6 -4-2 0 2 4 6 8 10 12 14 Particles locations versus time using the optimal importance function with resampling; 250 particles. x 1 (k) Rawlings MHE & PF 35 / 45

The MHE and particle filtering hybrid approach Hybrid implementation Use the MHE optimization to locate/relocate the samples Use the PF to obtain fast state estimates between MHE optimizations Rawlings MHE & PF 36 / 45

Application: Semi-Batch Reactor Reaction: 2A B k = 0.16 Measurement is C A + C B x 0 = [ 3 1 ] T A, B F i, C A0, C B0 V dc A dt dc B dt = 2kC 2 A + F i V C A 0 t = 0.1 = kc 2 A + F i V C B 0 Noise covariances Q w = diag (0.01 2, 0.01 2 ) and R v = 0.01 2 Poor Prior: x 0 = [ 0.1 4.5 ] T with a large P0 Unmodelled Disturbance: C A0, C B0 is pulsed at t k = 5 Rawlings MHE & PF 37 / 45

Using only MHE MHE implemented with N = 15(t = 1.5) and a smoothed prior MHE recovers robustly from poor priors and unmodelled disturbances 8 estimates Y 8 6 Concentrations, CA, CB 6 4 2 C B 4 2 0 Measurement, Y 0 C A -2 0 2 4 6 8 10 12 time Rawlings MHE & PF 38 / 45

Using only particle filter Particle filter implemented with the Optimal importance function: p(x k x k 1, y k ), 50 samples, Resampling The PF samples never recover from a poor x 0. Not robust 10 8 estimates Y 8 6 Concentrations 6 4 2 C B 4 2 Measurement 0 C A 0-2 -2-4 0 2 4 6 8 10 12 time Rawlings MHE & PF 39 / 45

MHE/PF hybrid with a simple importance function Importance function for PF: p(x k x k 1 ), 50 samples The PF samples recover from a poor x 0 and the unmodelled disturbance only after the MHE relocates the samples 10 8 estimates Y 8 6 Concentrations 6 4 2 C B 4 2 Measurement 0 C A 0-2 -2-4 0 2 4 6 8 10 12 time Rawlings MHE & PF 40 / 45

MHE/PF hybrid with an optimal importance function The optimal importance function: p(x k x k 1, y k ), 50 samples MHE relocates the samples after a poor x 0, but samples recover from the unmodelled disturbance without needing the MHE 10 8 estimates Y 8 6 Concentrations 6 4 2 C B 4 2 Measurement 0 C A 0-2 -2-4 0 2 4 6 8 10 12 time Rawlings MHE & PF 41 / 45

Conclusions Optimal state estimation of the linear dynamic system is the gold standard of state estimation. Rawlings MHE & PF 42 / 45

Conclusions Optimal state estimation of the linear dynamic system is the gold standard of state estimation. MHE is a good option for linear, constrained systems. Rawlings MHE & PF 42 / 45

Conclusions Optimal state estimation of the linear dynamic system is the gold standard of state estimation. MHE is a good option for linear, constrained systems. The classic solution for nonlinear systems, the EKF, has been superseded. Rawlings MHE & PF 42 / 45

Conclusions Optimal state estimation of the linear dynamic system is the gold standard of state estimation. MHE is a good option for linear, constrained systems. The classic solution for nonlinear systems, the EKF, has been superseded. MHE and particle filtering are higher-quality solutions for nonlinear models. Rawlings MHE & PF 42 / 45

Conclusions Optimal state estimation of the linear dynamic system is the gold standard of state estimation. MHE is a good option for linear, constrained systems. The classic solution for nonlinear systems, the EKF, has been superseded. MHE and particle filtering are higher-quality solutions for nonlinear models. MHE is robust to modeling errors but requires an online optimization. Rawlings MHE & PF 42 / 45

Conclusions Optimal state estimation of the linear dynamic system is the gold standard of state estimation. MHE is a good option for linear, constrained systems. The classic solution for nonlinear systems, the EKF, has been superseded. MHE and particle filtering are higher-quality solutions for nonlinear models. MHE is robust to modeling errors but requires an online optimization. PF is simple to program and fast to execute but may be sensitive to model errors. Rawlings MHE & PF 42 / 45

Conclusions Optimal state estimation of the linear dynamic system is the gold standard of state estimation. MHE is a good option for linear, constrained systems. The classic solution for nonlinear systems, the EKF, has been superseded. MHE and particle filtering are higher-quality solutions for nonlinear models. MHE is robust to modeling errors but requires an online optimization. PF is simple to program and fast to execute but may be sensitive to model errors. Hybrid MHE/PF methods can combine these complementary strengths. Rawlings MHE & PF 42 / 45

Future challenges Process systems are typically unobservable or ill-conditioned, i.e. nearby measurements do not imply nearby states. Rawlings MHE & PF 43 / 45

Future challenges Process systems are typically unobservable or ill-conditioned, i.e. nearby measurements do not imply nearby states. We must decide on the subset of states to reconstruct from the data an additional part to the modeling question. Rawlings MHE & PF 43 / 45

Future challenges Process systems are typically unobservable or ill-conditioned, i.e. nearby measurements do not imply nearby states. We must decide on the subset of states to reconstruct from the data an additional part to the modeling question. Nonlinear systems produce multi-modal densities. We need better solutions for handling these multi-modal densities in real time. Rawlings MHE & PF 43 / 45

Acknowledgments Professor Bhavik Bakshi of OSU for helpful discussion. NSF grant #CNS 0540147 PRF grant #43321 AC9 Rawlings MHE & PF 44 / 45

Further Reading I M. S. Arulampalam, S. Maskell, N. Gordon, and T. Clapp. A tutorial on particle filters for online nonlinear/non-gaussian Bayesian tracking. IEEE Trans. Signal Process., 50 (2):174 188, February 2002. A. Doucet, S. Godsill, and C. Andrieu. On sequential Monte Carlo sampling methods for Bayesian filtering. Stat. and Comput., 10:197 208, 2000. S. J. Julier and J. K. Uhlmann. Unscented filtering and nonlinear estimation. Proc. IEEE, 92(3):401 422, March 2004. C. V. Rao, J. B. Rawlings, and D. Q. Mayne. Constrained state estimation for nonlinear discrete-time systems: stability and moving horizon approximations. IEEE Trans. Auto. Cont., 48(2):246 258, February 2003. A. F. M. Smith and A. E. Gelfand. Bayesian statistics without tears: A sampling-resampling perspective. Amer. Statist., 46(2):84 88, 1992. Rawlings MHE & PF 45 / 45