Nonlinear and/or Non-normal Filtering. Jesús Fernández-Villaverde University of Pennsylvania

Similar documents
Filtering and Likelihood Inference

Lecture 2: From Linear Regression to Kalman Filter and Beyond

Lecture 2: From Linear Regression to Kalman Filter and Beyond

EKF, UKF. Pieter Abbeel UC Berkeley EECS. Many slides adapted from Thrun, Burgard and Fox, Probabilistic Robotics

EKF, UKF. Pieter Abbeel UC Berkeley EECS. Many slides adapted from Thrun, Burgard and Fox, Probabilistic Robotics

Introduction to Unscented Kalman Filter

Lecture 6: Bayesian Inference in SDE Models

ROBOTICS 01PEEQW. Basilio Bona DAUIN Politecnico di Torino

A new unscented Kalman filter with higher order moment-matching

Autonomous Navigation for Flying Robots

ECE276A: Sensing & Estimation in Robotics Lecture 10: Gaussian Mixture and Particle Filtering

Probabilistic Fundamentals in Robotics. DAUIN Politecnico di Torino July 2010

Sequential Monte Carlo Methods (for DSGE Models)

Estimating Macroeconomic Models: A Likelihood Approach

The Kalman Filter ImPr Talk

Nonlinear State Estimation! Extended Kalman Filters!

Lecture 7: Optimal Smoothing

Nonlinear State Estimation! Particle, Sigma-Points Filters!

Probabilistic Graphical Models

A Time-Varying Threshold STAR Model of Unemployment

Autonomous Mobile Robot Design

Efficient Monitoring for Planetary Rovers

The Unscented Particle Filter

Kalman filtering and friends: Inference in time series models. Herke van Hoof slides mostly by Michael Rubinstein

On a Data Assimilation Method coupling Kalman Filtering, MCRE Concept and PGD Model Reduction for Real-Time Updating of Structural Mechanics Model

Dual Estimation and the Unscented Transformation

The Scaled Unscented Transformation

A NEW NONLINEAR FILTER

Statistics 910, #15 1. Kalman Filter

Markov Chain Monte Carlo Methods for Stochastic

Gaussian Process Approximations of Stochastic Differential Equations

CSC487/2503: Foundations of Computer Vision. Visual Tracking. David Fleet

Markov Chain Monte Carlo Methods for Stochastic Optimization

Particle Filtering Approaches for Dynamic Stochastic Optimization

Sensor Fusion: Particle Filter

(Extended) Kalman Filter

Kalman Filter. Predict: Update: x k k 1 = F k x k 1 k 1 + B k u k P k k 1 = F k P k 1 k 1 F T k + Q

State Estimation of Linear and Nonlinear Dynamic Systems

Nonlinear Estimation Techniques for Impact Point Prediction of Ballistic Targets

Sequential Monte Carlo Methods for Bayesian Computation

ROBUST CONSTRAINED ESTIMATION VIA UNSCENTED TRANSFORMATION. Pramod Vachhani a, Shankar Narasimhan b and Raghunathan Rengaswamy a 1

Linear and nonlinear filtering for state space models. Master course notes. UdelaR 2010

Monetary and Exchange Rate Policy Under Remittance Fluctuations. Technical Appendix and Additional Results

Open Economy Macroeconomics: Theory, methods and applications

2D Image Processing. Bayes filter implementation: Kalman filter

TSRT14: Sensor Fusion Lecture 8

1 Kalman Filter Introduction

An introduction to particle filters

WORKING PAPER. Sigma point filters for dynamic nonlinear regime switching models NORGES BANK RESEARCH AUTHORS: ANDREW BINNING JUNIOR MAIH

Particle Filtering for Data-Driven Simulation and Optimization

Why do we care? Measurements. Handling uncertainty over time: predicting, estimating, recognizing, learning. Dealing with time

Particle Filters. Outline

2D Image Processing (Extended) Kalman and particle filter

SEQUENTIAL MONTE CARLO METHODS WITH APPLICATIONS TO COMMUNICATION CHANNELS. A Thesis SIRISH BODDIKURAPATI

The Ensemble Kalman Filter:

RESEARCH ARTICLE. Online quantization in nonlinear filtering

A Comparitive Study Of Kalman Filter, Extended Kalman Filter And Unscented Kalman Filter For Harmonic Analysis Of The Non-Stationary Signals

Constrained State Estimation Using the Unscented Kalman Filter

Lagrangian Data Assimilation and Manifold Detection for a Point-Vortex Model. David Darmon, AMSC Kayo Ide, AOSC, IPST, CSCAMM, ESSIC

NON-LINEAR NOISE ADAPTIVE KALMAN FILTERING VIA VARIATIONAL BAYES

RECURSIVE OUTLIER-ROBUST FILTERING AND SMOOTHING FOR NONLINEAR SYSTEMS USING THE MULTIVARIATE STUDENT-T DISTRIBUTION

Lecture 9. Time series prediction

AN EFFICIENT TWO-STAGE SAMPLING METHOD IN PARTICLE FILTER. Qi Cheng and Pascal Bondon. CNRS UMR 8506, Université Paris XI, France.

TAKEHOME FINAL EXAM e iω e 2iω e iω e 2iω

Lessons in Estimation Theory for Signal Processing, Communications, and Control

1 Bayesian Linear Regression (BLR)

X t = a t + r t, (7.1)

Introduction to Particle Filters for Data Assimilation

Why do we care? Examples. Bayes Rule. What room am I in? Handling uncertainty over time: predicting, estimating, recognizing, learning

Statistical Inference and Methods

2D Image Processing. Bayes filter implementation: Kalman filter

The Ensemble Kalman Filter:

Sequential Monte Carlo methods for filtering of unobservable components of multidimensional diffusion Markov processes

Lecture 6: Multiple Model Filtering, Particle Filtering and Other Approximations

State-space Model. Eduardo Rossi University of Pavia. November Rossi State-space Model Fin. Econometrics / 53

Lecture 4: Numerical Solution of SDEs, Itô Taylor Series, Gaussian Approximations

Bayesian Monte Carlo Filtering for Stochastic Volatility Models

On Continuous-Discrete Cubature Kalman Filtering

Probabilistic Graphical Models

Bayesian Dynamic Linear Modelling for. Complex Computer Models

CONDENSATION Conditional Density Propagation for Visual Tracking

Markov Chain Monte Carlo methods

Dynamic System Identification using HDMR-Bayesian Technique

Recursive Noise Adaptive Kalman Filtering by Variational Bayesian Approximations

Mobile Robot Localization

ECO 513 Fall 2008 C.Sims KALMAN FILTER. s t = As t 1 + ε t Measurement equation : y t = Hs t + ν t. u t = r t. u 0 0 t 1 + y t = [ H I ] u t.

Stochastic Analogues to Deterministic Optimizers

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 13: SEQUENTIAL DATA

The Kalman filter, Nonlinear filtering, and Markov Chain Monte Carlo

Research Division Federal Reserve Bank of St. Louis Working Paper Series

Bayesian Computations for DSGE Models

Blind Equalization via Particle Filtering

Paul Karapanagiotidis ECO4060

Evolution Strategies Based Particle Filters for Fault Detection

EKF and SLAM. McGill COMP 765 Sept 18 th, 2017

State Estimation Based on Nested Particle Filters

The Metropolis-Hastings Algorithm. June 8, 2012

A Comparison of the EKF, SPKF, and the Bayes Filter for Landmark-Based Localization

Posterior Cramer-Rao Lower Bound for Mobile Tracking in Mixed Line-of-Sight/Non Line-of-Sight Conditions

Bayes Filter Reminder. Kalman Filter Localization. Properties of Gaussians. Gaussians. Prediction. Correction. σ 2. Univariate. 1 2πσ e.

Transcription:

Nonlinear and/or Non-normal Filtering Jesús Fernández-Villaverde University of Pennsylvania 1

Motivation Nonlinear and/or non-gaussian filtering, smoothing, and forecasting (NLGF) problems are pervasive in economics. Macroeconomics: evaluating likelihood of DSGE models Finance: time-varying variance of time series. However, NLGF is a complicated endeavor with no simple and exact algorithm. 2

Environment Discrete time t {1, 2,...}. States S t. Initial state S 0 is either known or it comes from p (S 0 ; γ). Properties of p (S 0 ; γ)? Stationarity? 3

Nonlinear and/or Non-gaussian State Space Representations Transition equation: S t = f (S t 1,W t ; γ) Measurement equation: Y t = g (S t,v t ; γ) Interpretation. 4

Shocks {W t } and {V t } are independent of each other. W t and V t have zero mean. ThevarianceofW t is given by R t and the variance of V t by Q t. Generalizations? 5

Conditional Densities From S t = f (S t 1,W t ; γ), we can compute p (S t S t 1 ; γ). From Y t = g (S t,v t ; γ), we can compute p (Y t S t ; γ). From S t = f (S t 1,W t ; γ) andy t = g (S t,v t ; γ), we have: Y t = g (f (S t 1,W t ; γ),v t ; γ) and hence we can compute p (Y t S t 1 ; γ). 6

Filtering, Smoothing, and Filtering Filtering: we are concerned with what we have learned up to current observation. Smoothing: we are concerned with what we learn with the full sample. Forecasting: we are concerned with future realizations. 7

Goal of Filtering I Compute conditional densities: p ³ s t y t 1 ; γ and p ³ s t y t ; γ. Why? 1. It allows probability statements regarding the situation of the system. 2. Compute conditional moments: mean, s t t and s t t 1 and variances P t t and P t t 1. 3. Other functions of the states. 8

Goals of Filtering II Compute condition the likelihood function of a sequence of realizations of the observable y T at a particular parameter value γ: p ³ y T ; γ Given the markov structure of our state space representation, we can factorize the likelihood function as p ³ y T ; γ = TY t=1 p ³ y t y t 1 ; γ 9

Goals of Filtering III Then, p ³ y T ; γ = p (y 1 γ) = Z TY t=1 p ³ y t y t 1 ; γ p (y 1 s 0 ; γ) ds 0 T Y t=1 Z p (y t s t ; γ) p ³ s t y t 1 ; γ ds t Hence, knowledge of n p ³ s t y t 1 ; γ o T t=1 and p (s 0; γ) allow the evaluation of the likelihood of the model. 10

Two Fundamental Tools 1. Chapman-Kolmogorov equation: p ³ s t y t 1 ; γ = Z p (s t s t 1 ; γ) p ³ s t 1 y t 1 ; γ ds t 1 2. Bayes theorem: p ³ s t y t ; γ = p (y t s t ; γ) p ³ s t y t 1 ; γ p ³ y t y t 1 ; γ where: p ³ y t y t 1 ; γ = Z p (y t s t ; γ) p ³ s t y t 1 ; γ ds t 11

Interpretation 1. Chapman-Kolmogorov equation is one-step ahead predictor. 2. Bayes theorem updates the conditional density of states given the new observation. 12

Recursion for p ³ s t y t ; γ Combining the Chapman-Kolmogorov and the Bayes theorem: p ³ R p (st s t y t s t 1 ; γ) p ³ s t 1 y ; γ t 1 ; γ ds t 1 = R n R p (st s t 1 ; γ) p ³ s t 1 y t 1 o p (y t s t ; γ) ; γ ds t 1 p (yt s t ; γ) ds t To initiate that recursion, we only need a value for s 0 or p (s 0 ; γ). Applying the Chapman-Kolmogorov equation once more to the outcome of the recursion, the get the sequence of n p ³ s t y t 1 ; γ o T t=1 to evaluate the likelihood function. 13

Smoothing We are interested on the distribution of the state conditional on all the observations, on p ³ s t y T ; γ and p ³ y t y T ; γ. We compute: p ³ s t y T ; γ = p ³ s t y t ; γ Z p ³ s t+1 y T ; γ p (s t+1 s t ; γ) p (s t+1 y t ds t+1 ; γ) a backward recursion that we initialize with p ³ s T y T ; γ and the sequences n p ³ s t y t ; γ o T t=1 and n p ³ s t y t 1 ; γ o T we obtained from t=1 filtering. 14

Forecasting We apply the Chapman-Kolmogorov equation recursively, we can get p ³ s t+j y t ; γ, j 1. Integrating recursively: Z p ³ y l+1 y l ; γ = p y l+1 s l+1 ; γ p ³ s l+1 y l ; γ ds l+1 from t +1tot + j, wegetp ³ y t+j y T ; γ. Clearly smoothing and forecasting require to solve the filtering problem first! 15

Problem of Filtering We have the recursion p ³ R p (st s t y t s t 1 ; γ) p ³ s t 1 y ; γ t 1 ; γ ds t 1 = R n R p (st s t 1 ; γ) p ³ s t 1 y t 1 o p (y t s t ; γ) ; γ ds t 1 p (yt s t ; γ) ds t A lot of complicated and high dimensional integrals. In general, we do not have close for solution for them. Nonlinear and non-gaussianity translate, spread, and deform (TSD) the conditional densities in ways that impossibilities to accommodate them with any known parametric family. 16

Exception There is one exception: linear and gaussian case. Why? Because if the system is linear and gaussian, all the conditional probabilities are also gaussian. Linear and gaussian state spaces models translate and spread the conditional distributions, but they do not deform them. For gaussian distributions, we only need to track mean and variance (sufficient statistics). Kalman filter accomplishes this goal efficiently. 17

Different Approaches Deterministic filtering: 1. Kalman family. 2. Grid-based filtering. Simulation filtering: 1. McMc. 2. Sequential Monte Carlo. 18

Kalman Family of Filters Use ideas of Kalman filtering to NLGF problems. Non-optimal filters. Different implementations: 1. Extended Kalman filter. 2. Iterated Extended Kalman filter. 3. Second-order Extended Kalman filter. 4. Unscented Kalman filter. 19

The Extended Kalman Filter EKF is historically the first descendant of the Kalman filter. EKF deals with nonlinearities with a first order approximation to the system and applying the Kalman filter to this approximation. Non-gaussianities are ignored. 20

Algorithm I Given s t 1 t 1 : s t t 1 = f ³ s t 1 t 1, 0; γ Then: P t t 1 = Q t 1 + F t P t 1 t 1 F 0 t where F t = df (S t 1,W t ; γ) ds t 1 St 1 =s t 1 t 1,W t =0 21

Algorithm II Kalman gain, K t,is: K t = P t t 1 G 0 ³ t Gt P t t 1 G 0 t + R t 1 where G t = dg (S t 1,v t ; γ) ds t 1 St 1 =s t t 1,v t =0 Then s t t = s t t 1 + K t ³ yt g ³ s t t 1, 0; γ P t t = P t t 1 K t G t P t t 1 22

Problems of EKF 1. It ignores the non-gaussianities of W t and V t. 2. It ignores the non-gaussianities of states distribution. 3. Approximation error incurred by the linearization. 4. Biased estimate of the mean and variance. 5. We need to compute Jacobian and Hessians. As the sample size grows, those errors accumulate and the filter diverges. 23

Iterated Extended Kalman Filter I Compute s t t 1 and P t t 1 as in EKF. Iterate N times on: where K i t = P t t 1 G i0 t G i t = dg (S t 1,v t ; γ) ds t 1 ³ G i t P t t 1 G i0 t + R t 1 St 1 =s i t t 1,v t=0 and s i t t = s t t 1 + Ki t ³ yt g ³ s t t 1, 0; γ 24

Iterated Extended Kalman Filter II Why are we iterating? How many times? Then: ³ s t t = s t t 1 + K t yt g ³ s N t t 1, 0; γ P t t = P t t 1 K N t G N t P t t 1 25

Second-order Extended Kalman Filter We keep second-order terms of the Taylor expansion of transition and measurement. Theoretically, less biased than EKF. Messy algebra. In practice, not much improvement. 26

Unscented Kalman Filter I Recent proposal by Julier and Uhlmann (1996). Based around the unscented transform. A set of sigma points is selected to preserve some properties of the conditional distribution (for example, the first two moments). Then, those points are transformed and the properties of the new conditional distribution are computed. 27

Unscented Kalman Filter II The UKF computes the conditional mean and variance accurately up to a third order approximation if the shocks W t and V t are gaussian and up to a second order if they are not. The sigma points are chosen deterministically and not by simulation asinamontecarlomethod. The UKF has the advantage with respect to the EKF that no jacobian or hessians is required, objects that may be difficult to compute. 28

New State Variable We modify the state space by creating a new augmented state variable: S t = [S t,w t,v t ] that includes the pure state space and the two random variables W t and V t. We initialize the filter with s 0 0 = E (S t )=E (S 0, 0, 0) P 0 0 = P 0 0 0 0 0 R 0 0 0 0 Q 0 29

Sigma Points Let L be the dimension of the state variable S t. For t = 1, we calculate the 2L + 1 sigma points: S 0,t 1 t 1 = s t 1 t 1 S i,t 1 t 1 = s t 1 t 1 ³ (L + λ) P t 1 t 1 0.5 for i =1,..., L S i,t 1 t 1 = s t 1 t 1 + ³ (L + λ) P t 1 t 1 0.5 for i = L +1,...,2L 30

Parameters λ = α 2 (L + κ) L is a scaling parameter. α determines the spread of the sigma point and it must belong to the unit interval. κ is a secondary parameter usually set equal to zero. Notation for each of the elements of S: S i =[S s i, Sw i, Sv i ] for i =0,...,2L 31

Weights Weights for each point: W0 m λ = L + λ W0 c = λ L + λ + ³ 1 α 2 + β W m 0 = X c 0 = 1 2(L + λ) for i =1,...,2L β incorporates knowledge regarding the conditional distributions. For gaussian distributions, β = 2 is optimal. 32

Algorithm I: Prediction of States We compute the transition of the pure states: S s i,t t 1 = f ³ S s i,t t 1, Sw i,t 1 t 1 ; γ Weighted state s t t 1 = 2LX Wi m Ss i,t t 1 i=0 Weighted variance: P t t 1 = 2LX Wi c i=0 ³ S s i,t t 1 s t t 1 ³ S s i,t t 1 s t t 1 0 33

Algorithm II: Prediction of Observables Predicted sigma observables: Y i,t t 1 = g ³ S s i,t t 1, Sv i,t t 1 ; γ Predicted observable: y t t 1 = 2LX i=0 W m i Y i,t t 1 34

Algorithm III: Update Variance-covariance matrices: P yy,t = P xy,t = 2LX Wi c i=0 2LX Wi c i=0 ³ Yi,t t 1 y t t 1 ³ Yi,t t 1 y t t 1 0 ³ S s i,t t 1 s t t 1 ³ Yi,t t 1 y t t 1 0 Kalman gain: K t = P xy,t P 1 yy,t 35

Algorithm IV: Update Update of the state: s t t = s t t + K t ³ yt y t t 1 the update of the variance: P t t = P t t 1 + K t P yy,t K 0 t Finally: P t t = P t t 0 0 0 R t 0 0 0 Q t 36

Grid-Based Filtering Remember that we have the recursion p ³ R p (st s t y t s t 1 ; γ) p ³ s t 1 y ; γ t 1 ; γ ds t 1 = R n R p (st s t 1 ; γ) p ³ s t 1 y t 1 o p (y t s t ; γ) ; γ ds t 1 p (yt s t ; γ) ds t This recursion requires the evaluation of three integrals. This suggests the possibility of addressing the problem by computing those integrals by a deterministic procedure as a grid method. Kitagawa (1987) and Kramer and Sorenson (1988). 37

Grid-Based Filtering We divide the state space into N cells, with center point s i t, n s i t : i =1,...,No. We substitute the exact conditional densities by discrete densities that put all the mass at the points n s i N to i=1. We denote δ (x) is a dirac function with mass at 0. 38

Approximated Densities and Weights p ³ s t y t 1 ; γ p ³ s t y t ; γ ' ' NX i=1 NX i=1 ω i t t 1 δ ³ s t s i t ω i t t 1 δ ³ s t s i t X ω i N t t 1 = ω j t 1 t 1 p ³ s i t s j t 1 ; γ j=1 ω i ω i t t 1 t t = p ³ y t s i t ; γ P Nj=1 ω j t t 1 p ³ y t s j t ; γ 39

Approximated Recursion PNj=1 ω j t 1 t 1 p ³ s i t sj t 1 ; γ p ³ y t s i t ; γ p ³ s t y t ; γ = NX i=1 P Nj=1 PNj=1 ω j t 1 t 1 p ³ γ s i t sj t 1 ; p ³ y t s j γ δ t ; ³ st s i t Compare with p ³ s t y t ; γ = R p (st s t 1 ; γ) p ³ s t 1 y t 1 ; γ ds t 1 R n R p (st s t 1 ; γ) p ³ s t 1 y t 1 o p (y t s t ; γ) ; γ ds t 1 p (yt s t ; γ) ds t given that p ³ s t 1 y t 1 ; γ ' NX i=1 ω i t 1 t 1 δ ³ s i t 40

Problems Grid filters require a constant readjustment to small changes in the model or its parameter values. Too computationally expensive to be of any practical benefit beyond very low dimensions. Grid points are fixed ex ante and the results are very dependent on that choice. Can we overcome those difficulties and preserve the idea of integration? Yes, through Monte Carlo Integration. 41