Lecture 2: From Linear Regression to Kalman Filter and Beyond

Similar documents
Lecture 2: From Linear Regression to Kalman Filter and Beyond

Lecture 6: Bayesian Inference in SDE Models

Lecture 7: Optimal Smoothing

The Kalman Filter ImPr Talk

Part 1: Expectation Propagation

Lecture 6: Multiple Model Filtering, Particle Filtering and Other Approximations

Kalman filtering and friends: Inference in time series models. Herke van Hoof slides mostly by Michael Rubinstein

RECURSIVE OUTLIER-ROBUST FILTERING AND SMOOTHING FOR NONLINEAR SYSTEMS USING THE MULTIVARIATE STUDENT-T DISTRIBUTION

Lecture 8: Bayesian Estimation of Parameters in State Space Models

Lecture 4: Extended Kalman filter and Statistically Linearized Filter

Bayesian Inference and MCMC

BAYESIAN ESTIMATION OF TIME-VARYING SYSTEMS:

NON-LINEAR NOISE ADAPTIVE KALMAN FILTERING VIA VARIATIONAL BAYES

ECE276A: Sensing & Estimation in Robotics Lecture 10: Gaussian Mixture and Particle Filtering

Nonlinear and/or Non-normal Filtering. Jesús Fernández-Villaverde University of Pennsylvania

Dynamic System Identification using HDMR-Bayesian Technique

Density Estimation. Seungjin Choi

Prediction of ESTSP Competition Time Series by Unscented Kalman Filter and RTS Smoother

Probabilistic Graphical Models

Introduction to Systems Analysis and Decision Making Prepared by: Jakub Tomczak

Robotics. Mobile Robotics. Marc Toussaint U Stuttgart

EM-algorithm for Training of State-space Models with Application to Time Series Prediction

Lecture 4: Numerical Solution of SDEs, Itô Taylor Series, Gaussian Approximations

Modeling and state estimation Examples State estimation Probabilities Bayes filter Particle filter. Modeling. CSC752 Autonomous Robotic Systems

Probabilistic Graphical Models

Probabilistic Fundamentals in Robotics. DAUIN Politecnico di Torino July 2010

Why do we care? Measurements. Handling uncertainty over time: predicting, estimating, recognizing, learning. Dealing with time

Smoothers: Types and Benchmarks

ROBOTICS 01PEEQW. Basilio Bona DAUIN Politecnico di Torino

Perception: objects in the environment

Towards a Bayesian model for Cyber Security

EKF, UKF. Pieter Abbeel UC Berkeley EECS. Many slides adapted from Thrun, Burgard and Fox, Probabilistic Robotics

EKF, UKF. Pieter Abbeel UC Berkeley EECS. Many slides adapted from Thrun, Burgard and Fox, Probabilistic Robotics

Lecture : Probabilistic Machine Learning

Bayesian Machine Learning

Harrison B. Prosper. Bari Lectures

Lecture 4: Numerical Solution of SDEs, Itô Taylor Series, Gaussian Process Approximations

Sequential Monte Carlo Methods for Bayesian Computation

Rao-Blackwellized Particle Filter for Multiple Target Tracking

Efficient Likelihood-Free Inference

Probabilistic Graphical Models

Introduction to Machine Learning

Expectation Propagation for Approximate Bayesian Inference

Factor Analysis and Kalman Filtering (11/2/04)

Computer Intensive Methods in Mathematical Statistics

Parametric Models. Dr. Shuang LIANG. School of Software Engineering TongJi University Fall, 2012

Bayesian Machine Learning

State Space Representation of Gaussian Processes

The Metropolis-Hastings Algorithm. June 8, 2012

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 13: SEQUENTIAL DATA

Local Positioning with Parallelepiped Moving Grid

Stat 451 Lecture Notes Monte Carlo Integration

Probabilistic Graphical Models

Lecture 4: Dynamic models

CPSC 540: Machine Learning

Bayesian RL Seminar. Chris Mansley September 9, 2008

4 Derivations of the Discrete-Time Kalman Filter

Probabilistic Machine Learning

Lecture 7 and 8: Markov Chain Monte Carlo

Gaussian processes. Chuong B. Do (updated by Honglak Lee) November 22, 2008

Lecture 4: Probabilistic Learning. Estimation Theory. Classification with Probability Distributions

GAUSSIAN PROCESS REGRESSION

an introduction to bayesian inference

Graphical Models for Collaborative Filtering

Classification CE-717: Machine Learning Sharif University of Technology. M. Soleymani Fall 2012

Extension of the Sparse Grid Quadrature Filter

Sequential Bayesian Updating

Particle Filtering a brief introductory tutorial. Frank Wood Gatsby, August 2007

Sequential Monte Carlo and Particle Filtering. Frank Wood Gatsby, November 2007

Why do we care? Examples. Bayes Rule. What room am I in? Handling uncertainty over time: predicting, estimating, recognizing, learning

Dynamic models 1 Kalman filters, linearization,

Autonomous Navigation for Flying Robots

Integrated Non-Factorized Variational Inference

Filtering and Likelihood Inference

Advanced Computational Methods in Statistics: Lecture 5 Sequential Monte Carlo/Particle Filtering

Time Series Prediction by Kalman Smoother with Cross-Validated Noise Density

1 Bayesian Linear Regression (BLR)

TSRT14: Sensor Fusion Lecture 8

Lecture 13 and 14: Bayesian estimation theory

Lecture: Gaussian Process Regression. STAT 6474 Instructor: Hongxiao Zhu

State Estimation using Moving Horizon Estimation and Particle Filtering

State Space Gaussian Processes with Non-Gaussian Likelihoods

Introduction to Particle Filters for Data Assimilation

Linear Dynamical Systems (Kalman filter)

Patterns of Scalable Bayesian Inference Background (Session 1)

An introduction to particle filters

Statistical Machine Learning Lecture 8: Markov Chain Monte Carlo Sampling

Particle Filtering for Data-Driven Simulation and Optimization

State Estimation Based on Nested Particle Filters

Introduction to Bayesian Inference

Gaussian Process Approximations of Stochastic Differential Equations

Posterior Cramer-Rao Lower Bound for Mobile Tracking in Mixed Line-of-Sight/Non Line-of-Sight Conditions

SEQUENTIAL MONTE CARLO METHODS WITH APPLICATIONS TO COMMUNICATION CHANNELS. A Thesis SIRISH BODDIKURAPATI

Short tutorial on data assimilation

Data assimilation in high dimensions

Bayesian Monte Carlo Filtering for Stochastic Volatility Models

Markov Chain Monte Carlo methods

Crib Sheet : Linear Kalman Smoothing

David Giles Bayesian Econometrics

State Space and Hidden Markov Models

Transcription:

Lecture 2: From Linear Regression to Kalman Filter and Beyond January 18, 2017

Contents 1 Batch and Recursive Estimation 2 Towards Bayesian Filtering 3 Kalman Filter and Bayesian Filtering and Smoothing 4 Summary

Batch Linear Regression [1/2] y 2 Measurement True signal 1.5 1 0.5 0 0.2 0.4 0.6 0.8 1 t Consider the linear regression model y k = θ 1 + θ 2 t k + ε k, k = 1,..., T, with ε k N(0, σ 2 ) and θ = (θ 1, θ 2 ) N(m 0, P 0 ). In probabilistic notation this is: where H k = (1 t k ). p(y k θ) = N(y k H k θ, σ 2 ) p(θ) = N(θ m 0, P 0 ),

Batch Linear Regression [2/2] The Bayesian batch solution by the Bayes rule: p(θ y 1:T ) p(θ) T k=1 p(y k θ) The posterior is Gaussian = N(θ m 0, P 0 ) T k=1 N(y k H k θ, σ 2 ). p(θ y 1:T ) = N(θ m T, P T ). The mean and covariance are given as m T = P T = [ P 1 0 + 1 ] 1 [ 1 σ 2 HT H [ P 1 0 + 1 σ 2 HT H] 1, σ 2 HT y + P 1 0 m 0 where H k = (1 t k ), H = (H 1 ; H 2 ;... ; H T ), y = (y 1 ;... ; y T ). ]

Recursive Linear Regression [1/4] Assume that we have already computed the posterior distribution, which is conditioned on the measurements up to k 1: p(θ y 1:k 1 ) = N(θ m k 1, P k 1 ). Assume that we get the kth measurement y k. Using the equations from the previous slide we get p(θ y 1:k ) p(y k θ) p(θ y 1:k 1 ) N(θ m k, P k ). The mean and covariance are given as [ m k = P 1 k 1 + 1 σ 2 HT k H k P k = [ P 1 k 1 + 1 σ 2 HT k H k] 1. ] 1 [ 1 σ 2 HT k y k + P 1 k 1 m k 1 ]

Recursive Linear Regression [2/4] By the matrix inversion lemma (or Woodbury identity): P k = P k 1 P k 1 H T k [ H k P k 1 H T k + σ2] 1 Hk P k 1. Now the equations for the mean and covariance reduce to S k = H k P k 1 H T k + σ2 K k = P k 1 H T k S 1 k m k = m k 1 + K k [y k H k m k 1 ] P k = P k 1 K k S k K T k. Computing these for k = 0,..., T gives exactly the linear regression solution. A special case of Kalman filter.

Recursive Linear Regression [3/4]

Recursive Linear Regression [3/4]

Recursive Linear Regression [3/4]

Recursive Linear Regression [4/4] Convergence of the recursive solution to the batch solution on the last step the solutions are exactly equal: 1.2 1 0.8 0.6 0.4 y 0.2 0 0.2 0.4 0.6 Recursive E[ θ 1 ] Batch E[ θ 1 ] Recursive E[ θ 2 ] Batch E[ θ 2 ] 0.8 0 0.2 0.4 0.6 0.8 1 t

Batch vs. Recursive Estimation [1/2] General batch solution: Specify the measurement model: p(y 1:T θ) = k p(y k θ). Specify the prior distribution p(θ). Compute posterior distribution by the Bayes rule: p(θ y 1:T ) = 1 Z p(θ) k p(y k θ). Compute point estimates, moments, predictive quantities etc. from the posterior distribution.

Batch vs. Recursive Estimation [2/2] General recursive solution: Specify the measurement likelihood p(y k θ). Specify the prior distribution p(θ). Process measurements y 1,..., y T one at a time, starting from the prior: p(θ y 1 ) = 1 Z 1 p(y 1 θ)p(θ) p(θ y 1:2 ) = 1 Z 2 p(y 2 θ)p(θ y 1 ) p(θ y 1:3 ) = 1 Z 3 p(y 3 θ)p(θ y 1:2 ). p(θ y 1:T ) = 1 Z T p(y T θ)p(θ y 1:T 1 ). The result at the last step is the batch solution.

Advantages of Recursive Solution The recursive solution can be considered as the online learning solution to the Bayesian learning problem. Batch Bayesian inference is a special case of recursive Bayesian inference. The parameter can be modeled to change between the measurement steps basis of filtering theory.

Drift Model for Linear Regression [1/3] Let assume Gaussian random walk between the measurements in the linear regression model: p(y k θ k ) = N(y k H k θ k, σ 2 ) p(θ k θ k 1 ) = N(θ k θ k 1, Q) p(θ 0 ) = N(θ 0 m 0, P 0 ). Again, assume that we already know p(θ k 1 y 1:k 1 ) = N(θ k 1 m k 1, P k 1 ). The joint distribution of θ k and θ k 1 is (due to Markovianity of dynamics!): p(θ k, θ k 1 y 1:k 1 ) = p(θ k θ k 1 ) p(θ k 1 y 1:k 1 ).

Drift Model for Linear Regression [2/3] Integrating over θ k 1 gives: p(θ k y 1:k 1 ) = p(θ k θ k 1 ) p(θ k 1 y 1:k 1 ) dθ k 1. This equation for Markov processes is called the Chapman-Kolmogorov equation. Because the distributions are Gaussian, the result is Gaussian where p(θ k y 1:k 1 ) = N(θ k m k, P k ), m k = m k 1 P k = P k 1 + Q.

Drift Model for Linear Regression [3/3] As in the pure recursive estimation, we get p(θ k y 1:k ) p(y k θ k ) p(θ k y 1:k 1 ) N(θ k m k, P k ). After applying the matrix inversion lemma, mean and covariance can be written as S k = H k P k HT k + σ2 K k = P k HT k S 1 k m k = m k + K k[y k H k m k ] P k = P k K ks k K T k. Again, we have derived a special case of the Kalman filter. The batch version of this solution would be much more complicated.

State Space Notation In the previous slide we formulated the model as p(θ k θ k 1 ) = N(θ k θ k 1, Q) p(y k θ k ) = N(y k H k θ k, σ 2 ) But in Kalman filtering and control theory the vector of parameters θ k is usually called state and denoted as x k. More standard state space notation: Or equivalently p(x k x k 1 ) = N(x k x k 1, Q) p(y k x k ) = N(y k H k x k, σ 2 ) x k = x k 1 + q k 1 y k = H k x k + r k, where q k 1 N(0, Q), r k N(0, σ 2 ).

Kalman Filter [1/2] The canonical Kalman filtering model is p(x k x k 1 ) = N(x k A k 1 x k 1, Q k 1 ) p(y k x k ) = N(y k H k x k, R k ). More often, this model can be seen in the form x k = A k 1 x k 1 + q k 1 y k = H k x k + r k. The Kalman filter actually calculates the following distributions: p(x k y 1:k 1 ) = N(x k m k, P k ) p(x k y 1:k ) = N(x k m k, P k ).

Kalman Filter [2/2] Prediction step of the Kalman filter: m k P k = A k 1 m k 1 = A k 1 P k 1 A T k 1 + Q k 1. Update step of the Kalman filter: S k = H k P k HT k + R k K k = P k HT k S 1 k m k = m k + K k [y k H k m k ] P k = P k K k S k K T k. These equations will be derived from the general Bayesian filtering equations in the next lecture.

Probabilistic State Space Models [1/2] Generic non-linear state space models Generic Markov models x k = f(x k 1, q k 1 ) y k = h(x k, r k ). x k p(x k x k 1 ) y k p(y k x k ). Continuous-discrete state space models involving stochastic differential equations: dx = f(x, t) + w(t) dt y k p(y k x(t k )).

Probabilistic State Space Models [2/2] Non-linear state space model with unknown parameters: x k = f(x k 1, q k 1, θ) y k = h(x k, r k, θ). General Markovian state space model with unknown parameters: x k p(x k x k 1, θ) y k p(y k x k, θ). Parameter estimation will be considered later for now, we will attempt to estimate the state. Why Bayesian filtering and smoothing then?

Bayesian Filtering, Prediction and Smoothing In principle, we could just use the (batch) Bayes rule p(x 1,..., x T y 1,..., y T ) = p(y 1,..., y T x 1,..., x T ) p(x 1,..., x T ), p(y 1,..., y T ) Curse of computational complexity: complexity grows more than linearly with number of measurements (typically we have O(T 3 )). Hence, we concentrate on the following: Filtering distributions: Prediction distributions: p(x k y 1,..., y k ), k = 1,..., T. p(x k+n y 1,..., y k ), k = 1,..., T, n = 1, 2,..., Smoothing distributions: p(x k y 1,..., y T ), k = 1,..., T.

Bayesian Filtering, Prediction and Smoothing (cont.) Prediction: 0 k T Filtering: Smoothing: Measurements Estimate

Filtering Algorithms Kalman filter is the classical optimal filter for linear-gaussian models. Extended Kalman filter (EKF) is linearization based extension of Kalman filter to non-linear models. Unscented Kalman filter (UKF) is sigma-point transformation based extension of Kalman filter. Gauss-Hermite and Cubature Kalman filters (GHKF/CKF) are numerical integration based extensions of Kalman filter. Particle filter forms a Monte Carlo representation (particle set) to the distribution of the state estimate. Grid based filters approximate the probability distributions on a finite grid. Mixture Gaussian approximations are used, for example, in multiple model Kalman filters and Rao-Blackwellized Particle filters.

Smoothing Algorithms Rauch-Tung-Striebel (RTS) smoother is the closed form smoother for linear Gaussian models. Extended, statistically linearized and unscented RTS smoothers are the approximate nonlinear smoothers corresponding to EKF, SLF and UKF. Gaussian RTS smoothers: cubature RTS smoother, Gauss-Hermite RTS smoothers and various others Particle smoothing is based on approximating the smoothing solutions via Monte Carlo. Rao-Blackwellized particle smoother is a combination of particle smoothing and RTS smoothing.

Summary Linear regression problem can be solved as batch problem or recursively the latter solution is a special case of Kalman filter. A generic Bayesian estimation problem can also be solved as batch problem or recursively. If we let the linear regression parameter change between the measurements, we get a simple linear state space model again solvable with Kalman filtering model. By generalizing this idea and the solution we get the Kalman filter algorithm. By further generalizing to non-gaussian models results in generic probabilistic state space models. Bayesian filtering and smoothing methods solve Bayesian inference problems on state space models recursively.

Demonstration [Linear regression with Kalman filter]