Statistical Techniques in Robotics (16-831, F12) Lecture#17 (Wednesday October 31) Kalman Filters. Lecturer: Drew Bagnell Scribe:Greydon Foil 1

Similar documents
1 Bayesian Linear Regression (BLR)

Support Vector Machines and Bayes Regression

Factor Analysis and Kalman Filtering (11/2/04)

Probabilistic Graphical Models

The Multivariate Gaussian Distribution [DRAFT]

1 Kalman Filter Introduction

CS281A/Stat241A Lecture 17

Dynamic models 1 Kalman filters, linearization,

Machine Learning 4771

L03. PROBABILITY REVIEW II COVARIANCE PROJECTION. NA568 Mobile Robotics: Methods & Algorithms

Lecture 2: From Linear Regression to Kalman Filter and Beyond

Probabilistic Graphical Models

Kalman Filter Computer Vision (Kris Kitani) Carnegie Mellon University

A Bayesian Treatment of Linear Gaussian Regression

Gaussian processes. Chuong B. Do (updated by Honglak Lee) November 22, 2008

The Expectation-Maximization Algorithm

B4 Estimation and Inference

Lecture 11. Multivariate Normal theory

Lecture 2: From Linear Regression to Kalman Filter and Beyond

Lecture 6: April 19, 2002

Solutions to Homework Set #6 (Prepared by Lele Wang)

ECE276A: Sensing & Estimation in Robotics Lecture 10: Gaussian Mixture and Particle Filtering

CIS 390 Fall 2016 Robotics: Planning and Perception Final Review Questions

Probabilistic Graphical Models

01 Probability Theory and Statistics Review

A NEW NONLINEAR FILTER

Statistical Techniques in Robotics (16-831, F12) Lecture#21 (Monday November 12) Gaussian Processes

PILCO: A Model-Based and Data-Efficient Approach to Policy Search

Autonomous Navigation for Flying Robots

Parametric Unsupervised Learning Expectation Maximization (EM) Lecture 20.a

Introduction to Graphical Models

The Metropolis-Hastings Algorithm. June 8, 2012

Bayes Filter Reminder. Kalman Filter Localization. Properties of Gaussians. Gaussians. Prediction. Correction. σ 2. Univariate. 1 2πσ e.

Probabilistic Graphical Models

Gaussian with mean ( µ ) and standard deviation ( σ)

Introduction to Probabilistic Graphical Models: Exercises

Lecture Note 1: Probability Theory and Statistics

CS 195-5: Machine Learning Problem Set 1

Chris Bishop s PRML Ch. 8: Graphical Models

LQR, Kalman Filter, and LQG. Postgraduate Course, M.Sc. Electrical Engineering Department College of Engineering University of Salahaddin

Multivariate Gaussians. Sargur Srihari

Lecture 4: Dynamic models

A Study of Covariances within Basic and Extended Kalman Filters

Sampling Based Filters

Chapter 5 continued. Chapter 5 sections

Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science Algorithms For Inference Fall 2014

Chapter 3: Maximum-Likelihood & Bayesian Parameter Estimation (part 1)

Gaussian Processes. 1 What problems can be solved by Gaussian Processes?

CS229 Lecture notes. Andrew Ng

Continuous Random Variables

Chapter 4 : Expectation and Moments

Statistical Techniques in Robotics (16-831, F12) Lecture#20 (Monday November 12) Gaussian Processes

CONDENSATION Conditional Density Propagation for Visual Tracking

8 - Continuous random vectors

Linear Dynamical Systems

Kalman Filter. Man-Wai MAK

Bayesian Machine Learning

Chapter 4. Chapter 4 sections

Bayesian Regression Linear and Logistic Regression

Lecture 8: Signal Detection and Noise Assumption

An EM algorithm for Gaussian Markov Random Fields

Performance Comparison of K-Means and Expectation Maximization with Gaussian Mixture Models for Clustering EE6540 Final Project

COS Lecture 16 Autonomous Robot Navigation

What you need to know about Kalman Filters

Probabilistic Fundamentals in Robotics. DAUIN Politecnico di Torino July 2010

Gaussians. Pieter Abbeel UC Berkeley EECS. Many slides adapted from Thrun, Burgard and Fox, Probabilistic Robotics

ROBOTICS 01PEEQW. Basilio Bona DAUIN Politecnico di Torino

MIT Spring 2015

Dynamic System Identification using HDMR-Bayesian Technique

Lecture 2: Review of Basic Probability Theory

MIXTURE MODELS AND EM

Lecture 6: Bayesian Inference in SDE Models

Gaussian Processes for Sequential Prediction

1 Class Organization. 2 Introduction

Probabilistic Graphical Models

Robots Autónomos. Depto. CCIA. 2. Bayesian Estimation and sensor models. Domingo Gallardo

Probability Theory for Machine Learning. Chris Cremer September 2015

EEL 5544 Noise in Linear Systems Lecture 30. X (s) = E [ e sx] f X (x)e sx dx. Moments can be found from the Laplace transform as

Modeling Data with Linear Combinations of Basis Functions. Read Chapter 3 in the text by Bishop

SDS 321: Introduction to Probability and Statistics

Gaussian Processes (10/16/13)

Kalman Filter and Parameter Identification. Florian Herzog

Introduction: MLE, MAP, Bayesian reasoning (28/8/13)

Estimation Theory. as Θ = (Θ 1,Θ 2,...,Θ m ) T. An estimator

ECO 513 Fall 2008 C.Sims KALMAN FILTER. s t = As t 1 + ε t Measurement equation : y t = Hs t + ν t. u t = r t. u 0 0 t 1 + y t = [ H I ] u t.

Mobile Robot Localization

Problem Set 1. MAS 622J/1.126J: Pattern Recognition and Analysis. Due: 5:00 p.m. on September 20

Chapter 6. Random Processes

2D Image Processing. Bayes filter implementation: Kalman filter

EC212: Introduction to Econometrics Review Materials (Wooldridge, Appendix)

Part 1: Expectation Propagation

Sequential Monte Carlo Methods for Bayesian Computation

E X A M. Probability Theory and Stochastic Processes Date: December 13, 2016 Duration: 4 hours. Number of pages incl.

COM336: Neural Computing

5. Random Vectors. probabilities. characteristic function. cross correlation, cross covariance. Gaussian random vectors. functions of random vectors

Bayesian Decision and Bayesian Learning

Lecture 16: State Space Model and Kalman Filter Bus 41910, Time Series Analysis, Mr. R. Tsay

Introduction to Probability and Stocastic Processes - Part I

Variational Inference (11/04/13)

Lecture : Probabilistic Machine Learning

Transcription:

Statistical Techniques in Robotics (16-831, F12) Lecture#17 (Wednesday October 31) Kalman Filters Lecturer: Drew Bagnell Scribe:Greydon Foil 1 1 Gauss Markov Model Consider X 1, X 2,...X t, X t+1 to be the state variables and Y 1, Y 2,...Y t, Y t+1 be the sequence of corresponding observations. As in Hidden Markov models, conditional independencies (see Figure 1) dictate that past and future states are uncorrelated given the current state, X t at time t. For example, if we know what X 2 is, then no information about X 1 can possibly help us to reason about what X 3 should be. X1 X2 - - - - - - - - - - - - - Xt Xt+1... Y1 Y2 Yt Yt+1... Figure 1: The Independence Diagram of a Gauss-Markov model The update for state variable X t+1 is given by: X t+1 = AX t + ɛ X 0 N(µ 0, Σ 0 ) ɛ N(0, Q) X t+1 X t N(AX t, Q) The corresponding observation Y t+1 is given by equation: Y t+1 = CX t+1 + δ Y 0 N(µ 0, ɛ 0 ) δ N(0, R) Each component is defined as follow: 1 Some content adapted from previous scribes: Ammar Husain, Heather Justice, Kiho Kwak, Siddharth Mehrotra 1

A t : Matrix (n n) that describes how the state evolves from t to t-1 without controls or noise. C t : Matrix (k n) that describes how to map the state X t to an observation Y t, k is the number of observations. ɛ t, δ t : Random variables representing the process and measurement noise that are assumed to be independent and normally distributed with n n noise covariances R t and Q t respectively. We want to find x t y 1...t, so we need to calculate µ x+t and Σ xt. Because a Gaussian will try to fit itself to all of the data, in a real situation we would first try to remove all outliers to achieve a more stable result. Note that this parameterization is directly related to Bayes Linear Regression if it meets the following conditions: X here is equivalent to θ in BLR and Y here is equivalent to Y in BLR. The motion model is just the identity matrix. Q is going to 0 as t. It is nonzero if the noise might be changing as a function of time. C is the vector x t from BLR, different here at every timestep. δ N(0, σ 2 ). 2 What can you do with Gaussians? There are two common parameterizations for Gaussians, the moment parameterization and the natural parameterization. It is often most practical to switch back and forth between representations, depending on which calculations are needed. The moment parameterization is more convenient for visualization (simply draw a Gaussian centered around the mean with width determined by the variance), calculating expected value, and computing marginals. The natural parameterization is more convenient for multiplying Gaussians and for conditioning on known variables. While it is often convenient to switch between the two parameterizations, it is not always efficient, as we will discuss later. 2.1 Moment Parameterization Recall that the moment parameterization of a Gaussian is: N (µ, Σ) = p(θ) = 1 ( z exp 1 ) 2 (θ µ)t Σ 1 (θ µ) Given: [ x1 x 2 ] N ([ µ1 µ 2 ] [ Σ11 Σ, 12 Σ 21 Σ 22 ]) 2

Marginal: computing p(x 2 ) µ marg 2 = µ 2 Σ marg 2 = Σ 22 We find both of these by the definition of moments, specifically the fact that the moments of x 2 don t change if x 1 is removed. Conditional: computing p(x 1 x 2 ) µ 1 2 = µ 1 + Σ 12 Σ 1 22 (x 2 µ 2 ) (x 2 µ 2 ) is the distance x 2 is from its mean. We then multiply it by its uncertainty (Σ 22 ), and convert that value into the frame of x 1 using Σ 12, adding it to our best guess for x 1, µ 1. Σ 1 2 = Σ 11 Σ 12 Σ 1 22 Σ 21 Here we start with the uncertainty in x 1, Σ 11, and subtract out the uncertainty in x 2 and between x 1 and x 2, again mapping it to the frame of x 1 using Σ 12. 2.2 Natural Parameterization Recall that the natural parameterization of a Gaussian is: Ñ (J, P ) = p(θ) = 1 (J z exp T θ 1 ) 2 θt P θ P 0 = Σ 1 0 J 0 = P 0 µ 0 Given: Marginal: computing p(x 2 ) [ x1 x 2 ] N ([ J1 J 2 ] [ P11 P, 12 P 21 P 22 J marg 2 = J 2 P 21 P11 1 1 P marg 2 = P 22 P 21 P11 1 12 These are most easily calculated by deriving the marignals in moment parameterization and converting to natural parameterization. Conditional: computing p(x 1 x 2 ) J 1 2 = J 1 P 12 x 2 P 1 2 = P 11 Derivation of these conditionals is a straightforward expansion of the full moment parameterization, and was said to be a possible test question. I encourage you to read page 7 of [1] for the full derivation. Also note that the moment parameterization is often called the canonical parameterization. ]) 3

3 Lazy Gauss Markov Filter Motion Model (Prediction step): Before the observation is taken: X t+1 µ t+1 = Aµ t Proof: Mean: E[X t+1 ] = E[AX t + ɛ] = E[AX t ] + E[ɛ] = AE[X t ](since the mean of ɛ is 0) = Aµ t Variance: Σ t+1 = E[X t+1 X T t+1 ] = E[(AX t + ɛ)(ax t + ɛ) T ] = E[(AX t )(AX t ) T ] + V ar(ɛ) = AE[(X t )(X t ) T ]A T + Q = AΣ t A T + Q Therefore the motion update becomes: µ t+1 = Aµ t Σ t+1 = AΣ ta T + Q 3.1 Observation Model (Correction step): For the observation model the natural parameterization is more suitable as it involves multiplication of terms. The model equation in terms of Natural Parameters J and P is given by: P (y t+1 x t+1 )P (x t+1 ) e (J T x t+1 1 2 xt t+1 P x t+1) e 1 2 (y t+1 Cx t+1 ) T R 1 (y t+1 Cx t+1 ) = e 1 2 [ 2yT t+1 R 1 Cx t+1 +x T t+1 CT R 1 Cx t+1 +y T t+1 R 1 y t+1 ] = e 1 2 [ 2yT t+1 R 1 Cx t+1 +x T t+1 CT R 1 Cx t+1 ] The last term in the second line is constant with respect to x t+1, so it can be added to the the marginalization term. Therefore the observation update is: J + t+1 = J t+1 + (yt t+1r 1 C) T P + t+1 = P t+1 + C 1 R 1 C 4

3.2 Performance Lazy Gauss Markov can be expressed in two forms: When expressed in terms of moment parameters, µ and Σ, it acts as Kalman Filter. When expressed in terms of natural parameters, J and P, it acts as Information Filter. Kalman filters, as we will see, require matrix multiplications, approximately O(n 2 ) time, to do a prediction step, yet require matrix inversions, approximately O(n 2.8 ) time, to perform the observation update. Information filters are the exact opposite, requiring matrix inversions for the prediction step and matrix multiplications for the observation update. As mentioned above, the conversion between moment and natural parameterization requires an inversion of the covariance matrix, so switching between the two can be costly. Depending on the ratio of motion model updates to observation model updates one filter may be faster than the other. References [1] Michael I. Jordan and Christopher M. Bishop, An Introduction to Graphical Models. July 10, 2001. http://people.csail.mit.edu/yks/documents/classes/mlbook/pdf/chapter12. pdf. 5