For final project discussion every afternoon Mark and I will be available

Similar documents
The Kalman Filter. Data Assimilation & Inverse Problems from Weather Forecasting to Neuroscience. Sarah Dance

ECE295, Data Assimila0on and Inverse Problems, Spring 2015

Linear Prediction Theory

10-701/15-781, Machine Learning: Homework 4

F denotes cumulative density. denotes probability density function; (.)

Introduction to Probabilistic Graphical Models: Exercises

Mini-Course 07 Kalman Particle Filters. Henrique Massard da Fonseca Cesar Cunha Pacheco Wellington Bettencurte Julio Dutra

9 Forward-backward algorithm, sum-product on factor graphs

Kalman Filter. Predict: Update: x k k 1 = F k x k 1 k 1 + B k u k P k k 1 = F k P k 1 k 1 F T k + Q

Gaussian Processes. Le Song. Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012

Kalman Filter Computer Vision (Kris Kitani) Carnegie Mellon University

Probabilistic Graphical Models

How do we solve it and what does the solution look like?

Bayesian Machine Learning - Lecture 7

STA 4273H: Statistical Machine Learning

Linear Dynamical Systems

Greg Welch and Gary Bishop. University of North Carolina at Chapel Hill Department of Computer Science.

COMP 551 Applied Machine Learning Lecture 20: Gaussian processes

RAO-BLACKWELLIZED PARTICLE FILTER FOR MARKOV MODULATED NONLINEARDYNAMIC SYSTEMS

Chris Bishop s PRML Ch. 8: Graphical Models

Expectation Propagation Algorithm

Lecture 4: Probabilistic Learning. Estimation Theory. Classification with Probability Distributions

Bayesian Machine Learning

Covariance. if X, Y are independent

Part I. C. M. Bishop PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 8: GRAPHICAL MODELS

CS 2750: Machine Learning. Bayesian Networks. Prof. Adriana Kovashka University of Pittsburgh March 14, 2016

Factor Analysis and Kalman Filtering (11/2/04)

MASSACHUSETTS INSTITUTE OF TECHNOLOGY Department of Electrical Engineering and Computer Science ALGORITHMS FOR INFERENCE Fall 2014

Stat 521A Lecture 18 1

Probabilistic Graphical Models

Markov Models. CS 188: Artificial Intelligence Fall Example. Mini-Forward Algorithm. Stationary Distributions.

Organization. I MCMC discussion. I project talks. I Lecture.

Machine Learning 4771

VISION TRACKING PREDICTION

Machine Learning Lecture Notes

1 EM Primer. CS4786/5786: Machine Learning for Data Science, Spring /24/2015: Assignment 3: EM, graphical models

Kalman filtering and friends: Inference in time series models. Herke van Hoof slides mostly by Michael Rubinstein

CSEP 573: Artificial Intelligence

TSRT14: Sensor Fusion Lecture 9

Extended Object and Group Tracking with Elliptic Random Hypersurface Models

Lecture 2: From Linear Regression to Kalman Filter and Beyond

4 Derivations of the Discrete-Time Kalman Filter

STA 4273H: Statistical Machine Learning

Probabilistic Graphical Models

Announcements. CS 188: Artificial Intelligence Fall VPI Example. VPI Properties. Reasoning over Time. Markov Models. Lecture 19: HMMs 11/4/2008

CSC2515 Assignment #2

PILCO: A Model-Based and Data-Efficient Approach to Policy Search

L06. LINEAR KALMAN FILTERS. NA568 Mobile Robotics: Methods & Algorithms

Advanced Introduction to Machine Learning CMU-10715

Bayes Filter Reminder. Kalman Filter Localization. Properties of Gaussians. Gaussians. Prediction. Correction. σ 2. Univariate. 1 2πσ e.

Covariance Matrix Simplification For Efficient Uncertainty Management

ECE285/SIO209, Machine learning for physical applications, Spring 2017

p L yi z n m x N n xi

Mixture Models & EM. Nicholas Ruozzi University of Texas at Dallas. based on the slides of Vibhav Gogate

State Estimation and Prediction in a Class of Stochastic Hybrid Systems

Hidden Markov Models. Hal Daumé III. Computer Science University of Maryland CS 421: Introduction to Artificial Intelligence 19 Apr 2012

Lecture Note 2: Estimation and Information Theory

Part 1: Expectation Propagation

Mixture Models & EM. Nicholas Ruozzi University of Texas at Dallas. based on the slides of Vibhav Gogate

Principles of the Global Positioning System Lecture 11

Outline. CSE 573: Artificial Intelligence Autumn Agent. Partial Observability. Markov Decision Process (MDP) 10/31/2012

Overfitting, Bias / Variance Analysis

f X, Y (x, y)dx (x), where f(x,y) is the joint pdf of X and Y. (x) dx

Probabilistic Graphical Models

The Unscented Particle Filter

ECE521 Lecture 19 HMM cont. Inference in HMM

Particle Filters; Simultaneous Localization and Mapping (Intelligent Autonomous Robotics) Subramanian Ramamoorthy School of Informatics

Probabilistic Graphical Models Homework 2: Due February 24, 2014 at 4 pm

Bayesian Approach 2. CSC412 Probabilistic Learning & Reasoning

Short-Term Electricity Price Forecasting

Machine Learning for OR & FE

Pattern Recognition and Machine Learning

Probabilistic Graphical Models (I)

Dimension Reduction. David M. Blei. April 23, 2012

4 : Exact Inference: Variable Elimination

Lecture 8: Bayesian Estimation of Parameters in State Space Models

From Bayes to Extended Kalman Filter

Parameterized Joint Densities with Gaussian Mixture Marginals and their Potential Use in Nonlinear Robust Estimation

Hand Written Digit Recognition using Kalman Filter

ECE521 Tutorial 11. Topic Review. ECE521 Winter Credits to Alireza Makhzani, Alex Schwing, Rich Zemel and TAs for slides. ECE521 Tutorial 11 / 4

Image Alignment and Mosaicing Feature Tracking and the Kalman Filter

BACKGROUND NOTES FYS 4550/FYS EXPERIMENTAL HIGH ENERGY PHYSICS AUTUMN 2016 PROBABILITY A. STRANDLIE NTNU AT GJØVIK AND UNIVERSITY OF OSLO

Nonparametric Bayesian Methods (Gaussian Processes)

The Kalman Filter (part 1) Definition. Rudolf Emil Kalman. Why do we need a filter? Definition. HCI/ComS 575X: Computational Perception.

Bayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2014

Clustering. Professor Ameet Talwalkar. Professor Ameet Talwalkar CS260 Machine Learning Algorithms March 8, / 26

Recent Advances in Bayesian Inference Techniques

Advanced Introduction to Machine Learning

Message passing and approximate message passing

Introduction to Graphical Models

Analysis Methods in Atmospheric and Oceanic Science

SIMULTANEOUS STATE AND PARAMETER ESTIMATION USING KALMAN FILTERS

Gaussian Process Approximations of Stochastic Differential Equations

Latent Variable Methods Course

1 Kalman Filter Introduction

Analysis Methods in Atmospheric and Oceanic Science

Autonomous Navigation for Flying Robots

A Stochastic Collocation based. for Data Assimilation

Hidden Markov Models. By Parisa Abedi. Slides courtesy: Eric Xing

The Origin of Deep Learning. Lili Mou Jan, 2015

Transcription:

Worshop report 1. Daniels report is on website 2. Don t expect to write it based on listening to one project (we had 6 only 2 was sufficient quality) 3. I suggest writing it on one presentation. 4. Include figures (from a related paper or their presentation) 5. Include references Update: We are all set to have your students attend. We will not register them, so they can come and go as needed. food is for the registered participants and please allow them to eat first. Currently we have 70 registered participants and plant to order food for ~100.

May 22, Dictionary learning, Mie Bianco (half class), Bishop Ch 13 May 24, Class HW Bishop Ch 8/13 MAY 30 CODY May 31, No Class. Worshop, Big Data and The Earth Sciences: Grand Challenges Worshop June 5, Discuss worshop, Discuss final project. Spiess Hall open for project discussion 11am-7pm. June 7, Worshop report. No class June 12 Spiess Hall open for project discussion 9-11:30am and 2-7pm June 16 Final report delivered. Beer time For final project discussion every afternoon Mar and I will be available

Final Report In class on July 5 a status report from each group is mandatory. Maximum 2min/person, (i.e. a 5-member group have 10min), shorter is fine. Have presentation on memory stic or email Mar. Class might run longer, so we could start earlier. For the Final project (Due 16 June 5Pm). Delivery Dropbox request <2GB (details to follow).: A) Deliver a code: 1) Assume we have reasonable compilers installed (we use Mac OsX) 2) Give instructions if any additional software should be installed. 3) You can as us to download a dataset. Or include it in this submission 4) Don t include all developed codes. Just ey elements. 5) We should not have to reprogram your code. B) Report 1) The report should include all the following sections: Summary -> Introduction->Physical and Mathematical framewor->results. 2) Summary is a combination of an abstract and conclusion. 3) Plagiarism is not acceptable! When citing use for quotes and citations for relevant papers. 4) Don t write anything you don t understand. 5) Everyone in the group should understand everything that is written. If we do not understand a section during grading we should be able to as any member of the group to clarify. You can delegate the writing, but not the understanding. 6) Use citations. Any concepts which are not fully explained should have a citation with an explanation. 7) Please be concise. Equations are good. Figures essential. Write as though your report is to be published in a scientific journal. 8) I have attached a sample report from Mar, though shorter is preferred.

Discrete Variables (1) General joint distribution: K 2-1 parameters Independent joint distribution: 2(K-1) parameters General joint distribution over M variables: K M -1parameters M-node Marov chain: K-1 + (M-1) K(K-1) parameters

Where Joint Distribution is the potential over clique C and is the normalization coefficient; note: M K-state variables K M terms in Z. Energies and the Boltzmann distribution

Illustration: Image De-Noising Noisy Image Restored Image (ICM) Restored Image (Graph cuts)

Inference in Graphical Models

Inference on a Chain

Inference on a Chain

Inference on a Chain To compute local marginals: Compute and store all forward messages,. Compute and store all bacward messages,. Compute Z at any node x m Compute for all variables required.

The Sum-Product Algorithm (1) Objective: i. to obtain an efficient, exact inference algorithm for finding marginals; ii. in situations where several marginals are required, to allow computations to be shared efficiently. Key idea: Distributive Law 7 versus 3 operations

The Sum-Product Algorithm f 3 (x, y) y u w x f 1 (u, w) f 2 (w, x) f 4 (x, z) z

How do we solve it and what does the solution loo lie? KF/PFs offer solutions to dynamical systems, nonlinear in general, using prediction and update as data becomes available. Tracing in time or space offers an ideal framewor for studying KF/PF.

Kalman Framewor x y = f - = h -1( x 1, v ( x, w ) ) state equation measurement equation x y = F -1x -1 = H x + + w v state equation measurement equation x, y,v, w : Gaussian F, H : Linear Optimal Filter = Kalman Filter 1963

A Single Kalman Iteration x ~ N (ˆ x, P p( x x - 1, x - 2!, x0) y, y - 1, y ) p( x! 1 o o o o o o o ) ˆx o o o o o o previous states x -2 x -1 xˆ -1 xˆ -1 1. Predict the mean using previous history. p( x x - 1) { x x- 1} x p( x x dx xˆ - 1 E = ò -1 = ) P -1 2. Predict the covariance using previous history. 3. Correct/update the mean using new data y p( x Y xˆ ) { x Y } = x p( x Y dx = E ) ò P 4. Correct/update the covariance using y PREDICT UPDATE! Þ p( x- 1 Y -1) Þ p( x Y -1) Þ p( x Y ) Þ! PREDICTOR-CORRECTOR DENSITY PROPAGATOR

The Model Consider the discrete, linear system, where x +1 = M x + w, = 0, 1, 2,..., (1) x 2 R n is the state vector at time t M 2 R n n is the state transition matrix (mapping from time t to t +1 ) or model {w 2 R n ; = 0, 1, 2,...} is a white, Gaussian sequence, with w N(0, Q ), often referred to as model error Q 2 R n n is a symmetric positive definite covariance matrix (nown as the model error covariance matrix). 4 of 32 Some of the following slides are from: Sarah Dance, University of Reading

The Observations We also have discrete, linear observations that satisfy where y = H x + v, = 1, 2, 3,..., (2) y 2 R p is the vector of actual measurements or observations at time t H 2 R n p is the observation operator. Note that this is not in general a square matrix. {v 2 R p ; = 1, 2,...} is a white, Gaussian sequence, with v N(0, R ), often referred to as observation error. R 2 R p p is a symmetric positive definite covariance matrix (nown as the observation error covariance matrix). We assume that the initial state, x 0 and the noise vectors at each step, {w }, {v }, are assumed mutually independent. 5 of 32

The Prediction and Filtering Problems We suppose that there is some uncertainty in the initial state, i.e., x 0 N(0, P 0 ) (3) with P 0 2 R n n a symmetric positive definite covariance matrix. The problem is now to compute an improved estimate of the stochastic variable x, provided y 1,...y j have been measured: bx j = bx y1,...,y j. (4) When j = this is called the filtered estimate. When j = 1 this is the one-step predicted, or (here) the predicted estimate. 6 of 32

The Kalman filter (Kalman, 1960) provides estimates for the linear discrete prediction and filtering problem. We will tae a minimum variance approach to deriving the filter. We assume that all the relevant probability densities are Gaussian so that we can simply consider the mean and covariance. Rigorous justifcation and other approaches to deriving the filter are discussed by Jazwinsi (1970), Chapter 7. 8 of 32

Prediction step We first derive the equation for one-step prediction of the mean using the state propagation model (1). bx +1 = E [x +1 y 1,...y ], = E [M x + w ], = M bx (5) 9 of 32

The one step prediction of the covariance is defined by, i P +1 = E h(x +1 bx +1 )(x +1 bx +1 ) T y 1,...y. (6) Exercise: Using the state propagation model, (1), and one-step prediction of the mean, (5), show that P +1 = M P M T + Q. (7) 10 of 32

Product(of(Gaussians=Gaussian:( For the general linear inverse problem we would have Prior: Lielihood: p(m) exp One data point problem ½ 1 ¾ 2 (m m o) T Cm 1 (m m o) ½ p(d m) exp 1 ¾ 2 (d Gm)T Cd 1 (d Gm) Posterior PDF ½ exp 1 ¾ 2 [(d Gm)T Cd 1 (d Gm)+(m m o) T Cm 1 (m m o)] # exp$ 1 2 m ˆm % S 1 = G T C d 1 G + C m 1 ˆm = G T C d 1 G + C m 1 [ ] T S 1 [ m ˆm ] ( ) 1 ( G T C 1 d d + C 1 m m 0 ) = m 0 + G T C 1 1 ( d G + C m ) 1 G T C 1 d d Gm 0 & ' ( 260 ( )

Filtering Step At the time of an observation, we assume that the update to the mean may be written as a linear combination of the observation and the previous estimate: bx = bx 1 + K (y H bx 1 ), (8) where K 2 R n p is nown as the Kalman gain and will be derived shortly. 11 of 32

But first we consider the covariance associated with this estimate: i P = E h(x bx )(x bx ) T y 1,...y. (9) Using the observation update for the mean (8) we have, x bx = x bx 1 K (y H bx 1 ) = x bx 1 K (H x + v H bx 1 ), replacing the observations with their model equivalent, = (I K H )(x bx 1 ) K v. (10) Thus, since the error in the prior estimate, x bx 1 is uncorrelated with the measurement noise we find h i P = (I K H )E (x bx 1 )(x bx 1 ) T (I K H ) T h i +K E v v T K T. (11) 12 of 32

Simplification of the a posteriori error covariance formula Using this value of the Kalman gain we are in a position to simplify the Joseph form as P =(I K H )P 1 (I K H ) T + K R K T =(I K H )P 1. (15) Exercise: Show this. Note that the covariance update equation is independent of the actual measurements: so P could be computed in advance. 15 of 32

Summary of the Kalman filter Prediction step Mean update: bx +1 = M bx Covariance update: P +1 = M P M T + Q. Observation update step Mean update: bx = bx 1 + K (y H bx 1 ) Kalman gain: K = P 1 H T (H P 1 H T + R ) 1 Covariance update: P =(I K H )P 1. 16 of 32