PROBABILISTIC REASONING OVER TIME

Similar documents
Temporal probability models. Chapter 15, Sections 1 5 1

Hidden Markov models 1

Temporal probability models. Chapter 15

Hidden Markov Models. AIMA Chapter 15, Sections 1 5. AIMA Chapter 15, Sections 1 5 1

Dynamic Bayesian Networks and Hidden Markov Models Decision Trees

Temporal probability models. Chapter 15, Sections 1 5 1

Temporal probability models

Course Overview. Summary. Performance of approximation algorithms

Artificial Intelligence

Extensions of Bayesian Networks. Outline. Bayesian Network. Reasoning under Uncertainty. Features of Bayesian Networks.

Kalman Filters, Dynamic Bayesian Networks

CSCI 360 Introduc/on to Ar/ficial Intelligence Week 2: Problem Solving and Op/miza/on. Professor Wei-Min Shen Week 8.1 and 8.2

Hidden Markov Models (recap BNs)

CS 7180: Behavioral Modeling and Decision- making in AI

CS532, Winter 2010 Hidden Markov Models

Bayesian Networks BY: MOHAMAD ALSABBAGH

Linear Dynamical Systems

CS 343: Artificial Intelligence

CSEP 573: Artificial Intelligence

Hidden Markov Models. Hal Daumé III. Computer Science University of Maryland CS 421: Introduction to Artificial Intelligence 19 Apr 2012

CSE 473: Artificial Intelligence

CS 343: Artificial Intelligence

Markov Models. CS 188: Artificial Intelligence Fall Example. Mini-Forward Algorithm. Stationary Distributions.

PROBABILISTIC REASONING Outline

Bayesian Methods in Artificial Intelligence

Human-Oriented Robotics. Temporal Reasoning. Kai Arras Social Robotics Lab, University of Freiburg

Probabilistic Graphical Models

Reasoning Under Uncertainty Over Time. CS 486/686: Introduction to Artificial Intelligence

Linear Dynamical Systems (Kalman filter)

STA 4273H: Statistical Machine Learning

Approximate Inference

Kalman filtering and friends: Inference in time series models. Herke van Hoof slides mostly by Michael Rubinstein

Bayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2014

CS 5522: Artificial Intelligence II

CS 5522: Artificial Intelligence II

CS 7180: Behavioral Modeling and Decision- making in AI

Announcements. CS 188: Artificial Intelligence Fall VPI Example. VPI Properties. Reasoning over Time. Markov Models. Lecture 19: HMMs 11/4/2008

Hidden Markov Models. Vibhav Gogate The University of Texas at Dallas

Hidden Markov Models,99,100! Markov, here I come!

Bayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2016

Markov Models and Hidden Markov Models

Markov Chains and Hidden Markov Models

CS 343: Artificial Intelligence

CS 188: Artificial Intelligence Spring 2009

Hidden Markov Models. Aarti Singh Slides courtesy: Eric Xing. Machine Learning / Nov 8, 2010

CS 5522: Artificial Intelligence II

CS 188: Artificial Intelligence Fall 2011

CS 5522: Artificial Intelligence II

CS 188: Artificial Intelligence Spring Announcements

1/30/2012. Reasoning with uncertainty III

Introduction to Artificial Intelligence (AI)

Hidden Markov Models. By Parisa Abedi. Slides courtesy: Eric Xing

Lecture 6: Bayesian Inference in SDE Models

Announcements. CS 188: Artificial Intelligence Fall Markov Models. Example: Markov Chain. Mini-Forward Algorithm. Example

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 13: SEQUENTIAL DATA

Markov localization uses an explicit, discrete representation for the probability of all position in the state space.

Lecture 2: From Linear Regression to Kalman Filter and Beyond

Why do we care? Measurements. Handling uncertainty over time: predicting, estimating, recognizing, learning. Dealing with time

Introduction to Machine Learning CMU-10701

9 Forward-backward algorithm, sum-product on factor graphs

Probability, Markov models and HMMs. Vibhav Gogate The University of Texas at Dallas

Probabilistic Graphical Models

Note Set 5: Hidden Markov Models

COMP90051 Statistical Machine Learning

The Kalman Filter ImPr Talk

CS 188: Artificial Intelligence Fall Recap: Inference Example

CSE 473: Artificial Intelligence Probability Review à Markov Models. Outline

Mathematical Formulation of Our Example

Modeling and state estimation Examples State estimation Probabilities Bayes filter Particle filter. Modeling. CSC752 Autonomous Robotic Systems

CSE 473: Artificial Intelligence Spring 2014

Inference in Bayesian networks

CSE 473: Ar+ficial Intelligence. Probability Recap. Markov Models - II. Condi+onal probability. Product rule. Chain rule.

Temporal Models.

Introduction to Machine Learning

STA 414/2104: Machine Learning

Advanced Data Science

Lecture 4: State Estimation in Hidden Markov Models (cont.)

Autonomous Navigation for Flying Robots

CS 188: Artificial Intelligence

STATS 306B: Unsupervised Learning Spring Lecture 5 April 14

Sequential Monte Carlo Methods for Bayesian Computation

Outline. CSE 573: Artificial Intelligence Autumn Agent. Partial Observability. Markov Decision Process (MDP) 10/31/2012

Intelligent Systems (AI-2)

Template-Based Representations. Sargur Srihari

Introduction to Mobile Robotics Probabilistic Robotics

Sequence labeling. Taking collective a set of interrelated instances x 1,, x T and jointly labeling them

Pattern Recognition and Machine Learning

CS325 Artificial Intelligence Ch. 15,20 Hidden Markov Models and Particle Filtering

LEARNING DYNAMIC SYSTEMS: MARKOV MODELS

CS 188: Artificial Intelligence. Our Status in CS188

Stochastic inference in Bayesian networks, Markov chain Monte Carlo methods

Machine Learning 4771

Hidden Markov models

Introduction to Artificial Intelligence (AI)

2-Step Temporal Bayesian Networks (2TBN): Filtering, Smoothing, and Beyond Technical Report: TRCIM1030

Robot Localization and Kalman Filters

Bayesian Approaches Data Mining Selected Technique

Bayesian Machine Learning - Lecture 7

Multiscale Systems Engineering Research Group

Why do we care? Examples. Bayes Rule. What room am I in? Handling uncertainty over time: predicting, estimating, recognizing, learning

Transcription:

PROBABILISTIC REASONING OVER TIME In which we try to interpret the present, understand the past, and perhaps predict the future, even when very little is crystal clear.

Outline Time and uncertainty Inference: filtering, prediction, smoothing Hidden Markov models Kalman filters (a brief mention) Dynamic Bayesian networks Particle filtering Chapter 15, Sections 1 5 2

Time and uncertainty The world changes; we need to track and predict it Diabetes management vs vehicle diagnosis Basic idea: copy state and evidence variables for each time step X t =setofunobservablestatevariablesattimet e.g., BloodSugar t, StomachContents t,etc. E t =setofobservableevidencevariablesattimet e.g., MeasuredBloodSugar t, PulseRate t, FoodEaten t This assumes discrete time; step size depends on problem Notation: X a:b = X a, X a+1,...,x b 1, X b Chapter 15, Sections 1 5 3

Markov processes (Markov chains) Construct a Bayes net from these variables: parents? Markov assumption: X t depends on bounded subset of X 0:t 1 First-order Markov process: P(X t X 0:t 1 )=P(X t X t 1 ) Second-order Markov process: P(X t X 0:t 1 )=P(X t X t 2, X t 1 ) First order X t 2 X t 1 X t X t +1 X t +2 Second order X t 2 X t 1 X t X t +1 X t +2 Sensor Markov assumption: P(E t X 0:t, E 0:t 1 )=P(E t X t ) Stationary process: transition model P(X t X t 1 ) and sensor model P(E t X t ) fixed for all t Chapter 15, Sections 1 5 4

Example Rain t 1 R t 1 t f P(R t ) 0.7 0.3 Rain t Rain t +1 Rt t f P(U t ) 0.9 0.2 Umbrella t 1 Umbrella t Umbrella t +1 First-order Markov assumption not exactly true in real world! Possible fixes: 1. Increase order of Markov process 2. Augment state, e.g.,addt emp t, Pressure t Example: robot motion. Augment position and velocity with Battery t Chapter 15, Sections 1 5 5

Inference tasks Filtering: P(X t e 1:t ) belief state input to the decision process of a rational agent Prediction: P(X t+k e 1:t ) for k>0 evaluation of possible action sequences; like filtering without the evidence Smoothing: P(X k e 1:t ) for 0 k<t better estimate of past states, essential for learning Most likely explanation: arg max x1:t P (x 1:t e 1:t ) speech recognition, decoding with a noisy channel Chapter 15, Sections 1 5 6

Filtering Aim: devise a recursive state estimation algorithm: P(X t+1 e 1:t+1 )=f(e t+1, P(X t e 1:t )) P(X t+1 e 1:t+1 )=P(X t+1 e 1:t, e t+1 ) = αp(e t+1 X t+1, e 1:t )P(X t+1 e 1:t ) = αp(e t+1 X t+1 )P(X t+1 e 1:t ) I.e.,prediction + estimation. PredictionbysummingoutX t : P(X t+1 e 1:t+1 )=αp(e t+1 X t+1 )Σ xt P(X t+1 x t, e 1:t )P (x t e 1:t ) = αp(e t+1 X t+1 )Σ xt P(X t+1 x t )P (x t e 1:t ) f 1:t+1 = Forward(f 1:t, e t+1 ) where f 1:t = P(X t e 1:t ) Time and space constant (independent of t) Chapter 15, Sections 1 5 7

Filtering example 0.500 0.500 0.627 0.373 True False 0.500 0.500 0.818 0.182 0.883 0.117 Rain 0 Rain 1 Rain 2 Umbrella 1 Umbrella 2 Chapter 15, Sections 1 5 8

Filtering example 0.500 0.500 0.627 0.373 True False 0.500 0.500 0.818 0.182 0.883 0.117 Rain 0 Rain 1 Rain 2 Umbrella 1 Umbrella 2 Chapter 15, Sections 1 5 8

Filtering example 0.500 0.500 0.627 0.373 True False 0.500 0.500 0.818 0.182 0.883 0.117 Rain 0 Rain 1 Rain 2 Umbrella 1 Umbrella 2 Chapter 15, Sections 1 5 8

Smoothing example 0.500 0.500 0.627 0.373 True False 0.500 0.500 0.818 0.182 0.883 0.117 forward 0.883 0.117 0.883 0.117 smoothed 0.690 0.410 1.000 1.000 backward Rain 0 Rain 1 Rain 2 Umbrella 1 Umbrella 2 Forward backward algorithm: cache forward messages along the way Time linear in t (polytree inference), space O(t f ) Chapter 15, Sections 1 5 10

Smoothing example 0.500 0.500 0.627 0.373 True False 0.500 0.500 0.818 0.182 0.883 0.117 forward 0.883 0.117 0.883 0.117 smoothed 0.690 0.410 1.000 1.000 backward Rain 0 Rain 1 Rain 2 Umbrella 1 Umbrella 2 Forward backward algorithm: cache forward messages along the way Time linear in t (polytree inference), space O(t f ) Chapter 15, Sections 1 5 10

Most likely explanation Most likely sequence sequence of most likely states!!!! Most likely path to each x t+1 =mostlikelypathtosome x t plus one more step max P(x x 1...x 1,...,x t, X t+1 e 1:t+1 ) t = P(e t+1 X t+1 )max P(Xt+1 x x t ) max P (x t x 1...x 1,...,x t 1, x t e 1:t ) t 1 Identical to filtering, except f 1:t replaced by m 1:t = max P(x x 1...x 1,...,x t 1, X t e 1:t ), t 1 I.e., m 1:t (i) gives the probability of the most likely path to state i. Update has sum replaced by max, giving the Viterbi algorithm: m 1:t+1 = P(e t+1 X t+1 )max x t (P(X t+1 x t )m 1:t ) Chapter 15, Sections 1 5 11

Viterbi example Rain 1 Rain 2 Rain 3 Rain 4 Rain 5 state space paths true false true false true false true false true false umbrella true true false true true most likely paths.8182.5155.0361.0334.0210.1818.0491.1237.0173.0024 m 1:1 m 1:2 m 1:3 m 1:4 m 1:5 Chapter 15, Sections 1 5 12

Hidden Markov models X t is a single, discrete variable (usually E t is too) Domain of X t is {1,...,S} Transition matrix T ij = P (X t = j X t 1 = i), e.g., 0.7 0.3 0.3 0.7 Sensor matrix O t for each time step, diagonal elements P (e t X t = i) 0.9 0 e.g., with U 1 = true, O 1 = 0 0.2 Forward and backward messages as column vectors: f 1:t+1 = αo t+1 T f 1:t b k+1:t = TO k+1 b k+2:t Forward-backward algorithm needs time O(S 2 t) and space O(St) Chapter 15, Sections 1 5 13

Hidden Markov models X t is a single, discrete variable (usually E t is too) Domain of X t is {1,...,S} Transition matrix T ij = P (X t = j X t 1 = i), e.g., 0.7 0.3 0.3 0.7 Sensor matrix O t for each time step, diagonal elements P (e t X t = i) 0.9 0 e.g., with U 1 = true, O 1 = 0 0.2 Forward and backward messages as column vectors: f 1:t+1 = αo t+1 T f 1:t b k+1:t = TO k+1 b k+2:t Forward-backward algorithm needs time O(S 2 t) and space O(St) Chapter 15, Sections 1 5 13

Hidden Markov models X t is a single, discrete variable (usually E t is too) Domain of X t is {1,...,S} Transition matrix T ij = P (X t = j X t 1 = i), e.g., 0.7 0.3 0.3 0.7 Sensor matrix O t for each time step, diagonal elements P (e t X t = i) 0.9 0 e.g., with U 1 = true, O 1 = 0 0.2 Forward and backward messages as column vectors: f 1:t+1 = αo t+1 T f 1:t b k+1:t = TO k+1 b k+2:t Forward-backward algorithm needs time O(S 2 t) and space O(St) Chapter 15, Sections 1 5 13

Country dance algorithm Can avoid storing all forward messages in smoothing by running forward algorithm backwards: f 1:t+1 = αo t+1 T f 1:t O 1 t+1f 1:t+1 = αt f 1:t α (T ) 1 O 1 t+1f 1:t+1 = f 1:t Algorithm: forward pass computes f t,backwardpassdoesf i, b i Chapter 15, Sections 1 5 14

Country dance algorithm Can avoid storing all forward messages in smoothing by running forward algorithm backwards: f 1:t+1 = αo t+1 T f 1:t O 1 t+1f 1:t+1 = αt f 1:t α (T ) 1 O 1 t+1f 1:t+1 = f 1:t Algorithm: forward pass computes f t,backwardpassdoesf i, b i Chapter 15, Sections 1 5 15

Country dance algorithm Can avoid storing all forward messages in smoothing by running forward algorithm backwards: f 1:t+1 = αo t+1 T f 1:t O 1 t+1f 1:t+1 = αt f 1:t α (T ) 1 O 1 t+1f 1:t+1 = f 1:t Algorithm: forward pass computes f t,backwardpassdoesf i, b i Chapter 15, Sections 1 5 16

Country dance algorithm Can avoid storing all forward messages in smoothing by running forward algorithm backwards: f 1:t+1 = αo t+1 T f 1:t O 1 t+1f 1:t+1 = αt f 1:t α (T ) 1 O 1 t+1f 1:t+1 = f 1:t Algorithm: forward pass computes f t,backwardpassdoesf i, b i Chapter 15, Sections 1 5 17

Country dance algorithm Can avoid storing all forward messages in smoothing by running forward algorithm backwards: f 1:t+1 = αo t+1 T f 1:t O 1 t+1f 1:t+1 = αt f 1:t α (T ) 1 O 1 t+1f 1:t+1 = f 1:t Algorithm: forward pass computes f t,backwardpassdoesf i, b i Chapter 15, Sections 1 5 18

Country dance algorithm Can avoid storing all forward messages in smoothing by running forward algorithm backwards: f 1:t+1 = αo t+1 T f 1:t O 1 t+1f 1:t+1 = αt f 1:t α (T ) 1 O 1 t+1f 1:t+1 = f 1:t Algorithm: forward pass computes f t,backwardpassdoesf i, b i Chapter 15, Sections 1 5 19

Country dance algorithm Can avoid storing all forward messages in smoothing by running forward algorithm backwards: f 1:t+1 = αo t+1 T f 1:t O 1 t+1f 1:t+1 = αt f 1:t α (T ) 1 O 1 t+1f 1:t+1 = f 1:t Algorithm: forward pass computes f t,backwardpassdoesf i, b i Chapter 15, Sections 1 5 20

Country dance algorithm Can avoid storing all forward messages in smoothing by running forward algorithm backwards: f 1:t+1 = αo t+1 T f 1:t O 1 t+1f 1:t+1 = αt f 1:t α (T ) 1 O 1 t+1f 1:t+1 = f 1:t Algorithm: forward pass computes f t,backwardpassdoesf i, b i Chapter 15, Sections 1 5 21

Country dance algorithm Can avoid storing all forward messages in smoothing by running forward algorithm backwards: f 1:t+1 = αo t+1 T f 1:t O 1 t+1f 1:t+1 = αt f 1:t α (T ) 1 O 1 t+1f 1:t+1 = f 1:t Algorithm: forward pass computes f t,backwardpassdoesf i, b i Chapter 15, Sections 1 5 22

Country dance algorithm Can avoid storing all forward messages in smoothing by running forward algorithm backwards: f 1:t+1 = αo t+1 T f 1:t O 1 t+1f 1:t+1 = αt f 1:t α (T ) 1 O 1 t+1f 1:t+1 = f 1:t Algorithm: forward pass computes f t,backwardpassdoesf i, b i Chapter 15, Sections 1 5 23

Kalman filters Modelling systems described by a set of continuous variables, e.g., tracking a bird flying X t = X, Y, Z, Ẋ,Ẏ,Ż. Airplanes, robots, ecosystems, economies, chemical plants, planets,... Xt Xt+1 Xt Xt+1 Zt Zt+1 Gaussian prior, linear Gaussian transition model and sensor model Chapter 15, Sections 1 5 24

Updating Gaussian distributions Prediction step: if P(X t e 1:t ) is Gaussian, then prediction P(X t+1 e 1:t )= x t P(X t+1 x t )P (x t e 1:t ) dx t is Gaussian. If P(X t+1 e 1:t ) is Gaussian, then the updated distribution P(X t+1 e 1:t+1 )=αp(e t+1 X t+1 )P(X t+1 e 1:t ) is Gaussian Hence P(X t e 1:t ) is multivariate Gaussian N(µ t, Σ t ) for all t General (nonlinear, non-gaussian) process: description of posterior grows unboundedly as t Chapter 15, Sections 1 5 25

Simple 1-D example Gaussian random walk on X axis, s.d. σ x,sensors.d.σ z µ t+1 = (σ2 t + σ 2 x)z t+1 + σ 2 zµ t σ 2 t + σ 2 x + σ 2 z σ 2 t+1 = (σ2 t + σ 2 x)σ 2 z σ 2 t + σ 2 x + σ 2 z P(X) 0.45 0.4 0.35 0.3 P(x1 z1=2.5) 0.25 P(x0) 0.2 0.15 0.1 P(x1) 0.05 0-8 -6-4 -2 0 *z1 2 4 6 8 X position Chapter 15, Sections 1 5 26

Transition and sensor models: General Kalman update P (x t+1 x t )=N(Fx t, Σ x )(x t+1 ) P (z t x t )=N(Hx t, Σ z )(z t ) F is the matrix for the transition; Σ x the transition noise covariance H is the matrix for the sensors; Σ z the sensor noise covariance Filter computes the following update: µ t+1 = Fµ t + K t+1 (z t+1 HFµ t ) Σ t+1 =(I K t+1 )(FΣ t F + Σ x ) where K t+1 =(FΣ t F + Σ x )H (H(FΣ t F + Σ x )H + Σ z ) 1 is the Kalman gain matrix Σ t and K t are independent of observation sequence, so compute offline Chapter 15, Sections 1 5 27

2-D tracking example: filtering 12 2D filtering 11 true observed filtered 10 Y 9 8 7 6 8 10 12 14 16 18 20 22 24 26 X Chapter 15, Sections 1 5 28

2-D tracking example: smoothing 12 2D smoothing 11 true observed smoothed 10 Y 9 8 7 6 8 10 12 14 16 18 20 22 24 26 X Chapter 15, Sections 1 5 29

Where it breaks Cannot be applied if the transition model is nonlinear Extended Kalman Filter models transition as locally linear around x t = µ t Fails if systems is locally unsmooth Chapter 15, Sections 1 5 30

Dynamic Bayesian networks X t, E t contain arbitrarily many variables in a replicated Bayes net BMeter 1 P(R ) 0 0.7 R 0 t f P(R 1) 0.7 0.3 Battery 0 Battery 1 Rain 0 Rain 1 R 1 t f P(U 1) 0.9 0.2 X 0 X 1 Umbrella 1 txx 0 X 1 Z 1 Chapter 15, Sections 1 5 31

DBNs vs. HMMs Every HMM is a single-variable DBN; every discrete DBN is an HMM X t X t+1 Y t Yt+1 Z t Z t+1 Sparse dependencies exponentially fewer parameters; e.g., 20 state variables, three parents each DBN has 20 2 3 =160 parameters, HMM has 2 20 2 20 10 12 Chapter 15, Sections 1 5 32

DBNs vs Kalman filters Every Kalman filter model is a DBN, but few DBNs are KFs; real world requires non-gaussian posteriors E.g., where are bin Laden and my keys? What s the battery charge? BMBroken0 BMBroken1 BMeter 1 Battery 0 Battery 1 5 E(Battery...5555005555...) 4 E(Battery...5555000000...) X 0 txx 0 X 1 X 1 E(Battery) 3 2 1 P(BMBroken...5555000000...) 0 P(BMBroken...5555005555...) Z 1-1 15 20 25 30 Time step Chapter 15, Sections 1 5 33

Exact inference in DBNs Naive method: unroll the network and run any exact algorithm P(R 0) 0.7 Rain 0 R 0 t f P(R 1) 0.7 0.3 Rain 1 P(R 0) 0.7 Rain 0 R 0 t f P(R 1) 0.7 0.3 Rain 1 R 0 t f P(R 1) 0.7 0.3 Rain 2 R 0 t f P(R 1) 0.7 0.3 Rain 3 R 0 t f P(R 1) 0.7 0.3 Rain 4 R 0 t f P(R 1) 0.7 0.3 Rain 5 R 0 t f P(R 1) 0.7 0.3 Rain 6 R 0 t f P(R 1) 0.7 0.3 Rain 7 R 1 t f P(U 1) 0.9 0.2 R 1 t f P(U 1) 0.9 0.2 R 1 t f P(U 1) 0.9 0.2 R 1 t f P(U 1) 0.9 0.2 R 1 t f P(U 1) 0.9 0.2 R 1 t f P(U 1) 0.9 0.2 R 1 t f P(U 1) 0.9 0.2 R 1 t f P(U 1) 0.9 0.2 Umbrella 1 Umbrella 1 Umbrella 2 Umbrella 3 Umbrella 4 Umbrella 5 Umbrella 6 Umbrella 7 Problem: inference cost for each update grows with t Rollup filtering: add slice t +1, sum out slice t using variable elimination Largest factor is O(d n+1 ), update cost O(d n+2 ) (cf. HMM update cost O(d 2n )) Chapter 15, Sections 1 5 34

Likelihood weighting for DBNs Set of weighted samples approximates the belief state Rain 0 Rain 1 Rain 2 Rain 3 Rain 4 Rain 5 Umbrella 1 Umbrella 2 Umbrella 3 Umbrella 4 Umbrella 5 LW samples pay no attention to the evidence! fraction agreeing falls exponentially with t number of samples required grows exponentially with t 1 0.8 LW(10) LW(100) LW(1000) LW(10000) RMS error 0.6 0.4 0.2 0 0 5 10 15 20 25 30 35 40 45 50 Time step Chapter 15, Sections 1 5 35

Particle filtering Basic idea: ensure that the population of samples ( particles ) tracks the high-likelihood regions of the state-space Replicate particles proportional to likelihood for e t true Rain t Rain t +1 Rain t +1 Rain t +1 false (a) Propagate (b) Weight (c) Resample Widely used for tracking nonlinear systems, esp. in vision Also used for simultaneous localization and mapping in mobile robots 10 5 -dimensional state space Chapter 15, Sections 1 5 36

Particle filtering contd. Assume consistent at time t: N(x t e 1:t )/N = P (x t e 1:t ) Propagate forward: populations of x t+1 are N(x t+1 e 1:t )=Σ xt P (x t+1 x t )N(x t e 1:t ) Weight samples by their likelihood for e t+1 : W (x t+1 e 1:t+1 )=P (e t+1 x t+1 )N(x t+1 e 1:t ) Resample to obtain populations proportional to W : N(x t+1 e 1:t+1 )/N = αw (x t+1 e 1:t+1 )=αp (e t+1 x t+1 )N(x t+1 e 1:t ) = αp (e t+1 x t+1 )Σ xt P (x t+1 x t )N(x t e 1:t ) = α P (e t+1 x t+1 )Σ xt P (x t+1 x t )P (x t e 1:t ) = P (x t+1 e 1:t+1 ) Chapter 15, Sections 1 5 37

Particle filtering performance Approximation error of particle filtering remains bounded over time, at least empirically theoretical analysis is difficult Avg absolute error 1 0.8 0.6 0.4 0.2 LW(25) LW(100) LW(1000) LW(10000) ER/SOF(25) 0 0 5 10 15 20 25 30 35 40 45 50 Time step Chapter 15, Sections 1 5 38

Summary Temporal models use state and sensor variables replicated over time Markov assumptions and stationarity assumption, so we need transition modelp(x t X t 1 ) sensor model P(E t X t ) Tasks are filtering, prediction, smoothing, most likely sequence; all done recursively with constant cost per time step Hidden Markov models have a single discrete state variable; used for speech recognition Kalman filters allow n state variables, linear Gaussian, O(n 3 ) update Dynamic Bayes nets subsume HMMs, Kalman filters; exact update intractable Particle filtering is a good approximate filtering algorithm for DBNs Chapter 15, Sections 1 5 39