CSCI 360 Introduc/on to Ar/ficial Intelligence Week 2: Problem Solving and Op/miza/on. Professor Wei-Min Shen Week 8.1 and 8.2

Similar documents
CSE 473: Ar+ficial Intelligence. Probability Recap. Markov Models - II. Condi+onal probability. Product rule. Chain rule.

CSE 473: Ar+ficial Intelligence

CSE 473: Ar+ficial Intelligence. Example. Par+cle Filters for HMMs. An HMM is defined by: Ini+al distribu+on: Transi+ons: Emissions:

CSCI 360 Introduc/on to Ar/ficial Intelligence Week 2: Problem Solving and Op/miza/on

CSE 473: Artificial Intelligence

CS 188: Artificial Intelligence Spring 2009

Announcements. CS 188: Artificial Intelligence Fall Markov Models. Example: Markov Chain. Mini-Forward Algorithm. Example

CS 188: Artificial Intelligence Spring Announcements

CSE 473: Ar+ficial Intelligence. Hidden Markov Models. Bayes Nets. Two random variable at each +me step Hidden state, X i Observa+on, E i

CSEP 573: Artificial Intelligence

Hidden Markov Models. Hal Daumé III. Computer Science University of Maryland CS 421: Introduction to Artificial Intelligence 19 Apr 2012

CS 5522: Artificial Intelligence II

Markov Models. CS 188: Artificial Intelligence Fall Example. Mini-Forward Algorithm. Stationary Distributions.

Announcements. CS 188: Artificial Intelligence Fall VPI Example. VPI Properties. Reasoning over Time. Markov Models. Lecture 19: HMMs 11/4/2008

Markov Chains and Hidden Markov Models

CS 5522: Artificial Intelligence II

Hidden Markov Models. AIMA Chapter 15, Sections 1 5. AIMA Chapter 15, Sections 1 5 1

PROBABILISTIC REASONING OVER TIME

Hidden Markov models 1

CS 343: Artificial Intelligence

Approximate Inference

CSE 473: Artificial Intelligence Spring 2014

Temporal probability models. Chapter 15, Sections 1 5 1

CS 343: Artificial Intelligence

Hidden Markov Models. Vibhav Gogate The University of Texas at Dallas

CS 5522: Artificial Intelligence II

Temporal probability models. Chapter 15

CS 343: Artificial Intelligence

CS 5522: Artificial Intelligence II

Probability, Markov models and HMMs. Vibhav Gogate The University of Texas at Dallas

Outline. CSE 573: Artificial Intelligence Autumn Agent. Partial Observability. Markov Decision Process (MDP) 10/31/2012

CS 188: Artificial Intelligence Fall Recap: Inference Example

CS 7180: Behavioral Modeling and Decision- making in AI

CS 188: Artificial Intelligence

Hidden Markov Models (recap BNs)

CS532, Winter 2010 Hidden Markov Models

Mini-project 2 (really) due today! Turn in a printout of your work at the end of the class

Bayesian Networks BY: MOHAMAD ALSABBAGH

Sta$s$cal sequence recogni$on

Hidden Markov Models. Aarti Singh Slides courtesy: Eric Xing. Machine Learning / Nov 8, 2010

CSE 473: Artificial Intelligence Probability Review à Markov Models. Outline

Reasoning Under Uncertainty Over Time. CS 486/686: Introduction to Artificial Intelligence

Introduction to Artificial Intelligence (AI)

Introduc)on to Ar)ficial Intelligence

Markov Models and Hidden Markov Models

Bayesian networks Lecture 18. David Sontag New York University

Hidden Markov Models. By Parisa Abedi. Slides courtesy: Eric Xing

Probability. CS 3793/5233 Artificial Intelligence Probability 1

Introduction to Machine Learning CMU-10701

CS 188: Artificial Intelligence Fall 2011

Human-Oriented Robotics. Temporal Reasoning. Kai Arras Social Robotics Lab, University of Freiburg

Machine Learning & Data Mining CS/CNS/EE 155. Lecture 11: Hidden Markov Models

Hidden Markov Models,99,100! Markov, here I come!

Graphical Models. Lecture 1: Mo4va4on and Founda4ons. Andrew McCallum

Extensions of Bayesian Networks. Outline. Bayesian Network. Reasoning under Uncertainty. Features of Bayesian Networks.

Machine Learning & Data Mining CS/CNS/EE 155. Lecture 8: Hidden Markov Models

CS 188: Artificial Intelligence. Our Status in CS188

Introduction to Artificial Intelligence (AI)

Our Status in CSE 5522

Machine Learning for Data Science (CS4786) Lecture 19

Artificial Intelligence

Intelligent Systems (AI-2)

Dynamic Bayesian Networks and Hidden Markov Models Decision Trees

Hidden Markov Models

Announcements. CS 188: Artificial Intelligence Fall Causality? Example: Traffic. Topology Limits Distributions. Example: Reverse Traffic

CS188 Outline. We re done with Part I: Search and Planning! Part II: Probabilistic Reasoning. Part III: Machine Learning

Intelligent Systems (AI-2)

Introduction to Mobile Robotics Probabilistic Robotics

Advanced Data Science

DART Tutorial Sec'on 1: Filtering For a One Variable System

Markov localization uses an explicit, discrete representation for the probability of all position in the state space.

Final Exam December 12, 2017

Intelligent Systems (AI-2)

CS 188: Artificial Intelligence Spring Announcements

CS 343: Artificial Intelligence

CSE 473: Artificial Intelligence

Introduc)on to Ar)ficial Intelligence

Temporal Models.

CS188 Outline. CS 188: Artificial Intelligence. Today. Inference in Ghostbusters. Probability. We re done with Part I: Search and Planning!

Our Status. We re done with Part I Search and Planning!

CS 188: Artificial Intelligence Fall 2009

Hidden Markov Model. Ying Wu. Electrical Engineering and Computer Science Northwestern University Evanston, IL 60208

Hidden Markov Models. George Konidaris

CS 6140: Machine Learning Spring What We Learned Last Week 2/26/16

CS 6140: Machine Learning Spring What We Learned Last Week. Survey 2/26/16. VS. Model

CS 6140: Machine Learning Spring 2016

Linear Dynamical Systems

Par$cle Filters Part I: Theory. Peter Jan van Leeuwen Data- Assimila$on Research Centre DARC University of Reading

Course Overview. Summary. Performance of approximation algorithms

CS 188: Artificial Intelligence. Bayes Nets

We Live in Exciting Times. CSCI-567: Machine Learning (Spring 2019) Outline. Outline. ACM (an international computing research society) has named

Bayesian Methods in Artificial Intelligence

Machine Learning for OR & FE

CSCI 360 Introduc/on to Ar/ficial Intelligence Week 2: Problem Solving and Op/miza/on. Instructor: Wei-Min Shen

Supervised Learning Hidden Markov Models. Some of these slides were inspired by the tutorials of Andrew Moore

Hidden Markov models

Computa(on of Similarity Similarity Search as Computa(on. Stoyan Mihov and Klaus U. Schulz

CS 7180: Behavioral Modeling and Decision- making in AI

Bayes Nets: Sampling

Announcements. Inference. Mid-term. Inference by Enumeration. Reminder: Alarm Network. Introduction to Artificial Intelligence. V22.

Transcription:

CSCI 360 Introduc/on to Ar/ficial Intelligence Week 2: Problem Solving and Op/miza/on Professor Wei-Min Shen Week 8.1 and 8.2

Status Check Projects Project 2 Midterm is coming, please do your homework! Read the HW Guide (TJ will release that soon) Ques/ons? 2

Probabilis/c Model (Ac/ons & Sensors) Readings: AIMA 15, ALFE 2, 5, (5.10) Defini/ons Markov Chains, HMM, DBN, Examples: Speech recogni/on ( hearing =?= believing ) Traveling in colored rooms ( seeing =?= believing ) Given: a map of colored rooms, and an experience (act/see): e 1:t Compute: which room the robot was/is/will-be in: X States, observa/ons, transi/ons, ac/ons, sensors, Inferences P(X t e 1:t ) which room I am in now (state es/ma/on) P(X t+k e 1:t ) which room I will be in at /me t+k (predic/on) P(X k e 1:t ) which room I was in at /me k (smoothing) P(x 1:t e 1:t ) what rooms I have been in the past (explana/on) P(M t e 1:t ) can I improve my map (learning) 3

Reasoning over Time Determinis/c Reasoning over /me Automata, Determinis/c Finite Automata (DFA) Probabilis/c Reasoning over /me Non-Determinis/c Finite Automata (NFA) Hidden Markov Model (HMM) Dynamic Bayesian Networks (DBN) 4

Example of Reasoning over Time Speech Recogni/on Listening is not always equal to hearing J Traveling through rooms with colored walls Seeing does not always tell where you are J a 0 Room-1 a 1 Room-2 a 2 Room-3 a 3 Wall-color Wall-color Wall-color 5

Problem of Reasoning over Time Given: A model of the world (e.g., a map of rooms with colored walls) An experience of observa/ons and ac/ons from /me 1 to t Compute: Which states (e.g., rooms) the robot was/is/will-be in? a 0 Room-1 a 1 Room-2 a 2 Room-3 a 3 Wall-color Wall-color Wall-color 6

States, Ac/ons and Sensors Example of the Lijle Prince s Planet Percepts: {rose, volcano, nothing} Ac/ons: {forward, backward, turn-around} States: {north-side, south-side, east-side, west-side} 7

The Planet of Lijle Prince See Nothing See Volcano See Rose See Nothing 8

A Model for Lijle Prince s Planet See Nothing See Volcano See Rose See Nothing His internal ac/on model has only 4 states! (why?) 9

The Ac/on/Sensor Model 10

Lijle Prince s Determinis/c Model Are the ac/ons determinis/c? Are the sensors determinis/c? Is there any hidden states in this example? 11

Two Big Uncertain/es Seeing may not be believing States and observa/ons are not 1-1 mapping One state may be seen in many different ways The same observa/on may be seen in many different states E.g., When the prince sees nothing, where is he? Facing which way? Can you made the lijle prince s sensors non-determinis/c? Ac/on may not produce the same results Ac/ons do not always take you to the same states E.g., the lijle prince s planet may have planet-quakes J Transi/ons between states by ac/ons may be nondeterminis/c now Can you made the lijle prince s ac/ons non-determinis/c? 12

Certainty vs Uncertainty The certain way s1 a1 s2 The uncertain way S1 is not always blue s1 a1 a1 a1 S2 is not always red s3 s2 a1 from S1 does not always go to S2 s3 13

Uncertainty in Sensor & Ac/ons? State S19?? Observa/ons????? Solu/on: let s use probabili/es Sensor Model (for states and observa/ons) P( Observation State ) Probabilis/c Transi/ons (for states and ac/ons) P( Next_State Current_State, Ac/on) 14

Hidden Markov Model (with ac/ons) z1 z2 s23 a1 z1 s94 z3 1. Ac/ons a1 2. Percepts (observa/ons) 3. States 4. Appearance: states à observa/ons 5. Transi/ons: (states, ac/ons) à states 6. Current State a1 s77 15

Hidden Markov Model (1/2) Ac/on Model 16

Hidden Markov Model (2/2) Sensor Model This is also called localiza/on. 17

Lijle Prince s Model (HMM + Ac/ons) {P(s3 s0,f)=.51, P(s2 s1,b)=.32, P(s4 s3,t)=.89, } {P(rose s0)=.76, P(volcano s1)=.83, P(nothing s3)=.42, } π 1 (0) = 0.25,π 2 (0) = 0.25,π 3 (0) = 0.25,π 4 (0) = 0.25 18

Lijle Prince Example Model (ac/on) Transi/on Probabili/es Φ (Fwd, Back, Turn) F S0 S1 S2 S3 S0 0.1 0.1 0.1 0.7 S1 0.1 0.1 0.7 0.1 S2 0.7 0.1 0.1 0.1 S3 0.1 0.7 0.1 0.1 B S0 S1 S2 S3 S0 0.1 0.1 0.7 0.1 S1 0.1 0.1 0.1 0.7 S2 0.1 0.7 0.1 0.1 S3 0.7 0.1 0.1 0.1 T S0 S1 S2 S3 S0 0.7 0.1 0.1 0.1 S1 0.1 0.7 0.1 0.1 S2 0.1 0.1 0.1 0.7 S3 0.1 0.1 0.7 0.1 (sensor) Appearance Probabili/es θ θ Rose Volcano Nothing S0 0.8 0.1 0.1 S1 0.1 0.8 0.1 S2 0.1 0.1 0.8 S3 0.1 0.1 0.8 Ini/al State Probabili/es π π S0 S1 S2 S3.25.25.25.25 Later, we may add u/lity to states (e.g., prefer to see rose) 19

Experience and State Sequence E 1:T is an experience is a sequence of {observe 1, act 1, observe 2, act 2,., act T-1, observe T } E 1:T = {o 1, a 1, o 2, a 2, o 3,, o T-1, a T-1, o T } X 1:T is a sequence of states that may correspond to the experience X 1:T = {X 1, X 2, X 3,, X T-1, X T } 20

E 1:T Lijle Prince in Ac/on {rose}, fward, {none},, turn, {rose}, back, {volcano} /me States may be hidden! S1 S2 S3 S4 S1 S2 S3 S4 S1 S2 S3 S4 S1 S2 S3 S4 Given: the observable experience E 1:T (/me 1 through T) Defini/ons: State S and ac/ons A (Forward, Back, Turn) (Ac/on model) Transi/on Probabili/es Φ (Sensor model) Appearance Probabili/es θ (rose, volcano, none) (localiza/on) Ini/al/current State Probabili/es π 21

Lijle Prince Example From an experience (/me 1 through t) Infer the most likely sequence of states {rose}, forward, {.},, turn,, {rose}, bwd, {volcano} S1 S2 S3 S4 S1 S2 S3 S4 S1 S2 S3 S4 S1 S2 S3 S4 Find the best sequence of states that support the given experience! 22

Possible Inferences with names P(X 1:t E 1:t ) what states I have been through ( explana/on ) P(X t E 1:t ) which state I am in now ( state es/ma/on localiza/on ) P(X t+k E 1:t ) which state I will be in at /me t+k ( predic/on ) P(X k E 1:t ) which state I was in at /me k (k<t) ( smoothing ) P(M t E 1:t ) How correct is my model at /me t ( model learning ) 23

Temporal Models Models with ac/ons and sensors The next 12 slides are essen/al for you to understand the concepts The rest will follow naturally if you do Markov Chains No observa/on, no explicit ac/ons, transit randomly Hidden Markov Model No explicit ac/ons, state transit randomly Dynamic Bayesian Networks No explicit ac/ons, States are Bayesian Networks Con/nuous State Model No explicit ac/ons, States are con/nuous POMDP 24

Lijle Prince Ac/on Model (S,A,Φ,θ,π) Transi/on Probabili/es Φ (Forward, Back, Turn) F S0 S1 S2 S3 S0 0.1 0.1 0.1 0.7 S1 0.1 0.1 0.7 0.1 S2 0.7 0.1 0.1 0.1 S3 0.1 0.7 0.1 0.1 B S0 S1 S2 S3 S0 0.1 0.1 0.7 0.1 S1 0.1 0.1 0.1 0.7 S2 0.1 0.7 0.1 0.1 S3 0.7 0.1 0.1 0.1 T S0 S1 S2 S3 S0 0.7 0.1 0.1 0.1 S1 0.1 0.7 0.1 0.1 S2 0.1 0.1 0.1 0.7 S3 0.1 0.1 0.7 0.1 Appearance Probabili/es θ θ Rose Volcano Nothing S0 0.8 0.1 0.1 S1 0.1 0.8 0.1 S2 0.1 0.1 0.8 S3 0.1 0.1 0.8 Ini/al State Probabili/es π π S0 S1 S2 S3.25.25.25.25 25

E 1:T Lijle Prince in Ac/on {rose}, fward, {none},, turn, {rose}, back, {volcano} /me Hidden! S1 S2 S3 S4 S1 S2 S3 S4 S1 S2 S3 S4 S1 S2 S3 S4 Given: experience E 1:T (/me 1 through T) You can Infer: P(X t E 1:t ), where (which state) am I at now? (es/ma/on) P(X t+k E 1:t ), where I will be at /me t+k? (predic/on) P(X k E 1:t ), where I was at /me k < t? (smooth) P(X 1:t E 1:t ) which sequence of states I went through? (explana/on) 26

P(X 1:t E 1:t ) explana/on Example Given: experience E 1:2 ={rose, forward, nothing} Infer: the most likely sequence of states X 1:2 E= {rose}, forward, {nothing} E 1:2 ={rose, forward, nothing} O 1:2 ={rose, nothing} // observa/ons A 1 ={forward} // ac/on sequence E 1:2 =O 1:2 A 1 S0 S1 S2 S3 S0 S1 S2 S3 Compute P(X E): 16 possible X, which one is most likely? {s0,s0}, {s0,s1}, {s0,s2}, {s0,s3}, {s1,s0}, {s1,s1}, {s1,s2}, {s1,s3}, {s2,s0}, {s2,s1}, {s2,s2}, {s2,s3}, {s3,s0}, {s3,s1}, {s3,s2}, {s3,s3} Find the best sequence of states that support the experience! 27

P(X 1,2 E 1,2 ) Example in a Graph P(X E) {s0,s0} {s0,s1} {s0,s2} {s0,s3} {s1,s0} {s1,s1} {s1,s2} {s1,s3} {s2,s0} {s2,s1} {s2,s2} {s2,s3} {s3,s0} {s3,s1} {s3,s2} {s3,s3} X 28

Compute P(X 1:T E 1:T ) P(X 1:T E) where E=OA X1:T is all possible sequences of states Let x={i 1, i 2,, i T } be a sequence of states P(x E) = p(xe)/p(e) = P(xOA)/P(OA) = p(x A)p(O xa)/p(o A) = α p(x A)p(O xa) p(x A)= p(o xa)= O = {z 1, z 2,, z T }, A = {b 1, b 2,, b T-1 }, x = {i 1, i 2,, i T } E={rose, forward, nothing} O={rose, nothing} // observa/ons A={forward} // ac/on sequence {s0,s0}, {s0,s1}, {s0,s2}, {s0,s3}, {s1,s0}, {s1,s1}, {s1,s2}, {s1,s3}, {s2,s0}, {s2,s1}, {s2,s2}, {s2,s3}, {s3,s0}, {s3,s1}, {s3,s2}, {s3,s3} 29

In Our Example 16 possible x p(x A) p({s 0,s 0 } {f})=π(s 0 )p s0,s0 (f)=.25*.1=.025 p({s 0,s 1 } {f})=π(s 0 )p s0,s1 (f)=.25*.1=.025 p({s 0,s 2 } {f})=π(s 0 )p s0,s2 (f)=.25*.1=.025 p({s 0,s 3 } {f})=π(s 0 )p s0,s3 (f)=.25*.7=.175 p({s 1,s 0 } {f})=π(s 1 )p s1,s0 (f)=.25*.1=.025 p({s 1,s 1 } {f})=π(s 1 )p s1,s1 (f)=.25*.1=.025 p({s 1,s 2 } {f})=π(s 1 )p s1,s2 (f)=.25*.1=.175 p({s 1,s 3 } {f})=π(s 1 )p s1,s3 (f)=.25*.7=.025 p({s 2,s 0 } {f})=π(s 2 )p s2,s0 (f)=.25*.1=.175 p({s 2,s 1 } {f})=π(s 2 )p s2,s1 (f)=.25*.1=.025 p({s 2,s 2 } {f})=π(s 2 )p s2,s2 (f)=.25*.1=.025 p({s 2,s 3 } {f})=π(s 2 )p s2,s3 (f)=.25*.7=.025 p({s 3,s 0 } {f})=π(s 3 )p s3,s0 (f)=.25*.1=.025 p({s 3,s 1 } {f})=π(s 3 )p s3,s1 (f)=.25*.1=.175 p({s 3,s 2 } {f})=π(s 3 )p s3,s2 (f)=.25*.1=.025 p({s 3,s 3 } {f})=π(s 3 )p s3,s3 (f)=.25*.7=.025 {s0,s0}, {s0,s1}, {s0,s2}, {s0,s3},{s1,s0}, {s1,s1}, {s1,s2}, {s1,s3}, {s2,s0}, {s2,s1}, {s2,s2}, {s2,s3},{s3,s0}, {s3,s1}, {s3,s2}, {s3,s3} p(o xa) Θs 0 (r) Θs 0 (n)=.8*.1=.08 Θs 0 (r) Θs 1 (n)=.8*.1=.08 Θs 0 (r) Θs 2 (n)=.8*.8=.64 Θs 0 (r) Θs 3 (n)=.8*.8=.64 Θs 1 (r) Θs 0 (n)=.1*.1=.01 Θs 1 (r) Θs 1 (n)=.1*.1=.01 Θs 1 (r) Θs 2 (n)=.1*.8=.08 Θs 1 (r) Θs 3 (n)=.1*.8=.08 Θs 2 (r) Θs 0 (n)=.1*.1=.01 Θs 2 (r) Θs 1 (n)=.1*.1=.01 Θs 2 (r) Θs 2 (n)=.1*.8=.08 Θs 2 (r) Θs 3 (n)=.1*.8=.08 Θs 3 (r) Θs 0 (n)=.1*.1=.01 Θs 3 (r) Θs 1 (n)=.1*.1=.08 Θs 3 (r) Θs 2 (n)=.1*.8=.08 Θs 3 (r) Θs 3 (n)=.1*.8=.08 E={rose, forward, nothing} O={rose, nothing}, A={forward} Best Explana/on is: {s0, s3} 30

Which Explana/on is the Best? Among all possible sequences of states, the best explana/on is the sequence of states that gives the maximal value for P(x E)= Experience: Observa/ons: o 1, o 2,, o T-1, o T Ac/ons: b 1, b 2,, b T-1 Sensor models: Ac/on models: P ij [b] = P(s j s i, b) Explana/on: State sequence: i 1, i 2, i 3,, i T 31

Compute Hidden State Sequence (1/2) Experience consists of both O and A O is observa/on sequence and A is ac/on sequence in experience M is the model, C is the background informa/on O = {z 1, z 2,, z T }, A = {b 1, b 2,, b T-1 }, I = {i 1, i 1,, i T } In Lijle Prince Example: I used x for I, and ignored M and C there. 32

Compute Hidden State Sequence (2/2) O = {z 1, z 2,, z T }, A = {b 1, b 2,, b T }, I = {i 1, i 1,, i T } In Lijle Prince Example, I wrote above as p(x A) and p(o xa) Among all possible sequences in I, there is one with the maximal probability. 33

P(O AMC) by Forward Procedure Main idea: not consider all possible state sequence, but every step in the experience and compute P(O AMC) incrementally on the /me t: 34

Forward Procedure Forward Procedure Complexity is O(TN 2 ) 35

Forward Procedure Example Time: t 1 --------------------------------------t 2 ------------------------------------------ t 3 ------------- E = { rose, forward, none, turn, volcano,..} α 1 (i): α 1 (s 0 ) = π(s 0 )Θs 0 (r)=.25*.8=.20 α 1 (s 1 ) = π(s 1 )Θs 1 (r)=.25*.1=.025 α 1 (s 2 ) = π(s 2 )Θs 2 (r)=.25*.1=.025 α 1 (s 3 ) = π(s 3 )Θs 3 (r)=.25*.1=.025 α 2 (i): α 2 (s 0 ) = α 1 (s 0 ) p s0,s0 (f) Θs 0 (n) + α 1 (s 1 ) p s1,s0 (f) Θs 0 (n) + α 1 (s 2 ) p s2,s0 (f) Θs 0 (n) + α 1 (s 3 ) p s3,s0 (f) Θs 0 (n) α 2 (s 1 ) = α 1 (s 0 ) p s0,s1 (f) Θs 1 (n) + α 1 (s 1 ) p s1,s1 (f) Θs 1 (n) + α 1 (s 2 ) p s2,s1 (f) Θs 1 (n) + α 1 (s 3 ) p s3,s1 (f) Θs 1 (n) α 2 (s 2 ) = α 1 (s 0 ) p s0,s2 (f) Θs 2 (n) + α 1 (s 1 ) p s1,s2 (f) Θs 2 (n) + α 1 (s 2 ) p s2,s2 (f) Θs 2 (n) + α 1 (s 3 ) p s3,s2 (f) Θs 2 (n) α 2 (s 3 ) = α 1 (s 0 ) p s0,s3 (f) Θs 3 (n) + α 1 (s 1 ) p s1,s3 (f) Θs 3 (n) + α 1 (s 2 ) p s2,s3 (f) Θs 3 (n) + α 1 (s 3 ) p s3,s3 (f) Θs 3 (n) α 3 (i): α 3 (s 0 ) = α 2 (s 0 ) p s0,s0 (t) Θs 0 (v) + α 2 (s 1 ) p s1,s0 (t) Θs 0 (v) + α 2 (s 2 ) p s2,s0 (t) Θs 0 (v) + α 2 (s 3 ) p s3,s0 (t) Θs 0 (v) α 3 (s 1 ) = α 2 (s 0 ) p s0,s1 (t) Θs 1 (v) + α 2 (s 1 ) p s1,s1 (t) Θs 1 (v) + α 2 (s 2 ) p s2,s1 (t) Θs 1 (v) + α 2 (s 3 ) p s3,s1 (t) Θs 1 (v) α 3 (s 2 ) = α 2 (s 0 ) p s0,s2 (t) Θs 2 (v) + α 2 (s 1 ) p s1,s2 (t) Θs 2 (v) + α 2 (s 2 ) p s2,s2 (t) Θs 2 (v) + α 2 (s 3 ) p s3,s2 (t) Θs 2 (v) α 3 (s 3 ) = α 2 (s 0 ) p s0,s3 (t) Θs 3 (v) + α 2 (s 1 ) p s1,s3 (t) Θs 3 (v) + α 2 (s 2 ) p s2,s3 (t) Θs 3 (v) + α 2 (s 3 ) p s3,s3 (t) Θs 3 36 (v)

Backward Procedure Similar to forward See ALFE 5.10.1 37

Inference: State Es/ma/on P(X t E 1:t ) Which state I am most likely in now? It is the state s i such that α T (i) is the maximal 38

Inference: State Predic/on P(X t+k E 1:t ) which state I will be in at /me t+k? Depending on your ac/on during (t, t+k] If you have an non-ac/on, use that informa/on and con/nue compu/ng the future α values. Example/exercise: compute when k=1, k=2, 39

Inference: State Smoothing P(X k E 1:t ) which state I was in at /me k (smoothing) It is the state s i such that α k (i) P ij (b k+1 ) θ j (z k+1 ) β k+1 (j) is the maximal This is to say, given /me 1,,k, k+1,...,t (best forward from 1 to k) + (best transi/on from k to k+1) + (best backward to k+1 from t) 40

State Explana/on P(X 1:t E 1:t ) what states I have been through (explana/on) They are the following states: t=1: the state s i such that α 1 (i) is the maximal t=2: the state s i such that α 2 (i) is the maximal t=t: the state s i such that α T (i) is the maximal 41

Model Learning P(M t E 1:t ) How correct is my model of the world (learning) a er I have my experience We will talk about this a er mid-term 42

Homework Given: the Lijle Prince (see slide 18 or 24) His sensor model (you complete the rest) His ac/on model (you complete the rest) Assume he had the following experience Observa/ons: {rose, nothing, volcano, nothing} Ac/ons: {forward, turn, backward, backward} You Compute: P(X 1 E 1 ), P(X 2 E 1:2 ), P(X 3 E 1:3 ), P(X 1:3 E 1:3 ) P(X 4 E 1:3 ) 43

Other Models Markov Chains No observa/on, no explicit ac/ons, transit randomly Hidden Markov Model No explicit ac/ons, state transit randomly Con/nuous State Model No explicit ac/ons, States are con/nuous Dynamic Bayesian Networks No explicit ac/ons, States are Bayesian Networks POMDP 44

Hidden Markov Models Markov chains not so useful for most agents Eventually you don t know anything anymore Need observa/ons to update your beliefs Hidden Markov models (HMMs) Underlying Markov chain over states S You observe outputs (effects) at each /me step As a Bayes net: Hidden States X 1 X 2 X 3 X 4 X 5 Evidence E 1 E 2 E 3 E 4 E 5 45

Rain/Umbrella Example in the Book An HMM is defined by: Ini/al distribu/on: Transi/ons: Emissions (sensor model): 46

Ghostbuster HMM P(X 1 ) = uniform P(X X ) = usually move clockwise, but some/mes move in a random direc/on or stay in place P(R ij X) = the same sensor model as before: red means close, green means far away. 1/9 1/9 1/9 1/9 1/9 1/9 1/9 1/9 1/9 P(X 1 ) X 1 X 2 X 3 X 4 1/6 1/6 0 1/6 1/2 0 R i,j R i,j R i,j R i,j 0 0 0 P(X X =<1,2>) See ghost Not see ghost 47

Condi/onal Independence HMMs have two important independence proper/es: Markov hidden process, future depends only on the present (not past) Current observa/on independent of all else given the current state X 1 X 2 X 3 X 4 E 1 E 2 E 3 E 4 Quiz: does this mean that observa/ons are independent? [No, correlated by the hidden state] 48

Real HMM Examples Speech recogni/on HMMs: Observa/ons are acous/c signals (con/nuous valued) States are specific posi/ons in specific words (so, tens of thousands) Machine transla/on HMMs: Observa/ons are words (tens of thousands) States are transla/on op/ons Robot tracking: Observa/ons are range readings (con/nuous/discrete) States are posi/ons on a map (con/nuous/discrete) 49

Filtering / Monitoring / Localiza/on Filtering, or monitoring, is the task of tracking the distribu/on P(X) (the belief state) over /me We start with P(X) in an ini/al se ng, usually uniform As /me passes, or we get observa/ons, we update P(X) P(X t E 1:t ) 50

Example: Robot Localiza/on P(X t E 1:t ) Example from Michael Pfeiffer State X: Room Prob 0 t=0 (before sensing, all states are equally likely) Sensor model: never more than 1 mistake Mo/on model: may not execute ac/on with small prob. 1 51

Example: Robot Localiza/on P(X t E 1:t ) Prob 0 1 t=1 (sensing: no room up or down) 52

Example: Robot Localiza/on P(X t E 1:t ) Prob 0 1 t=2 53

Example: Robot Localiza/on P(X t E 1:t ) Prob 0 1 t=3 54

Example: Robot Localiza/on P(X t E 1:t ) Prob 0 1 t=4 55

Example: Robot Localiza/on P(X t E 1:t ) Prob 0 1 t=5 56

Filtering Example P(X t E 1:t ) Day 0: P(R 0 )=<0.5,0.5> Day 1 (Umbrella appears è U 1 =true) P(R 1 )=P(r 0 )P(R 1 r 0 ) = α0.5<0.7,0.3> + α0.5<0.3,0.7> = <0.5,0.5> upda/ng with evidence for t=1 gives: P(R 1 u 1 )=αp(u 1 R 1 )P(R 1 )=α<0.9,0.2><0.5,0.5> =α<0.45,0.1> = <0.818,0.182> Day 2 (Umbrella appears è U 2 =true) P(R 2 u 1 )=P(R 2 r 1 ) P(r 1 u 1 ) =α<0.818<0.7,0.3> + 0.182<0.3,0.7>> = <0.627,0.373> upda/ng with evidence for t=2 gives: P(R 2 u 1,u 2 )=αp(u 2 R 2 )P(R 2 u 1 )=α<0.9,0.2><0.627,0.373> =α<0.565,0.075> = <0.883,0.117> 57

Smoothing P(X k E 1:t ) Divide evidence e 1:t into e 1:k, e k+1:t : Using Bayes rule Using condi/onal independence Backward message computed by a backwards recursion: Condi/oning on X k+1 Using condi/onal independence = BACKWARD(b k+2:t,e k+1:t ) b k+1:t =Σ Xt+1 (P(e k+1 x k+1 )P(x k+1 X k ) b k+2:t 58

Smoothing Example Compute es/mate for rain at t=1 P(R 1 u 1,u 2 )=αp(r 1 u 1 )P(u 2 R 1 ) P(R 1 u 1 ) = <0.818,0.182> P(u 2 R 1 ) =Σ r2 P(u 2 r 2 ) P( r 2 ) P(r 2 R 1 ) Smoothed estimate: =(0.9 1 <0.7,0.3>) + (0.2 1 <0.3,0.7>) = <0.69,0.41> P(R 1 u 1,u 2 )=α<0.818,0.182> <0.69,0.41> = <0.883,0.117> Forward-backward algorithm: cache forward messages along the way Time linear in t (polytree inference), space O(t j f j ) 59

Best Explana/on Queries X 1 X 2 X 3 X 4 X 5 E 1 E 2 E 3 E 4 E 5 Query: most likely seq: 60

State Path Trellis State trellis: graph of states and transi/ons over /me sun sun sun sun rain rain rain rain Each arc represents some transi/on Each arc has weight Each path is a sequence of states The product of weights on a path is the seq s probability Can think of the Forward (and now Viterbi) algorithms as compu/ng sums of all paths (best paths) in this graph 61

Viterbi Algorithm: choose the best state seq. Similar to slide 18 α t (s t ), but use max not sum sun sun sun sun rain rain rain rain α t (s t ) α t-1 (s t-1 ) 62

Viterbi Example 0.1 63

Viterbi Example 0.8182 0. 7 0.9 0.5155 0. 7 0.1 0.1237 0. 3 0.9 0.0334 0. 7 0.9 0.8182 0.3 0.2 0.5155 0.3 0.8 0.1237 0.7 0.2 0.0173 0.7 0.2 64

Other Models Markov Chains No observa/on, no explicit ac/ons, transit randomly Hidden Markov Model No explicit ac/ons, state transit randomly Dynamic Bayesian Networks No explicit ac/ons, States are Bayesian Networks Con/nuous State Model No explicit ac/ons, States are con/nuous POMDP 65

Dynamic Bayesian Networks {evidence}, ac/on, {evidence},, ac/on,, {evidence}, ac/on, {evidence} S1 S2 S3 S4 S1 S2 S3 S4 S1 S2 S3 S4 S1 S2 S3 S4 Each /me slice is a Bayesian Network with variables and CPTs 66

Dynamic Bayes Nets (DBNs) We want to track mul/ple variables over /me, using mul/ple sources of evidence Idea: Repeat a fixed Bayes net structure at each /me Variables from /me t can condi/on on those from t-1 Discrete valued dynamic Bayes nets are also HMMs 67

Exact inference in DBNs Variable elimina/on applies to Dynamic Bayesian nets Procedure: unroll the network for T /me steps, then eliminate variables un/l P(X T e 1:T ) is computed Online belief updates: Eliminate all variables from the previous /me step; store factors for current /me only 68

Likelihood weigh/ng for DBNs Set of weighted samples approximates the belief state LW samples pay no attention to the evidence! ) fraction agreeing falls exponentially with t ) num. samples required grows exponentially with t 69

DBNs vs. HMMs Every HMM is a single-variable DBN; every discrete DBN is an HMM Sparse dependencies ) exponentially fewer parameters; e.g., 20 state variables, three parents each DBN has 20x2 3 =160 parameters, HMM has 2 20 x2 20 ¼10 12 70

Other Models Markov Chains No observa/on, no explicit ac/ons, transit randomly Hidden Markov Model No explicit ac/ons, state transit randomly Dynamic Bayesian Networks No explicit ac/ons, States are Bayesian Networks Con/nuous State Model No explicit ac/ons, States are con/nuous POMDP 71

Kalman Filters {evidence}, ac/on, {evidence},, ac/on,, {evidence}, ac/on, {evidence} S1 S2 S3 S4 S1 S2 S3 S4 S1 S2 S3 S4 S1 S2 S3 S4 In each /me slice, state variables are con/nuous (not discrete) 72

Summary Temporal models use states, transitions, sensors Transitions may related to agent s actions, or spontaneous Markov assumptions and stationarity assumption, so we need - transition model P(Xt Xt-1) or P(Xt Xt-1, a t-1 ) - sensor model P(Et Xt) Tasks: filtering, prediction, smoothing, most likely seq all done recursively with constant cost per time step Types of models HMM have a single discrete state variable Dynamic Bayes nets subsume HMMs; exact update intractable Other models may have internal structure driven by actions 73