CSCI 360 Introduc/on to Ar/ficial Intelligence Week 2: Problem Solving and Op/miza/on. Professor Wei-Min Shen Week 8.1 and PDF Free Download

CSCI 360 Introduc/on to Ar/ficial Intelligence Week 2: Problem Solving and Op/miza/on Professor Wei-Min Shen Week 8.1 and 8.2

Status Check Projects Project 2 Midterm is coming, please do your homework! Read the HW Guide (TJ will release that soon) Ques/ons? 2

Probabilis/c Model (Ac/ons & Sensors) Readings: AIMA 15, ALFE 2, 5, (5.10) Defini/ons Markov Chains, HMM, DBN, Examples: Speech recogni/on ( hearing =?= believing ) Traveling in colored rooms ( seeing =?= believing ) Given: a map of colored rooms, and an experience (act/see): e 1:t Compute: which room the robot was/is/will-be in: X States, observa/ons, transi/ons, ac/ons, sensors, Inferences P(X t e 1:t ) which room I am in now (state es/ma/on) P(X t+k e 1:t ) which room I will be in at /me t+k (predic/on) P(X k e 1:t ) which room I was in at /me k (smoothing) P(x 1:t e 1:t ) what rooms I have been in the past (explana/on) P(M t e 1:t ) can I improve my map (learning) 3

Reasoning over Time Determinis/c Reasoning over /me Automata, Determinis/c Finite Automata (DFA) Probabilis/c Reasoning over /me Non-Determinis/c Finite Automata (NFA) Hidden Markov Model (HMM) Dynamic Bayesian Networks (DBN) 4

Example of Reasoning over Time Speech Recogni/on Listening is not always equal to hearing J Traveling through rooms with colored walls Seeing does not always tell where you are J a 0 Room-1 a 1 Room-2 a 2 Room-3 a 3 Wall-color Wall-color Wall-color 5

Problem of Reasoning over Time Given: A model of the world (e.g., a map of rooms with colored walls) An experience of observa/ons and ac/ons from /me 1 to t Compute: Which states (e.g., rooms) the robot was/is/will-be in? a 0 Room-1 a 1 Room-2 a 2 Room-3 a 3 Wall-color Wall-color Wall-color 6

States, Ac/ons and Sensors Example of the Lijle Prince s Planet Percepts: {rose, volcano, nothing} Ac/ons: {forward, backward, turn-around} States: {north-side, south-side, east-side, west-side} 7

The Planet of Lijle Prince See Nothing See Volcano See Rose See Nothing 8

A Model for Lijle Prince s Planet See Nothing See Volcano See Rose See Nothing His internal ac/on model has only 4 states! (why?) 9

The Ac/on/Sensor Model 10

Lijle Prince s Determinis/c Model Are the ac/ons determinis/c? Are the sensors determinis/c? Is there any hidden states in this example? 11

Two Big Uncertain/es Seeing may not be believing States and observa/ons are not 1-1 mapping One state may be seen in many different ways The same observa/on may be seen in many different states E.g., When the prince sees nothing, where is he? Facing which way? Can you made the lijle prince s sensors non-determinis/c? Ac/on may not produce the same results Ac/ons do not always take you to the same states E.g., the lijle prince s planet may have planet-quakes J Transi/ons between states by ac/ons may be nondeterminis/c now Can you made the lijle prince s ac/ons non-determinis/c? 12

Certainty vs Uncertainty The certain way s1 a1 s2 The uncertain way S1 is not always blue s1 a1 a1 a1 S2 is not always red s3 s2 a1 from S1 does not always go to S2 s3 13

Uncertainty in Sensor & Ac/ons? State S19?? Observa/ons????? Solu/on: let s use probabili/es Sensor Model (for states and observa/ons) P( Observation State ) Probabilis/c Transi/ons (for states and ac/ons) P( Next_State Current_State, Ac/on) 14

Hidden Markov Model (with ac/ons) z1 z2 s23 a1 z1 s94 z3 1. Ac/ons a1 2. Percepts (observa/ons) 3. States 4. Appearance: states à observa/ons 5. Transi/ons: (states, ac/ons) à states 6. Current State a1 s77 15

Hidden Markov Model (1/2) Ac/on Model 16

Hidden Markov Model (2/2) Sensor Model This is also called localiza/on. 17

Lijle Prince s Model (HMM + Ac/ons) {P(s3 s0,f)=.51, P(s2 s1,b)=.32, P(s4 s3,t)=.89, } {P(rose s0)=.76, P(volcano s1)=.83, P(nothing s3)=.42, } π 1 (0) = 0.25,π 2 (0) = 0.25,π 3 (0) = 0.25,π 4 (0) = 0.25 18

Lijle Prince Example Model (ac/on) Transi/on Probabili/es Φ (Fwd, Back, Turn) F S0 S1 S2 S3 S0 0.1 0.1 0.1 0.7 S1 0.1 0.1 0.7 0.1 S2 0.7 0.1 0.1 0.1 S3 0.1 0.7 0.1 0.1 B S0 S1 S2 S3 S0 0.1 0.1 0.7 0.1 S1 0.1 0.1 0.1 0.7 S2 0.1 0.7 0.1 0.1 S3 0.7 0.1 0.1 0.1 T S0 S1 S2 S3 S0 0.7 0.1 0.1 0.1 S1 0.1 0.7 0.1 0.1 S2 0.1 0.1 0.1 0.7 S3 0.1 0.1 0.7 0.1 (sensor) Appearance Probabili/es θ θ Rose Volcano Nothing S0 0.8 0.1 0.1 S1 0.1 0.8 0.1 S2 0.1 0.1 0.8 S3 0.1 0.1 0.8 Ini/al State Probabili/es π π S0 S1 S2 S3.25.25.25.25 Later, we may add u/lity to states (e.g., prefer to see rose) 19

Experience and State Sequence E 1:T is an experience is a sequence of {observe 1, act 1, observe 2, act 2,., act T-1, observe T } E 1:T = {o 1, a 1, o 2, a 2, o 3,, o T-1, a T-1, o T } X 1:T is a sequence of states that may correspond to the experience X 1:T = {X 1, X 2, X 3,, X T-1, X T } 20

E 1:T Lijle Prince in Ac/on {rose}, fward, {none},, turn, {rose}, back, {volcano} /me States may be hidden! S1 S2 S3 S4 S1 S2 S3 S4 S1 S2 S3 S4 S1 S2 S3 S4 Given: the observable experience E 1:T (/me 1 through T) Defini/ons: State S and ac/ons A (Forward, Back, Turn) (Ac/on model) Transi/on Probabili/es Φ (Sensor model) Appearance Probabili/es θ (rose, volcano, none) (localiza/on) Ini/al/current State Probabili/es π 21

Lijle Prince Example From an experience (/me 1 through t) Infer the most likely sequence of states {rose}, forward, {.},, turn,, {rose}, bwd, {volcano} S1 S2 S3 S4 S1 S2 S3 S4 S1 S2 S3 S4 S1 S2 S3 S4 Find the best sequence of states that support the given experience! 22

Possible Inferences with names P(X 1:t E 1:t ) what states I have been through ( explana/on ) P(X t E 1:t ) which state I am in now ( state es/ma/on localiza/on ) P(X t+k E 1:t ) which state I will be in at /me t+k ( predic/on ) P(X k E 1:t ) which state I was in at /me k (k<t) ( smoothing ) P(M t E 1:t ) How correct is my model at /me t ( model learning ) 23

Temporal Models Models with ac/ons and sensors The next 12 slides are essen/al for you to understand the concepts The rest will follow naturally if you do Markov Chains No observa/on, no explicit ac/ons, transit randomly Hidden Markov Model No explicit ac/ons, state transit randomly Dynamic Bayesian Networks No explicit ac/ons, States are Bayesian Networks Con/nuous State Model No explicit ac/ons, States are con/nuous POMDP 24

Lijle Prince Ac/on Model (S,A,Φ,θ,π) Transi/on Probabili/es Φ (Forward, Back, Turn) F S0 S1 S2 S3 S0 0.1 0.1 0.1 0.7 S1 0.1 0.1 0.7 0.1 S2 0.7 0.1 0.1 0.1 S3 0.1 0.7 0.1 0.1 B S0 S1 S2 S3 S0 0.1 0.1 0.7 0.1 S1 0.1 0.1 0.1 0.7 S2 0.1 0.7 0.1 0.1 S3 0.7 0.1 0.1 0.1 T S0 S1 S2 S3 S0 0.7 0.1 0.1 0.1 S1 0.1 0.7 0.1 0.1 S2 0.1 0.1 0.1 0.7 S3 0.1 0.1 0.7 0.1 Appearance Probabili/es θ θ Rose Volcano Nothing S0 0.8 0.1 0.1 S1 0.1 0.8 0.1 S2 0.1 0.1 0.8 S3 0.1 0.1 0.8 Ini/al State Probabili/es π π S0 S1 S2 S3.25.25.25.25 25

E 1:T Lijle Prince in Ac/on {rose}, fward, {none},, turn, {rose}, back, {volcano} /me Hidden! S1 S2 S3 S4 S1 S2 S3 S4 S1 S2 S3 S4 S1 S2 S3 S4 Given: experience E 1:T (/me 1 through T) You can Infer: P(X t E 1:t ), where (which state) am I at now? (es/ma/on) P(X t+k E 1:t ), where I will be at /me t+k? (predic/on) P(X k E 1:t ), where I was at /me k < t? (smooth) P(X 1:t E 1:t ) which sequence of states I went through? (explana/on) 26

P(X 1:t E 1:t ) explana/on Example Given: experience E 1:2 ={rose, forward, nothing} Infer: the most likely sequence of states X 1:2 E= {rose}, forward, {nothing} E 1:2 ={rose, forward, nothing} O 1:2 ={rose, nothing} // observa/ons A 1 ={forward} // ac/on sequence E 1:2 =O 1:2 A 1 S0 S1 S2 S3 S0 S1 S2 S3 Compute P(X E): 16 possible X, which one is most likely? {s0,s0}, {s0,s1}, {s0,s2}, {s0,s3}, {s1,s0}, {s1,s1}, {s1,s2}, {s1,s3}, {s2,s0}, {s2,s1}, {s2,s2}, {s2,s3}, {s3,s0}, {s3,s1}, {s3,s2}, {s3,s3} Find the best sequence of states that support the experience! 27

P(X 1,2 E 1,2 ) Example in a Graph P(X E) {s0,s0} {s0,s1} {s0,s2} {s0,s3} {s1,s0} {s1,s1} {s1,s2} {s1,s3} {s2,s0} {s2,s1} {s2,s2} {s2,s3} {s3,s0} {s3,s1} {s3,s2} {s3,s3} X 28

Compute P(X 1:T E 1:T ) P(X 1:T E) where E=OA X1:T is all possible sequences of states Let x={i 1, i 2,, i T } be a sequence of states P(x E) = p(xe)/p(e) = P(xOA)/P(OA) = p(x A)p(O xa)/p(o A) = α p(x A)p(O xa) p(x A)= p(o xa)= O = {z 1, z 2,, z T }, A = {b 1, b 2,, b T-1 }, x = {i 1, i 2,, i T } E={rose, forward, nothing} O={rose, nothing} // observa/ons A={forward} // ac/on sequence {s0,s0}, {s0,s1}, {s0,s2}, {s0,s3}, {s1,s0}, {s1,s1}, {s1,s2}, {s1,s3}, {s2,s0}, {s2,s1}, {s2,s2}, {s2,s3}, {s3,s0}, {s3,s1}, {s3,s2}, {s3,s3} 29

In Our Example 16 possible x p(x A) p({s 0,s 0 } {f})=π(s 0 )p s0,s0 (f)=.25*.1=.025 p({s 0,s 1 } {f})=π(s 0 )p s0,s1 (f)=.25*.1=.025 p({s 0,s 2 } {f})=π(s 0 )p s0,s2 (f)=.25*.1=.025 p({s 0,s 3 } {f})=π(s 0 )p s0,s3 (f)=.25*.7=.175 p({s 1,s 0 } {f})=π(s 1 )p s1,s0 (f)=.25*.1=.025 p({s 1,s 1 } {f})=π(s 1 )p s1,s1 (f)=.25*.1=.025 p({s 1,s 2 } {f})=π(s 1 )p s1,s2 (f)=.25*.1=.175 p({s 1,s 3 } {f})=π(s 1 )p s1,s3 (f)=.25*.7=.025 p({s 2,s 0 } {f})=π(s 2 )p s2,s0 (f)=.25*.1=.175 p({s 2,s 1 } {f})=π(s 2 )p s2,s1 (f)=.25*.1=.025 p({s 2,s 2 } {f})=π(s 2 )p s2,s2 (f)=.25*.1=.025 p({s 2,s 3 } {f})=π(s 2 )p s2,s3 (f)=.25*.7=.025 p({s 3,s 0 } {f})=π(s 3 )p s3,s0 (f)=.25*.1=.025 p({s 3,s 1 } {f})=π(s 3 )p s3,s1 (f)=.25*.1=.175 p({s 3,s 2 } {f})=π(s 3 )p s3,s2 (f)=.25*.1=.025 p({s 3,s 3 } {f})=π(s 3 )p s3,s3 (f)=.25*.7=.025 {s0,s0}, {s0,s1}, {s0,s2}, {s0,s3},{s1,s0}, {s1,s1}, {s1,s2}, {s1,s3}, {s2,s0}, {s2,s1}, {s2,s2}, {s2,s3},{s3,s0}, {s3,s1}, {s3,s2}, {s3,s3} p(o xa) Θs 0 (r) Θs 0 (n)=.8*.1=.08 Θs 0 (r) Θs 1 (n)=.8*.1=.08 Θs 0 (r) Θs 2 (n)=.8*.8=.64 Θs 0 (r) Θs 3 (n)=.8*.8=.64 Θs 1 (r) Θs 0 (n)=.1*.1=.01 Θs 1 (r) Θs 1 (n)=.1*.1=.01 Θs 1 (r) Θs 2 (n)=.1*.8=.08 Θs 1 (r) Θs 3 (n)=.1*.8=.08 Θs 2 (r) Θs 0 (n)=.1*.1=.01 Θs 2 (r) Θs 1 (n)=.1*.1=.01 Θs 2 (r) Θs 2 (n)=.1*.8=.08 Θs 2 (r) Θs 3 (n)=.1*.8=.08 Θs 3 (r) Θs 0 (n)=.1*.1=.01 Θs 3 (r) Θs 1 (n)=.1*.1=.08 Θs 3 (r) Θs 2 (n)=.1*.8=.08 Θs 3 (r) Θs 3 (n)=.1*.8=.08 E={rose, forward, nothing} O={rose, nothing}, A={forward} Best Explana/on is: {s0, s3} 30

Which Explana/on is the Best? Among all possible sequences of states, the best explana/on is the sequence of states that gives the maximal value for P(x E)= Experience: Observa/ons: o 1, o 2,, o T-1, o T Ac/ons: b 1, b 2,, b T-1 Sensor models: Ac/on models: P ij [b] = P(s j s i, b) Explana/on: State sequence: i 1, i 2, i 3,, i T 31

Compute Hidden State Sequence (1/2) Experience consists of both O and A O is observa/on sequence and A is ac/on sequence in experience M is the model, C is the background informa/on O = {z 1, z 2,, z T }, A = {b 1, b 2,, b T-1 }, I = {i 1, i 1,, i T } In Lijle Prince Example: I used x for I, and ignored M and C there. 32

Compute Hidden State Sequence (2/2) O = {z 1, z 2,, z T }, A = {b 1, b 2,, b T }, I = {i 1, i 1,, i T } In Lijle Prince Example, I wrote above as p(x A) and p(o xa) Among all possible sequences in I, there is one with the maximal probability. 33

P(O AMC) by Forward Procedure Main idea: not consider all possible state sequence, but every step in the experience and compute P(O AMC) incrementally on the /me t: 34

Forward Procedure Forward Procedure Complexity is O(TN 2 ) 35

Forward Procedure Example Time: t 1 --------------------------------------t 2 ------------------------------------------ t 3 ------------- E = { rose, forward, none, turn, volcano,..} α 1 (i): α 1 (s 0 ) = π(s 0 )Θs 0 (r)=.25*.8=.20 α 1 (s 1 ) = π(s 1 )Θs 1 (r)=.25*.1=.025 α 1 (s 2 ) = π(s 2 )Θs 2 (r)=.25*.1=.025 α 1 (s 3 ) = π(s 3 )Θs 3 (r)=.25*.1=.025 α 2 (i): α 2 (s 0 ) = α 1 (s 0 ) p s0,s0 (f) Θs 0 (n) + α 1 (s 1 ) p s1,s0 (f) Θs 0 (n) + α 1 (s 2 ) p s2,s0 (f) Θs 0 (n) + α 1 (s 3 ) p s3,s0 (f) Θs 0 (n) α 2 (s 1 ) = α 1 (s 0 ) p s0,s1 (f) Θs 1 (n) + α 1 (s 1 ) p s1,s1 (f) Θs 1 (n) + α 1 (s 2 ) p s2,s1 (f) Θs 1 (n) + α 1 (s 3 ) p s3,s1 (f) Θs 1 (n) α 2 (s 2 ) = α 1 (s 0 ) p s0,s2 (f) Θs 2 (n) + α 1 (s 1 ) p s1,s2 (f) Θs 2 (n) + α 1 (s 2 ) p s2,s2 (f) Θs 2 (n) + α 1 (s 3 ) p s3,s2 (f) Θs 2 (n) α 2 (s 3 ) = α 1 (s 0 ) p s0,s3 (f) Θs 3 (n) + α 1 (s 1 ) p s1,s3 (f) Θs 3 (n) + α 1 (s 2 ) p s2,s3 (f) Θs 3 (n) + α 1 (s 3 ) p s3,s3 (f) Θs 3 (n) α 3 (i): α 3 (s 0 ) = α 2 (s 0 ) p s0,s0 (t) Θs 0 (v) + α 2 (s 1 ) p s1,s0 (t) Θs 0 (v) + α 2 (s 2 ) p s2,s0 (t) Θs 0 (v) + α 2 (s 3 ) p s3,s0 (t) Θs 0 (v) α 3 (s 1 ) = α 2 (s 0 ) p s0,s1 (t) Θs 1 (v) + α 2 (s 1 ) p s1,s1 (t) Θs 1 (v) + α 2 (s 2 ) p s2,s1 (t) Θs 1 (v) + α 2 (s 3 ) p s3,s1 (t) Θs 1 (v) α 3 (s 2 ) = α 2 (s 0 ) p s0,s2 (t) Θs 2 (v) + α 2 (s 1 ) p s1,s2 (t) Θs 2 (v) + α 2 (s 2 ) p s2,s2 (t) Θs 2 (v) + α 2 (s 3 ) p s3,s2 (t) Θs 2 (v) α 3 (s 3 ) = α 2 (s 0 ) p s0,s3 (t) Θs 3 (v) + α 2 (s 1 ) p s1,s3 (t) Θs 3 (v) + α 2 (s 2 ) p s2,s3 (t) Θs 3 (v) + α 2 (s 3 ) p s3,s3 (t) Θs 3 36 (v)

Backward Procedure Similar to forward See ALFE 5.10.1 37

Inference: State Es/ma/on P(X t E 1:t ) Which state I am most likely in now? It is the state s i such that α T (i) is the maximal 38

Inference: State Predic/on P(X t+k E 1:t ) which state I will be in at /me t+k? Depending on your ac/on during (t, t+k] If you have an non-ac/on, use that informa/on and con/nue compu/ng the future α values. Example/exercise: compute when k=1, k=2, 39

Inference: State Smoothing P(X k E 1:t ) which state I was in at /me k (smoothing) It is the state s i such that α k (i) P ij (b k+1 ) θ j (z k+1 ) β k+1 (j) is the maximal This is to say, given /me 1,,k, k+1,...,t (best forward from 1 to k) + (best transi/on from k to k+1) + (best backward to k+1 from t) 40

State Explana/on P(X 1:t E 1:t ) what states I have been through (explana/on) They are the following states: t=1: the state s i such that α 1 (i) is the maximal t=2: the state s i such that α 2 (i) is the maximal t=t: the state s i such that α T (i) is the maximal 41

Model Learning P(M t E 1:t ) How correct is my model of the world (learning) a er I have my experience We will talk about this a er mid-term 42

Homework Given: the Lijle Prince (see slide 18 or 24) His sensor model (you complete the rest) His ac/on model (you complete the rest) Assume he had the following experience Observa/ons: {rose, nothing, volcano, nothing} Ac/ons: {forward, turn, backward, backward} You Compute: P(X 1 E 1 ), P(X 2 E 1:2 ), P(X 3 E 1:3 ), P(X 1:3 E 1:3 ) P(X 4 E 1:3 ) 43

Other Models Markov Chains No observa/on, no explicit ac/ons, transit randomly Hidden Markov Model No explicit ac/ons, state transit randomly Con/nuous State Model No explicit ac/ons, States are con/nuous Dynamic Bayesian Networks No explicit ac/ons, States are Bayesian Networks POMDP 44

Hidden Markov Models Markov chains not so useful for most agents Eventually you don t know anything anymore Need observa/ons to update your beliefs Hidden Markov models (HMMs) Underlying Markov chain over states S You observe outputs (effects) at each /me step As a Bayes net: Hidden States X 1 X 2 X 3 X 4 X 5 Evidence E 1 E 2 E 3 E 4 E 5 45

Rain/Umbrella Example in the Book An HMM is defined by: Ini/al distribu/on: Transi/ons: Emissions (sensor model): 46

Ghostbuster HMM P(X 1 ) = uniform P(X X ) = usually move clockwise, but some/mes move in a random direc/on or stay in place P(R ij X) = the same sensor model as before: red means close, green means far away. 1/9 1/9 1/9 1/9 1/9 1/9 1/9 1/9 1/9 P(X 1 ) X 1 X 2 X 3 X 4 1/6 1/6 0 1/6 1/2 0 R i,j R i,j R i,j R i,j 0 0 0 P(X X =<1,2>) See ghost Not see ghost 47

Condi/onal Independence HMMs have two important independence proper/es: Markov hidden process, future depends only on the present (not past) Current observa/on independent of all else given the current state X 1 X 2 X 3 X 4 E 1 E 2 E 3 E 4 Quiz: does this mean that observa/ons are independent? [No, correlated by the hidden state] 48

Real HMM Examples Speech recogni/on HMMs: Observa/ons are acous/c signals (con/nuous valued) States are specific posi/ons in specific words (so, tens of thousands) Machine transla/on HMMs: Observa/ons are words (tens of thousands) States are transla/on op/ons Robot tracking: Observa/ons are range readings (con/nuous/discrete) States are posi/ons on a map (con/nuous/discrete) 49

Filtering / Monitoring / Localiza/on Filtering, or monitoring, is the task of tracking the distribu/on P(X) (the belief state) over /me We start with P(X) in an ini/al se ng, usually uniform As /me passes, or we get observa/ons, we update P(X) P(X t E 1:t ) 50

Example: Robot Localiza/on P(X t E 1:t ) Example from Michael Pfeiffer State X: Room Prob 0 t=0 (before sensing, all states are equally likely) Sensor model: never more than 1 mistake Mo/on model: may not execute ac/on with small prob. 1 51

Example: Robot Localiza/on P(X t E 1:t ) Prob 0 1 t=1 (sensing: no room up or down) 52

Example: Robot Localiza/on P(X t E 1:t ) Prob 0 1 t=2 53

Example: Robot Localiza/on P(X t E 1:t ) Prob 0 1 t=3 54

Example: Robot Localiza/on P(X t E 1:t ) Prob 0 1 t=4 55

Example: Robot Localiza/on P(X t E 1:t ) Prob 0 1 t=5 56

Filtering Example P(X t E 1:t ) Day 0: P(R 0 )=<0.5,0.5> Day 1 (Umbrella appears è U 1 =true) P(R 1 )=P(r 0 )P(R 1 r 0 ) = α0.5<0.7,0.3> + α0.5<0.3,0.7> = <0.5,0.5> upda/ng with evidence for t=1 gives: P(R 1 u 1 )=αp(u 1 R 1 )P(R 1 )=α<0.9,0.2><0.5,0.5> =α<0.45,0.1> = <0.818,0.182> Day 2 (Umbrella appears è U 2 =true) P(R 2 u 1 )=P(R 2 r 1 ) P(r 1 u 1 ) =α<0.818<0.7,0.3> + 0.182<0.3,0.7>> = <0.627,0.373> upda/ng with evidence for t=2 gives: P(R 2 u 1,u 2 )=αp(u 2 R 2 )P(R 2 u 1 )=α<0.9,0.2><0.627,0.373> =α<0.565,0.075> = <0.883,0.117> 57

Smoothing P(X k E 1:t ) Divide evidence e 1:t into e 1:k, e k+1:t : Using Bayes rule Using condi/onal independence Backward message computed by a backwards recursion: Condi/oning on X k+1 Using condi/onal independence = BACKWARD(b k+2:t,e k+1:t ) b k+1:t =Σ Xt+1 (P(e k+1 x k+1 )P(x k+1 X k ) b k+2:t 58

Smoothing Example Compute es/mate for rain at t=1 P(R 1 u 1,u 2 )=αp(r 1 u 1 )P(u 2 R 1 ) P(R 1 u 1 ) = <0.818,0.182> P(u 2 R 1 ) =Σ r2 P(u 2 r 2 ) P( r 2 ) P(r 2 R 1 ) Smoothed estimate: =(0.9 1 <0.7,0.3>) + (0.2 1 <0.3,0.7>) = <0.69,0.41> P(R 1 u 1,u 2 )=α<0.818,0.182> <0.69,0.41> = <0.883,0.117> Forward-backward algorithm: cache forward messages along the way Time linear in t (polytree inference), space O(t j f j ) 59

Best Explana/on Queries X 1 X 2 X 3 X 4 X 5 E 1 E 2 E 3 E 4 E 5 Query: most likely seq: 60

State Path Trellis State trellis: graph of states and transi/ons over /me sun sun sun sun rain rain rain rain Each arc represents some transi/on Each arc has weight Each path is a sequence of states The product of weights on a path is the seq s probability Can think of the Forward (and now Viterbi) algorithms as compu/ng sums of all paths (best paths) in this graph 61

Viterbi Algorithm: choose the best state seq. Similar to slide 18 α t (s t ), but use max not sum sun sun sun sun rain rain rain rain α t (s t ) α t-1 (s t-1 ) 62

Viterbi Example 0.1 63

Viterbi Example 0.8182 0. 7 0.9 0.5155 0. 7 0.1 0.1237 0. 3 0.9 0.0334 0. 7 0.9 0.8182 0.3 0.2 0.5155 0.3 0.8 0.1237 0.7 0.2 0.0173 0.7 0.2 64

Other Models Markov Chains No observa/on, no explicit ac/ons, transit randomly Hidden Markov Model No explicit ac/ons, state transit randomly Dynamic Bayesian Networks No explicit ac/ons, States are Bayesian Networks Con/nuous State Model No explicit ac/ons, States are con/nuous POMDP 65

Dynamic Bayesian Networks {evidence}, ac/on, {evidence},, ac/on,, {evidence}, ac/on, {evidence} S1 S2 S3 S4 S1 S2 S3 S4 S1 S2 S3 S4 S1 S2 S3 S4 Each /me slice is a Bayesian Network with variables and CPTs 66

Dynamic Bayes Nets (DBNs) We want to track mul/ple variables over /me, using mul/ple sources of evidence Idea: Repeat a fixed Bayes net structure at each /me Variables from /me t can condi/on on those from t-1 Discrete valued dynamic Bayes nets are also HMMs 67

Exact inference in DBNs Variable elimina/on applies to Dynamic Bayesian nets Procedure: unroll the network for T /me steps, then eliminate variables un/l P(X T e 1:T ) is computed Online belief updates: Eliminate all variables from the previous /me step; store factors for current /me only 68

Likelihood weigh/ng for DBNs Set of weighted samples approximates the belief state LW samples pay no attention to the evidence! ) fraction agreeing falls exponentially with t ) num. samples required grows exponentially with t 69

DBNs vs. HMMs Every HMM is a single-variable DBN; every discrete DBN is an HMM Sparse dependencies ) exponentially fewer parameters; e.g., 20 state variables, three parents each DBN has 20x2 3 =160 parameters, HMM has 2 20 x2 20 ¼10 12 70

Kalman Filters {evidence}, ac/on, {evidence},, ac/on,, {evidence}, ac/on, {evidence} S1 S2 S3 S4 S1 S2 S3 S4 S1 S2 S3 S4 S1 S2 S3 S4 In each /me slice, state variables are con/nuous (not discrete) 72

Summary Temporal models use states, transitions, sensors Transitions may related to agent s actions, or spontaneous Markov assumptions and stationarity assumption, so we need - transition model P(Xt Xt-1) or P(Xt Xt-1, a t-1 ) - sensor model P(Et Xt) Tasks: filtering, prediction, smoothing, most likely seq all done recursively with constant cost per time step Types of models HMM have a single discrete state variable Dynamic Bayes nets subsume HMMs; exact update intractable Other models may have internal structure driven by actions 73

CSCI 360 Introduc/on to Ar/ficial Intelligence Week 2: Problem Solving and Op/miza/on. Professor Wei-Min Shen Week 8.1 and 8.2