Written HW 9 Sol. CS 188 Fall Introduction to Artificial Intelligence

CS 188 Fall 2018 Inroducion o Arificial Inelligence Wrien HW 9 Sol. Self-assessmen due: Tuesday 11/13/2018 a 11:59pm (submi via Gradescope) For he self assessmen, fill in he self assessmen boxes in your original submission (you can download a PDF copy of your submission from Gradescope be sure o delee any exra ile pages ha Gradescope aaches). For each subpar where your original answer was correc, wrie correc. Oherwise, wrie and explain he correc answer. Do no leave any boxes empy. If you did no submi he homework (or skipped some quesions) bu wish o receive credi for he self-assessmen, we ask ha you firs complee he homework wihou looking a he soluions, and hen perform he self assessmen aferwards. 1

Q1. Paricle Filering: Where are he Two Cars? As before, we are rying o esimae he locaion of cars in a ciy, bu now, we model wo cars joinly, i.e. car i for i {1, 2}. The modified HMM model is as follows: X (i) he locaion of car i S (i) he noisy locaion of he car i from he signal srengh a a nearby cell phone ower G (i) he noisy locaion of car i from GPS X (1) 1 X (1) X (1) +1 S (1) S (2) 1 G (1) S (1) S (1) 1 G (1) +1 G (1) +1 1 G (2) S (2) S (2) 1 G (2) +1 G (2) +1 X (2) 1 X (2) X (2) +1 d D(d) E L (d) E N (d) E G (d) -4 0.05 0 0.02 0-3 0.10 0 0.04 0.03-2 0.25 0.05 0.09 0.07-1 0.10 0.10 0.20 0.15 0 0 0.70 0.30 0.50 1 0.10 0.10 0.20 0.15 2 0.25 0.05 0.09 0.07 3 0.10 0 0.04 0.03 4 0.05 0 0.02 0 The signal srengh from one car ges noisier if he oher car is a he same locaion. Thus, he observaion S (i) depends on he curren sae of he oher car X (j), j i. The ransiion is modeled using a drif model D, he GPS observaion G (i) using he error model E G, and he observaion S (i) using one of he error models E L or E N, depending on he car s speed and he relaive locaion of boh cars. These drif and error models are in he able above. The ransiion and observaion models are: P (S (i) X (i) P (X (i) X (i) 1, X(i) P (G (i) 1 ) = D(X(i), X (j) ) = X (i) 1 ) { E N (X (i) S (i) ), if X (i) X (i) 1 2 or X(i) = X (j) E L (X (i) X (i) ) = E G (X (i) G (i) ). S (i) ), oherwise Throughou his problem you may give answers eiher as unevaluaed numeric expressions (e.g. 0.1 0.5) or as numeric values (e.g. 0.05). The quesions are decoupled. (a) Assume ha a = 3, we have he single paricle 3 = 1, X (2). (i) Wha is he probabiliy ha his paricle becomes 4 = 3, X (2) 4 = 3) afer passing i hrough he dynamics model? P 4 = 3, X (2) 4 = 3 X (1) 3 = 1, X (2) = P 4 = 3 X (1) 3 = 1) P (X (2) 4 = 3 X (2) = D( 3 ( 1)) D(3 2) Answer: 0.025 = 0.25 0.10 = 0.025 also 2

(ii) Assume ha here are no sensor readings a = 4. Wha is he join probabiliy ha he original single paricle (from = 3) becomes 4 = 3, X (2) 4 = 3) and hen becomes 5 = 4, X (2) 5 = 4)? P 4 = 3, X (1) 5 = 4, X (2) 4 = 3, X (2) 5 = 4 X (1) 3 = 1, X (2) = P 4 = 3, X (1) 5 = 4 X (1) 3 = 1) P (X (2) 4 = 3, X (2) 5 = 4 X (2) = P 5 = 4 X (1) 4 = 3) P 4 = 3 X (1) 3 = 1) P (X (2) 5 = 4 X (2) 4 = 3) P (X (2) 4 = 3 X (2) = D( 4 ( 3)) D( 3 ( 1)) D(4 3) D(3 2) = 0.10 0.25 0.10 0.10 = 0.00025 Answer: 0.00025 For he remaining of his problem, we will be using 2 paricles a each ime sep. (b) A = 6, we have paricles [ 6 = 3, X (2) 6 = 0), 6 = 3, X (2) 6 = 5)]. Suppose ha afer weighing, resampling, and ransiioning from = 6 o = 7, he paricles become [, ]. (i) A = 7, you ge he observaions S (1) 7 = 2, G (1) 7 = 2, S (2) 7 = 2, G (2) 7 = 2. Wha is he weigh of each paricle? Paricle Weigh P (S (1) 6 = 3, X (1) P (G (1) P (S (2) 6 = 0, X (2) 7 = 2, X (1) P (G (2) = E N (2 2) E G (2 2) E N (2 2) E G (2 2) = 0.30 0.50 0.30 0.50 = 0.0225 P (S (1) 6 = 3, X (1) P (G (1) 7 = 4) P (S (2) 6 = 5, X (2) 7 = 1, X (1) 7 = 4) P (G (2) = E L (4 2) E G (4 2) E N (1 2) E G (1 2) = 0.05 0.07 0.20 0.15 = 0.000105 (ii) Suppose boh cars cell phones died so you only ge he observaions G (1) 7 = 2, G (2) 7 = 2. Wha is he weigh of each paricle? Paricle Weigh P (G (1) P (G (2) = E G (2 2) E G (2 2) = 0.50 0.50 = 0.25 P (G (1) 7 = 4) P (G (2) = E G (4 2) E G (1 2) = 0.07 0.15 = 0.0105 (c) To decouple his quesion, assume ha you go he following weighs for he wo paricles. 3

Paricle Weigh 0.09 0.01 Wha is he belief for he locaion of car 1 and car 2 a = 7? Locaion P 7 ) P (X(2) 7 ) X (i) 7 = 1 0 0.09+0.01 = 0 0.01 0.09+0.01 = 0.1 X (i) 7 = 2 0.09 0.09+0.01 = 0.9 0.09 0.09+0.01 = 0.9 X (i) 7 = 4 0.01 0.09+0.01 = 0.1 0 0.09+0.01 = 0 4

Q2. Naive Bayes Your friend claims ha he can wrie an effecive Naive Bayes spam deecor wih only hree feaures: he hour of he day ha he email was received (H {1, 2,, 24}), wheher i conains he word viagra (W {yes, no}), and wheher he email address of he sender is Known in his address book, Seen before in his inbox, or Unseen before (E {K, S, U}). (a) Flesh ou he following informaion abou his Bayes ne: Graph srucure: spam H W E Parameers: θ spam, θ H,i,c, θ W,c, θ E,j,c, i {1,, 23}, j {K, S}, c {spam, ham} is a correc minimal parameerizaion. Noe ha he sum-o-one consrain on disribuions resuls in one fewer parameer han he number of seings of a variable. For insance, θ spam suffices because θ ham = 1 θ spam. Aside: a non-minimal bu correc parameerizaion was also acceped since he quesion did no ask for minimal parameers. Size of he se of parameers: 1 + 23 2 + 2 + 2 2. The size of he se is he sum of parameer sizes. Every parameer has size = number of values number of seings of is parens. For insance, θ H,i,c has 23 values of hour H and is paren c, he class, has 2. Suppose now ha you labeled hree of he emails in your mailbox o es his idea: spam or ham? H W E spam 3 yes S ham 14 no K ham 15 no K (b) Use he hree insances o esimae he maximum likelihood parameers. The maximum likelihood esimaes are he sample proporions. θ spam = 1/3, θ H,3,spam = 1, θ H,14,ham = 1/2, θ H,15,ham = 1/2, θ W,spam = 1.0, θ E,S,spam = 1, θ E,K,ham = 1 (c) Using he maximum likelihood parameers, find he prediced class of a new daapoin wih H = 3, W = no, E = U. No predicion can be made. Since E = U is never observed, i has zero likelihood under boh classes. 5

(d) Now use he hree o esimae he parameers using Laplace smoohing and k = 2. Do no forge o smooh boh he class prior parameers and he feaure values parameers. The Laplace smoohed esimae for a caegorical variable X wih parameers θ 1,,d for he {1,, d} values of X is θ i = xi+k N+kd where x i is he number of imes value i is observed, N is he oal number of observaions, and d is he number of values of X. θ spam = 3/7, θ H,3,spam = 3/49, θ H,oher,spam = 2/49, θ H,14,ham = 3/50, θ H,15,ham = 3/50, θ H,oher,ham = 2/50, θ W,spam = 3/5, θ W,ham = 2/6, θ E,S,spam = 3/7, θ E,oher,spam = 2/7, θ E,K,ham = 4/8, θ E,oher,ham = 2/8. (e) Using he parameers obained wih Laplace smoohing, find he prediced class of a new daapoin wih H = 3, W = no, E = U. Ham. The probabiliy under he model for each class is compued as he produc of he class prior and he feaure condionals: p(ham) (1 θ spam )(θ H,oher,ham )(1 θ W,ham )(1 θ E,K,ham θ E,oher,ham ) and p(spam) (θ spam )(θ H,3,spam )(1 θ W,spam )(1 θ E,S,spam θ E,oher,spam ) where boh are proporional because he disribuion has no been normalized. (f) You observe ha you end o receive spam emails in baches. In paricular, if you receive one spam message, he nex message is more likely o be a spam message as well. Explain a new graphical model which mos naurally capures his phenomena. Graph srucure: The srucure is he same as an HMM excep each hidden sae node has hree observaion child nodes. Parameers: Add 2 parameers: ransiion o spam from spam and from ham. Size of he se of parameers: Add 2 o he expression in he firs quesion. 6