Bayesan Networks: Indeendences and Inference Scott Daves and ndrew Moore Note to other teachers and users of these sldes. ndrew and Scott would be delghted f you found ths source materal useful n gvng your own lectures. Feel free to use these sldes verbatm or to modfy them to ft your own needs. oweront orgnals are avalable. If you make use of a sgnfcant orton of these sldes n your own lecture lease nclude ths message or the followng lnk to the source reostory of ndrew s tutorals: htt://www.cs.cmu.edu/~awm/tutorals. omments and correctons gratefully receved. What Indeendences does a Bayes Net Model? In order for a Bayesan network to model a robablty dstrbuton the followng must be true by defnton: ach varable s condtonally ndeendent of all ts nondescendants n the grah gven the value of all ts arents. Ths mles 1 K arents 1 But what else does t mly? n n What Indeendences does a Bayes Net Model? Quck roof that ndeendence s symmetrc xamle: Y Gven Y does learnng the value of tell us nothng new about? I.e. s Y equal to Y? Yes. Snce we know the value of all of s arents namely Y and s not a descendant of s condtonally ndeendent of. lso snce ndeendence s symmetrc Y Y. ssume: Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Bayes s Rule han Rule By ssumton Bayes s Rule What Indeendences does a Bayes Net Model? What Indeendences does a Bayes Net Model? Let I<Y> reresent and beng condtonally ndeendent gven Y. U Y I<Y>? Yes ust as n revous examle: ll s arents gven and s not a descendant. I<{U}>? No. I<{U}>? Yes. Maybe I< S > ff S acts a cutset between and n an undrected verson of the grah? 1
Thngs get a lttle more confusng The Burglar larm examle Burglar arthquake Y larm has no arents so we know all ts arents values trvally s not a descendant of So I<{}> even though there s a undrected ath from to through an unknown varable Y. What f we do know the value of Y though? Or one of ts descendants? hone all Your house has a twtchy burglar alarm that s also sometmes trggered by earthquakes. arth arguably doesn t care whether your house s currently beng burgled Whle you are on vacaton one of your neghbors calls and tells you your home s burglar alarm s rngng. Uh oh! Thngs get a lot more confusng d-searaton to the rescue Burglar larm arthquake Fortunately there s a relatvely smle algorthm for determnng whether two varables n a Bayesan network are condtonally ndeendent: d-searaton. hone all But now suose you learn that there was a medum-szed earthquake n your neghborhood. Oh whew! robably not a burglar after all. arthquake exlans away the hyothetcal burglar. But then t must not be the case that I<Burglar{hone all} arthquake> even though I<Burglar{} arthquake>! Defnton: and are d-searated by a set of evdence varables ff every undrected ath from to s blocked where a ath s blocked ff one or more of the followng condtons s true:... ath s blocked when... ath s blocked when the funky case There exsts a varable on the ath such that t s n the evdence set the arcs uttng n the ath are tal-to-tal Or there exsts a varable on the ath such that t s n the evdence set the arcs uttng n the ath are tal-to-head Or there exsts a varable on the ath such that t s NOT n the evdence set nether are any of ts descendants the arcs uttng on the ath are head-to-head Or... 2
d-searaton to the rescue cont d d-searaton examle Theorem [erma & earl 1998]: If a set of evdence varables d-searates and n a Bayesan network s grah then I< >. d-searaton can be comuted n lnear tme usng a deth-frst-search-lke algorthm. Great! We now have a fast algorthm for automatcally nferrng whether learnng the value of one varable mght gve us any addtonal hnts about some other varable gven what we already know. Mght : arables may actually be ndeendent when they re not d- searated deendng on the actual robabltes nvolved G I B D F H J I< {} D>? I< {} D>? I< { B} D>? I< { B J} D>? I< { B J} D>? Bayesan Network Inference Inference: calculatng Y for some varables or sets of varables and Y. Inference n Bayesan networks s #-hard! Inuts: ror robabltes of.5 Bayesan Network Inference But nference s stll tractable n some cases. Let s look a secal class of networks: trees / forests n whch each node has at most one arent. Reduces to I1 I2 I3 I4 I5 How many satsfyng assgnments? O O must be #sat. assgn.*.5^#nuts Decomosng the robabltes Decomosng the robabltes cont d Suose we want where s some set of evdence varables. Let s slt nto two arts: - s the art consstng of assgnments to varables n the subtree rooted at s the rest of t 3
4 Decomosng the robabltes cont d Decomosng the robabltes cont d Decomosng the robabltes cont d απ Where: α s a constant ndeendent of π - Usng the decomoston for nference We can use ths decomoston to do nference as follows. Frst comute - for all recursvely usng the leaves of the tree as the base case. If s a leaf: If s n : 1 f matches 0 otherwse If s not n : - s the null set so - 1 constant Quck asde: rtual evdence For theoretcal smlcty but wthout loss of generalty let s assume that all varables n the evdence set are leaves n the tree. Why can we do ths WLOG: Observe quvalent to Observe Where 1 f 0 otherwse alculatng for non-leaves Suose has one chld c. c
5 alculatng for non-leaves Suose has one chld c. c alculatng for non-leaves Suose has one chld c. c alculatng for non-leaves Suose has one chld c. c alculatng for non-leaves Now suose has a set of chldren. Snce d-searates each of ts subtrees the contrbuton of each subtree to s ndeendent: where s the contrbuton to - of the art of the evdence lyng n the subtree rooted at one of s chldren. We are now -hay So now we have a way to recursvely comute all the s startng from the root and usng the leaves as the base case. If we want we can thnk of each node n the network as an autonomous rocessor that asses a lttle message to ts arent. The other half of the roblem Remember απ. Now that we have all the s what about the π s? π. What about the root of the tree r? In that case r s the null set so π r r. No sweat. Snce we also know r we can comute the fnal r. So for an arbtrary wth arent let s nductvely assume we know π and/or. How do we get π?
6 omutng π π omutng π π omutng π π omutng π π omutng π π omutng π Where π s defned as π π
We re done. Yay! Thus we can comute all the π s and n turn all the s. an thnk of nodes as autonomous rocessors assng and π messages to ther neghbors π π onunctve queres What f we want e.g. B nstead of ust margnal dstrbutons and B? Just use chan rule: B B ach of the latter robabltes can be comuted usng the technque ust dscussed. π π π π olytrees Technque can be generalzed to olytrees: undrected versons of the grahs are stll trees but nodes can have more than one arent Dealng wth cycles an deal wth undrected cycles n grah by clusterng varables together B D B D ondtonng Set to 0 Set to 1 Jon trees rbtrary Bayesan network can be transformed va some evl grah-theoretc magc nto a on tree n whch a smlar method can be emloyed. B B BD BD D G DF F In the worst case the on tree nodes must take on exonentally many combnatons of values but often works well n ractce 7