Artfcal Intellgence Bayesan Networks Adapted from sldes by Tm Fnn and Mare desjardns. Some materal borrowed from Lse Getoor. 1
Outlne Bayesan networks Network structure Condtonal probablty tables Condtonal ndependence Inference n Bayesan networks Exact nference Approxmate nference 2
Bayesan Belef Networks (BNs) Defnton: BN = (DAG, CPD) DAG: drected acyclc graph (BN s structure) Nodes: random varables (typcally bnary or dscrete, but methods also exst to handle contnuous varables) Arcs: ndcate probablstc dependences between nodes (lack of lnk sgnfes condtonal ndependence) CPD: condtonal probablty dstrbuton (BN s parameters) Condtonal probabltes at each node, usually stored as a table (condtonal probablty table, or CPT) P ( x π ) where π s the set of all parent nodes of x Root nodes are a specal case no parents, so just use prors n CPD: π = so P ( x π ) = P( x ), 3
Example BN P(A) = 0.001 a P(B A) = 0.3 P(B A) = 0.001 b c P(C A) = 0.2 P(C A) = 0.005 d e P(D B,C) = 0.1 P (D B, C) = 0.01 P(D B,C) = 0.01 P (D B, C) = 0.00001 P(E C) = 0.4 P(E C) = 0.002 Note that we only specfy P(A) etc., not P( A), snce they have to add to one 4
Condtonal ndependence and channg Condtonal ndependence assumpton P( x π, q) = P( x π ) π where q s any set of varables q (nodes) other than x and ts successors π x blocks nfluence of other nodes on x and ts successors (q nfluences x only through varables n π ) Wth ths assumpton, the complete jont probablty dstrbuton of all varables n the network can be represented by (recovered from) local CPDs by channg these CPDs: n P( x,..., xn) = Π = 1P( x π ) 1 5
Channg: Example a b c d e Computng the jont probablty for all varables s easy: P(a, b, c, d, e) = P(e a, b, c, d) P(a, b, c, d) by the product rule = P(e c) P(a, b, c, d) by cond. ndep. assumpton = P(e c) P(d a, b, c) P(a, b, c) = P(e c) P(d b, c) P(c a, b) P(a, b) = P(e c) P(d b, c) P(c a) P(b a) P(a) 6
Topologcal semantcs A node s condtonally ndependent of ts nondescendants gven ts parents A node s condtonally ndependent of all other nodes n the network gven ts parents, chldren, and chldren s parents (also known as ts Markov blanket) The method called d-separaton can be appled to decde whether a set of nodes X s ndependent of another set Y, gven a thrd set Z 7
Inference tasks Smple queres: Computer posteror margnal P(X E=e) E.g., P(NoGas Gauge=empty, Lghts=on, Starts=false) Conjunctve queres: P(X, X j E=e) = P(X e=e) P(X j X, E=e) Optmal decsons: Decson networks nclude utlty nformaton; probablstc nference s requred to fnd P (outcome acton, evdence) Value of nformaton: Whch evdence should we seek next? Senstvty analyss: Whch probablty values are most crtcal? Explanaton: Why do I need a new starter motor? 8
Approaches to nference Exact nference Enumeraton Belef propagaton n polytrees Varable elmnaton Clusterng / jon tree algorthms Approxmate nference Stochastc smulaton / samplng methods Markov chan Monte Carlo methods Genetc algorthms Neural networks Smulated annealng Mean feld theory 9
Drect nference wth BNs Instead of computng the jont, suppose we just want the probablty for one varable Exact methods of computaton: Enumeraton Varable elmnaton Jon trees: get the probabltes assocated wth every query varable 10
Inference by enumeraton Add all of the terms (atomc event probabltes) from the full jont dstrbuton If E are the evdence (observed) varables and Y are the other (unobserved) varables, then: P(X e) = α P(X, E) = α P(X, E, Y) Each P(X, E, Y) term can be computed usng the chan rule Computatonally expensve! 11
Example: Enumeraton a b c d e P(x ) = Σ π P(x π ) P(π ) Suppose we want P(D=true), and only the value of E s gven as true P (d e) = α Σ ABC P(a, b, c, d, e) = α Σ ABC P(a) P(b a) P(c a) P(d b,c) P(e c) Wth smple teraton to compute ths expresson, there s gong to be a lot of repetton (e.g., P(e c) has to be recomputed every tme we terate over C=true) 12
Exercse: Enumeraton p(smart)=.8 smart study p(study)=.6 prepared far p(far)=.9 pass smart smart p(pass ) prep prep prep prep far.9.7.7.2 far.1.1.1.1 p(prep ) smart smart study.9.7 study.5.1 Query: What s the probablty that a student studed, gven that they pass the exam? 13
Summary Bayes nets Structure Parameters Condtonal ndependence Channg BN nference Enumeraton Varable elmnaton Samplng methods 14