CS 188: Artificial Intelligence Fall PDF Free Download

CS 188: Artificil Intelligence Fll 2010 Lecture 18: Decision Digrms 10/28/2010 Dn Klein C Berkeley Vlue of Informtion 1

Decision Networks ME: choose the ction which mximizes the expected utility given the evidence Cn directly opertionlize this with decision networks Byes nets with nodes for utility nd ctions Lets us clculte the expected utility for ech ction New node types: Chnce nodes (just like BNs) Actions (rectngles, cnnot hve prents, ct s observed evidence) tility node (dimond, depends on ction nd chnce nodes) [DEMO: Ghostbusters] 4 Decision Networks Action selection: Instntite ll evidence Set ction node(s) ech possible wy Clculte posterior for ll prents of utility node, given the evidence Clculte expected utility for ech ction Choose mximizing ction 5 2

Exmple: Decision Networks = leve = tke Optiml decision = leve W P(W) sun 0.7 rin 0.3 A W (A,W) leve sun 100 leve rin 0 tke sun 20 tke rin 70 Decisions s Outcome Trees {} (t,s) (t,r) (l,s) (l,r) Almost exctly like expectimx / MDPs Wht s chnged? 7 3

Evidence in Decision Networks W P(W) sun 0.7 rin 0.3 Find P(W F=bd) Select for evidence W P(W) W P(F=bd W) sun 0.7 sun 0.2 rin 0.3 rin 0.9 F P(F sun) good 0.8 bd 0.2 F P(F rin) good 0.1 bd 0.9 W First we join P(W) nd P(bd W) Then we normlize P(W,F=bd) sun 0.14 rin 0.27 W P(W F=bd) sun 0.34 rin 0.66 Exmple: Decision Networks = leve W P(W F=bd) sun 0.34 rin 0.66 = tke Optiml decision = tke =bd A W (A,W) leve sun 100 leve rin 0 tke sun 20 tke rin 70 9 4

Decisions s Outcome Trees {b} W {b} W {b} (t,s) (t,r) (l,s) (l,r) 10 Vlue of Informtion Ide: compute vlue of cquiring evidence Cn be done directly from decision network Exmple: buying oil drilling rights Two blocks A nd B, exctly one hs oil, worth k You cn drill in one loction Prior probbilities 0.5 ech, & mutully exclusive Drilling in either A or B hs E = k/2, ME = k/2 Question: wht s the vlue of informtion of O? Vlue of knowing which of A or B hs oil Vlue is expected gin in ME from new info Survey my sy oil in or oil in b, prob 0.5 ech If we know OilLoc, ME is k (either wy) Gin in ME from knowing OilLoc? VPI(OilLoc) = k/2 Fir price of informtion: k/2 DrillLoc OilLoc O P 1/2 b 1/2 D O k b 0 b 0 b b k 12 5

Vlue of Informtion Assume we hve evidence E=e. Vlue if we ct now: {e} Assume we see tht E = e. Vlue if we ct then: BT E is rndom vrible whose vlue is unknown, so we don t know wht e will be Expected vlue if E is reveled nd then we ct: Vlue of informtion: how much ME goes up by reveling E first then cting, over cting now: P(s e) P(s e, e ) {e, e } P(e e) {e, e } {e} VPI Exmple: ME with no evidence ME if forecst is bd ME if forecst is good distribution A W leve sun 100 leve rin 0 tke sun 20 tke rin 70 F P(F) good 0.59 bd 0.41 14 6

VPI Properties Nonnegtive Nondditive ---consider, e.g., obtining E j twice Order-independent 15 Quick VPI Questions The soup of the dy is either clm chowder or split pe, but you wouldn t order either one. Wht s the vlue of knowing which it is? There re two kinds of plstic forks t picnic. It must be tht one is slightly better. Wht s the vlue of knowing which? You re plying the lottery. The prize will be $0 or $100. You cn ply ny number between 1 nd 100 (chnce of winning is 1%). Wht is the vlue of knowing the winning number? 7

POMDPs MDPs hve: Sttes S Actions A Trnsition fn P(s s,) (or T(s,,s )) Rewrds R(s,,s ) POMDPs dd: Observtions O Observtion function P(o s) (or O(s,o)) s s, s,,s s b b, POMDPs re MDPs over belief sttes b (distributions over S) o b We ll be ble to sy more in few lectures 17 Exmple: Ghostbusters In (sttic) Ghostbusters: Belief stte determined by evidence to dte {e} Tree relly over evidence sets Probbilistic resoning needed to predict new evidence given pst evidence o b b, b e {e} e, {e, e } Solving POMDPs One wy: use truncted expectimx to compute pproximte vlue of ctions Wht if you only considered busting or one sense followed by bust? You get VPI-bsed gent! {e} bust sense ( bust, {e}) bust {e}, sense e {e, e } ( bust, {e, e }) 18 8

More Generlly Generl solutions mp belief functions to ctions Cn divide regions of belief spce (set of belief functions) into policy regions (gets complex quickly) Cn build pproximte policies using discretiztion methods Cn fctor belief functions in vrious wys Overll, POMDPs re very (ctully PSACE-) hrd Most rel problems re POMDPs, but we cn rrely solve then in generl! 19 9

CS 188: Artificial Intelligence Fall 2010