CS 188: Artificil Intelligence Fll 2011 Decision Networks ME: choose the ction which mximizes the expected utility given the evidence mbrell Lecture 17: Decision Digrms 10/27/2011 Cn directly opertionlize this with decision networks Byes nets with nodes for utility nd ctions Lets us clculte the expected utility for ech ction Wether Dn Klein C Berkeley New node types: Chnce nodes (just like BNs) Actions (rectngles, cnnot hve prents, ct s observed evidence) tility node (dimond, depends on ction nd chnce nodes) Forecst 4 Decision Networks Exmple: Decision Networks Action selection: Instntite ll evidence Set ction node(s) ech possible wy Clculte posterior for ll prents of utility node, given the evidence Clculte expected utility for ech ction Choose mximizing ction mbrell Wether Forecst 5 mbrell = leve mbrell = tke Optiml decision = leve mbrell Wether A W (A,W) W P(W) leve 100 0.7 leve rin 0 rin 0.3 tke 20 tke rin 70 Decisions s Outcome Trees Exmple: Decision Networks {} mbrell = leve mbrell W P(W F=bd) 0.34 rin 0.66 Wether {} Wether {} mbrell = tke Wether A W (A,W) (t,s) (t,r) (l,s) (l,r) Almost exctly like expectimx / MDPs Wht s chnged? 7 Optiml decision = tke Forecst =bd leve 100 leve rin 0 tke 20 tke rin 70 9 1
Decisions s Outcome Trees Vlue of Informtion (t,s) W {b} {b} W {b} (t,r) (l,s) (l,r) 10 Ide: compute vlue of cquiring evidence Cn be done directly from decision network Exmple: buying oil drilling rights Two blocks A nd B, exctly one hs oil, worth k You cn drill in one loction Prior probbilities 0.5 ech, & mutully exclusive Drilling in either A or B hs E = k/2, ME = k/2 Question: wht s the vlue of informtion of O? Vlue of knowing which of A or B hs oil Vlue is expected gin in ME from new info Survey my sy oil in or oil in b, prob 0.5 ech If we know OilLoc, ME is k (either wy) Gin in ME from knowing OilLoc? VPI(OilLoc) = k/2 Fir price of informtion: k/2 DrillLoc OilLoc O P 1/2 b 1/2 D O k b 0 b 0 b b k 11 VPI Exmple: Wether Vlue of Informtion ME with no evidence mbrell Assume we hve evidence E=e. Vlue if we ct now: ME if forecst is bd ME if forecst is good Forecst distribution F P(F) Wether Forecst A W leve 100 leve rin 0 tke 20 tke rin 70 Assume we see tht E = e. Vlue if we ct then: BT E is rndom vrible whose vlue is unknown, so we don t know wht e will be Expected vlue if E is reveled nd then we ct: P(s e) P(s e, e ) P(e e) good 0.59 bd 0.41 Vlue of informtion: how much ME goes up by reveling E first then cting, over cting now: 12 VPI Properties Quick VPI Questions Nonnegtive Nondditive ---consider, e.g., obtining E j twice Order-independent The soup of the dy is either clm chowder or split pe, but you wouldn t order either one. Wht s the vlue of knowing which it is? There re two kinds of plstic forks t picnic. One kind is slightly sturdier. Wht s the vlue of knowing which? You re plying the lottery. The prize will be $0 or $100. You cn ply ny number between 1 nd 100 (chnce of winning is 1%). Wht is the vlue of knowing the winning number? 14 2
POMDPs Exmple: Ghostbusters MDPs hve: Sttes S Actions A Trnsition fn P(s s,) (or T(s,,s )) Rewrds R(s,,s ) POMDPs dd: Observtions O Observtion function P(o s) (or O(s,o)) POMDPs re MDPs over belief sttes b (distributions over S) We ll be ble to sy more in few lectures s s, s,,s s b b, o b 16 In (sttic) Ghostbusters: Belief stte determined by evidence to dte Tree relly over evidence sets Probbilistic resoning needed to predict new evidence given pst evidence Solving POMDPs One wy: use truncted expectimx to compute pproximte vlue of ctions Wht if you only considered busting or one sense followed by bust? You get VPI-bsed gent! o bust ( bust, ) b, b b bust e e sense e,, sense ( bust, ) 17 More Generlly Generl solutions mp belief functions to ctions Cn divide regions of belief spce (set of belief functions) into policy regions (gets complex quickly) Cn build pproximte policies using discretiztion methods Cn fctor belief functions in vrious wys Overll, POMDPs re very (ctully PSACE-) hrd Most rel problems re POMDPs, but we cn rrely solve then in generl! 18 19 Resoning over Time Often, we wnt to reson bout sequence of observtions Speech recognition Robot locliztion ser ttention Medicl monitoring Mrkov Models A Mrkov model is chin-structured BN Ech node is identiclly distributed (sttionry) Vlue of X t given time is clled the stte As BN: X 1 X 2 X 3 X 4 Need to introduce time into our models Bsic pproch: hidden Mrkov models (HMMs) More generl: dynmic Byes nets 20 Prmeters: clled trnsition probbilities or dynmics, specify how the stte evolves over time (lso, initil probs) 3
Conditionl Independence X 1 X 2 X 3 X 4 Bsic conditionl independence: Pst nd future independent of the present Ech time step only depends on the previous This is clled the (first order) Mrkov property Note tht the chin is just (growing) BN We cn lwys use generic BN resoning on it if we truncte the chin t fixed length Exmple: Mrkov Chin Wether: Sttes: X = {rin, } Trnsitions: 0.9 rin 0.1 Initil distribution: 1.0 Wht s the probbility distribution fter one step? 0.1 0.9 This is CPT, not BN! 22 23 Mini-Forwrd Algorithm Question: probbility of being in stte x t time t? Slow nswer: Enumerte ll sequences of length t which end in s Add up their probbilities Mini-Forwrd Algorithm Better wy: cched incrementl belief updtes An instnce of vrible elimintion! rin rin rin rin 24 Forwrd simultion Exmple From initil observtion of P(X 1 ) P(X 2 ) P(X 3 ) P(X ) From initil observtion of rin P(X 1 ) P(X 2 ) P(X 3 ) P(X ) 26 Sttionry Distributions If we simulte the chin long enough: Wht hppens? ncertinty ccumultes Eventully, we hve no ide wht the stte is! Sttionry distributions: For most chins, the distribution we end up in is independent of the initil distribution Clled the sttionry distribution of the chin sully, cn only predict short time out 4
Web Link Anlysis PgeRnk over web grph Ech web pge is stte Initil distribution: uniform over pges Trnsitions: With prob. c, uniform jump to rndom pge (dotted lines) With prob. 1-c, follow rndom outlink (solid lines) Sttionry distribution Will spend more time on highly rechble pges E.g. mny wys to get to the Acrobt Reder downlod pge! Somewht robust to link spm Google 1.0 returned the set of pges contining ll your keywords in decresing rnk, now ll serch engines use link nlysis long with mny other fctors 28 5