Decision Networks. CS 188: Artificial Intelligence Fall Example: Decision Networks. Decision Networks. Decisions as Outcome Trees

Similar documents
CS 188: Artificial Intelligence

CS 188: Artificial Intelligence Fall 2010

Decision Networks. CS 188: Artificial Intelligence. Decision Networks. Decision Networks. Decision Networks and Value of Information

CS 343: Artificial Intelligence

CS 188 Introduction to Artificial Intelligence Fall 2018 Note 7

Reasoning over Time or Space. CS 188: Artificial Intelligence. Outline. Markov Models. Conditional Independence. Query: P(X 4 )

CS 188: Artificial Intelligence Fall Recap: Inference Example

Reinforcement learning II

CS 188: Artificial Intelligence Fall Announcements

19 Optimal behavior: Game theory

CS 188: Artificial Intelligence Spring 2007

Bellman Optimality Equation for V*

Reinforcement Learning

Recitation 3: More Applications of the Derivative

Artificial Intelligence Markov Decision Problems

Today. Recap: Reasoning Over Time. Demo Bonanza! CS 188: Artificial Intelligence. Advanced HMMs. Speech recognition. HMMs. Start machine learning

State space systems analysis (continued) Stability. A. Definitions A system is said to be Asymptotically Stable (AS) when it satisfies

Administrivia CSE 190: Reinforcement Learning: An Introduction

Improper Integrals. Type I Improper Integrals How do we evaluate an integral such as

Bayesian Networks: Approximate Inference

Math 31S. Rumbos Fall Solutions to Assignment #16

Solution for Assignment 1 : Intro to Probability and Statistics, PAC learning

Properties of Integrals, Indefinite Integrals. Goals: Definition of the Definite Integral Integral Calculations using Antiderivatives

Joint distribution. Joint distribution. Marginal distributions. Joint distribution

f(x) dx, If one of these two conditions is not met, we call the integral improper. Our usual definition for the value for the definite integral

Uninformed Search Lecture 4

2D1431 Machine Learning Lab 3: Reinforcement Learning

Chapter 0. What is the Lebesgue integral about?

n f(x i ) x. i=1 In section 4.2, we defined the definite integral of f from x = a to x = b as n f(x i ) x; f(x) dx = lim i=1

Math 8 Winter 2015 Applications of Integration

Hidden Markov Models

20 MATHEMATICS POLYNOMIALS

CS103B Handout 18 Winter 2007 February 28, 2007 Finite Automata

Lecture 20: Numerical Integration III

Equations and Inequalities

CS667 Lecture 6: Monte Carlo Integration 02/10/05

Chapters 4 & 5 Integrals & Applications

Module 9: Tries and String Matching

Module 9: Tries and String Matching

Module 6 Value Iteration. CS 886 Sequential Decision Making and Reinforcement Learning University of Waterloo

Section 14.3 Arc Length and Curvature

Chapter 4: Dynamic Programming

Review of Calculus, cont d

Announcements. CS 188: Artificial Intelligence Fall Markov Models. Example: Markov Chain. Mini-Forward Algorithm. Example

Cf. Linn Sennott, Stochastic Dynamic Programming and the Control of Queueing Systems, Wiley Series in Probability & Statistics, 1999.

Math Lecture 23

{ } = E! & $ " k r t +k +1

Finite Automata. Informatics 2A: Lecture 3. John Longley. 22 September School of Informatics University of Edinburgh

3.4 Numerical integration

We will see what is meant by standard form very shortly

KNOWLEDGE-BASED AGENTS INFERENCE

HW3, Math 307. CSUF. Spring 2007.

DATA Search I 魏忠钰. 复旦大学大数据学院 School of Data Science, Fudan University. March 7 th, 2018

1 Error Analysis of Simple Rules for Numerical Integration

Data Assimilation. Alan O Neill Data Assimilation Research Centre University of Reading

Chapter 5 : Continuous Random Variables

Recitation 3: Applications of the Derivative. 1 Higher-Order Derivatives and their Applications

Chapter 6 Notes, Larson/Hostetler 3e

Line and Surface Integrals: An Intuitive Understanding

Riemann is the Mann! (But Lebesgue may besgue to differ.)

Duality # Second iteration for HW problem. Recall our LP example problem we have been working on, in equality form, is given below.

1 Probability Density Functions

Definite integral. Mathematics FRDIS MENDELU

Problem Set 3 Solutions

The Regulated and Riemann Integrals

Continuous Random Variables

Infinite Geometric Series

Announcements. CS 188: Artificial Intelligence Fall VPI Example. VPI Properties. Reasoning over Time. Markov Models. Lecture 19: HMMs 11/4/2008

4 7x =250; 5 3x =500; Read section 3.3, 3.4 Announcements: Bell Ringer: Use your calculator to solve

1 Online Learning and Regret Minimization

different methods (left endpoint, right endpoint, midpoint, trapezoid, Simpson s).

Definite integral. Mathematics FRDIS MENDELU. Simona Fišnarová (Mendel University) Definite integral MENDELU 1 / 30

Name Solutions to Test 3 November 8, 2017

CS 310 (sec 20) - Winter Final Exam (solutions) SOLUTIONS

Convert the NFA into DFA

Riemann Sums and Riemann Integrals

CS 188: Artificial Intelligence Spring 2009

The graphs of Rational Functions

Probabilistic Reasoning. CS 188: Artificial Intelligence Spring Inference by Enumeration. Probability recap. Chain Rule à Bayes net

Math 1431 Section M TH 4:00 PM 6:00 PM Susan Wheeler Office Hours: Wed 6:00 7:00 PM Online ***NOTE LABS ARE MON AND WED

Read section 3.3, 3.4 Announcements:

Connected-components. Summary of lecture 9. Algorithms and Data Structures Disjoint sets. Example: connected components in graphs

Reinforcement learning

Riemann Sums and Riemann Integrals

Probabilistic Model Checking Michaelmas Term Dr. Dave Parker. Department of Computer Science University of Oxford

( dg. ) 2 dt. + dt. dt j + dh. + dt. r(t) dt. Comparing this equation with the one listed above for the length of see that

MORE FUNCTION GRAPHING; OPTIMIZATION. (Last edited October 28, 2013 at 11:09pm.)

The steps of the hypothesis test

Math 426: Probability Final Exam Practice

Approximate Inference

The solutions of the single electron Hamiltonian were shown to be Bloch wave of the form: ( ) ( ) ikr

How do we solve these things, especially when they get complicated? How do we know when a system has a solution, and when is it unique?

Chapter 14. Matrix Representations of Linear Transformations

7 - Continuous random variables

Operations with Polynomials

SUMMER KNOWHOW STUDY AND LEARNING CENTRE

Reinforcement Learning and Policy Reuse

Finite Automata Theory and Formal Languages TMV027/DIT321 LP4 2018

Lecture 7 notes Nodal Analysis

Lesson 1: Quadratic Equations

Transcription:

CS 188: Artificil Intelligence Fll 2011 Decision Networks ME: choose the ction which mximizes the expected utility given the evidence mbrell Lecture 17: Decision Digrms 10/27/2011 Cn directly opertionlize this with decision networks Byes nets with nodes for utility nd ctions Lets us clculte the expected utility for ech ction Wether Dn Klein C Berkeley New node types: Chnce nodes (just like BNs) Actions (rectngles, cnnot hve prents, ct s observed evidence) tility node (dimond, depends on ction nd chnce nodes) Forecst 4 Decision Networks Exmple: Decision Networks Action selection: Instntite ll evidence Set ction node(s) ech possible wy Clculte posterior for ll prents of utility node, given the evidence Clculte expected utility for ech ction Choose mximizing ction mbrell Wether Forecst 5 mbrell = leve mbrell = tke Optiml decision = leve mbrell Wether A W (A,W) W P(W) leve 100 0.7 leve rin 0 rin 0.3 tke 20 tke rin 70 Decisions s Outcome Trees Exmple: Decision Networks {} mbrell = leve mbrell W P(W F=bd) 0.34 rin 0.66 Wether {} Wether {} mbrell = tke Wether A W (A,W) (t,s) (t,r) (l,s) (l,r) Almost exctly like expectimx / MDPs Wht s chnged? 7 Optiml decision = tke Forecst =bd leve 100 leve rin 0 tke 20 tke rin 70 9 1

Decisions s Outcome Trees Vlue of Informtion (t,s) W {b} {b} W {b} (t,r) (l,s) (l,r) 10 Ide: compute vlue of cquiring evidence Cn be done directly from decision network Exmple: buying oil drilling rights Two blocks A nd B, exctly one hs oil, worth k You cn drill in one loction Prior probbilities 0.5 ech, & mutully exclusive Drilling in either A or B hs E = k/2, ME = k/2 Question: wht s the vlue of informtion of O? Vlue of knowing which of A or B hs oil Vlue is expected gin in ME from new info Survey my sy oil in or oil in b, prob 0.5 ech If we know OilLoc, ME is k (either wy) Gin in ME from knowing OilLoc? VPI(OilLoc) = k/2 Fir price of informtion: k/2 DrillLoc OilLoc O P 1/2 b 1/2 D O k b 0 b 0 b b k 11 VPI Exmple: Wether Vlue of Informtion ME with no evidence mbrell Assume we hve evidence E=e. Vlue if we ct now: ME if forecst is bd ME if forecst is good Forecst distribution F P(F) Wether Forecst A W leve 100 leve rin 0 tke 20 tke rin 70 Assume we see tht E = e. Vlue if we ct then: BT E is rndom vrible whose vlue is unknown, so we don t know wht e will be Expected vlue if E is reveled nd then we ct: P(s e) P(s e, e ) P(e e) good 0.59 bd 0.41 Vlue of informtion: how much ME goes up by reveling E first then cting, over cting now: 12 VPI Properties Quick VPI Questions Nonnegtive Nondditive ---consider, e.g., obtining E j twice Order-independent The soup of the dy is either clm chowder or split pe, but you wouldn t order either one. Wht s the vlue of knowing which it is? There re two kinds of plstic forks t picnic. One kind is slightly sturdier. Wht s the vlue of knowing which? You re plying the lottery. The prize will be $0 or $100. You cn ply ny number between 1 nd 100 (chnce of winning is 1%). Wht is the vlue of knowing the winning number? 14 2

POMDPs Exmple: Ghostbusters MDPs hve: Sttes S Actions A Trnsition fn P(s s,) (or T(s,,s )) Rewrds R(s,,s ) POMDPs dd: Observtions O Observtion function P(o s) (or O(s,o)) POMDPs re MDPs over belief sttes b (distributions over S) We ll be ble to sy more in few lectures s s, s,,s s b b, o b 16 In (sttic) Ghostbusters: Belief stte determined by evidence to dte Tree relly over evidence sets Probbilistic resoning needed to predict new evidence given pst evidence Solving POMDPs One wy: use truncted expectimx to compute pproximte vlue of ctions Wht if you only considered busting or one sense followed by bust? You get VPI-bsed gent! o bust ( bust, ) b, b b bust e e sense e,, sense ( bust, ) 17 More Generlly Generl solutions mp belief functions to ctions Cn divide regions of belief spce (set of belief functions) into policy regions (gets complex quickly) Cn build pproximte policies using discretiztion methods Cn fctor belief functions in vrious wys Overll, POMDPs re very (ctully PSACE-) hrd Most rel problems re POMDPs, but we cn rrely solve then in generl! 18 19 Resoning over Time Often, we wnt to reson bout sequence of observtions Speech recognition Robot locliztion ser ttention Medicl monitoring Mrkov Models A Mrkov model is chin-structured BN Ech node is identiclly distributed (sttionry) Vlue of X t given time is clled the stte As BN: X 1 X 2 X 3 X 4 Need to introduce time into our models Bsic pproch: hidden Mrkov models (HMMs) More generl: dynmic Byes nets 20 Prmeters: clled trnsition probbilities or dynmics, specify how the stte evolves over time (lso, initil probs) 3

Conditionl Independence X 1 X 2 X 3 X 4 Bsic conditionl independence: Pst nd future independent of the present Ech time step only depends on the previous This is clled the (first order) Mrkov property Note tht the chin is just (growing) BN We cn lwys use generic BN resoning on it if we truncte the chin t fixed length Exmple: Mrkov Chin Wether: Sttes: X = {rin, } Trnsitions: 0.9 rin 0.1 Initil distribution: 1.0 Wht s the probbility distribution fter one step? 0.1 0.9 This is CPT, not BN! 22 23 Mini-Forwrd Algorithm Question: probbility of being in stte x t time t? Slow nswer: Enumerte ll sequences of length t which end in s Add up their probbilities Mini-Forwrd Algorithm Better wy: cched incrementl belief updtes An instnce of vrible elimintion! rin rin rin rin 24 Forwrd simultion Exmple From initil observtion of P(X 1 ) P(X 2 ) P(X 3 ) P(X ) From initil observtion of rin P(X 1 ) P(X 2 ) P(X 3 ) P(X ) 26 Sttionry Distributions If we simulte the chin long enough: Wht hppens? ncertinty ccumultes Eventully, we hve no ide wht the stte is! Sttionry distributions: For most chins, the distribution we end up in is independent of the initil distribution Clled the sttionry distribution of the chin sully, cn only predict short time out 4

Web Link Anlysis PgeRnk over web grph Ech web pge is stte Initil distribution: uniform over pges Trnsitions: With prob. c, uniform jump to rndom pge (dotted lines) With prob. 1-c, follow rndom outlink (solid lines) Sttionry distribution Will spend more time on highly rechble pges E.g. mny wys to get to the Acrobt Reder downlod pge! Somewht robust to link spm Google 1.0 returned the set of pges contining ll your keywords in decresing rnk, now ll serch engines use link nlysis long with mny other fctors 28 5