CS 188: Artificial Intelligence Fall 2010

Similar documents
CS 188: Artificial Intelligence

Decision Networks. CS 188: Artificial Intelligence. Decision Networks. Decision Networks. Decision Networks and Value of Information

Decision Networks. CS 188: Artificial Intelligence Fall Example: Decision Networks. Decision Networks. Decisions as Outcome Trees

CS 343: Artificial Intelligence

CS 188 Introduction to Artificial Intelligence Fall 2018 Note 7

Bellman Optimality Equation for V*

CS 188: Artificial Intelligence Spring 2007

19 Optimal behavior: Game theory

Reinforcement learning II

CS 188: Artificial Intelligence Fall Announcements

CS 188: Artificial Intelligence Fall Recap: Inference Example

Improper Integrals. Type I Improper Integrals How do we evaluate an integral such as

n f(x i ) x. i=1 In section 4.2, we defined the definite integral of f from x = a to x = b as n f(x i ) x; f(x) dx = lim i=1

Chapter 4: Dynamic Programming

Administrivia CSE 190: Reinforcement Learning: An Introduction

{ } = E! & $ " k r t +k +1

f(x) dx, If one of these two conditions is not met, we call the integral improper. Our usual definition for the value for the definite integral

Solution for Assignment 1 : Intro to Probability and Statistics, PAC learning

Artificial Intelligence Markov Decision Problems

State space systems analysis (continued) Stability. A. Definitions A system is said to be Asymptotically Stable (AS) when it satisfies

Reinforcement Learning

Problem Set 3 Solutions

1 Error Analysis of Simple Rules for Numerical Integration

Lecture 20: Numerical Integration III

CS667 Lecture 6: Monte Carlo Integration 02/10/05

Chapter 0. What is the Lebesgue integral about?

INTRODUCTION TO INTEGRATION

Problem Set 9. Figure 1: Diagram. This picture is a rough sketch of the 4 parabolas that give us the area that we need to find. The equations are:

DATA Search I 魏忠钰. 复旦大学大数据学院 School of Data Science, Fudan University. March 7 th, 2018

20 MATHEMATICS POLYNOMIALS

Uninformed Search Lecture 4

4 7x =250; 5 3x =500; Read section 3.3, 3.4 Announcements: Bell Ringer: Use your calculator to solve

Lecture 3 Gaussian Probability Distribution

Cf. Linn Sennott, Stochastic Dynamic Programming and the Control of Queueing Systems, Wiley Series in Probability & Statistics, 1999.

HW3, Math 307. CSUF. Spring 2007.

Recitation 3: More Applications of the Derivative

CS103B Handout 18 Winter 2007 February 28, 2007 Finite Automata

Is there an easy way to find examples of such triples? Why yes! Just look at an ordinary multiplication table to find them!

Name Solutions to Test 3 November 8, 2017

38.2. The Uniform Distribution. Introduction. Prerequisites. Learning Outcomes

Read section 3.3, 3.4 Announcements:

Continuous Random Variables

3.4 Numerical integration

SUMMER KNOWHOW STUDY AND LEARNING CENTRE

Hidden Markov Models

Expectation and Variance

Chapter 14. Matrix Representations of Linear Transformations

Riemann Sums and Riemann Integrals

Chapter 5 : Continuous Random Variables

Infinite Geometric Series

Review of Calculus, cont d

2D1431 Machine Learning Lab 3: Reinforcement Learning

Duality # Second iteration for HW problem. Recall our LP example problem we have been working on, in equality form, is given below.

Riemann Sums and Riemann Integrals

Construction of Gauss Quadrature Rules

A recursive construction of efficiently decodable list-disjunct matrices

Math 8 Winter 2015 Applications of Integration

Session Trimester 2. Module Code: MATH08001 MATHEMATICS FOR DESIGN

1 Online Learning and Regret Minimization

Today. Recap: Reasoning Over Time. Demo Bonanza! CS 188: Artificial Intelligence. Advanced HMMs. Speech recognition. HMMs. Start machine learning

Module 6 Value Iteration. CS 886 Sequential Decision Making and Reinforcement Learning University of Waterloo

How do we solve these things, especially when they get complicated? How do we know when a system has a solution, and when is it unique?

Lecture 3. In this lecture, we will discuss algorithms for solving systems of linear equations.

Math 426: Probability Final Exam Practice

The Fundamental Theorem of Calculus. The Total Change Theorem and the Area Under a Curve.

The solutions of the single electron Hamiltonian were shown to be Bloch wave of the form: ( ) ( ) ikr

Section 14.3 Arc Length and Curvature

CS 373, Spring Solutions to Mock midterm 1 (Based on first midterm in CS 273, Fall 2008.)

Review of Gaussian Quadrature method

1 Probability Density Functions

7.2 The Definite Integral

Bayesian Networks: Approximate Inference

Chapters 4 & 5 Integrals & Applications

MATH362 Fundamentals of Mathematical Finance

Reinforcement Learning and Policy Reuse

(e) if x = y + z and a divides any two of the integers x, y, or z, then a divides the remaining integer

Line and Surface Integrals: An Intuitive Understanding

Summary: Method of Separation of Variables

Reinforcement learning

MORE FUNCTION GRAPHING; OPTIMIZATION. (Last edited October 28, 2013 at 11:09pm.)

Strong Bisimulation. Overview. References. Actions Labeled transition system Transition semantics Simulation Bisimulation

Normal Distribution. Lecture 6: More Binomial Distribution. Properties of the Unit Normal Distribution. Unit Normal Distribution

( dg. ) 2 dt. + dt. dt j + dh. + dt. r(t) dt. Comparing this equation with the one listed above for the length of see that

Chapter 3 MATRIX. In this chapter: 3.1 MATRIX NOTATION AND TERMINOLOGY

Unit #9 : Definite Integral Properties; Fundamental Theorem of Calculus

Jim Lambers MAT 169 Fall Semester Lecture 4 Notes

1.2. Linear Variable Coefficient Equations. y + b "! = a y + b " Remark: The case b = 0 and a non-constant can be solved with the same idea as above.

Properties of Integrals, Indefinite Integrals. Goals: Definition of the Definite Integral Integral Calculations using Antiderivatives

Finite Automata. Informatics 2A: Lecture 3. John Longley. 22 September School of Informatics University of Edinburgh

Section 11.5 Estimation of difference of two proportions

Riemann is the Mann! (But Lebesgue may besgue to differ.)

Extended nonlocal games from quantum-classical games

Definite integral. Mathematics FRDIS MENDELU

Equations and Inequalities

MIXED MODELS (Sections ) I) In the unrestricted model, interactions are treated as in the random effects model:

Lecture 21: Order statistics

Joint distribution. Joint distribution. Marginal distributions. Joint distribution

Lesson 25: Adding and Subtracting Rational Expressions

Tutorial 4. b a. h(f) = a b a ln 1. b a dx = ln(b a) nats = log(b a) bits. = ln λ + 1 nats. = log e λ bits. = ln 1 2 ln λ + 1. nats. = ln 2e. bits.

1 From NFA to regular expression

Transcription:

CS 188: Artificil Intelligence Fll 2010 Lecture 18: Decision Digrms 10/28/2010 Dn Klein C Berkeley Vlue of Informtion 1

Decision Networks ME: choose the ction which mximizes the expected utility given the evidence Cn directly opertionlize this with decision networks Byes nets with nodes for utility nd ctions Lets us clculte the expected utility for ech ction New node types: Chnce nodes (just like BNs) Actions (rectngles, cnnot hve prents, ct s observed evidence) tility node (dimond, depends on ction nd chnce nodes) [DEMO: Ghostbusters] 4 Decision Networks Action selection: Instntite ll evidence Set ction node(s) ech possible wy Clculte posterior for ll prents of utility node, given the evidence Clculte expected utility for ech ction Choose mximizing ction 5 2

Exmple: Decision Networks = leve = tke Optiml decision = leve W P(W) sun 0.7 rin 0.3 A W (A,W) leve sun 100 leve rin 0 tke sun 20 tke rin 70 Decisions s Outcome Trees {} (t,s) (t,r) (l,s) (l,r) Almost exctly like expectimx / MDPs Wht s chnged? 7 3

Evidence in Decision Networks W P(W) sun 0.7 rin 0.3 Find P(W F=bd) Select for evidence W P(W) W P(F=bd W) sun 0.7 sun 0.2 rin 0.3 rin 0.9 F P(F sun) good 0.8 bd 0.2 F P(F rin) good 0.1 bd 0.9 W First we join P(W) nd P(bd W) Then we normlize P(W,F=bd) sun 0.14 rin 0.27 W P(W F=bd) sun 0.34 rin 0.66 Exmple: Decision Networks = leve W P(W F=bd) sun 0.34 rin 0.66 = tke Optiml decision = tke =bd A W (A,W) leve sun 100 leve rin 0 tke sun 20 tke rin 70 9 4

Decisions s Outcome Trees {b} W {b} W {b} (t,s) (t,r) (l,s) (l,r) 10 Vlue of Informtion Ide: compute vlue of cquiring evidence Cn be done directly from decision network Exmple: buying oil drilling rights Two blocks A nd B, exctly one hs oil, worth k You cn drill in one loction Prior probbilities 0.5 ech, & mutully exclusive Drilling in either A or B hs E = k/2, ME = k/2 Question: wht s the vlue of informtion of O? Vlue of knowing which of A or B hs oil Vlue is expected gin in ME from new info Survey my sy oil in or oil in b, prob 0.5 ech If we know OilLoc, ME is k (either wy) Gin in ME from knowing OilLoc? VPI(OilLoc) = k/2 Fir price of informtion: k/2 DrillLoc OilLoc O P 1/2 b 1/2 D O k b 0 b 0 b b k 12 5

Vlue of Informtion Assume we hve evidence E=e. Vlue if we ct now: {e} Assume we see tht E = e. Vlue if we ct then: BT E is rndom vrible whose vlue is unknown, so we don t know wht e will be Expected vlue if E is reveled nd then we ct: Vlue of informtion: how much ME goes up by reveling E first then cting, over cting now: P(s e) P(s e, e ) {e, e } P(e e) {e, e } {e} VPI Exmple: ME with no evidence ME if forecst is bd ME if forecst is good distribution A W leve sun 100 leve rin 0 tke sun 20 tke rin 70 F P(F) good 0.59 bd 0.41 14 6

VPI Properties Nonnegtive Nondditive ---consider, e.g., obtining E j twice Order-independent 15 Quick VPI Questions The soup of the dy is either clm chowder or split pe, but you wouldn t order either one. Wht s the vlue of knowing which it is? There re two kinds of plstic forks t picnic. It must be tht one is slightly better. Wht s the vlue of knowing which? You re plying the lottery. The prize will be $0 or $100. You cn ply ny number between 1 nd 100 (chnce of winning is 1%). Wht is the vlue of knowing the winning number? 7

POMDPs MDPs hve: Sttes S Actions A Trnsition fn P(s s,) (or T(s,,s )) Rewrds R(s,,s ) POMDPs dd: Observtions O Observtion function P(o s) (or O(s,o)) s s, s,,s s b b, POMDPs re MDPs over belief sttes b (distributions over S) o b We ll be ble to sy more in few lectures 17 Exmple: Ghostbusters In (sttic) Ghostbusters: Belief stte determined by evidence to dte {e} Tree relly over evidence sets Probbilistic resoning needed to predict new evidence given pst evidence o b b, b e {e} e, {e, e } Solving POMDPs One wy: use truncted expectimx to compute pproximte vlue of ctions Wht if you only considered busting or one sense followed by bust? You get VPI-bsed gent! {e} bust sense ( bust, {e}) bust {e}, sense e {e, e } ( bust, {e, e }) 18 8

More Generlly Generl solutions mp belief functions to ctions Cn divide regions of belief spce (set of belief functions) into policy regions (gets complex quickly) Cn build pproximte policies using discretiztion methods Cn fctor belief functions in vrious wys Overll, POMDPs re very (ctully PSACE-) hrd Most rel problems re POMDPs, but we cn rrely solve then in generl! 19 9