CS 188 Introduction to Artificial Intelligence Fall 2018 Note 7

Similar documents
CS 188: Artificial Intelligence Fall 2010

CS 188: Artificial Intelligence

Decision Networks. CS 188: Artificial Intelligence. Decision Networks. Decision Networks. Decision Networks and Value of Information

Decision Networks. CS 188: Artificial Intelligence Fall Example: Decision Networks. Decision Networks. Decisions as Outcome Trees

Recitation 3: More Applications of the Derivative

Solution for Assignment 1 : Intro to Probability and Statistics, PAC learning

1 Probability Density Functions

The First Fundamental Theorem of Calculus. If f(x) is continuous on [a, b] and F (x) is any antiderivative. f(x) dx = F (b) F (a).

1 Online Learning and Regret Minimization

Lecture 3: Equivalence Relations

19 Optimal behavior: Game theory

Line and Surface Integrals: An Intuitive Understanding

Properties of Integrals, Indefinite Integrals. Goals: Definition of the Definite Integral Integral Calculations using Antiderivatives

MORE FUNCTION GRAPHING; OPTIMIZATION. (Last edited October 28, 2013 at 11:09pm.)

Duality # Second iteration for HW problem. Recall our LP example problem we have been working on, in equality form, is given below.

Math 8 Winter 2015 Applications of Integration

p-adic Egyptian Fractions

5.7 Improper Integrals

Review of Calculus, cont d

Reinforcement Learning

n f(x i ) x. i=1 In section 4.2, we defined the definite integral of f from x = a to x = b as n f(x i ) x; f(x) dx = lim i=1

Handout: Natural deduction for first order logic

Infinite Geometric Series

f(x) dx, If one of these two conditions is not met, we call the integral improper. Our usual definition for the value for the definite integral

Math 1B, lecture 4: Error bounds for numerical methods

7.2 The Definite Integral

ACCESS TO SCIENCE, ENGINEERING AND AGRICULTURE: MATHEMATICS 1 MATH00030 SEMESTER /2019

The Regulated and Riemann Integrals

Riemann Sums and Riemann Integrals

Goals: Determine how to calculate the area described by a function. Define the definite integral. Explore the relationship between the definite

Chapters 4 & 5 Integrals & Applications

Finite Automata Theory and Formal Languages TMV027/DIT321 LP4 2018

Riemann Sums and Riemann Integrals

Continuous Random Variables

CS 188: Artificial Intelligence Spring 2007

CMPSCI 250: Introduction to Computation. Lecture #31: What DFA s Can and Can t Do David Mix Barrington 9 April 2014

MAA 4212 Improper Integrals

How do we solve these things, especially when they get complicated? How do we know when a system has a solution, and when is it unique?

Chapter 0. What is the Lebesgue integral about?

Non-Linear & Logistic Regression

Lecture 2: Fields, Formally

Improper Integrals. Type I Improper Integrals How do we evaluate an integral such as

p(t) dt + i 1 re it ireit dt =

Multi-Armed Bandits: Non-adaptive and Adaptive Sampling

THE EXISTENCE-UNIQUENESS THEOREM FOR FIRST-ORDER DIFFERENTIAL EQUATIONS.

Chapter 5 : Continuous Random Variables

Quadratic Forms. Quadratic Forms

The steps of the hypothesis test

Reversals of Signal-Posterior Monotonicity for Any Bounded Prior

ECO 317 Economics of Uncertainty Fall Term 2007 Notes for lectures 4. Stochastic Dominance

Designing Information Devices and Systems I Discussion 8B

PART 1 MULTIPLE CHOICE Circle the appropriate response to each of the questions below. Each question has a value of 1 point.

Recitation 3: Applications of the Derivative. 1 Higher-Order Derivatives and their Applications

Lecture 1. Functional series. Pointwise and uniform convergence.

Designing Information Devices and Systems I Spring 2018 Homework 7

Improper Integrals. The First Fundamental Theorem of Calculus, as we ve discussed in class, goes as follows:

Unit #9 : Definite Integral Properties; Fundamental Theorem of Calculus

CH 9 INTRO TO EQUATIONS

Intermediate Math Circles Wednesday, November 14, 2018 Finite Automata II. Nickolas Rollick a b b. a b 4

Lecture 1: Introduction to integration theory and bounded variation

Riemann is the Mann! (But Lebesgue may besgue to differ.)

Best Approximation. Chapter The General Case

State Minimization for DFAs

UNIFORM CONVERGENCE. Contents 1. Uniform Convergence 1 2. Properties of uniform convergence 3

Bernoulli Numbers Jeff Morton

Parse trees, ambiguity, and Chomsky normal form

Week 10: Line Integrals

Lecture 3 Gaussian Probability Distribution

20 MATHEMATICS POLYNOMIALS

Acceptance Sampling by Attributes

dt. However, we might also be curious about dy

CS103B Handout 18 Winter 2007 February 28, 2007 Finite Automata

CMDA 4604: Intermediate Topics in Mathematical Modeling Lecture 19: Interpolation and Quadrature

CS 373, Spring Solutions to Mock midterm 1 (Based on first midterm in CS 273, Fall 2008.)

Finite Automata. Informatics 2A: Lecture 3. John Longley. 22 September School of Informatics University of Edinburgh

Numerical integration

Chapter 14. Matrix Representations of Linear Transformations

Riemann Integrals and the Fundamental Theorem of Calculus

A REVIEW OF CALCULUS CONCEPTS FOR JDEP 384H. Thomas Shores Department of Mathematics University of Nebraska Spring 2007

Math 32B Discussion Session Session 7 Notes August 28, 2018

Arithmetic & Algebra. NCTM National Conference, 2017

Math 426: Probability Final Exam Practice

Theoretical foundations of Gaussian quadrature

Lecture 2: January 27

SYDE 112, LECTURES 3 & 4: The Fundamental Theorem of Calculus

MATH 144: Business Calculus Final Review

and that at t = 0 the object is at position 5. Find the position of the object at t = 2.

Chapter 6 Notes, Larson/Hostetler 3e

LECTURE NOTE #12 PROF. ALAN YUILLE

Math Lecture 23

Discrete Mathematics and Probability Theory Spring 2013 Anant Sahai Lecture 17

Problem Set 3 Solutions

Bases for Vector Spaces

Advanced Calculus: MATH 410 Notes on Integrals and Integrability Professor David Levermore 17 October 2004

Equations and Inequalities

CS 275 Automata and Formal Language Theory

CS667 Lecture 6: Monte Carlo Integration 02/10/05

( dg. ) 2 dt. + dt. dt j + dh. + dt. r(t) dt. Comparing this equation with the one listed above for the length of see that

Finite Automata. Informatics 2A: Lecture 3. Mary Cryan. 21 September School of Informatics University of Edinburgh

Discrete Random Variables Class 4, Jeremy Orloff and Jonathan Bloom

Transcription:

CS 188 Introduction to Artificil Intelligence Fll 2018 Note 7 These lecture notes re hevily bsed on notes originlly written by Nikhil Shrm. Decision Networks In the third note, we lerned bout gme trees nd lgorithms such s minimx nd expectimx which we used to determine optiml ctions tht mximized our expected utility. Then in the sixth note, we discussed Byes nets nd how we cn use evidence we know to run probbilistic inference to mke predictions. Now we ll discuss combintion of both Byes nets nd expectimx known s decision network tht we cn use to model the effect of vrious ctions on utilities bsed on n overrching grphicl probbilistic model. Let s dive right in with the ntomy of decision network: Chnce nodes - Chnce nodes in decision network behve identiclly to Byes nets. Ech outcome in chnce node hs n ssocited probbility, which cn be determined by running inference on the underlying Byes net it belongs to. We ll represent these with ovls. Action nodes - Action nodes re nodes tht we hve complete control over; they re nodes representing choice between ny of number of ctions which we hve the power to choose from. We ll represent ction nodes with rectngles. Utility nodes - Utility nodes re children of some combintion of ction nd chnce nodes. They output utility bsed on the vlues tken on by their prents, nd re represented s dimonds in our decision networks. Consider sitution when you re deciding whether or not to tke n umbrell when you re leving for clss in the morning, nd you know there s forecsted 30% chnce of rin. Should you tke the umbrell? If there ws 80% chnce of rin, would your nswer chnge? This sitution is idel for modeling with decision network, nd we do it s follows: CS 188, Fll 2018, Note 7 1

As we ve done throughout this course with the vrious modeling techniques nd lgorithms we ve discussed, our gol with decision networks is gin to select the ction which yields the mximum expected utility (MEU). This cn be done with firly strightforwrd nd intuitive procedure: Strt by instntiting ll evidence tht s known, nd run inference to clculte the posterior probbilities of ll chnce node prents of the utility node into which the ction node feeds. Go through ech possible ction nd compute the expected utility of tking tht ction given the posterior probbilities computed in the previous step. The expected utility of tking n ction given evidence e nd n chnce nodes is computed with the following formul: EU( e) = P(x 1,...,x n e)u(,x 1,...,x n ) x 1,...,x n where ech x i represents vlue tht the i th chnce node cn tke on. We simply tke weighted sum over the utilities of ech outcome under our given ction with weights corresponding to the probbilities of ech outcome. Finlly, select the ction which yielded the highest utility to get the MEU. Let s see how this ctully looks by clculting the optiml ction (should we leve or tke our umbrell) for our wether exmple, using both the conditionl probbility tble for wether given bd wether forecst (forecst is our evidence vrible) nd the utility tble given our ction nd the wether: CS 188, Fll 2018, Note 7 2

Note tht we hve omitted the inference computtion for the posterior probbilities P(W F = bd), but we could compute these using ny of the inference lgorithms we discussed for Byes Nets. Insted, here we simply ssume the bove tble of posterior probbilities for P(W F = bd) s given. Going through both our ctions nd computing expected utilities yields: EU(leve bd) = P(w bd)u(leve, w) w = 0.34 100 + 0.66 0 = 34 EU(tke bd) = P(w bd)u(tke, w) w = 0.34 20 + 0.66 70 = 53 All tht s left to do is tke the mximum over these computed utilities to determine the MEU: MEU(F = bd) = mxeu( bd) = 53 The ction tht yields the mximum expected utility is tke, nd so this is the ction recommended to us by the decision network. More formlly, the ction tht yields the MEU cn be determined by tking the rgmx over expected utilities. Outcome Trees We mentioned t the strt of this note tht decision networks involved some expectimx-esque elements, so let s discuss wht exctly tht mens. We cn unrvel the selection of n ction corresponding to the one tht mximizes expected utility in decision network s n outcome tree. Our wether forecst exmple from bove unrvels into the following outcome tree: CS 188, Fll 2018, Note 7 3

The root node t the top is mximizer node, just like in expectimx, nd is controlled by us. We select n ction, which tkes us to the next level in the tree, controlled by chnce nodes. At this level, chnce nodes resolve to different utility nodes t the finl level with probbilities corresponding to the posterior probbilities derived from probbilistic inference run on the underlying Byes net. Wht exctly mkes this different from vnill expectimx? The only rel difference is tht for outcome trees we nnotte our nodes with wht we know t ny given moment (inside the curly brces). The Vlue of Perfect Informtion In everything we ve covered up to this point, we ve generlly lwys ssumed tht our gent hs ll the informtion it needs for prticulr problem nd/or hs no wy to cquire new informtion. In prctice, this is hrdly the cse, nd one of the most importnt prts of decision mking is knowing whether or not it s worth gthering more evidence to help decide which ction to tke. Observing new evidence lmost lwys hs some cost, whether it be in terms of time, money, or some other medium. In this section, we ll tlk bout very importnt concept - the vlue of perfect informtion (VPI) - which mthemticlly quntifies the mount n gent s mximum expected utility is expected to increse if it observes some new evidence. We cn compre the VPI of lerning some new informtion with the cost ssocited with observing tht informtion to mke decisions bout whether or not it s worthwhile to observe. Generl Formul Rther thn simply presenting the formul for computing the vlue of perfect informtion for new evidence, let s wlk through n intuitive derivtion. We know from our bove definition tht the vlue of perfect informtion is the mount our mximum expected utility is expected to increse if we decide to observe new evidence. We know our current mximum utility given our current evidence e: MEU(e) = mx P(s e)u(s, ) s Additionlly, we know tht if we observed some new evidence e before cting, the mximum expected utility of our ction t tht point would become MEU(e,e ) = mx P(s e,e )U(s,) s CS 188, Fll 2018, Note 7 4

However, note tht we don t know wht new evidence we ll get. For exmple, if we didn t know the wether forecst beforehnd nd chose to observe it, the forecst we observe might be either good or bd. Becuse we don t know wht wht new evidence e we ll get, we must represent it s rndom vrible E. How do we represent the new MEU we ll get if we choose to observe new vrible if we don t know wht the evidence gined from observtion will tell us? The nswer is to compute the expected vlue of the mximum expected utility which, while being mouthful, is the nturl wy to go: MEU(e,E ) = e P(e e)meu(e,e ) Observing new evidence vrible yields different MEU with probbilities corresponding to the probbilities of observing ech vlue for the evidence vrible, nd so by computing MEU(e,E ) s bove, we compute wht we expect our new MEU will be if we choose to observe new evidence. We re just bout done now - returning to our definition for VPI, we wnt to find the mount our MEU is expected to increse if we choose to observe new evidence. We know our current MEU nd the expected vlue of the new MEU if we choose to observe, so the expected MEU increse is simply the difference of these two terms! Indeed, V PI(E e) = MEU(e,E ) MEU(e) where we cn red V PI(E e) s "the vlue of observing new evidence E given our current evidence e". Let s work our wy through n exmple by revisiting our wether scenrio one lst time: If we don t observe ny evidence, then our mximum expected utility cn be computed s follows: MEU( ) = mxeu() = mx P(w)U(, w) w = mx{0.7 100 + 0.3 0,0.7 20 + 0.3 70} = mx{70, 35} = 70 Note tht the convention when we hve no evidence is to write MEU( ), denoting tht our evidence is the empty set. Now let s sy tht we re deciding whether or not to observe the wether forecst. We ve lredy computed tht MEU(F = bd) = 53, nd let s ssume tht running n identicl computtion for F = good CS 188, Fll 2018, Note 7 5

yields MEU(F = good) = 95. We re now redy to compute MEU(e,E ): MEU(e,E ) = MEU(F) = P(e e)meu(e,e ) e = P(F = f )MEU(F = f ) f = P(F = good)meu(f = good) + P(F = bd)meu(f = bd) = 0.59 95 + 0.41 53 = 77.78 Hence we conclude V PI(F) = MEU(F) MEU( ) = 77.78 70 = 7.78. Properties of VPI The vlue of perfect informtion hs severl very importnt properties, nmely: Nonnegtivity. E,e V PI(E e) 0 Observing new informtion lwys llows you to mke more informed decision, nd so your mximum expected utility cn only increse (or sty the sme if the informtion is irrelevnt for the decision you must mke). Nondditivity. V PI(E j,e k e) V PI(E j e) +V PI(E k e) in generl. This is probbly the trickiest of the three properties to understnd intuitively. It s true becuse generlly observing some new evidence E j might chnge how much we cre bout E k ; therefore we cn t simply dd the VPI of observing E j to the VPI of observing E k to get the VPI of observing both of them. Rther, the VPI of observing two new evidence vribles is equivlent to observing one, incorporting it into our current evidence, then observing the other. This is encpsulted by the order-independence property of VPI, described more below. Order-independence. V PI(E j,e k e) = V PI(E j e) +V PI(E k e,e j ) = V PI(E k e) +V PI(E j e,e k ) Observing multiple new evidences yields the sme gin in mximum expected utility regrdless of the order of observtion. This should be firly strightforwrd ssumption - becuse we don t ctully tke ny ction until fter observing ny new evidence vribles, it doesn t ctully mtter whether we observe the new evidence vribles together or in some rbitrry sequentil order. CS 188, Fll 2018, Note 7 6