Lecture 7 Professor Scott Schmidler Duke University Today: Valuing outcomes -. Stat 340 TuTh 11:45-1:00 Fall 2016 Reference: Lectures draw heavily from Lindley Making Decisions, 2nd ed. Textbook reading: Smith Bayesian Decision Analysis Ch 3 See also: Clemen & Reilly Making Hard Decisions, Ch 13,14 A numerical measures of outcomes Example We showed uncertainty can be given a numerical measure that obeys certain laws (probabilities). We ll now do the same for outcomes/values (utilities). Together these will provide the solution to a decision problem. Consider possible decisions d 1,..., d m and possible outcomes o 1,..., o n. To fill out our decision model (ID or tree) we require a numerical value for each possible (d i, o j ) pair. Previously we used $ and EMV rule, but not all problems naturally lead to maximizing $. A manufacturer produces rolls of textiles (e.g. curtain material). Before selling, he must decide whether to inspect rolls for flaws. d 1 = inspect d 2 = sell w/out inspection Consequences: C = d 1 d 2 [ o 1 o 2 ] c11 c 12 c 21 c 22 o 1 = roll is free of flaws o 2 = has 1 flaw
Consequences o 1 : material good o 2 : material flawed d 1 satisfied customer Make new roll + cost of inspection cost of inspection satisfied customer d 2 satisfied customer Customer complaints Must replace with new roll Clearly c 21 is best for manufacturer c 11 probably next best (only cost of inspection) c 12 next best (must discard roll) c 22 worst (only saved inspection cost) Ranking: c 21 > c 11 > c 12 > c 22. Some consequences are preferred to others. (If inspection cost very high but flaw can be fixed, ranking might be different.) Example (cont d) Which decision should the manufacturer choose? Not obvious - presence of flaw is uncertain. If: o 1 (no flaws), d 2 clearly better (c 21 preferred to c 11 ) o 2 (flaws), d 1 better (c 12 preferred to c 22 ) Manufacturer can quantify uncertainty by assigning probabilities p(o 1 ) and p(o 2 ) = 1 p(o 1 ). If p(o 1 ) 1, choose d 2. If p(o 1 ) 0, choose d 1. Somewhere in between decision changes - but where? Depends not only on ranking of consequences, but on how much better one is than another. Numerical values Again we need a standard to compare against. We use two reference consequences: One better (or not worse) than any consequence in table (call this c best ) One worse (or not better) than any consequence in table (call this c worst ) We assume any pair of consequences can be compared (one preferred to other, or both equally desirable), and these comparisons coherent. c best (c worst ) may be in table, but need not be. Numerical values Consider any consequence c ij. Consider urn standard with U N black balls. For ball drawn at random suppose: consequence c best occurs if black ball drawn consequence c worst occurs if white ball drawn So receive c best w/ prob u = U N, c worst w/ prob 1 u. How does c best w/ prob u gamble compare to c ij? If u = 1, clearly better If u = 0, worse As U (and therefore u = U N ) increases, gamble gets better so must be a value of U (so u) s.t. indifferent between c ij and the gamble.
a unique number u (0, 1) s.t. c ij is equally desirable as chance u of getting a highly desirable outcome, and 1 u of getting a highly undesirable outcome. Write as µ(c ij ); call this the utility of c ij. By coherence we clearly have c ij preferred to c kl µ(c ij ) > µ(c kl ) c ij equally desirable to c kl µ(c ij ) = µ(c kl ) c ij worse than c kl µ(c ij ) < µ(c kl ) So have a numerical measure of the desirability of any consequence in the table, called a utility. (Note that u is a probability. so obeys the prob laws.) In general, we will have a decision table of the form d 1 d 2. d m o 1 o 2... o n µ(c 11 ) µ(c 12 )... µ(c 1n ) µ(c 21 ) µ(c 22 )... µ(c 2n )... µ(c m1 ) µ(c m2 )... µ(c mn ) E.g. for inspection example we have µ(c 21 ) = 1 µ(c 22 ) = 0 and might get for example C = d 1 : Inspect d 2 : Don t inspect o 1 [ : Good o 2 : Flawed ].9.5 1 0 Suppose further the manufacturer assesses p(flaw) =.2. Utilities in medicine: QALYs Often in medical decision making we are faced with possible outcomes to which it is difficult to assign dollar values. Examples: chemotherapy side effects chronic pain bilateral mastectomy Yet need quantitative measure of desirability for decision making. A solution: the quality adjusted life year. QALYs Note: are there fates worse than death? Can assign utility to any health state via the standard gamble. Let µ(1 yr in perfect health ) = 1, and µ(immediate death) = 0. Then µ(health state) is prob at which you are indifferent between living 1 yr in health state and gamble with prob µ of 1 year perfect health, (1 µ) of immediate death.
Combining probabilities and utilities Combining probabilities and utilities Now have numerical values for uncertainties and consequences. Wish to combine them to make decisions. Key: both are probabilities! If choose d i, outcome depends on uncertain event: if o j occurs, consequence is c ij. But... can replace c ij with chance µ(c ij ) of c, 1 µ(c ij ) of c. For any decision taken, and any event occuring, can think of result as either c or c!!! Now if take d i and o j occurs, prob of c is µ(c ij ): Using extension thm: p(c d i ) = p(c d i, o j ) = µ(c ij ) = p(c o j, d i )p(o j d i ) µ(c ij )p(o j d i ) If c doesn t result, c does, so p(c d i ) measures the merit of d i : higher p(c d i ) means better d i. the best decision is one with maximum p(c d i ). Expected utility Inspection example revisited Notice that p(c d i ) is expected utility under d i : p(c d i ) = µ(c ij )p(o j d i ) = µ(d i ) Therefore the best decision is the one with maximum expected utility. arg max D µ(d i) A decision problem is solved by maximizing expected utility. µ(d 1 ) =.9(.8) +.5(.2) =.82 µ(d 2 ) = 1.0(.8) +.0(.2) =.8 d 1 (Inspect) is the better decision. Notice: If improve production to lower p(flaw) to.1, then µ(d 1 ) =.86 and µ(d 2 ) =.9 can skip inspection. If inspection becomes more costly, so µ(c 11 ), µ(c 12 ) both decrease by.1 (to.8 o 1,.4 o 2 ), then µ(d 1 ) =.72 better not to inspect (but note MEU drops from.82 to.8)
Key idea of decision theory Summary of decision analysis Summary: A decision problem can be formulated as lists of decisions and uncertain events. Assuming coherent comparison of events and outcomes, probabilities can be assigned to events and utilities to consequences. Then each decision can be assigned a value (EU), and the best decision is the one with highest value (MEU). So the recommended procedure for making decisions is to 1 List the possible decisions (d 1,..., d m ) 2 List the uncertain events (o 1,..., o n ) 3 Assign probabilities to the events (p(o 1 ), p(o 2 ),..., p(o n )) 4 Assign utilities µ(d i, o j ) to the consequences (d i, o j ) 5 Choose the decision that maximizes expected utility µ(d i ) = µ(d i, o j )p(o j d i ) (Notice that all this follows more or less inevitably from coherence!) Arguing against existence of utilities Recall µ(c ij ) = c with prob p Suppose c ij = $1000 and c = $1001. Mike prefers c ij to c with prob p for any p < 1, since he prefers the certain $1000 to even the slightest risk of getting $0, since the increase ($1) is negligible. Claim: This is incoherent. Arguing against existence of utilities If Mike prefers 1 A certainty of $1000 to 2 A chance p of $1001 against 1 p of $0 for any p < 1, then presumably he also prefers 3 A certainty of $1001 to 4 A chance p of $1002 against 1 p of $0 for same p, since loss is greater while gain is same. Now alter (2) by replacing certain $1001 if p event occurs with (4) chance p of $1002. Makes (2) even worse since (3) preferred to (4). Now get $1002 with p 2 So to be coherent Mike must prefer (1) to 5 A chance p 2 of $1002 against 1 p 2 of $0. Now repeat argument to show Mike must prefer (1) to 6 A chance p 1000 of $2000 against 1 p 1000 of $0.
Arguing against existence of utilities But p was arbitrary so p 1000 can be as close to 1 as we want. Now can distinguish between the rewards ($1000 vs $2000) and becomes clear that, for some p, Mike should prefer (6) to (1). Contradicts original claim: clearly Mike must prefer (2) to (1) for some p sufficiently close to 1, to be coherent. For constructing utilities we used reference consequences c, c. Did choice matter? Let c and c be two other reference consequences, s.t. c > c (and so all c ij ) c < c (and so all c ij ) Recall c ij was equivalent to gamble c w/ prob u c ij u 1 u c c Since c < c < c, c is equivalent to c w/ prob p, c w/ prob 1 p for some p. Similarly, c = c w/ prob p + s, c w/ prob 1 p s for s > 0 (since c preferred to c ). So c ij u 1 u c c p+s 1 p s p 1 p c c c c c c ij us+p 1 us p new utility for c ij is us + p after replacing (c, c ) with (c, c ). (Note: p, s do not depend on c ij. Same for any consequence.) c So for any decision the (new) expected utility is: (µ(d i, o j )s + p) p(o j d i ) = s µ(d i ) + p where µ(d i ) is the original expected utility. Choice of reference consequences (c, c ) is irrelevant. ( Origin and scale of utilities are irrelevant.) MEU invariant under affine transformations of U. Convenient to use different origins/scales for different problems. (just like we measure length in Angstroms, yards, light-years,etc.)