Statistical Decision Theory II: Costs, Rewards, & Ideal Actors

Size: px

Start display at page:

Download "Statistical Decision Theory II: Costs, Rewards, & Ideal Actors"

Darleen Thomasina Turner
5 years ago
Views:

1 Statistical Decision Theory II: Costs, Rewards, & Ideal Actors

2 Loss Functions Statistical decision theory involves determining which decisions (or actions) are best It formalizes good versus bad outcomes in terms of the loss function l(, ax ( )) Which is simply a function of the world state θ and a decision rule or action a().

3 Binary (0-1) Loss Function l( s, sˆ) 1 ( sˆ s) where sˆ a( x) The 0-1 loss function represents a task whose goal is to maximize accuracy. It assigns a cost of 0 to an accurate estimate and a cost of 1 to any inaccurate estimate It doesn t care how inaccurate estimates differ from the true value Its expected loss is minimized by taking the MAP (posterior mode) as the estimate sƹ

4 Quadratic Loss Function l( s, sˆ ) ( sˆ s) 2 The quadratic loss function represents a task whose goal is to minimize the squared error of the estimate Unlike the binary loss function, it does care about how different inaccurate estimates are from the true value s Its expected value is minimized by taking the posterior mean as the estimate sƹ

5 Risk Risk (expected loss) is a measure used to characterize the fitness of a particular decision rule or action. It is approached differently for Frequentists and Bayesians Frequentist Risk: R(, a) E l(, a( X )) Bayesian (Posterior) Risk: X, a( x) E l(, a( x))

6 For frequentists, the question is: how should we aggregate over θ to determine the best a? R R(, a ) 3 R(, a ) 2 R(, a ) 1

7 Bayesian Ideal Actors Trommershäuser et al., 2003

8 Defining the Task for an Ideal Actor Requires an objective world state that determines the cost (loss/gain) of an action Requires an objective measure of success, generally quantified in terms of expected loss or expected gain EL(action) Equivalently, world states loss(world state, action) p(world state observations) EG(action) world states gain(world state, action) p(world state observations)

9 Back to Original Generative Model Each node represents a variable in the task ps () s Stimulus p( v s) v Measurement Each node is associated with an objective probability distribution The arrow represents a conditional probability relationship, with the arrow pointing towards the conditioned variable

10 Characterizing Ideal Performance ps () p( v s) sˆ v s v ŝ We can expand our original (forward) generative model (for generating measurements) to generate ideal observer responses sƹ sƹ is a deterministic function of v. However, since v is a random variable, so is sƹ when conditioned on s We can characterize the performance of (i.e., simulate) the ideal observer by drawing samples from this generative model

11 Characterizing Ideal Performance For purely perceptual tasks, the ideal observer s performance consists of simple estimation or classification of scene parameters based on sensory information Ideal performance is usually characterized in terms of the expected accuracy or precision of the estimate i.e., ideal observers tend to maximize expected accuracy More generally, an ideal actor s performance consists of selecting actions based on sensory information Ideal performance is characterized in terms of expected gain or expected loss i.e., ideal actors maximize expected gain or minimize expected loss

12 Generative Model for a Simple Sensorimotor Task ps () s gs (, ) g s : world state v : visual (sensory) measurement(s) a : (planned) action v p( v s) a av () p( a) : trajectory (executed action) g : gain, reward, or loss

13 Generative Model for a Simple Sensorimotor Task Optimal action: where a* arg max EG( a, v), EG ( a, v ) gs (, ) p a ( v ) p s v dsd s a

14 Trommershäuser et al., 2008

15 A Simple Pointing Task For the Trommershäuser et al. (2003) task, the authors assume that observers/actors know the state of the world s exactly. In this case the gain depends only on the selected action and we can simplify the expression for expected gain. I.e., EG( a) g( ) p a d

16 A Simple Pointing Task Possible Outcomes: 1. The target r 0 is hit: subject receives a reward g 0 (positive gain) 2. The penalty region r 1 is hit: subject incurs a penalty g 1 (negative gain) 3. The background region is hit: subject receives no reward or penalty (gain = 0) 4. The screen is not hit within the allotted time: subject receives a large penalty (large negative gain) For simplicity, we will ignore the two latter outcomes (as the authors did )

17 A Simple Pointing Task Trommershäuser et al., 2008

18 A Simple Pointing Task let the planned action a be represented by the intended endpoint of the movement µ let the executed trajectory be represented by the actual endpoint of the movement x then the expected gain for the action can be expressed as the average gain computed over possible trajectories EG( μ) g p( r μ) g p( r μ) g p( x r μ) dx g p( x r μ) dx 0 x 0 1 x 1

19 Trommershäuser et al., 2003

20 Trommershäuser et al., 2003

21 Trommershäuser et al., 2003

12/2/15. G Perception. Bayesian Decision Theory. Laurence T. Maloney. Perceptual Tasks. Testing hypotheses. Estimation

12/2/15. G Perception. Bayesian Decision Theory. Laurence T. Maloney. Perceptual Tasks. Testing hypotheses. Estimation G89.2223 Perception Bayesian Decision Theory Laurence T. Maloney Perceptual Tasks Testing hypotheses signal detection theory psychometric function Estimation previous lecture Selection of actions this