Information Theoretic Bounded Rationality. Pedro A. Ortega University of Pennsylvania

Size: px

Start display at page:

Download "Information Theoretic Bounded Rationality. Pedro A. Ortega University of Pennsylvania"

Agnes Randall
5 years ago
Views:

1 Information Theoretic Bounded Rationality Pedro A. Ortega University of Pennsylvania May 2015

2 Goal of this Talk In large scale decision problems, agents choose as Posterior Utility Inverse Temp. Why is this so? And how is this used? Prior

3 Pick a box. 50/50 25/75

4 Pick a box. 50/50?

5 Pick a box. 50/50? Unknown probabilities affect net value.

6 Pick the largest.

7 Pick the largest.

8 Pick the largest. Large scale: you cannot do better than random search.

9 Preliminaries History, variational principles and more May 2015

10 An active area of research... ermodynamics of Prediction Polani Still Regularisation in Control Peters Mitter Maccheroni Linearly Solvable Control Todorov KL Control Theory Neumann Theodorou Active Inference Bierkens Robust Decision Making Braun Bounded Rationality Rustichini Tishby Wiegerink Friston Palfrey Kappen Fleming Sims Gomez Marinacci Rational Inattention Variational Preferences Duality Control Estimation Guerra Information Flow in Perception Act McKelvey Quantum Mechanics for Control Wolpert Ortega Quantal Response Equilibria

11 History of Decision Theory Bernoulli 1738 Utility Knight Ramsey 1921 Uncertainty & Risk de Finetti Subjective Probability von Neumann Morgenstern 1944 (Objective) Expected Utility Savage 1954 Subjective Expected Utility New theory of bounded rational decisions. Simon 1957 Bounded Rationality?

12 What is boundedness? NO YES Suboptimal Heuristics Approximation Meta reasoning Intractability Model Uncertainty Risk Sensitivity ToM Bounded = Subject to information constraints.

13 Bounded Rational Decisions Assumptions, derivation and main properties May 2015

14 Planning and Searching

15 Planning and Searching

16 Planning and Searching

17 Planning and Searching Planning is searching for a good subset.

18 Straw Pile Assumptions Primitive operation: search A given B. Very large Negligible probability Irreducible Parallelisable

19 Straw Pile Assumptions Primitive operation: search A given B. Very large Negligible probability Irreducible Parallelisable Golden Rule: No Integration! Large scale planning is searching a set inside a pile of straw.

20 Search Costs

21 Search Costs Continuity Additivity Monotonicity < +

22 Cost of Probabilistic Choice a) Total Cost = Expected Cost + Waste b)

23 Bounded Rational Decision Making Deliberation transforms prior Q into posterior P. Agent extremises the free energy functional:... Trades off utility & information costs. Models: large scale choice spaces, trust & risk sensitivity, and other information constraints.

24 Optimal Choice & Certainty Equivalent Posterior (optimal choice distribution): Certainty Equivalent (value):

25 Optimal Choice and Certainty Equivalent a) b)

26 Optimal Choice and Certainty Equivalent a) b)

27 Optimal Choice and Certainty Equivalent a) b) Inverse temperature encapsulates constraints.

28 Equivalence Behaviourally indistinguishable. Have same: prior, posterior, and certainty equivalent. Many ways to represent same choice behaviour.

29 Stochastic Choices Choosing amounts to rejection sampling:

30 Stochastic Choices Choosing amounts to rejection sampling [1]: Essentially depending on inverse temperature, not on number of choices! RS is the most efficient sampler given knowledge. 1.8 [1] Ortega, Braun & Tishby, 2014.

31 Stochastic Choices II We can sample directly from an equivalent decision problem. Interpretation

32 Stochastic Choices II We can sample directly from an equivalent decision problem. Interpretation RS from equivalent problem requires equalisation.

33 Comparison Objective Function Rationality Paradigm Domain size Search strategy Search scope Functional form Utility sensitivity Expected Utility Free Energy perfect bounded small large deterministic randomised complete incomplete linear non linear first moment all moments

34 Sequential Decisions Definition, construction, and recursions May 2015

35 Classical Decision Trees max max E E min max max max E E min min Planning in classical decision trees depends on the type of system making the move in each node. Can we generalise them?

36 Certainty Equivalent/Value Classical values can be approximated b) a) 1 2 low 1 c) d) high 0 Different choices of inverse temperatures.

37 Bounded Rational Decision Trees a) E b) max E min Free energy functional: building block for game trees inverse temperatures are node specific solved using forward sampling, not dynamic programming

38 Construction of Decision Tree a) Start Startfrom fromsingle step. single step. b) c) d) Apply Applyequivalence equivalencetransformations transformationsuntil untildesired desiredtree treeisisobtained. obtained.

39 Map of Decision Rules a) b) e) anti rational rational risk seeking, cooperative d Minimax c) Expectimax c b d) risk averse, adversarial a Vary Inverse inverse temperatures temperatures determine to choose the decision decision rules. rule.

40 Recursive Equations Free Energy: Certainty Equivalent: Posterior:

41 Recursive Rejection Sampling Recursively sample a path. Equalise every time the inverse temperature changes. Implicitly evaluates feasibility of trajectories.

42 Summary Problem: Large scale decision spaces. Key ideas: Use statistical description of choices. Exploit stochastic computational model. Properties: Blurs boundary learning planning. Massively parallelisable. Controllable complexity. Implements rich class of decision rules.

43 Thank you!

44 Bonus Material...

45 Meta Reasoning Reason about costs of reasoning Penalise search complexity Much harder problem! Meta reasoning is not permitted!

46 Meta Reasoning Reason about costs of reasoning Penalise search complexity Much harder problem! Meta reasoning is not permitted!

47 Meta Reasoning Reason about costs of reasoning Penalise search complexity Much harder problem! Meta reasoning is not permitted!

48 Interrupted Reasoning Any time optimisation: Utility and cost of searching are commensurable! commensurable.

49 Interrupted Reasoning Any time optimisation: Computation Computationreduces reduces uncertainty of solution! uncertainty of solution! Utility and cost of searching are commensurable! commensurable.

50 Interrupted Reasoning Any time optimisation: Computation Computationreduces reduces uncertainty of solution! uncertainty of solution! Utility and cost of searching are commensurable! commensurable.

51 The Rational Adaptive Robobug Design: Horizon: 120 seconds Frequency: 1 Hz Observations: 2 Actions: 2 Size of decision tree: Nodes: 1072 Atoms in the world: 1050 Computation (@ 1016 FLOPS): Time: 1050 years Age of the universe: 1010 years SEU theory is completely intractable!

52 The Rational Adaptive Robobug Design: Horizon: 120 seconds Frequency: 1 Hz Observations: 2 Actions: 2 Size of decision tree: Nodes: 1072 Atoms in the world: 1050 Computation (@ 1016 FLOPS): Time: 1050 years Age of the universe: 1010 years SEU theory is completely intractable!

53 Relation Energy Information Desiderata for costs functions of probability additive monotonic Analogy: Isothermal Transformation Result [1] Before Conversion ConversionFactor Factor (real valued) (real valued) After Work Workisisreduction reduction ofofuncertainty! uncertainty! Utility = Work = (Unit Conv.) x (Reduction of Uncertainty). [1] Ortega, P.A. & Braun, D.A., 2011

54 Energy Information Trade Off Analogy: Isothermal Transformation Prior State Posterior State 2 1 Realisation Prior Work differs from Realised Work. 3

55 Energy Information Trade Off Analogy: Isothermal Transformation Prior State Posterior State 2 1 Realisation Prior Work differs from Realised Work. 3

56 Energy Information Trade Off Analogy: Isothermal Transformation Prior State Posterior State 2 1 Realisation Prior Work differs from Realised Work. 3

57 Energy Information Trade Off Analogy: Isothermal Transformation Prior State Posterior State 2 1 Realisation Knowledge shapes the perceived work! 3

58 Knowledge shapes perceived work Relation between work before/after knowledge: Free Energy = Work before Realisation.

59 Knowledge shapes perceived work Relation between work before/after knowledge: Free Energy = Work before Realisation.

60 Knowledge shapes perceived work Relation between work before/after knowledge: Free Energy = Work before knowledge of realisation.

61 Lesson from Thermodynamics No matter how hard we try, there's always a loss of energy.

Information, Utility & Bounded Rationality

Information, Utility & Bounded Rationality Pedro A. Ortega and Daniel A. Braun Department of Engineering, University of Cambridge Trumpington Street, Cambridge, CB2 PZ, UK {dab54,pao32}@cam.ac.uk Abstract.