Information Theoretic Bounded Rationality. Pedro A. Ortega University of Pennsylvania

Size: px
Start display at page:

Download "Information Theoretic Bounded Rationality. Pedro A. Ortega University of Pennsylvania"

Transcription

1 Information Theoretic Bounded Rationality Pedro A. Ortega University of Pennsylvania May 2015

2 Goal of this Talk In large scale decision problems, agents choose as Posterior Utility Inverse Temp. Why is this so? And how is this used? Prior

3 Pick a box. 50/50 25/75

4 Pick a box. 50/50?

5 Pick a box. 50/50? Unknown probabilities affect net value.

6 Pick the largest.

7 Pick the largest.

8 Pick the largest. Large scale: you cannot do better than random search.

9 Preliminaries History, variational principles and more May 2015

10 An active area of research... ermodynamics of Prediction Polani Still Regularisation in Control Peters Mitter Maccheroni Linearly Solvable Control Todorov KL Control Theory Neumann Theodorou Active Inference Bierkens Robust Decision Making Braun Bounded Rationality Rustichini Tishby Wiegerink Friston Palfrey Kappen Fleming Sims Gomez Marinacci Rational Inattention Variational Preferences Duality Control Estimation Guerra Information Flow in Perception Act McKelvey Quantum Mechanics for Control Wolpert Ortega Quantal Response Equilibria

11 History of Decision Theory Bernoulli 1738 Utility Knight Ramsey 1921 Uncertainty & Risk de Finetti Subjective Probability von Neumann Morgenstern 1944 (Objective) Expected Utility Savage 1954 Subjective Expected Utility New theory of bounded rational decisions. Simon 1957 Bounded Rationality?

12 What is boundedness? NO YES Suboptimal Heuristics Approximation Meta reasoning Intractability Model Uncertainty Risk Sensitivity ToM Bounded = Subject to information constraints.

13 Bounded Rational Decisions Assumptions, derivation and main properties May 2015

14 Planning and Searching

15 Planning and Searching

16 Planning and Searching

17 Planning and Searching Planning is searching for a good subset.

18 Straw Pile Assumptions Primitive operation: search A given B. Very large Negligible probability Irreducible Parallelisable

19 Straw Pile Assumptions Primitive operation: search A given B. Very large Negligible probability Irreducible Parallelisable Golden Rule: No Integration! Large scale planning is searching a set inside a pile of straw.

20 Search Costs

21 Search Costs Continuity Additivity Monotonicity < +

22 Cost of Probabilistic Choice a) Total Cost = Expected Cost + Waste b)

23 Bounded Rational Decision Making Deliberation transforms prior Q into posterior P. Agent extremises the free energy functional:... Trades off utility & information costs. Models: large scale choice spaces, trust & risk sensitivity, and other information constraints.

24 Optimal Choice & Certainty Equivalent Posterior (optimal choice distribution): Certainty Equivalent (value):

25 Optimal Choice and Certainty Equivalent a) b)

26 Optimal Choice and Certainty Equivalent a) b)

27 Optimal Choice and Certainty Equivalent a) b) Inverse temperature encapsulates constraints.

28 Equivalence Behaviourally indistinguishable. Have same: prior, posterior, and certainty equivalent. Many ways to represent same choice behaviour.

29 Stochastic Choices Choosing amounts to rejection sampling:

30 Stochastic Choices Choosing amounts to rejection sampling [1]: Essentially depending on inverse temperature, not on number of choices! RS is the most efficient sampler given knowledge. 1.8 [1] Ortega, Braun & Tishby, 2014.

31 Stochastic Choices II We can sample directly from an equivalent decision problem. Interpretation

32 Stochastic Choices II We can sample directly from an equivalent decision problem. Interpretation RS from equivalent problem requires equalisation.

33 Comparison Objective Function Rationality Paradigm Domain size Search strategy Search scope Functional form Utility sensitivity Expected Utility Free Energy perfect bounded small large deterministic randomised complete incomplete linear non linear first moment all moments

34 Sequential Decisions Definition, construction, and recursions May 2015

35 Classical Decision Trees max max E E min max max max E E min min Planning in classical decision trees depends on the type of system making the move in each node. Can we generalise them?

36 Certainty Equivalent/Value Classical values can be approximated b) a) 1 2 low 1 c) d) high 0 Different choices of inverse temperatures.

37 Bounded Rational Decision Trees a) E b) max E min Free energy functional: building block for game trees inverse temperatures are node specific solved using forward sampling, not dynamic programming

38 Construction of Decision Tree a) Start Startfrom fromsingle step. single step. b) c) d) Apply Applyequivalence equivalencetransformations transformationsuntil untildesired desiredtree treeisisobtained. obtained.

39 Map of Decision Rules a) b) e) anti rational rational risk seeking, cooperative d Minimax c) Expectimax c b d) risk averse, adversarial a Vary Inverse inverse temperatures temperatures determine to choose the decision decision rules. rule.

40 Recursive Equations Free Energy: Certainty Equivalent: Posterior:

41 Recursive Rejection Sampling Recursively sample a path. Equalise every time the inverse temperature changes. Implicitly evaluates feasibility of trajectories.

42 Summary Problem: Large scale decision spaces. Key ideas: Use statistical description of choices. Exploit stochastic computational model. Properties: Blurs boundary learning planning. Massively parallelisable. Controllable complexity. Implements rich class of decision rules.

43 Thank you!

44 Bonus Material...

45 Meta Reasoning Reason about costs of reasoning Penalise search complexity Much harder problem! Meta reasoning is not permitted!

46 Meta Reasoning Reason about costs of reasoning Penalise search complexity Much harder problem! Meta reasoning is not permitted!

47 Meta Reasoning Reason about costs of reasoning Penalise search complexity Much harder problem! Meta reasoning is not permitted!

48 Interrupted Reasoning Any time optimisation: Utility and cost of searching are commensurable! commensurable.

49 Interrupted Reasoning Any time optimisation: Computation Computationreduces reduces uncertainty of solution! uncertainty of solution! Utility and cost of searching are commensurable! commensurable.

50 Interrupted Reasoning Any time optimisation: Computation Computationreduces reduces uncertainty of solution! uncertainty of solution! Utility and cost of searching are commensurable! commensurable.

51 The Rational Adaptive Robobug Design: Horizon: 120 seconds Frequency: 1 Hz Observations: 2 Actions: 2 Size of decision tree: Nodes: 1072 Atoms in the world: 1050 Computation (@ 1016 FLOPS): Time: 1050 years Age of the universe: 1010 years SEU theory is completely intractable!

52 The Rational Adaptive Robobug Design: Horizon: 120 seconds Frequency: 1 Hz Observations: 2 Actions: 2 Size of decision tree: Nodes: 1072 Atoms in the world: 1050 Computation (@ 1016 FLOPS): Time: 1050 years Age of the universe: 1010 years SEU theory is completely intractable!

53 Relation Energy Information Desiderata for costs functions of probability additive monotonic Analogy: Isothermal Transformation Result [1] Before Conversion ConversionFactor Factor (real valued) (real valued) After Work Workisisreduction reduction ofofuncertainty! uncertainty! Utility = Work = (Unit Conv.) x (Reduction of Uncertainty). [1] Ortega, P.A. & Braun, D.A., 2011

54 Energy Information Trade Off Analogy: Isothermal Transformation Prior State Posterior State 2 1 Realisation Prior Work differs from Realised Work. 3

55 Energy Information Trade Off Analogy: Isothermal Transformation Prior State Posterior State 2 1 Realisation Prior Work differs from Realised Work. 3

56 Energy Information Trade Off Analogy: Isothermal Transformation Prior State Posterior State 2 1 Realisation Prior Work differs from Realised Work. 3

57 Energy Information Trade Off Analogy: Isothermal Transformation Prior State Posterior State 2 1 Realisation Knowledge shapes the perceived work! 3

58 Knowledge shapes perceived work Relation between work before/after knowledge: Free Energy = Work before Realisation.

59 Knowledge shapes perceived work Relation between work before/after knowledge: Free Energy = Work before Realisation.

60 Knowledge shapes perceived work Relation between work before/after knowledge: Free Energy = Work before knowledge of realisation.

61 Lesson from Thermodynamics No matter how hard we try, there's always a loss of energy.

Information, Utility & Bounded Rationality

Information, Utility & Bounded Rationality Information, Utility & Bounded Rationality Pedro A. Ortega and Daniel A. Braun Department of Engineering, University of Cambridge Trumpington Street, Cambridge, CB2 PZ, UK {dab54,pao32}@cam.ac.uk Abstract.

More information

arxiv: v1 [stat.ml] 21 Dec 2015

arxiv: v1 [stat.ml] 21 Dec 2015 Information-Theoretic Bounded Rationality arxiv:1512.06789v1 [stat.ml] 21 Dec 2015 Pedro A. Ortega University of Pennsylvania Philadelphia, PA 19104, USA Daniel A. Braun Max Planck Institute for Intelligent

More information

Free Energy and the Generalized Optimality Equations for Sequential Decision Making

Free Energy and the Generalized Optimality Equations for Sequential Decision Making JMLR: Workshop and Conference Proceedings vol: 0, 202 EWRL 202 Free Energy and the Generalized Optimality Equations for Sequential Decision Making Pedro A. Ortega Ma Planck Institute for Biological Cybernetics

More information

Rationality and Uncertainty

Rationality and Uncertainty Rationality and Uncertainty Based on papers by Itzhak Gilboa, Massimo Marinacci, Andy Postlewaite, and David Schmeidler Warwick Aug 23, 2013 Risk and Uncertainty Dual use of probability: empirical frequencies

More information

Evaluation for Pacman. CS 188: Artificial Intelligence Fall Iterative Deepening. α-β Pruning Example. α-β Pruning Pseudocode.

Evaluation for Pacman. CS 188: Artificial Intelligence Fall Iterative Deepening. α-β Pruning Example. α-β Pruning Pseudocode. CS 188: Artificial Intelligence Fall 2008 Evaluation for Pacman Lecture 7: Expectimax Search 9/18/2008 [DEMO: thrashing, smart ghosts] Dan Klein UC Berkeley Many slides over the course adapted from either

More information

CS 188: Artificial Intelligence Fall 2008

CS 188: Artificial Intelligence Fall 2008 CS 188: Artificial Intelligence Fall 2008 Lecture 7: Expectimax Search 9/18/2008 Dan Klein UC Berkeley Many slides over the course adapted from either Stuart Russell or Andrew Moore 1 1 Evaluation for

More information

Conditional and Dynamic Preferences

Conditional and Dynamic Preferences Conditional and Dynamic Preferences How can we Understand Risk in a Dynamic Setting? Samuel Drapeau Joint work with Hans Föllmer Humboldt University Berlin Jena - March 17th 2009 Samuel Drapeau Joint work

More information

Questions in Decision Theory

Questions in Decision Theory Questions in Decision Theory Itzhak Gilboa June 15, 2011 Gilboa () Questions in Decision Theory June 15, 2011 1 / 18 History Pascal and Bernoulli Gilboa () Questions in Decision Theory June 15, 2011 2

More information

Value-Ordering and Discrepancies. Ciaran McCreesh and Patrick Prosser

Value-Ordering and Discrepancies. Ciaran McCreesh and Patrick Prosser Value-Ordering and Discrepancies Maintaining Arc Consistency (MAC) Achieve (generalised) arc consistency (AC3, etc). If we have a domain wipeout, backtrack. If all domains have one value, we re done. Pick

More information

Reinforcement Learning

Reinforcement Learning 1 Reinforcement Learning Chris Watkins Department of Computer Science Royal Holloway, University of London July 27, 2015 2 Plan 1 Why reinforcement learning? Where does this theory come from? Markov decision

More information

6.207/14.15: Networks Lecture 10: Introduction to Game Theory 2

6.207/14.15: Networks Lecture 10: Introduction to Game Theory 2 6.207/14.15: Networks Lecture 10: Introduction to Game Theory 2 Daron Acemoglu and Asu Ozdaglar MIT October 14, 2009 1 Introduction Outline Mixed Strategies Existence of Mixed Strategy Nash Equilibrium

More information

16.4 Multiattribute Utility Functions

16.4 Multiattribute Utility Functions 285 Normalized utilities The scale of utilities reaches from the best possible prize u to the worst possible catastrophe u Normalized utilities use a scale with u = 0 and u = 1 Utilities of intermediate

More information

Announcements. CS 188: Artificial Intelligence Spring Mini-Contest Winners. Today. GamesCrafters. Adversarial Games

Announcements. CS 188: Artificial Intelligence Spring Mini-Contest Winners. Today. GamesCrafters. Adversarial Games CS 188: Artificial Intelligence Spring 2009 Lecture 7: Expectimax Search 2/10/2009 John DeNero UC Berkeley Slides adapted from Dan Klein, Stuart Russell or Andrew Moore Announcements Written Assignment

More information

Logic, Knowledge Representation and Bayesian Decision Theory

Logic, Knowledge Representation and Bayesian Decision Theory Logic, Knowledge Representation and Bayesian Decision Theory David Poole University of British Columbia Overview Knowledge representation, logic, decision theory. Belief networks Independent Choice Logic

More information

Lecture 14: Introduction to Decision Making

Lecture 14: Introduction to Decision Making Lecture 14: Introduction to Decision Making Preferences Utility functions Maximizing exected utility Value of information Actions and consequences So far, we have focused on ways of modeling a stochastic,

More information

Contents Quantum-like Paradigm Classical (Kolmogorovian) and Quantum (Born) Probability

Contents Quantum-like Paradigm Classical (Kolmogorovian) and Quantum (Born) Probability 1 Quantum-like Paradigm... 1 1.1 Applications of Mathematical Apparatus of QM Outside ofphysics... 1 1.2 Irreducible Quantum Randomness, Copenhagen Interpretation... 2 1.3 Quantum Reductionism in Biology

More information

Announcements. CS 188: Artificial Intelligence Fall Adversarial Games. Computing Minimax Values. Evaluation Functions. Recap: Resource Limits

Announcements. CS 188: Artificial Intelligence Fall Adversarial Games. Computing Minimax Values. Evaluation Functions. Recap: Resource Limits CS 188: Artificial Intelligence Fall 2009 Lecture 7: Expectimax Search 9/17/2008 Announcements Written 1: Search and CSPs is up Project 2: Multi-agent Search is up Want a partner? Come to the front after

More information

CS 188: Artificial Intelligence Fall Announcements

CS 188: Artificial Intelligence Fall Announcements CS 188: Artificial Intelligence Fall 2009 Lecture 7: Expectimax Search 9/17/2008 Dan Klein UC Berkeley Many slides over the course adapted from either Stuart Russell or Andrew Moore 1 Announcements Written

More information

Ambiguity and the Centipede Game

Ambiguity and the Centipede Game Ambiguity and the Centipede Game Jürgen Eichberger, Simon Grant and David Kelsey Heidelberg University, Australian National University, University of Exeter. University of Exeter. June 2018 David Kelsey

More information

Systems Engineering - Decision Analysis in Engineering Michael Havbro Faber Aalborg University, Denmark

Systems Engineering - Decision Analysis in Engineering Michael Havbro Faber Aalborg University, Denmark Risk and Safety Management Aalborg University, Esbjerg, 2017 Systems Engineering - Decision Analysis in Engineering Michael Havbro Faber Aalborg University, Denmark 1/50 M. H. Faber Systems Engineering,

More information

Zero-Sum Games Public Strategies Minimax Theorem and Nash Equilibria Appendix. Zero-Sum Games. Algorithmic Game Theory.

Zero-Sum Games Public Strategies Minimax Theorem and Nash Equilibria Appendix. Zero-Sum Games. Algorithmic Game Theory. Public Strategies Minimax Theorem and Nash Equilibria Appendix 2013 Public Strategies Minimax Theorem and Nash Equilibria Appendix Definition Definition A zero-sum game is a strategic game, in which for

More information

Learning Energy-Based Models of High-Dimensional Data

Learning Energy-Based Models of High-Dimensional Data Learning Energy-Based Models of High-Dimensional Data Geoffrey Hinton Max Welling Yee-Whye Teh Simon Osindero www.cs.toronto.edu/~hinton/energybasedmodelsweb.htm Discovering causal structure as a goal

More information

Complexity of stochastic branch and bound methods for belief tree search in Bayesian reinforcement learning

Complexity of stochastic branch and bound methods for belief tree search in Bayesian reinforcement learning Complexity of stochastic branch and bound methods for belief tree search in Bayesian reinforcement learning Christos Dimitrakakis Informatics Institute, University of Amsterdam, Amsterdam, The Netherlands

More information

Preferences for Randomization and Anticipation

Preferences for Randomization and Anticipation Preferences for Randomization and Anticipation Yosuke Hashidate Abstract In decision theory, it is not generally postulated that decision makers randomize their choices. In contrast, in real life, even

More information

CS 4100 // artificial intelligence. Recap/midterm review!

CS 4100 // artificial intelligence. Recap/midterm review! CS 4100 // artificial intelligence instructor: byron wallace Recap/midterm review! Attribution: many of these slides are modified versions of those distributed with the UC Berkeley CS188 materials Thanks

More information

A Necessary and Sufficient Condition for Convergence of Statistical to Strategic Equilibria of Market Games

A Necessary and Sufficient Condition for Convergence of Statistical to Strategic Equilibria of Market Games A Necessary and Sufficient Condition for Convergence of Statistical to Strategic Equilibria of Market Games Dimitrios P. Tsomocos Dimitris Voliotis Abstract In our model, we treat a market game where traders

More information

CSE 573: Artificial Intelligence

CSE 573: Artificial Intelligence CSE 573: Artificial Intelligence Autumn 2010 Lecture 5: Expectimax Search 10/14/2008 Luke Zettlemoyer Most slides over the course adapted from either Dan Klein, Stuart Russell or Andrew Moore 1 Announcements

More information

Decision Theory Intro: Preferences and Utility

Decision Theory Intro: Preferences and Utility Decision Theory Intro: Preferences and Utility CPSC 322 Lecture 29 March 22, 2006 Textbook 9.5 Decision Theory Intro: Preferences and Utility CPSC 322 Lecture 29, Slide 1 Lecture Overview Recap Decision

More information

Two-Stage-Partitional Representation and Dynamic Consistency 1

Two-Stage-Partitional Representation and Dynamic Consistency 1 Two-Stage-Partitional Representation and Dynamic Consistency 1 Suguru Ito June 5 th, 2015 Abstract In this paper, we study an individual who faces a three-period decision problem when she accepts the partitional

More information

LINEARLY SOLVABLE OPTIMAL CONTROL

LINEARLY SOLVABLE OPTIMAL CONTROL CHAPTER 6 LINEARLY SOLVABLE OPTIMAL CONTROL K. Dvijotham 1 and E. Todorov 2 1 Computer Science & Engineering, University of Washington, Seattle 2 Computer Science & Engineering and Applied Mathematics,

More information

Analysis of Algorithms. Unit 5 - Intractable Problems

Analysis of Algorithms. Unit 5 - Intractable Problems Analysis of Algorithms Unit 5 - Intractable Problems 1 Intractable Problems Tractable Problems vs. Intractable Problems Polynomial Problems NP Problems NP Complete and NP Hard Problems 2 In this unit we

More information

Recitation 7: Uncertainty. Xincheng Qiu

Recitation 7: Uncertainty. Xincheng Qiu Econ 701A Fall 2018 University of Pennsylvania Recitation 7: Uncertainty Xincheng Qiu (qiux@sas.upenn.edu 1 Expected Utility Remark 1. Primitives: in the basic consumer theory, a preference relation is

More information

The Planning-by-Inference view on decision making and goal-directed behaviour

The Planning-by-Inference view on decision making and goal-directed behaviour The Planning-by-Inference view on decision making and goal-directed behaviour Marc Toussaint Machine Learning & Robotics Lab FU Berlin marc.toussaint@fu-berlin.de Nov. 14, 2011 1/20 pigeons 2/20 pigeons

More information

Microeconomics. 2. Game Theory

Microeconomics. 2. Game Theory Microeconomics 2. Game Theory Alex Gershkov http://www.econ2.uni-bonn.de/gershkov/gershkov.htm 18. November 2008 1 / 36 Dynamic games Time permitting we will cover 2.a Describing a game in extensive form

More information

MFM Practitioner Module: Risk & Asset Allocation. John Dodson. February 18, 2015

MFM Practitioner Module: Risk & Asset Allocation. John Dodson. February 18, 2015 MFM Practitioner Module: Risk & Asset Allocation February 18, 2015 No introduction to portfolio optimization would be complete without acknowledging the significant contribution of the Markowitz mean-variance

More information

CS 188 Introduction to Fall 2007 Artificial Intelligence Midterm

CS 188 Introduction to Fall 2007 Artificial Intelligence Midterm NAME: SID#: Login: Sec: 1 CS 188 Introduction to Fall 2007 Artificial Intelligence Midterm You have 80 minutes. The exam is closed book, closed notes except a one-page crib sheet, basic calculators only.

More information

Modeling Bounded Rationality of Agents During Interactions

Modeling Bounded Rationality of Agents During Interactions Interactive Decision Theory and Game Theory: Papers from the 2 AAAI Workshop (WS--3) Modeling Bounded Rationality of Agents During Interactions Qing Guo and Piotr Gmytrasiewicz Department of Computer Science

More information

Thermodynamics as a theory of decision-making with information processing costs

Thermodynamics as a theory of decision-making with information processing costs Thermodynamics as a theory of decision-making with information processing costs Pedro A. Ortega and Daniel A. Braun July 31, 2012 Abstract Perfectly rational decision-makers maximize expected utility,

More information

Part IB Optimisation

Part IB Optimisation Part IB Optimisation Theorems Based on lectures by F. A. Fischer Notes taken by Dexter Chua Easter 2015 These notes are not endorsed by the lecturers, and I have modified them (often significantly) after

More information

Learning in Zero-Sum Team Markov Games using Factored Value Functions

Learning in Zero-Sum Team Markov Games using Factored Value Functions Learning in Zero-Sum Team Markov Games using Factored Value Functions Michail G. Lagoudakis Department of Computer Science Duke University Durham, NC 27708 mgl@cs.duke.edu Ronald Parr Department of Computer

More information

Bayesian Optimization

Bayesian Optimization Practitioner Course: Portfolio October 15, 2008 No introduction to portfolio optimization would be complete without acknowledging the significant contribution of the Markowitz mean-variance efficient frontier

More information

Last update: April 15, Rational decisions. CMSC 421: Chapter 16. CMSC 421: Chapter 16 1

Last update: April 15, Rational decisions. CMSC 421: Chapter 16. CMSC 421: Chapter 16 1 Last update: April 15, 2010 Rational decisions CMSC 421: Chapter 16 CMSC 421: Chapter 16 1 Outline Rational preferences Utilities Money Multiattribute utilities Decision networks Value of information CMSC

More information

Grundlagen der Künstlichen Intelligenz

Grundlagen der Künstlichen Intelligenz Grundlagen der Künstlichen Intelligenz Uncertainty & Probabilities & Bandits Daniel Hennes 16.11.2017 (WS 2017/18) University Stuttgart - IPVS - Machine Learning & Robotics 1 Today Uncertainty Probability

More information

arxiv: v1 [cs.ai] 16 Feb 2010

arxiv: v1 [cs.ai] 16 Feb 2010 Convergence of the Bayesian Control Rule Pedro A. Ortega Daniel A. Braun Dept. of Engineering, University of Cambridge, Cambridge CB2 1PZ, UK peortega@dcc.uchile.cl dab54@cam.ac.uk arxiv:1002.3086v1 [cs.ai]

More information

Hierarchical State Abstractions for Decision-Making Problems with Computational Constraints

Hierarchical State Abstractions for Decision-Making Problems with Computational Constraints Hierarchical State Abstractions for Decision-Making Problems with Computational Constraints Daniel T. Larsson, Daniel Braun, Panagiotis Tsiotras Abstract In this semi-tutorial paper, we first review the

More information

Rational preferences. Rational decisions. Outline. Rational preferences contd. Maximizing expected utility. Preferences

Rational preferences. Rational decisions. Outline. Rational preferences contd. Maximizing expected utility. Preferences Rational preferences Rational decisions hapter 16 Idea: preferences of a rational agent must obey constraints. Rational preferences behavior describable as maximization of expected utility onstraints:

More information

Birgit Rudloff Operations Research and Financial Engineering, Princeton University

Birgit Rudloff Operations Research and Financial Engineering, Princeton University TIME CONSISTENT RISK AVERSE DYNAMIC DECISION MODELS: AN ECONOMIC INTERPRETATION Birgit Rudloff Operations Research and Financial Engineering, Princeton University brudloff@princeton.edu Alexandre Street

More information

Optimal Control of Network Structure Growth

Optimal Control of Network Structure Growth Optimal Control of Networ Structure Growth Domini Thalmeier Radboud University Nijmegen Nijmegen, the Netherlands d.thalmeier@science.ru.nl Vicenç Gómez Universitat Pompeu Fabra Barcelona, Spain vicen.gomez@upf.edu

More information

Partially Observable Markov Decision Processes (POMDPs) Pieter Abbeel UC Berkeley EECS

Partially Observable Markov Decision Processes (POMDPs) Pieter Abbeel UC Berkeley EECS Partially Observable Markov Decision Processes (POMDPs) Pieter Abbeel UC Berkeley EECS Many slides adapted from Jur van den Berg Outline POMDPs Separation Principle / Certainty Equivalence Locally Optimal

More information

Choice under Uncertainty

Choice under Uncertainty In the Name of God Sharif University of Technology Graduate School of Management and Economics Microeconomics 2 44706 (1394-95 2 nd term) Group 2 Dr. S. Farshad Fatemi Chapter 6: Choice under Uncertainty

More information

Partially Observable Markov Decision Processes (POMDPs)

Partially Observable Markov Decision Processes (POMDPs) Partially Observable Markov Decision Processes (POMDPs) Geoff Hollinger Sequential Decision Making in Robotics Spring, 2011 *Some media from Reid Simmons, Trey Smith, Tony Cassandra, Michael Littman, and

More information

Separable Utility Functions in Dynamic Economic Models

Separable Utility Functions in Dynamic Economic Models Separable Utility Functions in Dynamic Economic Models Karel Sladký 1 Abstract. In this note we study properties of utility functions suitable for performance evaluation of dynamic economic models under

More information

Grundlagen der Künstlichen Intelligenz

Grundlagen der Künstlichen Intelligenz Grundlagen der Künstlichen Intelligenz Formal models of interaction Daniel Hennes 27.11.2017 (WS 2017/18) University Stuttgart - IPVS - Machine Learning & Robotics 1 Today Taxonomy of domains Models of

More information

Action-Independent Subjective Expected Utility without States of the World

Action-Independent Subjective Expected Utility without States of the World heoretical Economics Letters, 013, 3, 17-1 http://dxdoiorg/10436/tel01335a004 Published Online September 013 (http://wwwscirporg/journal/tel) Action-Independent Subjective Expected Utility without States

More information

Game Theory with Information: Introducing the Witsenhausen Intrinsic Model

Game Theory with Information: Introducing the Witsenhausen Intrinsic Model Game Theory with Information: Introducing the Witsenhausen Intrinsic Model Michel De Lara and Benjamin Heymann Cermics, École des Ponts ParisTech France École des Ponts ParisTech March 15, 2017 Information

More information

An Axiomatic Model of Reference Dependence under Uncertainty. Yosuke Hashidate

An Axiomatic Model of Reference Dependence under Uncertainty. Yosuke Hashidate An Axiomatic Model of Reference Dependence under Uncertainty Yosuke Hashidate Abstract This paper presents a behavioral characteization of a reference-dependent choice under uncertainty in the Anscombe-Aumann

More information

Markov decision processes and interval Markov chains: exploiting the connection

Markov decision processes and interval Markov chains: exploiting the connection Markov decision processes and interval Markov chains: exploiting the connection Mingmei Teo Supervisors: Prof. Nigel Bean, Dr Joshua Ross University of Adelaide July 10, 2013 Intervals and interval arithmetic

More information

Deliberation-aware Responder in Multi-Proposer Ultimatum Game

Deliberation-aware Responder in Multi-Proposer Ultimatum Game Deliberation-aware Responder in Multi-Proposer Ultimatum Game Marko Ruman, František Hůla, Miroslav Kárný, and Tatiana V. Guy Department of Adaptive Systems Institute of Information Theory and Automation

More information

14.12 Game Theory Lecture Notes Theory of Choice

14.12 Game Theory Lecture Notes Theory of Choice 14.12 Game Theory Lecture Notes Theory of Choice Muhamet Yildiz (Lecture 2) 1 The basic theory of choice We consider a set X of alternatives. Alternatives are mutually exclusive in the sense that one cannot

More information

This is designed for one 75-minute lecture using Games and Information. October 3, 2006

This is designed for one 75-minute lecture using Games and Information. October 3, 2006 This is designed for one 75-minute lecture using Games and Information. October 3, 2006 1 7 Moral Hazard: Hidden Actions PRINCIPAL-AGENT MODELS The principal (or uninformed player) is the player who has

More information

MARKOV DECISION PROCESSES (MDP) AND REINFORCEMENT LEARNING (RL) Versione originale delle slide fornita dal Prof. Francesco Lo Presti

MARKOV DECISION PROCESSES (MDP) AND REINFORCEMENT LEARNING (RL) Versione originale delle slide fornita dal Prof. Francesco Lo Presti 1 MARKOV DECISION PROCESSES (MDP) AND REINFORCEMENT LEARNING (RL) Versione originale delle slide fornita dal Prof. Francesco Lo Presti Historical background 2 Original motivation: animal learning Early

More information

Great Expectations. Part I: On the Customizability of Generalized Expected Utility*

Great Expectations. Part I: On the Customizability of Generalized Expected Utility* Great Expectations. Part I: On the Customizability of Generalized Expected Utility* Francis C. Chu and Joseph Y. Halpern Department of Computer Science Cornell University Ithaca, NY 14853, U.S.A. Email:

More information

Smooth Calibration, Leaky Forecasts, Finite Recall, and Nash Dynamics

Smooth Calibration, Leaky Forecasts, Finite Recall, and Nash Dynamics Smooth Calibration, Leaky Forecasts, Finite Recall, and Nash Dynamics Sergiu Hart August 2016 Smooth Calibration, Leaky Forecasts, Finite Recall, and Nash Dynamics Sergiu Hart Center for the Study of Rationality

More information

De Finetti s ultimate failure. Krzysztof Burdzy University of Washington

De Finetti s ultimate failure. Krzysztof Burdzy University of Washington De Finetti s ultimate failure Krzysztof Burdzy University of Washington Does philosophy matter? Global temperatures will rise by 1 degree in 20 years with probability 80%. Reading suggestions Probability

More information

arxiv: v1 [cs.ai] 10 Mar 2016

arxiv: v1 [cs.ai] 10 Mar 2016 Hierarchical Linearly-Solvable Markov Decision Problems (Extended Version with Supplementary Material) Anders Jonsson and Vicenç Gómez Department of Information and Communication Technologies Universitat

More information

Uniform Sources of Uncertainty for Subjective Probabilities and

Uniform Sources of Uncertainty for Subjective Probabilities and Uniform Sources of Uncertainty for Subjective Probabilities and Ambiguity Mohammed Abdellaoui (joint with Aurélien Baillon and Peter Wakker) 1 Informal Central in this work will be the recent finding of

More information

Expectation Propagation Algorithm

Expectation Propagation Algorithm Expectation Propagation Algorithm 1 Shuang Wang School of Electrical and Computer Engineering University of Oklahoma, Tulsa, OK, 74135 Email: {shuangwang}@ou.edu This note contains three parts. First,

More information

Optimization Tools in an Uncertain Environment

Optimization Tools in an Uncertain Environment Optimization Tools in an Uncertain Environment Michael C. Ferris University of Wisconsin, Madison Uncertainty Workshop, Chicago: July 21, 2008 Michael Ferris (University of Wisconsin) Stochastic optimization

More information

Decision Making Under Uncertainty. First Masterclass

Decision Making Under Uncertainty. First Masterclass Decision Making Under Uncertainty First Masterclass 1 Outline A short history Decision problems Uncertainty The Ellsberg paradox Probability as a measure of uncertainty Ignorance 2 Probability Blaise Pascal

More information

A Polynomial-time Nash Equilibrium Algorithm for Repeated Games

A Polynomial-time Nash Equilibrium Algorithm for Repeated Games A Polynomial-time Nash Equilibrium Algorithm for Repeated Games Michael L. Littman mlittman@cs.rutgers.edu Rutgers University Peter Stone pstone@cs.utexas.edu The University of Texas at Austin Main Result

More information

Markov Decision Processes

Markov Decision Processes Markov Decision Processes Noel Welsh 11 November 2010 Noel Welsh () Markov Decision Processes 11 November 2010 1 / 30 Annoucements Applicant visitor day seeks robot demonstrators for exciting half hour

More information

Robust Partially Observable Markov Decision Processes

Robust Partially Observable Markov Decision Processes Submitted to manuscript Robust Partially Observable Markov Decision Processes Mohammad Rasouli 1, Soroush Saghafian 2 1 Management Science and Engineering, Stanford University, Palo Alto, CA 2 Harvard

More information

Structural Uncertainty in Health Economic Decision Models

Structural Uncertainty in Health Economic Decision Models Structural Uncertainty in Health Economic Decision Models Mark Strong 1, Hazel Pilgrim 1, Jeremy Oakley 2, Jim Chilcott 1 December 2009 1. School of Health and Related Research, University of Sheffield,

More information

On Prediction and Planning in Partially Observable Markov Decision Processes with Large Observation Sets

On Prediction and Planning in Partially Observable Markov Decision Processes with Large Observation Sets On Prediction and Planning in Partially Observable Markov Decision Processes with Large Observation Sets Pablo Samuel Castro pcastr@cs.mcgill.ca McGill University Joint work with: Doina Precup and Prakash

More information

Probabilistic Subjective Expected Utility. Pavlo R. Blavatskyy

Probabilistic Subjective Expected Utility. Pavlo R. Blavatskyy Probabilistic Subjective Expected Utility Pavlo R. Blavatskyy Institute of Public Finance University of Innsbruck Universitaetsstrasse 15 A-6020 Innsbruck Austria Phone: +43 (0) 512 507 71 56 Fax: +43

More information

Bayesian inference J. Daunizeau

Bayesian inference J. Daunizeau Bayesian inference J. Daunizeau Brain and Spine Institute, Paris, France Wellcome Trust Centre for Neuroimaging, London, UK Overview of the talk 1 Probabilistic modelling and representation of uncertainty

More information

Whaddya know? Bayesian and Frequentist approaches to inverse problems

Whaddya know? Bayesian and Frequentist approaches to inverse problems Whaddya know? Bayesian and Frequentist approaches to inverse problems Philip B. Stark Department of Statistics University of California, Berkeley Inverse Problems: Practical Applications and Advanced Analysis

More information

Simple Counter-terrorism Decision

Simple Counter-terrorism Decision A Comparative Analysis of PRA and Intelligent Adversary Methods for Counterterrorism Risk Management Greg Parnell US Military Academy Jason R. W. Merrick Virginia Commonwealth University Simple Counter-terrorism

More information

Recursive Ambiguity and Machina s Examples

Recursive Ambiguity and Machina s Examples Recursive Ambiguity and Machina s Examples David Dillenberger Uzi Segal May 0, 0 Abstract Machina (009, 0) lists a number of situations where standard models of ambiguity aversion are unable to capture

More information

Online solution of the average cost Kullback-Leibler optimization problem

Online solution of the average cost Kullback-Leibler optimization problem Online solution of the average cost Kullback-Leibler optimization problem Joris Bierkens Radboud University Nijmegen j.bierkens@science.ru.nl Bert Kappen Radboud University Nijmegen b.kappen@science.ru.nl

More information

Why on earth did you do that?!

Why on earth did you do that?! Why on earth did you do that?! A simple model of obfuscation-based choice Caspar 9-5-2018 Chorus http://behave.tbm.tudelft.nl/ Delft University of Technology Challenge the future Background: models of

More information

Quantum Probability in Cognition. Ryan Weiss 11/28/2018

Quantum Probability in Cognition. Ryan Weiss 11/28/2018 Quantum Probability in Cognition Ryan Weiss 11/28/2018 Overview Introduction Classical vs Quantum Probability Brain Information Processing Decision Making Conclusion Introduction Quantum probability in

More information

Quantum Games. Quantum Strategies in Classical Games. Presented by Yaniv Carmeli

Quantum Games. Quantum Strategies in Classical Games. Presented by Yaniv Carmeli Quantum Games Quantum Strategies in Classical Games Presented by Yaniv Carmeli 1 Talk Outline Introduction Game Theory Why quantum games? PQ Games PQ penny flip 2x2 Games Quantum strategies 2 Game Theory

More information

Week 5 Consumer Theory (Jehle and Reny, Ch.2)

Week 5 Consumer Theory (Jehle and Reny, Ch.2) Week 5 Consumer Theory (Jehle and Reny, Ch.2) Serçin ahin Yldz Technical University 23 October 2012 Duality Expenditure and Consumer Preferences Choose (p 0, u 0 ) R n ++ R +, and evaluate E there to obtain

More information

Learning and Risk Aversion

Learning and Risk Aversion Learning and Risk Aversion Carlos Oyarzun Texas A&M University Rajiv Sarin Texas A&M University January 2007 Abstract Learning describes how behavior changes in response to experience. We consider how

More information

A Decentralized Approach to Multi-agent Planning in the Presence of Constraints and Uncertainty

A Decentralized Approach to Multi-agent Planning in the Presence of Constraints and Uncertainty 2011 IEEE International Conference on Robotics and Automation Shanghai International Conference Center May 9-13, 2011, Shanghai, China A Decentralized Approach to Multi-agent Planning in the Presence of

More information

Decision Trees. Gavin Brown

Decision Trees. Gavin Brown Decision Trees Gavin Brown Every Learning Method has Limitations Linear model? KNN? SVM? Explain your decisions Sometimes we need interpretable results from our techniques. How do you explain the above

More information

Fundamentals in Optimal Investments. Lecture I

Fundamentals in Optimal Investments. Lecture I Fundamentals in Optimal Investments Lecture I + 1 Portfolio choice Portfolio allocations and their ordering Performance indices Fundamentals in optimal portfolio choice Expected utility theory and its

More information

Incremental Policy Learning: An Equilibrium Selection Algorithm for Reinforcement Learning Agents with Common Interests

Incremental Policy Learning: An Equilibrium Selection Algorithm for Reinforcement Learning Agents with Common Interests Incremental Policy Learning: An Equilibrium Selection Algorithm for Reinforcement Learning Agents with Common Interests Nancy Fulda and Dan Ventura Department of Computer Science Brigham Young University

More information

Second-Order Expected Utility

Second-Order Expected Utility Second-Order Expected Utility Simon Grant Ben Polak Tomasz Strzalecki Preliminary version: November 2009 Abstract We present two axiomatizations of the Second-Order Expected Utility model in the context

More information

6.207/14.15: Networks Lecture 11: Introduction to Game Theory 3

6.207/14.15: Networks Lecture 11: Introduction to Game Theory 3 6.207/14.15: Networks Lecture 11: Introduction to Game Theory 3 Daron Acemoglu and Asu Ozdaglar MIT October 19, 2009 1 Introduction Outline Existence of Nash Equilibrium in Infinite Games Extensive Form

More information

Bayesian inference J. Daunizeau

Bayesian inference J. Daunizeau Bayesian inference J. Daunizeau Brain and Spine Institute, Paris, France Wellcome Trust Centre for Neuroimaging, London, UK Overview of the talk 1 Probabilistic modelling and representation of uncertainty

More information

Contracts under Asymmetric Information

Contracts under Asymmetric Information Contracts under Asymmetric Information 1 I Aristotle, economy (oiko and nemo) and the idea of exchange values, subsequently adapted by Ricardo and Marx. Classical economists. An economy consists of a set

More information

Spanning Tree Problem of Uncertain Network

Spanning Tree Problem of Uncertain Network Spanning Tree Problem of Uncertain Network Jin Peng Institute of Uncertain Systems Huanggang Normal University Hubei 438000, China Email: pengjin01@tsinghuaorgcn Shengguo Li College of Mathematics & Computer

More information

Y. Xiang, Inference with Uncertain Knowledge 1

Y. Xiang, Inference with Uncertain Knowledge 1 Inference with Uncertain Knowledge Objectives Why must agent use uncertain knowledge? Fundamentals of Bayesian probability Inference with full joint distributions Inference with Bayes rule Bayesian networks

More information

The role of predictive uncertainty in the operational management of reservoirs

The role of predictive uncertainty in the operational management of reservoirs 118 Evolving Water Resources Systems: Understanding, Predicting and Managing Water Society Interactions Proceedings of ICWRS014, Bologna, Italy, June 014 (IAHS Publ. 364, 014). The role of predictive uncertainty

More information

Confronting Theory with Experimental Data and vice versa. Lecture I Choice under risk. The Norwegian School of Economics Nov 7-11, 2011

Confronting Theory with Experimental Data and vice versa. Lecture I Choice under risk. The Norwegian School of Economics Nov 7-11, 2011 Confronting Theory with Experimental Data and vice versa Lecture I Choice under risk The Norwegian School of Economics Nov 7-11, 2011 Preferences Let X be some set of alternatives (consumption set). Formally,

More information

Perfect simulation of repulsive point processes

Perfect simulation of repulsive point processes Perfect simulation of repulsive point processes Mark Huber Department of Mathematical Sciences, Claremont McKenna College 29 November, 2011 Mark Huber, CMC Perfect simulation of repulsive point processes

More information

Course 16:198:520: Introduction To Artificial Intelligence Lecture 13. Decision Making. Abdeslam Boularias. Wednesday, December 7, 2016

Course 16:198:520: Introduction To Artificial Intelligence Lecture 13. Decision Making. Abdeslam Boularias. Wednesday, December 7, 2016 Course 16:198:520: Introduction To Artificial Intelligence Lecture 13 Decision Making Abdeslam Boularias Wednesday, December 7, 2016 1 / 45 Overview We consider probabilistic temporal models where the

More information

Chance-constrained optimization with tight confidence bounds

Chance-constrained optimization with tight confidence bounds Chance-constrained optimization with tight confidence bounds Mark Cannon University of Oxford 25 April 2017 1 Outline 1. Problem definition and motivation 2. Confidence bounds for randomised sample discarding

More information