Information Theoretic Bounded Rationality. Pedro A. Ortega University of Pennsylvania
|
|
- Agnes Randall
- 5 years ago
- Views:
Transcription
1 Information Theoretic Bounded Rationality Pedro A. Ortega University of Pennsylvania May 2015
2 Goal of this Talk In large scale decision problems, agents choose as Posterior Utility Inverse Temp. Why is this so? And how is this used? Prior
3 Pick a box. 50/50 25/75
4 Pick a box. 50/50?
5 Pick a box. 50/50? Unknown probabilities affect net value.
6 Pick the largest.
7 Pick the largest.
8 Pick the largest. Large scale: you cannot do better than random search.
9 Preliminaries History, variational principles and more May 2015
10 An active area of research... ermodynamics of Prediction Polani Still Regularisation in Control Peters Mitter Maccheroni Linearly Solvable Control Todorov KL Control Theory Neumann Theodorou Active Inference Bierkens Robust Decision Making Braun Bounded Rationality Rustichini Tishby Wiegerink Friston Palfrey Kappen Fleming Sims Gomez Marinacci Rational Inattention Variational Preferences Duality Control Estimation Guerra Information Flow in Perception Act McKelvey Quantum Mechanics for Control Wolpert Ortega Quantal Response Equilibria
11 History of Decision Theory Bernoulli 1738 Utility Knight Ramsey 1921 Uncertainty & Risk de Finetti Subjective Probability von Neumann Morgenstern 1944 (Objective) Expected Utility Savage 1954 Subjective Expected Utility New theory of bounded rational decisions. Simon 1957 Bounded Rationality?
12 What is boundedness? NO YES Suboptimal Heuristics Approximation Meta reasoning Intractability Model Uncertainty Risk Sensitivity ToM Bounded = Subject to information constraints.
13 Bounded Rational Decisions Assumptions, derivation and main properties May 2015
14 Planning and Searching
15 Planning and Searching
16 Planning and Searching
17 Planning and Searching Planning is searching for a good subset.
18 Straw Pile Assumptions Primitive operation: search A given B. Very large Negligible probability Irreducible Parallelisable
19 Straw Pile Assumptions Primitive operation: search A given B. Very large Negligible probability Irreducible Parallelisable Golden Rule: No Integration! Large scale planning is searching a set inside a pile of straw.
20 Search Costs
21 Search Costs Continuity Additivity Monotonicity < +
22 Cost of Probabilistic Choice a) Total Cost = Expected Cost + Waste b)
23 Bounded Rational Decision Making Deliberation transforms prior Q into posterior P. Agent extremises the free energy functional:... Trades off utility & information costs. Models: large scale choice spaces, trust & risk sensitivity, and other information constraints.
24 Optimal Choice & Certainty Equivalent Posterior (optimal choice distribution): Certainty Equivalent (value):
25 Optimal Choice and Certainty Equivalent a) b)
26 Optimal Choice and Certainty Equivalent a) b)
27 Optimal Choice and Certainty Equivalent a) b) Inverse temperature encapsulates constraints.
28 Equivalence Behaviourally indistinguishable. Have same: prior, posterior, and certainty equivalent. Many ways to represent same choice behaviour.
29 Stochastic Choices Choosing amounts to rejection sampling:
30 Stochastic Choices Choosing amounts to rejection sampling [1]: Essentially depending on inverse temperature, not on number of choices! RS is the most efficient sampler given knowledge. 1.8 [1] Ortega, Braun & Tishby, 2014.
31 Stochastic Choices II We can sample directly from an equivalent decision problem. Interpretation
32 Stochastic Choices II We can sample directly from an equivalent decision problem. Interpretation RS from equivalent problem requires equalisation.
33 Comparison Objective Function Rationality Paradigm Domain size Search strategy Search scope Functional form Utility sensitivity Expected Utility Free Energy perfect bounded small large deterministic randomised complete incomplete linear non linear first moment all moments
34 Sequential Decisions Definition, construction, and recursions May 2015
35 Classical Decision Trees max max E E min max max max E E min min Planning in classical decision trees depends on the type of system making the move in each node. Can we generalise them?
36 Certainty Equivalent/Value Classical values can be approximated b) a) 1 2 low 1 c) d) high 0 Different choices of inverse temperatures.
37 Bounded Rational Decision Trees a) E b) max E min Free energy functional: building block for game trees inverse temperatures are node specific solved using forward sampling, not dynamic programming
38 Construction of Decision Tree a) Start Startfrom fromsingle step. single step. b) c) d) Apply Applyequivalence equivalencetransformations transformationsuntil untildesired desiredtree treeisisobtained. obtained.
39 Map of Decision Rules a) b) e) anti rational rational risk seeking, cooperative d Minimax c) Expectimax c b d) risk averse, adversarial a Vary Inverse inverse temperatures temperatures determine to choose the decision decision rules. rule.
40 Recursive Equations Free Energy: Certainty Equivalent: Posterior:
41 Recursive Rejection Sampling Recursively sample a path. Equalise every time the inverse temperature changes. Implicitly evaluates feasibility of trajectories.
42 Summary Problem: Large scale decision spaces. Key ideas: Use statistical description of choices. Exploit stochastic computational model. Properties: Blurs boundary learning planning. Massively parallelisable. Controllable complexity. Implements rich class of decision rules.
43 Thank you!
44 Bonus Material...
45 Meta Reasoning Reason about costs of reasoning Penalise search complexity Much harder problem! Meta reasoning is not permitted!
46 Meta Reasoning Reason about costs of reasoning Penalise search complexity Much harder problem! Meta reasoning is not permitted!
47 Meta Reasoning Reason about costs of reasoning Penalise search complexity Much harder problem! Meta reasoning is not permitted!
48 Interrupted Reasoning Any time optimisation: Utility and cost of searching are commensurable! commensurable.
49 Interrupted Reasoning Any time optimisation: Computation Computationreduces reduces uncertainty of solution! uncertainty of solution! Utility and cost of searching are commensurable! commensurable.
50 Interrupted Reasoning Any time optimisation: Computation Computationreduces reduces uncertainty of solution! uncertainty of solution! Utility and cost of searching are commensurable! commensurable.
51 The Rational Adaptive Robobug Design: Horizon: 120 seconds Frequency: 1 Hz Observations: 2 Actions: 2 Size of decision tree: Nodes: 1072 Atoms in the world: 1050 Computation (@ 1016 FLOPS): Time: 1050 years Age of the universe: 1010 years SEU theory is completely intractable!
52 The Rational Adaptive Robobug Design: Horizon: 120 seconds Frequency: 1 Hz Observations: 2 Actions: 2 Size of decision tree: Nodes: 1072 Atoms in the world: 1050 Computation (@ 1016 FLOPS): Time: 1050 years Age of the universe: 1010 years SEU theory is completely intractable!
53 Relation Energy Information Desiderata for costs functions of probability additive monotonic Analogy: Isothermal Transformation Result [1] Before Conversion ConversionFactor Factor (real valued) (real valued) After Work Workisisreduction reduction ofofuncertainty! uncertainty! Utility = Work = (Unit Conv.) x (Reduction of Uncertainty). [1] Ortega, P.A. & Braun, D.A., 2011
54 Energy Information Trade Off Analogy: Isothermal Transformation Prior State Posterior State 2 1 Realisation Prior Work differs from Realised Work. 3
55 Energy Information Trade Off Analogy: Isothermal Transformation Prior State Posterior State 2 1 Realisation Prior Work differs from Realised Work. 3
56 Energy Information Trade Off Analogy: Isothermal Transformation Prior State Posterior State 2 1 Realisation Prior Work differs from Realised Work. 3
57 Energy Information Trade Off Analogy: Isothermal Transformation Prior State Posterior State 2 1 Realisation Knowledge shapes the perceived work! 3
58 Knowledge shapes perceived work Relation between work before/after knowledge: Free Energy = Work before Realisation.
59 Knowledge shapes perceived work Relation between work before/after knowledge: Free Energy = Work before Realisation.
60 Knowledge shapes perceived work Relation between work before/after knowledge: Free Energy = Work before knowledge of realisation.
61 Lesson from Thermodynamics No matter how hard we try, there's always a loss of energy.
Information, Utility & Bounded Rationality
Information, Utility & Bounded Rationality Pedro A. Ortega and Daniel A. Braun Department of Engineering, University of Cambridge Trumpington Street, Cambridge, CB2 PZ, UK {dab54,pao32}@cam.ac.uk Abstract.
More informationarxiv: v1 [stat.ml] 21 Dec 2015
Information-Theoretic Bounded Rationality arxiv:1512.06789v1 [stat.ml] 21 Dec 2015 Pedro A. Ortega University of Pennsylvania Philadelphia, PA 19104, USA Daniel A. Braun Max Planck Institute for Intelligent
More informationFree Energy and the Generalized Optimality Equations for Sequential Decision Making
JMLR: Workshop and Conference Proceedings vol: 0, 202 EWRL 202 Free Energy and the Generalized Optimality Equations for Sequential Decision Making Pedro A. Ortega Ma Planck Institute for Biological Cybernetics
More informationRationality and Uncertainty
Rationality and Uncertainty Based on papers by Itzhak Gilboa, Massimo Marinacci, Andy Postlewaite, and David Schmeidler Warwick Aug 23, 2013 Risk and Uncertainty Dual use of probability: empirical frequencies
More informationEvaluation for Pacman. CS 188: Artificial Intelligence Fall Iterative Deepening. α-β Pruning Example. α-β Pruning Pseudocode.
CS 188: Artificial Intelligence Fall 2008 Evaluation for Pacman Lecture 7: Expectimax Search 9/18/2008 [DEMO: thrashing, smart ghosts] Dan Klein UC Berkeley Many slides over the course adapted from either
More informationCS 188: Artificial Intelligence Fall 2008
CS 188: Artificial Intelligence Fall 2008 Lecture 7: Expectimax Search 9/18/2008 Dan Klein UC Berkeley Many slides over the course adapted from either Stuart Russell or Andrew Moore 1 1 Evaluation for
More informationConditional and Dynamic Preferences
Conditional and Dynamic Preferences How can we Understand Risk in a Dynamic Setting? Samuel Drapeau Joint work with Hans Föllmer Humboldt University Berlin Jena - March 17th 2009 Samuel Drapeau Joint work
More informationQuestions in Decision Theory
Questions in Decision Theory Itzhak Gilboa June 15, 2011 Gilboa () Questions in Decision Theory June 15, 2011 1 / 18 History Pascal and Bernoulli Gilboa () Questions in Decision Theory June 15, 2011 2
More informationValue-Ordering and Discrepancies. Ciaran McCreesh and Patrick Prosser
Value-Ordering and Discrepancies Maintaining Arc Consistency (MAC) Achieve (generalised) arc consistency (AC3, etc). If we have a domain wipeout, backtrack. If all domains have one value, we re done. Pick
More informationReinforcement Learning
1 Reinforcement Learning Chris Watkins Department of Computer Science Royal Holloway, University of London July 27, 2015 2 Plan 1 Why reinforcement learning? Where does this theory come from? Markov decision
More information6.207/14.15: Networks Lecture 10: Introduction to Game Theory 2
6.207/14.15: Networks Lecture 10: Introduction to Game Theory 2 Daron Acemoglu and Asu Ozdaglar MIT October 14, 2009 1 Introduction Outline Mixed Strategies Existence of Mixed Strategy Nash Equilibrium
More information16.4 Multiattribute Utility Functions
285 Normalized utilities The scale of utilities reaches from the best possible prize u to the worst possible catastrophe u Normalized utilities use a scale with u = 0 and u = 1 Utilities of intermediate
More informationAnnouncements. CS 188: Artificial Intelligence Spring Mini-Contest Winners. Today. GamesCrafters. Adversarial Games
CS 188: Artificial Intelligence Spring 2009 Lecture 7: Expectimax Search 2/10/2009 John DeNero UC Berkeley Slides adapted from Dan Klein, Stuart Russell or Andrew Moore Announcements Written Assignment
More informationLogic, Knowledge Representation and Bayesian Decision Theory
Logic, Knowledge Representation and Bayesian Decision Theory David Poole University of British Columbia Overview Knowledge representation, logic, decision theory. Belief networks Independent Choice Logic
More informationLecture 14: Introduction to Decision Making
Lecture 14: Introduction to Decision Making Preferences Utility functions Maximizing exected utility Value of information Actions and consequences So far, we have focused on ways of modeling a stochastic,
More informationContents Quantum-like Paradigm Classical (Kolmogorovian) and Quantum (Born) Probability
1 Quantum-like Paradigm... 1 1.1 Applications of Mathematical Apparatus of QM Outside ofphysics... 1 1.2 Irreducible Quantum Randomness, Copenhagen Interpretation... 2 1.3 Quantum Reductionism in Biology
More informationAnnouncements. CS 188: Artificial Intelligence Fall Adversarial Games. Computing Minimax Values. Evaluation Functions. Recap: Resource Limits
CS 188: Artificial Intelligence Fall 2009 Lecture 7: Expectimax Search 9/17/2008 Announcements Written 1: Search and CSPs is up Project 2: Multi-agent Search is up Want a partner? Come to the front after
More informationCS 188: Artificial Intelligence Fall Announcements
CS 188: Artificial Intelligence Fall 2009 Lecture 7: Expectimax Search 9/17/2008 Dan Klein UC Berkeley Many slides over the course adapted from either Stuart Russell or Andrew Moore 1 Announcements Written
More informationAmbiguity and the Centipede Game
Ambiguity and the Centipede Game Jürgen Eichberger, Simon Grant and David Kelsey Heidelberg University, Australian National University, University of Exeter. University of Exeter. June 2018 David Kelsey
More informationSystems Engineering - Decision Analysis in Engineering Michael Havbro Faber Aalborg University, Denmark
Risk and Safety Management Aalborg University, Esbjerg, 2017 Systems Engineering - Decision Analysis in Engineering Michael Havbro Faber Aalborg University, Denmark 1/50 M. H. Faber Systems Engineering,
More informationZero-Sum Games Public Strategies Minimax Theorem and Nash Equilibria Appendix. Zero-Sum Games. Algorithmic Game Theory.
Public Strategies Minimax Theorem and Nash Equilibria Appendix 2013 Public Strategies Minimax Theorem and Nash Equilibria Appendix Definition Definition A zero-sum game is a strategic game, in which for
More informationLearning Energy-Based Models of High-Dimensional Data
Learning Energy-Based Models of High-Dimensional Data Geoffrey Hinton Max Welling Yee-Whye Teh Simon Osindero www.cs.toronto.edu/~hinton/energybasedmodelsweb.htm Discovering causal structure as a goal
More informationComplexity of stochastic branch and bound methods for belief tree search in Bayesian reinforcement learning
Complexity of stochastic branch and bound methods for belief tree search in Bayesian reinforcement learning Christos Dimitrakakis Informatics Institute, University of Amsterdam, Amsterdam, The Netherlands
More informationPreferences for Randomization and Anticipation
Preferences for Randomization and Anticipation Yosuke Hashidate Abstract In decision theory, it is not generally postulated that decision makers randomize their choices. In contrast, in real life, even
More informationCS 4100 // artificial intelligence. Recap/midterm review!
CS 4100 // artificial intelligence instructor: byron wallace Recap/midterm review! Attribution: many of these slides are modified versions of those distributed with the UC Berkeley CS188 materials Thanks
More informationA Necessary and Sufficient Condition for Convergence of Statistical to Strategic Equilibria of Market Games
A Necessary and Sufficient Condition for Convergence of Statistical to Strategic Equilibria of Market Games Dimitrios P. Tsomocos Dimitris Voliotis Abstract In our model, we treat a market game where traders
More informationCSE 573: Artificial Intelligence
CSE 573: Artificial Intelligence Autumn 2010 Lecture 5: Expectimax Search 10/14/2008 Luke Zettlemoyer Most slides over the course adapted from either Dan Klein, Stuart Russell or Andrew Moore 1 Announcements
More informationDecision Theory Intro: Preferences and Utility
Decision Theory Intro: Preferences and Utility CPSC 322 Lecture 29 March 22, 2006 Textbook 9.5 Decision Theory Intro: Preferences and Utility CPSC 322 Lecture 29, Slide 1 Lecture Overview Recap Decision
More informationTwo-Stage-Partitional Representation and Dynamic Consistency 1
Two-Stage-Partitional Representation and Dynamic Consistency 1 Suguru Ito June 5 th, 2015 Abstract In this paper, we study an individual who faces a three-period decision problem when she accepts the partitional
More informationLINEARLY SOLVABLE OPTIMAL CONTROL
CHAPTER 6 LINEARLY SOLVABLE OPTIMAL CONTROL K. Dvijotham 1 and E. Todorov 2 1 Computer Science & Engineering, University of Washington, Seattle 2 Computer Science & Engineering and Applied Mathematics,
More informationAnalysis of Algorithms. Unit 5 - Intractable Problems
Analysis of Algorithms Unit 5 - Intractable Problems 1 Intractable Problems Tractable Problems vs. Intractable Problems Polynomial Problems NP Problems NP Complete and NP Hard Problems 2 In this unit we
More informationRecitation 7: Uncertainty. Xincheng Qiu
Econ 701A Fall 2018 University of Pennsylvania Recitation 7: Uncertainty Xincheng Qiu (qiux@sas.upenn.edu 1 Expected Utility Remark 1. Primitives: in the basic consumer theory, a preference relation is
More informationThe Planning-by-Inference view on decision making and goal-directed behaviour
The Planning-by-Inference view on decision making and goal-directed behaviour Marc Toussaint Machine Learning & Robotics Lab FU Berlin marc.toussaint@fu-berlin.de Nov. 14, 2011 1/20 pigeons 2/20 pigeons
More informationMicroeconomics. 2. Game Theory
Microeconomics 2. Game Theory Alex Gershkov http://www.econ2.uni-bonn.de/gershkov/gershkov.htm 18. November 2008 1 / 36 Dynamic games Time permitting we will cover 2.a Describing a game in extensive form
More informationMFM Practitioner Module: Risk & Asset Allocation. John Dodson. February 18, 2015
MFM Practitioner Module: Risk & Asset Allocation February 18, 2015 No introduction to portfolio optimization would be complete without acknowledging the significant contribution of the Markowitz mean-variance
More informationCS 188 Introduction to Fall 2007 Artificial Intelligence Midterm
NAME: SID#: Login: Sec: 1 CS 188 Introduction to Fall 2007 Artificial Intelligence Midterm You have 80 minutes. The exam is closed book, closed notes except a one-page crib sheet, basic calculators only.
More informationModeling Bounded Rationality of Agents During Interactions
Interactive Decision Theory and Game Theory: Papers from the 2 AAAI Workshop (WS--3) Modeling Bounded Rationality of Agents During Interactions Qing Guo and Piotr Gmytrasiewicz Department of Computer Science
More informationThermodynamics as a theory of decision-making with information processing costs
Thermodynamics as a theory of decision-making with information processing costs Pedro A. Ortega and Daniel A. Braun July 31, 2012 Abstract Perfectly rational decision-makers maximize expected utility,
More informationPart IB Optimisation
Part IB Optimisation Theorems Based on lectures by F. A. Fischer Notes taken by Dexter Chua Easter 2015 These notes are not endorsed by the lecturers, and I have modified them (often significantly) after
More informationLearning in Zero-Sum Team Markov Games using Factored Value Functions
Learning in Zero-Sum Team Markov Games using Factored Value Functions Michail G. Lagoudakis Department of Computer Science Duke University Durham, NC 27708 mgl@cs.duke.edu Ronald Parr Department of Computer
More informationBayesian Optimization
Practitioner Course: Portfolio October 15, 2008 No introduction to portfolio optimization would be complete without acknowledging the significant contribution of the Markowitz mean-variance efficient frontier
More informationLast update: April 15, Rational decisions. CMSC 421: Chapter 16. CMSC 421: Chapter 16 1
Last update: April 15, 2010 Rational decisions CMSC 421: Chapter 16 CMSC 421: Chapter 16 1 Outline Rational preferences Utilities Money Multiattribute utilities Decision networks Value of information CMSC
More informationGrundlagen der Künstlichen Intelligenz
Grundlagen der Künstlichen Intelligenz Uncertainty & Probabilities & Bandits Daniel Hennes 16.11.2017 (WS 2017/18) University Stuttgart - IPVS - Machine Learning & Robotics 1 Today Uncertainty Probability
More informationarxiv: v1 [cs.ai] 16 Feb 2010
Convergence of the Bayesian Control Rule Pedro A. Ortega Daniel A. Braun Dept. of Engineering, University of Cambridge, Cambridge CB2 1PZ, UK peortega@dcc.uchile.cl dab54@cam.ac.uk arxiv:1002.3086v1 [cs.ai]
More informationHierarchical State Abstractions for Decision-Making Problems with Computational Constraints
Hierarchical State Abstractions for Decision-Making Problems with Computational Constraints Daniel T. Larsson, Daniel Braun, Panagiotis Tsiotras Abstract In this semi-tutorial paper, we first review the
More informationRational preferences. Rational decisions. Outline. Rational preferences contd. Maximizing expected utility. Preferences
Rational preferences Rational decisions hapter 16 Idea: preferences of a rational agent must obey constraints. Rational preferences behavior describable as maximization of expected utility onstraints:
More informationBirgit Rudloff Operations Research and Financial Engineering, Princeton University
TIME CONSISTENT RISK AVERSE DYNAMIC DECISION MODELS: AN ECONOMIC INTERPRETATION Birgit Rudloff Operations Research and Financial Engineering, Princeton University brudloff@princeton.edu Alexandre Street
More informationOptimal Control of Network Structure Growth
Optimal Control of Networ Structure Growth Domini Thalmeier Radboud University Nijmegen Nijmegen, the Netherlands d.thalmeier@science.ru.nl Vicenç Gómez Universitat Pompeu Fabra Barcelona, Spain vicen.gomez@upf.edu
More informationPartially Observable Markov Decision Processes (POMDPs) Pieter Abbeel UC Berkeley EECS
Partially Observable Markov Decision Processes (POMDPs) Pieter Abbeel UC Berkeley EECS Many slides adapted from Jur van den Berg Outline POMDPs Separation Principle / Certainty Equivalence Locally Optimal
More informationChoice under Uncertainty
In the Name of God Sharif University of Technology Graduate School of Management and Economics Microeconomics 2 44706 (1394-95 2 nd term) Group 2 Dr. S. Farshad Fatemi Chapter 6: Choice under Uncertainty
More informationPartially Observable Markov Decision Processes (POMDPs)
Partially Observable Markov Decision Processes (POMDPs) Geoff Hollinger Sequential Decision Making in Robotics Spring, 2011 *Some media from Reid Simmons, Trey Smith, Tony Cassandra, Michael Littman, and
More informationSeparable Utility Functions in Dynamic Economic Models
Separable Utility Functions in Dynamic Economic Models Karel Sladký 1 Abstract. In this note we study properties of utility functions suitable for performance evaluation of dynamic economic models under
More informationGrundlagen der Künstlichen Intelligenz
Grundlagen der Künstlichen Intelligenz Formal models of interaction Daniel Hennes 27.11.2017 (WS 2017/18) University Stuttgart - IPVS - Machine Learning & Robotics 1 Today Taxonomy of domains Models of
More informationAction-Independent Subjective Expected Utility without States of the World
heoretical Economics Letters, 013, 3, 17-1 http://dxdoiorg/10436/tel01335a004 Published Online September 013 (http://wwwscirporg/journal/tel) Action-Independent Subjective Expected Utility without States
More informationGame Theory with Information: Introducing the Witsenhausen Intrinsic Model
Game Theory with Information: Introducing the Witsenhausen Intrinsic Model Michel De Lara and Benjamin Heymann Cermics, École des Ponts ParisTech France École des Ponts ParisTech March 15, 2017 Information
More informationAn Axiomatic Model of Reference Dependence under Uncertainty. Yosuke Hashidate
An Axiomatic Model of Reference Dependence under Uncertainty Yosuke Hashidate Abstract This paper presents a behavioral characteization of a reference-dependent choice under uncertainty in the Anscombe-Aumann
More informationMarkov decision processes and interval Markov chains: exploiting the connection
Markov decision processes and interval Markov chains: exploiting the connection Mingmei Teo Supervisors: Prof. Nigel Bean, Dr Joshua Ross University of Adelaide July 10, 2013 Intervals and interval arithmetic
More informationDeliberation-aware Responder in Multi-Proposer Ultimatum Game
Deliberation-aware Responder in Multi-Proposer Ultimatum Game Marko Ruman, František Hůla, Miroslav Kárný, and Tatiana V. Guy Department of Adaptive Systems Institute of Information Theory and Automation
More information14.12 Game Theory Lecture Notes Theory of Choice
14.12 Game Theory Lecture Notes Theory of Choice Muhamet Yildiz (Lecture 2) 1 The basic theory of choice We consider a set X of alternatives. Alternatives are mutually exclusive in the sense that one cannot
More informationThis is designed for one 75-minute lecture using Games and Information. October 3, 2006
This is designed for one 75-minute lecture using Games and Information. October 3, 2006 1 7 Moral Hazard: Hidden Actions PRINCIPAL-AGENT MODELS The principal (or uninformed player) is the player who has
More informationMARKOV DECISION PROCESSES (MDP) AND REINFORCEMENT LEARNING (RL) Versione originale delle slide fornita dal Prof. Francesco Lo Presti
1 MARKOV DECISION PROCESSES (MDP) AND REINFORCEMENT LEARNING (RL) Versione originale delle slide fornita dal Prof. Francesco Lo Presti Historical background 2 Original motivation: animal learning Early
More informationGreat Expectations. Part I: On the Customizability of Generalized Expected Utility*
Great Expectations. Part I: On the Customizability of Generalized Expected Utility* Francis C. Chu and Joseph Y. Halpern Department of Computer Science Cornell University Ithaca, NY 14853, U.S.A. Email:
More informationSmooth Calibration, Leaky Forecasts, Finite Recall, and Nash Dynamics
Smooth Calibration, Leaky Forecasts, Finite Recall, and Nash Dynamics Sergiu Hart August 2016 Smooth Calibration, Leaky Forecasts, Finite Recall, and Nash Dynamics Sergiu Hart Center for the Study of Rationality
More informationDe Finetti s ultimate failure. Krzysztof Burdzy University of Washington
De Finetti s ultimate failure Krzysztof Burdzy University of Washington Does philosophy matter? Global temperatures will rise by 1 degree in 20 years with probability 80%. Reading suggestions Probability
More informationarxiv: v1 [cs.ai] 10 Mar 2016
Hierarchical Linearly-Solvable Markov Decision Problems (Extended Version with Supplementary Material) Anders Jonsson and Vicenç Gómez Department of Information and Communication Technologies Universitat
More informationUniform Sources of Uncertainty for Subjective Probabilities and
Uniform Sources of Uncertainty for Subjective Probabilities and Ambiguity Mohammed Abdellaoui (joint with Aurélien Baillon and Peter Wakker) 1 Informal Central in this work will be the recent finding of
More informationExpectation Propagation Algorithm
Expectation Propagation Algorithm 1 Shuang Wang School of Electrical and Computer Engineering University of Oklahoma, Tulsa, OK, 74135 Email: {shuangwang}@ou.edu This note contains three parts. First,
More informationOptimization Tools in an Uncertain Environment
Optimization Tools in an Uncertain Environment Michael C. Ferris University of Wisconsin, Madison Uncertainty Workshop, Chicago: July 21, 2008 Michael Ferris (University of Wisconsin) Stochastic optimization
More informationDecision Making Under Uncertainty. First Masterclass
Decision Making Under Uncertainty First Masterclass 1 Outline A short history Decision problems Uncertainty The Ellsberg paradox Probability as a measure of uncertainty Ignorance 2 Probability Blaise Pascal
More informationA Polynomial-time Nash Equilibrium Algorithm for Repeated Games
A Polynomial-time Nash Equilibrium Algorithm for Repeated Games Michael L. Littman mlittman@cs.rutgers.edu Rutgers University Peter Stone pstone@cs.utexas.edu The University of Texas at Austin Main Result
More informationMarkov Decision Processes
Markov Decision Processes Noel Welsh 11 November 2010 Noel Welsh () Markov Decision Processes 11 November 2010 1 / 30 Annoucements Applicant visitor day seeks robot demonstrators for exciting half hour
More informationRobust Partially Observable Markov Decision Processes
Submitted to manuscript Robust Partially Observable Markov Decision Processes Mohammad Rasouli 1, Soroush Saghafian 2 1 Management Science and Engineering, Stanford University, Palo Alto, CA 2 Harvard
More informationStructural Uncertainty in Health Economic Decision Models
Structural Uncertainty in Health Economic Decision Models Mark Strong 1, Hazel Pilgrim 1, Jeremy Oakley 2, Jim Chilcott 1 December 2009 1. School of Health and Related Research, University of Sheffield,
More informationOn Prediction and Planning in Partially Observable Markov Decision Processes with Large Observation Sets
On Prediction and Planning in Partially Observable Markov Decision Processes with Large Observation Sets Pablo Samuel Castro pcastr@cs.mcgill.ca McGill University Joint work with: Doina Precup and Prakash
More informationProbabilistic Subjective Expected Utility. Pavlo R. Blavatskyy
Probabilistic Subjective Expected Utility Pavlo R. Blavatskyy Institute of Public Finance University of Innsbruck Universitaetsstrasse 15 A-6020 Innsbruck Austria Phone: +43 (0) 512 507 71 56 Fax: +43
More informationBayesian inference J. Daunizeau
Bayesian inference J. Daunizeau Brain and Spine Institute, Paris, France Wellcome Trust Centre for Neuroimaging, London, UK Overview of the talk 1 Probabilistic modelling and representation of uncertainty
More informationWhaddya know? Bayesian and Frequentist approaches to inverse problems
Whaddya know? Bayesian and Frequentist approaches to inverse problems Philip B. Stark Department of Statistics University of California, Berkeley Inverse Problems: Practical Applications and Advanced Analysis
More informationSimple Counter-terrorism Decision
A Comparative Analysis of PRA and Intelligent Adversary Methods for Counterterrorism Risk Management Greg Parnell US Military Academy Jason R. W. Merrick Virginia Commonwealth University Simple Counter-terrorism
More informationRecursive Ambiguity and Machina s Examples
Recursive Ambiguity and Machina s Examples David Dillenberger Uzi Segal May 0, 0 Abstract Machina (009, 0) lists a number of situations where standard models of ambiguity aversion are unable to capture
More informationOnline solution of the average cost Kullback-Leibler optimization problem
Online solution of the average cost Kullback-Leibler optimization problem Joris Bierkens Radboud University Nijmegen j.bierkens@science.ru.nl Bert Kappen Radboud University Nijmegen b.kappen@science.ru.nl
More informationWhy on earth did you do that?!
Why on earth did you do that?! A simple model of obfuscation-based choice Caspar 9-5-2018 Chorus http://behave.tbm.tudelft.nl/ Delft University of Technology Challenge the future Background: models of
More informationQuantum Probability in Cognition. Ryan Weiss 11/28/2018
Quantum Probability in Cognition Ryan Weiss 11/28/2018 Overview Introduction Classical vs Quantum Probability Brain Information Processing Decision Making Conclusion Introduction Quantum probability in
More informationQuantum Games. Quantum Strategies in Classical Games. Presented by Yaniv Carmeli
Quantum Games Quantum Strategies in Classical Games Presented by Yaniv Carmeli 1 Talk Outline Introduction Game Theory Why quantum games? PQ Games PQ penny flip 2x2 Games Quantum strategies 2 Game Theory
More informationWeek 5 Consumer Theory (Jehle and Reny, Ch.2)
Week 5 Consumer Theory (Jehle and Reny, Ch.2) Serçin ahin Yldz Technical University 23 October 2012 Duality Expenditure and Consumer Preferences Choose (p 0, u 0 ) R n ++ R +, and evaluate E there to obtain
More informationLearning and Risk Aversion
Learning and Risk Aversion Carlos Oyarzun Texas A&M University Rajiv Sarin Texas A&M University January 2007 Abstract Learning describes how behavior changes in response to experience. We consider how
More informationA Decentralized Approach to Multi-agent Planning in the Presence of Constraints and Uncertainty
2011 IEEE International Conference on Robotics and Automation Shanghai International Conference Center May 9-13, 2011, Shanghai, China A Decentralized Approach to Multi-agent Planning in the Presence of
More informationDecision Trees. Gavin Brown
Decision Trees Gavin Brown Every Learning Method has Limitations Linear model? KNN? SVM? Explain your decisions Sometimes we need interpretable results from our techniques. How do you explain the above
More informationFundamentals in Optimal Investments. Lecture I
Fundamentals in Optimal Investments Lecture I + 1 Portfolio choice Portfolio allocations and their ordering Performance indices Fundamentals in optimal portfolio choice Expected utility theory and its
More informationIncremental Policy Learning: An Equilibrium Selection Algorithm for Reinforcement Learning Agents with Common Interests
Incremental Policy Learning: An Equilibrium Selection Algorithm for Reinforcement Learning Agents with Common Interests Nancy Fulda and Dan Ventura Department of Computer Science Brigham Young University
More informationSecond-Order Expected Utility
Second-Order Expected Utility Simon Grant Ben Polak Tomasz Strzalecki Preliminary version: November 2009 Abstract We present two axiomatizations of the Second-Order Expected Utility model in the context
More information6.207/14.15: Networks Lecture 11: Introduction to Game Theory 3
6.207/14.15: Networks Lecture 11: Introduction to Game Theory 3 Daron Acemoglu and Asu Ozdaglar MIT October 19, 2009 1 Introduction Outline Existence of Nash Equilibrium in Infinite Games Extensive Form
More informationBayesian inference J. Daunizeau
Bayesian inference J. Daunizeau Brain and Spine Institute, Paris, France Wellcome Trust Centre for Neuroimaging, London, UK Overview of the talk 1 Probabilistic modelling and representation of uncertainty
More informationContracts under Asymmetric Information
Contracts under Asymmetric Information 1 I Aristotle, economy (oiko and nemo) and the idea of exchange values, subsequently adapted by Ricardo and Marx. Classical economists. An economy consists of a set
More informationSpanning Tree Problem of Uncertain Network
Spanning Tree Problem of Uncertain Network Jin Peng Institute of Uncertain Systems Huanggang Normal University Hubei 438000, China Email: pengjin01@tsinghuaorgcn Shengguo Li College of Mathematics & Computer
More informationY. Xiang, Inference with Uncertain Knowledge 1
Inference with Uncertain Knowledge Objectives Why must agent use uncertain knowledge? Fundamentals of Bayesian probability Inference with full joint distributions Inference with Bayes rule Bayesian networks
More informationThe role of predictive uncertainty in the operational management of reservoirs
118 Evolving Water Resources Systems: Understanding, Predicting and Managing Water Society Interactions Proceedings of ICWRS014, Bologna, Italy, June 014 (IAHS Publ. 364, 014). The role of predictive uncertainty
More informationConfronting Theory with Experimental Data and vice versa. Lecture I Choice under risk. The Norwegian School of Economics Nov 7-11, 2011
Confronting Theory with Experimental Data and vice versa Lecture I Choice under risk The Norwegian School of Economics Nov 7-11, 2011 Preferences Let X be some set of alternatives (consumption set). Formally,
More informationPerfect simulation of repulsive point processes
Perfect simulation of repulsive point processes Mark Huber Department of Mathematical Sciences, Claremont McKenna College 29 November, 2011 Mark Huber, CMC Perfect simulation of repulsive point processes
More informationCourse 16:198:520: Introduction To Artificial Intelligence Lecture 13. Decision Making. Abdeslam Boularias. Wednesday, December 7, 2016
Course 16:198:520: Introduction To Artificial Intelligence Lecture 13 Decision Making Abdeslam Boularias Wednesday, December 7, 2016 1 / 45 Overview We consider probabilistic temporal models where the
More informationChance-constrained optimization with tight confidence bounds
Chance-constrained optimization with tight confidence bounds Mark Cannon University of Oxford 25 April 2017 1 Outline 1. Problem definition and motivation 2. Confidence bounds for randomised sample discarding
More information