Low-Regret for Online Decision-Making

Size: px
Start display at page:

Download "Low-Regret for Online Decision-Making"

Transcription

1 Siddhartha Banerjee and Alberto Vera November 6, /17

2 Introduction Compensated Coupling Bayes Selector Conclusion Motivation 2/17

3 Introduction Compensated Coupling Bayes Selector Conclusion Motivation Offline is a host who knew all the arrivals ahead of time 2/17

4 Example: Online Packing (NRM) Resources with initial budgets B i, i = 1,..., d. 3/17

5 Example: Online Packing (NRM) Resources with initial budgets B i, i = 1,..., d. Type j wants a ij units of resource i. 3/17

6 Example: Online Packing (NRM) Resources with initial budgets B i, i = 1,..., d. Type j wants a ij units of resource i. Type j arrives with probability p j and has reward r j, j = 1,..., n. 3/17

7 Example: Online Packing (NRM) Resources with initial budgets B i, i = 1,..., d. Type j wants a ij units of resource i. Type j arrives with probability p j and has reward r j, j = 1,..., n. If we had full information, solve max r x s.t. Ax B (Budget Consumption) x Z(T ) (Respect Arrivals) x N n. Paradigm: two agents solving this problem, Online and Offline 3/17

8 Traditional Approach and Our Contributions State State S1 S1 S2 S2 S3 S3 S4 Offline S4 Online Offline Time Time t1 t2 t3 t4 t1 t2 t3 t4 4/17

9 Traditional Approach and Our Contributions State State S1 S1 S2 S2 S3 S3 S4 Offline S4 Online Offline Time Time t1 t2 t3 t4 t1 t2 t3 t4 Our Contributions Compensated Coupling: for analyzing algorithms Bayes Selector: meta-algorithm based on estimators Applications: constant regret 4/17

10 Application: Constant Regret for Online Packing V on and V off collected values, Regret := V off V on. 5/17

11 Application: Constant Regret for Online Packing V on and V off collected values, Regret := V off V on. V off = max{r x : Ax B, 0 x Z(T )}. 5/17

12 Application: Constant Regret for Online Packing V on and V off collected values, Regret := V off V on. V off = max{r x : Ax B, 0 x Z(T )}. Theorem (Informal) We provide a natural policy with constant expected regret for online packing problems. 5/17

13 Application: Constant Regret for Online Packing V on and V off collected values, Regret := V off V on. V off = max{r x : Ax B, 0 x Z(T )}. Theorem (Informal) We provide a natural policy with constant expected regret for online packing problems. Regret depends on d, n, κ(a), and the distribution of Z(t) Independent of T and initial budget levels B Correlated and time-varying distributions 1 o(1) approximation whenever T and B are large 5/17

14 Related Work Constant Regret for Multi-Secretary Arlotto and Gurvich (2017) Network Revenue Management / Online Packing: Asymptotic analysis O( T ) regret Talluri and Van Ryzin (2004) Later o( T ), O(1) conditional, O(1) Poisson Reiman and Wang (2008),Jasin and Kumar (2012),Bumpensanti and Wang (2018) Prophet: worst case distribution (competitive ratio) Ratio 1 e 1 for maximum of i.i.d. Hill and Kertz (1982) Best ratio 0.745, used in posted pricing Correa et al. (2017) Online Matching Manshadi et al (2012) Approximate Dynamic Programming Powell (2011) 6/17

15 Framework for Online Decision-Making Arrival type J t J presented at t = T, T 1,..., 1 7/17

16 Framework for Online Decision-Making Arrival type J t J presented at t = T, T 1,..., 1 States s S and actions a A 7/17

17 Framework for Online Decision-Making Arrival type J t J presented at t = T, T 1,..., 1 States s S and actions a A Rewards R(a, s, j) 7/17

18 Framework for Online Decision-Making Arrival type J t J presented at t = T, T 1,..., 1 States s S and actions a A Rewards R(a, s, j) Transitions T (a, s, j) 7/17

19 Framework for Online Decision-Making Arrival type J t J presented at t = T, T 1,..., 1 States s S and actions a A Rewards R(a, s, j) Transitions T (a, s, j) Examples: Online Packing (NRM), Multi-Secretary, Online Matching, Ski-Rental, etc. 7/17

20 Example: Multi-Secretary One resource (d = 1) with capacity B. Goal: choose B arrivals to maximize their sum t = t is time to go 8/17

21 Value of Offline V off (t, s)[ω] is the optimal offline value starting on s with t periods to go on sample path ω. 9/17

22 Value of Offline V off (t, s)[ω] is the optimal offline value starting on s with t periods to go on sample path ω t B = B = /17

23 Value of Offline V off (t, s)[ω] is the optimal offline value starting on s with t periods to go on sample path ω t B = B = It satisfies the Bellman Equation V off (t, s)[ω] = max{r(a, s, J t )+V off (t 1, T (a, s, J t ))[ω] : a A}. 9/17

24 Offline Follows Online Online uses a policy π on. We want Offline to follow π on 10/17

25 Offline Follows Online Online uses a policy π on. We want Offline to follow π on What if Offline is not satisfied with a = π on (t, s, j)? V off (t, s)[ω] > R(a, s, J t ) + V off (t 1, T (a, s, J t ))[ω]. 10/17

26 Offline Follows Online Online uses a policy π on. We want Offline to follow π on What if Offline is not satisfied with a = π on (t, s, j)? V off (t, s)[ω] > R(a, s, J t ) + V off (t 1, T (a, s, J t ))[ω]. We need to compensate him by paying V off (t, s)[ω] R(a, s, J t ) V off (t 1, T (a, s, J t ))[ω]. 10/17

27 Offline Follows Online Online uses a policy π on. We want Offline to follow π on What if Offline is not satisfied with a = π on (t, s, j)? V off (t, s)[ω] > R(a, s, J t ) + V off (t 1, T (a, s, J t ))[ω]. We need to compensate him by paying V off (t, s)[ω] R(a, s, J t ) V off (t 1, T (a, s, J t ))[ω] t = /17

28 Compensations How big are the compensations? 11/17

29 Compensations How big are the compensations? r max for multi-secretary and A j r max dr max for packing. 11/17

30 Compensations How big are the compensations? r max for multi-secretary and A j r max dr max for packing. Q(t, s, a) is the disagreement set: ω Ω such that Offline loses optimality by taking a. 11/17

31 Compensations How big are the compensations? r max for multi-secretary and A j r max dr max for packing. Q(t, s, a) is the disagreement set: ω Ω such that Offline loses optimality by taking a. Lemma (Compensated Coupling) Fix any policy for Online. Let S t be Online s state over time and A t Online s action over time. If r is a bound on the compensations, then [ ] E[Regret] re P[Q(t, S t, A t )] t 11/17

32 Compensated Coupling View State State Compensation S1 S1 S2 S2 S3 S3 S4 Online Offline S4 Online Offline 1 Time Time t1 t2 t3 t4 t1 t 1 t2 t3 t4 12/17

33 Compensated Coupling View State State Compensation S1 S1 S2 S2 S3 S3 S4 Online Offline S4 Online Offline 1 Time Time State t1 t2 t3 t4 Compensation t1 t 1 t2 t3 t4 Compensation Online Offline 2 t1 t 1 t2 t3 t4 Time 12/17

34 Compensated Coupling View State State Compensation S1 S1 S2 S2 S3 S3 S4 Online Offline S4 Online Offline 1 Time Time t1 t2 t3 t4 t1 t 1 t2 t3 t4 State Compensation State Compensation Compensation S1 Compensation Compensation S2 S3 Online Offline 2 S4 Online Offline 3 t1 t 1 t2 t3 t4 Time t1 t2 t3 t4 Time 12/17

35 The Bayes Selector Policy Compensated coupling works for any policy. Can we turn it around and find a policy? 13/17

36 The Bayes Selector Policy Compensated coupling works for any policy. Can we turn it around and find a policy? Alice is making decisions and we want to copy her. An oracle tells us she picked red/blue with probabilities 0.7/0.3. What to do? 13/17

37 The Bayes Selector Policy Compensated coupling works for any policy. Can we turn it around and find a policy? Alice is making decisions and we want to copy her. An oracle tells us she picked red/blue with probabilities 0.7/0.3. What to do? Q(t, s, a) is the disagreement set: ω Ω such that Offline loses optimality by taking a. 13/17

38 The Bayes Selector Policy Compensated coupling works for any policy. Can we turn it around and find a policy? Alice is making decisions and we want to copy her. An oracle tells us she picked red/blue with probabilities 0.7/0.3. What to do? Q(t, s, a) is the disagreement set: ω Ω such that Offline loses optimality by taking a. Policy: at each time t, take action A t BS argmin{p[q(t, S t BS, a)] : a A}. 13/17

39 Regret Guarantee of Bayes Selector Ideal Policy: at time t, A t BS argmin{p[q(t, St BS, a)] : a A}. 14/17

40 Regret Guarantee of Bayes Selector Ideal Policy: at time t, A t BS argmin{p[q(t, St BS, a)] : a A}. Meta-Algorithm: Based on estimators ˆq(t, s, a) P[Q(t, s, a)] 14/17

41 Regret Guarantee of Bayes Selector Ideal Policy: at time t, A t BS argmin{p[q(t, St BS, a)] : a A}. Meta-Algorithm: Based on estimators ˆq(t, s, a) P[Q(t, s, a)] at time t, A t BS argmin{ˆq(t, St BS, a) : a A}. 14/17

42 Regret Guarantee of Bayes Selector Ideal Policy: at time t, A t BS argmin{p[q(t, St BS, a)] : a A}. Meta-Algorithm: Based on estimators ˆq(t, s, a) P[Q(t, s, a)] at time t, A t BS argmin{ˆq(t, St BS, a) : a A}. Theorem Let r be an upper bound on the compensations. The Bayes Selector achieves [ ] E[Regret] re ˆq(t, SBS, t A t BS) t 14/17

43 Regret Guarantee of Bayes Selector Ideal Policy: at time t, A t BS argmin{p[q(t, St BS, a)] : a A}. Meta-Algorithm: Based on estimators ˆq(t, s, a) P[Q(t, s, a)] at time t, A t BS argmin{ˆq(t, St BS, a) : a A}. Theorem Let r be an upper bound on the compensations. The Bayes Selector achieves [ ] E[Regret] re ˆq(t, SBS, t A t BS) t Theorem The Bayes Selector achieves constant regret for online packing and matching problems. 14/17

44 Estimators for Online Packing ˆq(t, s, a) P[Q(t, s, a)], with Q(t, s, a) is the disagreement set. Given Online s budget B t, Offline solves max{r x : Ax B t, 0 x Z(t)}. 15/17

45 Estimators for Online Packing ˆq(t, s, a) P[Q(t, s, a)], with Q(t, s, a) is the disagreement set. Given Online s budget B t, Offline solves max{r x : Ax B t, 0 x Z(t)}. Let X solve max{r x : Ax B t, 0 x E[Z(t)]} 15/17

46 Estimators for Online Packing ˆq(t, s, a) P[Q(t, s, a)], with Q(t, s, a) is the disagreement set. Given Online s budget B t, Offline solves max{r x : Ax B t, 0 x Z(t)}. Let X solve max{r x : Ax B t, 0 x E[Z(t)]} X j /E[Z j (t)] [0, 1] for all j. 15/17

47 Estimators for Online Packing ˆq(t, s, a) P[Q(t, s, a)], with Q(t, s, a) is the disagreement set. Given Online s budget B t, Offline solves max{r x : Ax B t, 0 x Z(t)}. Let X solve max{r x : Ax B t, 0 x E[Z(t)]} X j /E[Z j (t)] [0, 1] for all j. Accept j iff X j /E[Z j (t)] 1/2. 15/17

48 Estimators for Online Packing ˆq(t, s, a) P[Q(t, s, a)], with Q(t, s, a) is the disagreement set. Given Online s budget B t, Offline solves max{r x : Ax B t, 0 x Z(t)}. Let X solve max{r x : Ax B t, 0 x E[Z(t)]} X j /E[Z j (t)] [0, 1] for all j. Accept j iff X j /E[Z j (t)] 1/2. min{ˆq(t, B t, accept), ˆq(t, B t, reject)} exp( t(p j /2κ) 2 /25) 15/17

49 Numerical Results Bayes Selector Randomized Adaptive Policy Randomized Static Policy Regret Scaling 16/17

50 Conclusions and Future Directions Compensated coupling: A way to understand online decision-making 17/17

51 Conclusions and Future Directions Compensated coupling: A way to understand online decision-making Bayes Selector: Natural policy with a performance guarantee Bridge between estimation and optimization 17/17

52 Conclusions and Future Directions Compensated coupling: A way to understand online decision-making Bayes Selector: Natural policy with a performance guarantee Bridge between estimation and optimization Online packing and matching: the Bayes Selector achieves constant regret Future work: learning and multi-stage problems Promising directions: online convex optimization 17/17

53 Appendix: Details for Numerical Results Type j Valuation Resource ,2 1,2 p j /1

Allocating Resources, in the Future

Allocating Resources, in the Future Allocating Resources, in the Future Sid Banerjee School of ORIE May 3, 2018 Simons Workshop on Mathematical and Computational Challenges in Real-Time Decision Making online resource allocation: basic model......

More information

arxiv: v1 [cs.ds] 15 Jan 2019

arxiv: v1 [cs.ds] 15 Jan 2019 The Bayesian Prophet: A Low-Regret Framework for Online Decision Making arxiv:1901.05028v1 [cs.ds] 15 Jan 2019 ALBERTO VERA, Cornell University SIDDHARTHA BANERJEE, Cornell University Motivated by the

More information

arxiv: v3 [math.oc] 11 Dec 2018

arxiv: v3 [math.oc] 11 Dec 2018 A Re-solving Heuristic with Uniformly Bounded Loss for Network Revenue Management Pornpawee Bumpensanti, He Wang School of Industrial and Systems Engineering, Georgia Institute of echnology, Atlanta, GA

More information

Dynamic Optimization of Mobile Push Advertising Campaigns

Dynamic Optimization of Mobile Push Advertising Campaigns Submitted to Management Science manuscript Authors are encouraged to submit new papers to INFORMS journals by means of a style file template, which includes the journal title. However, use of a template

More information

Assortment Optimization under the Multinomial Logit Model with Nested Consideration Sets

Assortment Optimization under the Multinomial Logit Model with Nested Consideration Sets Assortment Optimization under the Multinomial Logit Model with Nested Consideration Sets Jacob Feldman School of Operations Research and Information Engineering, Cornell University, Ithaca, New York 14853,

More information

An Online Algorithm for Maximizing Submodular Functions

An Online Algorithm for Maximizing Submodular Functions An Online Algorithm for Maximizing Submodular Functions Matthew Streeter December 20, 2007 CMU-CS-07-171 Daniel Golovin School of Computer Science Carnegie Mellon University Pittsburgh, PA 15213 This research

More information

Reinforcement Learning in Partially Observable Multiagent Settings: Monte Carlo Exploring Policies

Reinforcement Learning in Partially Observable Multiagent Settings: Monte Carlo Exploring Policies Reinforcement earning in Partially Observable Multiagent Settings: Monte Carlo Exploring Policies Presenter: Roi Ceren THINC ab, University of Georgia roi@ceren.net Prashant Doshi THINC ab, University

More information

Secretary Problems. Petropanagiotaki Maria. January MPLA, Algorithms & Complexity 2

Secretary Problems. Petropanagiotaki Maria. January MPLA, Algorithms & Complexity 2 January 15 2015 MPLA, Algorithms & Complexity 2 Simplest form of the problem 1 The candidates are totally ordered from best to worst with no ties. 2 The candidates arrive sequentially in random order.

More information

Dual Interpretations and Duality Applications (continued)

Dual Interpretations and Duality Applications (continued) Dual Interpretations and Duality Applications (continued) Yinyu Ye Department of Management Science and Engineering Stanford University Stanford, CA 94305, U.S.A. http://www.stanford.edu/ yyye (LY, Chapters

More information

A Dynamic Near-Optimal Algorithm for Online Linear Programming

A Dynamic Near-Optimal Algorithm for Online Linear Programming Submitted to Operations Research manuscript (Please, provide the manuscript number! Authors are encouraged to submit new papers to INFORMS journals by means of a style file template, which includes the

More information

CS599: Algorithm Design in Strategic Settings Fall 2012 Lecture 12: Approximate Mechanism Design in Multi-Parameter Bayesian Settings

CS599: Algorithm Design in Strategic Settings Fall 2012 Lecture 12: Approximate Mechanism Design in Multi-Parameter Bayesian Settings CS599: Algorithm Design in Strategic Settings Fall 2012 Lecture 12: Approximate Mechanism Design in Multi-Parameter Bayesian Settings Instructor: Shaddin Dughmi Administrivia HW1 graded, solutions on website

More information

Technical Note: Capacitated Assortment Optimization under the Multinomial Logit Model with Nested Consideration Sets

Technical Note: Capacitated Assortment Optimization under the Multinomial Logit Model with Nested Consideration Sets Technical Note: Capacitated Assortment Optimization under the Multinomial Logit Model with Nested Consideration Sets Jacob Feldman Olin Business School, Washington University, St. Louis, MO 63130, USA

More information

Economic Growth: Lecture 8, Overlapping Generations

Economic Growth: Lecture 8, Overlapping Generations 14.452 Economic Growth: Lecture 8, Overlapping Generations Daron Acemoglu MIT November 20, 2018 Daron Acemoglu (MIT) Economic Growth Lecture 8 November 20, 2018 1 / 46 Growth with Overlapping Generations

More information

The Multi-Path Utility Maximization Problem

The Multi-Path Utility Maximization Problem The Multi-Path Utility Maximization Problem Xiaojun Lin and Ness B. Shroff School of Electrical and Computer Engineering Purdue University, West Lafayette, IN 47906 {linx,shroff}@ecn.purdue.edu Abstract

More information

Bayesian Active Learning With Basis Functions

Bayesian Active Learning With Basis Functions Bayesian Active Learning With Basis Functions Ilya O. Ryzhov Warren B. Powell Operations Research and Financial Engineering Princeton University Princeton, NJ 08544, USA IEEE ADPRL April 13, 2011 1 / 29

More information

Lecture 2: Paging and AdWords

Lecture 2: Paging and AdWords Algoritmos e Incerteza (PUC-Rio INF2979, 2017.1) Lecture 2: Paging and AdWords March 20 2017 Lecturer: Marco Molinaro Scribe: Gabriel Homsi In this class we had a brief recap of the Ski Rental Problem

More information

Basics of reinforcement learning

Basics of reinforcement learning Basics of reinforcement learning Lucian Buşoniu TMLSS, 20 July 2018 Main idea of reinforcement learning (RL) Learn a sequential decision policy to optimize the cumulative performance of an unknown system

More information

CS261: Problem Set #3

CS261: Problem Set #3 CS261: Problem Set #3 Due by 11:59 PM on Tuesday, February 23, 2016 Instructions: (1) Form a group of 1-3 students. You should turn in only one write-up for your entire group. (2) Submission instructions:

More information

Competitive Equilibrium and the Welfare Theorems

Competitive Equilibrium and the Welfare Theorems Competitive Equilibrium and the Welfare Theorems Craig Burnside Duke University September 2010 Craig Burnside (Duke University) Competitive Equilibrium September 2010 1 / 32 Competitive Equilibrium and

More information

Online Learning Schemes for Power Allocation in Energy Harvesting Communications

Online Learning Schemes for Power Allocation in Energy Harvesting Communications Online Learning Schemes for Power Allocation in Energy Harvesting Communications Pranav Sakulkar and Bhaskar Krishnamachari Ming Hsieh Department of Electrical Engineering Viterbi School of Engineering

More information

A Robust Event-Triggered Consensus Strategy for Linear Multi-Agent Systems with Uncertain Network Topology

A Robust Event-Triggered Consensus Strategy for Linear Multi-Agent Systems with Uncertain Network Topology A Robust Event-Triggered Consensus Strategy for Linear Multi-Agent Systems with Uncertain Network Topology Amir Amini, Amir Asif, Arash Mohammadi Electrical and Computer Engineering,, Montreal, Canada.

More information

Field Exam: Advanced Theory

Field Exam: Advanced Theory Field Exam: Advanced Theory There are two questions on this exam, one for Econ 219A and another for Economics 206. Answer all parts for both questions. Exercise 1: Consider a n-player all-pay auction auction

More information

Rice University. Fall Semester Final Examination ECON501 Advanced Microeconomic Theory. Writing Period: Three Hours

Rice University. Fall Semester Final Examination ECON501 Advanced Microeconomic Theory. Writing Period: Three Hours Rice University Fall Semester Final Examination 007 ECON50 Advanced Microeconomic Theory Writing Period: Three Hours Permitted Materials: English/Foreign Language Dictionaries and non-programmable calculators

More information

arxiv: v2 [math.oc] 22 Dec 2017

arxiv: v2 [math.oc] 22 Dec 2017 Managing Appointment Booking under Customer Choices Nan Liu 1, Peter M. van de Ven 2, and Bo Zhang 3 arxiv:1609.05064v2 [math.oc] 22 Dec 2017 1 Operations Management Department, Carroll School of Management,

More information

Online Advance Admission Scheduling for Services, with Customer Preferences

Online Advance Admission Scheduling for Services, with Customer Preferences Submitted to Operations Research manuscript (Please, provide the manuscript number!) Authors are encouraged to submit new papers to INFORMS journals by means of a style file template, which includes the

More information

Online bin packing 24.Januar 2008

Online bin packing 24.Januar 2008 Rob van Stee: Approximations- und Online-Algorithmen 1 Online bin packing 24.Januar 2008 Problem definition First Fit and other algorithms The asymptotic performance ratio Weighting functions Lower bounds

More information

Welfare Maximization with Friends-of-Friends Network Externalities

Welfare Maximization with Friends-of-Friends Network Externalities Welfare Maximization with Friends-of-Friends Network Externalities Extended version of a talk at STACS 2015, Munich Wolfgang Dvořák 1 joint work with: Sayan Bhattacharya 2, Monika Henzinger 1, Martin Starnberger

More information

Managing Appointment Scheduling under Patient Choices

Managing Appointment Scheduling under Patient Choices Submitted to manuscript (Please, provide the manuscript number!) Authors are encouraged to submit new papers to INFORMS journals by means of a style file template, which includes the journal title. However,

More information

Online algorithms for parallel job scheduling and strip packing Hurink, J.L.; Paulus, J.J.

Online algorithms for parallel job scheduling and strip packing Hurink, J.L.; Paulus, J.J. Online algorithms for parallel job scheduling and strip packing Hurink, J.L.; Paulus, J.J. Published: 01/01/007 Document Version Publisher s PDF, also known as Version of Record (includes final page, issue

More information

Assortment Optimization for Parallel Flights under a Multinomial Logit Choice Model with Cheapest Fare Spikes

Assortment Optimization for Parallel Flights under a Multinomial Logit Choice Model with Cheapest Fare Spikes Assortment Optimization for Parallel Flights under a Multinomial Logit Choice Model with Cheapest Fare Spikes Yufeng Cao, Anton Kleywegt, He Wang School of Industrial and Systems Engineering, Georgia Institute

More information

Lecture notes 2: Applications

Lecture notes 2: Applications Lecture notes 2: Applications Vincent Conitzer In this set of notes, we will discuss a number of problems of interest to computer scientists where linear/integer programming can be fruitfully applied.

More information

OPTIMALITY OF RANDOMIZED TRUNK RESERVATION FOR A PROBLEM WITH MULTIPLE CONSTRAINTS

OPTIMALITY OF RANDOMIZED TRUNK RESERVATION FOR A PROBLEM WITH MULTIPLE CONSTRAINTS OPTIMALITY OF RANDOMIZED TRUNK RESERVATION FOR A PROBLEM WITH MULTIPLE CONSTRAINTS Xiaofei Fan-Orzechowski Department of Applied Mathematics and Statistics State University of New York at Stony Brook Stony

More information

Game Theory. Monika Köppl-Turyna. Winter 2017/2018. Institute for Analytical Economics Vienna University of Economics and Business

Game Theory. Monika Köppl-Turyna. Winter 2017/2018. Institute for Analytical Economics Vienna University of Economics and Business Monika Köppl-Turyna Institute for Analytical Economics Vienna University of Economics and Business Winter 2017/2018 Static Games of Incomplete Information Introduction So far we assumed that payoff functions

More information

Lecture 10: Mechanism Design

Lecture 10: Mechanism Design Computational Game Theory Spring Semester, 2009/10 Lecture 10: Mechanism Design Lecturer: Yishay Mansour Scribe: Vera Vsevolozhsky, Nadav Wexler 10.1 Mechanisms with money 10.1.1 Introduction As we have

More information

Point Process Control

Point Process Control Point Process Control The following note is based on Chapters I, II and VII in Brémaud s book Point Processes and Queues (1981). 1 Basic Definitions Consider some probability space (Ω, F, P). A real-valued

More information

Week 6: Consumer Theory Part 1 (Jehle and Reny, Chapter 1)

Week 6: Consumer Theory Part 1 (Jehle and Reny, Chapter 1) Week 6: Consumer Theory Part 1 (Jehle and Reny, Chapter 1) Tsun-Feng Chiang* *School of Economics, Henan University, Kaifeng, China November 2, 2014 1 / 28 Primitive Notions 1.1 Primitive Notions Consumer

More information

Adaptive Rumor Spreading

Adaptive Rumor Spreading José Correa 1 Marcos Kiwi 1 Neil Olver 2 Alberto Vera 1 1 2 VU Amsterdam and CWI July 27, 2015 1/21 The situation 2/21 The situation 2/21 The situation 2/21 The situation 2/21 Introduction Rumors in social

More information

Logarithmic Regret in the Dynamic and Stochastic Knapsack Problem

Logarithmic Regret in the Dynamic and Stochastic Knapsack Problem Logarithmic Regret in the Dynamic and Stochastic Knapsack Problem Alessandro Arlotto The Fuqua School of Business, Duke University, Fuqua Drive, Durham, NC, 2778, alessandro.arlotto@duke.edu Xinchang Xie

More information

Bayesian Contextual Multi-armed Bandits

Bayesian Contextual Multi-armed Bandits Bayesian Contextual Multi-armed Bandits Xiaoting Zhao Joint Work with Peter I. Frazier School of Operations Research and Information Engineering Cornell University October 22, 2012 1 / 33 Outline 1 Motivating

More information

Department of Agricultural Economics. PhD Qualifier Examination. May 2009

Department of Agricultural Economics. PhD Qualifier Examination. May 2009 Department of Agricultural Economics PhD Qualifier Examination May 009 Instructions: The exam consists of six questions. You must answer all questions. If you need an assumption to complete a question,

More information

Assortment Optimization under the Multinomial Logit Model with Sequential Offerings

Assortment Optimization under the Multinomial Logit Model with Sequential Offerings Assortment Optimization under the Multinomial Logit Model with Sequential Offerings Nan Liu Carroll School of Management, Boston College, Chestnut Hill, MA 02467, USA nan.liu@bc.edu Yuhang Ma School of

More information

On the Tightness of an LP Relaxation for Rational Optimization and its Applications

On the Tightness of an LP Relaxation for Rational Optimization and its Applications OPERATIONS RESEARCH Vol. 00, No. 0, Xxxxx 0000, pp. 000 000 issn 0030-364X eissn 526-5463 00 0000 000 INFORMS doi 0.287/xxxx.0000.0000 c 0000 INFORMS Authors are encouraged to submit new papers to INFORMS

More information

where u is the decision-maker s payoff function over her actions and S is the set of her feasible actions.

where u is the decision-maker s payoff function over her actions and S is the set of her feasible actions. Seminars on Mathematics for Economics and Finance Topic 3: Optimization - interior optima 1 Session: 11-12 Aug 2015 (Thu/Fri) 10:00am 1:00pm I. Optimization: introduction Decision-makers (e.g. consumers,

More information

Optimization under Uncertainty: An Introduction through Approximation. Kamesh Munagala

Optimization under Uncertainty: An Introduction through Approximation. Kamesh Munagala Optimization under Uncertainty: An Introduction through Approximation Kamesh Munagala Contents I Stochastic Optimization 5 1 Weakly Coupled LP Relaxations 6 1.1 A Gentle Introduction: The Maximum Value

More information

CS261: A Second Course in Algorithms Lecture #11: Online Learning and the Multiplicative Weights Algorithm

CS261: A Second Course in Algorithms Lecture #11: Online Learning and the Multiplicative Weights Algorithm CS61: A Second Course in Algorithms Lecture #11: Online Learning and the Multiplicative Weights Algorithm Tim Roughgarden February 9, 016 1 Online Algorithms This lecture begins the third module of the

More information

Robust Partially Observable Markov Decision Processes

Robust Partially Observable Markov Decision Processes Submitted to manuscript Robust Partially Observable Markov Decision Processes Mohammad Rasouli 1, Soroush Saghafian 2 1 Management Science and Engineering, Stanford University, Palo Alto, CA 2 Harvard

More information

Managing Call Centers with Many Strategic Agents

Managing Call Centers with Many Strategic Agents Managing Call Centers with Many Strategic Agents Rouba Ibrahim, Kenan Arifoglu Management Science and Innovation, UCL YEQT Workshop - Eindhoven November 2014 Work Schedule Flexibility 86SwofwthewqBestwCompanieswtowWorkwForqwofferw

More information

Project Discussions: SNL/ADMM, MDP/Randomization, Quadratic Regularization, and Online Linear Programming

Project Discussions: SNL/ADMM, MDP/Randomization, Quadratic Regularization, and Online Linear Programming Project Discussions: SNL/ADMM, MDP/Randomization, Quadratic Regularization, and Online Linear Programming Yinyu Ye Department of Management Science and Engineering Stanford University Stanford, CA 94305,

More information

MATH4406 Assignment 5

MATH4406 Assignment 5 MATH4406 Assignment 5 Patrick Laub (ID: 42051392) October 7, 2014 1 The machine replacement model 1.1 Real-world motivation Consider the machine to be the entire world. Over time the creator has running

More information

Online (and Distributed) Learning with Information Constraints. Ohad Shamir

Online (and Distributed) Learning with Information Constraints. Ohad Shamir Online (and Distributed) Learning with Information Constraints Ohad Shamir Weizmann Institute of Science Online Algorithms and Learning Workshop Leiden, November 2014 Ohad Shamir Learning with Information

More information

Learning Optimal Online Advertising Portfolios with Periodic Budgets

Learning Optimal Online Advertising Portfolios with Periodic Budgets Learning Optimal Online Advertising Portfolios with Periodic Budgets Lennart Baardman Operations Research Center, MIT, Cambridge, MA 02139, baardman@mit.edu Elaheh Fata Department of Aeronautics and Astronautics,

More information

Generalization bounds

Generalization bounds Advanced Course in Machine Learning pring 200 Generalization bounds Handouts are jointly prepared by hie Mannor and hai halev-hwartz he problem of characterizing learnability is the most basic question

More information

Massachusetts Institute of Technology 6.854J/18.415J: Advanced Algorithms Friday, March 18, 2016 Ankur Moitra. Problem Set 6

Massachusetts Institute of Technology 6.854J/18.415J: Advanced Algorithms Friday, March 18, 2016 Ankur Moitra. Problem Set 6 Massachusetts Institute of Technology 6.854J/18.415J: Advanced Algorithms Friday, March 18, 2016 Ankur Moitra Problem Set 6 Due: Wednesday, April 6, 2016 7 pm Dropbox Outside Stata G5 Collaboration policy:

More information

Lecture 4. 1 Examples of Mechanism Design Problems

Lecture 4. 1 Examples of Mechanism Design Problems CSCI699: Topics in Learning and Game Theory Lecture 4 Lecturer: Shaddin Dughmi Scribes: Haifeng Xu,Reem Alfayez 1 Examples of Mechanism Design Problems Example 1: Single Item Auctions. There is a single

More information

Lecture 12 November 3

Lecture 12 November 3 STATS 300A: Theory of Statistics Fall 2015 Lecture 12 November 3 Lecturer: Lester Mackey Scribe: Jae Hyuck Park, Christian Fong Warning: These notes may contain factual and/or typographic errors. 12.1

More information

Bayesian Combinatorial Auctions: Expanding Single Buyer Mechanisms to Many Buyers

Bayesian Combinatorial Auctions: Expanding Single Buyer Mechanisms to Many Buyers Bayesian Combinatorial Auctions: Expanding Single Buyer Mechanisms to Many Buyers Saeed Alaei December 12, 2013 Abstract We present a general framework for approximately reducing the mechanism design problem

More information

Lecture 12. Applications to Algorithmic Game Theory & Computational Social Choice. CSC Allan Borodin & Nisarg Shah 1

Lecture 12. Applications to Algorithmic Game Theory & Computational Social Choice. CSC Allan Borodin & Nisarg Shah 1 Lecture 12 Applications to Algorithmic Game Theory & Computational Social Choice CSC2420 - Allan Borodin & Nisarg Shah 1 A little bit of game theory CSC2420 - Allan Borodin & Nisarg Shah 2 Recap: Yao s

More information

Dynamic Mechanisms without Money

Dynamic Mechanisms without Money Dynamic Mechanisms without Money Yingni Guo, 1 Johannes Hörner 2 1 Northwestern 2 Yale July 11, 2015 Features - Agent sole able to evaluate the (changing) state of the world. 2 / 1 Features - Agent sole

More information

Estimating Single-Agent Dynamic Models

Estimating Single-Agent Dynamic Models Estimating Single-Agent Dynamic Models Paul T. Scott Empirical IO Fall, 2013 1 / 49 Why are dynamics important? The motivation for using dynamics is usually external validity: we want to simulate counterfactuals

More information

Bayesian Decision Theory

Bayesian Decision Theory Bayesian Decision Theory Dr. Shuang LIANG School of Software Engineering TongJi University Fall, 2012 Today s Topics Bayesian Decision Theory Bayesian classification for normal distributions Error Probabilities

More information

Advance Service Reservations with Heterogeneous Customers

Advance Service Reservations with Heterogeneous Customers Advance ervice Reservations with Heterogeneous Customers Clifford tein, Van-Anh Truong, Xinshang Wang Department of Industrial Engineering and Operations Research, Columbia University, New York, NY 10027,

More information

Redistribution Mechanisms for Assignment of Heterogeneous Objects

Redistribution Mechanisms for Assignment of Heterogeneous Objects Redistribution Mechanisms for Assignment of Heterogeneous Objects Sujit Gujar Dept of Computer Science and Automation Indian Institute of Science Bangalore, India sujit@csa.iisc.ernet.in Y Narahari Dept

More information

Discussion Papers in Economics

Discussion Papers in Economics Discussion Papers in Economics No. 10/11 A General Equilibrium Corporate Finance Theorem for Incomplete Markets: A Special Case By Pascal Stiefenhofer, University of York Department of Economics and Related

More information

An Adaptive Algorithm for Selecting Profitable Keywords for Search-Based Advertising Services

An Adaptive Algorithm for Selecting Profitable Keywords for Search-Based Advertising Services An Adaptive Algorithm for Selecting Profitable Keywords for Search-Based Advertising Services Paat Rusmevichientong David P. Williamson July 6, 2007 Abstract Increases in online searches have spurred the

More information

Chapter 9: Hypothesis Testing Sections

Chapter 9: Hypothesis Testing Sections 1 / 22 : Hypothesis Testing Sections Skip: 9.2 Testing Simple Hypotheses Skip: 9.3 Uniformly Most Powerful Tests Skip: 9.4 Two-Sided Alternatives 9.5 The t Test 9.6 Comparing the Means of Two Normal Distributions

More information

The Algorithmic Foundations of Adaptive Data Analysis November, Lecture The Multiplicative Weights Algorithm

The Algorithmic Foundations of Adaptive Data Analysis November, Lecture The Multiplicative Weights Algorithm he Algorithmic Foundations of Adaptive Data Analysis November, 207 Lecture 5-6 Lecturer: Aaron Roth Scribe: Aaron Roth he Multiplicative Weights Algorithm In this lecture, we define and analyze a classic,

More information

Working Paper. Adaptive Parametric and Nonparametric Multi-product Pricing via Self-Adjusting Controls

Working Paper. Adaptive Parametric and Nonparametric Multi-product Pricing via Self-Adjusting Controls = = = = Working Paper Adaptive Parametric and Nonparametric Multi-product Pricing via Self-Adjusting Controls Qi George Chen Stephen M. Ross School of Business University of Michigan Stefanus Jasin Stephen

More information

GSOE9210 Engineering Decisions

GSOE9210 Engineering Decisions Sutdent ID: Name: Signature: The University of New South Wales Session 2, 2017 GSOE9210 Engineering Decisions Sample mid-term test Instructions: Time allowed: 1 hour Reading time: 5 minutes This examination

More information

Game Theory: Spring 2017

Game Theory: Spring 2017 Game Theory: Spring 2017 Ulle Endriss Institute for Logic, Language and Computation University of Amsterdam Ulle Endriss 1 Plan for Today In this second lecture on mechanism design we are going to generalise

More information

On the Unique D1 Equilibrium in the Stackelberg Model with Asymmetric Information Janssen, M.C.W.; Maasland, E.

On the Unique D1 Equilibrium in the Stackelberg Model with Asymmetric Information Janssen, M.C.W.; Maasland, E. Tilburg University On the Unique D1 Equilibrium in the Stackelberg Model with Asymmetric Information Janssen, M.C.W.; Maasland, E. Publication date: 1997 Link to publication General rights Copyright and

More information

1 Review and Overview

1 Review and Overview DRAFT a final version will be posted shortly CS229T/STATS231: Statistical Learning Theory Lecturer: Tengyu Ma Lecture # 16 Scribe: Chris Cundy, Ananya Kumar November 14, 2018 1 Review and Overview Last

More information

Firming Renewable Power with Demand Response: An End-to-end Aggregator Business Model

Firming Renewable Power with Demand Response: An End-to-end Aggregator Business Model Firming Renewable Power with Demand Response: An End-to-end Aggregator Business Model Clay Campaigne joint work with Shmuel Oren November 5, 2015 1 / 16 Motivation Problem: Expansion of renewables increases

More information

Multi-armed Bandits in the Presence of Side Observations in Social Networks

Multi-armed Bandits in the Presence of Side Observations in Social Networks 52nd IEEE Conference on Decision and Control December 0-3, 203. Florence, Italy Multi-armed Bandits in the Presence of Side Observations in Social Networks Swapna Buccapatnam, Atilla Eryilmaz, and Ness

More information

Single Leg Airline Revenue Management With Overbooking

Single Leg Airline Revenue Management With Overbooking Single Leg Airline Revenue Management With Overbooking Nurşen Aydın, Ş. İlker Birbil, J. B. G. Frenk and Nilay Noyan Sabancı University, Manufacturing Systems and Industrial Engineering, Orhanlı-Tuzla,

More information

Adaptive Learning with Unknown Information Flows

Adaptive Learning with Unknown Information Flows Adaptive Learning with Unknown Information Flows Yonatan Gur Stanford University Ahmadreza Momeni Stanford University June 8, 018 Abstract An agent facing sequential decisions that are characterized by

More information

Technical Companion to: Sharing Aggregate Inventory Information with Customers: Strategic Cross-selling and Shortage Reduction

Technical Companion to: Sharing Aggregate Inventory Information with Customers: Strategic Cross-selling and Shortage Reduction Technical Companion to: Sharing Aggregate Inventory Information with Customers: Strategic Cross-selling and Shortage Reduction Ruomeng Cui Kelley School of Business, Indiana University, Bloomington, IN

More information

Greedy-Like Algorithms for Dynamic Assortment Planning Under Multinomial Logit Preferences

Greedy-Like Algorithms for Dynamic Assortment Planning Under Multinomial Logit Preferences Submitted to Operations Research manuscript (Please, provide the manuscript number!) Authors are encouraged to submit new papers to INFORMS journals by means of a style file template, which includes the

More information

A Optimal User Search

A Optimal User Search A Optimal User Search Nicole Immorlica, Northwestern University Mohammad Mahdian, Google Greg Stoddard, Northwestern University Most research in sponsored search auctions assume that users act in a stochastic

More information

OLSO. Online Learning and Stochastic Optimization. Yoram Singer August 10, Google Research

OLSO. Online Learning and Stochastic Optimization. Yoram Singer August 10, Google Research OLSO Online Learning and Stochastic Optimization Yoram Singer August 10, 2016 Google Research References Introduction to Online Convex Optimization, Elad Hazan, Princeton University Online Learning and

More information

1 The Basic RBC Model

1 The Basic RBC Model IHS 2016, Macroeconomics III Michael Reiter Ch. 1: Notes on RBC Model 1 1 The Basic RBC Model 1.1 Description of Model Variables y z k L c I w r output level of technology (exogenous) capital at end of

More information

Multi-Armed Bandits. Credit: David Silver. Google DeepMind. Presenter: Tianlu Wang

Multi-Armed Bandits. Credit: David Silver. Google DeepMind. Presenter: Tianlu Wang Multi-Armed Bandits Credit: David Silver Google DeepMind Presenter: Tianlu Wang Credit: David Silver (DeepMind) Multi-Armed Bandits Presenter: Tianlu Wang 1 / 27 Outline 1 Introduction Exploration vs.

More information

A Tighter Variant of Jensen s Lower Bound for Stochastic Programs and Separable Approximations to Recourse Functions

A Tighter Variant of Jensen s Lower Bound for Stochastic Programs and Separable Approximations to Recourse Functions A Tighter Variant of Jensen s Lower Bound for Stochastic Programs and Separable Approximations to Recourse Functions Huseyin Topaloglu School of Operations Research and Information Engineering, Cornell

More information

Bandit models: a tutorial

Bandit models: a tutorial Gdt COS, December 3rd, 2015 Multi-Armed Bandit model: general setting K arms: for a {1,..., K}, (X a,t ) t N is a stochastic process. (unknown distributions) Bandit game: a each round t, an agent chooses

More information

Lecture 5: The Principle of Deferred Decisions. Chernoff Bounds

Lecture 5: The Principle of Deferred Decisions. Chernoff Bounds Randomized Algorithms Lecture 5: The Principle of Deferred Decisions. Chernoff Bounds Sotiris Nikoletseas Associate Professor CEID - ETY Course 2013-2014 Sotiris Nikoletseas, Associate Professor Randomized

More information

Online algorithms December 13, 2007

Online algorithms December 13, 2007 Sanders/van Stee: Approximations- und Online-Algorithmen 1 Online algorithms December 13, 2007 Information is revealed to the algorithm in parts Algorithm needs to process each part before receiving the

More information

Sequential Decisions

Sequential Decisions Sequential Decisions A Basic Theorem of (Bayesian) Expected Utility Theory: If you can postpone a terminal decision in order to observe, cost free, an experiment whose outcome might change your terminal

More information

A tractable consideration set structure for network revenue management

A tractable consideration set structure for network revenue management A tractable consideration set structure for network revenue management Arne Strauss, Kalyan Talluri February 15, 2012 Abstract The dynamic program for choice network RM is intractable and approximated

More information

Macroeconomics Qualifying Examination

Macroeconomics Qualifying Examination Macroeconomics Qualifying Examination August 2015 Department of Economics UNC Chapel Hill Instructions: This examination consists of 4 questions. Answer all questions. If you believe a question is ambiguously

More information

Christopher Watkins and Peter Dayan. Noga Zaslavsky. The Hebrew University of Jerusalem Advanced Seminar in Deep Learning (67679) November 1, 2015

Christopher Watkins and Peter Dayan. Noga Zaslavsky. The Hebrew University of Jerusalem Advanced Seminar in Deep Learning (67679) November 1, 2015 Q-Learning Christopher Watkins and Peter Dayan Noga Zaslavsky The Hebrew University of Jerusalem Advanced Seminar in Deep Learning (67679) November 1, 2015 Noga Zaslavsky Q-Learning (Watkins & Dayan, 1992)

More information

(a) Write down the Hamilton-Jacobi-Bellman (HJB) Equation in the dynamic programming

(a) Write down the Hamilton-Jacobi-Bellman (HJB) Equation in the dynamic programming 1. Government Purchases and Endogenous Growth Consider the following endogenous growth model with government purchases (G) in continuous time. Government purchases enhance production, and the production

More information

We consider a network revenue management problem where customers choose among open fare products

We consider a network revenue management problem where customers choose among open fare products Vol. 43, No. 3, August 2009, pp. 381 394 issn 0041-1655 eissn 1526-5447 09 4303 0381 informs doi 10.1287/trsc.1090.0262 2009 INFORMS An Approximate Dynamic Programming Approach to Network Revenue Management

More information

Payment Rules for Combinatorial Auctions via Structural Support Vector Machines

Payment Rules for Combinatorial Auctions via Structural Support Vector Machines Payment Rules for Combinatorial Auctions via Structural Support Vector Machines Felix Fischer Harvard University joint work with Paul Dütting (EPFL), Petch Jirapinyo (Harvard), John Lai (Harvard), Ben

More information

Designing Competitive Online Algorithms via a Primal-Dual Approach. Niv Buchbinder

Designing Competitive Online Algorithms via a Primal-Dual Approach. Niv Buchbinder Designing Competitive Online Algorithms via a Primal-Dual Approach Niv Buchbinder Designing Competitive Online Algorithms via a Primal-Dual Approach Research Thesis Submitted in Partial Fulfillment of

More information

No-Regret Algorithms for Unconstrained Online Convex Optimization

No-Regret Algorithms for Unconstrained Online Convex Optimization No-Regret Algorithms for Unconstrained Online Convex Optimization Matthew Streeter Duolingo, Inc. Pittsburgh, PA 153 matt@duolingo.com H. Brendan McMahan Google, Inc. Seattle, WA 98103 mcmahan@google.com

More information

Mathematical Games and Random Walks

Mathematical Games and Random Walks Mathematical Games and Random Walks Alexander Engau, Ph.D. Department of Mathematical and Statistical Sciences College of Liberal Arts and Sciences / Intl. College at Beijing University of Colorado Denver

More information

STATE UNIVERSITY OF NEW YORK AT ALBANY Department of Economics

STATE UNIVERSITY OF NEW YORK AT ALBANY Department of Economics STATE UNIVERSITY OF NEW YORK AT ALBANY Department of Economics Ph. D. Comprehensive Examination: Macroeconomics Fall, 202 Answer Key to Section 2 Questions Section. (Suggested Time: 45 Minutes) For 3 of

More information

Parking Slot Assignment Problem

Parking Slot Assignment Problem Department of Economics Boston College October 11, 2016 Motivation Research Question Literature Review What is the concern? Cruising for parking is drivers behavior that circle around an area for a parking

More information

Economic Growth: Lecture 7, Overlapping Generations

Economic Growth: Lecture 7, Overlapping Generations 14.452 Economic Growth: Lecture 7, Overlapping Generations Daron Acemoglu MIT November 17, 2009. Daron Acemoglu (MIT) Economic Growth Lecture 7 November 17, 2009. 1 / 54 Growth with Overlapping Generations

More information

Parking Space Assignment Problem: A Matching Mechanism Design Approach

Parking Space Assignment Problem: A Matching Mechanism Design Approach Parking Space Assignment Problem: A Matching Mechanism Design Approach Jinyong Jeong Boston College ITEA 2017, Barcelona June 23, 2017 Jinyong Jeong (Boston College ITEA 2017, Barcelona) Parking Space

More information

Lecture Testing Hypotheses: The Neyman-Pearson Paradigm

Lecture Testing Hypotheses: The Neyman-Pearson Paradigm Math 408 - Mathematical Statistics Lecture 29-30. Testing Hypotheses: The Neyman-Pearson Paradigm April 12-15, 2013 Konstantin Zuev (USC) Math 408, Lecture 29-30 April 12-15, 2013 1 / 12 Agenda Example:

More information