Low-Regret for Online Decision-Making
|
|
- Augustus Wilkins
- 5 years ago
- Views:
Transcription
1 Siddhartha Banerjee and Alberto Vera November 6, /17
2 Introduction Compensated Coupling Bayes Selector Conclusion Motivation 2/17
3 Introduction Compensated Coupling Bayes Selector Conclusion Motivation Offline is a host who knew all the arrivals ahead of time 2/17
4 Example: Online Packing (NRM) Resources with initial budgets B i, i = 1,..., d. 3/17
5 Example: Online Packing (NRM) Resources with initial budgets B i, i = 1,..., d. Type j wants a ij units of resource i. 3/17
6 Example: Online Packing (NRM) Resources with initial budgets B i, i = 1,..., d. Type j wants a ij units of resource i. Type j arrives with probability p j and has reward r j, j = 1,..., n. 3/17
7 Example: Online Packing (NRM) Resources with initial budgets B i, i = 1,..., d. Type j wants a ij units of resource i. Type j arrives with probability p j and has reward r j, j = 1,..., n. If we had full information, solve max r x s.t. Ax B (Budget Consumption) x Z(T ) (Respect Arrivals) x N n. Paradigm: two agents solving this problem, Online and Offline 3/17
8 Traditional Approach and Our Contributions State State S1 S1 S2 S2 S3 S3 S4 Offline S4 Online Offline Time Time t1 t2 t3 t4 t1 t2 t3 t4 4/17
9 Traditional Approach and Our Contributions State State S1 S1 S2 S2 S3 S3 S4 Offline S4 Online Offline Time Time t1 t2 t3 t4 t1 t2 t3 t4 Our Contributions Compensated Coupling: for analyzing algorithms Bayes Selector: meta-algorithm based on estimators Applications: constant regret 4/17
10 Application: Constant Regret for Online Packing V on and V off collected values, Regret := V off V on. 5/17
11 Application: Constant Regret for Online Packing V on and V off collected values, Regret := V off V on. V off = max{r x : Ax B, 0 x Z(T )}. 5/17
12 Application: Constant Regret for Online Packing V on and V off collected values, Regret := V off V on. V off = max{r x : Ax B, 0 x Z(T )}. Theorem (Informal) We provide a natural policy with constant expected regret for online packing problems. 5/17
13 Application: Constant Regret for Online Packing V on and V off collected values, Regret := V off V on. V off = max{r x : Ax B, 0 x Z(T )}. Theorem (Informal) We provide a natural policy with constant expected regret for online packing problems. Regret depends on d, n, κ(a), and the distribution of Z(t) Independent of T and initial budget levels B Correlated and time-varying distributions 1 o(1) approximation whenever T and B are large 5/17
14 Related Work Constant Regret for Multi-Secretary Arlotto and Gurvich (2017) Network Revenue Management / Online Packing: Asymptotic analysis O( T ) regret Talluri and Van Ryzin (2004) Later o( T ), O(1) conditional, O(1) Poisson Reiman and Wang (2008),Jasin and Kumar (2012),Bumpensanti and Wang (2018) Prophet: worst case distribution (competitive ratio) Ratio 1 e 1 for maximum of i.i.d. Hill and Kertz (1982) Best ratio 0.745, used in posted pricing Correa et al. (2017) Online Matching Manshadi et al (2012) Approximate Dynamic Programming Powell (2011) 6/17
15 Framework for Online Decision-Making Arrival type J t J presented at t = T, T 1,..., 1 7/17
16 Framework for Online Decision-Making Arrival type J t J presented at t = T, T 1,..., 1 States s S and actions a A 7/17
17 Framework for Online Decision-Making Arrival type J t J presented at t = T, T 1,..., 1 States s S and actions a A Rewards R(a, s, j) 7/17
18 Framework for Online Decision-Making Arrival type J t J presented at t = T, T 1,..., 1 States s S and actions a A Rewards R(a, s, j) Transitions T (a, s, j) 7/17
19 Framework for Online Decision-Making Arrival type J t J presented at t = T, T 1,..., 1 States s S and actions a A Rewards R(a, s, j) Transitions T (a, s, j) Examples: Online Packing (NRM), Multi-Secretary, Online Matching, Ski-Rental, etc. 7/17
20 Example: Multi-Secretary One resource (d = 1) with capacity B. Goal: choose B arrivals to maximize their sum t = t is time to go 8/17
21 Value of Offline V off (t, s)[ω] is the optimal offline value starting on s with t periods to go on sample path ω. 9/17
22 Value of Offline V off (t, s)[ω] is the optimal offline value starting on s with t periods to go on sample path ω t B = B = /17
23 Value of Offline V off (t, s)[ω] is the optimal offline value starting on s with t periods to go on sample path ω t B = B = It satisfies the Bellman Equation V off (t, s)[ω] = max{r(a, s, J t )+V off (t 1, T (a, s, J t ))[ω] : a A}. 9/17
24 Offline Follows Online Online uses a policy π on. We want Offline to follow π on 10/17
25 Offline Follows Online Online uses a policy π on. We want Offline to follow π on What if Offline is not satisfied with a = π on (t, s, j)? V off (t, s)[ω] > R(a, s, J t ) + V off (t 1, T (a, s, J t ))[ω]. 10/17
26 Offline Follows Online Online uses a policy π on. We want Offline to follow π on What if Offline is not satisfied with a = π on (t, s, j)? V off (t, s)[ω] > R(a, s, J t ) + V off (t 1, T (a, s, J t ))[ω]. We need to compensate him by paying V off (t, s)[ω] R(a, s, J t ) V off (t 1, T (a, s, J t ))[ω]. 10/17
27 Offline Follows Online Online uses a policy π on. We want Offline to follow π on What if Offline is not satisfied with a = π on (t, s, j)? V off (t, s)[ω] > R(a, s, J t ) + V off (t 1, T (a, s, J t ))[ω]. We need to compensate him by paying V off (t, s)[ω] R(a, s, J t ) V off (t 1, T (a, s, J t ))[ω] t = /17
28 Compensations How big are the compensations? 11/17
29 Compensations How big are the compensations? r max for multi-secretary and A j r max dr max for packing. 11/17
30 Compensations How big are the compensations? r max for multi-secretary and A j r max dr max for packing. Q(t, s, a) is the disagreement set: ω Ω such that Offline loses optimality by taking a. 11/17
31 Compensations How big are the compensations? r max for multi-secretary and A j r max dr max for packing. Q(t, s, a) is the disagreement set: ω Ω such that Offline loses optimality by taking a. Lemma (Compensated Coupling) Fix any policy for Online. Let S t be Online s state over time and A t Online s action over time. If r is a bound on the compensations, then [ ] E[Regret] re P[Q(t, S t, A t )] t 11/17
32 Compensated Coupling View State State Compensation S1 S1 S2 S2 S3 S3 S4 Online Offline S4 Online Offline 1 Time Time t1 t2 t3 t4 t1 t 1 t2 t3 t4 12/17
33 Compensated Coupling View State State Compensation S1 S1 S2 S2 S3 S3 S4 Online Offline S4 Online Offline 1 Time Time State t1 t2 t3 t4 Compensation t1 t 1 t2 t3 t4 Compensation Online Offline 2 t1 t 1 t2 t3 t4 Time 12/17
34 Compensated Coupling View State State Compensation S1 S1 S2 S2 S3 S3 S4 Online Offline S4 Online Offline 1 Time Time t1 t2 t3 t4 t1 t 1 t2 t3 t4 State Compensation State Compensation Compensation S1 Compensation Compensation S2 S3 Online Offline 2 S4 Online Offline 3 t1 t 1 t2 t3 t4 Time t1 t2 t3 t4 Time 12/17
35 The Bayes Selector Policy Compensated coupling works for any policy. Can we turn it around and find a policy? 13/17
36 The Bayes Selector Policy Compensated coupling works for any policy. Can we turn it around and find a policy? Alice is making decisions and we want to copy her. An oracle tells us she picked red/blue with probabilities 0.7/0.3. What to do? 13/17
37 The Bayes Selector Policy Compensated coupling works for any policy. Can we turn it around and find a policy? Alice is making decisions and we want to copy her. An oracle tells us she picked red/blue with probabilities 0.7/0.3. What to do? Q(t, s, a) is the disagreement set: ω Ω such that Offline loses optimality by taking a. 13/17
38 The Bayes Selector Policy Compensated coupling works for any policy. Can we turn it around and find a policy? Alice is making decisions and we want to copy her. An oracle tells us she picked red/blue with probabilities 0.7/0.3. What to do? Q(t, s, a) is the disagreement set: ω Ω such that Offline loses optimality by taking a. Policy: at each time t, take action A t BS argmin{p[q(t, S t BS, a)] : a A}. 13/17
39 Regret Guarantee of Bayes Selector Ideal Policy: at time t, A t BS argmin{p[q(t, St BS, a)] : a A}. 14/17
40 Regret Guarantee of Bayes Selector Ideal Policy: at time t, A t BS argmin{p[q(t, St BS, a)] : a A}. Meta-Algorithm: Based on estimators ˆq(t, s, a) P[Q(t, s, a)] 14/17
41 Regret Guarantee of Bayes Selector Ideal Policy: at time t, A t BS argmin{p[q(t, St BS, a)] : a A}. Meta-Algorithm: Based on estimators ˆq(t, s, a) P[Q(t, s, a)] at time t, A t BS argmin{ˆq(t, St BS, a) : a A}. 14/17
42 Regret Guarantee of Bayes Selector Ideal Policy: at time t, A t BS argmin{p[q(t, St BS, a)] : a A}. Meta-Algorithm: Based on estimators ˆq(t, s, a) P[Q(t, s, a)] at time t, A t BS argmin{ˆq(t, St BS, a) : a A}. Theorem Let r be an upper bound on the compensations. The Bayes Selector achieves [ ] E[Regret] re ˆq(t, SBS, t A t BS) t 14/17
43 Regret Guarantee of Bayes Selector Ideal Policy: at time t, A t BS argmin{p[q(t, St BS, a)] : a A}. Meta-Algorithm: Based on estimators ˆq(t, s, a) P[Q(t, s, a)] at time t, A t BS argmin{ˆq(t, St BS, a) : a A}. Theorem Let r be an upper bound on the compensations. The Bayes Selector achieves [ ] E[Regret] re ˆq(t, SBS, t A t BS) t Theorem The Bayes Selector achieves constant regret for online packing and matching problems. 14/17
44 Estimators for Online Packing ˆq(t, s, a) P[Q(t, s, a)], with Q(t, s, a) is the disagreement set. Given Online s budget B t, Offline solves max{r x : Ax B t, 0 x Z(t)}. 15/17
45 Estimators for Online Packing ˆq(t, s, a) P[Q(t, s, a)], with Q(t, s, a) is the disagreement set. Given Online s budget B t, Offline solves max{r x : Ax B t, 0 x Z(t)}. Let X solve max{r x : Ax B t, 0 x E[Z(t)]} 15/17
46 Estimators for Online Packing ˆq(t, s, a) P[Q(t, s, a)], with Q(t, s, a) is the disagreement set. Given Online s budget B t, Offline solves max{r x : Ax B t, 0 x Z(t)}. Let X solve max{r x : Ax B t, 0 x E[Z(t)]} X j /E[Z j (t)] [0, 1] for all j. 15/17
47 Estimators for Online Packing ˆq(t, s, a) P[Q(t, s, a)], with Q(t, s, a) is the disagreement set. Given Online s budget B t, Offline solves max{r x : Ax B t, 0 x Z(t)}. Let X solve max{r x : Ax B t, 0 x E[Z(t)]} X j /E[Z j (t)] [0, 1] for all j. Accept j iff X j /E[Z j (t)] 1/2. 15/17
48 Estimators for Online Packing ˆq(t, s, a) P[Q(t, s, a)], with Q(t, s, a) is the disagreement set. Given Online s budget B t, Offline solves max{r x : Ax B t, 0 x Z(t)}. Let X solve max{r x : Ax B t, 0 x E[Z(t)]} X j /E[Z j (t)] [0, 1] for all j. Accept j iff X j /E[Z j (t)] 1/2. min{ˆq(t, B t, accept), ˆq(t, B t, reject)} exp( t(p j /2κ) 2 /25) 15/17
49 Numerical Results Bayes Selector Randomized Adaptive Policy Randomized Static Policy Regret Scaling 16/17
50 Conclusions and Future Directions Compensated coupling: A way to understand online decision-making 17/17
51 Conclusions and Future Directions Compensated coupling: A way to understand online decision-making Bayes Selector: Natural policy with a performance guarantee Bridge between estimation and optimization 17/17
52 Conclusions and Future Directions Compensated coupling: A way to understand online decision-making Bayes Selector: Natural policy with a performance guarantee Bridge between estimation and optimization Online packing and matching: the Bayes Selector achieves constant regret Future work: learning and multi-stage problems Promising directions: online convex optimization 17/17
53 Appendix: Details for Numerical Results Type j Valuation Resource ,2 1,2 p j /1
Allocating Resources, in the Future
Allocating Resources, in the Future Sid Banerjee School of ORIE May 3, 2018 Simons Workshop on Mathematical and Computational Challenges in Real-Time Decision Making online resource allocation: basic model......
More informationarxiv: v1 [cs.ds] 15 Jan 2019
The Bayesian Prophet: A Low-Regret Framework for Online Decision Making arxiv:1901.05028v1 [cs.ds] 15 Jan 2019 ALBERTO VERA, Cornell University SIDDHARTHA BANERJEE, Cornell University Motivated by the
More informationarxiv: v3 [math.oc] 11 Dec 2018
A Re-solving Heuristic with Uniformly Bounded Loss for Network Revenue Management Pornpawee Bumpensanti, He Wang School of Industrial and Systems Engineering, Georgia Institute of echnology, Atlanta, GA
More informationDynamic Optimization of Mobile Push Advertising Campaigns
Submitted to Management Science manuscript Authors are encouraged to submit new papers to INFORMS journals by means of a style file template, which includes the journal title. However, use of a template
More informationAssortment Optimization under the Multinomial Logit Model with Nested Consideration Sets
Assortment Optimization under the Multinomial Logit Model with Nested Consideration Sets Jacob Feldman School of Operations Research and Information Engineering, Cornell University, Ithaca, New York 14853,
More informationAn Online Algorithm for Maximizing Submodular Functions
An Online Algorithm for Maximizing Submodular Functions Matthew Streeter December 20, 2007 CMU-CS-07-171 Daniel Golovin School of Computer Science Carnegie Mellon University Pittsburgh, PA 15213 This research
More informationReinforcement Learning in Partially Observable Multiagent Settings: Monte Carlo Exploring Policies
Reinforcement earning in Partially Observable Multiagent Settings: Monte Carlo Exploring Policies Presenter: Roi Ceren THINC ab, University of Georgia roi@ceren.net Prashant Doshi THINC ab, University
More informationSecretary Problems. Petropanagiotaki Maria. January MPLA, Algorithms & Complexity 2
January 15 2015 MPLA, Algorithms & Complexity 2 Simplest form of the problem 1 The candidates are totally ordered from best to worst with no ties. 2 The candidates arrive sequentially in random order.
More informationDual Interpretations and Duality Applications (continued)
Dual Interpretations and Duality Applications (continued) Yinyu Ye Department of Management Science and Engineering Stanford University Stanford, CA 94305, U.S.A. http://www.stanford.edu/ yyye (LY, Chapters
More informationA Dynamic Near-Optimal Algorithm for Online Linear Programming
Submitted to Operations Research manuscript (Please, provide the manuscript number! Authors are encouraged to submit new papers to INFORMS journals by means of a style file template, which includes the
More informationCS599: Algorithm Design in Strategic Settings Fall 2012 Lecture 12: Approximate Mechanism Design in Multi-Parameter Bayesian Settings
CS599: Algorithm Design in Strategic Settings Fall 2012 Lecture 12: Approximate Mechanism Design in Multi-Parameter Bayesian Settings Instructor: Shaddin Dughmi Administrivia HW1 graded, solutions on website
More informationTechnical Note: Capacitated Assortment Optimization under the Multinomial Logit Model with Nested Consideration Sets
Technical Note: Capacitated Assortment Optimization under the Multinomial Logit Model with Nested Consideration Sets Jacob Feldman Olin Business School, Washington University, St. Louis, MO 63130, USA
More informationEconomic Growth: Lecture 8, Overlapping Generations
14.452 Economic Growth: Lecture 8, Overlapping Generations Daron Acemoglu MIT November 20, 2018 Daron Acemoglu (MIT) Economic Growth Lecture 8 November 20, 2018 1 / 46 Growth with Overlapping Generations
More informationThe Multi-Path Utility Maximization Problem
The Multi-Path Utility Maximization Problem Xiaojun Lin and Ness B. Shroff School of Electrical and Computer Engineering Purdue University, West Lafayette, IN 47906 {linx,shroff}@ecn.purdue.edu Abstract
More informationBayesian Active Learning With Basis Functions
Bayesian Active Learning With Basis Functions Ilya O. Ryzhov Warren B. Powell Operations Research and Financial Engineering Princeton University Princeton, NJ 08544, USA IEEE ADPRL April 13, 2011 1 / 29
More informationLecture 2: Paging and AdWords
Algoritmos e Incerteza (PUC-Rio INF2979, 2017.1) Lecture 2: Paging and AdWords March 20 2017 Lecturer: Marco Molinaro Scribe: Gabriel Homsi In this class we had a brief recap of the Ski Rental Problem
More informationBasics of reinforcement learning
Basics of reinforcement learning Lucian Buşoniu TMLSS, 20 July 2018 Main idea of reinforcement learning (RL) Learn a sequential decision policy to optimize the cumulative performance of an unknown system
More informationCS261: Problem Set #3
CS261: Problem Set #3 Due by 11:59 PM on Tuesday, February 23, 2016 Instructions: (1) Form a group of 1-3 students. You should turn in only one write-up for your entire group. (2) Submission instructions:
More informationCompetitive Equilibrium and the Welfare Theorems
Competitive Equilibrium and the Welfare Theorems Craig Burnside Duke University September 2010 Craig Burnside (Duke University) Competitive Equilibrium September 2010 1 / 32 Competitive Equilibrium and
More informationOnline Learning Schemes for Power Allocation in Energy Harvesting Communications
Online Learning Schemes for Power Allocation in Energy Harvesting Communications Pranav Sakulkar and Bhaskar Krishnamachari Ming Hsieh Department of Electrical Engineering Viterbi School of Engineering
More informationA Robust Event-Triggered Consensus Strategy for Linear Multi-Agent Systems with Uncertain Network Topology
A Robust Event-Triggered Consensus Strategy for Linear Multi-Agent Systems with Uncertain Network Topology Amir Amini, Amir Asif, Arash Mohammadi Electrical and Computer Engineering,, Montreal, Canada.
More informationField Exam: Advanced Theory
Field Exam: Advanced Theory There are two questions on this exam, one for Econ 219A and another for Economics 206. Answer all parts for both questions. Exercise 1: Consider a n-player all-pay auction auction
More informationRice University. Fall Semester Final Examination ECON501 Advanced Microeconomic Theory. Writing Period: Three Hours
Rice University Fall Semester Final Examination 007 ECON50 Advanced Microeconomic Theory Writing Period: Three Hours Permitted Materials: English/Foreign Language Dictionaries and non-programmable calculators
More informationarxiv: v2 [math.oc] 22 Dec 2017
Managing Appointment Booking under Customer Choices Nan Liu 1, Peter M. van de Ven 2, and Bo Zhang 3 arxiv:1609.05064v2 [math.oc] 22 Dec 2017 1 Operations Management Department, Carroll School of Management,
More informationOnline Advance Admission Scheduling for Services, with Customer Preferences
Submitted to Operations Research manuscript (Please, provide the manuscript number!) Authors are encouraged to submit new papers to INFORMS journals by means of a style file template, which includes the
More informationOnline bin packing 24.Januar 2008
Rob van Stee: Approximations- und Online-Algorithmen 1 Online bin packing 24.Januar 2008 Problem definition First Fit and other algorithms The asymptotic performance ratio Weighting functions Lower bounds
More informationWelfare Maximization with Friends-of-Friends Network Externalities
Welfare Maximization with Friends-of-Friends Network Externalities Extended version of a talk at STACS 2015, Munich Wolfgang Dvořák 1 joint work with: Sayan Bhattacharya 2, Monika Henzinger 1, Martin Starnberger
More informationManaging Appointment Scheduling under Patient Choices
Submitted to manuscript (Please, provide the manuscript number!) Authors are encouraged to submit new papers to INFORMS journals by means of a style file template, which includes the journal title. However,
More informationOnline algorithms for parallel job scheduling and strip packing Hurink, J.L.; Paulus, J.J.
Online algorithms for parallel job scheduling and strip packing Hurink, J.L.; Paulus, J.J. Published: 01/01/007 Document Version Publisher s PDF, also known as Version of Record (includes final page, issue
More informationAssortment Optimization for Parallel Flights under a Multinomial Logit Choice Model with Cheapest Fare Spikes
Assortment Optimization for Parallel Flights under a Multinomial Logit Choice Model with Cheapest Fare Spikes Yufeng Cao, Anton Kleywegt, He Wang School of Industrial and Systems Engineering, Georgia Institute
More informationLecture notes 2: Applications
Lecture notes 2: Applications Vincent Conitzer In this set of notes, we will discuss a number of problems of interest to computer scientists where linear/integer programming can be fruitfully applied.
More informationOPTIMALITY OF RANDOMIZED TRUNK RESERVATION FOR A PROBLEM WITH MULTIPLE CONSTRAINTS
OPTIMALITY OF RANDOMIZED TRUNK RESERVATION FOR A PROBLEM WITH MULTIPLE CONSTRAINTS Xiaofei Fan-Orzechowski Department of Applied Mathematics and Statistics State University of New York at Stony Brook Stony
More informationGame Theory. Monika Köppl-Turyna. Winter 2017/2018. Institute for Analytical Economics Vienna University of Economics and Business
Monika Köppl-Turyna Institute for Analytical Economics Vienna University of Economics and Business Winter 2017/2018 Static Games of Incomplete Information Introduction So far we assumed that payoff functions
More informationLecture 10: Mechanism Design
Computational Game Theory Spring Semester, 2009/10 Lecture 10: Mechanism Design Lecturer: Yishay Mansour Scribe: Vera Vsevolozhsky, Nadav Wexler 10.1 Mechanisms with money 10.1.1 Introduction As we have
More informationPoint Process Control
Point Process Control The following note is based on Chapters I, II and VII in Brémaud s book Point Processes and Queues (1981). 1 Basic Definitions Consider some probability space (Ω, F, P). A real-valued
More informationWeek 6: Consumer Theory Part 1 (Jehle and Reny, Chapter 1)
Week 6: Consumer Theory Part 1 (Jehle and Reny, Chapter 1) Tsun-Feng Chiang* *School of Economics, Henan University, Kaifeng, China November 2, 2014 1 / 28 Primitive Notions 1.1 Primitive Notions Consumer
More informationAdaptive Rumor Spreading
José Correa 1 Marcos Kiwi 1 Neil Olver 2 Alberto Vera 1 1 2 VU Amsterdam and CWI July 27, 2015 1/21 The situation 2/21 The situation 2/21 The situation 2/21 The situation 2/21 Introduction Rumors in social
More informationLogarithmic Regret in the Dynamic and Stochastic Knapsack Problem
Logarithmic Regret in the Dynamic and Stochastic Knapsack Problem Alessandro Arlotto The Fuqua School of Business, Duke University, Fuqua Drive, Durham, NC, 2778, alessandro.arlotto@duke.edu Xinchang Xie
More informationBayesian Contextual Multi-armed Bandits
Bayesian Contextual Multi-armed Bandits Xiaoting Zhao Joint Work with Peter I. Frazier School of Operations Research and Information Engineering Cornell University October 22, 2012 1 / 33 Outline 1 Motivating
More informationDepartment of Agricultural Economics. PhD Qualifier Examination. May 2009
Department of Agricultural Economics PhD Qualifier Examination May 009 Instructions: The exam consists of six questions. You must answer all questions. If you need an assumption to complete a question,
More informationAssortment Optimization under the Multinomial Logit Model with Sequential Offerings
Assortment Optimization under the Multinomial Logit Model with Sequential Offerings Nan Liu Carroll School of Management, Boston College, Chestnut Hill, MA 02467, USA nan.liu@bc.edu Yuhang Ma School of
More informationOn the Tightness of an LP Relaxation for Rational Optimization and its Applications
OPERATIONS RESEARCH Vol. 00, No. 0, Xxxxx 0000, pp. 000 000 issn 0030-364X eissn 526-5463 00 0000 000 INFORMS doi 0.287/xxxx.0000.0000 c 0000 INFORMS Authors are encouraged to submit new papers to INFORMS
More informationwhere u is the decision-maker s payoff function over her actions and S is the set of her feasible actions.
Seminars on Mathematics for Economics and Finance Topic 3: Optimization - interior optima 1 Session: 11-12 Aug 2015 (Thu/Fri) 10:00am 1:00pm I. Optimization: introduction Decision-makers (e.g. consumers,
More informationOptimization under Uncertainty: An Introduction through Approximation. Kamesh Munagala
Optimization under Uncertainty: An Introduction through Approximation Kamesh Munagala Contents I Stochastic Optimization 5 1 Weakly Coupled LP Relaxations 6 1.1 A Gentle Introduction: The Maximum Value
More informationCS261: A Second Course in Algorithms Lecture #11: Online Learning and the Multiplicative Weights Algorithm
CS61: A Second Course in Algorithms Lecture #11: Online Learning and the Multiplicative Weights Algorithm Tim Roughgarden February 9, 016 1 Online Algorithms This lecture begins the third module of the
More informationRobust Partially Observable Markov Decision Processes
Submitted to manuscript Robust Partially Observable Markov Decision Processes Mohammad Rasouli 1, Soroush Saghafian 2 1 Management Science and Engineering, Stanford University, Palo Alto, CA 2 Harvard
More informationManaging Call Centers with Many Strategic Agents
Managing Call Centers with Many Strategic Agents Rouba Ibrahim, Kenan Arifoglu Management Science and Innovation, UCL YEQT Workshop - Eindhoven November 2014 Work Schedule Flexibility 86SwofwthewqBestwCompanieswtowWorkwForqwofferw
More informationProject Discussions: SNL/ADMM, MDP/Randomization, Quadratic Regularization, and Online Linear Programming
Project Discussions: SNL/ADMM, MDP/Randomization, Quadratic Regularization, and Online Linear Programming Yinyu Ye Department of Management Science and Engineering Stanford University Stanford, CA 94305,
More informationMATH4406 Assignment 5
MATH4406 Assignment 5 Patrick Laub (ID: 42051392) October 7, 2014 1 The machine replacement model 1.1 Real-world motivation Consider the machine to be the entire world. Over time the creator has running
More informationOnline (and Distributed) Learning with Information Constraints. Ohad Shamir
Online (and Distributed) Learning with Information Constraints Ohad Shamir Weizmann Institute of Science Online Algorithms and Learning Workshop Leiden, November 2014 Ohad Shamir Learning with Information
More informationLearning Optimal Online Advertising Portfolios with Periodic Budgets
Learning Optimal Online Advertising Portfolios with Periodic Budgets Lennart Baardman Operations Research Center, MIT, Cambridge, MA 02139, baardman@mit.edu Elaheh Fata Department of Aeronautics and Astronautics,
More informationGeneralization bounds
Advanced Course in Machine Learning pring 200 Generalization bounds Handouts are jointly prepared by hie Mannor and hai halev-hwartz he problem of characterizing learnability is the most basic question
More informationMassachusetts Institute of Technology 6.854J/18.415J: Advanced Algorithms Friday, March 18, 2016 Ankur Moitra. Problem Set 6
Massachusetts Institute of Technology 6.854J/18.415J: Advanced Algorithms Friday, March 18, 2016 Ankur Moitra Problem Set 6 Due: Wednesday, April 6, 2016 7 pm Dropbox Outside Stata G5 Collaboration policy:
More informationLecture 4. 1 Examples of Mechanism Design Problems
CSCI699: Topics in Learning and Game Theory Lecture 4 Lecturer: Shaddin Dughmi Scribes: Haifeng Xu,Reem Alfayez 1 Examples of Mechanism Design Problems Example 1: Single Item Auctions. There is a single
More informationLecture 12 November 3
STATS 300A: Theory of Statistics Fall 2015 Lecture 12 November 3 Lecturer: Lester Mackey Scribe: Jae Hyuck Park, Christian Fong Warning: These notes may contain factual and/or typographic errors. 12.1
More informationBayesian Combinatorial Auctions: Expanding Single Buyer Mechanisms to Many Buyers
Bayesian Combinatorial Auctions: Expanding Single Buyer Mechanisms to Many Buyers Saeed Alaei December 12, 2013 Abstract We present a general framework for approximately reducing the mechanism design problem
More informationLecture 12. Applications to Algorithmic Game Theory & Computational Social Choice. CSC Allan Borodin & Nisarg Shah 1
Lecture 12 Applications to Algorithmic Game Theory & Computational Social Choice CSC2420 - Allan Borodin & Nisarg Shah 1 A little bit of game theory CSC2420 - Allan Borodin & Nisarg Shah 2 Recap: Yao s
More informationDynamic Mechanisms without Money
Dynamic Mechanisms without Money Yingni Guo, 1 Johannes Hörner 2 1 Northwestern 2 Yale July 11, 2015 Features - Agent sole able to evaluate the (changing) state of the world. 2 / 1 Features - Agent sole
More informationEstimating Single-Agent Dynamic Models
Estimating Single-Agent Dynamic Models Paul T. Scott Empirical IO Fall, 2013 1 / 49 Why are dynamics important? The motivation for using dynamics is usually external validity: we want to simulate counterfactuals
More informationBayesian Decision Theory
Bayesian Decision Theory Dr. Shuang LIANG School of Software Engineering TongJi University Fall, 2012 Today s Topics Bayesian Decision Theory Bayesian classification for normal distributions Error Probabilities
More informationAdvance Service Reservations with Heterogeneous Customers
Advance ervice Reservations with Heterogeneous Customers Clifford tein, Van-Anh Truong, Xinshang Wang Department of Industrial Engineering and Operations Research, Columbia University, New York, NY 10027,
More informationRedistribution Mechanisms for Assignment of Heterogeneous Objects
Redistribution Mechanisms for Assignment of Heterogeneous Objects Sujit Gujar Dept of Computer Science and Automation Indian Institute of Science Bangalore, India sujit@csa.iisc.ernet.in Y Narahari Dept
More informationDiscussion Papers in Economics
Discussion Papers in Economics No. 10/11 A General Equilibrium Corporate Finance Theorem for Incomplete Markets: A Special Case By Pascal Stiefenhofer, University of York Department of Economics and Related
More informationAn Adaptive Algorithm for Selecting Profitable Keywords for Search-Based Advertising Services
An Adaptive Algorithm for Selecting Profitable Keywords for Search-Based Advertising Services Paat Rusmevichientong David P. Williamson July 6, 2007 Abstract Increases in online searches have spurred the
More informationChapter 9: Hypothesis Testing Sections
1 / 22 : Hypothesis Testing Sections Skip: 9.2 Testing Simple Hypotheses Skip: 9.3 Uniformly Most Powerful Tests Skip: 9.4 Two-Sided Alternatives 9.5 The t Test 9.6 Comparing the Means of Two Normal Distributions
More informationThe Algorithmic Foundations of Adaptive Data Analysis November, Lecture The Multiplicative Weights Algorithm
he Algorithmic Foundations of Adaptive Data Analysis November, 207 Lecture 5-6 Lecturer: Aaron Roth Scribe: Aaron Roth he Multiplicative Weights Algorithm In this lecture, we define and analyze a classic,
More informationWorking Paper. Adaptive Parametric and Nonparametric Multi-product Pricing via Self-Adjusting Controls
= = = = Working Paper Adaptive Parametric and Nonparametric Multi-product Pricing via Self-Adjusting Controls Qi George Chen Stephen M. Ross School of Business University of Michigan Stefanus Jasin Stephen
More informationGSOE9210 Engineering Decisions
Sutdent ID: Name: Signature: The University of New South Wales Session 2, 2017 GSOE9210 Engineering Decisions Sample mid-term test Instructions: Time allowed: 1 hour Reading time: 5 minutes This examination
More informationGame Theory: Spring 2017
Game Theory: Spring 2017 Ulle Endriss Institute for Logic, Language and Computation University of Amsterdam Ulle Endriss 1 Plan for Today In this second lecture on mechanism design we are going to generalise
More informationOn the Unique D1 Equilibrium in the Stackelberg Model with Asymmetric Information Janssen, M.C.W.; Maasland, E.
Tilburg University On the Unique D1 Equilibrium in the Stackelberg Model with Asymmetric Information Janssen, M.C.W.; Maasland, E. Publication date: 1997 Link to publication General rights Copyright and
More information1 Review and Overview
DRAFT a final version will be posted shortly CS229T/STATS231: Statistical Learning Theory Lecturer: Tengyu Ma Lecture # 16 Scribe: Chris Cundy, Ananya Kumar November 14, 2018 1 Review and Overview Last
More informationFirming Renewable Power with Demand Response: An End-to-end Aggregator Business Model
Firming Renewable Power with Demand Response: An End-to-end Aggregator Business Model Clay Campaigne joint work with Shmuel Oren November 5, 2015 1 / 16 Motivation Problem: Expansion of renewables increases
More informationMulti-armed Bandits in the Presence of Side Observations in Social Networks
52nd IEEE Conference on Decision and Control December 0-3, 203. Florence, Italy Multi-armed Bandits in the Presence of Side Observations in Social Networks Swapna Buccapatnam, Atilla Eryilmaz, and Ness
More informationSingle Leg Airline Revenue Management With Overbooking
Single Leg Airline Revenue Management With Overbooking Nurşen Aydın, Ş. İlker Birbil, J. B. G. Frenk and Nilay Noyan Sabancı University, Manufacturing Systems and Industrial Engineering, Orhanlı-Tuzla,
More informationAdaptive Learning with Unknown Information Flows
Adaptive Learning with Unknown Information Flows Yonatan Gur Stanford University Ahmadreza Momeni Stanford University June 8, 018 Abstract An agent facing sequential decisions that are characterized by
More informationTechnical Companion to: Sharing Aggregate Inventory Information with Customers: Strategic Cross-selling and Shortage Reduction
Technical Companion to: Sharing Aggregate Inventory Information with Customers: Strategic Cross-selling and Shortage Reduction Ruomeng Cui Kelley School of Business, Indiana University, Bloomington, IN
More informationGreedy-Like Algorithms for Dynamic Assortment Planning Under Multinomial Logit Preferences
Submitted to Operations Research manuscript (Please, provide the manuscript number!) Authors are encouraged to submit new papers to INFORMS journals by means of a style file template, which includes the
More informationA Optimal User Search
A Optimal User Search Nicole Immorlica, Northwestern University Mohammad Mahdian, Google Greg Stoddard, Northwestern University Most research in sponsored search auctions assume that users act in a stochastic
More informationOLSO. Online Learning and Stochastic Optimization. Yoram Singer August 10, Google Research
OLSO Online Learning and Stochastic Optimization Yoram Singer August 10, 2016 Google Research References Introduction to Online Convex Optimization, Elad Hazan, Princeton University Online Learning and
More information1 The Basic RBC Model
IHS 2016, Macroeconomics III Michael Reiter Ch. 1: Notes on RBC Model 1 1 The Basic RBC Model 1.1 Description of Model Variables y z k L c I w r output level of technology (exogenous) capital at end of
More informationMulti-Armed Bandits. Credit: David Silver. Google DeepMind. Presenter: Tianlu Wang
Multi-Armed Bandits Credit: David Silver Google DeepMind Presenter: Tianlu Wang Credit: David Silver (DeepMind) Multi-Armed Bandits Presenter: Tianlu Wang 1 / 27 Outline 1 Introduction Exploration vs.
More informationA Tighter Variant of Jensen s Lower Bound for Stochastic Programs and Separable Approximations to Recourse Functions
A Tighter Variant of Jensen s Lower Bound for Stochastic Programs and Separable Approximations to Recourse Functions Huseyin Topaloglu School of Operations Research and Information Engineering, Cornell
More informationBandit models: a tutorial
Gdt COS, December 3rd, 2015 Multi-Armed Bandit model: general setting K arms: for a {1,..., K}, (X a,t ) t N is a stochastic process. (unknown distributions) Bandit game: a each round t, an agent chooses
More informationLecture 5: The Principle of Deferred Decisions. Chernoff Bounds
Randomized Algorithms Lecture 5: The Principle of Deferred Decisions. Chernoff Bounds Sotiris Nikoletseas Associate Professor CEID - ETY Course 2013-2014 Sotiris Nikoletseas, Associate Professor Randomized
More informationOnline algorithms December 13, 2007
Sanders/van Stee: Approximations- und Online-Algorithmen 1 Online algorithms December 13, 2007 Information is revealed to the algorithm in parts Algorithm needs to process each part before receiving the
More informationSequential Decisions
Sequential Decisions A Basic Theorem of (Bayesian) Expected Utility Theory: If you can postpone a terminal decision in order to observe, cost free, an experiment whose outcome might change your terminal
More informationA tractable consideration set structure for network revenue management
A tractable consideration set structure for network revenue management Arne Strauss, Kalyan Talluri February 15, 2012 Abstract The dynamic program for choice network RM is intractable and approximated
More informationMacroeconomics Qualifying Examination
Macroeconomics Qualifying Examination August 2015 Department of Economics UNC Chapel Hill Instructions: This examination consists of 4 questions. Answer all questions. If you believe a question is ambiguously
More informationChristopher Watkins and Peter Dayan. Noga Zaslavsky. The Hebrew University of Jerusalem Advanced Seminar in Deep Learning (67679) November 1, 2015
Q-Learning Christopher Watkins and Peter Dayan Noga Zaslavsky The Hebrew University of Jerusalem Advanced Seminar in Deep Learning (67679) November 1, 2015 Noga Zaslavsky Q-Learning (Watkins & Dayan, 1992)
More information(a) Write down the Hamilton-Jacobi-Bellman (HJB) Equation in the dynamic programming
1. Government Purchases and Endogenous Growth Consider the following endogenous growth model with government purchases (G) in continuous time. Government purchases enhance production, and the production
More informationWe consider a network revenue management problem where customers choose among open fare products
Vol. 43, No. 3, August 2009, pp. 381 394 issn 0041-1655 eissn 1526-5447 09 4303 0381 informs doi 10.1287/trsc.1090.0262 2009 INFORMS An Approximate Dynamic Programming Approach to Network Revenue Management
More informationPayment Rules for Combinatorial Auctions via Structural Support Vector Machines
Payment Rules for Combinatorial Auctions via Structural Support Vector Machines Felix Fischer Harvard University joint work with Paul Dütting (EPFL), Petch Jirapinyo (Harvard), John Lai (Harvard), Ben
More informationDesigning Competitive Online Algorithms via a Primal-Dual Approach. Niv Buchbinder
Designing Competitive Online Algorithms via a Primal-Dual Approach Niv Buchbinder Designing Competitive Online Algorithms via a Primal-Dual Approach Research Thesis Submitted in Partial Fulfillment of
More informationNo-Regret Algorithms for Unconstrained Online Convex Optimization
No-Regret Algorithms for Unconstrained Online Convex Optimization Matthew Streeter Duolingo, Inc. Pittsburgh, PA 153 matt@duolingo.com H. Brendan McMahan Google, Inc. Seattle, WA 98103 mcmahan@google.com
More informationMathematical Games and Random Walks
Mathematical Games and Random Walks Alexander Engau, Ph.D. Department of Mathematical and Statistical Sciences College of Liberal Arts and Sciences / Intl. College at Beijing University of Colorado Denver
More informationSTATE UNIVERSITY OF NEW YORK AT ALBANY Department of Economics
STATE UNIVERSITY OF NEW YORK AT ALBANY Department of Economics Ph. D. Comprehensive Examination: Macroeconomics Fall, 202 Answer Key to Section 2 Questions Section. (Suggested Time: 45 Minutes) For 3 of
More informationParking Slot Assignment Problem
Department of Economics Boston College October 11, 2016 Motivation Research Question Literature Review What is the concern? Cruising for parking is drivers behavior that circle around an area for a parking
More informationEconomic Growth: Lecture 7, Overlapping Generations
14.452 Economic Growth: Lecture 7, Overlapping Generations Daron Acemoglu MIT November 17, 2009. Daron Acemoglu (MIT) Economic Growth Lecture 7 November 17, 2009. 1 / 54 Growth with Overlapping Generations
More informationParking Space Assignment Problem: A Matching Mechanism Design Approach
Parking Space Assignment Problem: A Matching Mechanism Design Approach Jinyong Jeong Boston College ITEA 2017, Barcelona June 23, 2017 Jinyong Jeong (Boston College ITEA 2017, Barcelona) Parking Space
More informationLecture Testing Hypotheses: The Neyman-Pearson Paradigm
Math 408 - Mathematical Statistics Lecture 29-30. Testing Hypotheses: The Neyman-Pearson Paradigm April 12-15, 2013 Konstantin Zuev (USC) Math 408, Lecture 29-30 April 12-15, 2013 1 / 12 Agenda Example:
More information