Lecture 14: Bandits with Budget Constraints

Size: px
Start display at page:

Download "Lecture 14: Bandits with Budget Constraints"

Transcription

1 IEOR : Learnng and Optmzaton for Sequental Decson Makng 03/07/16 Lecture 14: andts wth udget Constrants Instructor: Shpra Agrawal Scrbed by: Zhpeng Lu 1 Problem defnton In the regular Mult-armed andt problem we pull an arm I t at each round t and a reward s collected whch depend on the bandt I t. Now suppose that each round after pullng the arm a cost s also ncurred. When the total cost tll tme t surpass a gven level the algorthm stops. hs settng of problem s called andts wth constrants. he formal descrpton s followng. At tme t pull an arm I t = observe reward [0 1] and cost [0 1] d. Gven I t = D where D s a jont dstrbuton dependent on arm denote E[ I t = ] = µ E[j I t = ] = C j. Stop when any budget constrant s volated. Goal s to here 0. mze subject to t Remark. he above settng s also called andt wth Knapsacks wk. Usually we replace the constrant wth a smplfed verson by lettng = mn j j and j j = 1... d. t Example 1. Dynamc prcng wth lmted supply Suppose at prce q the product has probablty Sq to be sold observng revenue q and ncurrng nventory decreasng by 1. So the dstrbuton of reward and cost fohe bandt problem s: pullng arm { q 1 w.p. Sq = 0 0 w.p. 1 Sq 2 Optmal statc polcy A polcy s a mappng whch maps hstory to acton. It can be pure statc n whch case we keep pullng the sngle arm wth hghest expected reward. It can be mxed n whch case we pull arms randomly accordng to some dstrbuton. O can be dynamc where each turn we make decson based on the hstory and remanng resource. In ths secton we wll show that there s an optmal statc polcy whch has expected reward at least optmal dynamc and t satsfes constrants n expectaton. 1

2 Suppose each turn we pull arm wth probablty p and the constrants are all satsfed n expectaton. hen the optmal total reward OP s; subject to p µ p C j p 1. j = 1... d Notce that the second constrant mples that we allow not pullng any arm durng one round. Next we prove OP s bettehan what a optmal dynamc polcy could acheve. Defne X : total number of tmes an optmal polcy pcked arm p = E[X]. Any feasble polcy must satsfy t c sj t = ake expectatons on both sdes and we have s=1 E[ =1...N t:i t= ] E[X C j ] p C j. herefore { p } N =1 s feasble to OP and the total expected reward of ths optmal polcy E[ X µ ] = E[X µ ] = p µ OP. 3 andt algorthm: reducng to unconstraned f problem Suppose the algorthm stops at tme τ when a budget constrant s volated. Defne the regret of such polcy R = OP E[ ]. Snce the total reward of optmal dynamc polcy s bounded by OP the gap between optmal dynamc polcy and the algorthm s bounded by R. Now we reduce the constraned problem to an unconstraned problem wth nonlnear objectve functon by applyng Lagrangan multpleo the constraned problem. he new unconstraned problem wll mze f 2

3 whch s a concave and Lpschtz contnuous functon. r t f = z j=1...d 0. In the above defnton of f he frst tem s average reward and the second tem s the penalty from the mum volaton of budget constrant. We defne the penalty coeffcent z as z = 2OP whch we wll explan momentarly. Suppose we relax the budget to + and defne ɛ =. Denote the OP wth budget 1 + ɛ as OP 1+ɛ. Any p OP 1+ɛ -feasble snce p 1 + ɛ C 1 + ɛ j j = 1... d 1 + ɛ therefore p 1+ɛ s OP-feasble. hus we have N p µ = =1 N =1 p 1 + ɛ µ 1 + ɛ 1 + ɛop so f we relax the budget constrant by the optmal value of OP wll ncrease by no more than OP. So we can set z = 2OP whch guarantees that volatng the budget won t gve beneft n terms of ncreasng value of f. heorem 2 below wll provde the exact relaton between optmzng f and the andts wth Knapsacks problem. he sgnfcance of ths value of z wll be more precsely llustrated n the proof of that theorem. Now we defne the regret of the algorthm wth objectve functon f [ R f = OP f E f r ] t where OP f = p: p 1 f p µ p C = p µ Z C j j 0 And we defne R as the regret of the constraned problem wth larger budget = + 2 z R f + Õ [ τ ] R = OP E. where τ s the frst tme a budget s volated. We conclude ths lecture wth the followng theorem. heorem 2. If an algorthm acheves R f regret for unconstraned f problem then R 3R f + zõ and the algorthm wll not volate at any tme step t wth hgh probablty. 3

4 Proof. Proof outlne: he proof follows from the followng two clams: 1. OP f OP z. 2. Let be the cost of the decson at tme t for unconstraned f algorthm then wth hgh probablty j for all j. If above two clams are true then τ = wth hgh probablty and R = OP E[ τ ] OP f E[ ] = R f + z j=1...d 0 R f + z Usng = 2R f z + Õ we get the theorem statement. Next we prove the above two clams. he frst clam holds because the optmal soluton n fact any soluton p such that p 1 p 0 for OP forms a feasble soluton for OP f wth value OP z. For second clam let M be the mum budget volaton above by the algorthm fohe unconstraned f problem.e. hen M := f r t j=1...d 0 = E OP OP f [ ] zm + zm 2 zm zm 2 he second last nequalty follows usng our earler observaton that OP 1+ɛ OP1 + ɛ and z = 2OP. Last nequalty follows because OP f OP the optmal soluton p for OP forms a feasble soluton for OP f wth value OP. hen rearrangng M 2 OP f f r t = 2 z z R f herefore by defnton of M E[ t j ] + 2R f z So that usng Azuma-Hoeffdng wth hgh probablty j + 2 Z R f + Õ = provng the second clam. Why s above theorem useful? Conceptually above theorem shows that to bound regret of andts wth knapsacks we can nstead use an algorthm that has small regret bound fohe unconstraned f bandt problem. In next lecture we wll prove regret bounds fohe unconstraned f problem. In partcular fohe f dscussed above a unform bound on regret of R f Õz N can be proven though we wll not prove ths specal case n class. hen gven above theorem 4

5 we could solve the unconstraned f bandt problem wth slghtly smaller budget = 2 z Õ N Õ to get regret so that above theorem wll gve R f Õ z N Õz N Õz N R 3R f + zõ Õz N + zõ Usng z = 2OP OP we get a bound of Õ on regret for andts wth knapsacks.e. multplcatve guarantee that the reward acheved by bandt algorthm s at least OP1 Õ 1. Note that above algorthm assumed that the value of OP s known OP was used to set the value of z. If OP s not known then t needs to be estmated usng pure exploraton n the begnnng. hs ntal exploraton mght cause addtonal regret. hs s a lmtaton of the above approach of reducng the problem to unconstraned f problem. For a drect approach wthout knowng OP refeo Secton 4.2 of [1]. References [1] Shpra Agrawal and Nkhl R Devanur. andts wth concave rewards and convex knapsacks. In Proceedngs of the ffteenth ACM conference on Economcs and computaton pages ACM

Assortment Optimization under MNL

Assortment Optimization under MNL Assortment Optmzaton under MNL Haotan Song Aprl 30, 2017 1 Introducton The assortment optmzaton problem ams to fnd the revenue-maxmzng assortment of products to offer when the prces of products are fxed.

More information

CS : Algorithms and Uncertainty Lecture 17 Date: October 26, 2016

CS : Algorithms and Uncertainty Lecture 17 Date: October 26, 2016 CS 29-128: Algorthms and Uncertanty Lecture 17 Date: October 26, 2016 Instructor: Nkhl Bansal Scrbe: Mchael Denns 1 Introducton In ths lecture we wll be lookng nto the secretary problem, and an nterestng

More information

COS 511: Theoretical Machine Learning. Lecturer: Rob Schapire Lecture # 15 Scribe: Jieming Mao April 1, 2013

COS 511: Theoretical Machine Learning. Lecturer: Rob Schapire Lecture # 15 Scribe: Jieming Mao April 1, 2013 COS 511: heoretcal Machne Learnng Lecturer: Rob Schapre Lecture # 15 Scrbe: Jemng Mao Aprl 1, 013 1 Bref revew 1.1 Learnng wth expert advce Last tme, we started to talk about learnng wth expert advce.

More information

princeton univ. F 17 cos 521: Advanced Algorithm Design Lecture 7: LP Duality Lecturer: Matt Weinberg

princeton univ. F 17 cos 521: Advanced Algorithm Design Lecture 7: LP Duality Lecturer: Matt Weinberg prnceton unv. F 17 cos 521: Advanced Algorthm Desgn Lecture 7: LP Dualty Lecturer: Matt Wenberg Scrbe: LP Dualty s an extremely useful tool for analyzng structural propertes of lnear programs. Whle there

More information

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 12 10/21/2013. Martingale Concentration Inequalities and Applications

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 12 10/21/2013. Martingale Concentration Inequalities and Applications MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.65/15.070J Fall 013 Lecture 1 10/1/013 Martngale Concentraton Inequaltes and Applcatons Content. 1. Exponental concentraton for martngales wth bounded ncrements.

More information

Lecture 10 Support Vector Machines II

Lecture 10 Support Vector Machines II Lecture 10 Support Vector Machnes II 22 February 2016 Taylor B. Arnold Yale Statstcs STAT 365/665 1/28 Notes: Problem 3 s posted and due ths upcomng Frday There was an early bug n the fake-test data; fxed

More information

Feature Selection: Part 1

Feature Selection: Part 1 CSE 546: Machne Learnng Lecture 5 Feature Selecton: Part 1 Instructor: Sham Kakade 1 Regresson n the hgh dmensonal settng How do we learn when the number of features d s greater than the sample sze n?

More information

Announcements EWA with ɛ-exploration (recap) Lecture 20: EXP3 Algorithm. EECS598: Prediction and Learning: It s Only a Game Fall 2013.

Announcements EWA with ɛ-exploration (recap) Lecture 20: EXP3 Algorithm. EECS598: Prediction and Learning: It s Only a Game Fall 2013. Lecture 0: EXP3 Algorthm 1 EECS598: Predcton and Learnng: It s Only a Game Fall 013 Prof. Jacob Abernethy Lecture 0: EXP3 Algorthm Scrbe: Zhhao Chen Announcements None 0.1 EWA wth ɛ-exploraton (recap)

More information

Lecture 4. Instructor: Haipeng Luo

Lecture 4. Instructor: Haipeng Luo Lecture 4 Instructor: Hapeng Luo In the followng lectures, we focus on the expert problem and study more adaptve algorthms. Although Hedge s proven to be worst-case optmal, one may wonder how well t would

More information

Winter 2008 CS567 Stochastic Linear/Integer Programming Guest Lecturer: Xu, Huan

Winter 2008 CS567 Stochastic Linear/Integer Programming Guest Lecturer: Xu, Huan Wnter 2008 CS567 Stochastc Lnear/Integer Programmng Guest Lecturer: Xu, Huan Class 2: More Modelng Examples 1 Capacty Expanson Capacty expanson models optmal choces of the tmng and levels of nvestments

More information

More metrics on cartesian products

More metrics on cartesian products More metrcs on cartesan products If (X, d ) are metrc spaces for 1 n, then n Secton II4 of the lecture notes we defned three metrcs on X whose underlyng topologes are the product topology The purpose of

More information

COS 521: Advanced Algorithms Game Theory and Linear Programming

COS 521: Advanced Algorithms Game Theory and Linear Programming COS 521: Advanced Algorthms Game Theory and Lnear Programmng Moses Charkar February 27, 2013 In these notes, we ntroduce some basc concepts n game theory and lnear programmng (LP). We show a connecton

More information

U.C. Berkeley CS294: Beyond Worst-Case Analysis Luca Trevisan September 5, 2017

U.C. Berkeley CS294: Beyond Worst-Case Analysis Luca Trevisan September 5, 2017 U.C. Berkeley CS94: Beyond Worst-Case Analyss Handout 4s Luca Trevsan September 5, 07 Summary of Lecture 4 In whch we ntroduce semdefnte programmng and apply t to Max Cut. Semdefnte Programmng Recall that

More information

College of Computer & Information Science Fall 2009 Northeastern University 20 October 2009

College of Computer & Information Science Fall 2009 Northeastern University 20 October 2009 College of Computer & Informaton Scence Fall 2009 Northeastern Unversty 20 October 2009 CS7880: Algorthmc Power Tools Scrbe: Jan Wen and Laura Poplawsk Lecture Outlne: Prmal-dual schema Network Desgn:

More information

Revenue comparison when the length of horizon is known in advance. The standard errors of all numbers are less than 0.1%.

Revenue comparison when the length of horizon is known in advance. The standard errors of all numbers are less than 0.1%. Golrezae, Nazerzadeh, and Rusmevchentong: Real-tme Optmzaton of Personalzed Assortments Management Scence 00(0, pp. 000 000, c 0000 INFORMS 37 Onlne Appendx Appendx A: Numercal Experments: Appendx to Secton

More information

1 The Mistake Bound Model

1 The Mistake Bound Model 5-850: Advanced Algorthms CMU, Sprng 07 Lecture #: Onlne Learnng and Multplcatve Weghts February 7, 07 Lecturer: Anupam Gupta Scrbe: Bryan Lee,Albert Gu, Eugene Cho he Mstake Bound Model Suppose there

More information

Stanford University CS359G: Graph Partitioning and Expanders Handout 4 Luca Trevisan January 13, 2011

Stanford University CS359G: Graph Partitioning and Expanders Handout 4 Luca Trevisan January 13, 2011 Stanford Unversty CS359G: Graph Parttonng and Expanders Handout 4 Luca Trevsan January 3, 0 Lecture 4 In whch we prove the dffcult drecton of Cheeger s nequalty. As n the past lectures, consder an undrected

More information

APPROXIMATE PRICES OF BASKET AND ASIAN OPTIONS DUPONT OLIVIER. Premia 14

APPROXIMATE PRICES OF BASKET AND ASIAN OPTIONS DUPONT OLIVIER. Premia 14 APPROXIMAE PRICES OF BASKE AND ASIAN OPIONS DUPON OLIVIER Prema 14 Contents Introducton 1 1. Framewor 1 1.1. Baset optons 1.. Asan optons. Computng the prce 3. Lower bound 3.1. Closed formula for the prce

More information

k t+1 + c t A t k t, t=0

k t+1 + c t A t k t, t=0 Macro II (UC3M, MA/PhD Econ) Professor: Matthas Kredler Fnal Exam 6 May 208 You have 50 mnutes to complete the exam There are 80 ponts n total The exam has 4 pages If somethng n the queston s unclear,

More information

Stochastic Multi-armed Bandits in Constant Space

Stochastic Multi-armed Bandits in Constant Space Davd Lau davdlau@utexas.edu Erc Prce Zhao Song ecprce@cs.utexas.edu zhaos@utexas.edu The Unversty of Texas at Austn Ger Yang geryang@utexas.edu Abstract We consder the stochastc bandt problem n the sublnear

More information

PROBLEM SET 7 GENERAL EQUILIBRIUM

PROBLEM SET 7 GENERAL EQUILIBRIUM PROBLEM SET 7 GENERAL EQUILIBRIUM Queston a Defnton: An Arrow-Debreu Compettve Equlbrum s a vector of prces {p t } and allocatons {c t, c 2 t } whch satsfes ( Gven {p t }, c t maxmzes βt ln c t subject

More information

The Experts/Multiplicative Weights Algorithm and Applications

The Experts/Multiplicative Weights Algorithm and Applications Chapter 2 he Experts/Multplcatve Weghts Algorthm and Applcatons We turn to the problem of onlne learnng, and analyze a very powerful and versatle algorthm called the multplcatve weghts update algorthm.

More information

Suggested solutions for the exam in SF2863 Systems Engineering. June 12,

Suggested solutions for the exam in SF2863 Systems Engineering. June 12, Suggested solutons for the exam n SF2863 Systems Engneerng. June 12, 2012 14.00 19.00 Examner: Per Enqvst, phone: 790 62 98 1. We can thnk of the farm as a Jackson network. The strawberry feld s modelled

More information

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur Module 3 LOSSY IMAGE COMPRESSION SYSTEMS Verson ECE IIT, Kharagpur Lesson 6 Theory of Quantzaton Verson ECE IIT, Kharagpur Instructonal Objectves At the end of ths lesson, the students should be able to:

More information

Lecture 4: November 17, Part 1 Single Buffer Management

Lecture 4: November 17, Part 1 Single Buffer Management Lecturer: Ad Rosén Algorthms for the anagement of Networs Fall 2003-2004 Lecture 4: November 7, 2003 Scrbe: Guy Grebla Part Sngle Buffer anagement In the prevous lecture we taled about the Combned Input

More information

Economics 8105 Macroeconomic Theory Recitation 1

Economics 8105 Macroeconomic Theory Recitation 1 Economcs 8105 Macroeconomc Theory Rectaton 1 Outlne: Conor Ryan September 6th, 2016 Adapted From Anh Thu (Monca) Tran Xuan s Notes Last Updated September 20th, 2016 Dynamc Economc Envronment Arrow-Debreu

More information

Economics 2450A: Public Economics Section 10: Education Policies and Simpler Theory of Capital Taxation

Economics 2450A: Public Economics Section 10: Education Policies and Simpler Theory of Capital Taxation Economcs 2450A: Publc Economcs Secton 10: Educaton Polces and Smpler Theory of Captal Taxaton Matteo Parads November 14, 2016 In ths secton we study educaton polces n a smplfed verson of framework analyzed

More information

Maximal Margin Classifier

Maximal Margin Classifier CS81B/Stat41B: Advanced Topcs n Learnng & Decson Makng Mamal Margn Classfer Lecturer: Mchael Jordan Scrbes: Jana van Greunen Corrected verson - /1/004 1 References/Recommended Readng 1.1 Webstes www.kernel-machnes.org

More information

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X Statstcs 1: Probablty Theory II 37 3 EPECTATION OF SEVERAL RANDOM VARIABLES As n Probablty Theory I, the nterest n most stuatons les not on the actual dstrbuton of a random vector, but rather on a number

More information

CS : Algorithms and Uncertainty Lecture 14 Date: October 17, 2016

CS : Algorithms and Uncertainty Lecture 14 Date: October 17, 2016 CS 294-128: Algorthms and Uncertanty Lecture 14 Date: October 17, 2016 Instructor: Nkhl Bansal Scrbe: Antares Chen 1 Introducton In ths lecture, we revew results regardng follow the regularzed leader (FTRL.

More information

Lecture 4: Universal Hash Functions/Streaming Cont d

Lecture 4: Universal Hash Functions/Streaming Cont d CSE 5: Desgn and Analyss of Algorthms I Sprng 06 Lecture 4: Unversal Hash Functons/Streamng Cont d Lecturer: Shayan Oves Gharan Aprl 6th Scrbe: Jacob Schreber Dsclamer: These notes have not been subjected

More information

An Asymptotically Efficient Simulation-Based Algorithm for Finite Horizon Stochastic Dynamic Programming

An Asymptotically Efficient Simulation-Based Algorithm for Finite Horizon Stochastic Dynamic Programming An Asymptotcally Effcent Smulaton-Based Algorthm for Fnte Horzon Stochastc Dynamc Programmng Hyeong Soo Chang, Mchael C. Fu, Jaqao Hu, and Steven I. Marcus Abstract We present a smulaton-based algorthm

More information

Problem Set 9 Solutions

Problem Set 9 Solutions Desgn and Analyss of Algorthms May 4, 2015 Massachusetts Insttute of Technology 6.046J/18.410J Profs. Erk Demane, Srn Devadas, and Nancy Lynch Problem Set 9 Solutons Problem Set 9 Solutons Ths problem

More information

Lecture Notes on Linear Regression

Lecture Notes on Linear Regression Lecture Notes on Lnear Regresson Feng L fl@sdueducn Shandong Unversty, Chna Lnear Regresson Problem In regresson problem, we am at predct a contnuous target value gven an nput feature vector We assume

More information

3.1 ML and Empirical Distribution

3.1 ML and Empirical Distribution 67577 Intro. to Machne Learnng Fall semester, 2008/9 Lecture 3: Maxmum Lkelhood/ Maxmum Entropy Dualty Lecturer: Amnon Shashua Scrbe: Amnon Shashua 1 In the prevous lecture we defned the prncple of Maxmum

More information

ECE559VV Project Report

ECE559VV Project Report ECE559VV Project Report (Supplementary Notes Loc Xuan Bu I. MAX SUM-RATE SCHEDULING: THE UPLINK CASE We have seen (n the presentaton that, for downlnk (broadcast channels, the strategy maxmzng the sum-rate

More information

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4)

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4) I. Classcal Assumptons Econ7 Appled Econometrcs Topc 3: Classcal Model (Studenmund, Chapter 4) We have defned OLS and studed some algebrac propertes of OLS. In ths topc we wll study statstcal propertes

More information

Equilibrium with Complete Markets. Instructor: Dmytro Hryshko

Equilibrium with Complete Markets. Instructor: Dmytro Hryshko Equlbrum wth Complete Markets Instructor: Dmytro Hryshko 1 / 33 Readngs Ljungqvst and Sargent. Recursve Macroeconomc Theory. MIT Press. Chapter 8. 2 / 33 Equlbrum n pure exchange, nfnte horzon economes,

More information

Additional Codes using Finite Difference Method. 1 HJB Equation for Consumption-Saving Problem Without Uncertainty

Additional Codes using Finite Difference Method. 1 HJB Equation for Consumption-Saving Problem Without Uncertainty Addtonal Codes usng Fnte Dfference Method Benamn Moll 1 HJB Equaton for Consumpton-Savng Problem Wthout Uncertanty Before consderng the case wth stochastc ncome n http://www.prnceton.edu/~moll/ HACTproect/HACT_Numercal_Appendx.pdf,

More information

Pricing Problems under the Nested Logit Model with a Quality Consistency Constraint

Pricing Problems under the Nested Logit Model with a Quality Consistency Constraint Prcng Problems under the Nested Logt Model wth a Qualty Consstency Constrant James M. Davs, Huseyn Topaloglu, Davd P. Wllamson 1 Aprl 28, 2015 Abstract We consder prcng problems when customers choose among

More information

Appendix B: Resampling Algorithms

Appendix B: Resampling Algorithms 407 Appendx B: Resamplng Algorthms A common problem of all partcle flters s the degeneracy of weghts, whch conssts of the unbounded ncrease of the varance of the mportance weghts ω [ ] of the partcles

More information

Economics 101. Lecture 4 - Equilibrium and Efficiency

Economics 101. Lecture 4 - Equilibrium and Efficiency Economcs 0 Lecture 4 - Equlbrum and Effcency Intro As dscussed n the prevous lecture, we wll now move from an envronment where we looed at consumers mang decsons n solaton to analyzng economes full of

More information

Online Appendix: Reciprocity with Many Goods

Online Appendix: Reciprocity with Many Goods T D T A : O A Kyle Bagwell Stanford Unversty and NBER Robert W. Stager Dartmouth College and NBER March 2016 Abstract Ths onlne Appendx extends to a many-good settng the man features of recprocty emphaszed

More information

Linear Regression Analysis: Terminology and Notation

Linear Regression Analysis: Terminology and Notation ECON 35* -- Secton : Basc Concepts of Regresson Analyss (Page ) Lnear Regresson Analyss: Termnology and Notaton Consder the generc verson of the smple (two-varable) lnear regresson model. It s represented

More information

CHAPTER 17 Amortized Analysis

CHAPTER 17 Amortized Analysis CHAPTER 7 Amortzed Analyss In an amortzed analyss, the tme requred to perform a sequence of data structure operatons s averaged over all the operatons performed. It can be used to show that the average

More information

Erratum: A Generalized Path Integral Control Approach to Reinforcement Learning

Erratum: A Generalized Path Integral Control Approach to Reinforcement Learning Journal of Machne Learnng Research 00-9 Submtted /0; Publshed 7/ Erratum: A Generalzed Path Integral Control Approach to Renforcement Learnng Evangelos ATheodorou Jonas Buchl Stefan Schaal Department of

More information

Technical Note: Capacity Constraints Across Nests in Assortment Optimization Under the Nested Logit Model

Technical Note: Capacity Constraints Across Nests in Assortment Optimization Under the Nested Logit Model Techncal Note: Capacty Constrants Across Nests n Assortment Optmzaton Under the Nested Logt Model Jacob B. Feldman, Huseyn Topaloglu School of Operatons Research and Informaton Engneerng, Cornell Unversty,

More information

Hidden Markov Models

Hidden Markov Models Hdden Markov Models Namrata Vaswan, Iowa State Unversty Aprl 24, 204 Hdden Markov Model Defntons and Examples Defntons:. A hdden Markov model (HMM) refers to a set of hdden states X 0, X,..., X t,...,

More information

Lecture 11. minimize. c j x j. j=1. 1 x j 0 j. +, b R m + and c R n +

Lecture 11. minimize. c j x j. j=1. 1 x j 0 j. +, b R m + and c R n + Topcs n Theoretcal Computer Scence May 4, 2015 Lecturer: Ola Svensson Lecture 11 Scrbes: Vncent Eggerlng, Smon Rodrguez 1 Introducton In the last lecture we covered the ellpsod method and ts applcaton

More information

Simultaneous Optimization of Berth Allocation, Quay Crane Assignment and Quay Crane Scheduling Problems in Container Terminals

Simultaneous Optimization of Berth Allocation, Quay Crane Assignment and Quay Crane Scheduling Problems in Container Terminals Smultaneous Optmzaton of Berth Allocaton, Quay Crane Assgnment and Quay Crane Schedulng Problems n Contaner Termnals Necat Aras, Yavuz Türkoğulları, Z. Caner Taşkın, Kuban Altınel Abstract In ths work,

More information

Notes on Frequency Estimation in Data Streams

Notes on Frequency Estimation in Data Streams Notes on Frequency Estmaton n Data Streams In (one of) the data streamng model(s), the data s a sequence of arrvals a 1, a 2,..., a m of the form a j = (, v) where s the dentty of the tem and belongs to

More information

EFFECTS OF JOINT REPLENISHMENT POLICY ON COMPANY COST UNDER PERMISSIBLE DELAY IN PAYMENTS

EFFECTS OF JOINT REPLENISHMENT POLICY ON COMPANY COST UNDER PERMISSIBLE DELAY IN PAYMENTS Mathematcal and Computatonal Applcatons, Vol. 5, No., pp. 8-58,. Assocaton for Scentfc Research EFFECS OF JOIN REPLENISHMEN POLICY ON COMPANY COS UNDER PERMISSIBLE DELAY IN PAYMENS Yu-Chung sao, Mng-Yu

More information

Lecture 20: November 7

Lecture 20: November 7 0-725/36-725: Convex Optmzaton Fall 205 Lecturer: Ryan Tbshran Lecture 20: November 7 Scrbes: Varsha Chnnaobreddy, Joon Sk Km, Lngyao Zhang Note: LaTeX template courtesy of UC Berkeley EECS dept. Dsclamer:

More information

Randomness and Computation

Randomness and Computation Randomness and Computaton or, Randomzed Algorthms Mary Cryan School of Informatcs Unversty of Ednburgh RC 208/9) Lecture 0 slde Balls n Bns m balls, n bns, and balls thrown unformly at random nto bns usually

More information

MMA and GCMMA two methods for nonlinear optimization

MMA and GCMMA two methods for nonlinear optimization MMA and GCMMA two methods for nonlnear optmzaton Krster Svanberg Optmzaton and Systems Theory, KTH, Stockholm, Sweden. krlle@math.kth.se Ths note descrbes the algorthms used n the author s 2007 mplementatons

More information

VARIATION OF CONSTANT SUM CONSTRAINT FOR INTEGER MODEL WITH NON UNIFORM VARIABLES

VARIATION OF CONSTANT SUM CONSTRAINT FOR INTEGER MODEL WITH NON UNIFORM VARIABLES VARIATION OF CONSTANT SUM CONSTRAINT FOR INTEGER MODEL WITH NON UNIFORM VARIABLES BÂRZĂ, Slvu Faculty of Mathematcs-Informatcs Spru Haret Unversty barza_slvu@yahoo.com Abstract Ths paper wants to contnue

More information

Vapnik-Chervonenkis theory

Vapnik-Chervonenkis theory Vapnk-Chervonenks theory Rs Kondor June 13, 2008 For the purposes of ths lecture, we restrct ourselves to the bnary supervsed batch learnng settng. We assume that we have an nput space X, and an unknown

More information

Perfect Competition and the Nash Bargaining Solution

Perfect Competition and the Nash Bargaining Solution Perfect Competton and the Nash Barganng Soluton Renhard John Department of Economcs Unversty of Bonn Adenauerallee 24-42 53113 Bonn, Germany emal: rohn@un-bonn.de May 2005 Abstract For a lnear exchange

More information

Portfolios with Trading Constraints and Payout Restrictions

Portfolios with Trading Constraints and Payout Restrictions Portfolos wth Tradng Constrants and Payout Restrctons John R. Brge Northwestern Unversty (ont wor wth Chrs Donohue Xaodong Xu and Gongyun Zhao) 1 General Problem (Very) long-term nvestor (eample: unversty

More information

CSC 411 / CSC D11 / CSC C11

CSC 411 / CSC D11 / CSC C11 18 Boostng s a general strategy for learnng classfers by combnng smpler ones. The dea of boostng s to take a weak classfer that s, any classfer that wll do at least slghtly better than chance and use t

More information

Conservative Contextual Linear Bandits

Conservative Contextual Linear Bandits Conservatve Contextual Lnear Bandts Abbas Kazeroun 1, Mohammad Ghavamzadeh 2, Yasn Abbas-Yadkor 3, and Benjamn Van Roy 4 arxv:1611.06426v2 [stat.ml] 4 Mar 2017 1 Stanford Unversty, abbask@stanford.edu

More information

A SEPARABLE APPROXIMATION DYNAMIC PROGRAMMING ALGORITHM FOR ECONOMIC DISPATCH WITH TRANSMISSION LOSSES. Pierre HANSEN, Nenad MLADENOVI]

A SEPARABLE APPROXIMATION DYNAMIC PROGRAMMING ALGORITHM FOR ECONOMIC DISPATCH WITH TRANSMISSION LOSSES. Pierre HANSEN, Nenad MLADENOVI] Yugoslav Journal of Operatons Research (00) umber 57-66 A SEPARABLE APPROXIMATIO DYAMIC PROGRAMMIG ALGORITHM FOR ECOOMIC DISPATCH WITH TRASMISSIO LOSSES Perre HASE enad MLADEOVI] GERAD and Ecole des Hautes

More information

COS 511: Theoretical Machine Learning. Lecturer: Rob Schapire Lecture #16 Scribe: Yannan Wang April 3, 2014

COS 511: Theoretical Machine Learning. Lecturer: Rob Schapire Lecture #16 Scribe: Yannan Wang April 3, 2014 COS 511: Theoretcal Machne Learnng Lecturer: Rob Schapre Lecture #16 Scrbe: Yannan Wang Aprl 3, 014 1 Introducton The goal of our onlne learnng scenaro from last class s C comparng wth best expert and

More information

1 Convex Optimization

1 Convex Optimization Convex Optmzaton We wll consder convex optmzaton problems. Namely, mnmzaton problems where the objectve s convex (we assume no constrants for now). Such problems often arse n machne learnng. For example,

More information

Dynamic Programming. Lecture 13 (5/31/2017)

Dynamic Programming. Lecture 13 (5/31/2017) Dynamc Programmng Lecture 13 (5/31/2017) - A Forest Thnnng Example - Projected yeld (m3/ha) at age 20 as functon of acton taken at age 10 Age 10 Begnnng Volume Resdual Ten-year Volume volume thnned volume

More information

Capacity Constraints Across Nests in Assortment Optimization Under the Nested Logit Model

Capacity Constraints Across Nests in Assortment Optimization Under the Nested Logit Model Capacty Constrants Across Nests n Assortment Optmzaton Under the Nested Logt Model Jacob B. Feldman School of Operatons Research and Informaton Engneerng, Cornell Unversty, Ithaca, New York 14853, USA

More information

Lecture 21: Numerical methods for pricing American type derivatives

Lecture 21: Numerical methods for pricing American type derivatives Lecture 21: Numercal methods for prcng Amercan type dervatves Xaoguang Wang STAT 598W Aprl 10th, 2014 (STAT 598W) Lecture 21 1 / 26 Outlne 1 Fnte Dfference Method Explct Method Penalty Method (STAT 598W)

More information

LOW BIAS INTEGRATED PATH ESTIMATORS. James M. Calvin

LOW BIAS INTEGRATED PATH ESTIMATORS. James M. Calvin Proceedngs of the 007 Wnter Smulaton Conference S G Henderson, B Bller, M-H Hseh, J Shortle, J D Tew, and R R Barton, eds LOW BIAS INTEGRATED PATH ESTIMATORS James M Calvn Department of Computer Scence

More information

Some modelling aspects for the Matlab implementation of MMA

Some modelling aspects for the Matlab implementation of MMA Some modellng aspects for the Matlab mplementaton of MMA Krster Svanberg krlle@math.kth.se Optmzaton and Systems Theory Department of Mathematcs KTH, SE 10044 Stockholm September 2004 1. Consdered optmzaton

More information

COS 511: Theoretical Machine Learning

COS 511: Theoretical Machine Learning COS 5: Theoretcal Machne Learnng Lecturer: Rob Schapre Lecture #0 Scrbe: José Sões Ferrera March 06, 203 In the last lecture the concept of Radeacher coplexty was ntroduced, wth the goal of showng that

More information

Computing Correlated Equilibria in Multi-Player Games

Computing Correlated Equilibria in Multi-Player Games Computng Correlated Equlbra n Mult-Player Games Chrstos H. Papadmtrou Presented by Zhanxang Huang December 7th, 2005 1 The Author Dr. Chrstos H. Papadmtrou CS professor at UC Berkley (taught at Harvard,

More information

n α j x j = 0 j=1 has a nontrivial solution. Here A is the n k matrix whose jth column is the vector for all t j=0

n α j x j = 0 j=1 has a nontrivial solution. Here A is the n k matrix whose jth column is the vector for all t j=0 MODULE 2 Topcs: Lnear ndependence, bass and dmenson We have seen that f n a set of vectors one vector s a lnear combnaton of the remanng vectors n the set then the span of the set s unchanged f that vector

More information

4DVAR, according to the name, is a four-dimensional variational method.

4DVAR, according to the name, is a four-dimensional variational method. 4D-Varatonal Data Assmlaton (4D-Var) 4DVAR, accordng to the name, s a four-dmensonal varatonal method. 4D-Var s actually a drect generalzaton of 3D-Var to handle observatons that are dstrbuted n tme. The

More information

Matrix Approximation via Sampling, Subspace Embedding. 1 Solving Linear Systems Using SVD

Matrix Approximation via Sampling, Subspace Embedding. 1 Solving Linear Systems Using SVD Matrx Approxmaton va Samplng, Subspace Embeddng Lecturer: Anup Rao Scrbe: Rashth Sharma, Peng Zhang 0/01/016 1 Solvng Lnear Systems Usng SVD Two applcatons of SVD have been covered so far. Today we loo

More information

1 Gradient descent for convex functions: univariate case

1 Gradient descent for convex functions: univariate case prnceton unv. F 13 cos 51: Advanced Algorthm Desgn Lecture 19: Gong wth the slope: offlne, onlne, and randomly Lecturer: Sanjeev Arora Scrbe: hs lecture s about gradent descent, a popular method for contnuous

More information

Pricing and Resource Allocation Game Theoretic Models

Pricing and Resource Allocation Game Theoretic Models Prcng and Resource Allocaton Game Theoretc Models Zhy Huang Changbn Lu Q Zhang Computer and Informaton Scence December 8, 2009 Z. Huang, C. Lu, and Q. Zhang (CIS) Game Theoretc Models December 8, 2009

More information

Assortment Optimization under the Paired Combinatorial Logit Model

Assortment Optimization under the Paired Combinatorial Logit Model Assortment Optmzaton under the Pared Combnatoral Logt Model Heng Zhang, Paat Rusmevchentong Marshall School of Busness, Unversty of Southern Calforna, Los Angeles, CA 90089 hengz@usc.edu, rusmevc@marshall.usc.edu

More information

CHAPTER 5 NUMERICAL EVALUATION OF DYNAMIC RESPONSE

CHAPTER 5 NUMERICAL EVALUATION OF DYNAMIC RESPONSE CHAPTER 5 NUMERICAL EVALUATION OF DYNAMIC RESPONSE Analytcal soluton s usually not possble when exctaton vares arbtrarly wth tme or f the system s nonlnear. Such problems can be solved by numercal tmesteppng

More information

Chapter Newton s Method

Chapter Newton s Method Chapter 9. Newton s Method After readng ths chapter, you should be able to:. Understand how Newton s method s dfferent from the Golden Secton Search method. Understand how Newton s method works 3. Solve

More information

Edge Isoperimetric Inequalities

Edge Isoperimetric Inequalities November 7, 2005 Ross M. Rchardson Edge Isopermetrc Inequaltes 1 Four Questons Recall that n the last lecture we looked at the problem of sopermetrc nequaltes n the hypercube, Q n. Our noton of boundary

More information

Which Separator? Spring 1

Which Separator? Spring 1 Whch Separator? 6.034 - Sprng 1 Whch Separator? Mamze the margn to closest ponts 6.034 - Sprng Whch Separator? Mamze the margn to closest ponts 6.034 - Sprng 3 Margn of a pont " # y (w $ + b) proportonal

More information

Week 5: Neural Networks

Week 5: Neural Networks Week 5: Neural Networks Instructor: Sergey Levne Neural Networks Summary In the prevous lecture, we saw how we can construct neural networks by extendng logstc regresson. Neural networks consst of multple

More information

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification E395 - Pattern Recognton Solutons to Introducton to Pattern Recognton, Chapter : Bayesan pattern classfcaton Preface Ths document s a soluton manual for selected exercses from Introducton to Pattern Recognton

More information

Generalized Linear Methods

Generalized Linear Methods Generalzed Lnear Methods 1 Introducton In the Ensemble Methods the general dea s that usng a combnaton of several weak learner one could make a better learner. More formally, assume that we have a set

More information

Lecture 10 Support Vector Machines. Oct

Lecture 10 Support Vector Machines. Oct Lecture 10 Support Vector Machnes Oct - 20-2008 Lnear Separators Whch of the lnear separators s optmal? Concept of Margn Recall that n Perceptron, we learned that the convergence rate of the Perceptron

More information

Statistics for Economics & Business

Statistics for Economics & Business Statstcs for Economcs & Busness Smple Lnear Regresson Learnng Objectves In ths chapter, you learn: How to use regresson analyss to predct the value of a dependent varable based on an ndependent varable

More information

Basic Statistical Analysis and Yield Calculations

Basic Statistical Analysis and Yield Calculations October 17, 007 Basc Statstcal Analyss and Yeld Calculatons Dr. José Ernesto Rayas Sánchez 1 Outlne Sources of desgn-performance uncertanty Desgn and development processes Desgn for manufacturablty A general

More information

Game Theory. Lecture Notes By Y. Narahari. Department of Computer Science and Automation Indian Institute of Science Bangalore, India February 2008

Game Theory. Lecture Notes By Y. Narahari. Department of Computer Science and Automation Indian Institute of Science Bangalore, India February 2008 Game Theory Lecture Notes By Y. Narahar Department of Computer Scence and Automaton Indan Insttute of Scence Bangalore, Inda February 2008 Chapter 10: Two Person Zero Sum Games Note: Ths s a only a draft

More information

Approximation Methods for Pricing Problems under the Nested Logit Model with Price Bounds

Approximation Methods for Pricing Problems under the Nested Logit Model with Price Bounds Approxmaton Methods for Prcng Problems under the Nested Logt Model wth Prce Bounds W. Zachary Rayfeld School of Operatons Research and Informaton Engneerng, Cornell Unversty, Ithaca, New York 14853, USA

More information

Interactive Bi-Level Multi-Objective Integer. Non-linear Programming Problem

Interactive Bi-Level Multi-Objective Integer. Non-linear Programming Problem Appled Mathematcal Scences Vol 5 0 no 65 3 33 Interactve B-Level Mult-Objectve Integer Non-lnear Programmng Problem O E Emam Department of Informaton Systems aculty of Computer Scence and nformaton Helwan

More information

BOUNDEDNESS OF THE RIESZ TRANSFORM WITH MATRIX A 2 WEIGHTS

BOUNDEDNESS OF THE RIESZ TRANSFORM WITH MATRIX A 2 WEIGHTS BOUNDEDNESS OF THE IESZ TANSFOM WITH MATIX A WEIGHTS Introducton Let L = L ( n, be the functon space wth norm (ˆ f L = f(x C dx d < For a d d matrx valued functon W : wth W (x postve sem-defnte for all

More information

arxiv: v1 [math.oc] 25 Jun 2008

arxiv: v1 [math.oc] 25 Jun 2008 Irrevocable Mult-Armed Bandt Polces Vvek F. Faras Rtesh Madan arxv:0806.4133v1 [math.oc] 25 Jun 2008 February 13, 2018 Abstract Ths paper consders the mult-armed bandt problem wth multple smultaneous arm

More information

The Expectation-Maximization Algorithm

The Expectation-Maximization Algorithm The Expectaton-Maxmaton Algorthm Charles Elan elan@cs.ucsd.edu November 16, 2007 Ths chapter explans the EM algorthm at multple levels of generalty. Secton 1 gves the standard hgh-level verson of the algorthm.

More information

Computational and Statistical Learning theory Assignment 4

Computational and Statistical Learning theory Assignment 4 Coputatonal and Statstcal Learnng theory Assgnent 4 Due: March 2nd Eal solutons to : karthk at ttc dot edu Notatons/Defntons Recall the defnton of saple based Radeacher coplexty : [ ] R S F) := E ɛ {±}

More information

Excess Error, Approximation Error, and Estimation Error

Excess Error, Approximation Error, and Estimation Error E0 370 Statstcal Learnng Theory Lecture 10 Sep 15, 011 Excess Error, Approxaton Error, and Estaton Error Lecturer: Shvan Agarwal Scrbe: Shvan Agarwal 1 Introducton So far, we have consdered the fnte saple

More information

An Experiment/Some Intuition (Fall 2006): Lecture 18 The EM Algorithm heads coin 1 tails coin 2 Overview Maximum Likelihood Estimation

An Experiment/Some Intuition (Fall 2006): Lecture 18 The EM Algorithm heads coin 1 tails coin 2 Overview Maximum Likelihood Estimation An Experment/Some Intuton I have three cons n my pocket, 6.864 (Fall 2006): Lecture 18 The EM Algorthm Con 0 has probablty λ of heads; Con 1 has probablty p 1 of heads; Con 2 has probablty p 2 of heads

More information

,, MRTS is the marginal rate of technical substitution

,, MRTS is the marginal rate of technical substitution Mscellaneous Notes on roducton Economcs ompled by eter F Orazem September 9, 00 I Implcatons of conve soquants Two nput case, along an soquant 0 along an soquant Slope of the soquant,, MRTS s the margnal

More information

Numerical Solution of Ordinary Differential Equations

Numerical Solution of Ordinary Differential Equations Numercal Methods (CENG 00) CHAPTER-VI Numercal Soluton of Ordnar Dfferental Equatons 6 Introducton Dfferental equatons are equatons composed of an unknown functon and ts dervatves The followng are examples

More information

A Simple Inventory System

A Simple Inventory System A Smple Inventory System Lawrence M. Leems and Stephen K. Park, Dscrete-Event Smulaton: A Frst Course, Prentce Hall, 2006 Hu Chen Computer Scence Vrgna State Unversty Petersburg, Vrgna February 8, 2017

More information

14 Lagrange Multipliers

14 Lagrange Multipliers Lagrange Multplers 14 Lagrange Multplers The Method of Lagrange Multplers s a powerful technque for constraned optmzaton. Whle t has applcatons far beyond machne learnng t was orgnally developed to solve

More information