Optimality of Myopic Policy for a Class of Monotone Affine Restless Multi-Armed Bandit
|
|
- Owen Chandler
- 5 years ago
- Views:
Transcription
1 Univeriy of Souhern Cliforni Opimliy of Myopic Policy for Cl of Monoone Affine Rele Muli-Armed Bndi Pri Mnourifrd USC Tr Jvidi UCSD Bhkr Krihnmchri USC Dec 0, 202
2 Univeriy of Souhern Cliforni Inroducion Muli-Armed Bndi: Sochic deciion problem Selecing from everl lernive rm ech ime Plying n rm yield n immedie rewrd How o ply rm o mximize he expeced dicouned or verge rewrd over horizon Trde-off beween explorion nd exploiion Two cegorie: Reed nd Rele 2
3 Univeriy of Souhern Cliforni Reed MAB: Inroducion The e of he plyed rm chnge ccording o known Mrkovin rule Byein The remining rm y frozen Opiml policy: Specificlly n index cn be igned o he e of ech rm Plying n rm wih he lrge index ech ime Referred o he Giin index 3
4 Univeriy of Souhern Cliforni Rele MAB: Inroducion The e of ll rm even hoe h re no eleced evolve in Mrkovin fhion ech ime n index-policy i no in generl opiml, While index policy i opiml under ome conrin on he verge number of rm h cn be plyed ech ime PSPACE-hrd problem In lierure: pecil cle of RMAB for which priculr heuriic re opiml Our conribuion: generl cl of RMAB for which imple index policy Myopic policy i opiml 4
5 Univeriy of Souhern Cliforni Myopic policy: Inroducion elec n rm wih he highe immedie rewrd, ech ime, ignoring he impc of he curren cion on he fuure rewrd Recenly everl reerche: opimliy of Myopic policy under cerin condiion for muliple rm evolving wih i.i.d. wo-e dicree-ime Mrkov chin Our conribuion: Generlizing beyond he pecific eing of wo-e Mrkov chin rel-vlued e p 0 p 0 00 p bd good p 0 5
6 Univeriy of Souhern Cliforni cl of RMAB Problem Formulion n independen nd ochiclly idenicl rm. Finie horizon T, ime ep,...,t Only one rm cn be plyed ech ime Ech rm i in rel-vlued e: [ 0, mx ] Plying n rm wih e yield n immedie rewrd wih expecion R 6
7 Univeriy of Souhern Cliforni he e of rm j ime : Problem Formulion The e of eleced rm will ree ochiclly. j The e of no-plyed rm evolve ccording o deerminiic funcion Se rniion of rm j : mx Prior work: Specific eing of our formulion, p p mx R, 0 0 p j j j 0 p j p j 2 7
8 Univeriy of Souhern Cliforni Problem Formulion Policy vecor: [,..., T] The policy mp he curren e vecor o he cion of elecing n rm ime {,..., n} Curren e vecor i ufficien iic due o he Mrkovin dynmic Gol: Mximizing ol dicouned expeced rewrd: mx E [ T R ] 8
9 Univeriy of Souhern Cliforni Problem Formulion lue funcion: mximum expeced remining rewrd ring from ime : Recurive Equion DP 9 T n,...,, mx,,...,,...,,,,,, 0 mx, T p p R n n,, T R
10 Univeriy of Souhern Cliforni Problem Formulion Opiml policy: Myopic policy: mx T [ ' E R ' ' opiml rg mx,,..., n ' ] Myopic rg mx,..., n R rg Mximizing curren expeced rewrd R R i umed monooniclly increing in mx,..., n 0
11 Univeriy of Souhern Cliforni Condiion: Min Reul monooniclly increing nd ffine funcion of e, i conrcion mpping Theorem: Under bove condiion, nd he myopic policy i opiml R, p, b. 2 b 2 if, b i p mx p 0 i,,,..., T, i 2,..., n
12 Univeriy of Souhern Cliforni Concluion We proved he opimliy of Myopic policy for generl cl of rele Muli-rmed Bndi Generlizing o non-idenicl rm, non-ffine evoluion Generlizing o muli-dimenionl e Idenifying condiion for he problem h Myopic i no opiml bu oher efficien, poibly index-bed, policy i opiml. 2
13 Univeriy of Souhern Cliforni 3
Chapter 2: Evaluative Feedback
Chper 2: Evluive Feedbck Evluing cions vs. insrucing by giving correc cions Pure evluive feedbck depends olly on he cion ken. Pure insrucive feedbck depends no ll on he cion ken. Supervised lerning is
More information3. Renewal Limit Theorems
Virul Lborories > 14. Renewl Processes > 1 2 3 3. Renewl Limi Theorems In he inroducion o renewl processes, we noed h he rrivl ime process nd he couning process re inverses, in sens The rrivl ime process
More informationReinforcement Learning
Reiforceme Corol lerig Corol polices h choose opiml cios Q lerig Covergece Chper 13 Reiforceme 1 Corol Cosider lerig o choose cios, e.g., Robo lerig o dock o bery chrger o choose cios o opimize fcory oupu
More informationMachine Learning Reinforcement Learning
Mchine Lerning Reinforcemen Lerning Leon 2 Mchine Lerning Mchine Lerning Supervied Lerning Techer ell lerner wh o remember Reinforcemen Lerning Environmen provide hin o lerner Unupervied Lerning Lerner
More informationBipartite Matching. Matching. Bipartite Matching. Maxflow Formulation
Mching Inpu: undireced grph G = (V, E). Biprie Mching Inpu: undireced, biprie grph G = (, E).. Mching Ern Myr, Hrld äcke Biprie Mching Inpu: undireced, biprie grph G = (, E). Mflow Formulion Inpu: undireced,
More informationStochastic Optimal Control with Linearized Dynamics
Sochic Opiml Conrol wih Linerized Dynmic Sochich opimle Regelung mi lineriieren Modellen Mer-Thei von Hny Abdulmd Tg der Einreichung: 1. Guchen: Prof. Gerhrd Neumnn 2. Guchen: Prof. Jn Peer 3. Guchen:
More informationContraction Mapping Principle Approach to Differential Equations
epl Journl of Science echnology 0 (009) 49-53 Conrcion pping Principle pproch o Differenil Equions Bishnu P. Dhungn Deprmen of hemics, hendr Rn Cmpus ribhuvn Universiy, Khmu epl bsrc Using n eension of
More informationIntroduction to SLE Lecture Notes
Inroducion o SLE Lecure Noe May 13, 16 - The goal of hi ecion i o find a ufficien condiion of λ for he hull K o be generaed by a imple cure. I urn ou if λ 1 < 4 hen K i generaed by a imple curve. We will
More informationTransformations. Ordered set of numbers: (1,2,3,4) Example: (x,y,z) coordinates of pt in space. Vectors
Trnformion Ordered e of number:,,,4 Emple:,,z coordine of p in pce. Vecor If, n i i, K, n, i uni ecor Vecor ddiion +w, +, +, + V+w w Sclr roduc,, Inner do roduc α w. w +,.,. The inner produc i SCLR!. w,.,
More informationThe solution is often represented as a vector: 2xI + 4X2 + 2X3 + 4X4 + 2X5 = 4 2xI + 4X2 + 3X3 + 3X4 + 3X5 = 4. 3xI + 6X2 + 6X3 + 3X4 + 6X5 = 6.
[~ o o :- o o ill] i 1. Mrices, Vecors, nd Guss-Jordn Eliminion 1 x y = = - z= The soluion is ofen represened s vecor: n his exmple, he process of eliminion works very smoohly. We cn elimine ll enries
More informationReinforcement learning II
CS 1675 Introduction to Mchine Lerning Lecture 26 Reinforcement lerning II Milos Huskrecht milos@cs.pitt.edu 5329 Sennott Squre Reinforcement lerning Bsics: Input x Lerner Output Reinforcement r Critic
More informationReinforcement learning
CS 75 Mchine Lening Lecue b einfocemen lening Milos Huskech milos@cs.pi.edu 539 Senno Sque einfocemen lening We wn o len conol policy: : X A We see emples of bu oupus e no given Insed of we ge feedbck
More informationReinforcement Learning
Reinforcement Lerning Tom Mitchell, Mchine Lerning, chpter 13 Outline Introduction Comprison with inductive lerning Mrkov Decision Processes: the model Optiml policy: The tsk Q Lerning: Q function Algorithm
More informationReinforcement Learning and Policy Reuse
Reinforcement Lerning nd Policy Reue Mnuel M. Veloo PEL Fll 206 Reding: Reinforcement Lerning: An Introduction R. Sutton nd A. Brto Probbilitic policy reue in reinforcement lerning gent Fernndo Fernndez
More informationPositive and negative solutions of a boundary value problem for a
Invenion Journl of Reerch Technology in Engineering & Mngemen (IJRTEM) ISSN: 2455-3689 www.ijrem.com Volume 2 Iue 9 ǁ Sepemer 28 ǁ PP 73-83 Poiive nd negive oluion of oundry vlue prolem for frcionl, -difference
More informationProbability, Estimators, and Stationarity
Chper Probbiliy, Esimors, nd Sionriy Consider signl genered by dynmicl process, R, R. Considering s funcion of ime, we re opering in he ime domin. A fundmenl wy o chrcerize he dynmics using he ime domin
More informationModule 6 Value Iteration. CS 886 Sequential Decision Making and Reinforcement Learning University of Waterloo
Module 6 Vlue Itertion CS 886 Sequentil Decision Mking nd Reinforcement Lerning University of Wterloo Mrkov Decision Process Definition Set of sttes: S Set of ctions (i.e., decisions): A Trnsition model:
More informationArtificial Intelligence Markov Decision Problems
rtificil Intelligence Mrkov eciion Problem ilon - briefly mentioned in hpter Ruell nd orvig - hpter 7 Mrkov eciion Problem; pge of Mrkov eciion Problem; pge of exmple: probbilitic blockworld ction outcome
More informationA 1.3 m 2.5 m 2.8 m. x = m m = 8400 m. y = 4900 m 3200 m = 1700 m
PHYS : Soluions o Chper 3 Home Work. SSM REASONING The displcemen is ecor drwn from he iniil posiion o he finl posiion. The mgniude of he displcemen is he shores disnce beween he posiions. Noe h i is onl
More informationModeling the Evolution of Demand Forecasts with Application to Safety Stock Analysis in Production/Distribution Systems
Modeling he Evoluion of Demand oreca wih Applicaion o Safey Sock Analyi in Producion/Diribuion Syem David Heah and Peer Jackon Preened by Kai Jiang Thi ummary preenaion baed on: Heah, D.C., and P.L. Jackon.
More informationPHYSICS 1210 Exam 1 University of Wyoming 14 February points
PHYSICS 1210 Em 1 Uniersiy of Wyoming 14 Februry 2013 150 poins This es is open-noe nd closed-book. Clculors re permied bu compuers re no. No collborion, consulion, or communicion wih oher people (oher
More informationBellman Optimality Equation for V*
Bellmn Optimlity Eqution for V* The vlue of stte under n optiml policy must equl the expected return for the best ction from tht stte: V (s) mx Q (s,) A(s) mx A(s) mx A(s) Er t 1 V (s t 1 ) s t s, t s
More informationSOME USEFUL MATHEMATICS
SOME USEFU MAHEMAICS SOME USEFU MAHEMAICS I is esy o mesure n preic he behvior of n elecricl circui h conins only c volges n currens. However, mos useful elecricl signls h crry informion vry wih ime. Since
More informationS Radio transmission and network access Exercise 1-2
S-7.330 Rdio rnsmission nd nework ccess Exercise 1 - P1 In four-symbol digil sysem wih eqully probble symbols he pulses in he figure re used in rnsmission over AWGN-chnnel. s () s () s () s () 1 3 4 )
More information4.8 Improper Integrals
4.8 Improper Inegrls Well you ve mde i hrough ll he inegrion echniques. Congrs! Unforunely for us, we sill need o cover one more inegrl. They re clled Improper Inegrls. A his poin, we ve only del wih inegrls
More informationFeature Extraction for Inverse Reinforcement Learning
Feure Exrcion for Invere Reinforcemen Lerning Feure-Exrkion für Invere Reinforcemen Lerning Mer-Thei von Oleg Arenz u Wiebden Tg der Einreichung: 1. Guchen: 2. Guchen: 3. Guchen: Feure Exrcion for Invere
More information5. Network flow. Network flow. Maximum flow problem. Ford-Fulkerson algorithm. Min-cost flow. Network flow 5-1
Nework flow -. Nework flow Nework flow Mximum flow prolem Ford-Fulkeron lgorihm Min-co flow Nework flow Nework N i e of direced grph G = (V ; E) ource 2 V which h only ougoing edge ink (or deinion) 2 V
More information2D Motion WS. A horizontally launched projectile s initial vertical velocity is zero. Solve the following problems with this information.
Nme D Moion WS The equions of moion h rele o projeciles were discussed in he Projecile Moion Anlsis Acii. ou found h projecile moes wih consn eloci in he horizonl direcion nd consn ccelerion in he ericl
More information1 jordan.mcd Eigenvalue-eigenvector approach to solving first order ODEs. -- Jordan normal (canonical) form. Instructor: Nam Sun Wang
jordnmcd Eigenvlue-eigenvecor pproch o solving firs order ODEs -- ordn norml (cnonicl) form Insrucor: Nm Sun Wng Consider he following se of coupled firs order ODEs d d x x 5 x x d d x d d x x x 5 x x
More informationMaking Complex Decisions Markov Decision Processes. Making Complex Decisions: Markov Decision Problem
Mking Comple Decisions Mrkov Decision Processes Vsn Honvr Bioinformics nd Compuionl Biology Progrm Cener for Compuionl Inelligence, Lerning, & Discovery honvr@cs.ise.edu www.cs.ise.edu/~honvr/ www.cild.ise.edu/
More informationA new model for limit order book dynamics
Anewmodelforlimiorderbookdynmics JeffreyR.Russell UniversiyofChicgo,GrdueSchoolofBusiness TejinKim UniversiyofChicgo,DeprmenofSisics Absrc:Thispperproposesnewmodelforlimiorderbookdynmics.Thelimiorderbookconsiss
More information5. Stochastic processes (1)
Lec05.pp S-38.45 - Inroducion o Teleraffic Theory Spring 2005 Conens Basic conceps Poisson process 2 Sochasic processes () Consider some quaniy in a eleraffic (or any) sysem I ypically evolves in ime randomly
More informationf t f a f x dx By Lin McMullin f x dx= f b f a. 2
Accumulion: Thoughs On () By Lin McMullin f f f d = + The gols of he AP* Clculus progrm include he semen, Sudens should undersnd he definie inegrl s he ne ccumulion of chnge. 1 The Topicl Ouline includes
More informationPrice Discrimination
My 0 Price Dicriminion. Direc rice dicriminion. Direc Price Dicriminion uing wo r ricing 3. Indirec Price Dicriminion wih wo r ricing 4. Oiml indirec rice dicriminion 5. Key Inigh ge . Direc Price Dicriminion
More informationCitation Abstract and Applied Analysis, 2013, v. 2013, article no
Tile An Opil-Type Inequliy in Time Scle Auhor() Cheung, WS; Li, Q Ciion Arc nd Applied Anlyi, 13, v. 13, ricle no. 53483 Iued De 13 URL hp://hdl.hndle.ne/17/181673 Righ Thi work i licened under Creive
More informationReinforcement Learning. Markov Decision Processes
einforcemen Lerning Mrkov Decision rocesses Mnfred Huber 2014 1 equenil Decision Mking N-rmed bi problems re no good wy o model sequenil decision problem Only dels wih sic decision sequences Could be miiged
More informationDecision Networks. CS 188: Artificial Intelligence. Decision Networks. Decision Networks. Decision Networks and Value of Information
CS 188: Artificil Intelligence nd Vlue of Informtion Instructors: Dn Klein nd Pieter Abbeel niversity of Cliforni, Berkeley [These slides were creted by Dn Klein nd Pieter Abbeel for CS188 Intro to AI
More informationMotion. Part 2: Constant Acceleration. Acceleration. October Lab Physics. Ms. Levine 1. Acceleration. Acceleration. Units for Acceleration.
Moion Accelerion Pr : Consn Accelerion Accelerion Accelerion Accelerion is he re of chnge of velociy. = v - vo = Δv Δ ccelerion = = v - vo chnge of velociy elpsed ime Accelerion is vecor, lhough in one-dimensionl
More information3.1.3 INTRODUCTION TO DYNAMIC OPTIMIZATION: DISCRETE TIME PROBLEMS. A. The Hamiltonian and First-Order Conditions in a Finite Time Horizon
3..3 INRODUCION O DYNAMIC OPIMIZAION: DISCREE IME PROBLEMS A. he Hamilonian and Firs-Order Condiions in a Finie ime Horizon Define a new funcion, he Hamilonian funcion, H. H he change in he oal value of
More informationResearch Article The General Solution of Differential Equations with Caputo-Hadamard Fractional Derivatives and Noninstantaneous Impulses
Hindwi Advnce in Mhemicl Phyic Volume 207, Aricle ID 309473, pge hp://doi.org/0.55/207/309473 Reerch Aricle The Generl Soluion of Differenil Equion wih Cpuo-Hdmrd Frcionl Derivive nd Noninnneou Impule
More information3 Motion with constant acceleration: Linear and projectile motion
3 Moion wih consn ccelerion: Liner nd projecile moion cons, In he precedin Lecure we he considered moion wih consn ccelerion lon he is: Noe h,, cn be posiie nd neie h leds o rie of behiors. Clerl similr
More informationMulti-Armed Bandits: Non-adaptive and Adaptive Sampling
CSE 547/Stt 548: Mchine Lerning for Big Dt Lecture Multi-Armed Bndits: Non-dptive nd Adptive Smpling Instructor: Shm Kkde 1 The (stochstic) multi-rmed bndit problem The bsic prdigm is s follows: K Independent
More informationMagnetostatics Bar Magnet. Magnetostatics Oersted s Experiment
Mgneosics Br Mgne As fr bck s 4500 yers go, he Chinese discovered h cerin ypes of iron ore could rc ech oher nd cerin mels. Iron filings "mp" of br mgne s field Crefully suspended slivers of his mel were
More informationON NEW INEQUALITIES OF SIMPSON S TYPE FOR FUNCTIONS WHOSE SECOND DERIVATIVES ABSOLUTE VALUES ARE CONVEX.
ON NEW INEQUALITIES OF SIMPSON S TYPE FOR FUNCTIONS WHOSE SECOND DERIVATIVES ABSOLUTE VALUES ARE CONVEX. MEHMET ZEKI SARIKAYA?, ERHAN. SET, AND M. EMIN OZDEMIR Asrc. In his noe, we oin new some ineuliies
More informationA LIMIT-POINT CRITERION FOR A SECOND-ORDER LINEAR DIFFERENTIAL OPERATOR IAN KNOWLES
A LIMIT-POINT CRITERION FOR A SECOND-ORDER LINEAR DIFFERENTIAL OPERATOR j IAN KNOWLES 1. Inroducion Consider he forml differenil operor T defined by el, (1) where he funcion q{) is rel-vlued nd loclly
More informationVectorautoregressive Model and Cointegration Analysis. Time Series Analysis Dr. Sevtap Kestel 1
Vecorauoregressive Model and Coinegraion Analysis Par V Time Series Analysis Dr. Sevap Kesel 1 Vecorauoregression Vecor auoregression (VAR) is an economeric model used o capure he evoluion and he inerdependencies
More informationAnnouncements: Warm-up Exercise:
Fri Apr 13 7.1 Sysems of differenial equaions - o model muli-componen sysems via comparmenal analysis hp//en.wikipedia.org/wiki/muli-comparmen_model Announcemens Warm-up Exercise Here's a relaively simple
More informationReinforcement learning
Reinforcement lerning Regulr MDP Given: Trnition model P Rewrd function R Find: Policy π Reinforcement lerning Trnition model nd rewrd function initilly unknown Still need to find the right policy Lern
More informationNMR Spectroscopy: Principles and Applications. Nagarajan Murali 1D - Methods Lecture 5
NMR pecroscop: Principles and Applicaions Nagarajan Murali D - Mehods Lecure 5 D-NMR To full appreciae he workings of D NMR eperimens we need o a leas consider wo coupled spins. omeimes we need o go up
More informationEXISTENCE AND UNIQUENESS OF SOLUTIONS FOR A SECOND-ORDER ITERATIVE BOUNDARY-VALUE PROBLEM
Elecronic Journl of Differenil Equions, Vol. 208 (208), No. 50, pp. 6. ISSN: 072-669. URL: hp://ejde.mh.xse.edu or hp://ejde.mh.un.edu EXISTENCE AND UNIQUENESS OF SOLUTIONS FOR A SECOND-ORDER ITERATIVE
More information!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!"
36 3 1!!!!!!"!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!"!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!" 1 1 3 3 1. 401331. 610000 3. 610000!!!!!!", ( ),,,,,,, ; ; ; ; ; TE973.6 A 100106 (010) 0300104 0 D /m; β
More informationRandomized Perfect Bipartite Matching
Inenive Algorihm Lecure 24 Randomized Perfec Biparie Maching Lecurer: Daniel A. Spielman April 9, 208 24. Inroducion We explain a randomized algorihm by Ahih Goel, Michael Kapralov and Sanjeev Khanna for
More informationA Kalman filtering simulation
A Klmn filering simulion The performnce of Klmn filering hs been esed on he bsis of wo differen dynmicl models, ssuming eiher moion wih consn elociy or wih consn ccelerion. The former is epeced o beer
More informationSeptember 20 Homework Solutions
College of Engineering nd Compuer Science Mechnicl Engineering Deprmen Mechnicl Engineering A Seminr in Engineering Anlysis Fll 7 Number 66 Insrucor: Lrry Creo Sepember Homework Soluions Find he specrum
More informationP441 Analytical Mechanics - I. Coupled Oscillators. c Alex R. Dzierba
Lecure 3 Mondy - Deceber 5, 005 Wrien or ls upded: Deceber 3, 005 P44 Anlyicl Mechnics - I oupled Oscillors c Alex R. Dzierb oupled oscillors - rix echnique In Figure we show n exple of wo coupled oscillors,
More informationARTIFICIAL INTELLIGENCE. Markov decision processes
INFOB2KI 2017-2018 Urech Univeriy The Neherland ARTIFICIAL INTELLIGENCE Markov deciion procee Lecurer: Silja Renooij Thee lide are par of he INFOB2KI Coure Noe available from www.c.uu.nl/doc/vakken/b2ki/chema.hml
More informationLAPLACE TRANSFORMS. 1. Basic transforms
LAPLACE TRANSFORMS. Bic rnform In hi coure, Lplce Trnform will be inroduced nd heir properie exmined; ble of common rnform will be buil up; nd rnform will be ued o olve ome dierenil equion by rnforming
More informationEXERCISE - 01 CHECK YOUR GRASP
UNIT # 09 PARABOLA, ELLIPSE & HYPERBOLA PARABOLA EXERCISE - 0 CHECK YOUR GRASP. Hin : Disnce beween direcri nd focus is 5. Given (, be one end of focl chord hen oher end be, lengh of focl chord 6. Focus
More informationINTEGRALS. Exercise 1. Let f : [a, b] R be bounded, and let P and Q be partitions of [a, b]. Prove that if P Q then U(P ) U(Q) and L(P ) L(Q).
INTEGRALS JOHN QUIGG Eercise. Le f : [, b] R be bounded, nd le P nd Q be priions of [, b]. Prove h if P Q hen U(P ) U(Q) nd L(P ) L(Q). Soluion: Le P = {,..., n }. Since Q is obined from P by dding finiely
More informationFactorized Decision Forecasting via Combining Value-based and Reward-based Estimation
Fcorized Decision Forecsing vi Combining Vlue-bsed nd Rewrd-bsed Esimion Brin D. Ziebr Crnegie Mellon Universiy Pisburgh, PA 15213 bziebr@cs.cmu.edu Absrc A powerful recen perspecive for predicing sequenil
More informationThe Finite Element Method for the Analysis of Non-Linear and Dynamic Systems
Swiss Federl Insiue of Pge 1 The Finie Elemen Mehod for he Anlysis of Non-Liner nd Dynmic Sysems Prof. Dr. Michel Hvbro Fber Dr. Nebojs Mojsilovic Swiss Federl Insiue of ETH Zurich, Swizerlnd Mehod of
More informationReserves measures have an economic component eg. what could be extracted at current prices?
3.2 Non-renewable esources A. Are socks of non-renewable resources fixed? eserves measures have an economic componen eg. wha could be exraced a curren prices? - Locaion and quaniies of reserves of resources
More informationT L. t=1. Proof of Lemma 1. Using the marginal cost accounting in Equation(4) and standard arguments. t )+Π RB. t )+K 1(Q RB
Elecronic Companion EC.1. Proofs of Technical Lemmas and Theorems LEMMA 1. Le C(RB) be he oal cos incurred by he RB policy. Then we have, T L E[C(RB)] 3 E[Z RB ]. (EC.1) Proof of Lemma 1. Using he marginal
More informationCompetitive and Cooperative Inventory Policies in a Two-Stage Supply-Chain
Compeiive and Cooperaive Invenory Policies in a Two-Sage Supply-Chain (G. P. Cachon and P. H. Zipkin) Presened by Shruivandana Sharma IOE 64, Supply Chain Managemen, Winer 2009 Universiy of Michigan, Ann
More informationNeural assembly binding in linguistic representation
Neurl ssembly binding in linguisic represenion Frnk vn der Velde & Mrc de Kmps Cogniive Psychology Uni, Universiy of Leiden, Wssenrseweg 52, 2333 AK Leiden, The Neherlnds, vdvelde@fsw.leidenuniv.nl Absrc.
More informationINVESTIGATION OF REINFORCEMENT LEARNING FOR BUILDING THERMAL MASS CONTROL
INVESTIGATION OF REINFORCEMENT LEARNING FOR BUILDING THERMAL MASS CONTROL Simeng Liu nd Gregor P. Henze, Ph.D., P.E. Universiy of Nebrsk Lincoln, Archiecurl Engineering 1110 Souh 67 h Sree, Peer Kiewi
More informationDecision Networks. CS 188: Artificial Intelligence Fall Example: Decision Networks. Decision Networks. Decisions as Outcome Trees
CS 188: Artificil Intelligence Fll 2011 Decision Networks ME: choose the ction which mximizes the expected utility given the evidence mbrell Lecture 17: Decision Digrms 10/27/2011 Cn directly opertionlize
More informationGEOMETRIC EFFECTS CONTRIBUTING TO ANTICIPATION OF THE BEVEL EDGE IN SPREADING RESISTANCE PROFILING
GEOMETRIC EFFECTS CONTRIBUTING TO ANTICIPATION OF THE BEVEL EDGE IN SPREADING RESISTANCE PROFILING D H Dickey nd R M Brennn Solecon Lbororie, Inc Reno, Nevd 89521 When preding reince probing re mde prior
More informationChapter Direct Method of Interpolation
Chper 5. Direc Mehod of Inerpolion Afer reding his chper, you should be ble o:. pply he direc mehod of inerpolion,. sole problems using he direc mehod of inerpolion, nd. use he direc mehod inerpolns o
More information0 for t < 0 1 for t > 0
8.0 Sep nd del funcions Auhor: Jeremy Orloff The uni Sep Funcion We define he uni sep funcion by u() = 0 for < 0 for > 0 I is clled he uni sep funcion becuse i kes uni sep = 0. I is someimes clled he Heviside
More informationEXERCISES FOR SECTION 1.5
1.5 Exisence and Uniqueness of Soluions 43 20. 1 v c 21. 1 v c 1 2 4 6 8 10 1 2 2 4 6 8 10 Graph of approximae soluion obained using Euler s mehod wih = 0.1. Graph of approximae soluion obained using Euler
More informationBU Macro BU Macro Fall 2008, Lecture 4
Dynamic Programming BU Macro 2008 Lecure 4 1 Ouline 1. Cerainy opimizaion problem used o illusrae: a. Resricions on exogenous variables b. Value funcion c. Policy funcion d. The Bellman equaion and an
More informationAn introduction to the theory of SDDP algorithm
An inroducion o he heory of SDDP algorihm V. Leclère (ENPC) Augus 1, 2014 V. Leclère Inroducion o SDDP Augus 1, 2014 1 / 21 Inroducion Large scale sochasic problem are hard o solve. Two ways of aacking
More informationNon-Myopic Multi-Aspect Sensing with Partially Observable Markov Decision Processes
Non-Myopic Multi-Apect Sening with Prtilly Oervle Mrkov Deciion Procee Shiho Ji 2 Ronld Prr nd Lwrence Crin Deprtment of Electricl & Computer Engineering 2 Deprtment of Computer Engineering Duke Univerity
More informationMATH 124 AND 125 FINAL EXAM REVIEW PACKET (Revised spring 2008)
MATH 14 AND 15 FINAL EXAM REVIEW PACKET (Revised spring 8) The following quesions cn be used s review for Mh 14/ 15 These quesions re no cul smples of quesions h will pper on he finl em, bu hey will provide
More informationHybrid Control and Switched Systems. Lecture #2 How to describe a hybrid system? Formal models for hybrid system
Hyrid Control nd Switched Systems Lecture #2 How to descrie hyrid system? Forml models for hyrid system João P. Hespnh University of Cliforni t Snt Brr Summry. Forml models for hyrid systems: Finite utomt
More informationSLOW INCREASING FUNCTIONS AND THEIR APPLICATIONS TO SOME PROBLEMS IN NUMBER THEORY
VOL. 8, NO. 7, JULY 03 ISSN 89-6608 ARPN Jourl of Egieerig d Applied Sciece 006-03 Ai Reerch Publihig Nework (ARPN). All righ reerved. www.rpjourl.com SLOW INCREASING FUNCTIONS AND THEIR APPLICATIONS TO
More informationState-Space Models. Initialization, Estimation and Smoothing of the Kalman Filter
Sae-Space Models Iniializaion, Esimaion and Smoohing of he Kalman Filer Iniializaion of he Kalman Filer The Kalman filer shows how o updae pas predicors and he corresponding predicion error variances when
More informationSyntactic Complexity of Suffix-Free Languages. Marek Szykuła
Inroducion Upper Bound on Synacic Complexiy of Suffix-Free Language Univeriy of Wrocław, Poland Join work wih Januz Brzozowki Univeriy of Waerloo, Canada DCFS, 25.06.2015 Abrac Inroducion Sae and ynacic
More informationChapter 2. Motion along a straight line. 9/9/2015 Physics 218
Chper Moion long srigh line 9/9/05 Physics 8 Gols for Chper How o describe srigh line moion in erms of displcemen nd erge elociy. The mening of insnneous elociy nd speed. Aerge elociy/insnneous elociy
More informationSolutions to Problems from Chapter 2
Soluions o Problems rom Chper Problem. The signls u() :5sgn(), u () :5sgn(), nd u h () :5sgn() re ploed respecively in Figures.,b,c. Noe h u h () :5sgn() :5; 8 including, bu u () :5sgn() is undeined..5
More informationANSWERS TO EVEN NUMBERED EXERCISES IN CHAPTER 2
ANSWERS TO EVEN NUMBERED EXERCISES IN CHAPTER Seion Eerise -: Coninuiy of he uiliy funion Le λ ( ) be he monooni uiliy funion defined in he proof of eisene of uiliy funion If his funion is oninuous y hen
More informationTHE MYSTERY OF STOCHASTIC MECHANICS. Edward Nelson Department of Mathematics Princeton University
THE MYSTERY OF STOCHASTIC MECHANICS Edward Nelson Deparmen of Mahemaics Princeon Universiy 1 Classical Hamilon-Jacobi heory N paricles of various masses on a Euclidean space. Incorporae he masses in he
More informationLocation is relative. Coordinate Systems. Which of the following can be described with vectors??
Locion is relive Coordine Sysems The posiion o hing is sed relive o noher hing (rel or virul) review o he physicl sis h governs mhemicl represenions Reerence oec mus e deined Disnce mus e nown Direcion
More informationMinimum Squared Error
Minimum Squred Error LDF: Minimum Squred-Error Procedures Ide: conver o esier nd eer undersood prolem Percepron y i > for ll smples y i solve sysem of liner inequliies MSE procedure y i = i for ll smples
More informationProduct Operators. Fundamentals of MR Alec Ricciuti 3 March 2011
Produc Operaors Fundamenals of MR Alec Ricciui 3 March 2011 Ouline Review of he classical vecor model Operaors Mahemaical definiion Quanum mechanics Densiy operaors Produc operaors Spin sysems Single spin-1/2
More informationPhysic 231 Lecture 4. Mi it ftd l t. Main points of today s lecture: Example: addition of velocities Trajectories of objects in 2 = =
Mi i fd l Phsic 3 Lecure 4 Min poins of od s lecure: Emple: ddiion of elociies Trjecories of objecs in dimensions: dimensions: g 9.8m/s downwrds ( ) g o g g Emple: A foobll pler runs he pern gien in he
More informationMinimum Squared Error
Minimum Squred Error LDF: Minimum Squred-Error Procedures Ide: conver o esier nd eer undersood prolem Percepron y i > 0 for ll smples y i solve sysem of liner inequliies MSE procedure y i i for ll smples
More informationMulti-scale 2D acoustic full waveform inversion with high frequency impulsive source
Muli-scale D acousic full waveform inversion wih high frequency impulsive source Vladimir N Zubov*, Universiy of Calgary, Calgary AB vzubov@ucalgaryca and Michael P Lamoureux, Universiy of Calgary, Calgary
More information2/5/2012 9:01 AM. Chapter 11. Kinematics of Particles. Dr. Mohammad Abuhaiba, P.E.
/5/1 9:1 AM Chper 11 Kinemic of Pricle 1 /5/1 9:1 AM Inroducion Mechnic Mechnic i Th cience which decribe nd predic he condiion of re or moion of bodie under he cion of force I i diided ino hree pr 1.
More informationMotion in a Straight Line
Moion in Srigh Line. Preei reched he mero sion nd found h he esclor ws no working. She wlked up he sionry esclor in ime. On oher dys, if she remins sionry on he moing esclor, hen he esclor kes her up in
More informationFamily structure and long-run equilibrium distribution of wealth. Yue Xin 2018/03/12
mily srucure nd long-run dribuion welh Yue 08/03/ Conen Review bsic model Assumpions Inrgenerion uiliy mximizion Inergenerion welh rnsformion elh divion welh union x Quesions: Suppose re endency o form
More information( ) = Q 0. ( ) R = R dq. ( t) = I t
ircuis onceps The addiion of a simple capacior o a circui of resisors allows wo relaed phenomena o occur The observaion ha he ime-dependence of a complex waveform is alered by he circui is referred o as
More informationSZG Macro 2011 Lecture 3: Dynamic Programming. SZG macro 2011 lecture 3 1
SZG Macro 2011 Lecure 3: Dynamic Programming SZG macro 2011 lecure 3 1 Background Our previous discussion of opimal consumpion over ime and of opimal capial accumulaion sugges sudying he general decision
More informationNECESSARY AND SUFFICIENT CONDITIONS FOR LATENT SEPARABILITY
NECESSARY AND SUFFICIENT CONDITIONS FOR LATENT SEPARABILITY Ian Crawford THE INSTITUTE FOR FISCAL STUDIES DEPARTMENT OF ECONOMICS, UCL cemmap working paper CWP02/04 Neceary and Sufficien Condiion for Laen
More informationPhil Wertheimer UMD Math Qualifying Exam Solutions Analysis - January, 2015
Problem 1 Let m denote the Lebesgue mesure restricted to the compct intervl [, b]. () Prove tht function f defined on the compct intervl [, b] is Lipschitz if nd only if there is constct c nd function
More information5.1-The Initial-Value Problems For Ordinary Differential Equations
5.-The Iniil-Vlue Problems For Ordinry Differenil Equions Consider solving iniil-vlue problems for ordinry differenil equions: (*) y f, y, b, y. If we know he generl soluion y of he ordinry differenil
More information22.615, MHD Theory of Fusion Systems Prof. Freidberg Lecture 9: The High Beta Tokamak
.65, MHD Theory of Fusion Sysems Prof. Freidberg Lecure 9: The High e Tokmk Summry of he Properies of n Ohmic Tokmk. Advnges:. good euilibrium (smll shif) b. good sbiliy ( ) c. good confinemen ( τ nr )
More informationProperties of Logarithms. Solving Exponential and Logarithmic Equations. Properties of Logarithms. Properties of Logarithms. ( x)
Properies of Logrihms Solving Eponenil nd Logrihmic Equions Properies of Logrihms Produc Rule ( ) log mn = log m + log n ( ) log = log + log Properies of Logrihms Quoien Rule log m = logm logn n log7 =
More information