Information Relaxations and Duality in Stochastic Dynamic Programs

Size: px
Start display at page:

Download "Information Relaxations and Duality in Stochastic Dynamic Programs"

Transcription

1 OPERATIONS RESEARCH Vol. 58, No. 4, Par 1 of 2, July Augus 2010, pp issn X eissn informs doi /opre INFORMS Informaion Relaxaions and Dualiy in Sochasic Dynamic Programs David B. Brown, James E. Smih, Peng Sun Fuqua School of Business, Duke Universiy, Durham, Norh Carolina {dbbrown@duke.edu, jes9@duke.edu, psun@duke.edu} We describe a general echnique for deermining upper bounds on maximal values (or lower bounds on minimal coss) in sochasic dynamic programs. In his approach, we relax he nonanicipaiviy consrains ha require decisions o depend only on he informaion available a he ime a decision is made and impose a penaly ha punishes violaions of nonanicipaiviy. In applicaions, he hope is ha his relaxed version of he problem will be simpler o solve han he original dynamic program. The upper bounds provided by his dual approach complemen lower bounds on values ha may be found by simulaing wih heurisic policies. We describe he heory underlying his dual approach and esablish weak dualiy, srong dualiy, and complemenary slackness resuls ha are analogous o he dualiy resuls of linear programming. We also sudy properies of good penalies. Finally, we demonsrae he use of his dual approach in an adapive invenory conrol problem wih an unknown and changing demand disribuion and in valuing opions wih sochasic volailiies and ineres raes. These are complex problems of significan pracical ineres ha are quie difficul o solve o opimaliy. In hese examples, our dual approach requires relaively lile addiional compuaion and leads o igh bounds on he opimal values. Subjec classificaions: dynamic programming; dualiy; invenory conrol; opion pricing. Area of review: Sochasic Models. Hisory: Received November 2007; revisions received November 2008, July 2009, Sepember 2009; acceped Ocober Published online in Aricles in Advance April 9, Inroducion In principle, dynamic programming provides a powerful framework for deermining opimal policies in complex decision problems where uncerainy is resolved and decisions are made over ime. However, he widespread use of dynamic programming is hampered by he so-called curse of dimensionaliy he size of he sae space ypically grows exponenially in he number of sae variables considered. In conras, Mone Carlo simulaion mehods ypically scale well wih he number of sae variables considered and, given a conrol policy, i is no difficul o simulae a complex dynamic sysem wih many uncerainies. Simulaing wih a feasible policy provides a lower bound on he expeced value (or upper bound on he expeced coss) of an opimal policy, bu Mone Carlo simulaion ypically does no provide a good way o idenify an opimal policy or provide an upper bound on he value of an opimal policy. In his paper, we describe a dual approach for sudying sochasic dynamic programs (DPs) ha focuses on providing an upper bound on he opimal expeced value. This dual approach consiss of wo elemens: (1) we relax he nonanicipaiviy consrains ha require decisions o depend only on he informaion available a he ime a decision is made and (2) we impose a penaly ha punishes violaions of he nonanicipaiviy consrains. By relaxing he nonanicipaiviy consrains, we can ofen grealy simplify he DP. For example, we sudy an adapive invenory conrol problem wih an unknown and changing demand disribuion and sochasic ordering coss. Here a perfec informaion relaxaion assumes he decision maker (DM) knows all demands and coss before placing any orders. Wih his informaion, he problem of choosing an opimal ordering schedule is a deerminisic DP ha can be solved quie easily. In anoher example, we sudy an opion-pricing model wih sochasic volailiies and sochasic ineres raes and consider an imperfec informaion relaxaion where volailiies and ineres raes are known in advance bu he sock price is no: wih he volailiies and ineres raes known, we can value he opion using sandard laice mehods. Because hese relaxaions assume he DM has more informaion han is ruly available, hey lead o an upper bound on value. Wihou any penaly for using his addiional informaion, he bound obained is ofen quie weak. Informally, we say a penaly is dual feasible if i does no penalize any policy ha is nonanicipaive; he penalies may, however, punish policies ha do no saisfy he nonanicipaiviy consrains. We will show ha in principle we can always find a dual feasible penaly ha provides a igh bound, i.e., srong dualiy holds. We view his dual approach as a complemen o he use of simulaion mehods and modern approximae 785

2 786 Operaions Research 58(4, Par 1 of 2), pp , 2010 INFORMS dynamic programming mehods for sudying DPs (see, e.g., Bersekas and Tsisiklis 1996, de Farias and Van Roy 2003, Powell 2007, or Adelman and Mersererau 2008). As menioned earlier, given a candidae policy (perhaps idenified using a heurisic approach or using approximae DP echniques), we can use sandard simulaion echniques o esimae he expeced value wih his policy and hereby generae a lower bound on he expeced value wih an opimal policy. Our dual approach can hen be used o generae an upper bound on he value of an opimal policy. If he difference beween he expeced value wih his candidae policy and he upper bound on he opimal value is small, we may conclude ha he candidae policy is good enough and no coninue searching for a beer policy. If he difference is large, i may be worhwhile o work harder o find a beer policy and/or a igher upper bound. In our invenory example, we will use he dual bounds o deermine wheher a simple myopic ordering policy is good enough or wheher we need o consider more complex one- or woperiod look-ahead policies. In he opion-pricing example, we use he dual bounds o sudy he effeciveness of an exercise policy ha ignores uncerainy abou volailiies and ineres raes. In boh examples, we will also demonsrae how we can use he resuls of he dual problem o idenify ways o improve hese heurisic policies. Our ineres in his dual approach for DPs was moivaed by he need o evaluae he qualiy of heurisic policies in applicaions, and inspired by Haugh and Kogan s (2004) dual approach for placing bounds on he value of an American opion; Rogers (2002) independenly proposed a similar dual approach, also applied o opion pricing. Boh Haugh and Kogan (2004) and Rogers (2002) consider he use of wha we call perfec informaion relaxaions and esablish heir main resuls using maringale argumens. Haugh and Kogan propose a paricular mehod for generaing penalies or, in heir erminology, dual maringales based on approximae value funcions and demonsrae he use of his mehod in high-dimensional opion-pricing problems. Andersen and Broadie (2004) propose an alernaive mehod for generaing dual maringales based on approximae policies. Glasserman (2004) provides a nice overview of his work. We generalize he work of Haugh and Kogan (2004), Rogers (2002), and Andersen and Broadie (2004) in several ways. Firs, raher han focusing exclusively on opionpricing problems, we consider general sochasic DPs. Second, raher han focusing exclusively on perfec informaion relaxaions, we consider general informaion relaxaions. Finally, we presen a general mehod for consrucing good penalies ha includes and exends he mehods proposed by Haugh and Kogan and Andersen and Broadie. These generalizaions expand he scope and flexibiliy of his dual approach. The idea of relaxing he nonanicipaiviy consrains has also been sudied in he sochasic programming lieraure (see, e.g., Rockafellar and Wes 1991, Shapiro and Ruszczyński 2003, Shapiro e. al 2009). Rogers (2007) also recenly (independenly) proposed a dual approach for Markov decision processes. In shor, hough hese alernaive approaches have similariies wih ours, our formulaion is differen and leads o resuls ha we believe are boh simpler and more general. The sochasic programming formulaion requires he reward funcions and se of feasible acions o be convex and he penalies considered are linear funcions of he acions; hey consider only perfec informaion relaxaions. Rogers focuses on Markov decision processes and considers only perfec informaion relaxaions and penalies ha are a funcion of he sae variable only; Rogers does no presen any example applicaions. In conras, our framework allows general reward funcions and acion spaces, allows general penaly funcions, and considers imperfec as well as perfec informaion relaxaions. Moreover, our dualiy proofs are quie simple and direc and do no rely on sophisicaed convex dualiy or maringale argumens. Finally, our invenory conrol and opion-pricing examples demonsrae he power of his dual approach in some complex problems of significan pracical ineres. We begin in 2 by defining he basic framework and heory underlying he dual approach; he main resuls are analogous o he dualiy resuls of linear programming. We hen illusrae he approach in he invenory conrol and opion-pricing examples in 3 4. We offer a few concluding remarks in 5. The elecronic companion provides supporing informaion: Appendix A conains mos of he proofs; Appendix B compares our resuls o similar resuls in sochasic programming and develops he connecions o linear programming more fully; and Appendix C provides some deails of he adapive invenory example. The elecronic companion o his paper is available as par of he online version ha can be found a hp://or.journal. informs.org/. 2. The Basic Framework and Resuls We begin by describing he general formulaion of he primal sochasic DP in 2.1. We hen presen our main dualiy resuls in 2.2 and discuss an approach for generaing good penalies in General Framework Uncerainy in he DP is described by a probabiliy space F where is he se of possible oucomes (wih ypical elemen ), F is a -algebra ha describes he se of all possible evens (an even is a subse of ), and is a probabiliy measure describing he likelihoods of he various evens. Time is discree and indexed by = 0 T. The DM s sae of informaion evolves over ime and is described by a filraion = F 0 F T where he -algebra F describes he DM s sae of informaion a he beginning of period, i.e., F is he se of evens ha will be known o be rue or false a ime. We will refer o as he naural filraion.we

3 Operaions Research 58(4, Par 1 of 2), pp , 2010 INFORMS 787 require all filraions o saisfy F F +1 F for all <T so he DM does no forge wha she once knew. We will assume ha F 0 =, so he DM iniially knows nohing abou he oucome of he uncerainies. A funcion (or random variable) f defined on is measurable wih respec o a algebra F (or F -measurable) if for every Borel se R in he range of f,wehave f R F ; we can inerpre f being F -measurable as meaning he resul of f depends only on he informaion known in period. A sequence of funcions f 0 f T is said o be adaped o a filraion (or -adaped) if each funcion f is measurable wih respec o F. In he DP model, he DM will choose an acion a in period from he se A ;welea A 0 A T denoe he se of all feasible acion sequences a. The DM s choice of acions is described by a policy ha selecs a sequence of acions a in A for each oucome in (i.e., A). We le denoe he se of all policies. In he primal DP, we assume ha he DM s choices are nonanicipaive in ha he choice of acion a in period depends only on wha is known a he beginning of period. More formally, we require policies o be adaped o he naural filraion in ha a policy s selecion of he firs +1 acions a 0 a mus be measurable wih respec o F. We le be he se of all nonanicipaive policies. 1 The goal of he DP is o selec a nonanicipaive policy o maximize he expeced oal reward. The rewards are defined by a sequence of reward funcions r 0 a r T a where he reward in period depends on he acion sequence a seleced and he oucome. We le r a = T =0 r a denoe he oal reward; discouning can be incorporaed ino he period reward funcion r. The primal DP is hen: sup Ɛ r (1) Here Ɛ r could be wrien more explicily as Ɛ r, where policy selecs an acion sequence ha depends on he random oucome, and he rewards r depend on he acion sequence seleced by and he oucome. We will ypically suppress he dependence on and inerpre r as a random variable represening he reward generaed wih policy. I is insrucive o wrie he primal DP (1) in he sandard Bellman-syle recursive form. Firs, we will assume ha he period- rewards r are F -measurable for each se of acions and depend only on he firs + 1 acions a 0 a ; we will wrie r a as r a 0 a wih he undersanding ha a 0 a is seleced from he full sequence of acions a. For>0, le A a 0 a 1 be he subse of period- acions A ha are feasible given he prior choice of acions a 0 a 1. We ake he erminal value funcion V T +1 a 0 a T = 0 and, for = 0 T, we define { V a 0 a 1 = sup r a 0 a a A a 0 a 1 + Ɛ V +1 a 0 a F } (2) Here boh sides are random variables (and herefore implicily funcions of he oucome ) and we selec an opimal acion a for each oucome. Because he rewards r are assumed o be F -measurable and he expeced coninuaion values are condiioned on F, and hus F -measurable, he objecive funcion on he righ is F -measurable for each se of acions a 0 a. Thus, he supremum over acions a is also F -measurable, which implies ha V is F -measurable. There is no loss in resricing he choice of acions a o be F -measurable; herefore, if he suprema on he righ side of (2) are aained, we can consruc a nonanicipaive opimal policy using his recursion. The final value V 0 is equal o he opimal value of (1) The Dual Approach In our dual approach o he DP (1), we relax he requiremen ha he policies be nonanicipaive and impose penalies ha punish violaions of he nonanicipaiviy consrains. We define relaxaions of he nonanicipaiviy requiremen by considering alernaive informaion srucures. We say ha a filraion = 0 T is a relaxaion of he naural filraion = F 0 F T if, for each, F F ; we abbreviae his by wriing. being a relaxaion of means ha he DM knows more in every period under han she knows under. The perfec informaion filraion = I 0 I T is given by aking I = F for all. We le denoe he se of policies ha are adaped o. For any relaxaion of, we have = ; hus, as we relax he filraion, we expand he se of feasible policies. The se of penalies is he se of all funcions z a ha, like he oal rewards, depend on he choice of acion sequence a and he oucome. As wih rewards, we will ypically wrie he penalies as an acion-dependen random variable z a (=z a ) or a policy-dependen random variable z =z, suppressing he dependence on he oucome. We define he se of dual feasible penalies o be hose penalies ha do no penalize nonanicipaive policies (in expecaion), ha is = z Ɛ z F 0 for all F in (3) Policies ha do no saisfy he nonanicipaiviy consrains (and hus are no feasible o implemen) may have posiive expeced penalies. We can place an upper bound on he expeced reward associaed wih any nonanicipaive policy by relaxing he nonanicipaiviy consrain on policies and imposing a dual feasible penaly. This simple resul can be viewed as a version of he weak dualiy lemma for linear programming: Lemma 2.1 (Weak Dualiy). If F and z are primal and dual feasible, respecively (i.e., F and z ), and is a relaxaion of, hen Ɛ r F sup Ɛ r G z G (4)

4 788 Operaions Research 58(4, Par 1 of 2), pp , 2010 INFORMS Proof. Wih z, F, and as defined in he lemma, we have Ɛ r F Ɛ r F z F sup Ɛ r G z G The firs inequaliy holds because z (hus Ɛ z F 0) and he second because F. Thus, any informaion relaxaion wih any dual feasible penaly provides an upper bound on all DP soluions. Wih a fixed penaly z, weaker relaxaions lead o larger ses of feasible policies and weaker bounds. For example, if we consider he perfec informaion relaxaion, he se of relaxed policies is simply he se of all policies and all acions are seleced wih full knowledge of he oucome. Thus, he weak dualiy lemma implies ha for any F in and z in, Ɛ r F sup Ɛ r z = Ɛ [ sup a A ] r a z a (5) If we ake he penaly z = 0, his upper bound is he expeced value wih perfec informaion. Noe ha he upper bound (5) is in a form ha is convenien for Mone Carlo simulaion: we can esimae he expeced value on he righ side of (5) by randomly generaing oucomes and solving a deerminisic inner problem of choosing an acion sequence a o maximize he penalized objecive r a z a for each. For insance, in our invenory example, he perfec informaion relaxaion assumes he DM has knowledge of all demands and coss before making any ordering decisions. We esimae he dual bound by randomly generaing demand/cos scenarios in he ouer simulaion, and he inner problem is a simple deerminisic DP ha chooses opimal ordering quaniies in each demand/cos scenario. Wih imperfec informaion relaxaions, we can ofen sill use Mone Carlo simulaion o esimae he upper bounds. For insance, in our opion-pricing example, we will randomly generae ineres raes and volailiies in he ouer simulaion, and he inner problem is a one-dimensional DP ha considers uncerainy in sock prices. If we minimize over he dual feasible penalies in (4), we obain he dual of he primal DP (1): { } inf sup Ɛ r G z G (6) z By he weak dualiy lemma, if we idenify a policy F and penaly z ha are primal and dual feasible, respecively, such ha equaliy holds in (4), hen F and z mus be opimal for heir respecive problems. In such a case, here would be no gap beween he values given by hese primal and dual soluions. If he primal soluion is bounded, here is always a dual feasible penaly ha yields no gap. For example, consider he penaly z a = r a v where v is he opimal value of he primal DP (1). This z is dual feasible (because Ɛ r F v for all F ) and rivially opimal: no maer wha policy is seleced, he penalized objecive funcion r a z a is equal o v. The exisence of his rivially opimal penaly is no helpful in pracice because i requires knowing he opimal value v of he primal DP. I does, however, show ha here is no gap beween he soluions o he primal and dual problems and ha, in principle, we could deermine he maximal expeced reward in he primal DP (1) by solving he dual problem (6). This resul is analogous o he srong dualiy heorem of linear programming. Theorem 2.1 (Srong Dualiy). Le be a relaxaion of. Then sup F Ɛ r F = inf z { } sup Ɛ r G z G (7) Furhermore, if he primal problem on he lef is bounded, he dual problem on he righ has an opimal soluion z ha achieves his bound. The complemenary slackness condiion furher characerizes he relaionship beween he primal and dual problems, saying ha for a primal-dual pair F z o be opimal, i is necessary and sufficien for F o have zero expeced penaly and for F o solve he dual problem in he following sense. Theorem 2.2 (Complemenary Slackness). Le F and z be feasible soluions for he primal and dual problems respecively (i.e., F and z ), wih informaion relaxaion. A necessary and sufficien condiion for hese o be opimal soluions for heir respecive problems is ha Ɛ z F = 0 and Ɛ r F z F = sup Ɛ r G z G (8) Equaion (8) can be inerpreed as saying ha wih an opimal penaly, in dual problem he DM will be conen o choose a policy ha is nonanicipaive even hough she has he opion of choosing a policy ha is no. In applicaions, we will compare he heurisic policies F used o compue a lower bound wih he policies G seleced in he dual problem o see if we can idenify some way o improve he heurisic policy. Finally, we noe a useful propery of his dual approach: if we can simplify he primal problem by focusing on some subse of policies, we can resric he dual problem o focus on policies in his same se. For example, if we know ha he opimal policy for he primal problem is myopic or has a hreshold srucure, we can simplify he dual problem by considering only policies ha have he same srucure. This leads o dual bounds ha are a leas as igh and perhaps easier o compue han he dual bounds ha do no include his consrain. We summarize his propery as follows.

5 Operaions Research 58(4, Par 1 of 2), pp , 2010 INFORMS 789 Proposiion 2.1 (Srucured Policies.) If for some we have sup Ɛ r F = sup Ɛ r F, hen, for F F any dual feasible z, we have sup F Ɛ r F sup Ɛ r G z G sup Ɛ r G z G (9) Moreover, he inequaliies also hold for all z such ha Ɛ z F 0 for all F in. For insance, in our opion-pricing example, in he primal problem i is never opimal o exercise a call opion prior o expiraion, excep possibly jus before a dividend is paid. However, in he dual problem wih a relaxed filraion, early exercise may be opimal. In our numerical experimens for his example, we will use his srucural resul and impose a no early exercise consrain in he dual problem for call opions. The resuling bounds are boh igher and easier o compue han hey would be wihou his consrain Good Penalies In our discussion so far, we have considered he se of all dual feasible penalies. We now focus on idenifying good penalies ha are likely o be useful in pracice. The main approach we will use o generae penalies is described in he following proposiion. We will show shorly ha we can, in principle, generae an opimal dual penaly using his approach, so ha srong dualiy holds even when resriced o hese good penalies. Proposiion 2.2 (Consrucing Good Penalies). Le be a relaxaion of and le w 0 a w T a be a sequence of generaing funcions defined on A where each w depends only on he firs + 1 acions a 0 a of a. Define z a = Ɛ w a Ɛ w a F and z a = T =0 z a. Then: (i) For all F in, we have Ɛ z F F = 0 for all, and Ɛ z F = 0; and (ii) z 0 a z T a is adaped o and z depends only on he firs + 1 acions a 0 a of a. Propery (i) of he proposiion implies ha he penalies z generaed using he proposiion will always be dual feasible in ha Ɛ z F 0 for F in, bu is sronger in ha i implies he inequaliy defining feasibiliy holds wih equaliy. The complemenary slackness condiion (Theorem 2) shows ha an opimal penaly z will assign zero expeced penaly o an opimal primal policy. Penalies generaed using Proposiion 2.2 will assign zero expeced penaly o all nonanicipaive policies. Propery (ii) of he proposiion implies ha he penalized objecive funcion can be decomposed ino period- componens r z ha depend only on wha is known a period under and he acions chosen in or before period. This means we can solve he dual problem using a DP recursion like ha of Equaion (2) using he penalized rewards and based on filraion raher han. Specifically, he erminal dual value funcion is VT +1 a 0 a T = 0 and, for = 0 T,wehave V a 0 a 1 { = sup r a 0 a z a 0 a a A a 0 a 1 + Ɛ V +1 a 0 a } { = sup r a 0 a Ɛ w a 0 a a A a 0 a 1 + Ɛ w a 0 a F + Ɛ V +1 a 0 a } (10) The iniial value, V0, provides on upper bound on he primal DP (1) or, equivalenly, (2). We can consruc an opimal penaly using Proposiion 2.2 by aking he generaing funcions o be based on he opimal DP value funcion given by (2). Specifically, if we ake w a = V +1 a 0 a, we arrive a an opimal dual penaly z a ha we will refer o as he ideal penaly. I is easy o show by inducion ha wih his choice of generaing funcion, he dual value funcions are equal o he corresponding primal value funcions, i.e., V = V. This is rivially rue for he erminal values (boh are zero). If we assume ha V+1 = V +1, erms cancel and (10) reduces o he expression for V given in Equaion (2). Thus, wih his choice of generaing funcion, we obain an opimal penaly for any informaion relaxaion. The following heorem summarizes his resul and adds a bi more. Theorem 2.3 (The Ideal Penaly). Le be a relaxaion of and le z be defined as in Proposiion 2.2 by aking w a = V +1 a 0 a. Then z is dual feasible and opimal in ha sup Ɛ r F = sup Ɛ r G z G (11) F Moreover, if F achieves he supremum for he primal problem on he lef side of (11), hen F is also opimal for he dual problem on he righ. Finally, if is he perfec informaion relaxaion and F is an opimal policy, hen r F z F = Ɛ r F almos always. Alhough he value funcions will no be known in applicaions, he form of z illusraes he ideal ha we would like o approximae wih our choice of penalies. Inuiively, we would like o choose penalies ha eliminae he benefi of choosing acions based on he informaion in raher han relying on he informaion in he naural filraion. Tha is, we wan o choose a generaing funcion w so ha he differences Ɛ w a Ɛ w a F approximae he differences Ɛ V +1 a Ɛ V +1 a F and he

6 790 Operaions Research 58(4, Par 1 of 2), pp , 2010 INFORMS condiional expecaions (Ɛ w a and Ɛ w a F ) are no oo difficul o compue. In applicaions, we can approximae z in a variey of ways. Haugh and Kogan (2004) and Andersen and Broadie (2004) proposed mehods for generaing penalies (or dual maringales) in he opion pricing conex ha can be generalized o our seing. Generalizing Haugh and Kogan s approach, we can approximae he ideal penaly z by using an approximae value funcion ˆv a in place of he rue value funcion V a. This leads o period- penalies of he form z a = Ɛ ˆv +1 a Ɛ ˆv +1 a F. To use his approach, we mus somehow esimae or calculae he condiional expecaions Ɛ ˆv +1 a and Ɛ ˆv +1 a F. Haugh and Kogan consider he perfec informaion relaxaion ( = F )soɛ ˆv +1 a =ˆv +1 a can be evaluaed direcly for any sample pah. They esimae Ɛ ˆv +1 a F using a nesed simulaion procedure: for each sample pah and each period, hey esimae Ɛ ˆv +1 a F by generaing random successors o he period- sae and averaging he nex period values ˆv +1 a in hese successor saes. The penalies generaed using his approach will lead o valid bounds as long as he nesed esimaes of hese condiional expecaions are unbiased; see Proposiion 2.3(iv) below. Andersen and Broadie (2004) also consider a perfec informaion relaxaion, bu base heir penaly on a given policy raher han an approximae value funcion. In our framework, heir approach can be seen as approximaing he value funcion V a wih v a = Ɛ r a F, where a denoes a policy ha akes he firs acions a 0 a 1 o mach hose of a and hen coninues according o some given rule. The penaly is hen z a = Ɛ v+1 a Ɛ v+1 a F ; wih a perfec informaion relaxaion, his is equivalen o z a = Ɛ r +1 a F +1 Ɛ r +1 a F. Andersen and Broadie generae sample pahs in he ouer simulaion and esimae he condiional expecaions using nesed simulaion. Whereas he nesed simulaions in he Haugh-Kogan approach consider a single period, here each period s nesed simulaion follows he specified policy hrough he end of he horizon or unil he policy calls for sopping. Because each fuure period is considered in each nesed simulaion, he work involved in he Andersen-Broadie approach poenially grows wih T 2 where T is he number of periods considered in he model. Again, hese penalies will lead o valid bounds as long as he esimaes of hese condiional expecaions are unbiased. In pracice, here will ypically be a rade-off beween he qualiy of he bound and he compuaional effor required o compue i. We can conrol his rade-off hrough our choice of informaion relaxaion and penaly. The following proposiion provides some properies of penalies and informaion relaxaions ha are useful in undersanding hese rade-offs. Proposiion 2.3 (Properies of Penalies and Relaxaions). (i) Le 1 and 2 be filraions saisfying 1 2 and le z 1 and z 2 be penalies consruced using Proposiion 2.2 wih relaxaions 1 and 2 and a common sequence of generaing funcions w 0 w T. Then sup Ɛ r G z 1 G sup Ɛ r G z 2 G (12) 1 2 (ii) For any wo dual feasible penalies z 1 and z 2 and informaion relaxaion, we have inf Ɛ z 2 G z 1 G sup Ɛ r G z 1 G sup Ɛ r G z 2 G sup Ɛ z 2 G z 1 G (13) (iii) Le and be filraions saisfying and le w 0 w T be a sequence of generaing funcions saisfying he condiions of Proposiion 2.2. The penaly z given by z a = Ɛ w a Ɛ w a F and z a = T =0 z a saisfies he resuls of Proposiion 2.2. (iv) Le be a relaxaion of and z a = T =0 z a be a dual feasible penaly such ha z a is -measurable and depends only on he firs + 1 acions of a. Suppose ẑ a = T =0 ẑ a where each ẑ a depends only on he firs + 1 acions of a, and furher suppose ha each ẑ a is an unbiased esimae of z a in ha Ɛ ẑ a = z a. Le be a relaxaion of ha assumes ha in addiion o wha is known under, he values of ẑ a are revealed in period. Then sup Ɛ r G z G sup Ɛ r G ẑ G (14) G The firs resul of he proposiion says ha if we generae penalies wih a common se of generaing funcions, looser relaxaions lead o weaker bounds. For example, we may find ha he bounds given by using a simple generaing funcion (say, w = 0) may be good enough wih one informaion relaxaion, bu no good enough wih a looser relaxaion. The second resul of he proposiion can be viewed as a coninuiy propery: if he penalies z 1 and z 2 are close in ha he difference Ɛ z 2 G z 1 G is small for all G, hen he bounds provided by he wo penalies will also be close. For example, if z 2 is he ideal penaly z and herefore yields he opimal upper bound, he bound given by some oher penaly z 1 will exceed he opimal bound by no more han sup G Ɛ z G z 1 G. In his sense, penalies ha are close o he ideal penaly will lead o bounds ha are close o opimal. The hird resul can be helpful for deermining penalies when Ɛ w a F is difficul o calculae. For insance in he opion-pricing example, if we assume ha under he naural filraion volailiy is unobserved, we may be able

7 Operaions Research 58(4, Par 1 of 2), pp , 2010 INFORMS 791 o simplify he compuaion of bounds by calculaing penalies using a filraion ha assumes ha he volailiy is observed. The final resul of Proposiion 2.3 concerns he effecs of errors when penalies are esimaed, for example, using nesed simulaions as in Haugh and Kogan (2004) and Andersen and Broadie (2004). Here we can imagine he probabiliy space F as including he uncerainies associaed wih he esimaion of penalies as well as he original model uncerainies. These esimaion uncerainies are no revealed under filraions or and do no affec he rewards or penalies and hus are irrelevan o he primal and rue dual problem. The esimaes are, however, revealed under and acions are seleced o maximize he esimaed penalized reward r a ẑ a raher han he rue penalized reward r a z a. Here we see ha when hese esimaed penalies are unbiased, we obain esimaes of he bounds ha are valid bu weaker han he bounds given by using he penaly z iself. Glasserman (2004) provides some numerical resuls sudying he qualiy of he bounds in an opion-pricing example wih varying numbers of rials in he nesed simulaions. His resuls (and ohers ) show he imporance of esimaing penalies accuraely. Our resuls for he opion example in 4.7 confirm his finding Summary of Approach Before urning o our examples, i may be worhwhile o summarize he seps involved in our approach. Given a dynamic programming model: Idenify a heurisic policy ha can be used in a simulaion sudy o esimae a lower bound on he opimal value (or upper bound on he opimal cos) for he problem. Choose an informaion relaxaion ha makes i easy o deermine opimal decisions given he addiional informaion in he relaxaion. I is ofen naural o sar by considering a perfec informaion relaxaion, alhough in some problems here may be oher naural saring poins. Find a penaly ha does no grealy complicae he calculaion of opimal decisions wih he chosen informaion relaxaion. We can sar wih zero penaly, bu his may lead o weak upper bounds. Esimae lower and upper bounds on he opimal value. In our examples, we will ypically esimae he upper and lower bounds simulaneously in a single simulaion. If he gap beween bounds is sufficienly small, we may conclude ha he heurisic policy is good enough for use in pracice, and we are done. If no, we can sudy he differences beween he heurisic policies and he dual policies and see if hese sugges some ideas for improving he heurisic policies, relaxaions, or penalies. In he nex wo secions, we will sudy wo complex examples and discuss issues involved in choosing heurisic policies, informaion relaxaions, and penalies in hese applicaions. 3. Example: Adapive Invenory Conrol Our firs example is an adapive invenory conrol model where demand is nonsaionary and parially observed, meaning he probabiliy disribuion for demand changes over ime and he rue demand disribuion is no known. These kinds of models are of significan pracical ineres, bu are quie difficul o solve. Treharne and Sox (2002) consider several heurisic policies and evaluae he performance of hese policies in a se of five-period examples ha hey were able o solve exacly. We illusrae our dual bounding approach by evaluaing some of hese heurisic policies in larger versions of Treharne and Sox s examples The Model The goal is o find a policy for ordering goods over T periods ( = 0 T 1) o minimize he expeced oal coss. The invenory level a he beginning of period is denoed by x and he amoun ordered in period is a. The demand in period is uncerain and denoed by d. The invenory level evolves according o x +1 = x + a d = x 0 + =0 a d, where x 0 is he iniial invenory level. This evoluion equaion assumes unme demand is backordered and appears as a negaive invenory level enering he nex period; he equaion also assumes here is no lead ime required o fulfill he orders. The order quaniies and demands are assumed o be nonnegaive inegers. The period- demand d is drawn from a disribuion ha changes sochasically, following a Markov process. The demand d is observed a he end of period, bu he disribuion is never observed. We begin wih a prior disribuion 0 on he iniial demand disribuion 0 and updae his over ime wih he period-( + 1) disribuion +1 d, aking ino accoun he prior beliefs, he observed demand d, and he possibiliy of he disribuion changing. In each period, here are ordering coss as well as coss associaed wih holding invenory or failing o mee demand. The cos of ordering a unis is c a, where c is he cos of ordering one iem or uni. The cos of holding invenory x +1 from period ino period + 1isf x +1 = h max 0 x +1 + p max 0 x +1, where h is he peruni cos of holding excess invenory in period and p is he per-uni penaly associaed wih backordering unme demand in period. Treharne and Sox assume a erminal cos of c T x T o capure he value (or cos) of holding invenory (or unme demand) a he end of he planning horizon. We generalize Treharne and Sox s model by allowing he ordering coss c o vary following a Markov chain ha is independen of he demands d and demand disribuions ; we assume ha he period- ordering cos c is known a he beginning of period. This generalizaion will allow us o consider a broader range of informaion relaxaions and makes he problem harder o solve. Placing his model in he general framework of 2.1, he acions a 0 a T 1 are he order quaniies for each

8 792 Operaions Research 58(4, Par 1 of 2), pp , 2010 INFORMS period and he acion sequences a are drawn from he se A of T -vecors of nonnegaive inegers. An oucome is a sample pah ha includes he demands, demand disribuions, and ordering coss for each period and a erminal cos c T ; ha is, he oucomes are of he form = d 0 0 c 0 d T 1 T 1 c T 1 c T. The naural filraion corresponds o knowing he demands d 0 d 1 and coss c 0 c a he beginning of period. Because he goal here is o minimize coss, we can eiher rewrie he primal DP (1) as a minimizaion problem or else ake he rewards in (1) o be he negaive coss. The srucure of he adapive invenory model is perhaps clearer if we view he problem as a parially observable Markov decision process and wrie i recursively. The period- sae variable is x c, where x is he invenory level a he beginning of period, c is he ordering cos in period, and is he probabiliy disribuion on he period- demand disribuion. In his recursive formulaion, i is convenien o ake he decision variables o be he order-up-o level y = x + a raher han he order quaniy a. We can hen wrie he period- cos-o-go funcion J, for = 0 T 1, as J x c { = c x + min c y + Ɛ f y d y x + J +1 y d c d c } (15) Here d and c +1 denoe he random period- demand and period-( + 1) coss, and he erminal cos funcion is J T x T c T T = c T x T. Wha makes his problem difficul o solve is ha each demand sequence d 0 d 1 leads o a differen and, consequenly, he number of scenarios ha mus be considered grows exponenially in he number of periods considered. For insance, he problems ha Treharne and Sox solved o opimaliy had 5 ime periods, 19 possible demand levels, and one ordering cos level. To find an opimal policy, hey had o solve he opimizaion problem (15) for approximaely 138,000 differen c -scenarios. In our numerical examples, we will consider 10 ime periods, 19 demand levels, and hree cos levels; we would have o solve approximaely such opimizaion problems o find an opimal policy Heurisic Policies Because of he complexiy of he primal problem, Treharne and Sox propose using simpler limied-look-ahead policies ha choose an order quaniy ha is opimal for a runcaed version of he model ha looks only zero, one or wo periods ino he fuure. For = 0 T 1, he L-period look-ahead cos-o-go funcion is defined as J L x c { = c x + min c y + Ɛ f y d y x + J L 1 +1 y d c d c } (16) In he erminal cases wih = T or L = 1, we ake J L x c = c x. When simulaing he invenory sysem using an L-period look-ahead policy, we deermine he order quaniy for a paricular c -scenario by solving (16) for he opimal order-up-o level y. We hen draw he random demand d and nex period cos c +1, calculae he updaed probabiliy disribuion +1, and repea he process by finding he order quaniy for he nex period using he L-period look-ahead value funcion saring a c The complexiy of hese limied-look-ahead policies grows exponenially wih he look-ahead horizon L. Inour numerical examples, we ake L = 0, 1, and 2 and we mus solve 1, 58, and 1,141 scenario-specific opimizaion problems (respecively) o deermine he recommended order quaniy for each period. If we esimae he expeced coss of hese policies using a simulaion wih T periods and K rials, we mus solve KT,58KT,or1 141KT opimizaion problems for he 0-, 1-, and 2-period look-ahead policies, respecively Informaion Relaxaions We will sudy hree differen informaion relaxaions in his example, each of which allows us o avoid considering he full ree of all possible cos/demand scenarios. Firs, we will consider he perfec informaion relaxaion. In his case, in he ouer simulaion we randomly generae he full sequence of ordering coss c 0 c T, demand disribuions 0 T 1, and acual demands d 0 d T 1.In he inner problem, we deermine opimal order quaniies by solving a simple deerminisic DP. Wih his relaxaion, we will be selecing random samples from he large ree of possible cos/demand scenarios. Second, we will consider a igher, imperfec informaion relaxaion ha assumes he demand disribuions 0 T 1 and acual demands d 0 d T 1 are known in advance, bu assumes he ordering coss c are no known unil period. In his case, we randomly generae he demand disribuions and demands in he ouer simulaion. In he inner problem, we solve a small sochasic DP ha deermines cos-dependen order quaniies for each period. The hird relaxaion is igher han he firs wo: i assumes ha he acual coss c and demands d are revealed as in he naural filraion (in period and period + 1, respecively), bu he demand disribuion is known in period ; he naural filraion assumes is never observed. In his case, if we assume zero penaly, he dual problem can be formulaed as a Markov DP wih sae variable x c ; he number of scenarios ha mus be considered no longer grows exponenially in T, and his DP is easy o solve Penalies As discussed in 2.3, he ideal penaly akes he generaing funcion w o be he opimal coninuaion value, i.e.,

9 Operaions Research 58(4, Par 1 of 2), pp , 2010 INFORMS 793 he period-( + 1) value funcion V +1. Here we will ake he generaing funcion w L for he L-period look-ahead penaly o be he L-period look-ahead cos-o-go funcions defined by Equaion (16), w = J L 1 +1 y d c d (17) For example, in he myopic case wih L = 0, he generaing funcion is simply c +1 y d ). Alhough we would no expec he (L 1)-period look-ahead cos-o-go funcions o provide a very good approximaion of he acual cos-o-go funcions J +1 (hey consider he coss over a small fracion of he oal ime frame), hese limied-look-ahead cos funcions may provide a reasonable approximaion of he change in coss due o having he addiional informaion provided by relaxaion insead of he naural filraion. In he perfec informaion relaxaion, he full sequence of demands and coss are known in advance and generaed in he ouer simulaion. Le dˆ 0 k dˆ T k 1 and ĉk 0 ĉk T denoe he sequences of hese values generaed in he kh rial of he simulaion and le ˆ k denoe he period- probabiliy disribuion on given by saring wih he prior disribuion 0 and updaing based on seeing dˆ 0 k dˆ 1 k. Following Equaion (10), we can wrie he inner problem in he kh rial wih he L-period-look-ahead penaly as J L k x = ĉ k x {ĉk + min y J L 1 +1 y dˆ k y x ĉk +1 ˆ k +1 + J L +1 y k dˆ k + Ɛ f y d + J L 1 +1 y d c ˆ k d ˆ k ĉk } (18) wih erminal value J L k T x T = ĉt k x T. Noe ha he limied-look-ahead cos-o-go funcion J+1 L 1 and he expecaion in (18) would be calculaed when deermining he limied-look-ahead order quaniy for his rial (see Equaion (16)). Consequenly, when simulaing o esimae he expeced coss wih an L-period look-ahead policy, i is no difficul o simulaneously esimae he corresponding dual bound: we need only solve one addiional scenariospecific opimizaion problem for each period. The dual bounds are also easy o calculae wih he generaing funcion of Equaion (17) for he imperfec informaion relaxaion ha assumes ha he demands d 0 d T 1 and demand disribuions 0 T 1 are known in advance, bu he ordering coss c are revealed over ime as in he naural filraion. In his case, he inner problem is a sochasic DP ha explicily considers he uncerainy abou he ordering coss; see Appendix A.7 for deails. By Proposiion 2.2(i), we know ha he bounds given by using his imperfec informaion relaxaion will be a leas as good as hose given by he perfec informaion relaxaion. The hird informaion relaxaion we consider in his problem assumes he demand disribuions are observed in period, buc and d are revealed over ime as in he naural filraion. As discussed in 3.3, wih zero penaly, his dual problem can be formulaed as a Markov decision problem ha is no difficul o solve. However, wih his relaxaion, he generaing funcion of Equaion (17) leads o an inner problem ha is no easy o solve. The difficuly is ha he generaing funcions depend on he probabiliy disribuions +1 ha, in urn, depend on he whole hisory of demands d 0 d. This dependence desroys he Markovian srucure ha makes i easy o solve he inner problem wih no penaly. Thus, he generaing funcion (17) works well wih he firs wo relaxaions, bu no wih he hird Numerical Resuls In his secion, we describe numerical resuls for he adapive invenory conrol example. Our choice of parameers closely follows Treharne and Sox (2002). Specifically, following Treharne and Sox, we assume ha here are hree possible random demand disribuions, each of which is a runcaed negaive binomial disribuion ha ranges from 0 o 18 unis. The hree disribuions are low, medium, and high and have means and sandard deviaions of , , and , respecively, before runcaion. We consider seven differen ransiion probabiliy marices represening various rends for he demand disribuions. The holding coss h are se o $1.00 per uni and he backorder coss p are $1.00, $1.86, or $4.00 per uni. Finally, we consider four differen priors on he iniial demand disribuion 0 : he firs, hird, and fourh represen cases where he demands are mos likely o be high, medium, or low, respecively; he second prior is a uniform disribuion across he hree differen demand disribuions. In oal, here are 84 differen combinaions of parameers o consider (7 ransiion marices 3 backorder coss 4 priors). In each case, we assume he iniial ordering coss c 0 are $0.60 per uni and laer coss ake values $0 00, $0 60, or $1 20 following a Markov chain. (These assumpions are described in deail in Appendix C.) Finally, we ake he planning horizon T o be 10 periods and assume zero iniial invenory. In our numerical experimens, we calculae upper and lower bounds on he opimal expeced coss using he zero-, one-, or wo-sep look-ahead policies and penalies. For each combinaion of model parameers, we esimae he bounds using a simulaion of 1,000 rials. Figure 1 summarizes he resuls for he perfec informaion relaxaion. Appendix C provides he numbers underlying his figure (esimaed means and sandard errors) as well as resuls for he imperfec informaion relaxaion described in 3.3. We call he plo of Figure 1 an aquarium plo. In he figure, here are 84 ses of bars, each appearing (if you have bad eyesigh!) like a ropical fish. Each fish represens he resuls for a paricular se of parameers and

10 794 Operaions Research 58(4, Par 1 of 2), pp , 2010 INFORMS Figure 1. Upper and lower bounds wih he perfec informaion relaxaion Expeced invenory cos Blue (lef): Myopic upper and lower bounds Black (middle): One period look-ahead upper and lower bounds Red (righ): Two-period look-ahead upper and lower bounds Black (doed): Observable demand disribuion lower bounds 0 Sable, Pos. Corr. Sable, Neg. Corr. Sable, Zero. Corr. Upward, Slow Cases Upward, Fas Downward, Slow Downward, Fas consiss of hree verical bars wih blue, black, and red colors and horizonal markers on each end. The blue bars on he lef of each fish represen he myopic (or 0-period look-ahead) upper and lower bounds; he black bars in he middle represen he 1-period look-ahead upper and lower bounds; and he red bars on he righ represen he woperiod look-ahead upper and lower bounds. The differen ses of parameers are grouped firs according o he ransiion marices (indicaed a he boom), hen by backorder coss (wih lef o righ represening high o low coss), and las by he iniial priors. In mos cases, he gaps beween bounds narrow as we increase he look-ahead horizon, albei a varying raes. In many cases, he bounds are all quie narrow and he fish look like minnows; in hese cases, we could probably assume ha he myopic policies are good enough and no consider more complex policies. 2 In he cases wih a sable ransiion marix, wih posiive correlaion (on he lef of he figure), he fish have relaively wide ails on he lef, bu narrow quickly: here we may no be saisfied wih he qualiy of he myopic policy, bu may find he oneor wo-period look-ahead policies o be good enough. There are, however, a few cases wih downward, slow and downward, fas ransiions (on he righ side of he figure) where he gaps remain relaively large even wih a wo-period look-ahead policy. We will reurn o hese cases in 3.6 below. Appendix C provides resuls for he imperfec informaion relaxaion where all demands are assumed o be known in advance, bu coss are revealed sequenially over ime. The esimaed bounds wih imperfec informaion are quie similar o hose wih perfec informaion, bu he imperfec informaion bounds are more precisely esimaed. Across he 84 cases, he mean sandard errors for he dual bounds wih he imperfec informaion relaxaion average $0.216, $0.172, and $0.137 for he zero-, one-, and woperiod look-ahead bounds, respecively. Wih he perfec informaion relaxaion, he corresponding mean sandard errors for he dual bounds average $0.821, $0.554, and $ Inuiively, he improved precision in he imperfec informaion bounds comes from eliminaing random sampling variaions associaed wih coss by explicily enumeraing he cos scenarios. In he imperfec informaion case, we also enumerae he cos scenarios when esimaing he expeced cos of he heurisic policy; his is somewha more ime consuming (for a fixed number of samples) bu improves he precision of he esimaed bounds.

11 Operaions Research 58(4, Par 1 of 2), pp , 2010 INFORMS 795 Table 1. Compuaion imes (seconds) for calculaing bounds in he invenory example. Perfec informaion relaxaion Imperfec informaion relaxaion Look-ahead Heurisic Dual Heurisic Dual horizon (L) policy bound policy bound Zero periods One period Two periods The run imes required o calculae hese bounds are shown in Table 1. We show he ime required o evaluae he zero-, one-, or wo-period look-ahead heurisic policies using 1,000 rials for one se of model parameers and he addiional ime required o calculae he dual bounds wih hese same 1,000 rials. 3 Here we see ha once we have calculaed he bounds associaed wih he heurisic policies (and he associaed look-ahead value funcions), i akes lile addiional ime o compue he dual bounds. The myopic dual bounds are somewha faser o compue han he one- and wo-period look-ahead bounds because in he myopic case we know he objecive funcion in Equaion (18) is convex and can simplify he opimizaion problem. The imperfec informaion bounds ake somewha longer o compue han he perfec informaion bounds, because we mus solve for dual opimal acions in each of he hree possible cos saes in each period raher han he one randomly chosen cos sae ha is considered in he perfec informaion case. As discussed in Secion 3.3, we can consruc an alernaive lower bound on expeced coss by considering an informaion relaxaion where he demands d and coss c are revealed over ime according o he naural filraion bu he demand disribuions are observed in period- (raher han never observed, as assumed in he naural filraion). If we ake he penaly o be zero, his problem can be formulaed as a Markov DP ha akes approximaely 0.08 seconds o solve. These observable demand disribuion bounds are shown as conneced doed lines in Figure 1. These lines are well below he fish represening he limied-look-ahead bounds. Thus, in hese examples, observing he demand disribuion is quie valuable and, wih no penaly, he corresponding bounds are quie weak Improving he Heurisic Policies and Bounds We now consider he use of he dual resuls o idenify beer policies and bounds when he gaps are relaively large. We will focus on he cases wih he downward, slow and downward, fas ransiion marices. In hese cases, demand may iniially be high (wih mean 16), bu i may drop o medium (wih mean 9) or low (wih mean 1) his period, and when demand drops, i will no increase again. Comparing he order-up-o quaniies (he y s) seleced by he myopic policy wih hose seleced in he corresponding dual bound, we find ha he dual problem akes advanage of he perfec informaion o reduce he order in he period when demand drops o he low demand sae, hereby avoiding he cos of carrying excess invenory when he sysem eners he low sae. I appears ha he myopic policies order oo much when he sysem is no in he low demand sae and he dual penalies do no appropriaely punish he DM in he dual problem for aking advanage of he perfec informaion abou demand. To undersand why his is he case, noe ha he erminal value used in deermining myopic policies and used as he generaing funcion for he myopic dual bound, J 1 x c = c x, implicily assumes ha lefover invenory subsiues for fuure purchases. One way o perhaps improve he policies and bounds is o use he erminal values based on a model ha assumes he demand disribuions is observable. Specifically, we ake he limied-look-ahead erminal value J 1 x c o be Ɛ J o x c, where J o is he value funcion for a Markov DP ha assumes he demand disribuion is observed in each period; his model was used o calculae he observable demand disribuion bounds described in 3.5. As is eviden in Figure 1, hese observable demand value funcions are no very good approximaions of he rue value funcions (hey grealy underesimae coss), bu hey are easy o compue and, unlike he original erminal values, hey include he holding coss associaed wih having excess invenory in a low demand sae. This modificaion leads o dramaic improvemens for he cases wih he downward, slow and downward, fas ransiion marices, wih lile addiional work. For example, in he case wih he downward, slow ransiion marix, high backorder coss, and a high prior disribuion, he myopic bounds wih he modified erminal values were $107.0 and $107.5 as compared o $92 and $111 for he myopic bounds wih he original erminal values; he run imes were 7.7 and 7.4 seconds, respecively. (These resuls are for he perfec informaion relaxaion and a simulaion of 1,000 rials.) The myopic bounds for he oher cases wih downward, slow and downward, fas ransiion marices are also much improved. In hese cases, hese modified myopic policies no only ouperform he original myopic policies, hey also ouperform he significanly more complex one- and wo-period look-ahead policies based on he original erminal values. (See Appendix C for deailed resuls for all cases.) Alhough his modificaion of he myopic policies grealy improves he resuls for he cases wih he downward, slow and downward, fas ransiion marices, he modified myopic policies perform worse han he original myopic policies in some oher cases, where he original myopic policies performed quie well. In all, comparing across he 84 differen ses of parameers, we find ha we can ge wihin 2% of he opimal coss (and ypically closer) using one of hese wo myopic policies. Thus, by

T L. t=1. Proof of Lemma 1. Using the marginal cost accounting in Equation(4) and standard arguments. t )+Π RB. t )+K 1(Q RB

T L. t=1. Proof of Lemma 1. Using the marginal cost accounting in Equation(4) and standard arguments. t )+Π RB. t )+K 1(Q RB Elecronic Companion EC.1. Proofs of Technical Lemmas and Theorems LEMMA 1. Le C(RB) be he oal cos incurred by he RB policy. Then we have, T L E[C(RB)] 3 E[Z RB ]. (EC.1) Proof of Lemma 1. Using he marginal

More information

An introduction to the theory of SDDP algorithm

An introduction to the theory of SDDP algorithm An inroducion o he heory of SDDP algorihm V. Leclère (ENPC) Augus 1, 2014 V. Leclère Inroducion o SDDP Augus 1, 2014 1 / 21 Inroducion Large scale sochasic problem are hard o solve. Two ways of aacking

More information

Inventory Analysis and Management. Multi-Period Stochastic Models: Optimality of (s, S) Policy for K-Convex Objective Functions

Inventory Analysis and Management. Multi-Period Stochastic Models: Optimality of (s, S) Policy for K-Convex Objective Functions Muli-Period Sochasic Models: Opimali of (s, S) Polic for -Convex Objecive Funcions Consider a seing similar o he N-sage newsvendor problem excep ha now here is a fixed re-ordering cos (> 0) for each (re-)order.

More information

Vehicle Arrival Models : Headway

Vehicle Arrival Models : Headway Chaper 12 Vehicle Arrival Models : Headway 12.1 Inroducion Modelling arrival of vehicle a secion of road is an imporan sep in raffic flow modelling. I has imporan applicaion in raffic flow simulaion where

More information

3.1.3 INTRODUCTION TO DYNAMIC OPTIMIZATION: DISCRETE TIME PROBLEMS. A. The Hamiltonian and First-Order Conditions in a Finite Time Horizon

3.1.3 INTRODUCTION TO DYNAMIC OPTIMIZATION: DISCRETE TIME PROBLEMS. A. The Hamiltonian and First-Order Conditions in a Finite Time Horizon 3..3 INRODUCION O DYNAMIC OPIMIZAION: DISCREE IME PROBLEMS A. he Hamilonian and Firs-Order Condiions in a Finie ime Horizon Define a new funcion, he Hamilonian funcion, H. H he change in he oal value of

More information

Notes on Kalman Filtering

Notes on Kalman Filtering Noes on Kalman Filering Brian Borchers and Rick Aser November 7, Inroducion Daa Assimilaion is he problem of merging model predicions wih acual measuremens of a sysem o produce an opimal esimae of he curren

More information

Essential Microeconomics : OPTIMAL CONTROL 1. Consider the following class of optimization problems

Essential Microeconomics : OPTIMAL CONTROL 1. Consider the following class of optimization problems Essenial Microeconomics -- 6.5: OPIMAL CONROL Consider he following class of opimizaion problems Max{ U( k, x) + U+ ( k+ ) k+ k F( k, x)}. { x, k+ } = In he language of conrol heory, he vecor k is he vecor

More information

STATE-SPACE MODELLING. A mass balance across the tank gives:

STATE-SPACE MODELLING. A mass balance across the tank gives: B. Lennox and N.F. Thornhill, 9, Sae Space Modelling, IChemE Process Managemen and Conrol Subjec Group Newsleer STE-SPACE MODELLING Inroducion: Over he pas decade or so here has been an ever increasing

More information

SZG Macro 2011 Lecture 3: Dynamic Programming. SZG macro 2011 lecture 3 1

SZG Macro 2011 Lecture 3: Dynamic Programming. SZG macro 2011 lecture 3 1 SZG Macro 2011 Lecure 3: Dynamic Programming SZG macro 2011 lecure 3 1 Background Our previous discussion of opimal consumpion over ime and of opimal capial accumulaion sugges sudying he general decision

More information

Final Spring 2007

Final Spring 2007 .615 Final Spring 7 Overview The purpose of he final exam is o calculae he MHD β limi in a high-bea oroidal okamak agains he dangerous n = 1 exernal ballooning-kink mode. Effecively, his corresponds o

More information

Online Appendix to Solution Methods for Models with Rare Disasters

Online Appendix to Solution Methods for Models with Rare Disasters Online Appendix o Soluion Mehods for Models wih Rare Disasers Jesús Fernández-Villaverde and Oren Levinal In his Online Appendix, we presen he Euler condiions of he model, we develop he pricing Calvo block,

More information

Physics 235 Chapter 2. Chapter 2 Newtonian Mechanics Single Particle

Physics 235 Chapter 2. Chapter 2 Newtonian Mechanics Single Particle Chaper 2 Newonian Mechanics Single Paricle In his Chaper we will review wha Newon s laws of mechanics ell us abou he moion of a single paricle. Newon s laws are only valid in suiable reference frames,

More information

This document was generated at 1:04 PM, 09/10/13 Copyright 2013 Richard T. Woodward. 4. End points and transversality conditions AGEC

This document was generated at 1:04 PM, 09/10/13 Copyright 2013 Richard T. Woodward. 4. End points and transversality conditions AGEC his documen was generaed a 1:4 PM, 9/1/13 Copyrigh 213 Richard. Woodward 4. End poins and ransversaliy condiions AGEC 637-213 F z d Recall from Lecure 3 ha a ypical opimal conrol problem is o maimize (,,

More information

ACE 562 Fall Lecture 5: The Simple Linear Regression Model: Sampling Properties of the Least Squares Estimators. by Professor Scott H.

ACE 562 Fall Lecture 5: The Simple Linear Regression Model: Sampling Properties of the Least Squares Estimators. by Professor Scott H. ACE 56 Fall 005 Lecure 5: he Simple Linear Regression Model: Sampling Properies of he Leas Squares Esimaors by Professor Sco H. Irwin Required Reading: Griffihs, Hill and Judge. "Inference in he Simple

More information

BU Macro BU Macro Fall 2008, Lecture 4

BU Macro BU Macro Fall 2008, Lecture 4 Dynamic Programming BU Macro 2008 Lecure 4 1 Ouline 1. Cerainy opimizaion problem used o illusrae: a. Resricions on exogenous variables b. Value funcion c. Policy funcion d. The Bellman equaion and an

More information

On Boundedness of Q-Learning Iterates for Stochastic Shortest Path Problems

On Boundedness of Q-Learning Iterates for Stochastic Shortest Path Problems MATHEMATICS OF OPERATIONS RESEARCH Vol. 38, No. 2, May 2013, pp. 209 227 ISSN 0364-765X (prin) ISSN 1526-5471 (online) hp://dx.doi.org/10.1287/moor.1120.0562 2013 INFORMS On Boundedness of Q-Learning Ieraes

More information

PENALIZED LEAST SQUARES AND PENALIZED LIKELIHOOD

PENALIZED LEAST SQUARES AND PENALIZED LIKELIHOOD PENALIZED LEAST SQUARES AND PENALIZED LIKELIHOOD HAN XIAO 1. Penalized Leas Squares Lasso solves he following opimizaion problem, ˆβ lasso = arg max β R p+1 1 N y i β 0 N x ij β j β j (1.1) for some 0.

More information

Lecture 2-1 Kinematics in One Dimension Displacement, Velocity and Acceleration Everything in the world is moving. Nothing stays still.

Lecture 2-1 Kinematics in One Dimension Displacement, Velocity and Acceleration Everything in the world is moving. Nothing stays still. Lecure - Kinemaics in One Dimension Displacemen, Velociy and Acceleraion Everyhing in he world is moving. Nohing says sill. Moion occurs a all scales of he universe, saring from he moion of elecrons in

More information

Maintenance Models. Prof. Robert C. Leachman IEOR 130, Methods of Manufacturing Improvement Spring, 2011

Maintenance Models. Prof. Robert C. Leachman IEOR 130, Methods of Manufacturing Improvement Spring, 2011 Mainenance Models Prof Rober C Leachman IEOR 3, Mehods of Manufacuring Improvemen Spring, Inroducion The mainenance of complex equipmen ofen accouns for a large porion of he coss associaed wih ha equipmen

More information

Econ107 Applied Econometrics Topic 7: Multicollinearity (Studenmund, Chapter 8)

Econ107 Applied Econometrics Topic 7: Multicollinearity (Studenmund, Chapter 8) I. Definiions and Problems A. Perfec Mulicollineariy Econ7 Applied Economerics Topic 7: Mulicollineariy (Sudenmund, Chaper 8) Definiion: Perfec mulicollineariy exiss in a following K-variable regression

More information

1. An introduction to dynamic optimization -- Optimal Control and Dynamic Programming AGEC

1. An introduction to dynamic optimization -- Optimal Control and Dynamic Programming AGEC This documen was generaed a :45 PM 8/8/04 Copyrigh 04 Richard T. Woodward. An inroducion o dynamic opimizaion -- Opimal Conrol and Dynamic Programming AGEC 637-04 I. Overview of opimizaion Opimizaion is

More information

Seminar 4: Hotelling 2

Seminar 4: Hotelling 2 Seminar 4: Hoelling 2 November 3, 211 1 Exercise Par 1 Iso-elasic demand A non renewable resource of a known sock S can be exraced a zero cos. Demand for he resource is of he form: D(p ) = p ε ε > A a

More information

Robust estimation based on the first- and third-moment restrictions of the power transformation model

Robust estimation based on the first- and third-moment restrictions of the power transformation model h Inernaional Congress on Modelling and Simulaion, Adelaide, Ausralia, 6 December 3 www.mssanz.org.au/modsim3 Robus esimaion based on he firs- and hird-momen resricions of he power ransformaion Nawaa,

More information

On Measuring Pro-Poor Growth. 1. On Various Ways of Measuring Pro-Poor Growth: A Short Review of the Literature

On Measuring Pro-Poor Growth. 1. On Various Ways of Measuring Pro-Poor Growth: A Short Review of the Literature On Measuring Pro-Poor Growh 1. On Various Ways of Measuring Pro-Poor Growh: A Shor eview of he Lieraure During he pas en years or so here have been various suggesions concerning he way one should check

More information

Simulation-Solving Dynamic Models ABE 5646 Week 2, Spring 2010

Simulation-Solving Dynamic Models ABE 5646 Week 2, Spring 2010 Simulaion-Solving Dynamic Models ABE 5646 Week 2, Spring 2010 Week Descripion Reading Maerial 2 Compuer Simulaion of Dynamic Models Finie Difference, coninuous saes, discree ime Simple Mehods Euler Trapezoid

More information

Chapter 2. First Order Scalar Equations

Chapter 2. First Order Scalar Equations Chaper. Firs Order Scalar Equaions We sar our sudy of differenial equaions in he same way he pioneers in his field did. We show paricular echniques o solve paricular ypes of firs order differenial equaions.

More information

Energy Storage Benchmark Problems

Energy Storage Benchmark Problems Energy Sorage Benchmark Problems Daniel F. Salas 1,3, Warren B. Powell 2,3 1 Deparmen of Chemical & Biological Engineering 2 Deparmen of Operaions Research & Financial Engineering 3 Princeon Laboraory

More information

Expert Advice for Amateurs

Expert Advice for Amateurs Exper Advice for Amaeurs Ernes K. Lai Online Appendix - Exisence of Equilibria The analysis in his secion is performed under more general payoff funcions. Wihou aking an explici form, he payoffs of he

More information

10. State Space Methods

10. State Space Methods . Sae Space Mehods. Inroducion Sae space modelling was briefly inroduced in chaper. Here more coverage is provided of sae space mehods before some of heir uses in conrol sysem design are covered in he

More information

The Optimal Stopping Time for Selling an Asset When It Is Uncertain Whether the Price Process Is Increasing or Decreasing When the Horizon Is Infinite

The Optimal Stopping Time for Selling an Asset When It Is Uncertain Whether the Price Process Is Increasing or Decreasing When the Horizon Is Infinite American Journal of Operaions Research, 08, 8, 8-9 hp://wwwscirporg/journal/ajor ISSN Online: 60-8849 ISSN Prin: 60-8830 The Opimal Sopping Time for Selling an Asse When I Is Uncerain Wheher he Price Process

More information

Matlab and Python programming: how to get started

Matlab and Python programming: how to get started Malab and Pyhon programming: how o ge sared Equipping readers he skills o wrie programs o explore complex sysems and discover ineresing paerns from big daa is one of he main goals of his book. In his chaper,

More information

( ) a system of differential equations with continuous parametrization ( T = R + These look like, respectively:

( ) a system of differential equations with continuous parametrization ( T = R + These look like, respectively: XIII. DIFFERENCE AND DIFFERENTIAL EQUATIONS Ofen funcions, or a sysem of funcion, are paramerized in erms of some variable, usually denoed as and inerpreed as ime. The variable is wrien as a funcion of

More information

ACE 562 Fall Lecture 4: Simple Linear Regression Model: Specification and Estimation. by Professor Scott H. Irwin

ACE 562 Fall Lecture 4: Simple Linear Regression Model: Specification and Estimation. by Professor Scott H. Irwin ACE 56 Fall 005 Lecure 4: Simple Linear Regression Model: Specificaion and Esimaion by Professor Sco H. Irwin Required Reading: Griffihs, Hill and Judge. "Simple Regression: Economic and Saisical Model

More information

1 Review of Zero-Sum Games

1 Review of Zero-Sum Games COS 5: heoreical Machine Learning Lecurer: Rob Schapire Lecure #23 Scribe: Eugene Brevdo April 30, 2008 Review of Zero-Sum Games Las ime we inroduced a mahemaical model for wo player zero-sum games. Any

More information

State-Space Models. Initialization, Estimation and Smoothing of the Kalman Filter

State-Space Models. Initialization, Estimation and Smoothing of the Kalman Filter Sae-Space Models Iniializaion, Esimaion and Smoohing of he Kalman Filer Iniializaion of he Kalman Filer The Kalman filer shows how o updae pas predicors and he corresponding predicion error variances when

More information

Physics 127b: Statistical Mechanics. Fokker-Planck Equation. Time Evolution

Physics 127b: Statistical Mechanics. Fokker-Planck Equation. Time Evolution Physics 7b: Saisical Mechanics Fokker-Planck Equaion The Langevin equaion approach o he evoluion of he velociy disribuion for he Brownian paricle migh leave you uncomforable. A more formal reamen of his

More information

Introduction D P. r = constant discount rate, g = Gordon Model (1962): constant dividend growth rate.

Introduction D P. r = constant discount rate, g = Gordon Model (1962): constant dividend growth rate. Inroducion Gordon Model (1962): D P = r g r = consan discoun rae, g = consan dividend growh rae. If raional expecaions of fuure discoun raes and dividend growh vary over ime, so should he D/P raio. Since

More information

Applying Genetic Algorithms for Inventory Lot-Sizing Problem with Supplier Selection under Storage Capacity Constraints

Applying Genetic Algorithms for Inventory Lot-Sizing Problem with Supplier Selection under Storage Capacity Constraints IJCSI Inernaional Journal of Compuer Science Issues, Vol 9, Issue 1, No 1, January 2012 wwwijcsiorg 18 Applying Geneic Algorihms for Invenory Lo-Sizing Problem wih Supplier Selecion under Sorage Capaciy

More information

Diebold, Chapter 7. Francis X. Diebold, Elements of Forecasting, 4th Edition (Mason, Ohio: Cengage Learning, 2006). Chapter 7. Characterizing Cycles

Diebold, Chapter 7. Francis X. Diebold, Elements of Forecasting, 4th Edition (Mason, Ohio: Cengage Learning, 2006). Chapter 7. Characterizing Cycles Diebold, Chaper 7 Francis X. Diebold, Elemens of Forecasing, 4h Ediion (Mason, Ohio: Cengage Learning, 006). Chaper 7. Characerizing Cycles Afer compleing his reading you should be able o: Define covariance

More information

5. Stochastic processes (1)

5. Stochastic processes (1) Lec05.pp S-38.45 - Inroducion o Teleraffic Theory Spring 2005 Conens Basic conceps Poisson process 2 Sochasic processes () Consider some quaniy in a eleraffic (or any) sysem I ypically evolves in ime randomly

More information

Two Popular Bayesian Estimators: Particle and Kalman Filters. McGill COMP 765 Sept 14 th, 2017

Two Popular Bayesian Estimators: Particle and Kalman Filters. McGill COMP 765 Sept 14 th, 2017 Two Popular Bayesian Esimaors: Paricle and Kalman Filers McGill COMP 765 Sep 14 h, 2017 1 1 1, dx x Bel x u x P x z P Recall: Bayes Filers,,,,,,, 1 1 1 1 u z u x P u z u x z P Bayes z = observaion u =

More information

R t. C t P t. + u t. C t = αp t + βr t + v t. + β + w t

R t. C t P t. + u t. C t = αp t + βr t + v t. + β + w t Exercise 7 C P = α + β R P + u C = αp + βr + v (a) (b) C R = α P R + β + w (c) Assumpions abou he disurbances u, v, w : Classical assumions on he disurbance of one of he equaions, eg. on (b): E(v v s P,

More information

) were both constant and we brought them from under the integral.

) were both constant and we brought them from under the integral. YIELD-PER-RECRUIT (coninued The yield-per-recrui model applies o a cohor, bu we saw in he Age Disribuions lecure ha he properies of a cohor do no apply in general o a collecion of cohors, which is wha

More information

Two Coupled Oscillators / Normal Modes

Two Coupled Oscillators / Normal Modes Lecure 3 Phys 3750 Two Coupled Oscillaors / Normal Modes Overview and Moivaion: Today we ake a small, bu significan, sep owards wave moion. We will no ye observe waves, bu his sep is imporan in is own

More information

Inventory Control of Perishable Items in a Two-Echelon Supply Chain

Inventory Control of Perishable Items in a Two-Echelon Supply Chain Journal of Indusrial Engineering, Universiy of ehran, Special Issue,, PP. 69-77 69 Invenory Conrol of Perishable Iems in a wo-echelon Supply Chain Fariborz Jolai *, Elmira Gheisariha and Farnaz Nojavan

More information

Finish reading Chapter 2 of Spivak, rereading earlier sections as necessary. handout and fill in some missing details!

Finish reading Chapter 2 of Spivak, rereading earlier sections as necessary. handout and fill in some missing details! MAT 257, Handou 6: Ocober 7-2, 20. I. Assignmen. Finish reading Chaper 2 of Spiva, rereading earlier secions as necessary. handou and fill in some missing deails! II. Higher derivaives. Also, read his

More information

= ( ) ) or a system of differential equations with continuous parametrization (T = R

= ( ) ) or a system of differential equations with continuous parametrization (T = R XIII. DIFFERENCE AND DIFFERENTIAL EQUATIONS Ofen funcions, or a sysem of funcion, are paramerized in erms of some variable, usually denoed as and inerpreed as ime. The variable is wrien as a funcion of

More information

Christos Papadimitriou & Luca Trevisan November 22, 2016

Christos Papadimitriou & Luca Trevisan November 22, 2016 U.C. Bereley CS170: Algorihms Handou LN-11-22 Chrisos Papadimiriou & Luca Trevisan November 22, 2016 Sreaming algorihms In his lecure and he nex one we sudy memory-efficien algorihms ha process a sream

More information

Lecture 20: Riccati Equations and Least Squares Feedback Control

Lecture 20: Riccati Equations and Least Squares Feedback Control 34-5 LINEAR SYSTEMS Lecure : Riccai Equaions and Leas Squares Feedback Conrol 5.6.4 Sae Feedback via Riccai Equaions A recursive approach in generaing he marix-valued funcion W ( ) equaion for i for he

More information

23.2. Representing Periodic Functions by Fourier Series. Introduction. Prerequisites. Learning Outcomes

23.2. Representing Periodic Functions by Fourier Series. Introduction. Prerequisites. Learning Outcomes Represening Periodic Funcions by Fourier Series 3. Inroducion In his Secion we show how a periodic funcion can be expressed as a series of sines and cosines. We begin by obaining some sandard inegrals

More information

Application of a Stochastic-Fuzzy Approach to Modeling Optimal Discrete Time Dynamical Systems by Using Large Scale Data Processing

Application of a Stochastic-Fuzzy Approach to Modeling Optimal Discrete Time Dynamical Systems by Using Large Scale Data Processing Applicaion of a Sochasic-Fuzzy Approach o Modeling Opimal Discree Time Dynamical Sysems by Using Large Scale Daa Processing AA WALASZE-BABISZEWSA Deparmen of Compuer Engineering Opole Universiy of Technology

More information

GMM - Generalized Method of Moments

GMM - Generalized Method of Moments GMM - Generalized Mehod of Momens Conens GMM esimaion, shor inroducion 2 GMM inuiion: Maching momens 2 3 General overview of GMM esimaion. 3 3. Weighing marix...........................................

More information

Excel-Based Solution Method For The Optimal Policy Of The Hadley And Whittin s Exact Model With Arma Demand

Excel-Based Solution Method For The Optimal Policy Of The Hadley And Whittin s Exact Model With Arma Demand Excel-Based Soluion Mehod For The Opimal Policy Of The Hadley And Whiin s Exac Model Wih Arma Demand Kal Nami School of Business and Economics Winson Salem Sae Universiy Winson Salem, NC 27110 Phone: (336)750-2338

More information

Notes for Lecture 17-18

Notes for Lecture 17-18 U.C. Berkeley CS278: Compuaional Complexiy Handou N7-8 Professor Luca Trevisan April 3-8, 2008 Noes for Lecure 7-8 In hese wo lecures we prove he firs half of he PCP Theorem, he Amplificaion Lemma, up

More information

A Primal-Dual Type Algorithm with the O(1/t) Convergence Rate for Large Scale Constrained Convex Programs

A Primal-Dual Type Algorithm with the O(1/t) Convergence Rate for Large Scale Constrained Convex Programs PROC. IEEE CONFERENCE ON DECISION AND CONTROL, 06 A Primal-Dual Type Algorihm wih he O(/) Convergence Rae for Large Scale Consrained Convex Programs Hao Yu and Michael J. Neely Absrac This paper considers

More information

20. Applications of the Genetic-Drift Model

20. Applications of the Genetic-Drift Model 0. Applicaions of he Geneic-Drif Model 1) Deermining he probabiliy of forming any paricular combinaion of genoypes in he nex generaion: Example: If he parenal allele frequencies are p 0 = 0.35 and q 0

More information

Explaining Total Factor Productivity. Ulrich Kohli University of Geneva December 2015

Explaining Total Factor Productivity. Ulrich Kohli University of Geneva December 2015 Explaining Toal Facor Produciviy Ulrich Kohli Universiy of Geneva December 2015 Needed: A Theory of Toal Facor Produciviy Edward C. Presco (1998) 2 1. Inroducion Toal Facor Produciviy (TFP) has become

More information

Comparing Means: t-tests for One Sample & Two Related Samples

Comparing Means: t-tests for One Sample & Two Related Samples Comparing Means: -Tess for One Sample & Two Relaed Samples Using he z-tes: Assumpions -Tess for One Sample & Two Relaed Samples The z-es (of a sample mean agains a populaion mean) is based on he assumpion

More information

Modal identification of structures from roving input data by means of maximum likelihood estimation of the state space model

Modal identification of structures from roving input data by means of maximum likelihood estimation of the state space model Modal idenificaion of srucures from roving inpu daa by means of maximum likelihood esimaion of he sae space model J. Cara, J. Juan, E. Alarcón Absrac The usual way o perform a forced vibraion es is o fix

More information

In this chapter the model of free motion under gravity is extended to objects projected at an angle. When you have completed it, you should

In this chapter the model of free motion under gravity is extended to objects projected at an angle. When you have completed it, you should Cambridge Universiy Press 978--36-60033-7 Cambridge Inernaional AS and A Level Mahemaics: Mechanics Coursebook Excerp More Informaion Chaper The moion of projeciles In his chaper he model of free moion

More information

KINEMATICS IN ONE DIMENSION

KINEMATICS IN ONE DIMENSION KINEMATICS IN ONE DIMENSION PREVIEW Kinemaics is he sudy of how hings move how far (disance and displacemen), how fas (speed and velociy), and how fas ha how fas changes (acceleraion). We say ha an objec

More information

Stochastic Perishable Inventory Systems: Dual-Balancing and Look-Ahead Approaches

Stochastic Perishable Inventory Systems: Dual-Balancing and Look-Ahead Approaches Sochasic Perishable Invenory Sysems: Dual-Balancing and Look-Ahead Approaches by Yuhe Diao A hesis presened o he Universiy Of Waerloo in fulfilmen of he hesis requiremen for he degree of Maser of Applied

More information

The Arcsine Distribution

The Arcsine Distribution The Arcsine Disribuion Chris H. Rycrof Ocober 6, 006 A common heme of he class has been ha he saisics of single walker are ofen very differen from hose of an ensemble of walkers. On he firs homework, we

More information

Some Basic Information about M-S-D Systems

Some Basic Information about M-S-D Systems Some Basic Informaion abou M-S-D Sysems 1 Inroducion We wan o give some summary of he facs concerning unforced (homogeneous) and forced (non-homogeneous) models for linear oscillaors governed by second-order,

More information

3.1 More on model selection

3.1 More on model selection 3. More on Model selecion 3. Comparing models AIC, BIC, Adjused R squared. 3. Over Fiing problem. 3.3 Sample spliing. 3. More on model selecion crieria Ofen afer model fiing you are lef wih a handful of

More information

Technical Report Doc ID: TR March-2013 (Last revision: 23-February-2016) On formulating quadratic functions in optimization models.

Technical Report Doc ID: TR March-2013 (Last revision: 23-February-2016) On formulating quadratic functions in optimization models. Technical Repor Doc ID: TR--203 06-March-203 (Las revision: 23-Februar-206) On formulaing quadraic funcions in opimizaion models. Auhor: Erling D. Andersen Convex quadraic consrains quie frequenl appear

More information

13.3 Term structure models

13.3 Term structure models 13.3 Term srucure models 13.3.1 Expecaions hypohesis model - Simples "model" a) shor rae b) expecaions o ge oher prices Resul: y () = 1 h +1 δ = φ( δ)+ε +1 f () = E (y +1) (1) =δ + φ( δ) f (3) = E (y +)

More information

Matrix Versions of Some Refinements of the Arithmetic-Geometric Mean Inequality

Matrix Versions of Some Refinements of the Arithmetic-Geometric Mean Inequality Marix Versions of Some Refinemens of he Arihmeic-Geomeric Mean Inequaliy Bao Qi Feng and Andrew Tonge Absrac. We esablish marix versions of refinemens due o Alzer ], Carwrigh and Field 4], and Mercer 5]

More information

Zürich. ETH Master Course: L Autonomous Mobile Robots Localization II

Zürich. ETH Master Course: L Autonomous Mobile Robots Localization II Roland Siegwar Margaria Chli Paul Furgale Marco Huer Marin Rufli Davide Scaramuzza ETH Maser Course: 151-0854-00L Auonomous Mobile Robos Localizaion II ACT and SEE For all do, (predicion updae / ACT),

More information

The expectation value of the field operator.

The expectation value of the field operator. The expecaion value of he field operaor. Dan Solomon Universiy of Illinois Chicago, IL dsolom@uic.edu June, 04 Absrac. Much of he mahemaical developmen of quanum field heory has been in suppor of deermining

More information

An Introduction to Backward Stochastic Differential Equations (BSDEs) PIMS Summer School 2016 in Mathematical Finance.

An Introduction to Backward Stochastic Differential Equations (BSDEs) PIMS Summer School 2016 in Mathematical Finance. 1 An Inroducion o Backward Sochasic Differenial Equaions (BSDEs) PIMS Summer School 2016 in Mahemaical Finance June 25, 2016 Chrisoph Frei cfrei@ualbera.ca This inroducion is based on Touzi [14], Bouchard

More information

Overview. COMP14112: Artificial Intelligence Fundamentals. Lecture 0 Very Brief Overview. Structure of this course

Overview. COMP14112: Artificial Intelligence Fundamentals. Lecture 0 Very Brief Overview. Structure of this course OMP: Arificial Inelligence Fundamenals Lecure 0 Very Brief Overview Lecurer: Email: Xiao-Jun Zeng x.zeng@mancheser.ac.uk Overview This course will focus mainly on probabilisic mehods in AI We shall presen

More information

A Dynamic Model of Economic Fluctuations

A Dynamic Model of Economic Fluctuations CHAPTER 15 A Dynamic Model of Economic Flucuaions Modified for ECON 2204 by Bob Murphy 2016 Worh Publishers, all righs reserved IN THIS CHAPTER, OU WILL LEARN: how o incorporae dynamics ino he AD-AS model

More information

t is a basis for the solution space to this system, then the matrix having these solutions as columns, t x 1 t, x 2 t,... x n t x 2 t...

t is a basis for the solution space to this system, then the matrix having these solutions as columns, t x 1 t, x 2 t,... x n t x 2 t... Mah 228- Fri Mar 24 5.6 Marix exponenials and linear sysems: The analogy beween firs order sysems of linear differenial equaions (Chaper 5) and scalar linear differenial equaions (Chaper ) is much sronger

More information

Cash Flow Valuation Mode Lin Discrete Time

Cash Flow Valuation Mode Lin Discrete Time IOSR Journal of Mahemaics (IOSR-JM) e-issn: 2278-5728,p-ISSN: 2319-765X, 6, Issue 6 (May. - Jun. 2013), PP 35-41 Cash Flow Valuaion Mode Lin Discree Time Olayiwola. M. A. and Oni, N. O. Deparmen of Mahemaics

More information

Bias-Variance Error Bounds for Temporal Difference Updates

Bias-Variance Error Bounds for Temporal Difference Updates Bias-Variance Bounds for Temporal Difference Updaes Michael Kearns AT&T Labs mkearns@research.a.com Sainder Singh AT&T Labs baveja@research.a.com Absrac We give he firs rigorous upper bounds on he error

More information

Bias in Conditional and Unconditional Fixed Effects Logit Estimation: a Correction * Tom Coupé

Bias in Conditional and Unconditional Fixed Effects Logit Estimation: a Correction * Tom Coupé Bias in Condiional and Uncondiional Fixed Effecs Logi Esimaion: a Correcion * Tom Coupé Economics Educaion and Research Consorium, Naional Universiy of Kyiv Mohyla Academy Address: Vul Voloska 10, 04070

More information

A Shooting Method for A Node Generation Algorithm

A Shooting Method for A Node Generation Algorithm A Shooing Mehod for A Node Generaion Algorihm Hiroaki Nishikawa W.M.Keck Foundaion Laboraory for Compuaional Fluid Dynamics Deparmen of Aerospace Engineering, Universiy of Michigan, Ann Arbor, Michigan

More information

Lecture 33: November 29

Lecture 33: November 29 36-705: Inermediae Saisics Fall 2017 Lecurer: Siva Balakrishnan Lecure 33: November 29 Today we will coninue discussing he boosrap, and hen ry o undersand why i works in a simple case. In he las lecure

More information

Decentralized Stochastic Control with Partial History Sharing: A Common Information Approach

Decentralized Stochastic Control with Partial History Sharing: A Common Information Approach 1 Decenralized Sochasic Conrol wih Parial Hisory Sharing: A Common Informaion Approach Ashuosh Nayyar, Adiya Mahajan and Demoshenis Tenekezis arxiv:1209.1695v1 [cs.sy] 8 Sep 2012 Absrac A general model

More information

Kriging Models Predicting Atrazine Concentrations in Surface Water Draining Agricultural Watersheds

Kriging Models Predicting Atrazine Concentrations in Surface Water Draining Agricultural Watersheds 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Kriging Models Predicing Arazine Concenraions in Surface Waer Draining Agriculural Waersheds Paul L. Mosquin, Jeremy Aldworh, Wenlin Chen Supplemenal Maerial Number

More information

OBJECTIVES OF TIME SERIES ANALYSIS

OBJECTIVES OF TIME SERIES ANALYSIS OBJECTIVES OF TIME SERIES ANALYSIS Undersanding he dynamic or imedependen srucure of he observaions of a single series (univariae analysis) Forecasing of fuure observaions Asceraining he leading, lagging

More information

UNIVERSITY OF OSLO DEPARTMENT OF ECONOMICS

UNIVERSITY OF OSLO DEPARTMENT OF ECONOMICS UNIVERSITY OF OSLO DEPARTMENT OF ECONOMICS Exam: ECON4325 Moneary Policy Dae of exam: Tuesday, May 24, 206 Grades are given: June 4, 206 Time for exam: 2.30 p.m. 5.30 p.m. The problem se covers 5 pages

More information

Phys1112: DC and RC circuits

Phys1112: DC and RC circuits Name: Group Members: Dae: TA s Name: Phys1112: DC and RC circuis Objecives: 1. To undersand curren and volage characerisics of a DC RC discharging circui. 2. To undersand he effec of he RC ime consan.

More information

RANDOM LAGRANGE MULTIPLIERS AND TRANSVERSALITY

RANDOM LAGRANGE MULTIPLIERS AND TRANSVERSALITY ECO 504 Spring 2006 Chris Sims RANDOM LAGRANGE MULTIPLIERS AND TRANSVERSALITY 1. INTRODUCTION Lagrange muliplier mehods are sandard fare in elemenary calculus courses, and hey play a cenral role in economic

More information

1. An introduction to dynamic optimization -- Optimal Control and Dynamic Programming AGEC

1. An introduction to dynamic optimization -- Optimal Control and Dynamic Programming AGEC This documen was generaed a :37 PM, 1/11/018 Copyrigh 018 Richard T. Woodward 1. An inroducion o dynamic opimiaion -- Opimal Conrol and Dynamic Programming AGEC 64-018 I. Overview of opimiaion Opimiaion

More information

Solutions Problem Set 3 Macro II (14.452)

Solutions Problem Set 3 Macro II (14.452) Soluions Problem Se 3 Macro II (14.452) Francisco A. Gallego 04/27/2005 1 Q heory of invesmen in coninuous ime and no uncerainy Consider he in nie horizon model of a rm facing adjusmen coss o invesmen.

More information

Optimality Conditions for Unconstrained Problems

Optimality Conditions for Unconstrained Problems 62 CHAPTER 6 Opimaliy Condiions for Unconsrained Problems 1 Unconsrained Opimizaion 11 Exisence Consider he problem of minimizing he funcion f : R n R where f is coninuous on all of R n : P min f(x) x

More information

Linear Response Theory: The connection between QFT and experiments

Linear Response Theory: The connection between QFT and experiments Phys540.nb 39 3 Linear Response Theory: The connecion beween QFT and experimens 3.1. Basic conceps and ideas Q: How do we measure he conduciviy of a meal? A: we firs inroduce a weak elecric field E, and

More information

Let us start with a two dimensional case. We consider a vector ( x,

Let us start with a two dimensional case. We consider a vector ( x, Roaion marices We consider now roaion marices in wo and hree dimensions. We sar wih wo dimensions since wo dimensions are easier han hree o undersand, and one dimension is a lile oo simple. However, our

More information

Air Traffic Forecast Empirical Research Based on the MCMC Method

Air Traffic Forecast Empirical Research Based on the MCMC Method Compuer and Informaion Science; Vol. 5, No. 5; 0 ISSN 93-8989 E-ISSN 93-8997 Published by Canadian Cener of Science and Educaion Air Traffic Forecas Empirical Research Based on he MCMC Mehod Jian-bo Wang,

More information

A Hop Constrained Min-Sum Arborescence with Outage Costs

A Hop Constrained Min-Sum Arborescence with Outage Costs A Hop Consrained Min-Sum Arborescence wih Ouage Coss Rakesh Kawara Minnesoa Sae Universiy, Mankao, MN 56001 Email: Kawara@mnsu.edu Absrac The hop consrained min-sum arborescence wih ouage coss problem

More information

Problem 1 / 25 Problem 2 / 20 Problem 3 / 10 Problem 4 / 15 Problem 5 / 30 TOTAL / 100

Problem 1 / 25 Problem 2 / 20 Problem 3 / 10 Problem 4 / 15 Problem 5 / 30 TOTAL / 100 eparmen of Applied Economics Johns Hopkins Universiy Economics 602 Macroeconomic Theory and Policy Miderm Exam Suggesed Soluions Professor Sanjay hugh Fall 2008 NAME: The Exam has a oal of five (5) problems

More information

CHAPTER 10 VALIDATION OF TEST WITH ARTIFICAL NEURAL NETWORK

CHAPTER 10 VALIDATION OF TEST WITH ARTIFICAL NEURAL NETWORK 175 CHAPTER 10 VALIDATION OF TEST WITH ARTIFICAL NEURAL NETWORK 10.1 INTRODUCTION Amongs he research work performed, he bes resuls of experimenal work are validaed wih Arificial Neural Nework. From he

More information

2. Nonlinear Conservation Law Equations

2. Nonlinear Conservation Law Equations . Nonlinear Conservaion Law Equaions One of he clear lessons learned over recen years in sudying nonlinear parial differenial equaions is ha i is generally no wise o ry o aack a general class of nonlinear

More information

Planning in POMDPs. Dominik Schoenberger Abstract

Planning in POMDPs. Dominik Schoenberger Abstract Planning in POMDPs Dominik Schoenberger d.schoenberger@sud.u-darmsad.de Absrac This documen briefly explains wha a Parially Observable Markov Decision Process is. Furhermore i inroduces he differen approaches

More information

2.160 System Identification, Estimation, and Learning. Lecture Notes No. 8. March 6, 2006

2.160 System Identification, Estimation, and Learning. Lecture Notes No. 8. March 6, 2006 2.160 Sysem Idenificaion, Esimaion, and Learning Lecure Noes No. 8 March 6, 2006 4.9 Eended Kalman Filer In many pracical problems, he process dynamics are nonlinear. w Process Dynamics v y u Model (Linearized)

More information

0.1 MAXIMUM LIKELIHOOD ESTIMATION EXPLAINED

0.1 MAXIMUM LIKELIHOOD ESTIMATION EXPLAINED 0.1 MAXIMUM LIKELIHOOD ESTIMATIO EXPLAIED Maximum likelihood esimaion is a bes-fi saisical mehod for he esimaion of he values of he parameers of a sysem, based on a se of observaions of a random variable

More information

Economic Growth & Development: Part 4 Vertical Innovation Models. By Kiminori Matsuyama. Updated on , 11:01:54 AM

Economic Growth & Development: Part 4 Vertical Innovation Models. By Kiminori Matsuyama. Updated on , 11:01:54 AM Economic Growh & Developmen: Par 4 Verical Innovaion Models By Kiminori Masuyama Updaed on 20-04-4 :0:54 AM Page of 7 Inroducion In he previous models R&D develops producs ha are new ie imperfec subsiues

More information

Appendix 14.1 The optimal control problem and its solution using

Appendix 14.1 The optimal control problem and its solution using 1 Appendix 14.1 he opimal conrol problem and is soluion using he maximum principle NOE: Many occurrences of f, x, u, and in his file (in equaions or as whole words in ex) are purposefully in bold in order

More information