arxiv: v1 [math.oc] 25 Jun 2008

Size: px
Start display at page:

Download "arxiv: v1 [math.oc] 25 Jun 2008"

Transcription

1 Irrevocable Mult-Armed Bandt Polces Vvek F. Faras Rtesh Madan arxv: v1 [math.oc] 25 Jun 2008 February 13, 2018 Abstract Ths paper consders the mult-armed bandt problem wth multple smultaneous arm pulls. We develop a new rrevocable heurstc for ths problem. In partcular, we do not allow recourse to arms that were pulled at some pont n the past but then dscarded. Ths rrevocable property s hghly desrable from a practcal perspectve. As a consequence of ths property, our heurstc entals a mnmum amount of exploraton. At the same tme, we fnd that the prce of rrevocablty s lmted for a broad useful class of bandts we characterze precsely. Ths class ncludes one of the most common applcatons of the bandt model, namely, bandts whose arms are cons of unknown bases. Computatonal experments wth a generatve famly of large scale problems wthn ths class ndcate losses of up to 5 10% relatve to an upper bound on the performance of an optmal polcy wth no restrctons on exploraton. We also provde a worst-case theoretcal analyss that shows that for ths class of bandt problems, the prce of rrevocablty s unformly bounded: our heurstc earns expected rewards that are always wthn a factor of 1/8 of an optmal polcy wth no restrctons on exploraton. In addton to beng an ndcator of robustness across all parameter regmes, ths analyss sheds lght on the structural propertes that afford a low prce of rrevocablty. Sloan School of Management and Operatons Research Center, Massachusetts Insttute of Technology, emal :vvekf@mt.edu Qualcomm-Flaron Technologes, emal :rkmadan@stanfordalumn.org 1

2 1 Introducton Consder the operatons of a fast-fashon retaler such as Zara or H&M. Such retalers have developed and nvested n merchandze procurement strateges that permt lead tmes for new fashons as short as two weeks. As a consequence of ths flexblty, such retalers are able to adjust the assortment of products offered on sale at ther stores to quckly adapt to popular fashon trends. In partcular, such retalers use weekly sales data to refne ther estmates of an tem s popularty, and based on such revsed estmates weed out unpopular tems, or else re-stock demonstrably popular ones on a week-by-week bass. In sharp contrast, tradtonal retalers such as J.C. Penney or Marks and Spencer face lead tmes on the order of several months. As such these retalers need to predct popular fashons months n advance and are allowed vrtually no changes to ther product assortments over the course of a sales season whch s typcally several months n length. Understandably, ths approach s not nearly as successful at dentfyng hgh sellng fashons and also results n substantal unsold nventores at the end of a sales season. In vew of the great deal of a-pror uncertanty n the popularty of a new fashon and the speed at whch fashon trends evolve, the fast-fashon operatons model s hghly desrable and emergng as the de-facto operatons model for large fashon retalers. Among other thngs, the fast-fashon model reles crucally on an effectve technology to learn from purchase data, and adjust product assortments based on such data. Such a technology must strke a balance between explorng potentally successful products and explotng products that are demonstrably popular. A convenent mathematcal model wthn whch to desgn algorthms capable of accomplshng such a task s that of the mult-armed bandt. Whle we defer a precse mathematcal dscusson to a later secton, a mult-armed bandt conssts of multple (say n) arms, each correspondng to a Markov Decson Process. As a specal case, one may thnk of each arm as an ndependent bnomal con wth an uncertan bas specfed va some pror dstrbuton. At each pont n tme, one may pull up to a certan number of arms (say k < n) smultaneously, or equvalently, toss up to a certan number of cons. For each tossed con, we earn a reward proportonal to ts realzaton and are able to refne our estmate of ts bas based on ths realzaton. We nether learn about, nor earn rewards from cons that are not tossed. The mult-armed bandt problem requres fndng a polcy that adaptvely selects k arms to pull at every pont n tme wth a vew to maxmzng total expected reward earned over some fnte tme horzon or alternatvely, dscounted rewards earned over an nfnte horzon or perhaps, even long term average rewards. Wth multple smultaneous pulls allowed, the mult-armed bandt problem we have descrbed s computatonally hard. A popular and emprcally successful heurstc for ths problem was proposed several decades ago by Whttle. Whttle s heurstc produces an ndex for every arm based on the state of that arm and smply calls for pullng the k arms wth the hghest ndex at every pont n tme. Whle t has been emprcally and computatonally observed that Whttle s heurstc provdes excellent performance, the heurstc typcally calls for frequent changes to the set of arms pulled that mght, n hndsght, have been unnecessary. For nstance, n the retal context, such a heurstc may choose to dscard from the assortment a product presently beng offered for sale n favor of a new product whose popularty s not known precsely. Later, the heurstc may well choose to rentroduce the dscarded product. Whle such exploraton may appear necessary f one s to dscover proftable bandt arms (or popular products), enablng such a heurstc n practce wll typcally call for a great number of adjustments to the product assortment a requrement that s both 1

3 expensve and undesrable. Ths begs the followng queston: Is t possble to desgn a heurstc for the mult-armed bandt problem that comes close to beng optmal wth a mnmal number of adjustments to the set of arms pulled over tme? Ths paper ntroduces a new rrevocable heurstc for the mult-armed bandt problem we call the packng heurstc. The packng heurstc establshes a statc rankng of bandt arms based on a measure of ther potental value relatve to the tme requred to realze that value, and pulls arms n the order prescrbed by ths rankng. For an arm currently beng pulled, the heurstc may ether choose to contnue pullng that arm n the next tme step or else dscard the arm n favor of the next hghest ranked arm not currently beng pulled. Once dscarded, an arm wll never be chosen agan; hence the term rrevocable. Irrevocablty s an attractve structural constrant to mpose on arm selecton polces n a number of practcal applcatons of the bandt model such as the dynamc assortment problem we have dscussed or sequental drug trals where recourse to drugs whose testng was dscontnued n the past s socally unacceptable. It s clear that an rrevocable heurstc makes a mnmal number of changes to the set of arms pulled. What s perhaps surprsng, s that the restrcton to an rrevocable polcy s typcally far less expensve than one mght expect. In partcular, we demonstrate va a theoretcal analyss and computatonal experments that the use of the packng heurstc ncurs a small performance loss relatve to an optmal bandt polcy wth no restrcton on exploraton,.e. an optmal strategy that s allowed recourse to arms that were pulled but dscarded n the past. More specfcally, the present work makes the followng contrbutons: We ntroduce a new rrevocable heurstc, the packng heurstc, for the mult-armed bandt problem wth multple smultaneous arm-pulls. The packng heurstc s rrevocable n that f an arm beng pulled s at some pont dscarded from the set of arms beng pulled, t s never pulled agan. At the same tme, the performance loss ncurred relatve to an optmal, potentally non-rrevocable, control polcy s lmted. In partcular, computatonal experments wth the packng heurstc for a generatve famly of large scale bandt problems ndcate performance losses of up to about a few percent relatve to an upper bound on the performance of an optmal polcy wth no restrctons on exploraton. Ths level of performance suggests that the packng heurstc s lkely to serve as a vable heurstc for the mult-armed bandt wth multple plays even when rrevocablty s not a concern. In addton to our computatonal study, we are able to demonstrate a unform bound on the prce of rrevocablty for a broad, nterestng class of bandts. Ths class ncludes most commonly used applcatons of the bandt model such as bandts whose arms are cons of unknown bases. We demonstrate that the packng heurstc earns expected rewards that are always wthn a factor of 1/8 of an optmal, potentally non-rrevocable polcy. Such a unform bound guarantees robust performance across all parameter regmes; n partcular, the packng heurstc wll track the performance of an optmal, potentally non-rrevocable polcy across all parameter regmes. In addton, our analyss sheds lght on the structural propertes that afford the surprsng effcacy of the rrevocable polces consdered here. In the nterest of practcal applcablty, we develop a fast combnatoral mplementaton of the packng heurstc. Assumng that an ndvdual arm has O(Σ) states, and gven a tme horzon of T steps, optmal soluton to the mult-armed bandt problem under consderaton requres 2

4 O(Σ n T n ) computatons. The man computatonal step n the packng heurstc calls for the one tme soluton of a lnear program wth O(nΣT) varables, whose soluton va a generc LP solver requres O(n 3 Σ 3 T 3 ) computatons. We develop a novel combnatoral algorthm that solves ths lnear program n O(nΣ 2 T logt) steps by solvng a sequence of dynamc programs for each bandt arm. The technque we develop here s potentally of ndependent nterest for the soluton of weakly coupled optmal control problems wth couplng constrants that must be met n expectaton. Employng ths soluton technque, our heurstc requres a total of O(nΣ 2 logt) computatons pertmestep amortzed over thetme horzon. Incomparson, the smplest theoretcally sound heurstcs n exstence for ths mult-armed bandt problem (such as Whttle s heurstc) requre O(nΣ 2 T) computatons per tme step. As such, we establsh that the packng heurstc s computatonally attractve. 1.1 Relevant Lterature The mult-armed bandt problem has a rch hstory, and a number of excellent references (such as Gttns (1989)) provde a thorough treatment of the subject. We revew here lterature especally relevant to the present work. In the case where k = 1, that s, allowng for a sngle arm to be pulled n a gven tme step, Gttns and Jones (1974) developed an elegant ndex based polcy that was shown to be optmal for the problem of maxmzng dscounted rewards over an nfnte horzon. Ther ndex polcy s known to be suboptmal f one s allowed to pull more than a sngle arm n a gven tme step. Whttle (1988) developed a smple ndex based heurstc for a more general bandt problem (the restless bandt problem) allowng for multple arms to be pulled n a gven tme step. Whle hs orgnal paper was concerned wth maxmzng long-term average rewards, hs heurstc s easly adapted to other objectves such as dscounted nfnte horzon rewards or expected rewards over a fnte horzon (see for nstance Caro and Gallen (2007), Bertsmas and Nno-Mora (2000)). Wess (1992) subsequently establshed that under sutable condtons, Whttle s heurstc was asymptotcally optmal (n a regme where n and k go to nfnty keepng n/k constant). Whttle s heurstc may be vewed as a modfcaton to the optmal control polcy one obtans upon relaxng the requrement that at most k arms be pulled n a gven tme step to requrng that at most k arms be pulled n expectaton n any gven tme step. The packng heurstc we ntroduce s motvated by a smlar relaxaton. In partcular, we restrct attenton to polces that ental a total of at most kt arm pulls over the entre horzon n expectaton whle allowng for no more than T pulls of any gven arm. Where we dffer substantally from Whttle s heurstc s the manner n whch we construct a feasble polcy (one where at most k arms are pulled n a gven tme step) from the relaxed polcy. In fact there are potentally many reasonable ways of transformng an optmal polcy for the relaxed problem to a feasble polcy for the mult-armed bandt; for nstance Bertsmas and Nno-Mora (2000) use a scheme dstnct from both Whttle s and ours, that employs optmal prmal and dual solutons to a lnear programmng formulaton of Whttle s relaxaton to construct an ndex heurstc for arm selecton. Nonetheless, none of these schemes are rrevocable and nor do they offer non-asymptotc performance guarantees, f any. The packng heurstc polcy bulds upon recent nsghts on the adaptvty gap for stochastc packng problems. In partcular, Dean et al. (2004) recently establshed that a smple statc rule (Smth s rule) for packng a knapsack wth tems of fxed reward (known a-pror), but whose szes were stochastc and unknown a-pror was wthn a constant factor of the optmal adaptve packng 3

5 polcy. Guha and Munagala (2007) used ths nsght to establsh a smlar statc rule for budgeted learnng problems. In such a problem one s nterested n fndng a con wth hghest bas from a set of cons of uncertan bas, assumng one s allowed to toss a sngle con n a gven tme step and that one has a fnte budget on the number of such expermental tosses allowed. Our work parallels that work n that we draw on the nsghts of the stochastc packng results of Dean et al. (2004). In addton, we must address two sgnfcant hurdles - correlatons between the total reward earned from pulls of a gven arm and the total number of pulls of that arm (these turn out not to matter n the budgeted learnng settng, but are crucal to our settng), and secondly, the fact that multple arms may be pulled smultaneously (only a sngle arm may be pulled at any tme n the budgeted learnng settng). Fnally, a workng paper (Bhattacharjee et al. (2007)), brought to our attenton by the authors of that work consders a varant of the budgeted learnng problem of Guha and Munagala (2007) wheren one s allowed to toss multple cons smultaneously. Whle t s concevable that ther heurstc may be modfed to apply to the mult-armed bandt problem we address, the heurstc they develop s also not rrevocable. Restrcted to cons, our work takes an nherently Bayesan vews of the mult-armed bandt problem. It s worth mentonng that there are a number of non-parametrc formulatons to such problems wth a vast assocated lterature. Most relevant to the present model are the papers by Anantharam et al. (1987a,b) that develop smple regret-optmal strateges for mult-armed bandt problems wth multple smultaneous plays. Our development of an rrevocable polcy for the mult-armed bandt problem was orgnally motvated by applcatons of ths framework to dynamc assortment problems of the type mentoned n the ntroducton. In partcular, Caro and Gallen (2007) computatonally explore the use of a number of smple ndex-type heurstcs (smlar to Whttle s heurstc) for such problems, none of whch are rrevocable; nonetheless, they stress the mportance of a mnmal number of changes to the assortment f any such heurstc s to be practcal. The remander of ths paper s organzed as follows. Secton 2 presents the mult-armed bandt model we consder and develops an (ntractable) LP whose soluton yelds an optmal control polcy for ths bandt problem. Secton 3 develops the packng heurstc by consderng a sutable relaxaton of the mult-armed bandt problem. Secton 4 ntroduces a structural property for bandt arms we call the decreasng returns property. It s shown that a useful class of bandts, namely the con bandts relevant to the applcatons that motvate us, possess ths property. That secton then establshes that the prce of rrevocablty for bandts possessng the decreasng returns property s unformly bounded. Secton 5 presents very encouragng computatonal experments for large scale bandt problems drawn from a generatve famly of con type bandts. In the nterest of mplementablty, Secton 6 develops a combnatoral algorthm for the fast computaton of packng heurstc polces for mult-armed bandts. Secton 7 concludes wth a perspectve on nterestng drectons for future work. 2 Model We consder a mult-armed bandt problem wth multple smultaneous pulls permtted at every tme step. A sngle bandt arm (ndexed by ) s a Markov Decson Process (MDP) specfed by a state space S, an acton space, A, a reward functon r : S A R +, and a transton 4

6 kernel P : S A S (where S s the S -dmensonal unt smplex), yeldng a probablty dstrbuton over next states should one choose some acton a A n state s S. Every bandt arm s endowed wth a dstngushed dle acton φ. Should a bandt be dled n some tme perod, t yelds no rewards n that perod and transtons to the same state wth probablty 1 n the next perod. More precsely, r (s,φ ) = 0, s S, P (s,φ,s ) = 1, s S. We consder a bandt problem wth n arms. In each tme step one must select a subset of up to k( n) arms for whch one may pck any acton avalable at those respectve arms. Should an acton other than the dle acton be selected at any of these k arms, we refer to such a selecton as a pull of that arm. That s, any acton a A \ {φ } would be consdered a pull of the th arm. One s forced to pck the dle acton for the remanng n k arms. We wsh to fnd an acton selecton (or control) polcy that maxmzes expected rewards earned over T tme perods. Our problem may be cast as an optmal control problem. In partcular, we defne as our state-space the set S = S and as our acton space, the set A = A. We let T = {0,1,...,T 1}. We understand by s, the th component of s S and smlarly let a denote the th component of a A. A feasble acton s one whch calls for smultaneously pullng at most k arms. In partcular we let A feas = {a A, 1 a φ k} denote the set of all feasble actons. We defne a reward functon r : S A R +, gven by r(s,a) = r (s,a ) and a system transton kernel P : S A Q S, gven by P(s,a,s ) = Π P (s,a,s ). We now formally develop what we mean by a control polcy. The arm selecton polcy we wll eventually develop wll use auxlary nformaton asde from the current state of the system, and so we requre a general defnton. Let X 0 be a random varable that encapsulates any endogenous randomzaton n selectng an acton, and defne the fltraton generated by X 0 and the hstory of vsted states and actons by F t = σ(x 0,(s 0 ),(s 1,a 0 ),...,(s t,a t 1 )), where s t and a t denote the state and acton at tme t, respectvely. We assume that P(s t+1 = s s t = s,a t = a,h t = h t ) = P(s,a,s ) for all s,s S,a A,t T and any F t -measurable random varable H t. A feasble polcy smply specfes a sequence of A feas -valued actons {a t } adapted to F t. In partcular, such a polcy may be specfed by a collecton of σ(x 0 ) measurable, A feas -valued random varables, {µ(s 0,...,s t,a 0,...,a t 1,t)}, one for each possble state-acton hstory of the system. We let M denote the set of all such polces µ, and denote by J µ (s,0) the expected value of usng polcy µ startng n state s at tme 0; n partcular J µ (s,0) = E ] R(s t,a t ) s 0 = s, [ T 1 t=0 where a t = µ(s 0,...,s t,a 0,...,a t 1,t). Our goal s to compute an optmal admssble polcy. Markovan polces,.e. polces under whch a t s measurable wth respect to σ(x 0,s t ), are partcularly useful. A Markovan polcy s 5

7 specfed as a collecton of ndependent A feas valued random varables {µ(s,t)} each measurable wth respect to σ(x 0 ). In partcular, assumng the system s n state s at tme t, such a polcy selects an acton a t as the random varable µ(s,t), ndependent of past states and actons. We let M m denote the set of all such admssble Markovan polces. Every µ M m s assocated wth a value functon, J µ : S T R + whch, for every (s,t) S T, gves the expected value of usng control polcy µ startng at that state: J µ (s,t) = E [ T 1 ] R(s t,µ(s t,t )) s t = s. t =t We denote by J the optmal value functon. In partcular, J (s,t) = sup µ M m J µ (s,t). The precedng supremum s always acheved and we denote by µ a correspondng optmal Markovan control polcy. That s, µ argsupj µ (s,t) for all (s,t) S T. Our restrcton to Markovan µ M m polces s wthout loss; M m always contans an optmal polcy among the broader class of admssble polces so that sup µ M m J µ (s,0) = sup µ M J µ (s,0) for all states s. We next formulate a mathematcal program to compute such an optmal polcy. 2.1 Computng an Optmal Polcy An optmal polcy µ may be found va the soluton of the followng lnear program, LP( π 0 ), specfed by a parameter π 0 S that specfes the dstrbuton of arm states at tme t = 0. max. s.t. t s,a π(s,a,t)r(s,a), a π(s,a,t) = s,a P(s,a,s)π(s,a,t 1), t > 0,s S, π(s,a,t) = 0, s,t,a / A feas a π(s,a,0) = π 0(s), s S, π 0. where the varables are the state acton frequences π(s, a, t), whch gve the probablty of beng n state s at tme t and choosng acton a. The frst set of constrants n the above program smply enforce the dynamcs of the system, whle the second set of constrants enforces the requrement that at most k arms are smultaneously pulled at any pont n tme. An optmal soluton to the program above may be used to construct a polcy µ that attans expected value J (s,0) startng at any state s for whch π 0 (s) > 0. In partcular, gven an optmal soluton π opt to LP( π 0 ), one obtans such a polcy by defnng µ (s,t) as a random varable that takes value a A wth probablty π opt (s,a,t)/ a πopt (s,a,t). By constructon, we have E[J (s,0) s π 0 ] = OPT(LP( π 0 )). Of course, effcent soluton of the above program s not a tractable task, whch forces us to seek approxmatons to an optmal polcy. The next secton wll present one such polcy wth an appealng structural property we term rrevocablty. 3 An Irrevocable Approxmaton to the Optmal Polcy Ths secton develops an approxmaton to the optmal mult-armed bandt control polcy that we wll subsequently establsh performs adequately relatve to the optmal polcy. Ths approxmaton 6

8 wll possess a desrable property we term rrevocablty. In partcular, the polcy we develop wll, at any tme, be permtted to pull an arm only f that arm was pulled n the pror tme step, or else never pulled n the past. We frst develop a control polcy for a related bandt problem, where the requrement that precsely k arms be pulled n any tme step s relaxed. As we wll see, ths s essentally Whttle s relaxaton and the polcy developed for ths relaxaton s an upper bound to the optmal polcy. We wll then use the control polcy developed for ths relaxed control problem to desgn a polcy for the mult-armed bandt problem that s rrevocable and also offers good performance relatve to the optmal polcy for a broad class of bandts. Consder the followng relaxaton of the program LP( π 0 ), RLP( π 0 ). RLP( π 0 ) may be vewed as a prmal formulaton of Whttle s relaxaton: max. s.t. t s,a π (s,a,t)r (s,a ), a π (s,a,t) = s P (s,a,a,s )π (s,a,t 1), t > 0,s S,. [ T s t π (s,φ,t) ] kt, a π (s,a,0) = s: s =s π 0 ( s), π 0, where π (s,a,t) s the probablty of the th bandt beng n state s at tme t and choosng acton a. The program above relaxes the requrement that precsely k arms be pulled n a gven tme step; nstead we now requre that over the entre horzon at most kt arms are pulled n expectaton, where the expectaton s over polcy randomzaton and state evoluton. The frst set of equalty constrants enforce ndvdual arm dynamcs whereas the frst nequalty constrant enforces the requrement that at most kt arms be pulled n expectaton over the entre tme horzon. The followng lemma makes the noton of a relaxaton to LP( π 0 ) precse; the proof may be found n the appendx. Lemma 1. OPT(RLP( π 0 )) OPT(LP( π 0 )) Gven an optmal soluton π to RLP( π 0 ), one may consder the polcy µ R, that, assumng we are n state s at tme t, selects a random acton µ R (s,t), where µ R (s,t) = a wth probablty ( π (s,a,t)/ a π (s,a,t) ) ndependent of the past. Notng that the acton for each arm s chosen ndependently of all other arms, we use µ R (s,t) to denote the nduced polcy for arm. By constructon, E[J µr (s,0) s π 0 ] = OPT(RLP( π 0 )). Moreover, we have that µ R satsfes the constrant [ T 1 ] E µ 1 µ R (s t,t) φ s0 = s kt, t=0 where the expectaton s over random state transtons and endogenous polcy randomzaton. Of course, µ R s not necessarly feasble; we ultmately requre a polcy that entals at most k arm pulls n any tme step. We wll use µ R to construct such a feasble polcy. In addton, we wll see that f an arm s pulled and then dled n some subsequent tme step, t wll never agan be pulled, so that the polcy we construct wll be rrevocable. In what follows we wll assume for convenence that π 0 s degenerate and puts mass 1 on a sngle startng state. That s, π 0 (s ) = 1 for some s S for all. We frst ntroduce some relevant notaton. Gven an optmal soluton π 7

9 to RLP( π 0 ), defne the value generated by arm as the random varable T 1 R = r (s t,µr (st,t)), t=0 and the actve tme of arm, T as the total number of pulls of arm entaled under that polcy T 1 T = 1 µ R (s t,t) φ. t=0 The expected value of arm, E[R ] = s,a,t π (s,a,t)r (s,a ), and the expected actve tme E[T ] = s,a,t:a φ π (s,a,t). We wll assume n what follows that E[T ] > 0 for all ; otherwse, we smply consder elmnatng those for whch E[T ] = 0. We wll also assume for analytcal convenence that E[T ] = kt. Nether assumpton results n a loss of generalty. To motvate our polcy we begn wth the followng analogy wth a packng problem: Imagne packng n objects nto a knapsack of sze B. Each object has sze T and value R. Moreover, we assume that we are allowed to pack fractonal quanttes of an object nto the knapsack and that packng a fracton α of the th object requres space α T and generates value α R. An optmal polcy s then gven by the followng greedy procedure: select objects n decreasng order of the rato R / T and place them n to the knapsack to the extent that there s room avalable. If one had more than a sngle knapsack and the addtonal constrant that an tem could not be placed n more than a sngle knapsack, then the stuaton s more complcated. One may consder a greedy procedure that, as before, consders tems n decreasng order of the rato R / T and places them (possbly fractonally) n sequence, nto the least loaded of the bns at that pont. Ths generalzaton of the greedy procedure for the smple knapsack s suboptmal, but stll a reasonable heurstc. Thus motvated, we begn wth a loose hgh level descrpton of our control polcy, whch we call the packng heurstc. We thnk of each bandt arm as an tem of value E[R ] wth sze E[T ]. For the purposes of ths explanaton alone, we wll assume for convenence that should polcy µ R call for an arm that was pulled n the past to be dled, t wll never agan call for that arm to be pulled; we wll momentarly remove that assumpton. Our control polcy wll operate as follows: we wll order arms n decreasng order of the rato E[R ]/E[T ]. We begn wth the top k arms accordng to ths orderng. For each such arm we wll select an acton accordng to the polcy specfed for that arm by µ R ; should ths polcy call for the arm to be dled, we dscard that arm and wll never agan consder pullng t. We replace the dscarded arm wth the next avalable arm (n order of ntal arm rankngs) and select an acton for the arm accordng to µ R. We repeat ths procedure untl we have selected non-dle actons for up to k arms (or no arms are avalable). We then let tme advance, earn rewards, and repeat the procedure descrbed above untl the end of the tme horzon. Algorthm 1 descrbes the packng heurstc polcy precsely, addressng the fact that µ R may call for an arm to be dled but then pulled n some subsequent tme step. Intheeventthatweplacednorestrctononthetmehorzon(.e. wesett = nthealgorthm above), we have by constructon, that the expected total reward earned under the above polcy s precsely OPT(RLP( π 0 )). In essence, RLP( π 0 ) prescrbes a polcy wheren each arm generates a total reward wth mean E[R ] usng an expected total number of pulls E[T ], ndependent of other 8

10 Algorthm 1 The Packng Heurstc 1: Renumber bandts so that E[R 1] E[T 1 ] E[R 2] E[T 2 ] E[R N] E[T N ]. Index bandts by varable. 2: l 0,a φ for all, s π 0 ( ) {The local tme of every arm s set to 0 and ts desgnated acton to the dle acton. An ntal state s drawn accordng to the ntal state dstrbuton π 0.} 3: J 0 {Total reward earned s ntalzed to 0.} 4: X {1,2,...,k},A {k+1,...,n},d =. {Intalze the set of actve (X), avalable (A), and dscarded (D) arms.} 5: for t = 0 to T 1 do 6: whle there exsts an arm X wth a = φ do {Select up to k arms to pull.} 7: Select an X wth a = φ {In what follows, ether select an acton for arm or else dscard t.} 8: whle a = φ and l < T do {Attempt to select a pull acton for arm } 9: Select a π (s,,l ) {Select an acton accordng to the soluton to RLP( π).} 10: l l +1 {Increment arm s local tme.} 11: end whle 12: f l = T anda = φ then{dscard armandactvate nexthghestrankedarmavalable.} 13: X X\{},D D {} {Dscard arm.} 14: f A then {There are avalable arms.} 15: j mn A {Select hghest ranked avalable arm.} 16: X X {j},a A\{j} {Add arm to actve set.} 17: end f 18: end f 19: end whle 20: for Every X do {Pull selected arms.} 21: s P(s,a, ) {Pull arm ; select next arm state accordng to ts transton kernel assumng the use of acton a.} 22: J J +r (s,a ) {Earn rewards.} 23: a φ 24: end for 25: end for 9

11 arms. The above scheme may be vsualzed as one whch packs as many of the pulls of varous arms possble n a manner so as to meet feasblty constrants. It s clear that the heurstc we have constructed entals a mnmal amount of arm exploraton. In partcular, we are guaranteed at most n k changes to the set of pulled arms. One may naturally ask what the lmted exploraton permtted under ths polcy costs us n terms of performance. In addton, s ths scheme computatonally practcal? In partcular, the lnear programmng relaxaton we must solve s stll a farly large program. In subsequent sectons we address these ssues. Frst, we present a theoretcal analyss that demonstrates that the prce of rrevocablty s unformly bounded for an mportant general class of bandts. Our analyss sheds lght on the structural propertes that are lkely to afford a low prce of rrevocablty n practce. We then present results of computatonal experments wth a generatve famly of large-scale problems demonstratng performance losses of up to 5 10% percent relatve to an upper bound on the performance of the optmal polcy (whch s potentally non-rreovcable and has no restrctons on exploraton). Fnally, we address computatonal ssues relevant to the packng heurstc and develop a computatonal scheme that s substantally qucker than heurstcs such as Whttle s heurstc. 4 The Prce of Irrevocablty Ths secton establshes a unform bound on the performance loss ncurred n usng the rrevocable packng heurstc relatve to an optmal, potentally non-rrevocable scheme for a useful famly of bandts whose arms exhbt a certan decreasng returns property. Ths class ncludes bandts whose arms are cons of unknown bases a famly partcularly relevant to a number of applcatons ncludng those dscussed n the ntroducton. We establsh that the packng heurstc always earns expected rewards that are wthn a factor of 1/8 of an optmal scheme. Our analyss sheds lght on those structural propertes that lkely afford a low prce to rrevocablty. In addton to beng an ndcator of robustness across all parameter regmes, ths bound on the prce of rrevocablty s remarkable for two reasons. Frst, t does not rely on an asymptotc scalng of the system; the performance of the packng heurstc wll track that of an optmal, potentally non-rrevocable heurstc across all regmes. Second, the bound represents a comparson wth a system where one s allowed recourse to arms that were pulled n the past and dscarded. In partcular, the bound thus hghlghts the fact that for a useful class of bandts, one may acheve reasonable performance wth very lmted exploraton. The typcal performance we expect from the heurstc s lkely to be far superor (as t generally s n the case of problems for whch such worst case guarantees can be establshed); n a subsequent secton we wll present computatonal experments ndcatng a performance loss of 5 10% relatve to an optmal polcy wth no restrctons on exploraton. In what follows we frst specfy the decreasng returns property and explctly dentfy a class of bandts that possess ths property. We then present our performance analyss whch wll proceed as follows: we frst consder pullng bandt arms serally,.e. at most one arm at a tme, n order of ther rank and show that the total reward earned from bandts that were frst pulled wthn the frst kt/2 pulls s at least wthn a factor of 1/8 of an optmal polcy. Ths result reles on the statc rankng of bandt arms used, and a symmetrzaton dea exploted by Dean et al. (2004) n ther result on stochastc packng where rewards are statstcally ndependent of tem sze. In contrast to that work, we must address the fact that the rewards earned from a bandt are statstcally 10

12 dependent on the number of pulls of that bandt and to ths end we explot the decreasng returns property that establshes the nature of ths correlaton. We then show va a combnatoral sample path argument that the expected reward earned from bandts pulled wthn the frst T/2 tme steps of the packng heurstc.e., wth arms beng pulled n parallel, s at least as much as that earned n the settng above where arms are pulled serally, thereby establshng our performance guarantee. 4.1 The Decreasng Returns Property Defne for every and l < T, the random varable L (l) = l 1 µ R (s t,t) φ. t=0 L (l) tracks the number of tmes a gven arm has been pulled under polcy µ R among the frst l+1 steps of selectng an acton for that arm. Further, defne T 1 R m = 1 L (l) mr (s l,µ R (s l,l)). l=0 R m s the random reward earned wthn the frst m pulls of arm under the polcy µ R. The decreasng returns property roughly states that the expected ncremental returns from allowng an addtonal pull of a bandt arm are, on average, decreasng. More precsely, we have: Property 1. (Decreasng Returns) E[R m+1 ] E[R m] E[Rm ] E[Rm 1 ] for all 0 < m < T. One useful class of bandts from a modelng perspectve that satsfy ths property are bandts whose arms are cons of unknown bas. The followng dscusson makes ths noton more precse: An example of a bandt wth decreasng returns: Cons We defne a con to be any mult-armed bandt for whch every arm has acton space a = {p,φ }, wth r(s,p) > 0 for all s S, and satsfes the followng property: r(s,p) s S P(s,p,s )r(s,p), s S. The above sub-martngale characterzaton of rewards ntutvely suggests the decreasng returns property. In partcular, t suggests that the returns from a pull n the current state are at least as large as the expected returns to a pull n a state reached subsequent to the current pull. The decreasng returns property for cons s establshed n the followng Lemma whose proof may be found n the appendx: Lemma 2. Cons satsfy the decreasng returns property. That s, f A = {p,φ }, and r(s,p) s S P(s,p,s )r(s,p),,s S, 11

13 then for all 0 < m < T. E[R m+1 ] E[R m ] E[Rm ] E[Rm 1 ] Returnng to our motvatng example of dynamc product assortment selecton, we note that n estmatng the bas of a bnomal con of unknown bas gven some ntal pror on con bas, Bayes rule mples that the estmated bas after n observatons (whch generate the fltraton F n ), µ n+1 satsfes E[µ n+1 F n ] = µ n. Thus, bandts wth such arms wheren the reward from an arm s some non-negatve scalar tmes the bas, automatcally possess the decreasng returns property. 4.2 A Unform Bound on the Prce of Irrevocablty for Bandts wth Decreasng Returns For convenence of exposton we assume that T s even; addressng the odd case requres essentally dentcal proofs but cumbersome notaton. We re-order the bandts n decreasng order of E[R ]/E[T ] as n the packng heurstc. Let us defne { } j H = mn j : E[T ] kt/2. Thus, H s the set of bandts that take up approxmately half the budget on total expected pulls. Next, let us defne for all, random varables R and T accordng to R = R, T = T for all < H. We defne R H = αr H and T H = αt H, where α = kt/2 P H 1 E[T ] E[T. H ] We begn wth a prelmnary lemma: Lemma 3. Proof. Defne a functon f(t) = E[ R ] 1 2 OPT(RLP( π 0)). H n E[R ] E[T ] 1 t E[T ] j=1 + E[T ], where (a b) = mn(a,b). By constructon (.e. snce E[R ] E[T ] s non-ncreasng n ), we have that f s a concave functon on [0,kT]. Now observe that E[ R ] = H Next, observe that H 1 E[R ] E[T ] E[T ]+ E[R H ] H 1 kt/2 E[T H ] OPT(RLP( π 0 )) = n j=1 E[R ] E[T ] E[T ] = f(kt). E[T ] = f(kt/2). By the concavty of f and snce f(0) = 0, we have that f(kt/2) 1 2f(kT), whch yelds the result. 12

14 We next compare the expected reward earned by a certan subset of bandts wth ndces no larger than H. The sgnfcance of the subset of bandts we defne wll be seen later n the proof of Lemma 6 we wll see there that all bandts n ths subset wll begn operaton pror to tme T/2 n a run the packng heurstc. In partcular, defne R 1/2 = 1 P { 1 j=1 T j<kt/2} R. H Lemma 4. Proof. We have: E[R 1/2 ] (a) = (b) (c) = (d) E[R 1/2 ] 1 4 OPT(RLP( π 0)). H 1 Pr T j < kt/2 E[R ] j=1 H 1 Pr T j < kt/2 E[ R ] j=1 H 1 Pr T j < kt/2 E[ R ] ( H 1 H j=1 1 = E[ R ] (e) H (f) 1 2 H j=1 E[ T ) j ] E[ R ] kt/2 H E[ R ] 1 2 E[ R ] 1 H H (g) 1 4 OPT(RLP( π 0)) j=1 E[ T j ] E[ R ] kt/2 j=1,j E[ T j ] E[ R ] kt/2 Equalty (a) follows from the fact that under polcy µ R, R s ndependent of T j for j <. Inequalty (b) follows from our defnton of R : R R. Equalty (c) follows from the fact that by defnton T = T for all < H. Inequalty (d) nvokes Markov s nequalty. Inequalty (e) s the crtcal step n establshng the result and uses the smple symmetrzaton dea exploted by Dean et al. (2004): In partcular, we observe that snce E[R ] E[T ] E[R j] E[T j ] for > j, t follows that E[R ]E[T j ] 1 2 (E[R ]E[T j ] + E[R j ]E[T ]) for > j. Replacng every term of the form E[R ]E[T j ] (wth > j) n the expresson precedng nequalty (e) wth the upper bound 1 2 (E[R ]E[T j ] +E[R j ]E[T ]) yelds nequalty (e). Inequalty (f) follows from the fact that H E[ T ] = kt/2. Inequalty (g) follows from Lemma 3. 13

15 Before movng on to our man Lemma that translates the above guarantees to a guarantee on the performance of the packng heurstc, we need to establsh one addtonal techncal fact. Recall that R m s the reward earned by bandt n the frst m pulls of ths bandt. Explotng the assumed decreasng returns property, we have the followng Lemma whose proof may be found n the appendx: Lemma 5. For bandts satsfyng the decreasng returns property (Property 1), E [ H 1 P 1 j=1 T j<kt/2 RT/2 ] 1 2 E[R 1/2]. We have thus far establshed estmates for total expected rewards earned assumng mplctly that bandts are pulled n a seral fashon n order of ther rank. The followng Lemma connects these estmates to the expected reward earned under the µ packng polcy (gven by the packng heurstc) usng a smple sample path argument. In partcular, the followng Lemma shows that the expected rewards under the µ packng polcy are at least as large as E Lemma 6. E[J µpackng (s,0) s π 0 ] E [ H ] 1P 1 j=1 T j<kt/2 RT/2. Proof. For a gven sample path of the system defne h = (H ) mn : T j kt/2. On ths sample path, t must be that: j=1 [ H 1P 1 j=1 T j<kt/2 RT/2 ]. (4.1) H 1 P 1 j=1 T j<kt/2 RT/2 = h R T/2. We clam that arms 1,2,...,h are all frst pulled at tmes t < T/2 under µ packng. Assume to the contrary that ths were not the case and recall that arms are consdered n order of ndex under µ packng, so that an arm wth ndex s pulled for the frst tme no later than the frst tme arm l s pulled for l >. Let h be the hghest arm ndex among the arms pulled at tme t = T/2 1 so that h < h. It must be that h j=1 T j kt/2. But then, H mn : T j kt/2 h j=1 whch s a contradcton. Thus, snce every one of the arms 1,2,...,h s frst pulled at tmes t < T/2, each such arm may be pulled for at least T/2 tme steps pror to tme T (the horzon). Consequently, we have that the total rewards earned on ths sample path under polcy µ packng are at least h R T/2 14

16 Usng dentty (4.1) and takng an expectaton over sample paths yelds the result. We are ready to establsh our man Theorem that provdes a unform bound on the performance loss ncurred n usng the packng heurstc polcy relatve to an optmal polcy wth no restrctons on exploraton. In partcular, we have that the prce of rrevocablty s unformly bounded for bandts satsfyng the decreasng returns property. Theorem 1. For mult-armed bandts satsfyng the decreasng returns property (Property 1), E[J µpackng (s,0) s π 0 ] 1 8 E[J (s,0) s π 0 ] for all ntal state dstrbutons π 0. Proof. We have from Lemmas 1,4,5 and 6 that E[J µpackng (s,0) s π 0 ] 1 8 OPT(RLP( π 0)). We know from Lemma 1 that OPT(RLP( π 0 )) OPT(LP( π 0 )) = E[J (s,0) s π 0 ] from whch the result follows. Our analyss hghlghted a structural property decreasng returns that s lkely to afford a low prce of rrevocablty. The next secton demonstrates computatonal results that suggest that n practce we may expect ths prce to be qute small (on the order of 5 10%) for bandts possessng ths property. 5 Computatonal Experments Ths secton presents computatonal experments wth the packng heurstc. We consder a number of large scale bandt problems drawn from a generatve famly of problems to be dscussed shortly and demonstrate that the packng heurstc consstently demonstrates performance wthn about 5 10 % of an upper bound on the performance of an unrestrcted (.e. potentally non-rrevocable) optmal soluton to the mult-armed bandt problem. In partcular, ths suggests that the prce of rrevocablty s lkely to be small n practce, at least for models of the type we consder here. Snce the bandts consdered n our experments - Bnomal cons of uncertan bas - are among the most wdely used applcatons of the mult-armed bandt model, we vew ths to be a postve result. The Generatve Model: We consder mult-armed bandt problems wth n arms up to k of whchmaybepulledsmultaneouslyatanytme. ThetharmcorrespondstoaBnomal(m,P )con wheremsfxedandknown,andp sunknownbutdrawnfromadrchlet(α,β )prordstrbuton. Assumng we choose to pull arm at some pont, we realze a random outcome M {0,1,...,m}. M s a Bernoull(m,P ) random varable where P s tself a Drchlet(α,β ) random varable. We receve a reward of r M and update the pror dstrbuton parameters accordng to α α +M, β β +m M. By selectng the ntal values of α and β for each arm approprately we can control for the ntal uncertanty n the value of P. Ths model s, for nstance, applcable to the dynamc assortment selecton problem dscussed earler (see Caro and Gallen (2007)) wth each con representng a product of uncertan popularty and M representng the uncertan number of product sales over a sngle perod n whch that product s offered for sale. We recall from our prevous dscusson that ths famly of bandts satsfes the decreasng returns property and from our performance analyss we expect a reasonable prce of rrevocablty. 15

17 Summary of Computatonal Experments Coeff. of Varaton Arms Smultaneous Pulls Horzon Performance (cv) (n) (k) (T) (J µpackng /J ) Moderate (1) Hgh (2.5) Table 1: Computatonal Summary. Each row represents summary statstcs for 100 dstnct random bandt problems wth the specfed n, k, T and cv parameters. Performance for each nstance was computed from 3000 smulatons of that nstance. Performance fgures thus represent an average over the generatve famly wth the specfed n,k,t and cv parameters as also over system randomness. We consder the followng random nstances of the above problem. We consder bandts wth (n, k) {(500, 50),(500, 100),(100, 10),(100, 20)}. These dmensons are representatve of large scale applcatons of whch the dynamc assortment problem s an example. For each value of (n, k) we consder tme horzons T = 25 and T = 40. For every bandt problem we consder, we subdvde the arms of the bandt nto 10 groups. All arms wthn a group have dentcal statstcal structure, that s, dentcal r values and dentcal ntal values of α and β. For each value of (n,k,t), we generate a number of problem nstances by randomly drawng pror parameters for bandt arms. In partcular, for all arms n a gven group we select α unformly n the nterval [0.05,0.35] and then select that value of β whch results n a pror co-effcent of varaton cv {1,2.5}. These co-effcents of varaton represent, respectvely, a moderate and hgh degree of a-pror uncertanty n con bas (or n the context of the dynamc assortment applcaton, product popularty). In addton, r s drawn unformly on [0,2] and we take m = 2. We generate 100 random problem nstances for each co-effcent of varaton. Control polces for a gven bandt problem nstance are evaluated over 3000 random state trajectores (whch resulted n 98% confdence ntervals that were at least wthn +/-1% of the sample average). Evaluatng Performance: A strkng feature of our performance results s that the prce of rrevocablty s qute small, a trend that appears to hold over varyng parameter regmes. In partcular, we make the followng observaton: 16

18 Consder problems wth a small number of arms (100) wth a large number of smultaneous pulls (20) allowed. Intutvely, an optmal polcy could reasonably explore all arms n ths settng before settlng on the best arms. We thus expect the prce of rrevocablty to be hgh here. Even n ths regme we fnd that the prce of rrevocablty s only about % of optmal performance. Consder problems wth a hgh degree of a-pror uncertanty n con bas. Mstakes - that s, dscardng an arm that s performng reasonably n favor of an unexplored arm that turns out to perform poorly - are partcularly expensve n such problems. Wth a hgh co-effcent of varaton n the pror on ntal arm bas, the prce of r-revocablty s ndeed somewhat hgher but contnues to reman wthn % of optmal performance. For each of our experments, we observe that keepng all other parameters fxed, relatve performance mproves wth a longer tme horzon. Ths s ntutve; wth longer horzons, one may delay dscardng an arm only once one s sure that the arm performs poorly relatve to the expected value of the avalable alternatves. Fnally, we note that the performance fgures we report are relatve to an upper bound on optmal polcy performance. Computng the optmal polcy s tself an ntractable task. The performance observed here suggests that at least for bandt problems wth decreasng returns the packng heurstc s a vable approxmaton scheme even when rrevocablty s not necessarly a concern. We can thus conclude that the prce of rrevocablty s small for a useful class of mult-armed bandt problems and that the packng heurstc performs well for ths class of problems. A fnal concern s computatonal effort. In partcular, for the largest problem nstance we consdered (n = 500), the lnear program we need to solve has 3.2 mllon varables and about the same number of constrants. Even a commercal lnear programmng solver (such as CPLEX) equpped wth the ablty to explot structure n ths program wll requre several hours on a powerful computer to solve ths program. Ths s n stark contrast wth an ndex based heurstc (such as Whttles heurstc) thatsolvesasmpledynamcprogramforeacharmateverytmestep. Inthenextsectonwedevelop an effcent computatonal algorthm for the soluton of RLP( π 0 ) that requres substantally less effort than even Whttles heurstc and takes a few mnutes to solve the aforementoned program on a laptop computer. 6 Fast Computaton Ths secton consders the computatonal effort requred to mplement the packng heurstc. We develop a computatonal scheme that makes the packng heurstc substantally easer to mplement than popular ndex heurstcs such as Whttle s heurstc and thus establsh that the heurstc s vable from a computatonal perspectve. The key computatonal step n mplementng the packng heurstc s the soluton of the lnear program RLP( π 0 ). Assumng that S = O(S) and A = O(A) for all, ths lnear program has O(nT AS) varables and each Newton teraton of a general purpose nteror pont method wll requre O ( (ntas) 3) steps. An nteror pont method that explots the fact that bandt arms are 17

19 coupled va a sngle constrant wll requre O(n(TAS) 3 ) computatonal steps at each teraton. We develop a combnatoral scheme to solve ths lnear program that s n sprt smlar to the classcal Dantzg-Wolfe dual decomposton algorthm. In contrast wth Dantzg-Wolfe decomposton, our scheme s effcent. In partcular, the scheme requres O(nTAS 2 log(kt)) computatonal steps to solve RLP( π 0 ) makng t a sgnfcantly faster soluton alternatve to the schemes alluded to above. Equpped wth ths fast scheme, t s notable that usng the packng heurstc requres O(nAS 2 log(kt)) computatons per tme step amortzed over the tme horzon whch wll typcally be substantally less than the Θ(nAS 2 T) computatons requred per tme step for ndex polcy heurstcs such as Whttle s heurstc. Our scheme employs a dual decomposton of RLP( π 0 ). The key techncal dffculty we must overcome n developng our computatonal scheme for the soluton of RLP( π 0 ) s the nondfferentablty of the dual functon correspondng to RLP( π 0 ) at an optmal dual soluton whch prevents us from recoverng an optmal or near optmal polcy by drect mnmzaton of the dual functon. 6.1 An Overvew of the Scheme For each bandtarm, defnethepolytoped ( π 0 ) R S A T of permssblestate-acton frequences for that bandt arm specfed va the constrants of RLP( π 0 ) relevant to that arm. A pont wthn ths polytope, π, corresponds to a set of vald state-acton frequences for the th bandt arm. Wth some abuse of notaton, we denote the expected reward from ths arm under π by the value functon: T 1 R (π ) = π (s,a,t)r (s,a ). In addton denote the expected number of pulls of bandt arm under π by t=0 T (π ) = T π (s,φ,t). s We understand that both R ( ) and T ( ) are defned over the doman D ( π 0 ). We may thus rewrte RLP( π 0 ) n the followng form: t (6.1) max. s.t. R (π ), T (π ) kt. The Lagrangan dual of ths program s DRLP( π 0 ): mn. λkt + max π (R (π ) λt (π )), s.t. λ 0. The above program s convex. In partcular, the objectve s a convex functon of λ. We wll show that strong dualty apples to the dual par of programs above, so that the optmal soluton to the two programs have dentcal value. Next, we wll observe that for a gven value of λ, t s smple to compute max π (R (π ) λt (π )) va the soluton of a dynamc program over the state space of arm (a fast procedure). Fnally t s smple to derve useful a-pror lower and upper 18

Problem Set 9 Solutions

Problem Set 9 Solutions Desgn and Analyss of Algorthms May 4, 2015 Massachusetts Insttute of Technology 6.046J/18.410J Profs. Erk Demane, Srn Devadas, and Nancy Lynch Problem Set 9 Solutons Problem Set 9 Solutons Ths problem

More information

Assortment Optimization under MNL

Assortment Optimization under MNL Assortment Optmzaton under MNL Haotan Song Aprl 30, 2017 1 Introducton The assortment optmzaton problem ams to fnd the revenue-maxmzng assortment of products to offer when the prces of products are fxed.

More information

Lecture 14: Bandits with Budget Constraints

Lecture 14: Bandits with Budget Constraints IEOR 8100-001: Learnng and Optmzaton for Sequental Decson Makng 03/07/16 Lecture 14: andts wth udget Constrants Instructor: Shpra Agrawal Scrbed by: Zhpeng Lu 1 Problem defnton In the regular Mult-armed

More information

CS : Algorithms and Uncertainty Lecture 17 Date: October 26, 2016

CS : Algorithms and Uncertainty Lecture 17 Date: October 26, 2016 CS 29-128: Algorthms and Uncertanty Lecture 17 Date: October 26, 2016 Instructor: Nkhl Bansal Scrbe: Mchael Denns 1 Introducton In ths lecture we wll be lookng nto the secretary problem, and an nterestng

More information

College of Computer & Information Science Fall 2009 Northeastern University 20 October 2009

College of Computer & Information Science Fall 2009 Northeastern University 20 October 2009 College of Computer & Informaton Scence Fall 2009 Northeastern Unversty 20 October 2009 CS7880: Algorthmc Power Tools Scrbe: Jan Wen and Laura Poplawsk Lecture Outlne: Prmal-dual schema Network Desgn:

More information

princeton univ. F 17 cos 521: Advanced Algorithm Design Lecture 7: LP Duality Lecturer: Matt Weinberg

princeton univ. F 17 cos 521: Advanced Algorithm Design Lecture 7: LP Duality Lecturer: Matt Weinberg prnceton unv. F 17 cos 521: Advanced Algorthm Desgn Lecture 7: LP Dualty Lecturer: Matt Wenberg Scrbe: LP Dualty s an extremely useful tool for analyzng structural propertes of lnear programs. Whle there

More information

Module 9. Lecture 6. Duality in Assignment Problems

Module 9. Lecture 6. Duality in Assignment Problems Module 9 1 Lecture 6 Dualty n Assgnment Problems In ths lecture we attempt to answer few other mportant questons posed n earler lecture for (AP) and see how some of them can be explaned through the concept

More information

Winter 2008 CS567 Stochastic Linear/Integer Programming Guest Lecturer: Xu, Huan

Winter 2008 CS567 Stochastic Linear/Integer Programming Guest Lecturer: Xu, Huan Wnter 2008 CS567 Stochastc Lnear/Integer Programmng Guest Lecturer: Xu, Huan Class 2: More Modelng Examples 1 Capacty Expanson Capacty expanson models optmal choces of the tmng and levels of nvestments

More information

Feature Selection: Part 1

Feature Selection: Part 1 CSE 546: Machne Learnng Lecture 5 Feature Selecton: Part 1 Instructor: Sham Kakade 1 Regresson n the hgh dmensonal settng How do we learn when the number of features d s greater than the sample sze n?

More information

The Minimum Universal Cost Flow in an Infeasible Flow Network

The Minimum Universal Cost Flow in an Infeasible Flow Network Journal of Scences, Islamc Republc of Iran 17(2): 175-180 (2006) Unversty of Tehran, ISSN 1016-1104 http://jscencesutacr The Mnmum Unversal Cost Flow n an Infeasble Flow Network H Saleh Fathabad * M Bagheran

More information

COS 521: Advanced Algorithms Game Theory and Linear Programming

COS 521: Advanced Algorithms Game Theory and Linear Programming COS 521: Advanced Algorthms Game Theory and Lnear Programmng Moses Charkar February 27, 2013 In these notes, we ntroduce some basc concepts n game theory and lnear programmng (LP). We show a connecton

More information

Lecture 10 Support Vector Machines II

Lecture 10 Support Vector Machines II Lecture 10 Support Vector Machnes II 22 February 2016 Taylor B. Arnold Yale Statstcs STAT 365/665 1/28 Notes: Problem 3 s posted and due ths upcomng Frday There was an early bug n the fake-test data; fxed

More information

Lecture 4. Instructor: Haipeng Luo

Lecture 4. Instructor: Haipeng Luo Lecture 4 Instructor: Hapeng Luo In the followng lectures, we focus on the expert problem and study more adaptve algorthms. Although Hedge s proven to be worst-case optmal, one may wonder how well t would

More information

Kernel Methods and SVMs Extension

Kernel Methods and SVMs Extension Kernel Methods and SVMs Extenson The purpose of ths document s to revew materal covered n Machne Learnng 1 Supervsed Learnng regardng support vector machnes (SVMs). Ths document also provdes a general

More information

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur Module 3 LOSSY IMAGE COMPRESSION SYSTEMS Verson ECE IIT, Kharagpur Lesson 6 Theory of Quantzaton Verson ECE IIT, Kharagpur Instructonal Objectves At the end of ths lesson, the students should be able to:

More information

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 12 10/21/2013. Martingale Concentration Inequalities and Applications

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 12 10/21/2013. Martingale Concentration Inequalities and Applications MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.65/15.070J Fall 013 Lecture 1 10/1/013 Martngale Concentraton Inequaltes and Applcatons Content. 1. Exponental concentraton for martngales wth bounded ncrements.

More information

4 Analysis of Variance (ANOVA) 5 ANOVA. 5.1 Introduction. 5.2 Fixed Effects ANOVA

4 Analysis of Variance (ANOVA) 5 ANOVA. 5.1 Introduction. 5.2 Fixed Effects ANOVA 4 Analyss of Varance (ANOVA) 5 ANOVA 51 Introducton ANOVA ANOVA s a way to estmate and test the means of multple populatons We wll start wth one-way ANOVA If the populatons ncluded n the study are selected

More information

MMA and GCMMA two methods for nonlinear optimization

MMA and GCMMA two methods for nonlinear optimization MMA and GCMMA two methods for nonlnear optmzaton Krster Svanberg Optmzaton and Systems Theory, KTH, Stockholm, Sweden. krlle@math.kth.se Ths note descrbes the algorthms used n the author s 2007 mplementatons

More information

Notes on Frequency Estimation in Data Streams

Notes on Frequency Estimation in Data Streams Notes on Frequency Estmaton n Data Streams In (one of) the data streamng model(s), the data s a sequence of arrvals a 1, a 2,..., a m of the form a j = (, v) where s the dentty of the tem and belongs to

More information

Bayesian predictive Configural Frequency Analysis

Bayesian predictive Configural Frequency Analysis Psychologcal Test and Assessment Modelng, Volume 54, 2012 (3), 285-292 Bayesan predctve Confgural Frequency Analyss Eduardo Gutérrez-Peña 1 Abstract Confgural Frequency Analyss s a method for cell-wse

More information

Resource Allocation with a Budget Constraint for Computing Independent Tasks in the Cloud

Resource Allocation with a Budget Constraint for Computing Independent Tasks in the Cloud Resource Allocaton wth a Budget Constrant for Computng Independent Tasks n the Cloud Wemng Sh and Bo Hong School of Electrcal and Computer Engneerng Georga Insttute of Technology, USA 2nd IEEE Internatonal

More information

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity LINEAR REGRESSION ANALYSIS MODULE IX Lecture - 30 Multcollnearty Dr. Shalabh Department of Mathematcs and Statstcs Indan Insttute of Technology Kanpur 2 Remedes for multcollnearty Varous technques have

More information

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification E395 - Pattern Recognton Solutons to Introducton to Pattern Recognton, Chapter : Bayesan pattern classfcaton Preface Ths document s a soluton manual for selected exercses from Introducton to Pattern Recognton

More information

A Robust Method for Calculating the Correlation Coefficient

A Robust Method for Calculating the Correlation Coefficient A Robust Method for Calculatng the Correlaton Coeffcent E.B. Nven and C. V. Deutsch Relatonshps between prmary and secondary data are frequently quantfed usng the correlaton coeffcent; however, the tradtonal

More information

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X Statstcs 1: Probablty Theory II 37 3 EPECTATION OF SEVERAL RANDOM VARIABLES As n Probablty Theory I, the nterest n most stuatons les not on the actual dstrbuton of a random vector, but rather on a number

More information

Structure and Drive Paul A. Jensen Copyright July 20, 2003

Structure and Drive Paul A. Jensen Copyright July 20, 2003 Structure and Drve Paul A. Jensen Copyrght July 20, 2003 A system s made up of several operatons wth flow passng between them. The structure of the system descrbes the flow paths from nputs to outputs.

More information

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4)

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4) I. Classcal Assumptons Econ7 Appled Econometrcs Topc 3: Classcal Model (Studenmund, Chapter 4) We have defned OLS and studed some algebrac propertes of OLS. In ths topc we wll study statstcal propertes

More information

Numerical Heat and Mass Transfer

Numerical Heat and Mass Transfer Master degree n Mechancal Engneerng Numercal Heat and Mass Transfer 06-Fnte-Dfference Method (One-dmensonal, steady state heat conducton) Fausto Arpno f.arpno@uncas.t Introducton Why we use models and

More information

Edge Isoperimetric Inequalities

Edge Isoperimetric Inequalities November 7, 2005 Ross M. Rchardson Edge Isopermetrc Inequaltes 1 Four Questons Recall that n the last lecture we looked at the problem of sopermetrc nequaltes n the hypercube, Q n. Our noton of boundary

More information

The Order Relation and Trace Inequalities for. Hermitian Operators

The Order Relation and Trace Inequalities for. Hermitian Operators Internatonal Mathematcal Forum, Vol 3, 08, no, 507-57 HIKARI Ltd, wwwm-hkarcom https://doorg/0988/mf088055 The Order Relaton and Trace Inequaltes for Hermtan Operators Y Huang School of Informaton Scence

More information

Generalized Linear Methods

Generalized Linear Methods Generalzed Lnear Methods 1 Introducton In the Ensemble Methods the general dea s that usng a combnaton of several weak learner one could make a better learner. More formally, assume that we have a set

More information

APPENDIX A Some Linear Algebra

APPENDIX A Some Linear Algebra APPENDIX A Some Lnear Algebra The collecton of m, n matrces A.1 Matrces a 1,1,..., a 1,n A = a m,1,..., a m,n wth real elements a,j s denoted by R m,n. If n = 1 then A s called a column vector. Smlarly,

More information

4DVAR, according to the name, is a four-dimensional variational method.

4DVAR, according to the name, is a four-dimensional variational method. 4D-Varatonal Data Assmlaton (4D-Var) 4DVAR, accordng to the name, s a four-dmensonal varatonal method. 4D-Var s actually a drect generalzaton of 3D-Var to handle observatons that are dstrbuted n tme. The

More information

Lecture Notes on Linear Regression

Lecture Notes on Linear Regression Lecture Notes on Lnear Regresson Feng L fl@sdueducn Shandong Unversty, Chna Lnear Regresson Problem In regresson problem, we am at predct a contnuous target value gven an nput feature vector We assume

More information

Technical Note: Capacity Constraints Across Nests in Assortment Optimization Under the Nested Logit Model

Technical Note: Capacity Constraints Across Nests in Assortment Optimization Under the Nested Logit Model Techncal Note: Capacty Constrants Across Nests n Assortment Optmzaton Under the Nested Logt Model Jacob B. Feldman, Huseyn Topaloglu School of Operatons Research and Informaton Engneerng, Cornell Unversty,

More information

U.C. Berkeley CS294: Spectral Methods and Expanders Handout 8 Luca Trevisan February 17, 2016

U.C. Berkeley CS294: Spectral Methods and Expanders Handout 8 Luca Trevisan February 17, 2016 U.C. Berkeley CS94: Spectral Methods and Expanders Handout 8 Luca Trevsan February 7, 06 Lecture 8: Spectral Algorthms Wrap-up In whch we talk about even more generalzatons of Cheeger s nequaltes, and

More information

THE SUMMATION NOTATION Ʃ

THE SUMMATION NOTATION Ʃ Sngle Subscrpt otaton THE SUMMATIO OTATIO Ʃ Most of the calculatons we perform n statstcs are repettve operatons on lsts of numbers. For example, we compute the sum of a set of numbers, or the sum of the

More information

Global Sensitivity. Tuesday 20 th February, 2018

Global Sensitivity. Tuesday 20 th February, 2018 Global Senstvty Tuesday 2 th February, 28 ) Local Senstvty Most senstvty analyses [] are based on local estmates of senstvty, typcally by expandng the response n a Taylor seres about some specfc values

More information

Additional Codes using Finite Difference Method. 1 HJB Equation for Consumption-Saving Problem Without Uncertainty

Additional Codes using Finite Difference Method. 1 HJB Equation for Consumption-Saving Problem Without Uncertainty Addtonal Codes usng Fnte Dfference Method Benamn Moll 1 HJB Equaton for Consumpton-Savng Problem Wthout Uncertanty Before consderng the case wth stochastc ncome n http://www.prnceton.edu/~moll/ HACTproect/HACT_Numercal_Appendx.pdf,

More information

BOOTSTRAP METHOD FOR TESTING OF EQUALITY OF SEVERAL MEANS. M. Krishna Reddy, B. Naveen Kumar and Y. Ramu

BOOTSTRAP METHOD FOR TESTING OF EQUALITY OF SEVERAL MEANS. M. Krishna Reddy, B. Naveen Kumar and Y. Ramu BOOTSTRAP METHOD FOR TESTING OF EQUALITY OF SEVERAL MEANS M. Krshna Reddy, B. Naveen Kumar and Y. Ramu Department of Statstcs, Osmana Unversty, Hyderabad -500 007, Inda. nanbyrozu@gmal.com, ramu0@gmal.com

More information

Maximizing the number of nonnegative subsets

Maximizing the number of nonnegative subsets Maxmzng the number of nonnegatve subsets Noga Alon Hao Huang December 1, 213 Abstract Gven a set of n real numbers, f the sum of elements of every subset of sze larger than k s negatve, what s the maxmum

More information

CHAPTER 5 NUMERICAL EVALUATION OF DYNAMIC RESPONSE

CHAPTER 5 NUMERICAL EVALUATION OF DYNAMIC RESPONSE CHAPTER 5 NUMERICAL EVALUATION OF DYNAMIC RESPONSE Analytcal soluton s usually not possble when exctaton vares arbtrarly wth tme or f the system s nonlnear. Such problems can be solved by numercal tmesteppng

More information

NP-Completeness : Proofs

NP-Completeness : Proofs NP-Completeness : Proofs Proof Methods A method to show a decson problem Π NP-complete s as follows. (1) Show Π NP. (2) Choose an NP-complete problem Π. (3) Show Π Π. A method to show an optmzaton problem

More information

Linear Approximation with Regularization and Moving Least Squares

Linear Approximation with Regularization and Moving Least Squares Lnear Approxmaton wth Regularzaton and Movng Least Squares Igor Grešovn May 007 Revson 4.6 (Revson : March 004). 5 4 3 0.5 3 3.5 4 Contents: Lnear Fttng...4. Weghted Least Squares n Functon Approxmaton...

More information

Markov Chain Monte Carlo (MCMC), Gibbs Sampling, Metropolis Algorithms, and Simulated Annealing Bioinformatics Course Supplement

Markov Chain Monte Carlo (MCMC), Gibbs Sampling, Metropolis Algorithms, and Simulated Annealing Bioinformatics Course Supplement Markov Chan Monte Carlo MCMC, Gbbs Samplng, Metropols Algorthms, and Smulated Annealng 2001 Bonformatcs Course Supplement SNU Bontellgence Lab http://bsnuackr/ Outlne! Markov Chan Monte Carlo MCMC! Metropols-Hastngs

More information

Welfare Properties of General Equilibrium. What can be said about optimality properties of resource allocation implied by general equilibrium?

Welfare Properties of General Equilibrium. What can be said about optimality properties of resource allocation implied by general equilibrium? APPLIED WELFARE ECONOMICS AND POLICY ANALYSIS Welfare Propertes of General Equlbrum What can be sad about optmalty propertes of resource allocaton mpled by general equlbrum? Any crteron used to compare

More information

Capacity Constraints Across Nests in Assortment Optimization Under the Nested Logit Model

Capacity Constraints Across Nests in Assortment Optimization Under the Nested Logit Model Capacty Constrants Across Nests n Assortment Optmzaton Under the Nested Logt Model Jacob B. Feldman School of Operatons Research and Informaton Engneerng, Cornell Unversty, Ithaca, New York 14853, USA

More information

Chapter 11: Simple Linear Regression and Correlation

Chapter 11: Simple Linear Regression and Correlation Chapter 11: Smple Lnear Regresson and Correlaton 11-1 Emprcal Models 11-2 Smple Lnear Regresson 11-3 Propertes of the Least Squares Estmators 11-4 Hypothess Test n Smple Lnear Regresson 11-4.1 Use of t-tests

More information

x = , so that calculated

x = , so that calculated Stat 4, secton Sngle Factor ANOVA notes by Tm Plachowsk n chapter 8 we conducted hypothess tests n whch we compared a sngle sample s mean or proporton to some hypotheszed value Chapter 9 expanded ths to

More information

Vapnik-Chervonenkis theory

Vapnik-Chervonenkis theory Vapnk-Chervonenks theory Rs Kondor June 13, 2008 For the purposes of ths lecture, we restrct ourselves to the bnary supervsed batch learnng settng. We assume that we have an nput space X, and an unknown

More information

Stat260: Bayesian Modeling and Inference Lecture Date: February 22, Reference Priors

Stat260: Bayesian Modeling and Inference Lecture Date: February 22, Reference Priors Stat60: Bayesan Modelng and Inference Lecture Date: February, 00 Reference Prors Lecturer: Mchael I. Jordan Scrbe: Steven Troxler and Wayne Lee In ths lecture, we assume that θ R; n hgher-dmensons, reference

More information

Errors for Linear Systems

Errors for Linear Systems Errors for Lnear Systems When we solve a lnear system Ax b we often do not know A and b exactly, but have only approxmatons  and ˆb avalable. Then the best thng we can do s to solve ˆx ˆb exactly whch

More information

Solutions HW #2. minimize. Ax = b. Give the dual problem, and make the implicit equality constraints explicit. Solution.

Solutions HW #2. minimize. Ax = b. Give the dual problem, and make the implicit equality constraints explicit. Solution. Solutons HW #2 Dual of general LP. Fnd the dual functon of the LP mnmze subject to c T x Gx h Ax = b. Gve the dual problem, and make the mplct equalty constrants explct. Soluton. 1. The Lagrangan s L(x,

More information

EEL 6266 Power System Operation and Control. Chapter 3 Economic Dispatch Using Dynamic Programming

EEL 6266 Power System Operation and Control. Chapter 3 Economic Dispatch Using Dynamic Programming EEL 6266 Power System Operaton and Control Chapter 3 Economc Dspatch Usng Dynamc Programmng Pecewse Lnear Cost Functons Common practce many utltes prefer to represent ther generator cost functons as sngle-

More information

Chapter 5 Multilevel Models

Chapter 5 Multilevel Models Chapter 5 Multlevel Models 5.1 Cross-sectonal multlevel models 5.1.1 Two-level models 5.1.2 Multple level models 5.1.3 Multple level modelng n other felds 5.2 Longtudnal multlevel models 5.2.1 Two-level

More information

Pricing Problems under the Nested Logit Model with a Quality Consistency Constraint

Pricing Problems under the Nested Logit Model with a Quality Consistency Constraint Prcng Problems under the Nested Logt Model wth a Qualty Consstency Constrant James M. Davs, Huseyn Topaloglu, Davd P. Wllamson 1 Aprl 28, 2015 Abstract We consder prcng problems when customers choose among

More information

Appendix B: Resampling Algorithms

Appendix B: Resampling Algorithms 407 Appendx B: Resamplng Algorthms A common problem of all partcle flters s the degeneracy of weghts, whch conssts of the unbounded ncrease of the varance of the mportance weghts ω [ ] of the partcles

More information

Foundations of Arithmetic

Foundations of Arithmetic Foundatons of Arthmetc Notaton We shall denote the sum and product of numbers n the usual notaton as a 2 + a 2 + a 3 + + a = a, a 1 a 2 a 3 a = a The notaton a b means a dvdes b,.e. ac = b where c s an

More information

More metrics on cartesian products

More metrics on cartesian products More metrcs on cartesan products If (X, d ) are metrc spaces for 1 n, then n Secton II4 of the lecture notes we defned three metrcs on X whose underlyng topologes are the product topology The purpose of

More information

Chapter 5. Solution of System of Linear Equations. Module No. 6. Solution of Inconsistent and Ill Conditioned Systems

Chapter 5. Solution of System of Linear Equations. Module No. 6. Solution of Inconsistent and Ill Conditioned Systems Numercal Analyss by Dr. Anta Pal Assstant Professor Department of Mathematcs Natonal Insttute of Technology Durgapur Durgapur-713209 emal: anta.bue@gmal.com 1 . Chapter 5 Soluton of System of Lnear Equatons

More information

Markov Chain Monte Carlo Lecture 6

Markov Chain Monte Carlo Lecture 6 where (x 1,..., x N ) X N, N s called the populaton sze, f(x) f (x) for at least one {1, 2,..., N}, and those dfferent from f(x) are called the tral dstrbutons n terms of mportance samplng. Dfferent ways

More information

Basically, if you have a dummy dependent variable you will be estimating a probability.

Basically, if you have a dummy dependent variable you will be estimating a probability. ECON 497: Lecture Notes 13 Page 1 of 1 Metropoltan State Unversty ECON 497: Research and Forecastng Lecture Notes 13 Dummy Dependent Varable Technques Studenmund Chapter 13 Bascally, f you have a dummy

More information

Psychology 282 Lecture #24 Outline Regression Diagnostics: Outliers

Psychology 282 Lecture #24 Outline Regression Diagnostics: Outliers Psychology 282 Lecture #24 Outlne Regresson Dagnostcs: Outlers In an earler lecture we studed the statstcal assumptons underlyng the regresson model, ncludng the followng ponts: Formal statement of assumptons.

More information

Simultaneous Optimization of Berth Allocation, Quay Crane Assignment and Quay Crane Scheduling Problems in Container Terminals

Simultaneous Optimization of Berth Allocation, Quay Crane Assignment and Quay Crane Scheduling Problems in Container Terminals Smultaneous Optmzaton of Berth Allocaton, Quay Crane Assgnment and Quay Crane Schedulng Problems n Contaner Termnals Necat Aras, Yavuz Türkoğulları, Z. Caner Taşkın, Kuban Altınel Abstract In ths work,

More information

Limited Dependent Variables

Limited Dependent Variables Lmted Dependent Varables. What f the left-hand sde varable s not a contnuous thng spread from mnus nfnty to plus nfnty? That s, gven a model = f (, β, ε, where a. s bounded below at zero, such as wages

More information

Economics 101. Lecture 4 - Equilibrium and Efficiency

Economics 101. Lecture 4 - Equilibrium and Efficiency Economcs 0 Lecture 4 - Equlbrum and Effcency Intro As dscussed n the prevous lecture, we wll now move from an envronment where we looed at consumers mang decsons n solaton to analyzng economes full of

More information

U.C. Berkeley CS294: Beyond Worst-Case Analysis Luca Trevisan September 5, 2017

U.C. Berkeley CS294: Beyond Worst-Case Analysis Luca Trevisan September 5, 2017 U.C. Berkeley CS94: Beyond Worst-Case Analyss Handout 4s Luca Trevsan September 5, 07 Summary of Lecture 4 In whch we ntroduce semdefnte programmng and apply t to Max Cut. Semdefnte Programmng Recall that

More information

Real-Time Systems. Multiprocessor scheduling. Multiprocessor scheduling. Multiprocessor scheduling

Real-Time Systems. Multiprocessor scheduling. Multiprocessor scheduling. Multiprocessor scheduling Real-Tme Systems Multprocessor schedulng Specfcaton Implementaton Verfcaton Multprocessor schedulng -- -- Global schedulng How are tasks assgned to processors? Statc assgnment The processor(s) used for

More information

Lectures - Week 4 Matrix norms, Conditioning, Vector Spaces, Linear Independence, Spanning sets and Basis, Null space and Range of a Matrix

Lectures - Week 4 Matrix norms, Conditioning, Vector Spaces, Linear Independence, Spanning sets and Basis, Null space and Range of a Matrix Lectures - Week 4 Matrx norms, Condtonng, Vector Spaces, Lnear Independence, Spannng sets and Bass, Null space and Range of a Matrx Matrx Norms Now we turn to assocatng a number to each matrx. We could

More information

Stanford University CS359G: Graph Partitioning and Expanders Handout 4 Luca Trevisan January 13, 2011

Stanford University CS359G: Graph Partitioning and Expanders Handout 4 Luca Trevisan January 13, 2011 Stanford Unversty CS359G: Graph Parttonng and Expanders Handout 4 Luca Trevsan January 3, 0 Lecture 4 In whch we prove the dffcult drecton of Cheeger s nequalty. As n the past lectures, consder an undrected

More information

On the Multicriteria Integer Network Flow Problem

On the Multicriteria Integer Network Flow Problem BULGARIAN ACADEMY OF SCIENCES CYBERNETICS AND INFORMATION TECHNOLOGIES Volume 5, No 2 Sofa 2005 On the Multcrtera Integer Network Flow Problem Vassl Vasslev, Marana Nkolova, Maryana Vassleva Insttute of

More information

Perfect Competition and the Nash Bargaining Solution

Perfect Competition and the Nash Bargaining Solution Perfect Competton and the Nash Barganng Soluton Renhard John Department of Economcs Unversty of Bonn Adenauerallee 24-42 53113 Bonn, Germany emal: rohn@un-bonn.de May 2005 Abstract For a lnear exchange

More information

The Geometry of Logit and Probit

The Geometry of Logit and Probit The Geometry of Logt and Probt Ths short note s meant as a supplement to Chapters and 3 of Spatal Models of Parlamentary Votng and the notaton and reference to fgures n the text below s to those two chapters.

More information

The Multiple Classical Linear Regression Model (CLRM): Specification and Assumptions. 1. Introduction

The Multiple Classical Linear Regression Model (CLRM): Specification and Assumptions. 1. Introduction ECONOMICS 5* -- NOTE (Summary) ECON 5* -- NOTE The Multple Classcal Lnear Regresson Model (CLRM): Specfcaton and Assumptons. Introducton CLRM stands for the Classcal Lnear Regresson Model. The CLRM s also

More information

P R. Lecture 4. Theory and Applications of Pattern Recognition. Dept. of Electrical and Computer Engineering /

P R. Lecture 4. Theory and Applications of Pattern Recognition. Dept. of Electrical and Computer Engineering / Theory and Applcatons of Pattern Recognton 003, Rob Polkar, Rowan Unversty, Glassboro, NJ Lecture 4 Bayes Classfcaton Rule Dept. of Electrcal and Computer Engneerng 0909.40.0 / 0909.504.04 Theory & Applcatons

More information

Difference Equations

Difference Equations Dfference Equatons c Jan Vrbk 1 Bascs Suppose a sequence of numbers, say a 0,a 1,a,a 3,... s defned by a certan general relatonshp between, say, three consecutve values of the sequence, e.g. a + +3a +1

More information

a b a In case b 0, a being divisible by b is the same as to say that

a b a In case b 0, a being divisible by b is the same as to say that Secton 6.2 Dvsblty among the ntegers An nteger a ε s dvsble by b ε f there s an nteger c ε such that a = bc. Note that s dvsble by any nteger b, snce = b. On the other hand, a s dvsble by only f a = :

More information

A PROBABILITY-DRIVEN SEARCH ALGORITHM FOR SOLVING MULTI-OBJECTIVE OPTIMIZATION PROBLEMS

A PROBABILITY-DRIVEN SEARCH ALGORITHM FOR SOLVING MULTI-OBJECTIVE OPTIMIZATION PROBLEMS HCMC Unversty of Pedagogy Thong Nguyen Huu et al. A PROBABILITY-DRIVEN SEARCH ALGORITHM FOR SOLVING MULTI-OBJECTIVE OPTIMIZATION PROBLEMS Thong Nguyen Huu and Hao Tran Van Department of mathematcs-nformaton,

More information

Interactive Bi-Level Multi-Objective Integer. Non-linear Programming Problem

Interactive Bi-Level Multi-Objective Integer. Non-linear Programming Problem Appled Mathematcal Scences Vol 5 0 no 65 3 33 Interactve B-Level Mult-Objectve Integer Non-lnear Programmng Problem O E Emam Department of Informaton Systems aculty of Computer Scence and nformaton Helwan

More information

Time-Varying Systems and Computations Lecture 6

Time-Varying Systems and Computations Lecture 6 Tme-Varyng Systems and Computatons Lecture 6 Klaus Depold 14. Januar 2014 The Kalman Flter The Kalman estmaton flter attempts to estmate the actual state of an unknown dscrete dynamcal system, gven nosy

More information

ANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U)

ANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U) Econ 413 Exam 13 H ANSWERS Settet er nndelt 9 deloppgaver, A,B,C, som alle anbefales å telle lkt for å gøre det ltt lettere å stå. Svar er gtt . Unfortunately, there s a prntng error n the hnt of

More information

Yong Joon Ryang. 1. Introduction Consider the multicommodity transportation problem with convex quadratic cost function. 1 2 (x x0 ) T Q(x x 0 )

Yong Joon Ryang. 1. Introduction Consider the multicommodity transportation problem with convex quadratic cost function. 1 2 (x x0 ) T Q(x x 0 ) Kangweon-Kyungk Math. Jour. 4 1996), No. 1, pp. 7 16 AN ITERATIVE ROW-ACTION METHOD FOR MULTICOMMODITY TRANSPORTATION PROBLEMS Yong Joon Ryang Abstract. The optmzaton problems wth quadratc constrants often

More information

Case A. P k = Ni ( 2L i k 1 ) + (# big cells) 10d 2 P k.

Case A. P k = Ni ( 2L i k 1 ) + (# big cells) 10d 2 P k. THE CELLULAR METHOD In ths lecture, we ntroduce the cellular method as an approach to ncdence geometry theorems lke the Szemeréd-Trotter theorem. The method was ntroduced n the paper Combnatoral complexty

More information

Grover s Algorithm + Quantum Zeno Effect + Vaidman

Grover s Algorithm + Quantum Zeno Effect + Vaidman Grover s Algorthm + Quantum Zeno Effect + Vadman CS 294-2 Bomb 10/12/04 Fall 2004 Lecture 11 Grover s algorthm Recall that Grover s algorthm for searchng over a space of sze wors as follows: consder the

More information

Some modelling aspects for the Matlab implementation of MMA

Some modelling aspects for the Matlab implementation of MMA Some modellng aspects for the Matlab mplementaton of MMA Krster Svanberg krlle@math.kth.se Optmzaton and Systems Theory Department of Mathematcs KTH, SE 10044 Stockholm September 2004 1. Consdered optmzaton

More information

Online Appendix to: Axiomatization and measurement of Quasi-hyperbolic Discounting

Online Appendix to: Axiomatization and measurement of Quasi-hyperbolic Discounting Onlne Appendx to: Axomatzaton and measurement of Quas-hyperbolc Dscountng José Lus Montel Olea Tomasz Strzaleck 1 Sample Selecton As dscussed before our ntal sample conssts of two groups of subjects. Group

More information

Motion Perception Under Uncertainty. Hongjing Lu Department of Psychology University of Hong Kong

Motion Perception Under Uncertainty. Hongjing Lu Department of Psychology University of Hong Kong Moton Percepton Under Uncertanty Hongjng Lu Department of Psychology Unversty of Hong Kong Outlne Uncertanty n moton stmulus Correspondence problem Qualtatve fttng usng deal observer models Based on sgnal

More information

Supplementary Notes for Chapter 9 Mixture Thermodynamics

Supplementary Notes for Chapter 9 Mixture Thermodynamics Supplementary Notes for Chapter 9 Mxture Thermodynamcs Key ponts Nne major topcs of Chapter 9 are revewed below: 1. Notaton and operatonal equatons for mxtures 2. PVTN EOSs for mxtures 3. General effects

More information

Equilibrium with Complete Markets. Instructor: Dmytro Hryshko

Equilibrium with Complete Markets. Instructor: Dmytro Hryshko Equlbrum wth Complete Markets Instructor: Dmytro Hryshko 1 / 33 Readngs Ljungqvst and Sargent. Recursve Macroeconomc Theory. MIT Press. Chapter 8. 2 / 33 Equlbrum n pure exchange, nfnte horzon economes,

More information

6.854J / J Advanced Algorithms Fall 2008

6.854J / J Advanced Algorithms Fall 2008 MIT OpenCourseWare http://ocw.mt.edu 6.854J / 18.415J Advanced Algorthms Fall 2008 For nformaton about ctng these materals or our Terms of Use, vst: http://ocw.mt.edu/terms. 18.415/6.854 Advanced Algorthms

More information

Comparison of the Population Variance Estimators. of 2-Parameter Exponential Distribution Based on. Multiple Criteria Decision Making Method

Comparison of the Population Variance Estimators. of 2-Parameter Exponential Distribution Based on. Multiple Criteria Decision Making Method Appled Mathematcal Scences, Vol. 7, 0, no. 47, 07-0 HIARI Ltd, www.m-hkar.com Comparson of the Populaton Varance Estmators of -Parameter Exponental Dstrbuton Based on Multple Crtera Decson Makng Method

More information

Singular Value Decomposition: Theory and Applications

Singular Value Decomposition: Theory and Applications Sngular Value Decomposton: Theory and Applcatons Danel Khashab Sprng 2015 Last Update: March 2, 2015 1 Introducton A = UDV where columns of U and V are orthonormal and matrx D s dagonal wth postve real

More information

18.1 Introduction and Recap

18.1 Introduction and Recap CS787: Advanced Algorthms Scrbe: Pryananda Shenoy and Shjn Kong Lecturer: Shuch Chawla Topc: Streamng Algorthmscontnued) Date: 0/26/2007 We contnue talng about streamng algorthms n ths lecture, ncludng

More information

Convergence of random processes

Convergence of random processes DS-GA 12 Lecture notes 6 Fall 216 Convergence of random processes 1 Introducton In these notes we study convergence of dscrete random processes. Ths allows to characterze phenomena such as the law of large

More information

Computing Correlated Equilibria in Multi-Player Games

Computing Correlated Equilibria in Multi-Player Games Computng Correlated Equlbra n Mult-Player Games Chrstos H. Papadmtrou Presented by Zhanxang Huang December 7th, 2005 1 The Author Dr. Chrstos H. Papadmtrou CS professor at UC Berkley (taught at Harvard,

More information

Estimation: Part 2. Chapter GREG estimation

Estimation: Part 2. Chapter GREG estimation Chapter 9 Estmaton: Part 2 9. GREG estmaton In Chapter 8, we have seen that the regresson estmator s an effcent estmator when there s a lnear relatonshp between y and x. In ths chapter, we generalzed the

More information

Statistics II Final Exam 26/6/18

Statistics II Final Exam 26/6/18 Statstcs II Fnal Exam 26/6/18 Academc Year 2017/18 Solutons Exam duraton: 2 h 30 mn 1. (3 ponts) A town hall s conductng a study to determne the amount of leftover food produced by the restaurants n the

More information

Lecture 20: Lift and Project, SDP Duality. Today we will study the Lift and Project method. Then we will prove the SDP duality theorem.

Lecture 20: Lift and Project, SDP Duality. Today we will study the Lift and Project method. Then we will prove the SDP duality theorem. prnceton u. sp 02 cos 598B: algorthms and complexty Lecture 20: Lft and Project, SDP Dualty Lecturer: Sanjeev Arora Scrbe:Yury Makarychev Today we wll study the Lft and Project method. Then we wll prove

More information

For now, let us focus on a specific model of neurons. These are simplified from reality but can achieve remarkable results.

For now, let us focus on a specific model of neurons. These are simplified from reality but can achieve remarkable results. Neural Networks : Dervaton compled by Alvn Wan from Professor Jtendra Malk s lecture Ths type of computaton s called deep learnng and s the most popular method for many problems, such as computer vson

More information

3.1 ML and Empirical Distribution

3.1 ML and Empirical Distribution 67577 Intro. to Machne Learnng Fall semester, 2008/9 Lecture 3: Maxmum Lkelhood/ Maxmum Entropy Dualty Lecturer: Amnon Shashua Scrbe: Amnon Shashua 1 In the prevous lecture we defned the prncple of Maxmum

More information