arxiv: v1 [math.oc] 25 Jun 2008
|
|
- Muriel Parker
- 6 years ago
- Views:
Transcription
1 Irrevocable Mult-Armed Bandt Polces Vvek F. Faras Rtesh Madan arxv: v1 [math.oc] 25 Jun 2008 February 13, 2018 Abstract Ths paper consders the mult-armed bandt problem wth multple smultaneous arm pulls. We develop a new rrevocable heurstc for ths problem. In partcular, we do not allow recourse to arms that were pulled at some pont n the past but then dscarded. Ths rrevocable property s hghly desrable from a practcal perspectve. As a consequence of ths property, our heurstc entals a mnmum amount of exploraton. At the same tme, we fnd that the prce of rrevocablty s lmted for a broad useful class of bandts we characterze precsely. Ths class ncludes one of the most common applcatons of the bandt model, namely, bandts whose arms are cons of unknown bases. Computatonal experments wth a generatve famly of large scale problems wthn ths class ndcate losses of up to 5 10% relatve to an upper bound on the performance of an optmal polcy wth no restrctons on exploraton. We also provde a worst-case theoretcal analyss that shows that for ths class of bandt problems, the prce of rrevocablty s unformly bounded: our heurstc earns expected rewards that are always wthn a factor of 1/8 of an optmal polcy wth no restrctons on exploraton. In addton to beng an ndcator of robustness across all parameter regmes, ths analyss sheds lght on the structural propertes that afford a low prce of rrevocablty. Sloan School of Management and Operatons Research Center, Massachusetts Insttute of Technology, emal :vvekf@mt.edu Qualcomm-Flaron Technologes, emal :rkmadan@stanfordalumn.org 1
2 1 Introducton Consder the operatons of a fast-fashon retaler such as Zara or H&M. Such retalers have developed and nvested n merchandze procurement strateges that permt lead tmes for new fashons as short as two weeks. As a consequence of ths flexblty, such retalers are able to adjust the assortment of products offered on sale at ther stores to quckly adapt to popular fashon trends. In partcular, such retalers use weekly sales data to refne ther estmates of an tem s popularty, and based on such revsed estmates weed out unpopular tems, or else re-stock demonstrably popular ones on a week-by-week bass. In sharp contrast, tradtonal retalers such as J.C. Penney or Marks and Spencer face lead tmes on the order of several months. As such these retalers need to predct popular fashons months n advance and are allowed vrtually no changes to ther product assortments over the course of a sales season whch s typcally several months n length. Understandably, ths approach s not nearly as successful at dentfyng hgh sellng fashons and also results n substantal unsold nventores at the end of a sales season. In vew of the great deal of a-pror uncertanty n the popularty of a new fashon and the speed at whch fashon trends evolve, the fast-fashon operatons model s hghly desrable and emergng as the de-facto operatons model for large fashon retalers. Among other thngs, the fast-fashon model reles crucally on an effectve technology to learn from purchase data, and adjust product assortments based on such data. Such a technology must strke a balance between explorng potentally successful products and explotng products that are demonstrably popular. A convenent mathematcal model wthn whch to desgn algorthms capable of accomplshng such a task s that of the mult-armed bandt. Whle we defer a precse mathematcal dscusson to a later secton, a mult-armed bandt conssts of multple (say n) arms, each correspondng to a Markov Decson Process. As a specal case, one may thnk of each arm as an ndependent bnomal con wth an uncertan bas specfed va some pror dstrbuton. At each pont n tme, one may pull up to a certan number of arms (say k < n) smultaneously, or equvalently, toss up to a certan number of cons. For each tossed con, we earn a reward proportonal to ts realzaton and are able to refne our estmate of ts bas based on ths realzaton. We nether learn about, nor earn rewards from cons that are not tossed. The mult-armed bandt problem requres fndng a polcy that adaptvely selects k arms to pull at every pont n tme wth a vew to maxmzng total expected reward earned over some fnte tme horzon or alternatvely, dscounted rewards earned over an nfnte horzon or perhaps, even long term average rewards. Wth multple smultaneous pulls allowed, the mult-armed bandt problem we have descrbed s computatonally hard. A popular and emprcally successful heurstc for ths problem was proposed several decades ago by Whttle. Whttle s heurstc produces an ndex for every arm based on the state of that arm and smply calls for pullng the k arms wth the hghest ndex at every pont n tme. Whle t has been emprcally and computatonally observed that Whttle s heurstc provdes excellent performance, the heurstc typcally calls for frequent changes to the set of arms pulled that mght, n hndsght, have been unnecessary. For nstance, n the retal context, such a heurstc may choose to dscard from the assortment a product presently beng offered for sale n favor of a new product whose popularty s not known precsely. Later, the heurstc may well choose to rentroduce the dscarded product. Whle such exploraton may appear necessary f one s to dscover proftable bandt arms (or popular products), enablng such a heurstc n practce wll typcally call for a great number of adjustments to the product assortment a requrement that s both 1
3 expensve and undesrable. Ths begs the followng queston: Is t possble to desgn a heurstc for the mult-armed bandt problem that comes close to beng optmal wth a mnmal number of adjustments to the set of arms pulled over tme? Ths paper ntroduces a new rrevocable heurstc for the mult-armed bandt problem we call the packng heurstc. The packng heurstc establshes a statc rankng of bandt arms based on a measure of ther potental value relatve to the tme requred to realze that value, and pulls arms n the order prescrbed by ths rankng. For an arm currently beng pulled, the heurstc may ether choose to contnue pullng that arm n the next tme step or else dscard the arm n favor of the next hghest ranked arm not currently beng pulled. Once dscarded, an arm wll never be chosen agan; hence the term rrevocable. Irrevocablty s an attractve structural constrant to mpose on arm selecton polces n a number of practcal applcatons of the bandt model such as the dynamc assortment problem we have dscussed or sequental drug trals where recourse to drugs whose testng was dscontnued n the past s socally unacceptable. It s clear that an rrevocable heurstc makes a mnmal number of changes to the set of arms pulled. What s perhaps surprsng, s that the restrcton to an rrevocable polcy s typcally far less expensve than one mght expect. In partcular, we demonstrate va a theoretcal analyss and computatonal experments that the use of the packng heurstc ncurs a small performance loss relatve to an optmal bandt polcy wth no restrcton on exploraton,.e. an optmal strategy that s allowed recourse to arms that were pulled but dscarded n the past. More specfcally, the present work makes the followng contrbutons: We ntroduce a new rrevocable heurstc, the packng heurstc, for the mult-armed bandt problem wth multple smultaneous arm-pulls. The packng heurstc s rrevocable n that f an arm beng pulled s at some pont dscarded from the set of arms beng pulled, t s never pulled agan. At the same tme, the performance loss ncurred relatve to an optmal, potentally non-rrevocable, control polcy s lmted. In partcular, computatonal experments wth the packng heurstc for a generatve famly of large scale bandt problems ndcate performance losses of up to about a few percent relatve to an upper bound on the performance of an optmal polcy wth no restrctons on exploraton. Ths level of performance suggests that the packng heurstc s lkely to serve as a vable heurstc for the mult-armed bandt wth multple plays even when rrevocablty s not a concern. In addton to our computatonal study, we are able to demonstrate a unform bound on the prce of rrevocablty for a broad, nterestng class of bandts. Ths class ncludes most commonly used applcatons of the bandt model such as bandts whose arms are cons of unknown bases. We demonstrate that the packng heurstc earns expected rewards that are always wthn a factor of 1/8 of an optmal, potentally non-rrevocable polcy. Such a unform bound guarantees robust performance across all parameter regmes; n partcular, the packng heurstc wll track the performance of an optmal, potentally non-rrevocable polcy across all parameter regmes. In addton, our analyss sheds lght on the structural propertes that afford the surprsng effcacy of the rrevocable polces consdered here. In the nterest of practcal applcablty, we develop a fast combnatoral mplementaton of the packng heurstc. Assumng that an ndvdual arm has O(Σ) states, and gven a tme horzon of T steps, optmal soluton to the mult-armed bandt problem under consderaton requres 2
4 O(Σ n T n ) computatons. The man computatonal step n the packng heurstc calls for the one tme soluton of a lnear program wth O(nΣT) varables, whose soluton va a generc LP solver requres O(n 3 Σ 3 T 3 ) computatons. We develop a novel combnatoral algorthm that solves ths lnear program n O(nΣ 2 T logt) steps by solvng a sequence of dynamc programs for each bandt arm. The technque we develop here s potentally of ndependent nterest for the soluton of weakly coupled optmal control problems wth couplng constrants that must be met n expectaton. Employng ths soluton technque, our heurstc requres a total of O(nΣ 2 logt) computatons pertmestep amortzed over thetme horzon. Incomparson, the smplest theoretcally sound heurstcs n exstence for ths mult-armed bandt problem (such as Whttle s heurstc) requre O(nΣ 2 T) computatons per tme step. As such, we establsh that the packng heurstc s computatonally attractve. 1.1 Relevant Lterature The mult-armed bandt problem has a rch hstory, and a number of excellent references (such as Gttns (1989)) provde a thorough treatment of the subject. We revew here lterature especally relevant to the present work. In the case where k = 1, that s, allowng for a sngle arm to be pulled n a gven tme step, Gttns and Jones (1974) developed an elegant ndex based polcy that was shown to be optmal for the problem of maxmzng dscounted rewards over an nfnte horzon. Ther ndex polcy s known to be suboptmal f one s allowed to pull more than a sngle arm n a gven tme step. Whttle (1988) developed a smple ndex based heurstc for a more general bandt problem (the restless bandt problem) allowng for multple arms to be pulled n a gven tme step. Whle hs orgnal paper was concerned wth maxmzng long-term average rewards, hs heurstc s easly adapted to other objectves such as dscounted nfnte horzon rewards or expected rewards over a fnte horzon (see for nstance Caro and Gallen (2007), Bertsmas and Nno-Mora (2000)). Wess (1992) subsequently establshed that under sutable condtons, Whttle s heurstc was asymptotcally optmal (n a regme where n and k go to nfnty keepng n/k constant). Whttle s heurstc may be vewed as a modfcaton to the optmal control polcy one obtans upon relaxng the requrement that at most k arms be pulled n a gven tme step to requrng that at most k arms be pulled n expectaton n any gven tme step. The packng heurstc we ntroduce s motvated by a smlar relaxaton. In partcular, we restrct attenton to polces that ental a total of at most kt arm pulls over the entre horzon n expectaton whle allowng for no more than T pulls of any gven arm. Where we dffer substantally from Whttle s heurstc s the manner n whch we construct a feasble polcy (one where at most k arms are pulled n a gven tme step) from the relaxed polcy. In fact there are potentally many reasonable ways of transformng an optmal polcy for the relaxed problem to a feasble polcy for the mult-armed bandt; for nstance Bertsmas and Nno-Mora (2000) use a scheme dstnct from both Whttle s and ours, that employs optmal prmal and dual solutons to a lnear programmng formulaton of Whttle s relaxaton to construct an ndex heurstc for arm selecton. Nonetheless, none of these schemes are rrevocable and nor do they offer non-asymptotc performance guarantees, f any. The packng heurstc polcy bulds upon recent nsghts on the adaptvty gap for stochastc packng problems. In partcular, Dean et al. (2004) recently establshed that a smple statc rule (Smth s rule) for packng a knapsack wth tems of fxed reward (known a-pror), but whose szes were stochastc and unknown a-pror was wthn a constant factor of the optmal adaptve packng 3
5 polcy. Guha and Munagala (2007) used ths nsght to establsh a smlar statc rule for budgeted learnng problems. In such a problem one s nterested n fndng a con wth hghest bas from a set of cons of uncertan bas, assumng one s allowed to toss a sngle con n a gven tme step and that one has a fnte budget on the number of such expermental tosses allowed. Our work parallels that work n that we draw on the nsghts of the stochastc packng results of Dean et al. (2004). In addton, we must address two sgnfcant hurdles - correlatons between the total reward earned from pulls of a gven arm and the total number of pulls of that arm (these turn out not to matter n the budgeted learnng settng, but are crucal to our settng), and secondly, the fact that multple arms may be pulled smultaneously (only a sngle arm may be pulled at any tme n the budgeted learnng settng). Fnally, a workng paper (Bhattacharjee et al. (2007)), brought to our attenton by the authors of that work consders a varant of the budgeted learnng problem of Guha and Munagala (2007) wheren one s allowed to toss multple cons smultaneously. Whle t s concevable that ther heurstc may be modfed to apply to the mult-armed bandt problem we address, the heurstc they develop s also not rrevocable. Restrcted to cons, our work takes an nherently Bayesan vews of the mult-armed bandt problem. It s worth mentonng that there are a number of non-parametrc formulatons to such problems wth a vast assocated lterature. Most relevant to the present model are the papers by Anantharam et al. (1987a,b) that develop smple regret-optmal strateges for mult-armed bandt problems wth multple smultaneous plays. Our development of an rrevocable polcy for the mult-armed bandt problem was orgnally motvated by applcatons of ths framework to dynamc assortment problems of the type mentoned n the ntroducton. In partcular, Caro and Gallen (2007) computatonally explore the use of a number of smple ndex-type heurstcs (smlar to Whttle s heurstc) for such problems, none of whch are rrevocable; nonetheless, they stress the mportance of a mnmal number of changes to the assortment f any such heurstc s to be practcal. The remander of ths paper s organzed as follows. Secton 2 presents the mult-armed bandt model we consder and develops an (ntractable) LP whose soluton yelds an optmal control polcy for ths bandt problem. Secton 3 develops the packng heurstc by consderng a sutable relaxaton of the mult-armed bandt problem. Secton 4 ntroduces a structural property for bandt arms we call the decreasng returns property. It s shown that a useful class of bandts, namely the con bandts relevant to the applcatons that motvate us, possess ths property. That secton then establshes that the prce of rrevocablty for bandts possessng the decreasng returns property s unformly bounded. Secton 5 presents very encouragng computatonal experments for large scale bandt problems drawn from a generatve famly of con type bandts. In the nterest of mplementablty, Secton 6 develops a combnatoral algorthm for the fast computaton of packng heurstc polces for mult-armed bandts. Secton 7 concludes wth a perspectve on nterestng drectons for future work. 2 Model We consder a mult-armed bandt problem wth multple smultaneous pulls permtted at every tme step. A sngle bandt arm (ndexed by ) s a Markov Decson Process (MDP) specfed by a state space S, an acton space, A, a reward functon r : S A R +, and a transton 4
6 kernel P : S A S (where S s the S -dmensonal unt smplex), yeldng a probablty dstrbuton over next states should one choose some acton a A n state s S. Every bandt arm s endowed wth a dstngushed dle acton φ. Should a bandt be dled n some tme perod, t yelds no rewards n that perod and transtons to the same state wth probablty 1 n the next perod. More precsely, r (s,φ ) = 0, s S, P (s,φ,s ) = 1, s S. We consder a bandt problem wth n arms. In each tme step one must select a subset of up to k( n) arms for whch one may pck any acton avalable at those respectve arms. Should an acton other than the dle acton be selected at any of these k arms, we refer to such a selecton as a pull of that arm. That s, any acton a A \ {φ } would be consdered a pull of the th arm. One s forced to pck the dle acton for the remanng n k arms. We wsh to fnd an acton selecton (or control) polcy that maxmzes expected rewards earned over T tme perods. Our problem may be cast as an optmal control problem. In partcular, we defne as our state-space the set S = S and as our acton space, the set A = A. We let T = {0,1,...,T 1}. We understand by s, the th component of s S and smlarly let a denote the th component of a A. A feasble acton s one whch calls for smultaneously pullng at most k arms. In partcular we let A feas = {a A, 1 a φ k} denote the set of all feasble actons. We defne a reward functon r : S A R +, gven by r(s,a) = r (s,a ) and a system transton kernel P : S A Q S, gven by P(s,a,s ) = Π P (s,a,s ). We now formally develop what we mean by a control polcy. The arm selecton polcy we wll eventually develop wll use auxlary nformaton asde from the current state of the system, and so we requre a general defnton. Let X 0 be a random varable that encapsulates any endogenous randomzaton n selectng an acton, and defne the fltraton generated by X 0 and the hstory of vsted states and actons by F t = σ(x 0,(s 0 ),(s 1,a 0 ),...,(s t,a t 1 )), where s t and a t denote the state and acton at tme t, respectvely. We assume that P(s t+1 = s s t = s,a t = a,h t = h t ) = P(s,a,s ) for all s,s S,a A,t T and any F t -measurable random varable H t. A feasble polcy smply specfes a sequence of A feas -valued actons {a t } adapted to F t. In partcular, such a polcy may be specfed by a collecton of σ(x 0 ) measurable, A feas -valued random varables, {µ(s 0,...,s t,a 0,...,a t 1,t)}, one for each possble state-acton hstory of the system. We let M denote the set of all such polces µ, and denote by J µ (s,0) the expected value of usng polcy µ startng n state s at tme 0; n partcular J µ (s,0) = E ] R(s t,a t ) s 0 = s, [ T 1 t=0 where a t = µ(s 0,...,s t,a 0,...,a t 1,t). Our goal s to compute an optmal admssble polcy. Markovan polces,.e. polces under whch a t s measurable wth respect to σ(x 0,s t ), are partcularly useful. A Markovan polcy s 5
7 specfed as a collecton of ndependent A feas valued random varables {µ(s,t)} each measurable wth respect to σ(x 0 ). In partcular, assumng the system s n state s at tme t, such a polcy selects an acton a t as the random varable µ(s,t), ndependent of past states and actons. We let M m denote the set of all such admssble Markovan polces. Every µ M m s assocated wth a value functon, J µ : S T R + whch, for every (s,t) S T, gves the expected value of usng control polcy µ startng at that state: J µ (s,t) = E [ T 1 ] R(s t,µ(s t,t )) s t = s. t =t We denote by J the optmal value functon. In partcular, J (s,t) = sup µ M m J µ (s,t). The precedng supremum s always acheved and we denote by µ a correspondng optmal Markovan control polcy. That s, µ argsupj µ (s,t) for all (s,t) S T. Our restrcton to Markovan µ M m polces s wthout loss; M m always contans an optmal polcy among the broader class of admssble polces so that sup µ M m J µ (s,0) = sup µ M J µ (s,0) for all states s. We next formulate a mathematcal program to compute such an optmal polcy. 2.1 Computng an Optmal Polcy An optmal polcy µ may be found va the soluton of the followng lnear program, LP( π 0 ), specfed by a parameter π 0 S that specfes the dstrbuton of arm states at tme t = 0. max. s.t. t s,a π(s,a,t)r(s,a), a π(s,a,t) = s,a P(s,a,s)π(s,a,t 1), t > 0,s S, π(s,a,t) = 0, s,t,a / A feas a π(s,a,0) = π 0(s), s S, π 0. where the varables are the state acton frequences π(s, a, t), whch gve the probablty of beng n state s at tme t and choosng acton a. The frst set of constrants n the above program smply enforce the dynamcs of the system, whle the second set of constrants enforces the requrement that at most k arms are smultaneously pulled at any pont n tme. An optmal soluton to the program above may be used to construct a polcy µ that attans expected value J (s,0) startng at any state s for whch π 0 (s) > 0. In partcular, gven an optmal soluton π opt to LP( π 0 ), one obtans such a polcy by defnng µ (s,t) as a random varable that takes value a A wth probablty π opt (s,a,t)/ a πopt (s,a,t). By constructon, we have E[J (s,0) s π 0 ] = OPT(LP( π 0 )). Of course, effcent soluton of the above program s not a tractable task, whch forces us to seek approxmatons to an optmal polcy. The next secton wll present one such polcy wth an appealng structural property we term rrevocablty. 3 An Irrevocable Approxmaton to the Optmal Polcy Ths secton develops an approxmaton to the optmal mult-armed bandt control polcy that we wll subsequently establsh performs adequately relatve to the optmal polcy. Ths approxmaton 6
8 wll possess a desrable property we term rrevocablty. In partcular, the polcy we develop wll, at any tme, be permtted to pull an arm only f that arm was pulled n the pror tme step, or else never pulled n the past. We frst develop a control polcy for a related bandt problem, where the requrement that precsely k arms be pulled n any tme step s relaxed. As we wll see, ths s essentally Whttle s relaxaton and the polcy developed for ths relaxaton s an upper bound to the optmal polcy. We wll then use the control polcy developed for ths relaxed control problem to desgn a polcy for the mult-armed bandt problem that s rrevocable and also offers good performance relatve to the optmal polcy for a broad class of bandts. Consder the followng relaxaton of the program LP( π 0 ), RLP( π 0 ). RLP( π 0 ) may be vewed as a prmal formulaton of Whttle s relaxaton: max. s.t. t s,a π (s,a,t)r (s,a ), a π (s,a,t) = s P (s,a,a,s )π (s,a,t 1), t > 0,s S,. [ T s t π (s,φ,t) ] kt, a π (s,a,0) = s: s =s π 0 ( s), π 0, where π (s,a,t) s the probablty of the th bandt beng n state s at tme t and choosng acton a. The program above relaxes the requrement that precsely k arms be pulled n a gven tme step; nstead we now requre that over the entre horzon at most kt arms are pulled n expectaton, where the expectaton s over polcy randomzaton and state evoluton. The frst set of equalty constrants enforce ndvdual arm dynamcs whereas the frst nequalty constrant enforces the requrement that at most kt arms be pulled n expectaton over the entre tme horzon. The followng lemma makes the noton of a relaxaton to LP( π 0 ) precse; the proof may be found n the appendx. Lemma 1. OPT(RLP( π 0 )) OPT(LP( π 0 )) Gven an optmal soluton π to RLP( π 0 ), one may consder the polcy µ R, that, assumng we are n state s at tme t, selects a random acton µ R (s,t), where µ R (s,t) = a wth probablty ( π (s,a,t)/ a π (s,a,t) ) ndependent of the past. Notng that the acton for each arm s chosen ndependently of all other arms, we use µ R (s,t) to denote the nduced polcy for arm. By constructon, E[J µr (s,0) s π 0 ] = OPT(RLP( π 0 )). Moreover, we have that µ R satsfes the constrant [ T 1 ] E µ 1 µ R (s t,t) φ s0 = s kt, t=0 where the expectaton s over random state transtons and endogenous polcy randomzaton. Of course, µ R s not necessarly feasble; we ultmately requre a polcy that entals at most k arm pulls n any tme step. We wll use µ R to construct such a feasble polcy. In addton, we wll see that f an arm s pulled and then dled n some subsequent tme step, t wll never agan be pulled, so that the polcy we construct wll be rrevocable. In what follows we wll assume for convenence that π 0 s degenerate and puts mass 1 on a sngle startng state. That s, π 0 (s ) = 1 for some s S for all. We frst ntroduce some relevant notaton. Gven an optmal soluton π 7
9 to RLP( π 0 ), defne the value generated by arm as the random varable T 1 R = r (s t,µr (st,t)), t=0 and the actve tme of arm, T as the total number of pulls of arm entaled under that polcy T 1 T = 1 µ R (s t,t) φ. t=0 The expected value of arm, E[R ] = s,a,t π (s,a,t)r (s,a ), and the expected actve tme E[T ] = s,a,t:a φ π (s,a,t). We wll assume n what follows that E[T ] > 0 for all ; otherwse, we smply consder elmnatng those for whch E[T ] = 0. We wll also assume for analytcal convenence that E[T ] = kt. Nether assumpton results n a loss of generalty. To motvate our polcy we begn wth the followng analogy wth a packng problem: Imagne packng n objects nto a knapsack of sze B. Each object has sze T and value R. Moreover, we assume that we are allowed to pack fractonal quanttes of an object nto the knapsack and that packng a fracton α of the th object requres space α T and generates value α R. An optmal polcy s then gven by the followng greedy procedure: select objects n decreasng order of the rato R / T and place them n to the knapsack to the extent that there s room avalable. If one had more than a sngle knapsack and the addtonal constrant that an tem could not be placed n more than a sngle knapsack, then the stuaton s more complcated. One may consder a greedy procedure that, as before, consders tems n decreasng order of the rato R / T and places them (possbly fractonally) n sequence, nto the least loaded of the bns at that pont. Ths generalzaton of the greedy procedure for the smple knapsack s suboptmal, but stll a reasonable heurstc. Thus motvated, we begn wth a loose hgh level descrpton of our control polcy, whch we call the packng heurstc. We thnk of each bandt arm as an tem of value E[R ] wth sze E[T ]. For the purposes of ths explanaton alone, we wll assume for convenence that should polcy µ R call for an arm that was pulled n the past to be dled, t wll never agan call for that arm to be pulled; we wll momentarly remove that assumpton. Our control polcy wll operate as follows: we wll order arms n decreasng order of the rato E[R ]/E[T ]. We begn wth the top k arms accordng to ths orderng. For each such arm we wll select an acton accordng to the polcy specfed for that arm by µ R ; should ths polcy call for the arm to be dled, we dscard that arm and wll never agan consder pullng t. We replace the dscarded arm wth the next avalable arm (n order of ntal arm rankngs) and select an acton for the arm accordng to µ R. We repeat ths procedure untl we have selected non-dle actons for up to k arms (or no arms are avalable). We then let tme advance, earn rewards, and repeat the procedure descrbed above untl the end of the tme horzon. Algorthm 1 descrbes the packng heurstc polcy precsely, addressng the fact that µ R may call for an arm to be dled but then pulled n some subsequent tme step. Intheeventthatweplacednorestrctononthetmehorzon(.e. wesett = nthealgorthm above), we have by constructon, that the expected total reward earned under the above polcy s precsely OPT(RLP( π 0 )). In essence, RLP( π 0 ) prescrbes a polcy wheren each arm generates a total reward wth mean E[R ] usng an expected total number of pulls E[T ], ndependent of other 8
10 Algorthm 1 The Packng Heurstc 1: Renumber bandts so that E[R 1] E[T 1 ] E[R 2] E[T 2 ] E[R N] E[T N ]. Index bandts by varable. 2: l 0,a φ for all, s π 0 ( ) {The local tme of every arm s set to 0 and ts desgnated acton to the dle acton. An ntal state s drawn accordng to the ntal state dstrbuton π 0.} 3: J 0 {Total reward earned s ntalzed to 0.} 4: X {1,2,...,k},A {k+1,...,n},d =. {Intalze the set of actve (X), avalable (A), and dscarded (D) arms.} 5: for t = 0 to T 1 do 6: whle there exsts an arm X wth a = φ do {Select up to k arms to pull.} 7: Select an X wth a = φ {In what follows, ether select an acton for arm or else dscard t.} 8: whle a = φ and l < T do {Attempt to select a pull acton for arm } 9: Select a π (s,,l ) {Select an acton accordng to the soluton to RLP( π).} 10: l l +1 {Increment arm s local tme.} 11: end whle 12: f l = T anda = φ then{dscard armandactvate nexthghestrankedarmavalable.} 13: X X\{},D D {} {Dscard arm.} 14: f A then {There are avalable arms.} 15: j mn A {Select hghest ranked avalable arm.} 16: X X {j},a A\{j} {Add arm to actve set.} 17: end f 18: end f 19: end whle 20: for Every X do {Pull selected arms.} 21: s P(s,a, ) {Pull arm ; select next arm state accordng to ts transton kernel assumng the use of acton a.} 22: J J +r (s,a ) {Earn rewards.} 23: a φ 24: end for 25: end for 9
11 arms. The above scheme may be vsualzed as one whch packs as many of the pulls of varous arms possble n a manner so as to meet feasblty constrants. It s clear that the heurstc we have constructed entals a mnmal amount of arm exploraton. In partcular, we are guaranteed at most n k changes to the set of pulled arms. One may naturally ask what the lmted exploraton permtted under ths polcy costs us n terms of performance. In addton, s ths scheme computatonally practcal? In partcular, the lnear programmng relaxaton we must solve s stll a farly large program. In subsequent sectons we address these ssues. Frst, we present a theoretcal analyss that demonstrates that the prce of rrevocablty s unformly bounded for an mportant general class of bandts. Our analyss sheds lght on the structural propertes that are lkely to afford a low prce of rrevocablty n practce. We then present results of computatonal experments wth a generatve famly of large-scale problems demonstratng performance losses of up to 5 10% percent relatve to an upper bound on the performance of the optmal polcy (whch s potentally non-rreovcable and has no restrctons on exploraton). Fnally, we address computatonal ssues relevant to the packng heurstc and develop a computatonal scheme that s substantally qucker than heurstcs such as Whttle s heurstc. 4 The Prce of Irrevocablty Ths secton establshes a unform bound on the performance loss ncurred n usng the rrevocable packng heurstc relatve to an optmal, potentally non-rrevocable scheme for a useful famly of bandts whose arms exhbt a certan decreasng returns property. Ths class ncludes bandts whose arms are cons of unknown bases a famly partcularly relevant to a number of applcatons ncludng those dscussed n the ntroducton. We establsh that the packng heurstc always earns expected rewards that are wthn a factor of 1/8 of an optmal scheme. Our analyss sheds lght on those structural propertes that lkely afford a low prce to rrevocablty. In addton to beng an ndcator of robustness across all parameter regmes, ths bound on the prce of rrevocablty s remarkable for two reasons. Frst, t does not rely on an asymptotc scalng of the system; the performance of the packng heurstc wll track that of an optmal, potentally non-rrevocable heurstc across all regmes. Second, the bound represents a comparson wth a system where one s allowed recourse to arms that were pulled n the past and dscarded. In partcular, the bound thus hghlghts the fact that for a useful class of bandts, one may acheve reasonable performance wth very lmted exploraton. The typcal performance we expect from the heurstc s lkely to be far superor (as t generally s n the case of problems for whch such worst case guarantees can be establshed); n a subsequent secton we wll present computatonal experments ndcatng a performance loss of 5 10% relatve to an optmal polcy wth no restrctons on exploraton. In what follows we frst specfy the decreasng returns property and explctly dentfy a class of bandts that possess ths property. We then present our performance analyss whch wll proceed as follows: we frst consder pullng bandt arms serally,.e. at most one arm at a tme, n order of ther rank and show that the total reward earned from bandts that were frst pulled wthn the frst kt/2 pulls s at least wthn a factor of 1/8 of an optmal polcy. Ths result reles on the statc rankng of bandt arms used, and a symmetrzaton dea exploted by Dean et al. (2004) n ther result on stochastc packng where rewards are statstcally ndependent of tem sze. In contrast to that work, we must address the fact that the rewards earned from a bandt are statstcally 10
12 dependent on the number of pulls of that bandt and to ths end we explot the decreasng returns property that establshes the nature of ths correlaton. We then show va a combnatoral sample path argument that the expected reward earned from bandts pulled wthn the frst T/2 tme steps of the packng heurstc.e., wth arms beng pulled n parallel, s at least as much as that earned n the settng above where arms are pulled serally, thereby establshng our performance guarantee. 4.1 The Decreasng Returns Property Defne for every and l < T, the random varable L (l) = l 1 µ R (s t,t) φ. t=0 L (l) tracks the number of tmes a gven arm has been pulled under polcy µ R among the frst l+1 steps of selectng an acton for that arm. Further, defne T 1 R m = 1 L (l) mr (s l,µ R (s l,l)). l=0 R m s the random reward earned wthn the frst m pulls of arm under the polcy µ R. The decreasng returns property roughly states that the expected ncremental returns from allowng an addtonal pull of a bandt arm are, on average, decreasng. More precsely, we have: Property 1. (Decreasng Returns) E[R m+1 ] E[R m] E[Rm ] E[Rm 1 ] for all 0 < m < T. One useful class of bandts from a modelng perspectve that satsfy ths property are bandts whose arms are cons of unknown bas. The followng dscusson makes ths noton more precse: An example of a bandt wth decreasng returns: Cons We defne a con to be any mult-armed bandt for whch every arm has acton space a = {p,φ }, wth r(s,p) > 0 for all s S, and satsfes the followng property: r(s,p) s S P(s,p,s )r(s,p), s S. The above sub-martngale characterzaton of rewards ntutvely suggests the decreasng returns property. In partcular, t suggests that the returns from a pull n the current state are at least as large as the expected returns to a pull n a state reached subsequent to the current pull. The decreasng returns property for cons s establshed n the followng Lemma whose proof may be found n the appendx: Lemma 2. Cons satsfy the decreasng returns property. That s, f A = {p,φ }, and r(s,p) s S P(s,p,s )r(s,p),,s S, 11
13 then for all 0 < m < T. E[R m+1 ] E[R m ] E[Rm ] E[Rm 1 ] Returnng to our motvatng example of dynamc product assortment selecton, we note that n estmatng the bas of a bnomal con of unknown bas gven some ntal pror on con bas, Bayes rule mples that the estmated bas after n observatons (whch generate the fltraton F n ), µ n+1 satsfes E[µ n+1 F n ] = µ n. Thus, bandts wth such arms wheren the reward from an arm s some non-negatve scalar tmes the bas, automatcally possess the decreasng returns property. 4.2 A Unform Bound on the Prce of Irrevocablty for Bandts wth Decreasng Returns For convenence of exposton we assume that T s even; addressng the odd case requres essentally dentcal proofs but cumbersome notaton. We re-order the bandts n decreasng order of E[R ]/E[T ] as n the packng heurstc. Let us defne { } j H = mn j : E[T ] kt/2. Thus, H s the set of bandts that take up approxmately half the budget on total expected pulls. Next, let us defne for all, random varables R and T accordng to R = R, T = T for all < H. We defne R H = αr H and T H = αt H, where α = kt/2 P H 1 E[T ] E[T. H ] We begn wth a prelmnary lemma: Lemma 3. Proof. Defne a functon f(t) = E[ R ] 1 2 OPT(RLP( π 0)). H n E[R ] E[T ] 1 t E[T ] j=1 + E[T ], where (a b) = mn(a,b). By constructon (.e. snce E[R ] E[T ] s non-ncreasng n ), we have that f s a concave functon on [0,kT]. Now observe that E[ R ] = H Next, observe that H 1 E[R ] E[T ] E[T ]+ E[R H ] H 1 kt/2 E[T H ] OPT(RLP( π 0 )) = n j=1 E[R ] E[T ] E[T ] = f(kt). E[T ] = f(kt/2). By the concavty of f and snce f(0) = 0, we have that f(kt/2) 1 2f(kT), whch yelds the result. 12
14 We next compare the expected reward earned by a certan subset of bandts wth ndces no larger than H. The sgnfcance of the subset of bandts we defne wll be seen later n the proof of Lemma 6 we wll see there that all bandts n ths subset wll begn operaton pror to tme T/2 n a run the packng heurstc. In partcular, defne R 1/2 = 1 P { 1 j=1 T j<kt/2} R. H Lemma 4. Proof. We have: E[R 1/2 ] (a) = (b) (c) = (d) E[R 1/2 ] 1 4 OPT(RLP( π 0)). H 1 Pr T j < kt/2 E[R ] j=1 H 1 Pr T j < kt/2 E[ R ] j=1 H 1 Pr T j < kt/2 E[ R ] ( H 1 H j=1 1 = E[ R ] (e) H (f) 1 2 H j=1 E[ T ) j ] E[ R ] kt/2 H E[ R ] 1 2 E[ R ] 1 H H (g) 1 4 OPT(RLP( π 0)) j=1 E[ T j ] E[ R ] kt/2 j=1,j E[ T j ] E[ R ] kt/2 Equalty (a) follows from the fact that under polcy µ R, R s ndependent of T j for j <. Inequalty (b) follows from our defnton of R : R R. Equalty (c) follows from the fact that by defnton T = T for all < H. Inequalty (d) nvokes Markov s nequalty. Inequalty (e) s the crtcal step n establshng the result and uses the smple symmetrzaton dea exploted by Dean et al. (2004): In partcular, we observe that snce E[R ] E[T ] E[R j] E[T j ] for > j, t follows that E[R ]E[T j ] 1 2 (E[R ]E[T j ] + E[R j ]E[T ]) for > j. Replacng every term of the form E[R ]E[T j ] (wth > j) n the expresson precedng nequalty (e) wth the upper bound 1 2 (E[R ]E[T j ] +E[R j ]E[T ]) yelds nequalty (e). Inequalty (f) follows from the fact that H E[ T ] = kt/2. Inequalty (g) follows from Lemma 3. 13
15 Before movng on to our man Lemma that translates the above guarantees to a guarantee on the performance of the packng heurstc, we need to establsh one addtonal techncal fact. Recall that R m s the reward earned by bandt n the frst m pulls of ths bandt. Explotng the assumed decreasng returns property, we have the followng Lemma whose proof may be found n the appendx: Lemma 5. For bandts satsfyng the decreasng returns property (Property 1), E [ H 1 P 1 j=1 T j<kt/2 RT/2 ] 1 2 E[R 1/2]. We have thus far establshed estmates for total expected rewards earned assumng mplctly that bandts are pulled n a seral fashon n order of ther rank. The followng Lemma connects these estmates to the expected reward earned under the µ packng polcy (gven by the packng heurstc) usng a smple sample path argument. In partcular, the followng Lemma shows that the expected rewards under the µ packng polcy are at least as large as E Lemma 6. E[J µpackng (s,0) s π 0 ] E [ H ] 1P 1 j=1 T j<kt/2 RT/2. Proof. For a gven sample path of the system defne h = (H ) mn : T j kt/2. On ths sample path, t must be that: j=1 [ H 1P 1 j=1 T j<kt/2 RT/2 ]. (4.1) H 1 P 1 j=1 T j<kt/2 RT/2 = h R T/2. We clam that arms 1,2,...,h are all frst pulled at tmes t < T/2 under µ packng. Assume to the contrary that ths were not the case and recall that arms are consdered n order of ndex under µ packng, so that an arm wth ndex s pulled for the frst tme no later than the frst tme arm l s pulled for l >. Let h be the hghest arm ndex among the arms pulled at tme t = T/2 1 so that h < h. It must be that h j=1 T j kt/2. But then, H mn : T j kt/2 h j=1 whch s a contradcton. Thus, snce every one of the arms 1,2,...,h s frst pulled at tmes t < T/2, each such arm may be pulled for at least T/2 tme steps pror to tme T (the horzon). Consequently, we have that the total rewards earned on ths sample path under polcy µ packng are at least h R T/2 14
16 Usng dentty (4.1) and takng an expectaton over sample paths yelds the result. We are ready to establsh our man Theorem that provdes a unform bound on the performance loss ncurred n usng the packng heurstc polcy relatve to an optmal polcy wth no restrctons on exploraton. In partcular, we have that the prce of rrevocablty s unformly bounded for bandts satsfyng the decreasng returns property. Theorem 1. For mult-armed bandts satsfyng the decreasng returns property (Property 1), E[J µpackng (s,0) s π 0 ] 1 8 E[J (s,0) s π 0 ] for all ntal state dstrbutons π 0. Proof. We have from Lemmas 1,4,5 and 6 that E[J µpackng (s,0) s π 0 ] 1 8 OPT(RLP( π 0)). We know from Lemma 1 that OPT(RLP( π 0 )) OPT(LP( π 0 )) = E[J (s,0) s π 0 ] from whch the result follows. Our analyss hghlghted a structural property decreasng returns that s lkely to afford a low prce of rrevocablty. The next secton demonstrates computatonal results that suggest that n practce we may expect ths prce to be qute small (on the order of 5 10%) for bandts possessng ths property. 5 Computatonal Experments Ths secton presents computatonal experments wth the packng heurstc. We consder a number of large scale bandt problems drawn from a generatve famly of problems to be dscussed shortly and demonstrate that the packng heurstc consstently demonstrates performance wthn about 5 10 % of an upper bound on the performance of an unrestrcted (.e. potentally non-rrevocable) optmal soluton to the mult-armed bandt problem. In partcular, ths suggests that the prce of rrevocablty s lkely to be small n practce, at least for models of the type we consder here. Snce the bandts consdered n our experments - Bnomal cons of uncertan bas - are among the most wdely used applcatons of the mult-armed bandt model, we vew ths to be a postve result. The Generatve Model: We consder mult-armed bandt problems wth n arms up to k of whchmaybepulledsmultaneouslyatanytme. ThetharmcorrespondstoaBnomal(m,P )con wheremsfxedandknown,andp sunknownbutdrawnfromadrchlet(α,β )prordstrbuton. Assumng we choose to pull arm at some pont, we realze a random outcome M {0,1,...,m}. M s a Bernoull(m,P ) random varable where P s tself a Drchlet(α,β ) random varable. We receve a reward of r M and update the pror dstrbuton parameters accordng to α α +M, β β +m M. By selectng the ntal values of α and β for each arm approprately we can control for the ntal uncertanty n the value of P. Ths model s, for nstance, applcable to the dynamc assortment selecton problem dscussed earler (see Caro and Gallen (2007)) wth each con representng a product of uncertan popularty and M representng the uncertan number of product sales over a sngle perod n whch that product s offered for sale. We recall from our prevous dscusson that ths famly of bandts satsfes the decreasng returns property and from our performance analyss we expect a reasonable prce of rrevocablty. 15
17 Summary of Computatonal Experments Coeff. of Varaton Arms Smultaneous Pulls Horzon Performance (cv) (n) (k) (T) (J µpackng /J ) Moderate (1) Hgh (2.5) Table 1: Computatonal Summary. Each row represents summary statstcs for 100 dstnct random bandt problems wth the specfed n, k, T and cv parameters. Performance for each nstance was computed from 3000 smulatons of that nstance. Performance fgures thus represent an average over the generatve famly wth the specfed n,k,t and cv parameters as also over system randomness. We consder the followng random nstances of the above problem. We consder bandts wth (n, k) {(500, 50),(500, 100),(100, 10),(100, 20)}. These dmensons are representatve of large scale applcatons of whch the dynamc assortment problem s an example. For each value of (n, k) we consder tme horzons T = 25 and T = 40. For every bandt problem we consder, we subdvde the arms of the bandt nto 10 groups. All arms wthn a group have dentcal statstcal structure, that s, dentcal r values and dentcal ntal values of α and β. For each value of (n,k,t), we generate a number of problem nstances by randomly drawng pror parameters for bandt arms. In partcular, for all arms n a gven group we select α unformly n the nterval [0.05,0.35] and then select that value of β whch results n a pror co-effcent of varaton cv {1,2.5}. These co-effcents of varaton represent, respectvely, a moderate and hgh degree of a-pror uncertanty n con bas (or n the context of the dynamc assortment applcaton, product popularty). In addton, r s drawn unformly on [0,2] and we take m = 2. We generate 100 random problem nstances for each co-effcent of varaton. Control polces for a gven bandt problem nstance are evaluated over 3000 random state trajectores (whch resulted n 98% confdence ntervals that were at least wthn +/-1% of the sample average). Evaluatng Performance: A strkng feature of our performance results s that the prce of rrevocablty s qute small, a trend that appears to hold over varyng parameter regmes. In partcular, we make the followng observaton: 16
18 Consder problems wth a small number of arms (100) wth a large number of smultaneous pulls (20) allowed. Intutvely, an optmal polcy could reasonably explore all arms n ths settng before settlng on the best arms. We thus expect the prce of rrevocablty to be hgh here. Even n ths regme we fnd that the prce of rrevocablty s only about % of optmal performance. Consder problems wth a hgh degree of a-pror uncertanty n con bas. Mstakes - that s, dscardng an arm that s performng reasonably n favor of an unexplored arm that turns out to perform poorly - are partcularly expensve n such problems. Wth a hgh co-effcent of varaton n the pror on ntal arm bas, the prce of r-revocablty s ndeed somewhat hgher but contnues to reman wthn % of optmal performance. For each of our experments, we observe that keepng all other parameters fxed, relatve performance mproves wth a longer tme horzon. Ths s ntutve; wth longer horzons, one may delay dscardng an arm only once one s sure that the arm performs poorly relatve to the expected value of the avalable alternatves. Fnally, we note that the performance fgures we report are relatve to an upper bound on optmal polcy performance. Computng the optmal polcy s tself an ntractable task. The performance observed here suggests that at least for bandt problems wth decreasng returns the packng heurstc s a vable approxmaton scheme even when rrevocablty s not necessarly a concern. We can thus conclude that the prce of rrevocablty s small for a useful class of mult-armed bandt problems and that the packng heurstc performs well for ths class of problems. A fnal concern s computatonal effort. In partcular, for the largest problem nstance we consdered (n = 500), the lnear program we need to solve has 3.2 mllon varables and about the same number of constrants. Even a commercal lnear programmng solver (such as CPLEX) equpped wth the ablty to explot structure n ths program wll requre several hours on a powerful computer to solve ths program. Ths s n stark contrast wth an ndex based heurstc (such as Whttles heurstc) thatsolvesasmpledynamcprogramforeacharmateverytmestep. Inthenextsectonwedevelop an effcent computatonal algorthm for the soluton of RLP( π 0 ) that requres substantally less effort than even Whttles heurstc and takes a few mnutes to solve the aforementoned program on a laptop computer. 6 Fast Computaton Ths secton consders the computatonal effort requred to mplement the packng heurstc. We develop a computatonal scheme that makes the packng heurstc substantally easer to mplement than popular ndex heurstcs such as Whttle s heurstc and thus establsh that the heurstc s vable from a computatonal perspectve. The key computatonal step n mplementng the packng heurstc s the soluton of the lnear program RLP( π 0 ). Assumng that S = O(S) and A = O(A) for all, ths lnear program has O(nT AS) varables and each Newton teraton of a general purpose nteror pont method wll requre O ( (ntas) 3) steps. An nteror pont method that explots the fact that bandt arms are 17
19 coupled va a sngle constrant wll requre O(n(TAS) 3 ) computatonal steps at each teraton. We develop a combnatoral scheme to solve ths lnear program that s n sprt smlar to the classcal Dantzg-Wolfe dual decomposton algorthm. In contrast wth Dantzg-Wolfe decomposton, our scheme s effcent. In partcular, the scheme requres O(nTAS 2 log(kt)) computatonal steps to solve RLP( π 0 ) makng t a sgnfcantly faster soluton alternatve to the schemes alluded to above. Equpped wth ths fast scheme, t s notable that usng the packng heurstc requres O(nAS 2 log(kt)) computatons per tme step amortzed over the tme horzon whch wll typcally be substantally less than the Θ(nAS 2 T) computatons requred per tme step for ndex polcy heurstcs such as Whttle s heurstc. Our scheme employs a dual decomposton of RLP( π 0 ). The key techncal dffculty we must overcome n developng our computatonal scheme for the soluton of RLP( π 0 ) s the nondfferentablty of the dual functon correspondng to RLP( π 0 ) at an optmal dual soluton whch prevents us from recoverng an optmal or near optmal polcy by drect mnmzaton of the dual functon. 6.1 An Overvew of the Scheme For each bandtarm, defnethepolytoped ( π 0 ) R S A T of permssblestate-acton frequences for that bandt arm specfed va the constrants of RLP( π 0 ) relevant to that arm. A pont wthn ths polytope, π, corresponds to a set of vald state-acton frequences for the th bandt arm. Wth some abuse of notaton, we denote the expected reward from ths arm under π by the value functon: T 1 R (π ) = π (s,a,t)r (s,a ). In addton denote the expected number of pulls of bandt arm under π by t=0 T (π ) = T π (s,φ,t). s We understand that both R ( ) and T ( ) are defned over the doman D ( π 0 ). We may thus rewrte RLP( π 0 ) n the followng form: t (6.1) max. s.t. R (π ), T (π ) kt. The Lagrangan dual of ths program s DRLP( π 0 ): mn. λkt + max π (R (π ) λt (π )), s.t. λ 0. The above program s convex. In partcular, the objectve s a convex functon of λ. We wll show that strong dualty apples to the dual par of programs above, so that the optmal soluton to the two programs have dentcal value. Next, we wll observe that for a gven value of λ, t s smple to compute max π (R (π ) λt (π )) va the soluton of a dynamc program over the state space of arm (a fast procedure). Fnally t s smple to derve useful a-pror lower and upper 18
Problem Set 9 Solutions
Desgn and Analyss of Algorthms May 4, 2015 Massachusetts Insttute of Technology 6.046J/18.410J Profs. Erk Demane, Srn Devadas, and Nancy Lynch Problem Set 9 Solutons Problem Set 9 Solutons Ths problem
More informationAssortment Optimization under MNL
Assortment Optmzaton under MNL Haotan Song Aprl 30, 2017 1 Introducton The assortment optmzaton problem ams to fnd the revenue-maxmzng assortment of products to offer when the prces of products are fxed.
More informationLecture 14: Bandits with Budget Constraints
IEOR 8100-001: Learnng and Optmzaton for Sequental Decson Makng 03/07/16 Lecture 14: andts wth udget Constrants Instructor: Shpra Agrawal Scrbed by: Zhpeng Lu 1 Problem defnton In the regular Mult-armed
More informationCS : Algorithms and Uncertainty Lecture 17 Date: October 26, 2016
CS 29-128: Algorthms and Uncertanty Lecture 17 Date: October 26, 2016 Instructor: Nkhl Bansal Scrbe: Mchael Denns 1 Introducton In ths lecture we wll be lookng nto the secretary problem, and an nterestng
More informationCollege of Computer & Information Science Fall 2009 Northeastern University 20 October 2009
College of Computer & Informaton Scence Fall 2009 Northeastern Unversty 20 October 2009 CS7880: Algorthmc Power Tools Scrbe: Jan Wen and Laura Poplawsk Lecture Outlne: Prmal-dual schema Network Desgn:
More informationprinceton univ. F 17 cos 521: Advanced Algorithm Design Lecture 7: LP Duality Lecturer: Matt Weinberg
prnceton unv. F 17 cos 521: Advanced Algorthm Desgn Lecture 7: LP Dualty Lecturer: Matt Wenberg Scrbe: LP Dualty s an extremely useful tool for analyzng structural propertes of lnear programs. Whle there
More informationModule 9. Lecture 6. Duality in Assignment Problems
Module 9 1 Lecture 6 Dualty n Assgnment Problems In ths lecture we attempt to answer few other mportant questons posed n earler lecture for (AP) and see how some of them can be explaned through the concept
More informationWinter 2008 CS567 Stochastic Linear/Integer Programming Guest Lecturer: Xu, Huan
Wnter 2008 CS567 Stochastc Lnear/Integer Programmng Guest Lecturer: Xu, Huan Class 2: More Modelng Examples 1 Capacty Expanson Capacty expanson models optmal choces of the tmng and levels of nvestments
More informationFeature Selection: Part 1
CSE 546: Machne Learnng Lecture 5 Feature Selecton: Part 1 Instructor: Sham Kakade 1 Regresson n the hgh dmensonal settng How do we learn when the number of features d s greater than the sample sze n?
More informationThe Minimum Universal Cost Flow in an Infeasible Flow Network
Journal of Scences, Islamc Republc of Iran 17(2): 175-180 (2006) Unversty of Tehran, ISSN 1016-1104 http://jscencesutacr The Mnmum Unversal Cost Flow n an Infeasble Flow Network H Saleh Fathabad * M Bagheran
More informationCOS 521: Advanced Algorithms Game Theory and Linear Programming
COS 521: Advanced Algorthms Game Theory and Lnear Programmng Moses Charkar February 27, 2013 In these notes, we ntroduce some basc concepts n game theory and lnear programmng (LP). We show a connecton
More informationLecture 10 Support Vector Machines II
Lecture 10 Support Vector Machnes II 22 February 2016 Taylor B. Arnold Yale Statstcs STAT 365/665 1/28 Notes: Problem 3 s posted and due ths upcomng Frday There was an early bug n the fake-test data; fxed
More informationLecture 4. Instructor: Haipeng Luo
Lecture 4 Instructor: Hapeng Luo In the followng lectures, we focus on the expert problem and study more adaptve algorthms. Although Hedge s proven to be worst-case optmal, one may wonder how well t would
More informationKernel Methods and SVMs Extension
Kernel Methods and SVMs Extenson The purpose of ths document s to revew materal covered n Machne Learnng 1 Supervsed Learnng regardng support vector machnes (SVMs). Ths document also provdes a general
More informationModule 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur
Module 3 LOSSY IMAGE COMPRESSION SYSTEMS Verson ECE IIT, Kharagpur Lesson 6 Theory of Quantzaton Verson ECE IIT, Kharagpur Instructonal Objectves At the end of ths lesson, the students should be able to:
More informationMASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 12 10/21/2013. Martingale Concentration Inequalities and Applications
MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.65/15.070J Fall 013 Lecture 1 10/1/013 Martngale Concentraton Inequaltes and Applcatons Content. 1. Exponental concentraton for martngales wth bounded ncrements.
More information4 Analysis of Variance (ANOVA) 5 ANOVA. 5.1 Introduction. 5.2 Fixed Effects ANOVA
4 Analyss of Varance (ANOVA) 5 ANOVA 51 Introducton ANOVA ANOVA s a way to estmate and test the means of multple populatons We wll start wth one-way ANOVA If the populatons ncluded n the study are selected
More informationMMA and GCMMA two methods for nonlinear optimization
MMA and GCMMA two methods for nonlnear optmzaton Krster Svanberg Optmzaton and Systems Theory, KTH, Stockholm, Sweden. krlle@math.kth.se Ths note descrbes the algorthms used n the author s 2007 mplementatons
More informationNotes on Frequency Estimation in Data Streams
Notes on Frequency Estmaton n Data Streams In (one of) the data streamng model(s), the data s a sequence of arrvals a 1, a 2,..., a m of the form a j = (, v) where s the dentty of the tem and belongs to
More informationBayesian predictive Configural Frequency Analysis
Psychologcal Test and Assessment Modelng, Volume 54, 2012 (3), 285-292 Bayesan predctve Confgural Frequency Analyss Eduardo Gutérrez-Peña 1 Abstract Confgural Frequency Analyss s a method for cell-wse
More informationResource Allocation with a Budget Constraint for Computing Independent Tasks in the Cloud
Resource Allocaton wth a Budget Constrant for Computng Independent Tasks n the Cloud Wemng Sh and Bo Hong School of Electrcal and Computer Engneerng Georga Insttute of Technology, USA 2nd IEEE Internatonal
More informationLINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity
LINEAR REGRESSION ANALYSIS MODULE IX Lecture - 30 Multcollnearty Dr. Shalabh Department of Mathematcs and Statstcs Indan Insttute of Technology Kanpur 2 Remedes for multcollnearty Varous technques have
More information2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification
E395 - Pattern Recognton Solutons to Introducton to Pattern Recognton, Chapter : Bayesan pattern classfcaton Preface Ths document s a soluton manual for selected exercses from Introducton to Pattern Recognton
More informationA Robust Method for Calculating the Correlation Coefficient
A Robust Method for Calculatng the Correlaton Coeffcent E.B. Nven and C. V. Deutsch Relatonshps between prmary and secondary data are frequently quantfed usng the correlaton coeffcent; however, the tradtonal
More information3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X
Statstcs 1: Probablty Theory II 37 3 EPECTATION OF SEVERAL RANDOM VARIABLES As n Probablty Theory I, the nterest n most stuatons les not on the actual dstrbuton of a random vector, but rather on a number
More informationStructure and Drive Paul A. Jensen Copyright July 20, 2003
Structure and Drve Paul A. Jensen Copyrght July 20, 2003 A system s made up of several operatons wth flow passng between them. The structure of the system descrbes the flow paths from nputs to outputs.
More informationEcon107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4)
I. Classcal Assumptons Econ7 Appled Econometrcs Topc 3: Classcal Model (Studenmund, Chapter 4) We have defned OLS and studed some algebrac propertes of OLS. In ths topc we wll study statstcal propertes
More informationNumerical Heat and Mass Transfer
Master degree n Mechancal Engneerng Numercal Heat and Mass Transfer 06-Fnte-Dfference Method (One-dmensonal, steady state heat conducton) Fausto Arpno f.arpno@uncas.t Introducton Why we use models and
More informationEdge Isoperimetric Inequalities
November 7, 2005 Ross M. Rchardson Edge Isopermetrc Inequaltes 1 Four Questons Recall that n the last lecture we looked at the problem of sopermetrc nequaltes n the hypercube, Q n. Our noton of boundary
More informationThe Order Relation and Trace Inequalities for. Hermitian Operators
Internatonal Mathematcal Forum, Vol 3, 08, no, 507-57 HIKARI Ltd, wwwm-hkarcom https://doorg/0988/mf088055 The Order Relaton and Trace Inequaltes for Hermtan Operators Y Huang School of Informaton Scence
More informationGeneralized Linear Methods
Generalzed Lnear Methods 1 Introducton In the Ensemble Methods the general dea s that usng a combnaton of several weak learner one could make a better learner. More formally, assume that we have a set
More informationAPPENDIX A Some Linear Algebra
APPENDIX A Some Lnear Algebra The collecton of m, n matrces A.1 Matrces a 1,1,..., a 1,n A = a m,1,..., a m,n wth real elements a,j s denoted by R m,n. If n = 1 then A s called a column vector. Smlarly,
More information4DVAR, according to the name, is a four-dimensional variational method.
4D-Varatonal Data Assmlaton (4D-Var) 4DVAR, accordng to the name, s a four-dmensonal varatonal method. 4D-Var s actually a drect generalzaton of 3D-Var to handle observatons that are dstrbuted n tme. The
More informationLecture Notes on Linear Regression
Lecture Notes on Lnear Regresson Feng L fl@sdueducn Shandong Unversty, Chna Lnear Regresson Problem In regresson problem, we am at predct a contnuous target value gven an nput feature vector We assume
More informationTechnical Note: Capacity Constraints Across Nests in Assortment Optimization Under the Nested Logit Model
Techncal Note: Capacty Constrants Across Nests n Assortment Optmzaton Under the Nested Logt Model Jacob B. Feldman, Huseyn Topaloglu School of Operatons Research and Informaton Engneerng, Cornell Unversty,
More informationU.C. Berkeley CS294: Spectral Methods and Expanders Handout 8 Luca Trevisan February 17, 2016
U.C. Berkeley CS94: Spectral Methods and Expanders Handout 8 Luca Trevsan February 7, 06 Lecture 8: Spectral Algorthms Wrap-up In whch we talk about even more generalzatons of Cheeger s nequaltes, and
More informationTHE SUMMATION NOTATION Ʃ
Sngle Subscrpt otaton THE SUMMATIO OTATIO Ʃ Most of the calculatons we perform n statstcs are repettve operatons on lsts of numbers. For example, we compute the sum of a set of numbers, or the sum of the
More informationGlobal Sensitivity. Tuesday 20 th February, 2018
Global Senstvty Tuesday 2 th February, 28 ) Local Senstvty Most senstvty analyses [] are based on local estmates of senstvty, typcally by expandng the response n a Taylor seres about some specfc values
More informationAdditional Codes using Finite Difference Method. 1 HJB Equation for Consumption-Saving Problem Without Uncertainty
Addtonal Codes usng Fnte Dfference Method Benamn Moll 1 HJB Equaton for Consumpton-Savng Problem Wthout Uncertanty Before consderng the case wth stochastc ncome n http://www.prnceton.edu/~moll/ HACTproect/HACT_Numercal_Appendx.pdf,
More informationBOOTSTRAP METHOD FOR TESTING OF EQUALITY OF SEVERAL MEANS. M. Krishna Reddy, B. Naveen Kumar and Y. Ramu
BOOTSTRAP METHOD FOR TESTING OF EQUALITY OF SEVERAL MEANS M. Krshna Reddy, B. Naveen Kumar and Y. Ramu Department of Statstcs, Osmana Unversty, Hyderabad -500 007, Inda. nanbyrozu@gmal.com, ramu0@gmal.com
More informationMaximizing the number of nonnegative subsets
Maxmzng the number of nonnegatve subsets Noga Alon Hao Huang December 1, 213 Abstract Gven a set of n real numbers, f the sum of elements of every subset of sze larger than k s negatve, what s the maxmum
More informationCHAPTER 5 NUMERICAL EVALUATION OF DYNAMIC RESPONSE
CHAPTER 5 NUMERICAL EVALUATION OF DYNAMIC RESPONSE Analytcal soluton s usually not possble when exctaton vares arbtrarly wth tme or f the system s nonlnear. Such problems can be solved by numercal tmesteppng
More informationNP-Completeness : Proofs
NP-Completeness : Proofs Proof Methods A method to show a decson problem Π NP-complete s as follows. (1) Show Π NP. (2) Choose an NP-complete problem Π. (3) Show Π Π. A method to show an optmzaton problem
More informationLinear Approximation with Regularization and Moving Least Squares
Lnear Approxmaton wth Regularzaton and Movng Least Squares Igor Grešovn May 007 Revson 4.6 (Revson : March 004). 5 4 3 0.5 3 3.5 4 Contents: Lnear Fttng...4. Weghted Least Squares n Functon Approxmaton...
More informationMarkov Chain Monte Carlo (MCMC), Gibbs Sampling, Metropolis Algorithms, and Simulated Annealing Bioinformatics Course Supplement
Markov Chan Monte Carlo MCMC, Gbbs Samplng, Metropols Algorthms, and Smulated Annealng 2001 Bonformatcs Course Supplement SNU Bontellgence Lab http://bsnuackr/ Outlne! Markov Chan Monte Carlo MCMC! Metropols-Hastngs
More informationWelfare Properties of General Equilibrium. What can be said about optimality properties of resource allocation implied by general equilibrium?
APPLIED WELFARE ECONOMICS AND POLICY ANALYSIS Welfare Propertes of General Equlbrum What can be sad about optmalty propertes of resource allocaton mpled by general equlbrum? Any crteron used to compare
More informationCapacity Constraints Across Nests in Assortment Optimization Under the Nested Logit Model
Capacty Constrants Across Nests n Assortment Optmzaton Under the Nested Logt Model Jacob B. Feldman School of Operatons Research and Informaton Engneerng, Cornell Unversty, Ithaca, New York 14853, USA
More informationChapter 11: Simple Linear Regression and Correlation
Chapter 11: Smple Lnear Regresson and Correlaton 11-1 Emprcal Models 11-2 Smple Lnear Regresson 11-3 Propertes of the Least Squares Estmators 11-4 Hypothess Test n Smple Lnear Regresson 11-4.1 Use of t-tests
More informationx = , so that calculated
Stat 4, secton Sngle Factor ANOVA notes by Tm Plachowsk n chapter 8 we conducted hypothess tests n whch we compared a sngle sample s mean or proporton to some hypotheszed value Chapter 9 expanded ths to
More informationVapnik-Chervonenkis theory
Vapnk-Chervonenks theory Rs Kondor June 13, 2008 For the purposes of ths lecture, we restrct ourselves to the bnary supervsed batch learnng settng. We assume that we have an nput space X, and an unknown
More informationStat260: Bayesian Modeling and Inference Lecture Date: February 22, Reference Priors
Stat60: Bayesan Modelng and Inference Lecture Date: February, 00 Reference Prors Lecturer: Mchael I. Jordan Scrbe: Steven Troxler and Wayne Lee In ths lecture, we assume that θ R; n hgher-dmensons, reference
More informationErrors for Linear Systems
Errors for Lnear Systems When we solve a lnear system Ax b we often do not know A and b exactly, but have only approxmatons  and ˆb avalable. Then the best thng we can do s to solve ˆx ˆb exactly whch
More informationSolutions HW #2. minimize. Ax = b. Give the dual problem, and make the implicit equality constraints explicit. Solution.
Solutons HW #2 Dual of general LP. Fnd the dual functon of the LP mnmze subject to c T x Gx h Ax = b. Gve the dual problem, and make the mplct equalty constrants explct. Soluton. 1. The Lagrangan s L(x,
More informationEEL 6266 Power System Operation and Control. Chapter 3 Economic Dispatch Using Dynamic Programming
EEL 6266 Power System Operaton and Control Chapter 3 Economc Dspatch Usng Dynamc Programmng Pecewse Lnear Cost Functons Common practce many utltes prefer to represent ther generator cost functons as sngle-
More informationChapter 5 Multilevel Models
Chapter 5 Multlevel Models 5.1 Cross-sectonal multlevel models 5.1.1 Two-level models 5.1.2 Multple level models 5.1.3 Multple level modelng n other felds 5.2 Longtudnal multlevel models 5.2.1 Two-level
More informationPricing Problems under the Nested Logit Model with a Quality Consistency Constraint
Prcng Problems under the Nested Logt Model wth a Qualty Consstency Constrant James M. Davs, Huseyn Topaloglu, Davd P. Wllamson 1 Aprl 28, 2015 Abstract We consder prcng problems when customers choose among
More informationAppendix B: Resampling Algorithms
407 Appendx B: Resamplng Algorthms A common problem of all partcle flters s the degeneracy of weghts, whch conssts of the unbounded ncrease of the varance of the mportance weghts ω [ ] of the partcles
More informationFoundations of Arithmetic
Foundatons of Arthmetc Notaton We shall denote the sum and product of numbers n the usual notaton as a 2 + a 2 + a 3 + + a = a, a 1 a 2 a 3 a = a The notaton a b means a dvdes b,.e. ac = b where c s an
More informationMore metrics on cartesian products
More metrcs on cartesan products If (X, d ) are metrc spaces for 1 n, then n Secton II4 of the lecture notes we defned three metrcs on X whose underlyng topologes are the product topology The purpose of
More informationChapter 5. Solution of System of Linear Equations. Module No. 6. Solution of Inconsistent and Ill Conditioned Systems
Numercal Analyss by Dr. Anta Pal Assstant Professor Department of Mathematcs Natonal Insttute of Technology Durgapur Durgapur-713209 emal: anta.bue@gmal.com 1 . Chapter 5 Soluton of System of Lnear Equatons
More informationMarkov Chain Monte Carlo Lecture 6
where (x 1,..., x N ) X N, N s called the populaton sze, f(x) f (x) for at least one {1, 2,..., N}, and those dfferent from f(x) are called the tral dstrbutons n terms of mportance samplng. Dfferent ways
More informationBasically, if you have a dummy dependent variable you will be estimating a probability.
ECON 497: Lecture Notes 13 Page 1 of 1 Metropoltan State Unversty ECON 497: Research and Forecastng Lecture Notes 13 Dummy Dependent Varable Technques Studenmund Chapter 13 Bascally, f you have a dummy
More informationPsychology 282 Lecture #24 Outline Regression Diagnostics: Outliers
Psychology 282 Lecture #24 Outlne Regresson Dagnostcs: Outlers In an earler lecture we studed the statstcal assumptons underlyng the regresson model, ncludng the followng ponts: Formal statement of assumptons.
More informationSimultaneous Optimization of Berth Allocation, Quay Crane Assignment and Quay Crane Scheduling Problems in Container Terminals
Smultaneous Optmzaton of Berth Allocaton, Quay Crane Assgnment and Quay Crane Schedulng Problems n Contaner Termnals Necat Aras, Yavuz Türkoğulları, Z. Caner Taşkın, Kuban Altınel Abstract In ths work,
More informationLimited Dependent Variables
Lmted Dependent Varables. What f the left-hand sde varable s not a contnuous thng spread from mnus nfnty to plus nfnty? That s, gven a model = f (, β, ε, where a. s bounded below at zero, such as wages
More informationEconomics 101. Lecture 4 - Equilibrium and Efficiency
Economcs 0 Lecture 4 - Equlbrum and Effcency Intro As dscussed n the prevous lecture, we wll now move from an envronment where we looed at consumers mang decsons n solaton to analyzng economes full of
More informationU.C. Berkeley CS294: Beyond Worst-Case Analysis Luca Trevisan September 5, 2017
U.C. Berkeley CS94: Beyond Worst-Case Analyss Handout 4s Luca Trevsan September 5, 07 Summary of Lecture 4 In whch we ntroduce semdefnte programmng and apply t to Max Cut. Semdefnte Programmng Recall that
More informationReal-Time Systems. Multiprocessor scheduling. Multiprocessor scheduling. Multiprocessor scheduling
Real-Tme Systems Multprocessor schedulng Specfcaton Implementaton Verfcaton Multprocessor schedulng -- -- Global schedulng How are tasks assgned to processors? Statc assgnment The processor(s) used for
More informationLectures - Week 4 Matrix norms, Conditioning, Vector Spaces, Linear Independence, Spanning sets and Basis, Null space and Range of a Matrix
Lectures - Week 4 Matrx norms, Condtonng, Vector Spaces, Lnear Independence, Spannng sets and Bass, Null space and Range of a Matrx Matrx Norms Now we turn to assocatng a number to each matrx. We could
More informationStanford University CS359G: Graph Partitioning and Expanders Handout 4 Luca Trevisan January 13, 2011
Stanford Unversty CS359G: Graph Parttonng and Expanders Handout 4 Luca Trevsan January 3, 0 Lecture 4 In whch we prove the dffcult drecton of Cheeger s nequalty. As n the past lectures, consder an undrected
More informationOn the Multicriteria Integer Network Flow Problem
BULGARIAN ACADEMY OF SCIENCES CYBERNETICS AND INFORMATION TECHNOLOGIES Volume 5, No 2 Sofa 2005 On the Multcrtera Integer Network Flow Problem Vassl Vasslev, Marana Nkolova, Maryana Vassleva Insttute of
More informationPerfect Competition and the Nash Bargaining Solution
Perfect Competton and the Nash Barganng Soluton Renhard John Department of Economcs Unversty of Bonn Adenauerallee 24-42 53113 Bonn, Germany emal: rohn@un-bonn.de May 2005 Abstract For a lnear exchange
More informationThe Geometry of Logit and Probit
The Geometry of Logt and Probt Ths short note s meant as a supplement to Chapters and 3 of Spatal Models of Parlamentary Votng and the notaton and reference to fgures n the text below s to those two chapters.
More informationThe Multiple Classical Linear Regression Model (CLRM): Specification and Assumptions. 1. Introduction
ECONOMICS 5* -- NOTE (Summary) ECON 5* -- NOTE The Multple Classcal Lnear Regresson Model (CLRM): Specfcaton and Assumptons. Introducton CLRM stands for the Classcal Lnear Regresson Model. The CLRM s also
More informationP R. Lecture 4. Theory and Applications of Pattern Recognition. Dept. of Electrical and Computer Engineering /
Theory and Applcatons of Pattern Recognton 003, Rob Polkar, Rowan Unversty, Glassboro, NJ Lecture 4 Bayes Classfcaton Rule Dept. of Electrcal and Computer Engneerng 0909.40.0 / 0909.504.04 Theory & Applcatons
More informationDifference Equations
Dfference Equatons c Jan Vrbk 1 Bascs Suppose a sequence of numbers, say a 0,a 1,a,a 3,... s defned by a certan general relatonshp between, say, three consecutve values of the sequence, e.g. a + +3a +1
More informationa b a In case b 0, a being divisible by b is the same as to say that
Secton 6.2 Dvsblty among the ntegers An nteger a ε s dvsble by b ε f there s an nteger c ε such that a = bc. Note that s dvsble by any nteger b, snce = b. On the other hand, a s dvsble by only f a = :
More informationA PROBABILITY-DRIVEN SEARCH ALGORITHM FOR SOLVING MULTI-OBJECTIVE OPTIMIZATION PROBLEMS
HCMC Unversty of Pedagogy Thong Nguyen Huu et al. A PROBABILITY-DRIVEN SEARCH ALGORITHM FOR SOLVING MULTI-OBJECTIVE OPTIMIZATION PROBLEMS Thong Nguyen Huu and Hao Tran Van Department of mathematcs-nformaton,
More informationInteractive Bi-Level Multi-Objective Integer. Non-linear Programming Problem
Appled Mathematcal Scences Vol 5 0 no 65 3 33 Interactve B-Level Mult-Objectve Integer Non-lnear Programmng Problem O E Emam Department of Informaton Systems aculty of Computer Scence and nformaton Helwan
More informationTime-Varying Systems and Computations Lecture 6
Tme-Varyng Systems and Computatons Lecture 6 Klaus Depold 14. Januar 2014 The Kalman Flter The Kalman estmaton flter attempts to estmate the actual state of an unknown dscrete dynamcal system, gven nosy
More informationANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U)
Econ 413 Exam 13 H ANSWERS Settet er nndelt 9 deloppgaver, A,B,C, som alle anbefales å telle lkt for å gøre det ltt lettere å stå. Svar er gtt . Unfortunately, there s a prntng error n the hnt of
More informationYong Joon Ryang. 1. Introduction Consider the multicommodity transportation problem with convex quadratic cost function. 1 2 (x x0 ) T Q(x x 0 )
Kangweon-Kyungk Math. Jour. 4 1996), No. 1, pp. 7 16 AN ITERATIVE ROW-ACTION METHOD FOR MULTICOMMODITY TRANSPORTATION PROBLEMS Yong Joon Ryang Abstract. The optmzaton problems wth quadratc constrants often
More informationCase A. P k = Ni ( 2L i k 1 ) + (# big cells) 10d 2 P k.
THE CELLULAR METHOD In ths lecture, we ntroduce the cellular method as an approach to ncdence geometry theorems lke the Szemeréd-Trotter theorem. The method was ntroduced n the paper Combnatoral complexty
More informationGrover s Algorithm + Quantum Zeno Effect + Vaidman
Grover s Algorthm + Quantum Zeno Effect + Vadman CS 294-2 Bomb 10/12/04 Fall 2004 Lecture 11 Grover s algorthm Recall that Grover s algorthm for searchng over a space of sze wors as follows: consder the
More informationSome modelling aspects for the Matlab implementation of MMA
Some modellng aspects for the Matlab mplementaton of MMA Krster Svanberg krlle@math.kth.se Optmzaton and Systems Theory Department of Mathematcs KTH, SE 10044 Stockholm September 2004 1. Consdered optmzaton
More informationOnline Appendix to: Axiomatization and measurement of Quasi-hyperbolic Discounting
Onlne Appendx to: Axomatzaton and measurement of Quas-hyperbolc Dscountng José Lus Montel Olea Tomasz Strzaleck 1 Sample Selecton As dscussed before our ntal sample conssts of two groups of subjects. Group
More informationMotion Perception Under Uncertainty. Hongjing Lu Department of Psychology University of Hong Kong
Moton Percepton Under Uncertanty Hongjng Lu Department of Psychology Unversty of Hong Kong Outlne Uncertanty n moton stmulus Correspondence problem Qualtatve fttng usng deal observer models Based on sgnal
More informationSupplementary Notes for Chapter 9 Mixture Thermodynamics
Supplementary Notes for Chapter 9 Mxture Thermodynamcs Key ponts Nne major topcs of Chapter 9 are revewed below: 1. Notaton and operatonal equatons for mxtures 2. PVTN EOSs for mxtures 3. General effects
More informationEquilibrium with Complete Markets. Instructor: Dmytro Hryshko
Equlbrum wth Complete Markets Instructor: Dmytro Hryshko 1 / 33 Readngs Ljungqvst and Sargent. Recursve Macroeconomc Theory. MIT Press. Chapter 8. 2 / 33 Equlbrum n pure exchange, nfnte horzon economes,
More information6.854J / J Advanced Algorithms Fall 2008
MIT OpenCourseWare http://ocw.mt.edu 6.854J / 18.415J Advanced Algorthms Fall 2008 For nformaton about ctng these materals or our Terms of Use, vst: http://ocw.mt.edu/terms. 18.415/6.854 Advanced Algorthms
More informationComparison of the Population Variance Estimators. of 2-Parameter Exponential Distribution Based on. Multiple Criteria Decision Making Method
Appled Mathematcal Scences, Vol. 7, 0, no. 47, 07-0 HIARI Ltd, www.m-hkar.com Comparson of the Populaton Varance Estmators of -Parameter Exponental Dstrbuton Based on Multple Crtera Decson Makng Method
More informationSingular Value Decomposition: Theory and Applications
Sngular Value Decomposton: Theory and Applcatons Danel Khashab Sprng 2015 Last Update: March 2, 2015 1 Introducton A = UDV where columns of U and V are orthonormal and matrx D s dagonal wth postve real
More information18.1 Introduction and Recap
CS787: Advanced Algorthms Scrbe: Pryananda Shenoy and Shjn Kong Lecturer: Shuch Chawla Topc: Streamng Algorthmscontnued) Date: 0/26/2007 We contnue talng about streamng algorthms n ths lecture, ncludng
More informationConvergence of random processes
DS-GA 12 Lecture notes 6 Fall 216 Convergence of random processes 1 Introducton In these notes we study convergence of dscrete random processes. Ths allows to characterze phenomena such as the law of large
More informationComputing Correlated Equilibria in Multi-Player Games
Computng Correlated Equlbra n Mult-Player Games Chrstos H. Papadmtrou Presented by Zhanxang Huang December 7th, 2005 1 The Author Dr. Chrstos H. Papadmtrou CS professor at UC Berkley (taught at Harvard,
More informationEstimation: Part 2. Chapter GREG estimation
Chapter 9 Estmaton: Part 2 9. GREG estmaton In Chapter 8, we have seen that the regresson estmator s an effcent estmator when there s a lnear relatonshp between y and x. In ths chapter, we generalzed the
More informationStatistics II Final Exam 26/6/18
Statstcs II Fnal Exam 26/6/18 Academc Year 2017/18 Solutons Exam duraton: 2 h 30 mn 1. (3 ponts) A town hall s conductng a study to determne the amount of leftover food produced by the restaurants n the
More informationLecture 20: Lift and Project, SDP Duality. Today we will study the Lift and Project method. Then we will prove the SDP duality theorem.
prnceton u. sp 02 cos 598B: algorthms and complexty Lecture 20: Lft and Project, SDP Dualty Lecturer: Sanjeev Arora Scrbe:Yury Makarychev Today we wll study the Lft and Project method. Then we wll prove
More informationFor now, let us focus on a specific model of neurons. These are simplified from reality but can achieve remarkable results.
Neural Networks : Dervaton compled by Alvn Wan from Professor Jtendra Malk s lecture Ths type of computaton s called deep learnng and s the most popular method for many problems, such as computer vson
More information3.1 ML and Empirical Distribution
67577 Intro. to Machne Learnng Fall semester, 2008/9 Lecture 3: Maxmum Lkelhood/ Maxmum Entropy Dualty Lecturer: Amnon Shashua Scrbe: Amnon Shashua 1 In the prevous lecture we defned the prncple of Maxmum
More information