arxiv: v2 [cs.lg] 19 Dec 2016

Size: px
Start display at page:

Download "arxiv: v2 [cs.lg] 19 Dec 2016"

Transcription

1 1 Sasfcg mul-armed bad problems Paul Reverdy, Vabhav Srvasava, ad Naom Ehrch Leoard arxv: v2 [cs.lg] 19 Dec 2016 Absrac Sasfcg s a relaxao of maxmzg ad allows for less rsky decso makg he face of uceray. We propose wo ses of sasfcg objecves for he mul-armed bad problem, where he objecve s o acheve reward-based decso-makg performace above a gve hreshold. We show ha hese ew problems are equvale o varous sadard mul-armed bad problems wh maxmzg objecves ad use he equvalece o fd bouds o performace. The dffere objecves ca resul qualavely dffere behavor; for example, ages explore her opos coually oe case ad oly a fe umber of mes aoher. For he case of Gaussa rewards we show a addoal equvalece bewee he wo ses of sasfcg objecves ha allows algorhms developed for oe se o be appled o he oher. We he develop varas of he Upper Credble Lm (UCL algorhm ha solve he problems wh sasfcg objecves ad show ha hese modfed UCL algorhms acheve effce sasfcg performace. I. INTRODUCTION Egeerg soluos o decso-makg problems are ofe desged o maxmze a objecve fuco. However, may coexs maxmzao of a objecve fuco s a ureasoable goal, eher because he objecve self s poorly defed or because solvg he resulg opmzao problem s racable or cosly. I hese coexs, s valuable o cosder alerave decso-makg frameworks. Herber Smo cosdered alerave models of raoal decso-makg [30] wh he goal of makg hem compable wh he access o formao ad he compuaoal capaces ha are acually possessed by orgasms, cludg ma, he kds of evromes whch such orgasms exs. A major feaure of he models he cosdered s wha he called sasfcg. I [30], he dscussed very broad erms a varey of smplfcaos o he classcal ecoomc cocep of raoaly, mos mporaly he dea ha payoffs should be smple, defed by dog well relave o some hreshold value. I [31], he roduced he word sasfcg, a combao of he words sasfy ad suffce, o refer o hs hresholdg cocep ad llusraed usg a mahemacal model of foragg. He also brefly dscussed how sasfcg relaes o problems veory corol ad more complcaed decso processes lke playg chess. Sce Smo s poeerg work, sasfcg has bee suded may felds such as psychology [29], ecoomcs [6], maageme scece [23], [37], ad ecology [36], [8]. I Ths research has bee suppored par by ONR gra N ad ARO gra W911NF P. Reverdy was suppored hrough a NDSEG Fellowshp. P. Reverdy s wh he Deparme of Elecrcal ad Sysems Egeerg, Uversy of Pesylvaa, Phladelpha, PA 19104, USA preverdy@seas.upe.edu. V. Srvasava s wh he Deparme of Elecrcal ad Compuer Egeerg, Mchga Sae Uversy, Eas Lasg, MI 48824, USA vabhav@egr.msu.edu. N. E. Leoard s wh he Deparme of Mechacal ad Aerospace Egeerg, Prceo Uversy, Prceo, NJ 08544, USA aom@prceo.edu egeerg, sasfcg s of eres for he same reasos ha movaed s roduco he socal scece leraure, specfcally ha ca smplfy decso-makg problems: as compared o maxmzg allows for less rsky decso makg he face of uceray. Furhermore, may egeerg problems are aurally posed usg a sasfcg objecve, such as choosg a desg ha mees gve specfcaos, bu where he desgers may be dffere amog ay such desgs. Sasfcg s well defed eve f here are several compeg performace measures ha rade off complcaed ways, whereas maxmzg may be poorly defed whou addoal formao abou prefereces. Sasfcg has bee suded he egeerg leraure several coexs. I [25], he auhors suded desg opmzao usg a sasfcg objecve ad foud ha s effecve may praccal felds. I [14], he auhors suded corol heory usg a sasfcg objecve fuco, ad [38], he auhors used sasfcg o sudy opmal sofware desg. I [10], he auhors used a mul-armed bad algorhm o cosruc robos ha acvely adap her corol polces o mgae damage, such as acuaor falures. I order o speed he covergece of her algorhm, hey oly sough o defy corol polces wh performace above a se hreshold, raher ha o defy a opmal polcy. The heory ha we develop hs paper formalzes her oo of hresholdg ad provdes bouds o performace. I hs paper, we cosder sasfcg he sochasc mul-armed bad problem [28], for whch a decso maker sequeally chooses oe of a se of alerave opos, called arms, ad ears a reward draw from a saoary probably dsrbuo assocaed wh ha arm. The sadard mularmed bad problem uses a maxmzg objecve o accumulaed reward. For hs objecve here s a kow performace boud erms of expeced regre, whch s he expeced dfferece bewee he reward receved by he decso maker ad he maxmum reward possble. Sce he sadard oo of regre s defed relave o he ukow opmum, ca oly be compued by a omsce age; hs oo of regre s o compuable by a decso maker faced wh a mul-armed bad problem. Neverheless, s a useful heorecal cocep, whch faclaes he aalyss of algorhms desged o solve bad problems. We exed he oo of regre o sasfcg objecves ad use o aalyze ew algorhms. I coras o he sadard sochasc mul-armed bad problem whch he age seeks o deerme, wh ceray, he opo wh maxmum mea reward, he sasfcg mularmed bad problem seeks o deerme, wh a desred cofdece, a sasfyg opo. We characerze sasfcg mul-armed bad problems usg hree separae feaures of he sasfcg objecve.

2 2 The frs feaure selecs he quay o whch he sasfcg objecve s defed. We cosder wo such quaes: ( he ukow mea reward of he seleced opo, ad ( he saaeous observed reward. The secod feaure reas he sasfaco aspec of he sasfcg problem. I parcular, selecs f he objecve fuco should be opmzg, or f should be sasfyg. The hrd feaure reas he suffcg aspec of he sasfcg problem. I parcular, selecs f he decso-makg algorhm should be cera ha he opmzg/sasfyg crero s me, or f s suffce for he algorhm o mee a desred hreshold cofdece abou he crero. Dffere combaos of he above hree feaures of sasfcg lead o egh sasfcg objecves ha we dscuss hs paper. We beg by defg he four objecves for he case where he sasfcg quay s he ukow mea reward. We show ha he bad problem wh each of hese four objecves s equvale o a prevously-suded bad problem ad use he equvalece o derve a performace boud for he sasfcg problems. These four objecves seek a arm wh sasfygly hgh mea reward whou regard o ha reward s dsperso. To develop objecves wh mproved robusess properes, we he cosder he case where he sasfcg quay s he saaeous observed reward. We exed he frs four objecves o hs case by addg a addoal layer of hresholdg, whch defes four more objecves. Whe he reward dsrbuos belog o locao-scale famles, here s a equvalece bewee he objecves defed erms of mea reward ad he robus objecves defed erms of saaeous reward, whch we prove for Gaussa rewards. For smplcy of exposo, we he specalze o Gaussa mul-armed bad problems, where he reward dsrbuos are Gaussa wh ukow mea ad kow varace. For such problems, we develop several modfcaos of he UCL algorhm ha we developed prevous work [27]. These algorhms solve he problem wh he sasfcg mea reward objecves (ad hus also wh he robus objecves; ad we show ha hese algorhms acheve effce performace. These resuls exed our prevous work [26] by corporag he cocep of suffcecy o he sasfcg objecve, as well as addg several ew algorhms ad her assocaed aalyss. The assumpo of Gaussa rewards wh kow varace s o requred, bu allows us o focus o he dffere oos of regre, whch s he ma corbuo of hs paper. We laer show how he kow varace assumpo ca be relaxed. Our mehods also exed mmedaely o may oher mpora classes of reward dsrbuos, cludg dsrbuos wh bouded suppor ad sub-gaussa dsrbuos. We show how o exed our mehods hese cases ad provde refereces o he releva leraure for oher exesos. The remader of he paper s srucured as follows. I Seco II we revew he sadard sochasc mul-armed bad problem ad he assocaed performace bouds. I Seco III we propose he sasfcg objecves ad boud performace erms of hese objecves. I Seco IV we specalze o he case of Gaussa rewards ad show he equvalece bewee he sasfcg mea reward objecves ad he sasfcg saaeous observed reward objecves. I Seco V we revew he UCL algorhm, ad Seco VI we desg modfed versos of he UCL algorhm for he sasfcg objecves. We show ha hese modfed algorhms acheve effce performace for Gaussa rewards. We show he resuls of umercal smulaos Seco VII ad Seco VIII we coclude. II. THE STOCHASTIC MULTI-ARMED BANDIT PROBLEM I he sochasc mul-armed bad problem a decsomakg age sequeally chooses oe amog a se of N opos called arms aalogy wh he lever of a slo mache. A sgle-levered slo mache s called a oe-armed bad, so he case of N 2 opos s called a mul-armed bad. The decso-makg age collecs reward r R by choosg arm a each me {1,..., T }, where T N s he horzo legh for he sequeal decso process. The reward from opo {1,..., N} s sampled from a saoary probably dsrbuo ν ad has a ukow mea m R. The decso-maker s objecve s o maxmze some fuco of he sequece of rewards {r } by sequeally pckg arms usg oly he formao avalable a me. A. Maxmzao objecve I he sadard mul-armed bad problem, he age s objecve s o maxmze he expeced cumulave reward [ T ] T J = E r = m. (1 =1 =1 Equvalely, by defg m = max m ad R = m m, expeced regre a me, mmzg (1 ca be formulaed as mmzg he cumulave expeced regre defed by T N R = T m m E [ T ] N = E [ T ], (2 =1 =1 where T s he umber of mes arm has bee chose up o me T, = m m s he expeced regre due o pckg arm sead of arm, ad he expecao s over he possble rewards ad decsos made by he age. The erpreao of (2 s ha subopmal arms should be chose as rarely as possble. Ths s a o-rval ask sce he mea rewards m are ally ukow o he decso-maker, who mus ry arms o lear abou her rewards whle prefereally pckg arms ha appear more rewardg. The eso bewee hese requremes s kow as he explore-explo radeoff ad s commo o may problems mache learg ad adapve corol. B. Boud o opmal performace Opmal performace a bad problem correspods o pckg subopmal arms as rarely as possble, as show by (2. La ad Robbs [20] suded he sadard sochasc mularmed bad problem ad showed ha ay polcy solvg he problem mus pck each subopmal arm a umber of mes ha s a leas logarhmc he me horzo T,.e., E [ ( T ] =1 1 D(ν ν + o(1 log T, (3

3 3 where o(1 0 as T +. The quay D(ν ν := ν (r log ν(r ν (r dr s he Kullback-Lebler dvergece bewee he reward desy ν of ay subopmal arm ad he reward desy ν of he opmal arm. Equao (3 mples ha cumulave expeced regre mus grow a leas logarhmcally me. The boud (3 s asympoc me, bu researchers (e.g., [4], [13], [27] have developed algorhms ha acheve cumulave expeced regre ha s bouded by a logarhmc erm uformly me, somemes wh he same cosa as (3. Cumulave expeced regre ha s uformly bouded me by a logarhmc erm s ofe called logarhmc regre for shor. I he leraure, algorhms ha acheve logarhmc regre wh a leadg erm ha s wh a cosa facor of ha (3 are cosdered o have opmal performace. C. Mulple plays Aaharam e al. [2] suded a geeralzao of he mularmed bad problem whch he age pcks k 1 arms a each me, whch hey called he mul-armed bad problem wh mulple plays. The case k = 1 correspods o he sadard mul-armed bad problem defed above. I he spr of [2], le σ be a permuao of {1,..., N} such ha m σ(1 m σ(2 m σ(n. For he mul-armed bad problem wh k plays, he opmal polcy wh full formao correspods o pckg he arms σ(1,, σ(k, called he k-bes arms [2]. I he case k = 1, σ(1 =, he opmal arm defed above. For he case of geeral k 1, he cumulave expeced regre for he mul-armed bad problem wh mulple plays s defed as follows [2]: T k N m σ( m E [ T ], (4 =1 whch s a sraghforward geeralzao of he regre (2. The subopmal arms σ(k + 1,, σ(n are called he k-wors arms [2]. Defe (k = m σ(k m for each k-wors arm. The quay (k s he geeralzao of he expeced regre for he problem wh mulple plays, where he expeced value of he opmal polcy s ha of he k bes arms. As he case of a sgle play, opmal performace correspods o pckg subopmal (.e., k-wors arms as rarely as possble. By [2] each k-wors arm mus be pcked a umber of mes ha s a leas logarhmc he me horzo T,.e., E [ ( T ] 1 D(ν ν σ(k + o(1 log T. (5 =1 Ths boud ca be erpreed as a geeralzao of he La- Robbs boud (3 where he Kullback-Lebler dvergece s ake wh respec o he k h bes arm σ(k raher ha he frs bes arm σ(1 (.e., he case k = 1. D. PAC bouds I he sadard mul-armed bad problem ad he mularmed bad problem wh mulple plays, regre s defed erms of he ukow mea reward values m. These regre defos mply ha avodg regre requres defyg opmal arms wh ceray. The requreme o defy opmal arms wh ceray s characersc of a maxmzg decso-makg sraegy. I coras, a sasfcg decsomakg age should seek arms ha are good eough. I hs coex, sasfcg correspods o fdg arms ha are opmal wh hgh probably raher ha wh ceray. The Probably Approxmaely Correc (PAC model for learg roduced by Vala [34] provdes a aural way o capure hs aspec of sasfcg. Eve-Dar e al. [11], [12] ad Maor ad Tsskls [22] suded he mul-armed bad problem usg he PAC model ad defed a ɛ-opmal arm as oe for whch m > m ɛ,.e., he mea reward s wh ɛ of he opmum value. Equvalely, a ɛ-opmal arm s a arm for whch he expeced regre s a mos ɛ. Uder he PAC model oe wshes o fd a ɛ-opmal arm wh probably of a leas 1 δ. Wh probably oe, hs ca be acheved a fe umber of samples, so performace guaraees ake he form of bouds o he umber of samples requred, whch s referred o as sample complexy. I our oao, we deoe sample complexy by T, as s he value of he horzo legh a whch samplg ermaes. Whe he rewards are Beroull dsrbued wh ukow success probables p, he followg lower boud holds [22]: ( 1 E [T ] O ɛ 2 log(1/δ. (6 A smlar resul was repored [11] for T, raher ha s expeced value. I oher words, oe mus sample a arm a leas log(1/δ/ɛ 2 mes o be able o declare ha s ɛ-opmal wh probably a leas 1 δ. Smlar o he work of [2] exedg La ad Robbs bouds [20] o he case of mulple plays, Kalyaakrsha e al. [15] exeded he work of [12] from fdg he ɛ-opmal arm o fdg he m ɛ-bes arms wh probably a leas 1 δ. I [15] hs problem s called Explore-m, ad a algorhm ha solves (ɛ, m, δ-opmal. Noe ha he problem [12] s he specal case Explore-1. The Explore-m problem s suded [15] for rewards ha are Beroull dsrbued. I s proved ha, for every (ɛ, m, δ-opmal algorhm, here exss a bad problem o whch ha algorhm has wors-case sample complexy of a leas log(m/8δ. Specfcally, s show ha here exss a bad problem such ha he umber of samples T requred o defy m ɛ-bes arms obeys T 1 N ( m ɛ 2 log. (7 8δ Ths gves a wors-case boud o he umber of mes all arms eed o be sampled o acheve (ɛ, m, δ-opmaly. The bouds (6 ad (7 were boh formulaed for he case of Beroull rewards, bu s sraghforward o exed hem o he case where he rewards are Gaussa dsrbued wh ukow mea ad kow varace. E. Gaussa rewards I hs paper we focus o he case of Gaussa reward dsrbuos, where he dsrbuo ν of rewards assocaed wh arm s Gaussa wh mea m, whch s ukow

4 4 o he decso maker, ad varace σs, 2, whch s kow o he decso maker from, e.g., prevous observaos or kow measureme characerscs. Relaxao of he assumpo of kow varace s dscussed Remark 12. For he gve case, he Kullback-Lebler dvergece (3 akes he value ( D(ν ν = σ 2 s, + σ2 s, σs, 2 1 log σ2 s, σ 2 s,. (8 Ths equao s more easly erpreed whe he reward varaces are uform,.e., σs, 2 = σ2 s for each. I some cases we assume uform varace for smplcy of exposo, bu he releva resuls are readly geeralzed o he case of ouform varace. Assumg uform varace, D(ν ν = 2 /2σ2 s, so he boud (3 s E [ T ] ( 2σ 2 s 2 + o(1 log T. (9 Ths resul ca be erpreed as follows. For a gve value of, a larger varace σ 2 s makes he rewards more varable ad herefore s more dffcul o dsgush bewee he arms. For a gve value of σ 2 s, a larger value of makes easer o dsgush from he opmal arm. The expressos for he problem wh mulple plays (.e., (5 are decal excep for subsug σ(k for ad (k for. III. THE MULTI-ARMED BANDIT PROBLEM WITH SATISFICING OBJECTIVES We ow defe he mul-armed bad problem wh sasfcg objecves. We propose several ew sasfcg oos of regre ad fd assocaed bouds o opmal performace. These oos capure wo dmesos of he sasfcg problem: sasfaco,.e., he age s desre o oba a reward ha s above a cera hreshold, ad suffcecy,.e., he age s desre o aa a level of cofdece ha s choce of a gve arm wll brg hem sasfaco. We defe hese oos frs for sasfcg mea reward ad he exed hem o sasfcg saaeous reward, whch we refer o as robus sasfcg. A. Sasfcg mea reward We defe sasfaco mea reward as havg a expeced reward m ha s above a specfed hreshold value M. Formally, we represe sasfaco mea reward a me by he varable s, defed as { 1, f m M s = (10 0, oherwse. The hreshold M s a free parameer ha mus be specfed by he decso-makg age. Le m = max m be he maxmum expeced reward from ay arm. The age ca ever be sasfed f M s greaer ha m, so we assume ha M m o make he problem feasble. If M > m σ(2,.e., greaer ha he mea reward of he secod-bes arm, he arm σ(1 = s he oly oe ha s sasfyg mea reward. As he mul-armed bad problem wh mulple plays, le σ be a permuao of {1,..., N} such ha m σ(1 m σ(2 m σ(n. Le k be he larges eger such ha m σ(k M. The arms {σ(1,..., σ(k} are he k-bes arms defed by he sasfaco hreshold M. For each arm, defe he hresholded expeced regre M = max{m m, 0}. For each k-bes arm, he hresholded regre s zero, ad for each k-wors arm {σ(k + 1,..., σ(n}, he value M > 0 quafes he exe o whch he arm s usasfyg mea rewards. Noe ha f M = m, M =, whch s he sadard measure of expeced regre. We refer o he k- bes ad k-wors arms as sasfyg ad o-sasfyg arms, respecvely. The sasfaco varable s defed (10 ca be wre as a fuco of he sg of M : { 1, f M s = = 0, 0, oherwse. The quay s s deermsc. However, sce he age does o kow he value of M assocaed wh ay gve arm, hey mus lear by samplg rewards from he varous arms ad updag her belefs accordgly. Adopg a Bayesa framework, we assume s s a realzao of a bary radom varable S. Due o he sochasc aure of he rewards he age wll have less ha perfec cofdece her belefs abou he value of s. We dsgush sasfcg objecves mea reward accordg o he degree δ [0, 1] of cofdece he age seeks her belefs, whch we call suffcecy mea reward. We defe a arm o be (δ-suffcg mea reward f Pr [S = 1] 1 δ, where he probably s evaluaed based o he age s curre belefs. For o-zero values of δ, he age fds suffce o have fe cofdece ha hey are sasfed, whle for δ = 0, he age was ceray ha hey are sasfed. The age cao acheve ceray fe me, so hese wo cases resul qualavely dffere behavor: δ = 0 meas he age wll ever sop explorg, whle δ > 0 meas he age wll sele o a se of accepable opos afer fe me. The sasfcg--mea-reward objecve s T 1 ( (s = 1 or Pr [S = 1] > 1 δ, (11 =1 where 1( s he dcaor fuco. The objecve (11 s maxmzed f, a each me, a sasfyg opo s seleced, or he probably ha he opo s sasfyg s suffcely hgh. The eve ha a opo s sasfyg s o kow a pror ad mus be leared by explorao. Ths resuls a explore-explo radeoff as he sadard mul-armed bad problem. To quafy he opmal explore-explo radeoff he spr of he La-Robbs boud we roduce he followg oo of he expeced sasfcg regre a me, R, defed by { M R =, f Pr [S = 1] 1 δ, (12 0, oherwse.

5 5 If he age s suffcely cera of beg sasfed by he choce of, hey cur expeced regre of M. Oherwse, hey cur o regre. We defe he sasfcg--mea-reward mul-armed bad problem erms of mmzg cumulave expeced sasfcg regre. Defo 1 (Sasfcg--mea-reward mul-armed bad problem. The sasfcg--mea-reward mul-armed bad problem s o mmze he cumulave sum of he expeced sasfcg regre (12: [ T ] J R = E R. (13 =1 The sasfcg--mea-reward bad problem has wo parameers: M ad δ. These parameers characerze he age s hresholds for sasfaco ad suffcecy, respecvely. For purposes of aalyss we dsgush four cases as a fuco of he parameer values. For he sasfaco hreshold M R, he frs case s seg M > m σ(2, whle he secod case s seg M m σ(2. For he suffcecy hreshold δ [0, 1], he frs case s he ceray value δ = 0, whle he secod case s δ (0, 1]. Table I summarzes he four problems ha resul from he eraco of he wo dmesos of sasfaco ad suffcecy. Problem 1 ses he sasfaco hreshold M > m σ(2 ad he suffcecy hreshold δ = 0, whch resuls a sadard bad problem. We call Problem 2 wh M m σ(2 ad δ = 0 sasfaco--mea-reward. We call Problem 3 wh M > m σ(2 ad δ (0, 1] δ-suffcg. Fally, we call Problem 4 wh M m σ(2 ad δ (0, 1], (M, δ-sasfcg. Remark 1. We oe ha he dsco bewee Problems 1 ad 2 ad bewee Problems 3 ad 4 s oly due o he rage of values M ca ake. These problems ca be hough of as a sgle problem whch he choce of M dcaes he cardaly of he se of sasfyg arms. However, he wo rages of hresholds M > m σ(2 ad M m σ(2 allow us o clearly coras he sasfcg problem wh he sadard problem. Assumg M > m σ(2 Problems 1 ad 3 s equvale o assumg ha he age seeks he (ukow hghes mea reward, whch s cosse wh he sadard problem. The polces we defe for Problems 1 ad 3 do o rely o a kow hreshold M. Assumg M m σ(2 s equvale o assumg ha he age seeks o mee a (kow desred mea reward hreshold. The polces we defe for Problems 2 ad 4 do rely o he hreshold M. These same assumpos aalogously dsgush Problems 5 ad 7 from Problems 6 ad 8 defed Seco III-B. However, ulke he polces for Problems 1 ad 3, he polces defed for Problems 5 ad 7 do rely o M > m σ(2 beg kow. We do o assume ay of he problems ha he age kows he permuao σ, so o polces deped o σ. We develop performace bouds for each of hese problems erms of corollares of he performace bouds preseed Seco II. For he problems wh δ = 0, hese bouds show ha cumulave expeced regre mus grow a leas a a logarhmc rae, whle for he problems wh δ > 0, fe regre s possble. Problem 1: Sadard bad The sasfcg--meareward mul-armed bad problem wh M > m σ(2 ad δ = 0 s a sadard mul-armed bad problem. Therefore, for hs problem, he La-Robbs boud (3 holds, ad he expeced umber of mes a subopmal arm s chose obeys E [ ( T ] 1 D(ν ν + o(1 log T. As a drec cosequece, he cumulave expeced sasfcg regre (13 grows a leas logarhmcally wh me horzo T : ( N M J R D(ν ν + o(1 log T. =1 Problem 2: Sasfaco--mea-reward The sasfaco-mea-reward problem, defed as he sasfcg--meareward mul-armed bad problem where M m σ(2 ad δ = 0, also has a logarhmc lower boud o he cumulave expeced sasfcg regre: Corollary 2 (Sasfaco--mea-reward regre boud. The sasfaco--mea-reward problem s a sasfcg--meareward mul-armed bad problem where he objecve (13 s defed wh M m σ(2 ad δ = 0. Ay polcy solvg he sasfaco--mea-reward problem obeys E [ ( T ] 1 D(ν ν σ(k + o(1 log T (14 for each o-sasfyg arm, where σ s a permuao of {1,..., N} such ha m σ(1 m σ(2 m σ(n ad k s he larges eger such ha m σ(k M. Proof: The defo of sasfaco (10 mples ha performace bouds for he sasfaco--mea-reward problem ad he mul-armed bad problem wh mulple plays are equvale. Gve a problem sace, he hreshold M duces he umber k of sasfyg arms, so performace ca be aalyzed as he problem wh mulple plays. The boud (5 apples o he problem wh mulple plays ad he equvalece mples he resul. Problem 3: δ-suffcg The δ-suffcg problem, defed as he sasfcg--mea-reward mul-armed bad problem where M > m σ(2 ad δ (0, 1], adms polces ha acheve cumulave expeced regre ha s a bouded fuco of T : Corollary 3 (δ-suffcg regre boud. The δ-suffcg problem s a sasfcg--mea-reward mul-armed bad problem where he objecve (13 s defed wh M > m σ(2 ad δ (0, 1]. Ay polcy solvg he δ-suffcg problem obeys ( 1 T O ɛ 2 log(1/δ (15 for each subopmal arm, where ɛ = = M m. Proof: The defo of sasfaco (10 he δ-suffcg problem mples ha he age curs regre f he arm seleced s o (ɛ = 0, δ-opmal. The boud (6 hus provdes a lower boud o he umber of mes he age mus cur regre.

6 6 Problem 4: (M, δ-sasfcg The (M, δ-sasfcg problem, defed as he sasfcg--mea-reward mularmed bad problem where M m σ(2 ad δ (0, 1], adms polces ha acheve cumulave expeced regre ha s a bouded fuco of T : Corollary 4 ((M, δ-sasfcg regre boud. The (M, δ- sasfcg problem s a sasfcg--mea-reward mularmed bad problem where he objecve (13 s defed wh M m σ(2 ad δ (0, 1]. Ay polcy solvg he (M, δ- sasfcg mul-armed bad problem obeys N T = T 1 ( N k ɛ 2 log (16 8δ =1 where σ s a permuao of {1,..., N} such ha m σ(1 m σ(2 m σ(n, k s he larges eger such ha m σ(k M, ad ɛ = M m σ(k. Sce oly arms {σ(k + 1,..., σ(n} resul regre, he lef had sde of (16 s a upper boud o he expeced sasfcg regre (13. Proof: The defo of sasfaco (10 he (M, δ- suffcg problem mples ha a algorhm ha mmzes sasfcg regre s equvale o a (ɛ = m σ(k M, k, δ- opmal algorhm he sese of [15]. Therefore, he boud (7 apples o he (M, δ-suffcg problem. Recall ha T s he umber of mes all arms (cludg he opmal oe should by cumulavely sampled such ha followg T a (M, ɛ-opmal decso ca be made. The lower bouds o boh T suggesg ha for (M, ɛ-sasfcg, a bouded regre ca be acheved. Corollares 3 ad 4 show ha he wors-case regre s a bouded fuco of T for he suffcg problems, where δ > 0. Therefore we ca coclude ha he expeced regre for such problems ca also be a bouded fuco of T. Ths s a mpora dsco from he maxmzg problems, where δ = 0: such problems, he La-Robbs boud (3 mples ha he expeced regre mus grow logarhmcally wh T. As s sadard he bad leraure, we say a algorhm has effce performace f s regre maches, up o cosa facors, he releva growh raes: log T for maxmzg problems ad log(k/δ/ɛ 2 for suffcg problems. ad T are depede of T, B. Robus sasfcg saaeous reward The four objecves defed Seco III-A above defe sasfaco (10 erms of he mea reward m from a arm. Ths capures suaos where he me scale for sasfaco spas umerous decso mes. For example, cosder foragg, where a amal mus cosume a mmum amou of food each day. If each decso me represes a small poro of he day, he oal food cosumed durg he day represes he sum of umerous small rewards from each decso me. As log as he mea reward a each decso me s suffcely hgh, he amal wll mee s daly food requreme. If, sead, he decso me scale s he same as he sasfaco me scale, s more approprae o defe sasfaco a me erms of he reward r receved a ha me. Ths requres more robus algorhms, he sese ha hey mus esure ha each reward, raher ha smply he mea reward, s sasfyg wh hgh probably. I hs coex we defe sasfaco wo sages. Frs, we defe happess as recevg a reward r ha s a leas a hreshold value M R. We represe happess a me as he Beroull radom varable h, defed as h = { 1, f r M 0, oherwse. (17 We defe he success probably of he happess radom varable h as p = Pr [h = 1 = ]. (18 The success probably p s he expeced rae of happess due o pckg arm. Ths defes a Beroull mul-armed bad problem where he mea reward (.e., happess rae s p. We he defe sasfaco erms of a hreshold Π for hs Beroull mul-armed bad problem as we dd (10: { 1, f p Π s = (19 0, oherwse. Gve he happess hreshold M, hs defo s decal o he defo (10 of sasfaco where m = p, p = max p, ad M = Π. Therefore he four sasfcg mularmed bad problems defed Table I ca be used o defe four addoal problems hs coex, whch we call robus sasfcg. Defo 2 (Robus sasfcg mul-armed bad problem. The robus sasfcg mul-armed bad problem s o mmze he cumulave sum of he expeced sasfcg regre (12: [ T ] J R = E R, =1 where he regre R s defed usg he oo of sasfaco defed by (17 (19. A robus sasfcg mul-armed bad problem has hree parameers: M, Π, ad δ. We assume ha M ad Π are chose such ha here s a leas oe sasfyg arm; oherwse, he expeced regre mus grow defely. Table II summarzes he four robus sasfcg mul-armed bad problems ha resul from he eraco of he wo dmesos of sasfaco ad suffcecy, whch we ls below. We assume ha ς s a permuao of {1,..., N} such ha p ς(1 p ς(2... p ς(n. Problem 5: Robus bad The robus bad problem s defed as he robus sasfcg mul-armed bad problem where Π > p ς(2 ad δ = 0. Problem 6: Robus sasfaco The robus sasfaco problem s defed as he robus sasfcg mul-armed bad problem where Π p ς(2 ad δ = 0. Problem 7: δ-robus suffcg The δ-robus suffcg problem s defed as he robus sasfcg mul-armed bad problem where Π > p ς(2 ad δ (0, 1]. Problem 8: (Π, δ-robus sasfcg The (Π, δ-robus sasfcg problem s defed as he robus sasfcg mularmed bad problem where Π p ς(2 ad δ (0, 1].

7 7 Threshold level Seek ceray (δ = 0 Suffce (δ > 0 M > m σ(2 1 Sadard bad 3 δ-suffcg M m σ(2 2 Sasfaco--mea-rwd 4 (M, δ-sasfcg TABLE I TABLE OF THE FOUR DIFFERENT REGRET CONCEPTS, AND RESULTING PROBLEMS, ASSOCIATED WITH THE SATISFICING-IN-MEAN-REWARD MULTI-ARMED BANDIT PROBLEM. Threshold level Seek ceray (δ = 0 Suffce (δ > 0 Π > p ς(2 5 Robus bad 7 δ-robus suffcg Π p ς(2 6 Robus sasfaco 8(Π, δ-robus sasfcg TABLE II TABLE OF THE FOUR DIFFERENT REGRET CONCEPTS, AND RESULTING PROBLEMS, ASSOCIATED WITH THE ROBUST SATISFICING MULTI-ARMED BANDIT PROBLEM. THE QUANTITY p REPRESENTS THE PROBABILITY OF HAPPINESS (I.E., RECEIVING A REWARD OF AT LEAST M DUE TO CHOOSING ARM. For a large class of reward dsrbuos, here s a equvalece bewee Problems 5 8 defed erms of r ad Problems 1 4 defed erms of m. By Lemma 5 below, whe he rewards r follow a Gaussa dsrbuo wh ukow mea m ad kow varace σs, 2, each problem Table II s equvale o he aalogous problem Table I. IV. SATISFICING WITH GAUSSIAN REWARDS I hs seco we sudy he Gaussa sasfcg mularmed bad problem. Ths s he sasfcg mul-armed bad problem where he reward r due o selecg arm s r N (m, σs, 2 ad σs, 2 s he kow varace of arm. I hs case, we show a formal equvalece bewee he sasfcg--mea-reward mul-armed bad problems ad he robus sasfcg mul-armed bad problems. The choce of Gaussa rewards faclaes modelg correlao depedeces amog arms, whch ca be useful applcaos. A. Equvalece lemma for Gaussa rewards For he Gaussa robus sasfcg mul-armed bad problem, defe he quay x = m M σ s,, (20 whch we call he sadardzed mea reward, for each arm. The followg lemma saes ha each Gaussa robus sasfcg mul-armed bad problem where sasfaco s defed by (19 s equvale o a Gaussa sasfcg-mea-reward mul-armed bad problem where sasfaco s defed by (10 wh sadardzed reward dsrbuos. Lemma 5 (Equvalece for Gaussa rewards. Each Gaussa robus sasfcg mul-armed bad problem s equvale o a Gaussa sasfcg--mea-reward mul-armed bad problem wh rewards r N (x, 1 wh x gve by (20. Tha s, he orderg of he arms erms of x s decal o he orderg erms of p, ad, parcular, he arm wh maxmal x s he arm wh maxmal p. Proof: Wh Gaussa rewards, he probably (18 of happess due o choosg arm s p = Pr [m + σ s, z M] ( m M = Φ = Φ(x, (21 σ s, where z N (0, 1 s a sadard ormal radom varable ad Φ(z s s cumulave dsrbuo fuco. Le = arg max p. The key sgh s ha Φ( s a mooocally creasg fuco, whch mples ha he orderg of arms erms of p s decal o he orderg erms of x. I parcular, arm s he arm wh maxmal x. Therefore, sasfaco erms of r s equvale o sasfaco erms of he mea reward x. Ths s aga a Gaussa bad problem: cosder he sadardzed reward r = r M σ s,, (22 whch s a Gaussa radom varable r N (x, 1. The quay x plays he role of he mea reward m ad he rasformed rewards have uform varace σ s 2 = 1. Mmzg he robus sasfcg regre erms of r s equvale o mmzg he sasfcg regre erms of x. Lemma 5 has wo mplcaos for he relaoshp bewee Problems 5 8 ad Problems 1 4 whe rewards are Gaussa dsrbued. Frs, each Problem 5 8 hers a regre boud from he correspodg Problem 1 4. Secod, each Problem 5 8 ca be solved by applyg he algorhm developed for Problem 1 4 by frs applyg he sadardzao rasformao (22 o he observed rewards. Remark 6 (Locao-scale famles. Lemma 5 s easly geeralzed o reward dsrbuos belogg o locao-scale famles. A locao-scale famly s a se of probably dsrbuos closed uder affe rasformaos,.e., f he radom varable X s he famly, so s he varable Y = a + bx, where a, b R. Ay radom varable X such a famly wh mea µ ad sadard devao σ ca be wre as X = µ + σz, where Z s a zero-mea, u-varace member of he famly. Examples clude he uform ad Sude s - dsrbuos. B. Applcao o he Gaussa robus sasfcg problems I hs seco we show how o use he equvalece resul of Lemma 5 for he full se of robus sasfcg problems he case of Gaussa rewards. Recall from Lemma 5 ha he probably of happess (18 due o pckg a arm s p. I he proof of he lemma, we show ha maxmzg he probably of happess s equvale o maxmzg he mea reward a Gaussa mul-armed bad problem wh mea rewards x = Φ 1 (p, where x s he sadardzed mea reward (m M/σ s,. Gve a algorhm developed for oe of he Problems 1 4 defed Table I, ca be appled o he correspodg Problem 5 8 defed Table II as follows. Sadardze he observed rewards r ad ru he algorhm usg he sadardzed rewards r = (r M/σ s, as pu. For

8 8 example, Problem 5, he robus mul-armed bad problem, ca be solved by a algorhm desged o solve Problem 1, he sadard bad problem, where rewards are rasformed accordg o (22 before beg pu o he algorhm. The same procedure allows oe o apply algorhms developed for Problem 3, δ-suffcg, o Problem 7, δ-robus suffcg. For Problem 6, robus sasfaco, ad Problem 8, (Π, δ- robus sasfcg, we eed a hreshold X ha s aalogous o he hreshold M defed for Problem 2, sasfaco-mea-reward, ad Problem 4, (M, δ-sasfcg. We use he relaoshp bewee x ad p o derve he hreshold. I parcular, for a robus sasfcg problem wh probably of happess hreshold Π, defe he hreshold X by X = Φ 1 (Π. (23 Whe he rewards are Gaussa dsrbued, we ca apply algorhms developed for Problems 2 ad 4 o he correspodg robus sasfcg Problems 6 ad 8 by sadardzg rewards ad usg he hreshold X defed (23 place of he hreshold M. Lemma 5 mples ha he effce performace guaraees for algorhms desged for Problems 1 4 also hold whe hey are used o solve he robus sasfcg Problems 5 8. V. THE UCL ALGORITHM FOR GAUSSIAN MULTI-ARMED BANDIT PROBLEMS I hs seco we revew he UCL algorhm, a Bayesa algorhm we developed ad aalyzed [27] o solve he sadard Gaussa mul-armed bad problem. The UCL algorhm was developed by applyg he Bayesa upper cofdece boud approach of [16] o he case of Gaussa rewards; he choce of Gaussa rewards faclaed he modelg of huma decso-makg behavor. The UCL algorhm maas a belef abou he mea rewards m by sarg wh a pror ad updag usg Bayesa ferece as ew rewards are receved. A each me he algorhm chooses arm usg a heursc ha s a smple fuco of he curre belef sae. For uformave prors, he UCL algorhm acheves logarhmc regre,.e., opmal performace. Uformave prors correspod o havg o formao abou he mea rewards. A major advaage of he UCL algorhm s s ably o corporae formao abou he mea rewards hrough he use of a so-called formave pror. I [27], we showed ha a appropraely-chose pror ca sgfcaly crease he performace of he UCL algorhm. Several dffere UCL algorhms were developed [27], cludg a sochasc decso rule o model huma behavor; here we cover oly he deermsc UCL algorhm, whch, for brevy, we refer o as he UCL algorhm. A. Pror The pror dsrbuo capures he age s kowledge abou he vecor of mea rewards m before begg he ask. We assume ha he pror dsrbuo s mulvarae Gaussa wh mea µ 0 R N ad covarace Σ 0 R N N : m N (µ 0, Σ 0. (24 The h eleme of µ 0, deoed by µ 0, represes he age s mea belef of he reward m assocaed wh arm. The (, eleme of Σ 0, deoed by ( 2, σ 0 represes he age s uceray assocaed wh ha belef. Off-dagoal elemes of Σ 0, e.g., σj 0, represe he age s perceved relaoshp bewee m ad m j : f σj 0 s posve, hgh values of m are correlaed wh hgh values of m j, whle f s egave, hgh values of m correlae wh low values of m j. Ay posve-defe marx ca be used as Σ 0, bu s ofe useful o cosder a srucured paramerzao, such as Σ 0 = σ0σ, 2 where σ0 2 > 0 ecodes he age s uceray. Oe mpora specal case s a ucorrelaed pror, where Σ s dagoal, whch correspods o he age percevg he rewards assocaed wh dffere arms o be depede. Aoher mpora specal case s a uformave pror, whch correspods o complee uceray,.e., he lm σ0 2 + ; a uformave pror ca be hough of as a specal case of a ucorrelaed pror. B. Iferece updae A each me he age pcks a arm ad receves a reward r ha s Gaussa dsrbued: r N (m, σs, 2. Bayesa ferece provdes a opmal soluo o he problem of updag he belef sae (µ, Σ (.e., he suffce sascs for esmag m o corporae hs ew formao. Le Λ = Σ 1, ad le φ R N be he vecor wh eleme equal o 1 ad all oher elemes equal o zero. The gve he Gaussa pror (24, he Bayesa updae equaos are lear [17]: C. Decso heursc q = r φ σs, 2 + Λ 1 µ 1 Λ = φ φ T σ 2 s, + Λ 1, µ = Σ q. (25 A each me he UCL algorhm compues a value Q for each arm. The algorhm he pcks he arm ha maxmzes Q. Tha s, pcks The heursc value Q s = arg max Q. (26 Q = µ + σ Φ 1 (1 α, (27 where µ = (µ, (σ 2 = (Σ, α = 1/(K, K > 0 s a uable parameer, ad Φ 1 ( s he quale fuco of he sadard ormal radom varable. The heursc Q s a Bayesa upper lm for he value of m based o he formao avalable a me. I represes a opmsc assessme of he value of m. The decso made ca be hough of as he mos opmsc oe cosse wh he curre formao.

9 9 D. Performace I [27], we suded he case of homogeeous samplg ose (.e., σs, 2 = σ2 s for each ad showed ha he UCL algorhm acheves logarhmc cumulave expeced regre uformly me. I parcular, we proved ha he followg heorem holds. We defe {R UCL } {1,...,T } as he sequece of expeced regre for he deermsc UCL algorhm. Theorem 7 (Regre of he deermsc UCL algorhm [27]. The followg saemes hold for he Gaussa mul-armed bad problem ad he deermsc UCL algorhm wh ucorrelaed uformave pror ad K = 1: 1 he expeced umber of mes a subopmal arm s chose ul me T sasfes E [ T ] ( 8σ 2 s log T + 3 ; 2 he cumulave expeced regre ul me T sasfes ( T N (8σ 2 J R = R s log T + 3. =1 =1 The mplcao of hs heorem ca be see by comparg 1 wh he La-Robbs boud (9: he UCL algorhm acheves logarhmc regre uformly me wh a cosa ha dffers from he opmal asympoc oe by a cosa facor, ad hus s cosdered o have opmal performace. VI. ALGORITHMS FOR SATISFICING GAUSSIAN MULTI-ARMED BANDIT PROBLEMS I hs seco we develop algorhms for solvg Gaussa mul-armed bad problems wh he sasfcg objecves proposed Seco III. All he algorhms coss of modfed versos of he UCL algorhm. We aalyze he algorhms ad show ha hey acheve effce performace. The UCL algorhm solves he sadard Gaussa mul-armed bad problem,.e., he sasfcg Gaussa mul-armed bad problem wh M > m σ(2 ad δ = 0 (Problem 1. We develop hree ew UCL varas for Problems 2 4 Table I. These algorhms ca he be appled o Problems 5 8 Table II. A he ed of he seco, we cosder exesos o reward dsrbuos oher ha he Gaussa wh kow varace. A. Problem 2: Sasfaco--mea-reward UCL algorhm A smple modfcao of he UCL algorhm acheves logarhmc regre for he Gaussa sasfaco--mea-reward problem, whch s he sasfcg--mea-reward mul-armed bad problem wh M m σ(2 ad δ = 0 (Problem 2. We defe hs algorhm, whch we refer o as he sasfaco-mea-reward UCL algorhm, as follows. As (27, defe he heursc value Q as Q = µ + σ Φ 1 (1 α, where α = 1/(K ad K > 0 s aga a uable parameer. Le M R be he sasfaco hreshold, so he age s sasfed f pcks a arm wh m M. Le he elgble se a me be { Q M}. I coras o he UCL seleco scheme (26 ha pcks he arm wh maxmal Q, sasfaco-mea-reward UCL pcks ay arm he elgble se. Tha s, f he elgble se s o-empy, he { Q M}, (28 or f he elgble se s empy, he sasfaco--mea-reward UCL pcks he arm wh maxmal Q. Thus, f he mos recely seleced arm s he elgble se, may be seleced aga eve f does o have he maxmal Q. The sasfaco--mea-reward UCL algorhm acheves logarhmc cumulave expeced sasfaco--mea-reward regre, as guaraeed by he followg heorem. Theorem 8 (Regre of he sasfaco--mea-reward UCL algorhm. Le a Gaussa mul-armed bad problem wh he sasfaco--mea-reward objecve have a leas oe arm ha obeys m > M, ad, whou loss of geeraly, assume σs, 2 = 1 for each arm. The, he followg saemes hold for he sasfaco--mea-reward UCL algorhm wh ucorrelaed uformave pror ad K = 1: 1 he expeced umber of mes a o-sasfyg arm s chose ul me T sasfes E [ ( T ] 8 ( M log T + 4; 2 he cumulave expeced sasfaco--mea-reward regre ul me T sasfes (( N 8 J SM ( M log T + 4. =1 M To prove Theorem 8 we use he followg boud from [1]. Lemma 9 (Bouds o he verse Gaussa cdf. For he sadard ormal (.e., Gaussa radom varable z ad a cosa w R 0, Pr [z w] 2e w2 /2 2π(w + w2 + 8/π 1 2 e w2 /2. (29 I follows from (29 ha for ay α [0.5, 1], Φ 1 (1 α 2 log α. (30 Proof of Theorem 8: The proof proceeds as he proof of Theorem 7 [27], whch self follows he proofs [4]. Le be a o-sasfyg arm,.e., m < M, ad recall ha desgaes he maxmum mea reward. The E [ T ] T = Pr [ = ] =1 T =1 η + Pr [ Q M ] [ ] +Pr Q Q & max Q j <M j T ( [ Pr Q M, η ] =1 +Pr [ Q Q, η ]. The frs erm he summad correspods o he probably ha he o-sasfyg arm s he elgble se, whle he

10 10 secod erm correspods o he probably ha he elgble se s empy ad ha a o-sasfyg arm appears beer ha a opmal arm. The saeme Q Q mples ha a leas oe of he followg equales holds: µ m + C (31 µ m C (32 m < m + 2C, (33 where C = σ Φ 1 (1 α ad α = 1/(K. Oherwse, f oe of (31 (33 holds, he Q = µ + C > m m + 2C > µ + C = Q. We frs aalyze he probably ha (31 holds. For a ucorrelaed uformave pror, µ s equal o m, he emprcal mea reward observed a arm ul me, ad σ = 1/. Therefore, for a ucorrelaed uformave pror, Q = m + 1 Φ 1 (1 α. Codoal o, he emprcal mea reward m s self a Gaussa radom varable wh mea m ad sadard devao 1/, so (31 holds f m + m m + 1 Φ 1 (1 α z m + 1 Φ 1 (1 α z Φ 1 (1 α, where z N (0, 1 s a sadard ormal radom varable. Thus, for a uformave pror, Pr [(31 holds] = α = 1 K. Smlarly, (32 holds f m m C m + z m 1 Φ 1 (1 α z Φ 1 (1 α, where z N (0, 1 s a sadard ormal radom varable. Thus, for a uformave pror, Pr [(32 holds] = α = 1 K. Iequaly (33 holds f m < m + 2 Φ 1 (1 α < 2 Φ 1 (1 α 2 4 < 2 log < 2 log T < 2 log α (34 where = m m ad equaly (34 follows from boud (30. Thus, for a uformave pror, (33 ever holds f 8 2 log T. (35 Thus, for suffcely large, Pr [Q Q ] = 2/(K. We ow boud he probably Pr [Q M] ha a osasfyg arm s he elgble se. Noe ha Q M mples ha a leas oe of he followg equales holds: µ m + C (36 M < m + 2C. (37 Oherwse, f eher (36 or (37 holds, M m + 2C > µ + C = Q ad arm s o he elgble se. (36 s decal o (31 ad (37 o (33. For a uformave pror, Pr [(36 holds] = α = 1 K. Ad (37 holds f M < m + 2 Φ 1 (1 α M < 2 Φ 1 (1 α ( M 2 < 2 log(α 4 ( M 2 ( M 2 < 2 log < 2 log T. 4 4 Thus, for a uformave pror, (37 ever holds f 8 ( M 2 log T. (38 Sce m M, for each o-sasfyg arm, M. Thus, 1/ ( M 2 1/ 2 ad (38 mples (35. So seg 8 η = ( M 2 log T (39 yelds he boud E [ T ] T ( [ η + Pr Q M, η ] < =1 =1 + Pr [ Q Q, η ] 8 T 1 ( M 2 log T + 3. =1 The sum ca be bouded by he egral T 1 T d = 1 + log T, 1 yeldg he boud he frs saeme of he heorem: E [ ( T ] 8 ( M log T + 4. The secod saeme of he heorem follows from he defo (12 of expeced sasfcg regre. B. Problem 3: δ-suffcg UCL algorhm A alerave modfcao of he UCL algorhm acheves fe sasfcg regre he Gaussa δ-suffcg problem, whch s he sasfcg--mea-reward mul-armed bad wh M > m σ(2 ad δ (0, 1] (Problem 3. For he age, hs ca be hough of as wag o have fe cofdece ha

11 11 has foud he ukow opmal arm σ(1. For he δ-suffcg problem, defe he heursc fuco ( Q = µ + σφ 1 1 δ. 2 We defe he δ-suffcg UCL algorhm as he algorhm ha selecs arm = arg max Q a each decso me. The δ- suffcg UCL algorhm acheves fe cumulave sasfcg regre, as guaraeed by he followg heorem. Theorem 10. Cosder he δ-suffcg UCL algorhm wh a uformave pror. The umber of mes he pcked arm s o-sasfyg wh probably greaer ha δ s upper bouded as T < 4σ2 s 2 (Φ 1 ( 1 δ Proof: We boud T by og ha a o-sasfyg arm s pcked oly f Q Q, whch ca be decomposed as he proof of Theorem 8 o he hree codos (42 s equvale o µ m + C (40 µ m C (41 m < m + 2C. (42 = m m < 2C = 2σ s Φ 1 (1 δ/2. Squarg ad rearragg, we see ha hs ever holds f ( > 4σ2 s log(1/δ > 4σ2 ( s Φ 1 log 2 2 (1 δ/2 2 = η. The same argume as he proof of Theorem 8 shows ha for 1, (40 ad (41 each hold wh probably a mos δ/2. Therefore, for > η + 1, a o-sasfyg arm s seleced wh probably a mos δ. Theorem 10 guaraees ha he δ-suffcg UCL algorhm acheves fe regre. Furhermore, he algorhm s effce ha he regre maches he depedece o ɛ ad δ he boud (15. To see hs, oe ha a o-sasfyg arm wh s a ɛ = -subopmal arm, so Corollary 3 mples ha T s lower bouded by O ( log(1/δ/ɛ 2. The saeme of Theorem 10 combed wh he boud (30 o he verse Gaussa cdf mples ha T s upper bouded by 8σs 2 log(2/δ/ = 8σ2 s log(2/δ/ɛ 2 + 1, whch maches he lower boud (15 up o cosa facors. C. Problem 4: (M, δ-sasfcg UCL algorhm A hrd modfcao of he UCL algorhm acheves fe sasfcg regre he Gaussa (M, δ-sasfcg problem, whch s he sasfcg--mea-reward mul-armed bad wh M m σ(2 ad δ (0, 1] (Problem 4. For he age, hs ca be hough of as wag o have fe cofdece ha has foud a arm whose mea reward s above a kow hreshold. For he (M, δ-sasfcg problem, defe he heursc fuco ( Q = µ + σφ 1 1 δ. 3 Le he elgble se a me be { Q M}. We defe he (M, δ-sasfcg UCL algorhm as he algorhm ha selecs arm { Q M}, f he elgble se a me s oempy. Oherwse, f he elgble se s empy, he algorhm pcks he arm wh maxmal Q. The (M, δ-sasfcg UCL algorhm acheves effce performace as guaraeed by he followg heorem. Theorem 11. Cosder he (M, δ-sasfcg UCL algorhm wh a uformave pror. The umber of mes he pcked arm s o-sasfyg wh probably greaer ha δ s upper bouded as T < 4σ2 s ( M 2 ( Φ 1 (1 δ/ Proof: The proof s very smlar o he proofs of Theorems 8 ad 10. As Theorem 8, we boud T by T = T 1( = =1 η+ T ( ( 1 Q M, η +1 ( Q Q, η. =1 The codo Q M, whch meas arm s he elgble se, ca be decomposed o he wo codos (44 s equvale o M µ m + C (43 M < m + 2C. (44 = M m < 2C = 2σ s Φ 1 (1 δ/3. Squarg ad rearragg, we see ha (44 ever holds f > 4σ2 s ( M 2 ( Φ 1 (1 δ/3 2 = η. The same argume as he proof of Theorem 10 shows ha for 1, (43 holds wh probably a mos δ/3, so > η mples ha a o-sasfyg arm s he elgble se wh probably a mos δ/3. As he proof of Theorem 10, a o-sasfyg arm s pcked due o he elgble se beg empy oly f Q Q, where s he arm wh maxmal mea reward. Ths codo ca aga be decomposed o he hree codos (40 (42. (42 does o hold f > η, so he probably ha Q Q s bouded by he probably ha eher (40 or (41 holds. For > 1, each of hese holds wh probably δ/3, so he probably of a o-sasfyg arm beg chose due o he elgble se beg empy s a mos 2δ/3. Thus, for > η +1, a o-sasfyg arm s seleced wh probably a mos δ. Theorem 11 guaraees ha he (M, δ-sasfcg UCL algorhm acheves fe regre. Furhermore, he algorhm s effce ha he regre maches he depedece o ɛ ad δ he boud (16. Applyg he boud (30 o he verse Gaussa cdf o he saeme he heorem, we see ha T s upper bouded by 8σs 2 log(3/δ/ ( 2. M Summg hs boud over o-sasfyg arms shows ha

12 12 he oal umber of mes he algorhm curs regre s a mos 8σs 2 log(3/δ { M >0} 1/ ( 2. M Ths maches he depedece o ɛ ad δ he boud (16 up o cosa facors. Noe ha lower boud (16 cous he umber of selecos of all arms cludg he opmal arm, whle he upper boud cous oly he subopmal arms. Hece, we ca oly clam ha we acheve cumulave regre bouded T. Wh a beer lower boud o T, we may be able o clam ha, smlar o δ-suffcg UCL, (M, δ-suffcg UCL acheves he opmal depedece o ɛ ad δ. However, hs remas a ope problem o pursue. D. Robus sasfcg UCL algorhms The UCL algorhm solves Problem 1, he Gaussa sadard problem. The modfed versos of he UCL algorhm Secos VI-A, VI-B, ad VI-C solve he oher hree Gaussa sasfcg--mea-reward Problems 2 4. All four UCL algorhms acheve effce performace solvg her respecve problems, as guaraeed by Theorems 8, 10, ad 11. The equvalece resul of Lemma 5 shows for Gaussa dsrbued rewards ha we ca modfy he four UCL algorhms developed for Problems 1 4 o solve Problems 5 8 as follows. The modfed UCL algorhms make decsos based o he sadardzed mea reward (20 usg prors o he sadardzed mea rewards. A pror belef m N (µ 0, Σ 0 o he mea rewards m s rasformed o a pror belef o he sadardzed mea rewards x N ( µ 0, Σ 0 by ( µ 0 = ((µ 0 M/σ s,, ( Σ 0 j = (Σ 0 j /(σ s, σ s,j. 1 Problem 5: Robus UCL algorhm: The robus UCL algorhm s he UCL algorhm where he pror s gve erms of he sadardzed mea rewards, ad he observed reward r s sadardzed accordg o he rasformao (22 before beg pu o he ferece equaos (25. 2 Problem 6: Robus sasfaco UCL algorhm: The robus sasfaco UCL algorhm s he sasfaco--meareward UCL algorhm where he pror s gve erms of he sadardzed mea rewards, he observed reward r s sadardzed accordg o he rasformao (22 before beg pu o he ferece equaos (25, ad he parameer M s se equal o X = Φ 1 (Π defed (23. 3 Problem 7: δ-robus suffcg: The δ-robus suffcg UCL algorhm s he δ-suffcg UCL algorhm where he pror s gve erms of he sadardzed mea rewards, ad he observed reward r s sadardzed accordg o he rasformao (22 before beg pu o he ferece equaos (25. 4 Problem 8: (Π, δ-robus suffcg: The (Π, δ-robus suffcg UCL algorhm s he (M, δ-sasfcg UCL algorhm where he pror s gve erms of he sadardzed mea rewards, he observed reward r s sadardzed accordg o he rasformao (22 before beg pu o he ferece equaos (25, ad he parameer M s se equal o X = Φ 1 (Π defed (23. Lemma 5 mples ha he performace guaraees ha hold for he UCL algorhms developed for Problems 1 4 also hold for he four ew UCL algorhms defed above whe appled o Problems 5 8. E. Relaxaos of Gaussa ad kow varace assumpos The algorhms preseed so far have bee developed assumg ha he reward dsrbuo assocaed wh each arm s Gaussa wh ukow mea m ad kow varace σs, 2. The reward varace may be kow, e.g., esmaed from kow sesor characerscs or pror daa. Whe he reward varace s o kow, a smple modfcao o he heursc (27 yelds a algorhm ha acheves effce performace. Smlar smple modfcaos exed our resuls o he case where he reward dsrbuo s sub-gaussa, whch cludes dsrbuos wh bouded suppor. We sae modfcaos for he case of a uformave pror. Pror formao ca be corporaed usg a cojugae pror, as dscussed [24]. Remark 12 (Gaussa rewards wh ukow varace. Whe he reward dsrbuo s Gaussa wh ukow varace, he heursc developed by Auer e al. [4] for her algorhm UCB1-NORMAL resuls algorhms ha acheve effce performace. Recall ha s he umber of mes arm has bee seleced up o me, ad m s he emprcal mea reward observed a arm up o me. Defe q = = r 2 as he sum of he squared rewards obaed from arm. The UCB1-NORMAL algorhm s composed of wo rules: f here s a arm ha has bee played less ha 8 log mes, selecs ha arm. Oherwse selecs he arm ha maxmzes he heursc Q,UCB1 NORMAL = m + 16 q ( m 2 1 log. Ths heursc ca be used drecly he sadard ad sasfaco--mea-reward UCL algorhms. For he δ- suffcg ad (M, δ-sasfcg UCL algorhms, use k = 2 ad k = 3, respecvely, he heursc Q = m + 4 q ( m 2 1 log( k/δ. The Gaussa dsrbuo wh ukow mea ad varace s aga a locao-scale famly, so Lemma 5 mples ha hese modfed algorhms ca be used o solve he robus sasfcg problems as well. Pror formao ca be corporaed by meas of a cojugae pror, as dscussed [24]. Remark 13 (Sub-Gaussa rewards. Aoher geeralzao of Gaussa rewards wh kow varace s he case where he reward dsrbuo s sub-gaussa, also kow as lghaled. The dsrbuo of a radom varable X s called sub-gaussa f s mome geerag fuco M(u = E [exp(ux] s fe for all u R. The, oe ca fd a cosa ζ such ha M(u exp(ζu 2 /2 [9]. I hs case, a heursc fuco due o Lu ad Zhou [21] Q,SG = m + 8ζ log ca be used o acheve effce performace.

13 13 Remark 14 (Reward dsrbuos wh bouded suppor. Aoher commo assumpo he bad leraure s ha he reward dsrbuos are arbrary bu have a kow bouded suppor [a, b] R. Whou loss of geeraly, we assume ha he suppor s coaed he u erval [0, 1]. I hs case he UCB1 heursc due o Auer e al. [4] Q,UCB1 = m + 2 log ca be used he sadard ad sasfaco--mea-reward UCL algorhms. For he δ-suffcg ad (M, δ-sasfcg UCL algorhms, use k = 2 ad k = 3, respecvely, he heursc Q = m + log( k/δ 2. For he robus sasfcg problems he releva reward, happess h (17, s a Beroull radom varable whch s suppored o [0, 1]. Therefore, each robus sasfcg problem ca be solved by he approprae vara of UCB1. However, f addoal formao s avalable abou he dsrbuo of he raw rewards r, e.g., ha hey are Gaussa wh kow varace, he he robus UCL algorhms ca acheve mproved performace relave o UCB1, for example f he Kullback-Lebler dvergece bewee he r dsrbuos s larger ha he Beroull dsrbuos assocaed wh h. Addoal exesos o heavy-aled dsrbuos may be possble usg he echques of [7]. VII. NUMERICAL EXAMPLES I hs seco, we prese he resuls of umercal smulaos of he modfed UCL algorhms solvg mul-armed bad problems wh Gaussa rewards ad sasfcg objecves. We cosder boh hresholdg he mea rewards m, as Problems 1 4 (Table I, ad hresholdg he saaeous rewards, as Problems 5 8 (Table II. I all of he cases preseed, he algorhms used a uformave pror. We use he smulaos o llusrae performace of he algorhms relave o he bouds proved he heorems of Seco IV. We also use he smulaos o compare how he dffere algorhms rade off accumulao of reward wh reduco explorao cos as measured by umber of swches amog arms. As show he fgures, sasfcg ca sgfcaly decrease he explorao cos whle currg lle cos erms of he rewards receved by he age. We frs cosder he sasfcg objecves wh hresholdg he mea rewards. We llusrae how he objecves of Problems 1 ad 2 yeld logarhmc regre (Fgure 1 whereas he objecves of Problems 3 ad 4 yeld fe regre (Fgure 2, as predced by he bouds proved Theorems 7, 8, 10 ad 11. For he smulaos preseed Fgures 1 ad 2, we se N = 4. The mea rewards m were se equal o [ ] ad he sadard devaos σ s, were each se equal o 1. I Fgure 1, he age s regre s defed by comparg he mea rewards m wh he sasfaco hreshold M. For he sadard objecve (Problem 1 he sasfaco level M was se equal o m = 4, so he age curred regre f seleced ay arm oher ha = 4. For he sasfaco--mea-reward objecve (Problem 2 he sasfaco level M was se equal o 2.5, so he age curred regre f seleced arms 1 or 2. Fgure 1 plos mea cumulave regre from 100 smulaos (sold les ad he bouds o regre from Theorems 7 ad 8 (dashed les for he sadard ad sasfaco--meareward UCL algorhms, respecvely. We observe from Fgure 1 ha he algorhms regre s sgfcaly below he bouds, dcag ha boh bouds are coservave. Because boh objecves se he suffcecy hreshold δ = 0 ad defe regre erms of he ukow mea rewards m, he age mus acheve ceray abou he mea reward values o sop currg regre. I s mpossble for he age o acheve hs ceray fe me, so he mea regre ad s boud boh crease defely a a logarhmc rae for boh objecves. Fgure 2, coras, shows ha by seg he suffcecy hreshold δ o a o-zero value, oe ca acheve fe regre. All he parameers for he smulaos show Fgure 2 were decal o hose for he smulaos Fgure 1, excep ha he suffcecy hreshold δ was se equal o Seg δ o a o-zero value rasformed he sadard ad sasfaco-mea-reward objecves o he δ-suffcg ad (M, δ- sasfcg objecves, Problems 3 ad 4, respecvely. For hese objecves, regre s aga defed by hresholdg he mea rewards m. However, raher ha seekg ceray ha he hreshold s me, he age oly seeks o esure ha s hreshold s me wh a probably of a leas 1 δ. Because of he allowace of uceray, he age oly eeds o perform a fe amou of explorao before selg o a arm ha appears sasfyg. The regre bouds he fgure follow from Theorems 10 ad 11 for he δ-suffcg ad (M, δ- sasfcg objecves, respecvely. As Fgure 1, we observe from Fgure 2 ha boh bouds are coservave. The reduco explorg ha comes wh suffcg ca be advaageous whe explorg s cosly, e.g., whe here s a cos assocaed wh makg a swch from oe arm o aoher. Fgure 3 suggess ha hs reduco cos may requre lle sacrfce reward. The upper pael plos mea cumulave reward from 100 smulaos for boh he sadard bad problem 1 ad he δ-suffcg problem 3. The curve for δ-suffcg s slghly below ha for he sadard bad problem, showg ha resuls slghly lower cumulave rewards, bu he dfferece s sgfca comparso o he overall magude of he cumulave rewards. The lower pael plos he mea cumulave umber of swches bewee arms for boh algorhms ad shows ha δ-suffcg requres roughly half as much explorao. We ex cosder he sasfcg objecves wh hresholdg he saaeous rewards. Fgure 4 preses a smulao ha demosraes he equvalece resul of Lemma 5 for he robus bad (Problem 5 usg he robus UCL algorhm. The happess hreshold M was se equal o 2. As Fgures 1 ad 2, he mea rewards m were se equal o [ ], bu for hs smulao he sadard devaos were se equal o [ ]. So he sadardzed mea rewards were x = [ ] ad = 3 he opmal arm,.e., he arm wh maxmal happess probably. Fgure 4 shows mea cumulave regre from 100 smulaos (sold le ad he regre boud (dashed le

ENGINEERING solutions to decision-making problems are

ENGINEERING solutions to decision-making problems are 3788 IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 62, NO. 8, AUGUST 2017 Sasfcg Mul-Armed Bad Problems Paul Reverdy, Member, IEEE, Vabhav Srvasava, ad Naom Ehrch Leoard, Fellow, IEEE Absrac Sasfcg s a

More information

The Poisson Process Properties of the Poisson Process

The Poisson Process Properties of the Poisson Process Posso Processes Summary The Posso Process Properes of he Posso Process Ierarrval mes Memoryless propery ad he resdual lfeme paradox Superposo of Posso processes Radom seleco of Posso Pos Bulk Arrvals ad

More information

(1) Cov(, ) E[( E( ))( E( ))]

(1) Cov(, ) E[( E( ))( E( ))] Impac of Auocorrelao o OLS Esmaes ECON 3033/Evas Cosder a smple bvarae me-seres model of he form: y 0 x The four key assumpos abou ε hs model are ) E(ε ) = E[ε x ]=0 ) Var(ε ) =Var(ε x ) = ) Cov(ε, ε )

More information

14. Poisson Processes

14. Poisson Processes 4. Posso Processes I Lecure 4 we roduced Posso arrvals as he lmg behavor of Bomal radom varables. Refer o Posso approxmao of Bomal radom varables. From he dscusso here see 4-6-4-8 Lecure 4 " arrvals occur

More information

International Journal Of Engineering And Computer Science ISSN: Volume 5 Issue 12 Dec. 2016, Page No.

International Journal Of Engineering And Computer Science ISSN: Volume 5 Issue 12 Dec. 2016, Page No. www.jecs. Ieraoal Joural Of Egeerg Ad Compuer Scece ISSN: 19-74 Volume 5 Issue 1 Dec. 16, Page No. 196-1974 Sofware Relably Model whe mulple errors occur a a me cludg a faul correco process K. Harshchadra

More information

Real-Time Systems. Example: scheduling using EDF. Feasibility analysis for EDF. Example: scheduling using EDF

Real-Time Systems. Example: scheduling using EDF. Feasibility analysis for EDF. Example: scheduling using EDF EDA/DIT6 Real-Tme Sysems, Chalmers/GU, 0/0 ecure # Updaed February, 0 Real-Tme Sysems Specfcao Problem: Assume a sysem wh asks accordg o he fgure below The mg properes of he asks are gve he able Ivesgae

More information

Key words: Fractional difference equation, oscillatory solutions,

Key words: Fractional difference equation, oscillatory solutions, OSCILLATION PROPERTIES OF SOLUTIONS OF FRACTIONAL DIFFERENCE EQUATIONS Musafa BAYRAM * ad Ayd SECER * Deparme of Compuer Egeerg, Isabul Gelsm Uversy Deparme of Mahemacal Egeerg, Yldz Techcal Uversy * Correspodg

More information

Optimal Eye Movement Strategies in Visual Search (Supplement)

Optimal Eye Movement Strategies in Visual Search (Supplement) Opmal Eye Moveme Sraeges Vsual Search (Suppleme) Jr Naemk ad Wlso S. Gesler Ceer for Percepual Sysems ad Deparme of Psychology, Uversy of exas a Aus, Aus X 787 Here we derve he deal searcher for he case

More information

Real-time Classification of Large Data Sets using Binary Knapsack

Real-time Classification of Large Data Sets using Binary Knapsack Real-me Classfcao of Large Daa Ses usg Bary Kapsack Reao Bru bru@ds.uroma. Uversy of Roma La Sapeza AIRO 004-35h ANNUAL CONFERENCE OF THE ITALIAN OPERATIONS RESEARCH Sepember 7-0, 004, Lecce, Ialy Oule

More information

-distributed random variables consisting of n samples each. Determine the asymptotic confidence intervals for

-distributed random variables consisting of n samples each. Determine the asymptotic confidence intervals for Assgme Sepha Brumme Ocober 8h, 003 9 h semeser, 70544 PREFACE I 004, I ed o sped wo semesers o a sudy abroad as a posgraduae exchage sude a he Uversy of Techology Sydey, Ausrala. Each opporuy o ehace my

More information

Determination of Antoine Equation Parameters. December 4, 2012 PreFEED Corporation Yoshio Kumagae. Introduction

Determination of Antoine Equation Parameters. December 4, 2012 PreFEED Corporation Yoshio Kumagae. Introduction refeed Soluos for R&D o Desg Deermao of oe Equao arameers Soluos for R&D o Desg December 4, 0 refeed orporao Yosho Kumagae refeed Iroduco hyscal propery daa s exremely mpora for performg process desg ad

More information

AML710 CAD LECTURE 12 CUBIC SPLINE CURVES. Cubic Splines Matrix formulation Normalised cubic splines Alternate end conditions Parabolic blending

AML710 CAD LECTURE 12 CUBIC SPLINE CURVES. Cubic Splines Matrix formulation Normalised cubic splines Alternate end conditions Parabolic blending CUIC SLINE CURVES Cubc Sples Marx formulao Normalsed cubc sples Alerae ed codos arabolc bledg AML7 CAD LECTURE CUIC SLINE The ame sple comes from he physcal srume sple drafsme use o produce curves A geeral

More information

Continuous Time Markov Chains

Continuous Time Markov Chains Couous me Markov chas have seay sae probably soluos f a oly f hey are ergoc, us lke scree me Markov chas. Fg he seay sae probably vecor for a couous me Markov cha s o more ffcul ha s he scree me case,

More information

QR factorization. Let P 1, P 2, P n-1, be matrices such that Pn 1Pn 2... PPA

QR factorization. Let P 1, P 2, P n-1, be matrices such that Pn 1Pn 2... PPA QR facorzao Ay x real marx ca be wre as AQR, where Q s orhogoal ad R s upper ragular. To oba Q ad R, we use he Householder rasformao as follows: Le P, P, P -, be marces such ha P P... PPA ( R s upper ragular.

More information

IMPROVED PORTFOLIO OPTIMIZATION MODEL WITH TRANSACTION COST AND MINIMAL TRANSACTION LOTS

IMPROVED PORTFOLIO OPTIMIZATION MODEL WITH TRANSACTION COST AND MINIMAL TRANSACTION LOTS Vol.7 No.4 (200) p73-78 Joural of Maageme Scece & Sascal Decso IMPROVED PORTFOLIO OPTIMIZATION MODEL WITH TRANSACTION COST AND MINIMAL TRANSACTION LOTS TIANXIANG YAO AND ZAIWU GONG College of Ecoomcs &

More information

Quantum Mechanics II Lecture 11 Time-dependent perturbation theory. Time-dependent perturbation theory (degenerate or non-degenerate starting state)

Quantum Mechanics II Lecture 11 Time-dependent perturbation theory. Time-dependent perturbation theory (degenerate or non-degenerate starting state) Pro. O. B. Wrgh, Auum Quaum Mechacs II Lecure Tme-depede perurbao heory Tme-depede perurbao heory (degeerae or o-degeerae sarg sae) Cosder a sgle parcle whch, s uperurbed codo wh Hamloa H, ca exs a superposo

More information

The Mean Residual Lifetime of (n k + 1)-out-of-n Systems in Discrete Setting

The Mean Residual Lifetime of (n k + 1)-out-of-n Systems in Discrete Setting Appled Mahemacs 4 5 466-477 Publshed Ole February 4 (hp//wwwscrporg/oural/am hp//dxdoorg/436/am45346 The Mea Resdual Lfeme of ( + -ou-of- Sysems Dscree Seg Maryam Torab Sahboom Deparme of Sascs Scece ad

More information

θ = θ Π Π Parametric counting process models θ θ θ Log-likelihood: Consider counting processes: Score functions:

θ = θ Π Π Parametric counting process models θ θ θ Log-likelihood: Consider counting processes: Score functions: Paramerc coug process models Cosder coug processes: N,,..., ha cou he occurreces of a eve of eres for dvduals Iesy processes: Lelhood λ ( ;,,..., N { } λ < Log-lelhood: l( log L( Score fucos: U ( l( log

More information

Linear Regression Linear Regression with Shrinkage

Linear Regression Linear Regression with Shrinkage Lear Regresso Lear Regresso h Shrkage Iroduco Regresso meas predcg a couous (usuall scalar oupu from a vecor of couous pus (feaures x. Example: Predcg vehcle fuel effcec (mpg from 8 arbues: Lear Regresso

More information

FORCED VIBRATION of MDOF SYSTEMS

FORCED VIBRATION of MDOF SYSTEMS FORCED VIBRAION of DOF SSES he respose of a N DOF sysem s govered by he marx equao of moo: ] u C] u K] u 1 h al codos u u0 ad u u 0. hs marx equao of moo represes a sysem of N smulaeous equaos u ad s me

More information

Fundamentals of Speech Recognition Suggested Project The Hidden Markov Model

Fundamentals of Speech Recognition Suggested Project The Hidden Markov Model . Projec Iroduco Fudameals of Speech Recogo Suggesed Projec The Hdde Markov Model For hs projec, s proposed ha you desg ad mpleme a hdde Markov model (HMM) ha opmally maches he behavor of a se of rag sequeces

More information

4. THE DENSITY MATRIX

4. THE DENSITY MATRIX 4. THE DENSTY MATRX The desy marx or desy operaor s a alerae represeao of he sae of a quaum sysem for whch we have prevously used he wavefuco. Alhough descrbg a quaum sysem wh he desy marx s equvale o

More information

Solution. The straightforward approach is surprisingly difficult because one has to be careful about the limits.

Solution. The straightforward approach is surprisingly difficult because one has to be careful about the limits. ose ad Varably Homewor # (8), aswers Q: Power spera of some smple oses A Posso ose A Posso ose () s a sequee of dela-fuo pulses, eah ourrg depedely, a some rae r (More formally, s a sum of pulses of wdh

More information

8. Queueing systems lect08.ppt S Introduction to Teletraffic Theory - Fall

8. Queueing systems lect08.ppt S Introduction to Teletraffic Theory - Fall 8. Queueg sysems lec8. S-38.45 - Iroduco o Teleraffc Theory - Fall 8. Queueg sysems Coes Refresher: Smle eleraffc model M/M/ server wag laces M/M/ servers wag laces 8. Queueg sysems Smle eleraffc model

More information

Fault Tolerant Computing. Fault Tolerant Computing CS 530 Probabilistic methods: overview

Fault Tolerant Computing. Fault Tolerant Computing CS 530 Probabilistic methods: overview Probably 1/19/ CS 53 Probablsc mehods: overvew Yashwa K. Malaya Colorado Sae Uversy 1 Probablsc Mehods: Overvew Cocree umbers presece of uceray Probably Dsjo eves Sascal depedece Radom varables ad dsrbuos

More information

Moments of Order Statistics from Nonidentically Distributed Three Parameters Beta typei and Erlang Truncated Exponential Variables

Moments of Order Statistics from Nonidentically Distributed Three Parameters Beta typei and Erlang Truncated Exponential Variables Joural of Mahemacs ad Sascs 6 (4): 442-448, 200 SSN 549-3644 200 Scece Publcaos Momes of Order Sascs from Nodecally Dsrbued Three Parameers Bea ype ad Erlag Trucaed Expoeal Varables A.A. Jamoom ad Z.A.

More information

Chapter 8. Simple Linear Regression

Chapter 8. Simple Linear Regression Chaper 8. Smple Lear Regresso Regresso aalyss: regresso aalyss s a sascal mehodology o esmae he relaoshp of a respose varable o a se of predcor varable. whe here s jus oe predcor varable, we wll use smple

More information

As evident from the full-sample-model, we continue to assume that individual errors are identically and

As evident from the full-sample-model, we continue to assume that individual errors are identically and Maxmum Lkelhood smao Greee Ch.4; App. R scrp modsa, modsb If we feel safe makg assumpos o he sascal dsrbuo of he error erm, Maxmum Lkelhood smao (ML) s a aracve alerave o Leas Squares for lear regresso

More information

FALL HOMEWORK NO. 6 - SOLUTION Problem 1.: Use the Storage-Indication Method to route the Input hydrograph tabulated below.

FALL HOMEWORK NO. 6 - SOLUTION Problem 1.: Use the Storage-Indication Method to route the Input hydrograph tabulated below. Jorge A. Ramírez HOMEWORK NO. 6 - SOLUTION Problem 1.: Use he Sorage-Idcao Mehod o roue he Ipu hydrograph abulaed below. Tme (h) Ipu Hydrograph (m 3 /s) Tme (h) Ipu Hydrograph (m 3 /s) 0 0 90 450 6 50

More information

The algebraic immunity of a class of correlation immune H Boolean functions

The algebraic immunity of a class of correlation immune H Boolean functions Ieraoal Coferece o Advaced Elecroc Scece ad Techology (AEST 06) The algebrac mmuy of a class of correlao mmue H Boolea fucos a Jgla Huag ad Zhuo Wag School of Elecrcal Egeerg Norhwes Uversy for Naoales

More information

Some Probability Inequalities for Quadratic Forms of Negatively Dependent Subgaussian Random Variables

Some Probability Inequalities for Quadratic Forms of Negatively Dependent Subgaussian Random Variables Joural of Sceces Islamc epublc of Ira 6(: 63-67 (005 Uvers of ehra ISSN 06-04 hp://scecesuacr Some Probabl Iequales for Quadrac Forms of Negavel Depede Subgaussa adom Varables M Am A ozorga ad H Zare 3

More information

Least Squares Fitting (LSQF) with a complicated function Theexampleswehavelookedatsofarhavebeenlinearintheparameters

Least Squares Fitting (LSQF) with a complicated function Theexampleswehavelookedatsofarhavebeenlinearintheparameters Leas Squares Fg LSQF wh a complcaed fuco Theeampleswehavelookedasofarhavebeelearheparameers ha we have bee rg o deerme e.g. slope, ercep. For he case where he fuco s lear he parameers we ca fd a aalc soluo

More information

The textbook expresses the stock price as the present discounted value of the dividend paid and the price of the stock next period.

The textbook expresses the stock price as the present discounted value of the dividend paid and the price of the stock next period. ublc Affars 974 Meze D. Ch Fall Socal Sceces 748 Uversy of Wscos-Madso Sock rces, News ad he Effce Markes Hypohess (rev d //) The rese Value Model Approach o Asse rcg The exbook expresses he sock prce

More information

Fully Fuzzy Linear Systems Solving Using MOLP

Fully Fuzzy Linear Systems Solving Using MOLP World Appled Sceces Joural 12 (12): 2268-2273, 2011 ISSN 1818-4952 IDOSI Publcaos, 2011 Fully Fuzzy Lear Sysems Solvg Usg MOLP Tofgh Allahvraloo ad Nasser Mkaelvad Deparme of Mahemacs, Islamc Azad Uversy,

More information

Quantitative Portfolio Theory & Performance Analysis

Quantitative Portfolio Theory & Performance Analysis 550.447 Quaave Porfolo heory & Performace Aalyss Week February 4 203 Coceps. Assgme For February 4 (hs Week) ead: A&L Chaper Iroduco & Chaper (PF Maageme Evrome) Chaper 2 ( Coceps) Seco (Basc eur Calculaos)

More information

The Linear Regression Of Weighted Segments

The Linear Regression Of Weighted Segments The Lear Regresso Of Weghed Segmes George Dael Maeescu Absrac. We proposed a regresso model where he depede varable s made o up of pos bu segmes. Ths suao correspods o he markes hroughou he da are observed

More information

Probability Bracket Notation and Probability Modeling. Xing M. Wang Sherman Visual Lab, Sunnyvale, CA 94087, USA. Abstract

Probability Bracket Notation and Probability Modeling. Xing M. Wang Sherman Visual Lab, Sunnyvale, CA 94087, USA. Abstract Probably Bracke Noao ad Probably Modelg Xg M. Wag Sherma Vsual Lab, Suyvale, CA 94087, USA Absrac Ispred by he Drac oao, a ew se of symbols, he Probably Bracke Noao (PBN) s proposed for probably modelg.

More information

Partial Molar Properties of solutions

Partial Molar Properties of solutions Paral Molar Properes of soluos A soluo s a homogeeous mxure; ha s, a soluo s a oephase sysem wh more ha oe compoe. A homogeeous mxures of wo or more compoes he gas, lqud or sold phase The properes of a

More information

Midterm Exam. Tuesday, September hour, 15 minutes

Midterm Exam. Tuesday, September hour, 15 minutes Ecoomcs of Growh, ECON560 Sa Fracsco Sae Uvers Mchael Bar Fall 203 Mderm Exam Tuesda, Sepember 24 hour, 5 mues Name: Isrucos. Ths s closed boo, closed oes exam. 2. No calculaors of a d are allowed. 3.

More information

COMPARISON OF ESTIMATORS OF PARAMETERS FOR THE RAYLEIGH DISTRIBUTION

COMPARISON OF ESTIMATORS OF PARAMETERS FOR THE RAYLEIGH DISTRIBUTION COMPARISON OF ESTIMATORS OF PARAMETERS FOR THE RAYLEIGH DISTRIBUTION Eldesoky E. Affy. Faculy of Eg. Shbee El kom Meoufa Uv. Key word : Raylegh dsrbuo, leas squares mehod, relave leas squares, leas absolue

More information

Lecture 3 Topic 2: Distributions, hypothesis testing, and sample size determination

Lecture 3 Topic 2: Distributions, hypothesis testing, and sample size determination Lecure 3 Topc : Drbuo, hypohe eg, ad ample ze deermao The Sude - drbuo Coder a repeaed drawg of ample of ze from a ormal drbuo of mea. For each ample, compue,,, ad aoher ac,, where: The ac he devao of

More information

Least squares and motion. Nuno Vasconcelos ECE Department, UCSD

Least squares and motion. Nuno Vasconcelos ECE Department, UCSD Leas squares ad moo uo Vascocelos ECE Deparme UCSD Pla for oda oda we wll dscuss moo esmao hs s eresg wo was moo s ver useful as a cue for recogo segmeao compresso ec. s a grea eample of leas squares problem

More information

Cyclone. Anti-cyclone

Cyclone. Anti-cyclone Adveco Cycloe A-cycloe Lorez (963) Low dmesoal aracors. Uclear f hey are a good aalogy o he rue clmae sysem, bu hey have some appealg characerscs. Dscusso Is he al codo balaced? Is here a al adjusme

More information

Comparison of the Bayesian and Maximum Likelihood Estimation for Weibull Distribution

Comparison of the Bayesian and Maximum Likelihood Estimation for Weibull Distribution Joural of Mahemacs ad Sascs 6 (2): 1-14, 21 ISSN 1549-3644 21 Scece Publcaos Comarso of he Bayesa ad Maxmum Lkelhood Esmao for Webull Dsrbuo Al Omar Mohammed Ahmed, Hadeel Salm Al-Kuub ad Noor Akma Ibrahm

More information

For the plane motion of a rigid body, an additional equation is needed to specify the state of rotation of the body.

For the plane motion of a rigid body, an additional equation is needed to specify the state of rotation of the body. The kecs of rgd bodes reas he relaoshps bewee he exeral forces acg o a body ad he correspodg raslaoal ad roaoal moos of he body. he kecs of he parcle, we foud ha wo force equaos of moo were requred o defe

More information

The textbook expresses the stock price as the present discounted value of the dividend paid and the price of the stock next period.

The textbook expresses the stock price as the present discounted value of the dividend paid and the price of the stock next period. coomcs 435 Meze. Ch Fall 07 Socal Sceces 748 Uversy of Wscos-Madso Sock rces, News ad he ffce Markes Hypohess The rese Value Model Approach o Asse rcg The exbook expresses he sock prce as he prese dscoued

More information

VARIATIONAL ITERATION METHOD FOR DELAY DIFFERENTIAL-ALGEBRAIC EQUATIONS. Hunan , China,

VARIATIONAL ITERATION METHOD FOR DELAY DIFFERENTIAL-ALGEBRAIC EQUATIONS. Hunan , China, Mahemacal ad Compuaoal Applcaos Vol. 5 No. 5 pp. 834-839. Assocao for Scefc Research VARIATIONAL ITERATION METHOD FOR DELAY DIFFERENTIAL-ALGEBRAIC EQUATIONS Hoglag Lu Aguo Xao Yogxag Zhao School of Mahemacs

More information

Available online Journal of Scientific and Engineering Research, 2014, 1(1): Research Article

Available online  Journal of Scientific and Engineering Research, 2014, 1(1): Research Article Avalable ole wwwjsaercom Joural o Scec ad Egeerg Research, 0, ():0-9 Research Arcle ISSN: 39-630 CODEN(USA): JSERBR NEW INFORMATION INEUALITIES ON DIFFERENCE OF GENERALIZED DIVERGENCES AND ITS APPLICATION

More information

Solution set Stat 471/Spring 06. Homework 2

Solution set Stat 471/Spring 06. Homework 2 oluo se a 47/prg 06 Homework a Whe he upper ragular elemes are suppressed due o smmer b Le Y Y Y Y A weep o he frs colum o oba: A ˆ b chagg he oao eg ad ec YY weep o he secod colum o oba: Aˆ YY weep o

More information

RATIO ESTIMATORS USING CHARACTERISTICS OF POISSON DISTRIBUTION WITH APPLICATION TO EARTHQUAKE DATA

RATIO ESTIMATORS USING CHARACTERISTICS OF POISSON DISTRIBUTION WITH APPLICATION TO EARTHQUAKE DATA The 7 h Ieraoal as of Sascs ad Ecoomcs Prague Sepember 9-0 Absrac RATIO ESTIMATORS USING HARATERISTIS OF POISSON ISTRIBUTION WITH APPLIATION TO EARTHQUAKE ATA Gamze Özel Naural pulaos bolog geecs educao

More information

Continuous Indexed Variable Systems

Continuous Indexed Variable Systems Ieraoal Joural o Compuaoal cece ad Mahemacs. IN 0974-389 Volume 3, Number 4 (20), pp. 40-409 Ieraoal Research Publcao House hp://www.rphouse.com Couous Idexed Varable ysems. Pouhassa ad F. Mohammad ghjeh

More information

The ray paths and travel times for multiple layers can be computed using ray-tracing, as demonstrated in Lab 3.

The ray paths and travel times for multiple layers can be computed using ray-tracing, as demonstrated in Lab 3. C. Trael me cures for mulple reflecors The ray pahs ad rael mes for mulple layers ca be compued usg ray-racg, as demosraed Lab. MATLAB scrp reflec_layers_.m performs smple ray racg. (m) ref(ms) ref(ms)

More information

Cyclically Interval Total Colorings of Cycles and Middle Graphs of Cycles

Cyclically Interval Total Colorings of Cycles and Middle Graphs of Cycles Ope Joural of Dsree Mahemas 2017 7 200-217 hp://wwwsrporg/joural/ojdm ISSN Ole: 2161-7643 ISSN Pr: 2161-7635 Cylally Ierval Toal Colorgs of Cyles Mddle Graphs of Cyles Yogqag Zhao 1 Shju Su 2 1 Shool of

More information

Regression Approach to Parameter Estimation of an Exponential Software Reliability Model

Regression Approach to Parameter Estimation of an Exponential Software Reliability Model Amerca Joural of Theorecal ad Appled Sascs 06; 5(3): 80-86 hp://www.scecepublshggroup.com/j/ajas do: 0.648/j.ajas.060503. ISSN: 36-8999 (Pr); ISSN: 36-9006 (Ole) Regresso Approach o Parameer Esmao of a

More information

The Optimal Combination Forecasting Based on ARIMA,VAR and SSM

The Optimal Combination Forecasting Based on ARIMA,VAR and SSM Advaces Compuer, Sgals ad Sysems (206) : 3-7 Clausus Scefc Press, Caada The Opmal Combao Forecasg Based o ARIMA,VAR ad SSM Bebe Che,a, Mgya Jag,b* School of Iformao Scece ad Egeerg, Shadog Uversy, Ja,

More information

Supplement Material for Inverse Probability Weighted Estimation of Local Average Treatment Effects: A Higher Order MSE Expansion

Supplement Material for Inverse Probability Weighted Estimation of Local Average Treatment Effects: A Higher Order MSE Expansion Suppleme Maeral for Iverse Probably Weged Esmao of Local Average Treame Effecs: A Hger Order MSE Expaso Sepe G. Doald Deparme of Ecoomcs Uversy of Texas a Aus Yu-C Hsu Isue of Ecoomcs Academa Sca Rober

More information

Density estimation. Density estimations. CS 2750 Machine Learning. Lecture 5. Milos Hauskrecht 5329 Sennott Square

Density estimation. Density estimations. CS 2750 Machine Learning. Lecture 5. Milos Hauskrecht 5329 Sennott Square Lecure 5 esy esmao Mlos Hauskrec mlos@cs..edu 539 Seo Square esy esmaos ocs: esy esmao: Mamum lkelood ML Bayesa arameer esmaes M Beroull dsrbuo. Bomal dsrbuo Mulomal dsrbuo Normal dsrbuo Eoeal famly Noaramerc

More information

Mixed Integral Equation of Contact Problem in Position and Time

Mixed Integral Equation of Contact Problem in Position and Time Ieraoal Joural of Basc & Appled Sceces IJBAS-IJENS Vol: No: 3 ed Iegral Equao of Coac Problem Poso ad me. A. Abdou S. J. oaquel Deparme of ahemacs Faculy of Educao Aleadra Uversy Egyp Deparme of ahemacs

More information

Complementary Tree Paired Domination in Graphs

Complementary Tree Paired Domination in Graphs IOSR Joural of Mahemacs (IOSR-JM) e-issn: 2278-5728, p-issn: 239-765X Volume 2, Issue 6 Ver II (Nov - Dec206), PP 26-3 wwwosrjouralsorg Complemeary Tree Pared Domao Graphs A Meeaksh, J Baskar Babujee 2

More information

Asymptotic Behavior of Solutions of Nonlinear Delay Differential Equations With Impulse

Asymptotic Behavior of Solutions of Nonlinear Delay Differential Equations With Impulse P a g e Vol Issue7Ver,oveber Global Joural of Scece Froer Research Asypoc Behavor of Soluos of olear Delay Dffereal Equaos Wh Ipulse Zhag xog GJSFR Classfcao - F FOR 3 Absrac Ths paper sudes he asypoc

More information

Pricing of CDO s Based on the Multivariate Wang Transform*

Pricing of CDO s Based on the Multivariate Wang Transform* Prcg of DO s Based o he Mulvarae Wag Trasform* ASTIN 2009 olloquum @ Helsk 02 Jue 2009 Masaak Kma Tokyo Meropola versy/ Kyoo versy Emal: kma@mu.ac.p hp://www.comp.mu.ac.p/kmam * Jo Work wh Sh-ch Moomya

More information

Research on portfolio model based on information entropy theory

Research on portfolio model based on information entropy theory Avalable ole www.jocpr.com Joural of Chemcal ad Pharmaceucal esearch, 204, 6(6):286-290 esearch Arcle ISSN : 0975-7384 CODEN(USA) : JCPC5 esearch o porfolo model based o formao eropy heory Zhag Jusha,

More information

Complete Identification of Isotropic Configurations of a Caster Wheeled Mobile Robot with Nonredundant/Redundant Actuation

Complete Identification of Isotropic Configurations of a Caster Wheeled Mobile Robot with Nonredundant/Redundant Actuation 486 Ieraoal Joural Sugbok of Corol Km Auomao ad Byugkwo ad Sysems Moo vol 4 o 4 pp 486-494 Augus 006 Complee Idefcao of Isoropc Cofguraos of a Caser Wheeled Moble Robo wh Noreduda/Reduda Acuao Sugbok Km

More information

General Complex Fuzzy Transformation Semigroups in Automata

General Complex Fuzzy Transformation Semigroups in Automata Joural of Advaces Compuer Research Quarerly pissn: 345-606x eissn: 345-6078 Sar Brach Islamc Azad Uversy Sar IRIra Vol 7 No May 06 Pages: 7-37 wwwacrausaracr Geeral Complex uzzy Trasformao Semgroups Auomaa

More information

The Bernstein Operational Matrix of Integration

The Bernstein Operational Matrix of Integration Appled Mahemacal Sceces, Vol. 3, 29, o. 49, 2427-2436 he Berse Operaoal Marx of Iegrao Am K. Sgh, Vee K. Sgh, Om P. Sgh Deparme of Appled Mahemacs Isue of echology, Baaras Hdu Uversy Varaas -225, Ida Asrac

More information

Stabilization of LTI Switched Systems with Input Time Delay. Engineering Letters, 14:2, EL_14_2_14 (Advance online publication: 16 May 2007) Lin Lin

Stabilization of LTI Switched Systems with Input Time Delay. Engineering Letters, 14:2, EL_14_2_14 (Advance online publication: 16 May 2007) Lin Lin Egeerg Leers, 4:2, EL_4_2_4 (Advace ole publcao: 6 May 27) Sablzao of LTI Swched Sysems wh Ipu Tme Delay L L Absrac Ths paper deals wh sablzao of LTI swched sysems wh pu me delay. A descrpo of sysems sablzao

More information

EE 6885 Statistical Pattern Recognition

EE 6885 Statistical Pattern Recognition EE 6885 Sascal Paer Recogo Fall 005 Prof. Shh-Fu Chag hp://.ee.columba.edu/~sfchag Lecure 8 (/8/05 8- Readg Feaure Dmeso Reduco PCA, ICA, LDA, Chaper 3.8, 0.3 ICA Tuoral: Fal Exam Aapo Hyväre ad Erkk Oja,

More information

Redundancy System Fault Sampling Under Imperfect Maintenance

Redundancy System Fault Sampling Under Imperfect Maintenance A publcao of CHEMICAL EGIEERIG TRASACTIOS VOL. 33, 03 Gues Edors: Erco Zo, Pero Barald Copyrgh 03, AIDIC Servz S.r.l., ISB 978-88-95608-4-; ISS 974-979 The Iala Assocao of Chemcal Egeerg Ole a: www.adc./ce

More information

JORIND 9(2) December, ISSN

JORIND 9(2) December, ISSN JORIND 9() December, 011. ISSN 1596 8308. www.rascampus.org., www.ajol.o/jourals/jord THE EXONENTIAL DISTRIBUTION AND THE ALICATION TO MARKOV MODELS Usma Yusu Abubakar Deparme o Mahemacs/Sascs Federal

More information

Solving fuzzy linear programming problems with piecewise linear membership functions by the determination of a crisp maximizing decision

Solving fuzzy linear programming problems with piecewise linear membership functions by the determination of a crisp maximizing decision Frs Jo Cogress o Fuzzy ad Iellge Sysems Ferdows Uversy of Mashhad Ira 9-3 Aug 7 Iellge Sysems Scefc Socey of Ira Solvg fuzzy lear programmg problems wh pecewse lear membershp fucos by he deermao of a crsp

More information

NOTE ON SIMPLE AND LOGARITHMIC RETURN

NOTE ON SIMPLE AND LOGARITHMIC RETURN Appled udes Agrbusess ad Commerce AAC Ceer-r ublshg House, Debrece DOI:.94/AAC/27/-2/6 CIENIFIC AE NOE ON IME AND OGAIHMIC EUN aa Mskolcz Uversy of Debrece, Isue of Accoug ad Face mskolczpaa@gmal.com Absrac:

More information

Model for Optimal Management of the Spare Parts Stock at an Irregular Distribution of Spare Parts

Model for Optimal Management of the Spare Parts Stock at an Irregular Distribution of Spare Parts Joural of Evromeal cece ad Egeerg A 7 (08) 8-45 do:0.765/6-598/08.06.00 D DAVID UBLIHING Model for Opmal Maageme of he pare ars ock a a Irregular Dsrbuo of pare ars veozar Madzhov Fores Research Isue,

More information

ASYMPTOTIC EQUIVALENCE OF NONPARAMETRIC REGRESSION AND WHITE NOISE. BY LAWRENCE D. BROWN 1 AND MARK G. LOW 2 University of Pennsylvania

ASYMPTOTIC EQUIVALENCE OF NONPARAMETRIC REGRESSION AND WHITE NOISE. BY LAWRENCE D. BROWN 1 AND MARK G. LOW 2 University of Pennsylvania The Aals of Sascs 996, Vol., No. 6, 38398 ASYMPTOTIC EQUIVALENCE OF NONPARAMETRIC REGRESSION AND WITE NOISE BY LAWRENCE D. BROWN AND MARK G. LOW Uversy of Pesylvaa The prcpal resul s ha, uder codos, o

More information

Synchronization of Complex Network System with Time-Varying Delay Via Periodically Intermittent Control

Synchronization of Complex Network System with Time-Varying Delay Via Periodically Intermittent Control Sychrozao of Complex ework Sysem wh me-varyg Delay Va Perodcally Ierme Corol JIAG Ya Deparme of Elecrcal ad Iformao Egeerg Hua Elecrcal College of echology Xaga 4, Cha Absrac he sychrozao corol problem

More information

USING INPUT PROCESS INDICATORS FOR DYNAMIC DECISION MAKING

USING INPUT PROCESS INDICATORS FOR DYNAMIC DECISION MAKING Proceedgs of he 999 Wer Smulao Coferece P. A. Farrgo, H. B. Nembhard, D. T. Surrock, ad G. W. Evas, eds. USING INPUT PROCESS INDICATORS FOR DYNAMIC DECISION MAKING Mchael Fremer School of Operaos Research

More information

Brownian Motion and Stochastic Calculus. Brownian Motion and Stochastic Calculus

Brownian Motion and Stochastic Calculus. Brownian Motion and Stochastic Calculus Browa Moo Sochasc Calculus Xogzh Che Uversy of Hawa a Maoa earme of Mahemacs Seember, 8 Absrac Ths oe s abou oob decomoso he bascs of Suare egrable margales Coes oob-meyer ecomoso Suare Iegrable Margales

More information

Density estimation III. Linear regression.

Density estimation III. Linear regression. Lecure 6 Mlos Hauskrec mlos@cs.p.eu 539 Seo Square Des esmao III. Lear regresso. Daa: Des esmao D { D D.. D} D a vecor of arbue values Obecve: r o esmae e uerlg rue probabl srbuo over varables X px usg

More information

Final Exam Applied Econometrics

Final Exam Applied Econometrics Fal Eam Appled Ecoomercs. 0 Sppose we have he followg regresso resl: Depede Varable: SAT Sample: 437 Iclded observaos: 437 Whe heeroskedasc-cosse sadard errors & covarace Varable Coeffce Sd. Error -Sasc

More information

Asymptotic Regional Boundary Observer in Distributed Parameter Systems via Sensors Structures

Asymptotic Regional Boundary Observer in Distributed Parameter Systems via Sensors Structures Sesors,, 37-5 sesors ISSN 44-8 by MDPI hp://www.mdp.e/sesors Asympoc Regoal Boudary Observer Dsrbued Parameer Sysems va Sesors Srucures Raheam Al-Saphory Sysems Theory Laboraory, Uversy of Perpga, 5, aveue

More information

Application of the stochastic self-training procedure for the modelling of extreme floods

Application of the stochastic self-training procedure for the modelling of extreme floods The Exremes of he Exremes: Exraordary Floods (Proceedgs of a symposum held a Reyjav, Icelad, July 000). IAHS Publ. o. 7, 00. 37 Applcao of he sochasc self-rag procedure for he modellg of exreme floods

More information

Solving Non-Linear Rational Expectations Models: Approximations based on Taylor Expansions

Solving Non-Linear Rational Expectations Models: Approximations based on Taylor Expansions Work progress Solvg No-Lear Raoal Expecaos Models: Approxmaos based o Taylor Expasos Rober Kollma (*) Deparme of Ecoomcs, Uversy of Pars XII 6, Av. du Gééral de Gaulle; F-94 Créel Cedex; Frace rober_kollma@yahoo.com;

More information

AN INCREMENTAL QUASI-NEWTON METHOD WITH A LOCAL SUPERLINEAR CONVERGENCE RATE. Aryan Mokhtari Mark Eisen Alejandro Ribeiro

AN INCREMENTAL QUASI-NEWTON METHOD WITH A LOCAL SUPERLINEAR CONVERGENCE RATE. Aryan Mokhtari Mark Eisen Alejandro Ribeiro AN INCREMENTAL QUASI-NEWTON METHOD WITH A LOCAL SUPERLINEAR CONVERGENCE RATE Arya Mokhar Mark Ese Alejadro Rbero Deparme of Elecrcal ad Sysems Egeerg, Uversy of Pesylvaa ABSTRACT We prese a cremeal Broyde-Flecher-Goldfarb-Shao

More information

Other Topics in Kernel Method Statistical Inference with Reproducing Kernel Hilbert Space

Other Topics in Kernel Method Statistical Inference with Reproducing Kernel Hilbert Space Oher Topcs Kerel Mehod Sascal Iferece wh Reproducg Kerel Hlber Space Kej Fukumzu Isue of Sascal Mahemacs, ROIS Deparme of Sascal Scece, Graduae Uversy for Advaced Sudes Sepember 6, 008 / Sascal Learg Theory

More information

Chapter 3: Maximum-Likelihood & Bayesian Parameter Estimation (part 1)

Chapter 3: Maximum-Likelihood & Bayesian Parameter Estimation (part 1) Aoucemes Reags o E-reserves Proec roosal ue oay Parameer Esmao Bomercs CSE 9-a Lecure 6 CSE9a Fall 6 CSE9a Fall 6 Paer Classfcao Chaer 3: Mamum-Lelhoo & Bayesa Parameer Esmao ar All maerals hese sles were

More information

EDUCATION COMMITTEE OF THE SOCIETY OF ACTUARIES ADVANCED TOPICS IN GENERAL INSURANCE STUDY NOTE CREDIBILITY WITH SHIFTING RISK PARAMETERS

EDUCATION COMMITTEE OF THE SOCIETY OF ACTUARIES ADVANCED TOPICS IN GENERAL INSURANCE STUDY NOTE CREDIBILITY WITH SHIFTING RISK PARAMETERS EDUCATION COMMITTEE OF THE SOCIETY OF ACTUARIES ADVANCED TOPICS IN GENERAL INSURANCE STUDY NOTE CREDIBILITY WITH SHIFTING RISK PARAMETERS Suar Klugma, FSA, CERA, PhD Copyrgh 04 Socey of Acuares The Educao

More information

Density estimation III.

Density estimation III. Lecure 4 esy esmao III. Mlos Hauskrec mlos@cs..edu 539 Seo Square Oule Oule: esy esmao: Mamum lkelood ML Bayesa arameer esmaes MP Beroull dsrbuo. Bomal dsrbuo Mulomal dsrbuo Normal dsrbuo Eoeal famly Eoeal

More information

An Efficient Dual to Ratio and Product Estimator of Population Variance in Sample Surveys

An Efficient Dual to Ratio and Product Estimator of Population Variance in Sample Surveys "cece as True Here" Joural of Mahemacs ad ascal cece, Volume 06, 78-88 cece gpos Publshg A Effce Dual o Rao ad Produc Esmaor of Populao Varace ample urves ubhash Kumar Yadav Deparme of Mahemacs ad ascs

More information

Nature and Science, 5(1), 2007, Han and Xu, Multi-variable Grey Model based on Genetic Algorithm and its Application in Urban Water Consumption

Nature and Science, 5(1), 2007, Han and Xu, Multi-variable Grey Model based on Genetic Algorithm and its Application in Urban Water Consumption Naure ad Scece, 5, 7, Ha ad u, ul-varable Grey odel based o Geec Algorhm ad s Applcao Urba Waer Cosumpo ul-varable Grey odel based o Geec Algorhm ad s Applcao Urba Waer Cosumpo Ha Ya*, u Shguo School of

More information

Use of Non-Conventional Measures of Dispersion for Improved Estimation of Population Mean

Use of Non-Conventional Measures of Dispersion for Improved Estimation of Population Mean Amerca Joural of Operaoal esearch 06 6(: 69-75 DOI: 0.59/.aor.06060.0 Use of o-coveoal Measures of Dsperso for Improve Esmao of Populao Mea ubhash Kumar aav.. Mshra * Alok Kumar hukla hak Kumar am agar

More information

Efficient Estimators for Population Variance using Auxiliary Information

Efficient Estimators for Population Variance using Auxiliary Information Global Joural of Mahemacal cece: Theor ad Praccal. IN 97-3 Volume 3, Number (), pp. 39-37 Ieraoal Reearch Publcao Houe hp://www.rphoue.com Effce Emaor for Populao Varace ug Aular Iformao ubhah Kumar Yadav

More information

Pricing Asian Options with Fourier Convolution

Pricing Asian Options with Fourier Convolution Prcg Asa Opos wh Fourer Covoluo Cheg-Hsug Shu Deparme of Compuer Scece ad Iformao Egeerg Naoal Tawa Uversy Coes. Iroduco. Backgroud 3. The Fourer Covoluo Mehod 3. Seward ad Hodges facorzao 3. Re-ceerg

More information

A note on Turán number Tk ( 1, kn, )

A note on Turán number Tk ( 1, kn, ) A oe o Turá umber T (,, ) L A-Pg Beg 00085, P.R. Cha apl000@sa.com Absrac: Turá umber s oe of prmary opcs he combaorcs of fe ses, hs paper, we wll prese a ew upper boud for Turá umber T (,, ). . Iroduco

More information

Solution of Impulsive Differential Equations with Boundary Conditions in Terms of Integral Equations

Solution of Impulsive Differential Equations with Boundary Conditions in Terms of Integral Equations Joural of aheacs ad copuer Scece (4 39-38 Soluo of Ipulsve Dffereal Equaos wh Boudary Codos Ters of Iegral Equaos Arcle hsory: Receved Ocober 3 Acceped February 4 Avalable ole July 4 ohse Rabba Depare

More information

Learning of Graphical Models Parameter Estimation and Structure Learning

Learning of Graphical Models Parameter Estimation and Structure Learning Learg of Grahal Models Parameer Esmao ad Sruure Learg e Fukumzu he Isue of Sasal Mahemas Comuaoal Mehodology Sasal Iferee II Work wh Grahal Models Deermg sruure Sruure gve by modelg d e.g. Mxure model

More information

Voltage Sensitivity Analysis in MV Distribution Networks

Voltage Sensitivity Analysis in MV Distribution Networks Proceedgs of he 6h WSEAS/IASME I. Cof. o Elecrc Power Sysems, Hgh olages, Elecrc Maches, Teerfe, Spa, December 6-8, 2006 34 olage Sesvy Aalyss M Dsrbuo Neworks S. CONTI, A.M. GRECO, S. RAITI Dparmeo d

More information

On subsets of the hypercube with prescribed Hamming distances

On subsets of the hypercube with prescribed Hamming distances O subses of he hypercube wh prescrbed Hammg dsaces Hao Huag Oleksy Klurma Cosm Pohoaa Absrac A celebraed heorem of Klema exremal combaorcs saes ha a colleco of bary vecors {0, 1} wh dameer d has cardaly

More information

New Guaranteed H Performance State Estimation for Delayed Neural Networks

New Guaranteed H Performance State Estimation for Delayed Neural Networks Ieraoal Joural of Iformao ad Elecrocs Egeerg Vol. o. 6 ovember ew Guaraeed H Performace ae Esmao for Delayed eural eworks Wo Il Lee ad PooGyeo Park Absrac I hs paper a ew guaraeed performace sae esmao

More information

Enhanced least squares Monte Carlo method for real-time decision optimizations for evolving natural hazards

Enhanced least squares Monte Carlo method for real-time decision optimizations for evolving natural hazards Dowloaded from orbdudk o: Ja 4 29 Ehaced leas squares Moe Carlo mehod for real-me decso opmzaos for evolvg aural hazards Aders Ae; Nshjma Kazuyosh Publcao dae: 22 Lk back o DTU Orb Cao (APA): Aders A &

More information

Density estimation III.

Density estimation III. Lecure 6 esy esmao III. Mlos Hausrec mlos@cs..eu 539 Seo Square Oule Oule: esy esmao: Bomal srbuo Mulomal srbuo ormal srbuo Eoeal famly aa: esy esmao {.. } a vecor of arbue values Objecve: ry o esmae e

More information

Spike-and-Slab Dirichlet Process Mixture Models

Spike-and-Slab Dirichlet Process Mixture Models Ope oural of Sascs 5-58 hp://dxdoorg/436/os566 Publshed Ole December (hp://wwwscrporg/oural/os) Spke-ad-Slab Drchle Process Mxure Models Ka Cu Wesha Cu Deparme of Sascal Scece Duke Uversy Durham USA School

More information