Online Geometric Optimization in the Bandit Setting Against an Adaptive Adversary

Size: px

Start display at page:

Download "Online Geometric Optimization in the Bandit Setting Against an Adaptive Adversary"

Bonnie Miller
5 years ago
Views:

1 lie Geomeric pimizaio i he Badi Seig Agais a Adapive Adversary H. Breda McMaha ad Avrim Blum Caregie Mello Uiversiy, Pisburgh, PA, 15213, Absrac. We give a algorihm for he badi versio of a very geeral olie opimizaio problem cosidered by Kalai ad Vempala [1], for he case of a adapive adversary. I his problem we are give a bouded se S of feasible pois. A each ime sep, he olie algorihm mus selec a poi x S while simulaeously a adversary selecs a cos vecor c. The algorihm he icurs cos c x. Kalai ad Vempala show ha eve if S is expoeially large (or ifiie), so log as we have a efficie algorihm for he offlie problem (give c, fid x S o miimize c x) ad so log as he cos vecors are bouded, oe ca efficiely solve he olie problem of performig early as well as he bes fixed x S i hidsigh. The Kalai-Vempala algorihm assumes ha he cos vecors c are give o he algorihm afer each ime sep. I he badi versio of he problem, he algorihm oly observes is cos, c x. Awerbuch ad Kleiberg [2] give a algorihm for he badi versio for he case of a oblivious adversary, ad a algorihm ha works agais a adapive adversary for he special case of he shores pah problem. They leave ope he problem of hadlig a adapive adversary i he geeral case. I his paper, we solve his ope problem, givig a simple olie algorihm for he badi problem i he geeral case i he presece of a adapive adversary. Igorig a (polyomial) depedece o, we achieve a regre boud of T 3 4 l T. 1 Iroducio Kalai ad Vempala [1] give a elega, efficie algorihm for a broad class of olie opimizaio problems. I heir seig, we have a arbirary (bouded) se S of feasible pois. A each ime sep, a olie algorihm mus selec a poi x S ad simulaeously a adversary selecs a cos vecor c (hroughou he paper we use superscrips o idex ieraios). The algorihm he observes c ad icurs cos c x. Kalai ad Vempala show ha so log as we have a efficie algorihm for he offlie problem (give c fid x S o miimize c x) ad so log as he cos vecors are bouded, we ca efficiely solve he olie problem of performig early as well as he bes fixed x S i hidsigh. This geeralizes he classic exper advice problem, because we do o require he se S o be represeed explicily: we jus eed a efficie oracle for selecig he bes x S i hidsigh. Furher, i decouples he umber of expers from he uderlyig dimesioaliy of he decisio se, uder he assumpio he cos of a decisio is a liear fucio of feaures of he decisio. The sadard expers seig ca be recovered by leig S e 1 e, he colums of he ideiy marix.

2 A problem ha fis aurally io his framework is a olie shores pah problem where we repeaedly ravel bewee wo pois a ad b i some graph whose edge coss chage each day (say, due o raffic). I his case, we ca view he se of pahs as a se S of pois i a space of dimesio equal o he umber of edges i he graph, ad c is simply he vecor of edge coss o day. Eve hough he umber of pahs i a graph ca be expoeial i he umber of edges (i.e., he se S is of expoeial size), sice we ca solve he shores pah problem for ay give se of edge leghs, we ca apply he Kalai-Vempala algorihm. (Noe ha a differe algorihm for he special case of he olie shores pah problem is give by Takimoo ad Warmuh [3].) A aural geeralizaio of he above problem, cosidered by Awerbuch ad Kleiberg [2], is o imagie ha raher ha beig give he eire cos vecor c, he algorihm is simply old he cos icurred c x. For example, i he case of shores pahs, raher ha beig old he leghs of all edges a ime, his would correspod o jus beig old he oal ime ake o reach he desiaio. Thus, his is he badi versio of he Kalai-Vempala seig. Awerbuch ad Kleiberg prese wo resuls: a algorihm for he geeral problem i he presece of a oblivious adversary, ad a algorihm for he special case of he shores pah problem ha works i he presece of a adapive adversary. The differece bewee he wo adversaries is ha a oblivious adversary mus commi o he eire sequece of cos vecors i advace, whereas a adapive adversary may deermie he ex cos vecor based o he olie algorihm s play (ad hece, he iformaio he algorihm received) i he previous ime seps. Thus, a adapive adversary is i essece playig a repeaed game. They leave ope he quesio of achievig good regre guaraees for a adapive adversary i he geeral seig. I his paper we solve he ope quesio of [2], givig a algorihm for he geeral badi seig i he presece of a adapive adversary. Moreover, our mehod is sigificaly simpler ha he special-purpose algorihm of Awerbuch ad Kleiberg for shores pahs. ur bouds are somewha worse: we achieve regre bouds of T 3 4 lt! compared o he " T 2 3! bouds of [2]. We believe improveme i his direcio may be possible, ad prese some discussio of his issue a he ed of he paper. The basic idea of our approach is as follows. We begi by oicig ha he oly hisory iformaio used by he Kalai-Vempala algorihm i deermiig is acio a ime is he sum c 1:# 1 # τ$ 1 1 cτ of all cos vecors received so far (we use his abbreviaed oaio for sums over ieraio idexes hroughou he paper). Furhermore, he way his is used i he algorihm is by addig radom oise µ o his vecor, ad he callig he offlie oracle o fid he x S ha miimizes c 1:# 1 % µ! x. So, if we ca desig a badi algorihm ha produces a esimae ĉ 1:# 1 of c 1:# 1, ad show ha wih high probabiliy eve a adapive adversary will o cause ĉ 1:# 1 o differ oo subsaially from c 1:# 1, we ca he argue ha he disribuio ĉ 1:# 1 % µ is close eough o c 1:# 1 % µ for he Kalai-Vempala aalysis o apply. I fac, o make our aalysis a bi more geeral, so ha we could poeially use oher algorihms as subrouies, we will argue a lile differely. Le PT c!& mi x' S c x!. We will show ha wih high probabiliy, PT ĉ 1:T! is close o PT c 1:T! ad ĉ 1:T saisfies codiios eeded for he subrouie o achieve low regre o ĉ 1:T. This meas ha our subrouie, which believes i has see ĉ 1:T, will achieve performace o ĉ 1:T close o PT c 1:T!. We he fiish off by arguig ha our performace o c 1:T is close o is performace o ĉ 1:T.

3 The behavior of he badi algorihm will i fac be fairly simple. We begi by choosig a basis B of (a mos) pois i S o use for samplig (we address he issue of how B is chose whe we describe our algorihm i deail). The, a each ime sep, wih probabiliy γ we explore by playig a radom basis eleme, ad oherwise (wih probabiliy 1 ( γ) we exploi by playig accordig o he Kalai-Vempala algorihm. For each basis eleme b j, we use our cos icurred while explorig wih ha basis eleme, scaled by ) γ, as a esimae of c 1:# 1 b j. Usig marigale ail iequaliies, we argue ha eve a adapive adversary cao make our esimae differ oo wildly from he rue value of c 1:# 1 b j, ad use his o show ha afer marix iversio, our esimae ĉ 1:# 1 is close o is correc value wih high probabiliy. 2 Problem Formalizaio We ca ow fully formalize he problem. Firs, however, we esablish a few oaioal coveios. As meioed previously, we use superscrips o idex ieraios (or rouds) of our algorihm, ad use he abbreviaed summaio oaio c 1: whe summig variables over ieraios. Vecors quaiies are idicaed i bold, ad subscrips idex io vecors or ses. Has (such as ĉ ) deoe esimaes of he correspodig acual quaiies. The variables ad cosas used i he paper are summarized i Table (1). As meioed above, we cosider he seig of [1] i which we have a arbirary (bouded) se S * of feasible pois. A each ime sep, he olie algorihm mus selec a poi x S ad simulaeously a adversary selecs a cos vecor c. The algorihm he icurs cos c x. Ulike [1], however, raher ha beig old c, he algorihm simply lears is cos c x. For simpliciy, we assume a fixed adapive adversary + ad ime horizo T for he duraio of his paper. Sice our choice of algorihm parameers depeds o T, we assume 1 T is kow o he algorihm. We refer o he sequece of decisios made by he algorihm so far as a decisio hisory, which ca be wrie h, x 1 x -. Le H. be he se of all possible decisio hisories of legh 0 hrough T ( 1. Wihou loss of geeraliy (e.g., see [5]), we assume our adapive adversary is deermiisic, as specified by a fucio + : H.0/1, a mappig from decisio hisories o cos vecors. Thus, +2 h # 1!3 c is he cos vecor for imesep. We ca view our olie decisio problem as a game, where o each ieraio he adversary + selecs a ew cos vecor c based o h # 1, ad he olie algorihm selecs a decisio x S based o is pas plays ad observaios, ad possibly addiioal hidde sae or radomess. The, pays c x ad observes his cos. For our aalysis, we assume a L 1 boud o S, amely 4 x4 1 5 D) 2 for all x S, so 4 x ( y4 1 5 D for all x y S. We also assume ha 6 c x6 5 M for all x S ad all c played by +. We also assume S is full rak, if i is o we simply projec o a lower-dimesioal represeaio. Some of hese assumpios ca be lifed or modified, bu his se of assumpios simplifies he aalysis. For a fixed decisio hisory h T ad cos hisory k T 7 c 1 8 c T!, we defie loss h T k T!9 $ T 1 c x!. For a radomized algorihm ad adversary +, we defie he radom 1 e ca remove his requireme by guessig T, ad doublig he guess each ime we play loger ha expeced (see, for example, Theorem 6.4 from [4]).

4 C variable loss: +;! o be loss h T k T!, where h T is draw from he disribuio over hisories defied by ad +, ad k T <=+2 h 0! 8 +2 h T# 1!!. Whe i is clear from coex, we will omi he depedece o +, wriig oly loss>?!. ur goal is o defie a olie algorihm wih low regre. Tha is, we wa a guaraee ha he oal loss icurred will, i expecaio, o be much larger ha he opimal sraegy i hidsigh agais he cos sequece we acually faced. To formalize his, firs defie a : / S ha solves he offlie opimizaio c! argmi x' S c x!. We he defie PT k T! c c 1:T!. Similarly, PT> +;! is he radom variable PT k T! whe k T is geeraed by playig + agais. We agai drop he depedece o + ad whe i is clear from coex. Formally, we defie expeced regre as E, loss> +2!B( PT> +;! - E, loss: +;! - ( E mi x' S T c x! (1) $ 1 Noe ha he E, PT: +;! - erm correspods o applyig he mi operaor separaely o each possible cos hisory o fid he bes fixed decisio wih respec o ha paricular cos hisory, ad he akig he expecaio wih respec o hese hisories. I [5], a aleraive weaker defiiio of regre is give. We discuss relaioships bewee he defiiios i Appedix B. 3 Algorihm Choose parameers γ ad ε, where ε is a parameer of GEX C 1 Fix a basis B C7D b 1EFFF>E b G S while playig do Le χ C 1 wih probabiliy γ ad χ C 0 oherwise if χ C 0 he Selec x from he disribuio GEX ĉ 1 EFFFHE ĉ I 1 Icur cos z C c x ĉ C 0 else Draw j uiformly a radom from D 1EFFF>E G x C b j Icur cos J ad observe z C c x Defie ˆ J by ˆ i C 0 for i K J j ad ˆ j C L γ z ĉ C B I 1ˆ J ed if ĉ 1: C ĉ 1:I 1 M ĉ C M 1 ed while Algorihm 1: BGA

5 We iroduce a algorihm we call BGA, sadig for Badi-syle Geomeric decisio algorihm agais a Adapive adversary. The algorihm aleraes bewee playig decisios from a fixed basis o ge ubiased esimaes of coss, ad playig (hopefully) good decisios based o hose esimaes. I order o deermie he good decisios o play, i uses some olie geomeric opimizaio algorihm for he full observaio problem. We deoe his algorihm by GEX (Geomeric Expers algorihm). The implemeaio of GEX we aalyze is based o he FPL algorihm of Kalai ad Vempala [1]; we deail his implemeaio ad aalysis i Appedix A. However, oher algorihms could be used, for example he algorihm of Zikevich [6] whe S is covex. We view GEX as a fucio from he sequece of previous cos vecors ĉ 1 8 ĉ # 1! o disribuios over decisios. Pseudocode for our algorihm is give i Algorihm (1). each imesep, we make decisio x. Wih probabiliy 1 ( γ!, BGA plays a recommedaio x x S from GEX. Wih probabiliy γ, we igore x ad play a basis decisio, x b i uiformly a radom from a samplig basis B N b 1 b. The idicaor variable χ is 1 o exploraio ieraios ad 0 oherwise. ur samplig basis B is a marix wih colums b i S, so we ca wrie x Bw for ay x ad weighs w. For a give cos vecor c, le P B c (he superscrip idicaes raspose). This is he vecor of decisio coss for he basis decisios, so i c b i. We defie ˆ, a esimae of, as follows: Le ˆ 0 o exploiaio ieraios. If o a exploraio ieraio we play b j, he ˆ is he vecor where ˆ i 0 for i Q j ad ˆ j γ c b j!. Noe ha c b j is he observed quaiy, he cos of basis decisio b j. each ieraio, we esimae c by ĉ R B! # 1ˆ. I is sraighforward o show ha ˆ is a ubiased esimae of basis decisio coss ad ha ĉ is a ubiased esimae of c o each imesep. The choice of he samplig basis plays a impora role i he aalysis of our algorihm. I paricular, we use a bariceric spaer, iroduced i [2]. A bariceric spaer B S b 1 b is a basis for S such ha b i S ad for all x S we ca wrie x Bw wih coefficies w i,t( I may o be easy o fid exac bariceric spaers i all cases, bu [2] proves hey always exis ad gives a algorihm for fidig 2-approximae bariceric spaers (where he weighs w i,u( 2 2 -!, which is sufficie for our purposes. 4 Aalysis 4.1 Prelimiaries A each ime sep, BGA eiher (wih probabiliy 1 ( γ) plays he recommedaio x from GEX, or else (wih probabiliy γ) plays a radom basis vecor from B. For purposes of aalysis, however, i will be coveie o imagie ha we reques a recommedaio x from GEX o every ieraio, ad also ha we radomly pick a basis o explore, b b 1 b, o each ieraio. We he decide o play eiher x or b based o he oucome χ of a coi of bias γ. Thus, he complee hisory of he algorihm is specified by he algorihm hisory G # 1 *, χ 1 x 1 b 1 χ 2 x 2 b 2 χ # 1 x # 1 b # 1-, which ecodes all previous radom choices. The sample space for all probabiliies ad expecaios is

6 ] ` S V D Table 1. Summary of oaio se of decisios, a compac subse of L 1 boud o diameer of S, W xe y SEYX x Z yx 1 [ D N dimesio of decisio space h decisio hisory, h C x 1 EFFF>E x H\ se of possible decisio hisories : H\_^ adversary, fucio from decisio hisories o cos vecors a olie opimizaio algorihm G I 1 hisory of BGA radomess for imeseps 1 hrough Z 1 c cos vecor o ime ĉ BGA s esimae of he cos vecor o ime M 0a boud o sigle-ieraio cos, X c x X [ M B S samplig basis B CbD b 1EFFF>E b G β marix max orm o B I 1 J dc Z ME Me J vecor, i C c b i for b i B Jˆ J BGA s esimae of T f ed of ime, idex of fial ieraio x S x S BGA s decisio o ime decisio recommeded by GEX o ime idicaor, χ C 1 if BGA explores o, 0 oherwise he probabiliy BGA explores o each imesep χ D 0E 1G γ dc 0E 1e z dc Z ME Me BGA s loss o ieraio, z C c x, ẑ dc Z RE Re loss of GEX, ẑ C ĉ x he se of all possible algorihm hisories of legh T. Thus, for a give adversary +, he various radom variables ad vecors we cosider, such as x c ĉ x, ad ohers, ca all be viewed as fucios o he se of possible algorihm hisories. Uless oherwise saed, our expecaios ad probabiliies are wih respec o he disribuio over hese hisories. A parial hisory G # 1 ca be viewed a subse of he sample space (a eve) cosisig of all complee hisories ha have G # 1 as a prefix. We frequely cosider codiioal disribuios ad correspodig expecaios wih respec o parial algorihm hisories. For isace, if we codiio o a hisory G # 1, he radom variables c 1 c, 1 1, ˆ ˆ# 1, ĉ 1 ĉ # 1, x 1 x # 1, ad χ 1 χ # 1 are fully deermied. We ow oulie he geeral srucure of our argume. Le ẑ ĉ x be he loss perceived by he GEX o ieraio. I keepig wih earlier defiiios, loss BGA!& z 1:T ad loss GEX!0 ẑ 1:T. We also le PT PT BGA +;!0 c c 1:T!, he performace of he bes pos-hoc decisio, ad similarly PT PT ĉ 1 ĉ T!P ĉ ĉ 1:!. The base of our aalysis is a boud o he loss of GEX wih respec o he cos vecors ĉ of he form E, loss GEX! - 5 E, PT -8% erms! (2)

7 Such a resul is give i Appedix A, ad follows from a adapaio of he aalysis from [1]. We he prove saemes havig he geeral form E, loss BGA! - 5 E, loss GEX! -8% erms! (3) ad E, PT - 5 E, PT -g% erms! (4) These saemes coec our real loss o he imagiary loss of GEX, ad similarly coec he loss of he bes decisio i GEX s imagied world wih he loss of he bes decisio i he real world. Combiig he resuls correspodig o Equaios (2), (3), ad (4) leads o a overall boud o he regre of BGA. 4.2 High Probabiliy Bouds o Esimaes We prove a boud o he accuracy of BGA s esimaes ˆ, ad use his o show a relaioship bewee PT ad PT of he form i Equaio 4. Defie radom variables e 0 0 ad e ( ˆ. We are really ieresed i he correspodig sums e 1:, where e 1: is he oal error i our esimae of c 1: b i. We ow boud 6 e 1: i 6. Theorem 1. For λ h 0, i Pr e 1: i i λ M γ 5 2e # λ2 2 Proof. I is sufficie o show he sequece e 0 e 1 e 1:2 e 1:3 e 1:T of radom variables is a bouded marigale sequece wih respec o he filer G 0 G 1 G T ; ha is, E, e 1: i 6 G # 1- e 1:# 1. The resul he follows from Azuma s Iequaliy (see, for example,[7]). Firs, observe ha e 1: i i j i ( ˆ % i e 1:# i 1 we kow G # 1, ad so i is also fixed. Thus, accouig for he γ a paricular basis decisio b i, we have E e 1: i 6 G # 1 γ i ( γ i % e 1:# i 1. Furher, he cos vecor c is deermied if probabiliy we explore % 1 ( γ, i ( 0 % e 1:# i 1 - e 1:# i 1 ad so we coclude ha he e 1: i forms a marigale sequece. Noice ha 6 e 1: i ( e 1:# i 1 6 i ( ˆ i 6. If we do sample, ˆ i 0 ad so 6 e 1: i ( e 1:# i M. If we do sample, we have ˆ i γ i, ad so 6 e 1: i ( e 1:# i 1 M 6 5 γ. This boud is worse, so i holds i boh cases. The resul ow follows from Azuma s iequaliy. lm Le β *4= B! # 1 4, a marix L -orm o B! # 1, so ha for ay w, 4 B! # 1 w4 5 β 4 w4. Corollary 1. For δ 0 1 -, ad all from 1 o T, where J δ γ!o 1 γ M 2l 2) δ!. Pr 4 ĉ 1: ( c 1: 4 i β J δ γ! 5 δ 6k

8 ml # Proof. Solvig δ) 2e λ2 2 yields λ 2l 2) δ!, ad he usig his value i Theorem (1) gives for all i 1 2. The, Pr 6 e 1: i 6 i J δ γ! 5 δ) Pr 4 e 1: 4 i J δ γ! 5 5 δ i$ 1 Pr 6 e 1: i 6 i J δ γ! 1: by he uio boud. Now, oice ha we ca relae ˆ ad ĉ 1: by B! # 1ˆ 1: B! # 1 τ τ$ 1 B! # 1 τ τ$ 1 τ$ 1 ĉ τ ĉ 1: ad similarly for 1: ad c 1:. The Pr 4 ĉ 1: ( c 1: 4 i β J δ γ! Pr 4 B! # 1 ˆ 1: 1: (?!p4 i β J δ γ! 5 Pr β 4 e 1: 4 i β J δ γ! Pr 4 e 1: 4 i J δ γ! 5 δ We ca ow prove our mai resul for he secio, a saeme of he form of Equaio (4) relaig PT ad PT: Theorem 2. If we play + agais BGA for T imeseps, E, PT - 5 E, PT -g% 1 ( δ! 3 2 Dβ J δ γ! T % δmt Proof. Le Φ ĉ 1:T ( c 1:T. By ĉ 1:T! ĉ 1:T c 1:T! ĉ 1:T or c 1:T % Φ! c 1:T % Φ! c 1:T! c 1:T % Φ!, ad so by expadig ad rearragig we have c 1:T % Φ! c 1:T (;@A c 1:T! c 1:T 5 >@A c 1:T!q(;@7 c 1:T % Φ!! Φ 5 D4 Φ4 (5) 6 PT ( c 1:T! c 1:T (2@A c 1:T % Φ! c 1:T % Φ!6 5 6s>@A c 1:T!q(2@A c 1:T % Φ!! c 1:T 6 % 6@A c 1:T % Φ! Φ6 5 D % D) 2!p4 Φ4

9 D) 2 for where we have used Equaio (5). Recall from Secio (2), we assume 4 x4 1 5 all x S, so 4 x ( y4 1 5 D for all x y S. The heorem follows by applyig he boud o Φ give by Corollary (1), ad he observig ha he above relaioship holds for a leas a 1 ( δ fracio of he possible algorihm hisories. For he oher δ fracio, he differece migh be as much as δmt. Wriig he overall expecaio as he sum of wo expecaios codiioed o wheher or o he boud holds gives he resul. lm 4.3 Relaig he Loss of BGA ad is GEX Subrouie Now we prove a saeme like Equaio (3), relaig loss BGA! o loss GEX!. Theorem 3. If we ru BGA wih parameer γ agais + for T imeseps, E, loss BGA! ( γ! E, loss GEX! -8% γmt Proof. For a give adversary +, G # 1 fully deermies he sequece of cos vecors give o algorihm GEX. So, we ca view GEX as a fucio from G # 1 o probabiliy disribuios over S. If we prese a cos vecor ĉ o GEX, he he expeced cos o GEX give hisory G # 1 is x' S Pr x 6 G # 1!& ĉ x!. If we defie x x' S Pr x 6 G # 1! x, we ca re-wrie he expeced loss of GEX agais ĉ as ĉ x ; ha is, we ca view GEX as icurrig he cos of some covex combiaio of he possible decisios i expecaio. Le ˆu j be ˆ give ha we explore by playig basis vecor b j o ime, ad similarly le ĉ u j R B! # 1ˆu j. bserve ha ˆu j i γ i for j i ad 0 oherwise, ad so j$ 1 ˆ u j γ γ B c (6) Now, we ca wrie E, ẑ 6 G # 1- S 1 ( γ! 0 % γ j$ 1 γ j$ 1 ĉu 1 j γ B! # 1 c x j$ 1 x ˆ 1 x ' S Pr x 6 G # 1!v ĉ u j x! u j x ad usig Equaio (6), Now, we cosider he codiioal expecaio of z ad see ha E, z 6 G # 1- S 1 ( γ! c x! % γ i$ 1 c b i! 5 1 ( γ! E, ẑ 6 G # 1-8% γm (7) 1

10 The we have, E, z - E E, z 6 G # 1-5 E 1 ( γ! E, ẑ 6 G # 1-g% γm w 1 ( γ! E E, ẑ 6 G # 1- % γm w 1 ( γ! E, ẑ -g% γm (8) by usig he iequaliy from Equaio (7). The heorem follows by summig he iequaliy (8) over from 1 o T ad applyig lieariy of expecaio. lm 4.4 A Boud o he Expeced Regre of BGA Theorem 4. If we ru BGA wih parameer γ usig subrouie GEX wih parameer ε (as defied i Appedix A), he for all δ 0 1 -, E, loss BGA! - 5 E, PT -g% D 1 γ M 2l 2) δ! T % δmt % ε γ 2 3 M 2 T % % γmt ε Proof. I Appedix A, we show a algorihm o plug i for GEX, based o he FPL algorihm of [1] ad give bouds o regre agais a deermiisic adapive adversary. We firs show how o apply ha aalysis o GEX ruig as a subrouie o BGA. Bw Firs, we eed o boud 6 ĉ x6. By defiiio, for ay x S, we ca wrie x for weighs w wih w i,u( (or,t( if i is a approximae bariceric spaer). Noe ha 4 ˆ γ! M, ad for ay x S, we ca wrie x as Bw where w i,u( Thus, Le R 6 ĉ x686s B! # 1ˆ Bw6qx6y ˆ! B # 1 Bw686 ˆ w6 5 4 ˆ w4 5 2M γ 2M) γ. Suppose a he begiig of ime we fix he radom decisios of BGA ha are o made by GEX, ha is, we fix a sequece X z, χ 1 b 1 χ T b T-. Fixig his radomess ogeher wih + ˆ deermies a ew deermiisic adapive adversary + ha GEX is effecively playig agais. To see his, le h # 1 z, x 1 x # 1-. If we combie h # 1 wih he iformaio i X, i fully deermies a parial hisory G # 1. If we le h # 1, x 1 x # 1- be he parial decisio hisory ha ca be recovered from G # 1, he +2 ˆ h # 1!o χ d +2 h# γ 1!. Thus, whe GEX is ru as a subrouie of BGA, we ca apply Lemma (3) from he Appedix ad coclude E, loss GEX!{6 X - 5 E, PT 6 X -g% ε 4 % 2! R 2 T % 4 ε For he remaider of his proof, we use big-h oaio o simplify he preseaio. Now, akig he expecaio of boh sides of Equaio (9), E, loss GEX! - 5 E, PT -g% εr 2 T % ε (9)

11 Applyig Theorem (3), E, loss BGA! - ad he usig Theorem (2) we have E, loss BGA! ( γ! E, PT -g% εr 2 T % % γmt ε % γmt 5 1 ( γ! E, PT -g% J δ γ! D T % δmt % εr 2 T % ε 5 E, PT -8% D 1 γ M 2l 2) δ! T % δmt % ε γ 2 3 M 2 T % % γmt ε For he las lie, oe ha while E, PT - could be egaive, i is sill bouded by MT, ad so his jus adds aoher γmt erm, which is capured i he big-h erm. lm Igorig he depedece o, M, ad D ad simplifyig, we see BGA s expeced regre is bouded by E, regre BGA! - z T l 1) δ! γ % δt % εt % 1 % γ 2 γt ε Seig γ δ T # 1 4 ad ε T # 3 4, we ge a boud o our loss of order " T 3 4 lt!. 5 Coclusios ad pe Problems We have preseed a geeral algorihm for olie opimizaio over a arbirary se of decisios S R, ad proved regre bouds for our algorihm ha hold agais a adapive adversary. A umber of quesios are raised by his work. I he fla badis problem, bouds of he form " T! are possible agais a adapive adversary [4]. Agais a oblivious adversary i he geomeric case, a boud of " T 2 3! is achieved i [2]. We achieve a boud of " T 3 4 lt! for his problem agais a adapive adversary. I [4], lower bouds are give showig ha he T! resul is igh, bu o such bouds are kow for he geomeric decisio-space problem. Ca he T 3 4 lt! ad possibly he " T 2 3! bouds be igheed o " T!? A relaed issue is he use of iformaio received by he algorihm; our algorihm ad he algorihm of [2] oly use a γ fracio of he feedback hey receive, which is iuiively uappealig. I seems plausible ha a algorihm ca be foud ha uses all of he feedback, possibly achievig igher bouds. Ackowledgmes The auhors wish o hak Adam Kalai, Geoff Gordo, Bobby Kleiberg, Tom Hayes, ad Varsha Dai for useful coversaios ad correspodece. Fudig provided by NSF gras CCR , NSF-ITR CCR , ad NSF-ITR IIS

12 Refereces 1. Kalai, A., Vempala, S.: Efficie algorihms for o-lie opimizaio. I: Proceedigs of he The 16h Aual Coferece o Learig Theory. (2003) 2. Awerbuch, B., Kleiberg, R.: Adapive rouig wih ed-o-ed feedback: Disribued learig ad geomeric approaches. I: Proceedigs of he 36h ACM Symposium o Theory of Compuig. (2004) To appear. 3. Takimoo, E., Warmuh, M.K.: Pah kerels ad muliplicaive updaes. I: Proceedigs of he 15h Aual Coferece o Compuaioal Learig Theory. Lecure Noes i Arificial Ielligece, Spriger (2002) 4. Auer, P., Cesa-Biachi, N., Freud, Y., Schapire, R.E.: The osochasic muliarmed badi problem. SIAM Joural o Compuig 32 (2002) Auer, P., Cesa-Biachi, N., Freud, Y., Schapire, R.E.: Gamblig i a rigged casio: he adversarial muli-armed badi problem. I: Proceedigs of he 36h Aual Symposium o Foudaios of Compuer Sciece, IEEE Compuer Sociey Press, Los Alamios, CA (1995) Zikevich, M.: lie covex programmig ad geeralized ifiiesimal gradie asce. I: Proceedigs of he Tweieh Ieraioal Coferece o Machie Learig. (2003) 7. Mowai, R., Raghava, P.: Radomized algorihms. Cambridge Uiversiy Press (1995)

13 A Specificaio of a Geomeric Expers Algorihm I his secio we poi ou how he FPL algorihm ad aalysis of [1] ca be adaped o our seig o use as he GEX subrouie, ad prove he correspodig boud eeded for Theorem (4). I paricular, we eed a boud for a arbirary S ad arbirary cos vecors, requirig oly ha o each imesep, 6 c x6 5 R. Furher, he boud mus hold agais a adapive adversary. FPL solves he olie opimizaio problem whe he eire cos vecor c is observed a each imesep. I maiais he sum c 1:# 1, ad o each imesep plays decisio c 1:# 1 % µ!, where µ is chose uiformly a radom from, 0 1) ε -, give ε, a parameer of he algorihm. The aalysis of FPL i [1] assumes posiive cos vecors c saisfyig 4 c4 1 5 A, ad posiive decisio vecors from S wih 4 x ( y4 1 5 D for all x y S ad 6 c x ( c y6 5 R for all cos vecors c ad x y S. Furher, he bouds proved are wih respec o a fixed series of cos vecors, o a adapive adversary. We ow show how o bridge he gap from hese assumpios o our assumpios. Firs, we adap a argume from [2], showig ha by usig our bariceric spaer basis, we ca rasform our problem io oe where he assumpios of FPL are me. We he argue ha a correspodig boud holds agais a adapive adversary. Lemma 1. Le S j be a se of (o ecessarily posiive) decisios, ad k 7, c 1 c T- a se of cos vecors o hose decisios, such ha 6 c x6 5 R for all x S ad c k. The, here is a algorihm? ε! ha achieves E, loss:? ε! k! - 5 PT k! % ε 4 % 2! R 2 T % 4 ε Proof. This a adapaio of he argumes of Appedix A of [2]. Fix a bariceric spaer B N b 1 b for S. The, for each x S, le x Bw ad defie f x!},t( i$ 1 w i w 1 w -. Le f S!_ S~. For each cos vecor c defie g c!_x, R R % c bs 1 R % c b -. I is sraighforward o verify ha c x g c! f x!, ad furher g c! i 0, 4 g c!p % 1! R, ad he differece i cos of ay wo decisios agais a fixed g c! is a mos 2R. By defiiio of a bariceric spaer, w i,t( ad so he L 1 diameer of S~ is a mos 4. Noe he assumpio of posiive decisio vecors i Theorem 1 of [1] ca easily be lifed by addiively shifig he space of decisio vecors uil i is posiive. This chages he loss of he algorihm ad of he bes decisio by he same amou, so addiive regre bouds are uchaged. The resul of his lemma he follows from he boud of Theorem 1 from [1]. lm We ow eed o exed he above boud o adapive adversaries. The key poi here is ha he algorihm is self-oblivious. A self-oblivious algorihm always plays a decisio from some disribuio ha depeds oly o he cos hisory so far ad o he oucome of is previous probabilisic choices. Thus, a self-oblivious algorihm ca be viewed as a fucio from cos hisories o disribuios over decisios. For such algorihms, for ay (possibly adapive) adversary + here always exiss a oblivious adversary ha causes a leas as much regre. The idea for he proof below is due o Adam Kalai. 2 2 We hak Tom Hayes ad Varsha Dai for poiig ou a bug i he proof we had i he origial versio of his paper.

14 Lemma 2. Fix T, le H. be he se of decisio hisories of legh 0 o T ( 1, ad le K. be he se of all cos hisories of legh 0 o T ( 1. The, fix a decisio algorihm : K.&/ S!, where S! is he se of probabiliy disribuios o he se S of possible decisios. Defie Le + R> +2!o E u T T c x ( mi $ 1 x' S $ 1 be a arbirary adversary. The, here exiss a oblivious adversary +V~ such ha R: + ~! i R> +;! c x Proof. A adversary is -oblivious if is firs coss are chose obliviously; oe all adversaries are 1-oblivious. Le + be a arbirary adversary, ad suppose i is k-oblivious. If k T, we are doe. herwise, le c 1 o c k o be he firs k (obliviously chose) coss seleced by +. Expecaios are over he radom variables x 1 x T ad c 1 c T whe + plays agais, hough i his case c 1 c k are fully deermied. Le K T c 1 c T, he radom vecor correspodig o he cos hisory. Le g K T! mi x' S $ T 1 c x. Usig lieariy of expecaio, we ca spli he expeced regre R> +;! io 3 erms: k T E, c ox -g% E, c k 1 x k 1-8% E, $ 1 $ c x ( g K T! - k 2 Sice ad c 1 c k are fixed, E, x k 1- E,? c 1 o c k o! - x is also kow. Sice + is oly k-oblivious, i ges o pick c k 1 wih kowledge of x 1 x k. We have Pr c k 1!o x 1 u u x k Pr x1 x k!ƒ, +; x 1 x k!o c k 1- where is a idicaor fucio, reurig 1 if +2 x 1 x k!y c k 1 ad zero oherwise. The probabiliy Pr x 1 x k! is well defied because + ad are fixed. Imporaly, oe ha he disribuio over c k 1 is idepede of he disribuio over x k 1 ; his follows from he assumpio ha is self-oblivious, ha is, i picks is disribuios based oly o he pas cos vecors, o o is ow acios. Thus, leig L k E, $ k 1 c ox - we ca wrie T R: +;!3 L k % xe, c k 1-8% E, $ c x ( g K T! - (10) k 2 T L k % c k Pr ck 1! c k 1 x % E, 1 $ c x ( g K T!36 c k 1- dc k 1 k 2 T 5 L k % sup c k c k 1 x % E, 1 $ c x ( g K T!36 c k 1- k 2 where he sup is over all c k 1 wih Pr c k 1!&h (11) (12) 0. bserve ha he quaiy iside he supremum is well defied before ay coss or decisios are seleced, ad so + could do

15 a leas as well by selecig c k 1 obliviously o be some c ha achieves he supremum. Thus, here is a k % 1! -oblivious adversary ha causes a leas as much regre as +. Exedig his resul iducively, we coclude here is a fully oblivious (T -oblivious) adversary + ~ such ha R: + ~! i R> +2!. lm Lemma 3. The regre boud from Lemma 1 applies eve if he adversary is adapive. Proof. Firs, observe ha as log as FPL re-radomizes a each imesep, i is selfoblivious, ad so Lemma 2 applies. Suppose some adapive adversary + causes regre ha exceeds he boud i Lemma 1. We ca apply Lemma 2 o + ad cosruc a oblivious +V~ ha also exceeds he boud, a coradicio. Thus, we ca use? ε! as our GEX subrouie for full-observaio olie geomeric opimizaio.

16 B Noios of Regre I [5], a aleraive defiiio of regre is give, amely, E, loss 0u h T T! - ( mi x' E S c x (13) $ 1 This defiiio is equivale o ours i he case of a oblivious adversary, bu agais a adapive adversary he bes decisio for his defiiio is o he bes decisio for a paricular decisio hisory, bu he bes decisio if he decisio mus be chose before a cos hisory is seleced accordig o he disribuio over such hisories. I paricular, E mi x' S T $ 1 c T x 5 mi x' E S c x $ 1 ad so a boud o Equaio (1) is a leas as srog as a boud o Equaio (13). I fac, bouds o Equaio (13) ca be very poor whe he adversary is adapive. There are aural examples where he sroger defiiio (1) gives regre " T! while he weaker defiiio (13) idicaes o regre. Adapig a example from [5], le S < e 1 e (he fla badi seig) ad cosider he algorihm ha plays uiformly a radom from S. The adversary + gives c 1 0, ad if he plays e i o he firs ieraio, hereafer he adversary plays he cos vecor c where c i 0 ad c j 1 for j Q i. The expeced loss of is # 1T. For regre as defied by Equaio (13), mi x' S E, c 1:T x - # 1 T, idicaig o regre, while E, mi x' S c 1:T x! - 0, ad so he sroger defiiio idicaes " T! regre. Uforuaely, his implies like he proof echiques for bouds o expeced weak regre like hose i [4] ad [2] cao be used o ge bouds o regre as defied by Equaio (1). The problem is ha eve if we have ubiased esimaes of he coss, hese cao be used o evaluae he erm E, mi x' S $ T 1 c x! - i (1) because mi is a oliear operaor. We surmou his problem by provig high-probabiliy bouds o our esimaes of c, which allows us o use a uio boud o evaluae he expecaio over he mi operaor. Noe ha he high probabiliy bouds proved i [4] ad [2] ca be see as correspodig o our defiiio of expeced regre.

BEST LINEAR FORECASTS VS. BEST POSSIBLE FORECASTS

BEST LINEAR FORECASTS VS. BEST POSSIBLE FORECASTS Opimal ear Forecasig Alhough we have o meioed hem explicily so far i he course, here are geeral saisical priciples for derivig he bes liear forecas, ad