Best Arm Identification: A Unified Approach to Fixed Budget and Fixed Confidence

Size: px
Start display at page:

Download "Best Arm Identification: A Unified Approach to Fixed Budget and Fixed Confidence"

Transcription

1 Best Ar Identification: A Unified Approach to Fixed Budget and Fixed Confidence Victor Gabillon Mohaad Ghavazadeh Alessandro Lazaric INRIA Lille - Nord Europe, Tea SequeL {victor.gabillon,ohaad.ghavazadeh,alessandro.lazaric}@inria.fr Abstract We study the proble of identifying the best ar(s) in the stochastic ulti-ared bandit setting. This proble has been studied in the literature fro two different perspectives: fixed budget and fixed confidence. We propose a unifying approach that leads to a eta-algorith called unified gap-based exploration (UGapE), with a coon structure and siilar theoretical analysis for these two settings. We prove a perforance bound for the two versions of the algorith showing that the two probles are characterized by the sae notion of coplexity. We also show how the UGapE algorith as well as its theoretical analysis can be extended to take into account the variance of the ars and to ultiple bandits. Finally, we evaluate the perforance of UGapE and copare it with a nuber of existing fixed budget and fixed confidence algoriths. 1 Introduction The proble of best ar(s) identification [6, 3, 1] in the stochastic ulti-ared bandit setting has recently received uch attention. In this proble, a forecaster repeatedly selects an ar and observes a saple drawn fro its reward distribution during an exploration phase, and then is asked to return the best ar(s). Unlike the standard ulti-ared bandit proble, where the goal is to axiize the cuulative su of rewards obtained by the forecaster (see e.g., [15, ]), in this proble the forecaster is evaluated on the quality of the ar(s) returned at the end of the exploration phase. This abstract proble odels a wide range of applications. For instance, let us consider a copany that has K different variants of a product and needs to identify the best one(s) before actually placing it on the arket. The copany sets up a testing phase in which the products are tested by potential custoers. Each custoer tests one product at the tie and gives it a score (a reward). The objective of the copany is to return a product at the end of the test phase which is likely to be successful once placed on the arket (i.e., the best ar identification), and it is not interested in the scores collected during the test phase (i.e., the cuulative reward). The proble of best ar(s) identification has been studied in two distinct settings in the literature. Fixed budget. In the fixed budget setting (see e.g., [3, 1]), the nuber of rounds of the exploration phase is fixed and is known by the forecaster, and the objective is to axiize the probability of returning the best ar(s). In the above exaple, the copany fixes the length of the test phase before hand (e.g., enrolls a fixed nuber of custoers) and defines a strategy to choose which products to show to the testers so that the final selected product is the best with the highest probability. Audibert et al. [1] proposed two different strategies to solve this proble. They defined a strategy based on upper confidence bounds, called UCB-E, whose optial paraeterization is strictly related to a easure of the coplexity of the proble. They also introduced an eliination algorith, called Successive Rejects, which divides the budget n in phases and discards one ar per phase. Both algoriths were shown to have nearly optial probability of returning the best ar. Deng et al. [5] and Gabillon et al. [8] considered the extension of the best ar identification proble to the ulti- 1

2 bandit setting, where the objective is to return the best ar for each bandit. Recently, Bubeck et al. [4] extended the previous results to the proble of -best ar identification and introduced a new version of the Successive Rejects algorith (with accept and reject) that is able to return the set of the -best ars with high probability. Fixed confidence. In the fixed confidence setting (see e.g., [1, 6]), the forecaster tries to iniize the nuber of rounds needed to achieve a fixed confidence about the quality of the returned ar(s). In the above exaple, the copany keeps enrolling custoers in the test until it is, e.g., 95% confident that the best product has been identified. Maron & Moore [1] considered a slightly different setting where besides a fixed confidence also the axiu nuber of rounds is fixed. They designed an eliination algorith, called Hoeffding Races, based on progressively discarding the ars that are suboptial with enough confidence. Mnih et al. [14] introduced an iproved algorith, built on the Bernstein concentration inequality, which takes into account the epirical variance of each ar. Even-Dar et al. [6] studied the fixed confidence setting without any budget constraint and designed an eliination algorith able to return an ar with a required accuracy ɛ (i.e., whose perforance is at least ɛ-close to the optial ar). Kalyanakrishnan & Stone [10] further extended this approach to the case where the -best ars ust be returned with a given confidence. Finally, Kalyanakrishnan et al. [11] recently introduced an algorith for the case of -best ar identification along with a thorough theoretical analysis showing the nuber of rounds needed to achieve the desired confidence. Although the fixed budget and fixed confidence probles have been studied separately, they display several siilarities. In this paper, we propose a unified approach to these two settings in the general case of -best ar identification with accuracy ɛ. 1 The ain contributions of the paper can be suarized as follows: Algorith. In Section 3, we propose a novel eta-algorith, called unified gap-based exploration (UGapE), which uses the sae ar selection and (ar) return strategies for the two settings. This algorith allows us to solve settings that have not been covered in the previous work (e.g., the case of ɛ 0 has not been studied in the fixed budget setting). Furtherore, we show in Appendix C of [7] that UGapE outperfors existing algoriths in soe settings (e.g., it iproves the perforance of the algorith by Mnih et al. [14] in the fixed confidence setting). We also provide a thorough epirical evaluation of UGapE and copare it with a nuber of existing fixed budget and fixed confidence algoriths in Appendix C of [7]. Theoretical analysis. Siilar to the algorithic contribution, in Section 4, we show that a large portion of the theoretical analysis required to study the behavior of the two settings of the UGapE algorith can be unified in a series of leas. The final theoretical guarantees are thus a direct consequence of these leas when used in the two specific settings. Proble coplexity. In Section 4.4, we show that the theoretical analysis indicates that the two probles share exactly the sae definition of coplexity. In particular, we show that the probability of success in the fixed budget setting as well as the saple coplexity in the fixed confidence setting strictly depend on the inverse of the gaps of the ars and the desired accuracy ɛ. Extensions. Finally, in Appendix B of [7], we discuss how the proposed algorith and analysis can be extended to iproved definitions of confidence interval (e.g., Bernstein-based bounds) and to ore coplex settings, such as the ulti-bandit best ar identification proble introduced in [8]. Proble Forulation In this section, we introduce the notation used throughout the paper. Let A = {1,..., K} be the set of ars such that each ar k A is characterized by a distribution ν k bounded in [0, b] with ean µ k and variance σk. We define the -ax and -argax operators as µ () = ax µ k and () = arg ax µ k, k A k A where () denotes the index of the -th best ar in A and µ () is its corresponding ean so that µ (1) µ ()... µ (K). We denote by S A any subset of ars (i.e., S = < K) and by S, the subset of the best ars (i.e., k S, iif µ k µ () ). Without loss of generality, we 1 Note that when ɛ = 0 and = 1 this reduces to the standard best ar identification proble. Ties are broken in an arbitrary but consistent anner.

3 assue there exists a unique set S,. In the following we drop the superscript and use S = S and S = S, whenever is clear fro the context. With a slight abuse of notation we further extend the -ax operator to an operator returning a set of ars, such that {µ (1),..., µ () } = ax 1.. µ k and S = arg ax 1.. µ k. k A k A For each ar k A, we define the gap Δ k as { µk µ Δ k = (+1) if k S µ () µ k if k / S. This definition of gap indicates that if k S, Δ k represents the advantage of ar k over the suboptial ars, and if k / S, Δ k denotes how suboptial ar k is. Note that we can also write the gap as Δ k = ax µ i µ k. Given an accuracy ɛ and a nuber of ars, we say that an ar i k k is (ɛ,)-optial if µ k µ () ɛ. Thus, we define the (ɛ,)-best ar identification proble as the proble of finding a set S of (ɛ,)-optial ars. The (ɛ,)-best ar identification proble can be foralized as a gae between a stochastic bandit environent and a forecaster. The distributions {ν k } are unknown to the forecaster. At each round t, the forecaster pulls an ar I(t) A and observes an independent saple drawn fro the distribution ν I(t). The forecaster estiates the expected value of each ar by coputing the average of the saples observed over tie. Let T k (t) be the nuber of ties that ar k has been pulled by the end Tk (t) s=1 X k(s), where X k (s) is of round t, then the ean of this ar is estiated as µ k (t) = 1 T k (t) the s-th saple observed fro ν k. For any ar k A, we define the notion of ar siple regret as and for any set S A of ars, we define the siple regret as r k = µ () µ k, (1) r S = ax k S r k = µ () in k S µ k. () We denote by Ω(t) A the set of ars returned by the forecaster at the end of the exploration phase (when the alg. stops after t rounds), and by r Ω(t) its corresponding siple regret. Returning (ɛ,)-optial ars is then equivalent to having r Ω(t) saller than ɛ. Given an accuracy ɛ and a nuber of ars to return, we now foralize the two settings of fixed budget and fixed confidence. Fixed budget. The objective is to design a forecaster capable of returning a set of (ɛ,)-optial ars with the largest possible confidence using a fixed budget of n rounds. More forally, given a budget n, the perforance of the forecaster is easured by the probability δ of not eeting the (ɛ,) requireent, i.e., δ = P [ r Ω(n) ɛ ], the saller δ, the better the algorith. Fixed confidence. The goal is to design a forecaster that stops as soon as possible and returns a set of (ɛ,)-optial ars with a fixed confidence. We denote by ñ the tie when the algorith stops and by Ω(ñ) its set of returned ars. Given a confidence level δ, the forecaster has to guarantee that P [ r Ω(ñ) ɛ ] δ. The perforance of the forecaster is then easured by the nuber of rounds ñ either in expectation or high probability. Although these settings have been considered as two distinct probles, in Section 3 we introduce a unified ar selection strategy that can be used in both cases by siply changing the stopping criteria. Moreover, we show in Section 4 that the bounds on the perforance of the algorith in the two settings share the sae notion of coplexity and can be derived using very siilar arguents. 3 Unified Gap-based Exploration Algorith In this section, we describe the unified gap-based exploration (UGapE) eta-algorith and show how it is ipleented in the fixed-budget and fixed-confidence settings. As shown in Figure 1, both fixed-budget (UGapEb) and fixed-confidence (UGapEc) instances of UGapE use the sae arselection strategy, SELECT-ARM (described in Figure ), and upon stopping, return the -best ars in the sae anner (using Ω). The two algoriths only differ in their stopping criteria. More precisely, both algoriths receive as input the definition of the proble (ɛ, ), a constraint (the 3

4 budget n in UGapEb and the confidence level δ in UGapEc), and a paraeter (a or c). While UGapEb runs for n rounds and then returns the set of ars Ω(n), UGapEc runs until it achieves the desired accuracy ɛ with the requested confidence level δ. This difference is due to the two different objectives targeted by the algoriths; while UGapEc optiizes its budget for a given confidence level, UGapEb s goal is to optiize the quality of its recoendation for a fixed budget. UGapEb (ɛ,, n, a) Paraeters: accuracy ɛ, nuber of ars, budget n, exploration paraeter a Initialize: Pull each ar k once, update µ k (K) and set T k (K) = 1 SAMP for t = K + 1,..., n do SELECT-ARM (t) end for SAMP Return Ω(n) = arg in J(t) B J(t)(t) UGapEc (ɛ,, δ, c) Paraeters: accuracy ɛ, nuber of ars, confidence level δ, exploration paraeter c Initialize: Pull each ar k once, update µ k (K), set T k (K) = 1 and t K + 1 SAMP while B J(t) (t) ɛ do SELECT-ARM (t) t t + 1 end while SAMP Return Ω(t) = J(t) Figure 1: The pseudo-code for the UGapE algorith in the fixed-budget (UGapEb) (left) and fixedconfidence (UGapEc) (right) settings. Regardless of the final objective, how to select an ar at each round (ar-selection strategy) is the key coponent of any ulti-ar bandit algorith. One of the ost iportant features of UGapE is having a unique ar-selection strategy for the fixed-budget and fixed-confidence settings. We now describe the UGapE s arselection strategy, whose pseudo-code has been reported in Figure. At each tie step t, UGapE first uses the observations up to tie t 1 and coputes an index B k (t) = ax U i(t) i k L k (t) for each ar k A, where SELECT-ARM (t) Copute B k (t) for each ar k A Identify the set of ars J(t) arg 1.. in Pull the ar I(t) = arg ax k {l t,u t } B k(t) k A β k(t 1) Observe X I(t) ( TI(t) (t 1) + 1 ) ν I(t) Update µ I(t) (t) and T I(t) (t) Figure : The pseudo-code for the UGapE s arselection strategy. This routine is used in both UGapEb and UGapEc instances of UGapE. t, k A U k (t) = µ k (t 1) + β k(t 1), L k (t) = µ k (t 1) β k(t 1). (3) In Eq. 3, β k(t 1) is a confidence interval, 3 and U k (t) and L k (t) are high probability upper and lower bounds on the ean of ar k, µ k, after t 1 rounds. Note that the paraeters a and c are used in the definition of the confidence interval β k, whose shape strictly depends on the concentration bound used by the algorith. For exaple, we can derive β k fro the Chernoff-Hoeffding bound as a UGapEb: β k(t 1) = b T k (t 1), UGapEc: β k(t 1) = b c log 4K(t 1)3 δ T k (t 1). (4) In Sec. 4, we discuss how the paraeters a and c can be tuned and we show that while a should be tuned as a function of n and ɛ in UGapEb, c = 1/ is always a good choice for UGapEc. Defining the confidence interval in a general for β k(t 1) allows us to easily extend the algorith by taking into account different (higher) oents of the ars (see Appendix B of [7] for the case of variance, where β k(t 1) is obtained fro the Bernstein inequality). Fro Eq. 3, we ay see that the index B k (t) is an upper-bound on the siple regret r k of the kth ar (see Eq. 1). We also define an index for a set S as B S (t) = ax i S B i (t). Siilar to the ar index, B S is also defined in order to upper-bound the siple regret r S with high probability (see Lea 1). After coputing the ar indices, UGapE finds a set of ars J(t) with iniu upper-bound on their siple regrets, i.e., J(t) = arg in 1.. B k(t). Fro J(t), it coputes two ar indices u t = k A arg ax j / J(t) U j (t) and l t = arg in i J(t) L i (t), where in both cases the tie is broken in favor of 3 To be ore precise, β k(t 1) is the width of a confidence interval or a confidence radius. 4

5 the ar with the largest uncertainty β(t 1). Ars l t and u t are the worst possible ar aong those in J(t) and the best possible ar left outside J(t), respectively, and together they represent how bad the choice of J(t) could be. Intuitively, UGapE pulls the ost uncertain between u t or l t allows Finally, the algorith selects and pulls the ar I(t) as the ar with the larger β(t 1) aong u t and l t, observes a saple X I(t) ( TI(t) (t 1) + 1 ) fro the distribution ν I(t), and updates the epirical ean µ I(t) (t) and the nuber of pulls T I(t) (t) of the selected ar I(t). There are two ore points that need to be discussed about the UGapE algorith. 1) While UGapEc defines the set of returned ars as Ω(t) = J(t), UGapEb returns the set of ars J(t) with the sallest index, i.e., Ω(n) = arg in J(t) B J(t) (t), t {1,..., n}. ) UGapEc stops (we refer to the nuber of rounds before stopping as ñ) when B J(ñ+1) (ñ + 1) is less than the given accuracy ɛ, i.e., when even the th worst upper-bound on the ar siple regret aong all the ars in the selected set J(ñ + 1) is saller than ɛ. This guarantees that the siple regret (see Eq. ) of the set returned by the algorith, Ω(ñ) = J(ñ + 1), to be saller than ɛ with probability larger than 1 δ. 4 Theoretical Analysis In this section, we provide high probability upper-bounds on the perforance of the two instances of the UGapE algorith, UGapEb and UGapEc, introduced in Section 3. An iportant feature of UGapE is that since its fixed-budget and fixed-confidence versions share the sae ar-selection strategy, a large part of their theoretical analysis can be unified. We first report this unified part of the proof in Section 4.1, and then provide the final perforance bound for each of the algoriths, UGapEb and UGapEc, separately, in Sections 4. and 4.3, respectively. Before oving to the ain results, we define additional notation used in the analysis. We first define event E as E = { k A, t {1,..., T }, µ k (t) µ k < β k(t) }, (5) where the values of T and β k are defined for each specific setting separately. Note that event E plays an iportant role in the sequel, since it allows us to first derive a series of results which are directly iplied by the event E and to postpone the study of the stochastic nature of the proble (i.e., the probability of E) in the two specific settings. In particular, when E holds, we have that for any ar k A and at any tie t, L k (t) µ k U k (t). Finally, we define the coplexity of the proble as K b H ɛ = ax( Δi+ɛ i=1, ɛ). (6) Note that although the coplexity has an explicit dependence on ɛ, it also depends on the nuber of ars through the definition of the gaps Δ i, thus aking it a coplexity easure of the (ɛ, ) best ar identification proble. In Section 4.4, we will discuss why the coplexity of the two instances of the proble is easured by this quantity. 4.1 Analysis of the Ar-Selection Strategy Here we report lower (Lea 1) and upper (Lea ) bounds for indices B S on the event E, which show their connection with the regret and gaps. The technical leas used in the proofs (Leas 3 and 4 and Corollary 1) are reported in Appendix A of [7]. We first prove that for any set S S and any tie t {1,..., T }, the index B S (t) is an upper-bound on the siple regret of this set r S. Lea 1. On event E, for any set S S and any tie t {1,..., T }, we have B S (t) r S. Proof. On event E, for any ar i / S and each tie t {1,..., T }, we ay write B i (t) = ax U j(t) L i (t) = ax ( µj (t 1) + β j(t 1) ) ( µ i (t 1) β i(t 1) ) j i j i Using Eq. 7, we have ax j i µ j µ i = µ () µ i = r i. (7) B S (t) = ax i S B i(t) ax B i(t) i (S S ) ax r i = r S, i (S S ) where the last passage follows fro the fact that r i 0 for any i S. 5

6 Lea. On event E, if ar k {l t, u t } is pulled at tie t {1,..., T }, we have B J(t) (t) in ( 0, Δ k + β k(t 1) ) + β k(t 1). (8) Proof. We first prove the stateent for B(t) = U ut (t) L lt (t), i.e., B(t) in ( 0, Δ k + β k(t 1) ) + β k(t 1). (9) We consider the following cases: Case 1. k = u t : Case 1.1. u t S : Since by definition u t / J(t), there exists an ar j / S such that j J(t). Now we ay write µ (+1) µ j (a) L j (t) (b) L lt (t) (c) L ut (t) = µ k (t 1) β k(t 1) (d) µ k β k(t 1) (10) (a) and (d) hold because of event E, (b) follows fro the fact that j J(t) and fro the definition of l t, and (c) is the result of Lea 4. Fro Eq. 10, we ay deduce that Δ k + β k(t 1) 0, which together with Corollary 1 gives us the desired result (Eq. 9). Case 1.. u t / S : Case l t S : In this case, we ay write B(t) = U ut (t) L lt (t) (a) µ ut + β u t (t 1) µ lt + β l t (t 1) (b) µ ut + β u t (t 1) µ () + β l t (t 1) (c) Δ ut + 4β u t (t 1) (11) (a) holds because of event E, (b) is fro the fact that l t S, and (c) is because u t is pulled, and thus, β u t (t 1) β l t (t 1). The final result follows fro Eq. 11 and Corollary 1. Case 1... l t / S : Since l t / S and the fact that by definition l t J(t), there exists an ar j S such that j / J(t). Now we ay write µ ut + β u t (t 1) (a) U ut (t) (b) U j (t) (c) µ j (d) µ () (1) (a) and (c) hold because of event E, (b) is fro the definition of u t and the fact that j / J(t), and (d) holds because j S. Fro Eq. 1, we ay deduce that Δ ut + β u t (t 1) 0, which together with Corollary 1 gives us the final result (Eq. 9). With siilar arguents and cases, we prove the result of Eq. 9 for k = l t. The final stateent of the lea (Eq. 8) follows directly fro B J(t) (t) B(t) as shown in Lea 3. Using Leas 1 and, we define an upper and a lower bounds on B J(t) in ters of quantities related to the regret of J(t). Lea 1 confirs the intuition that the B-values upper-bound the regret of the corresponding set of ars (with high probability). Unfortunately, this is not enough to clai that selecting J(t) as the set of ars with sallest B-values actually correspond to ars with sall regret, since B J(t) could be an arbitrary loose bound on the regret. Lea provides this copleentary guarantee specifically for the set J(t), in the for of an upper-bound on B J(t) w.r.t. the gap of k {u t, l t }. This iplies that as the algorith runs, the choice of J(t) becoes ore and ore accurate since B J(t) is constrained between r J(t) and a quantity (Eq. 8) that gets saller and saller, thus iplying that selecting the ars with the saller B-value, i.e., the set J(t), corresponds to those which actually have the sallest regret, i.e., the ars in S. This arguent will be iplicitly at the basis of the proofs of the two following theores. 4. Regret Bound for the Fixed-Budget Setting Here we prove an upper-bound on the siple-regret of UGapEb. Since the setting considered by the algorith is fixed-budget, we ay set T = n. Fro the definition of the confidence interval β i(t) in Eq. 4 and a union bound, we have that P(E) 1 Kn exp( a). 4 We now have all the tools needed to prove the perforance of UGapEb for the (ɛ,)-best ar identification proble. 4 The extension to a confidence interval that takes into account the variance of the ars is discussed in Appendix B of [7]. 6

7 Theore 1. If we run UGapEb with paraeter 0 < a n K 4H ɛ, its siple regret r Ω(n) satisfies δ = P ( r Ω(n) ɛ ) Kn exp( a), and in particular this probability is iniized for a = n K 4H ɛ. Proof. The proof is by contradiction. We assue that r Ω(n) following two steps: > ɛ on event E and consider the Step 1: Here we show that on event E, we have the following upper-bound on the nuber of pulls of any ar i A: 4ab T i (n) < ax ( Δ i+ɛ, ɛ ) + 1. (13) Let t i be the last tie that ar i is pulled. If ar i has been pulled only during the initialization phase, T i (n) = 1 and Eq. 13 trivially holds. If i has been selected by SELECT-ARM, then we have in ( Δ i + β i(t i 1), 0 ) + β i(t i 1) (a) B(t i ) (b) B J(ti)(t i ) (c) B Ω(n) (t l ) (d) > ɛ, (14) where t l {1,..., n} is the tie such that Ω(n) = J(t l ). (a) and (b) are the results of Leas and 3, (c) is by the definition of Ω(n), and (d) holds because using Lea 1, we know that if the algorith suffers a siple regret r Ω(n) > ɛ (as assued at the beginning of the proof), then t = 1,..., n + 1, B Ω(n) (t) > ɛ. By the definition of t i, we know T i (n) = T i (t i 1) + 1. Using this fact, the definition of β i(t i 1), and Eq. 14, it is straightforward to show that Eq. 13 holds. Step : We know that K i=1 T i(n) = n. Using Eq. 13, we have K i=1 4ab ax ( Δi +ɛ,ɛ ) + K > n on event E. It is easy to see that by selecting a n K 4H ɛ, the left-hand-side of this inequality will be saller than or equal to n, which is a contradiction. Thus, we conclude that r Ω(n) ɛ on event E. The final result follows fro the probability of event E defined at the beginning of this section. 4.3 Regret Bound for the Fixed-Confidence Setting Here we prove an upper-bound on the siple-regret of UGapEc. Since the setting considered by the algorith is fixed-confidence, we ay set T = +. Fro the definition of the confidence interval β i(t) in Eq. 4 and a union bound on T k (t) {0,..., t}, t = 1,...,, we have that P(E) 1 δ. Theore. The UGapEc algorith stops after ñ rounds and returns a set of ars, Ω(ñ), that satisfies P ( r Ω(ñ+1) ɛ ñ N ) 1 δ, where N = K + O(H ɛ log Hɛ δ ) and c has been set to its optial value 1/. Proof. We first prove the bound on the siple regret of UGapEc. Using Lea 1, we have that on event E, the siple regret of UGapEc upon stopping satisfies B J(ñ+1) (ñ + 1) = B Ω(ñ+1) (ñ + 1) r Ω(ñ+1). As a result, on event E, the regret of UGapEc cannot be bigger than ɛ, because then it contradicts the stopping condition of the algorith, i.e., B J(ñ+1) (ñ + 1) < ɛ. Therefore, we have P ( r Ω(ñ+1) ɛ ) 1 δ. Now we prove the bound for the saple coplexity. Siilar to the proof of Theore 1, we consider the following two steps: Step 1: Here we show that on event E, we have the following upper-bound on the nuber of pulls of any ar i A: T i (ñ) b log(4k(ñ 1) 3 /δ) ax ( Δ i+ɛ, ɛ ) + 1. (15) Let t i be the last tie that ar i is pulled. If ar i has been pulled only during the initialization phase, T i (ñ) = 1 and Eq. 15 trivially holds. If i has been selected by SELECT-ARM, then we have B J(ti)(t i ) ɛ. Now using Lea, we ay write B J(ti)(t i ) in ( 0, Δ i + β i(t i 1) ) + β i(t i 1). (16) We can prove Eq. 15 by plugging in the value of β i(t i 1) fro Eq. 4 and solving Eq. 16 for T i (t i ) taking into account that T i (t i 1) + 1 = T i (t i ). 7

8 Step : We know that K i=1 T i(ñ) = ñ. Using Eq. 15, on event E, we have H ɛ log ( K(ñ 1) 3 /δ ) + K ñ. Solving this inequality gives us ñ N. 4.4 Proble Coplexity Theores 1 and indicate that both the probability of success and saple coplexity of UGapE are directly related to the coplexity H ɛ defined by Eq. 6. This iplies that H ɛ captures the intrinsic difficulty of the (ɛ,)-best ar(s) identification proble independently fro the specific setting considered. Furtherore, note that this definition generalizes existing notions of coplexity. For exaple, for ɛ = 0 and = 1 we recover the coplexity used in the definition of UCB-E [1] for the fixed budget setting and the one defined in [6] for the fixed accuracy proble. Let us analyze H ɛ in the general case of ɛ > 0. We define the coplexity of a single ar i A, H ɛ,i = b / ax( Δi+ɛ, ɛ). When the gap Δ i is saller than the desired accuracy ɛ, i.e., Δ i ɛ, then the coplexity reduces to H ɛ,i = 1/ɛ. In fact, the algorith can stop as soon as the desired accuracy ɛ is achieved, which eans that there is no need to exactly discriinate between ar i and the best ar. On the other hand, when Δ i > ɛ, then the coplexity becoes H ɛ,i = 4b /(Δ i + ɛ). This shows that when the desired accuracy is saller than the gap, the coplexity of the proble is saller than the case of ɛ = 0, for which we have H 0,i = 4b /Δ i. More in general, the analysis reported in the paper suggests that the perforance of a upper confidence bound based algorith such as UGapE is characterized by the sae notion of coplexity in both settings. Thus, whenever the coplexity is known, it is possible to exploit the theoretical analysis (bounds on the perforance) to easily switch fro one setting to the other. For instance, as also suggested in Section 5.4 of [9], if the coplexity H is known, an algorith like UGapEc can be adapted to run in the fixed budget setting by inverting the bound on its saple coplexity. This would lead to an algorith siilar to UGapEb with siilar perforance, although the paraeter tuning could be ore difficult because of the intrinsic poor accuracy in the constants of the bound. On the other hand, it is an open question whether it is possible to find an equivalence between algoriths for the two different settings when the coplexity is not known. In particular, it would be iportant to derive a distribution-dependent lower bound in the for of the one reported in [1] for the general case of ɛ 0 and 1 for both the fixed budget and fixed confidence settings. 5 Suary and Discussion We proposed a eta-algorith, called unified gap-based exploration (UGapE), that unifies the two settings of the best ar(s) identification proble in stochastic ulti-ared bandit: fixed budget and fixed confidence. UGapE can be instantiated as two algoriths with a coon structure (the sae ar-selection and ar-return strategies) corresponding to these two settings, whose perforance can be analyzed in a unified way, i.e., a large portion of their theoretical analysis can be unified in a series of leas. We proved a perforance bound for the UGapE algorith in the two settings. We also showed how UGapE and its theoretical analysis can be extended to take into account the variance of the ars and to ultiple bandits. Finally, we evaluated the perforance of UGapE and copare it with a nuber of existing fixed budget and fixed confidence algoriths. This unification is iportant for both theoretical and algorithic reasons. Despite their siilarities, fixed budget and fixed confidence settings have been treated differently in the literature. We believe that this unification provides a better understanding of the intrinsic difficulties of the best ar(s) identification proble. In particular, our analysis showed that the sae coplexity ter characterizes the hardness of both settings. As entioned in the introduction, there was no algorith available for several settings considered in this paper, e.g., (ɛ,)-best ar identification with fixed budget. With UGapE, we introduced an algorith that can be easily adapted to all these settings. Acknowledgents This work was supported by Ministry of Higher Education and Research, Nord- Pas de Calais Regional Council and FEDER through the contrat de projets état region ", French National Research Agency (ANR) under project LAMPADA n ANR-09-EMER-007, European Counity s Seventh Fraework Prograe (FP7/ ) under grant agreeent n 7037, and PASCAL European Network of Excellence. 8

9 References [1] J.-Y. Audibert, S. Bubeck, and R. Munos. Best ar identification in ulti-ared bandits. In Proceedings of the Twenty-Third Annual Conference on Learning Theory, pages 41 53, 010. [] P. Auer, N. Cesa-Bianchi, and P. Fischer. Finite-tie analysis of the ulti-ared bandit proble. Machine Learning, 47:35 56, 00. [3] S. Bubeck, R. Munos, and G. Stoltz. Pure exploration in ulti-ared bandit probles. In Proceedings of the Twentieth International Conference on Algorithic Learning Theory, pages 3 37, 009. [4] S. Bubeck, T. Wang, and N. Viswanathan. Multiple identifications in ulti-ared bandits. CoRR, abs/ , 01. [5] K. Deng, J. Pineau, and S. Murphy. Active learning for developing personalized treatent. In Proceedings of the Twenty-Seventh International Conference on Uncertainty in Artificial Intelligence, pages , 011. [6] E. Even-Dar, S. Mannor, and Y. Mansour. Action eliination and stopping conditions for the ulti-ared bandit and reinforceent learning probles. Journal of Machine Learning Research, 7: , 006. [7] V. Gabillon, M. Ghavazadeh, and A. Lazaric. Best Ar Identification: A Unified Approach to Fixed Budget and Fixed Confidence. Technical report , October 01. [8] V. Gabillon, M. Ghavazadeh, A. Lazaric, and S. Bubeck. Multi-bandit best ar identification. In Proceedings of Advances in Neural Inforation Processing Systes 5, pages 30, 011. [9] S. Kalyanakrishnan. Learning Methods for Sequential Decision Making with Iperfect Representations. PhD thesis, Departent of Coputer Science, The University of Texas at Austin, Austin, Texas, USA, Deceber 011. Published as UT Austin Coputer Science Technical Report TR [10] S. Kalyanakrishnan and P. Stone. Efficient selection of ultiple bandit ars: Theory and practice. In Proceedings of the Twenty-Seventh International Conference on Machine Learning, pages , 010. [11] S. Kalyanakrishnan, A. Tewari, P. Auer, and P. Stone. Pac subset selection in stochastic ultiared bandits. In Proceedings of the Twentieth International Conference on Machine Learning, 01. [1] O. Maron and A. Moore. Hoeffding races: Accelerating odel selection search for classification and function approxiation. In Proceedings of Advances in Neural Inforation Processing Systes 6, pages 59 66, [13] A. Maurer and M. Pontil. Epirical bernstein bounds and saple-variance penalization. In th annual conference on learning theory, 009. [14] V. Mnih, Cs. Szepesvári, and J.-Y. Audibert. Epirical Bernstein stopping. In Proceedings of the Twenty-Fifth International Conference on Machine Learning, pages , 008. [15] H. Robbins. Soe aspects of the sequential design of experients. Bulletin of the Aerican Matheatics Society, 58:57 535,

Best Arm Identification: A Unified Approach to Fixed Budget and Fixed Confidence

Best Arm Identification: A Unified Approach to Fixed Budget and Fixed Confidence Best Ar Identification: A Unified Approach to Fixed Budget and Fixed Confidence Victor Gabillon, Mohaad Ghavazadeh, Alessandro Lazaric To cite this version: Victor Gabillon, Mohaad Ghavazadeh, Alessandro

More information

E0 370 Statistical Learning Theory Lecture 6 (Aug 30, 2011) Margin Analysis

E0 370 Statistical Learning Theory Lecture 6 (Aug 30, 2011) Margin Analysis E0 370 tatistical Learning Theory Lecture 6 (Aug 30, 20) Margin Analysis Lecturer: hivani Agarwal cribe: Narasihan R Introduction In the last few lectures we have seen how to obtain high confidence bounds

More information

e-companion ONLY AVAILABLE IN ELECTRONIC FORM

e-companion ONLY AVAILABLE IN ELECTRONIC FORM OPERATIONS RESEARCH doi 10.1287/opre.1070.0427ec pp. ec1 ec5 e-copanion ONLY AVAILABLE IN ELECTRONIC FORM infors 07 INFORMS Electronic Copanion A Learning Approach for Interactive Marketing to a Custoer

More information

Block designs and statistics

Block designs and statistics Bloc designs and statistics Notes for Math 447 May 3, 2011 The ain paraeters of a bloc design are nuber of varieties v, bloc size, nuber of blocs b. A design is built on a set of v eleents. Each eleent

More information

Multiple Identifications in Multi-Armed Bandits

Multiple Identifications in Multi-Armed Bandits Multiple Identifications in Multi-Armed Bandits arxiv:05.38v [cs.lg] 4 May 0 Sébastien Bubeck Department of Operations Research and Financial Engineering, Princeton University sbubeck@princeton.edu Tengyao

More information

Kernel Methods and Support Vector Machines

Kernel Methods and Support Vector Machines Intelligent Systes: Reasoning and Recognition Jaes L. Crowley ENSIAG 2 / osig 1 Second Seester 2012/2013 Lesson 20 2 ay 2013 Kernel ethods and Support Vector achines Contents Kernel Functions...2 Quadratic

More information

Experimental Design For Model Discrimination And Precise Parameter Estimation In WDS Analysis

Experimental Design For Model Discrimination And Precise Parameter Estimation In WDS Analysis City University of New York (CUNY) CUNY Acadeic Works International Conference on Hydroinforatics 8-1-2014 Experiental Design For Model Discriination And Precise Paraeter Estiation In WDS Analysis Giovanna

More information

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation Course Notes for EE227C (Spring 2018): Convex Optiization and Approxiation Instructor: Moritz Hardt Eail: hardt+ee227c@berkeley.edu Graduate Instructor: Max Sichowitz Eail: sichow+ee227c@berkeley.edu October

More information

Intelligent Systems: Reasoning and Recognition. Perceptrons and Support Vector Machines

Intelligent Systems: Reasoning and Recognition. Perceptrons and Support Vector Machines Intelligent Systes: Reasoning and Recognition Jaes L. Crowley osig 1 Winter Seester 2018 Lesson 6 27 February 2018 Outline Perceptrons and Support Vector achines Notation...2 Linear odels...3 Lines, Planes

More information

Introduction to Machine Learning. Recitation 11

Introduction to Machine Learning. Recitation 11 Introduction to Machine Learning Lecturer: Regev Schweiger Recitation Fall Seester Scribe: Regev Schweiger. Kernel Ridge Regression We now take on the task of kernel-izing ridge regression. Let x,...,

More information

On Constant Power Water-filling

On Constant Power Water-filling On Constant Power Water-filling Wei Yu and John M. Cioffi Electrical Engineering Departent Stanford University, Stanford, CA94305, U.S.A. eails: {weiyu,cioffi}@stanford.edu Abstract This paper derives

More information

Prediction by random-walk perturbation

Prediction by random-walk perturbation Prediction by rando-walk perturbation Luc Devroye School of Coputer Science McGill University Gábor Lugosi ICREA and Departent of Econoics Universitat Popeu Fabra lucdevroye@gail.co gabor.lugosi@gail.co

More information

1 Bounding the Margin

1 Bounding the Margin COS 511: Theoretical Machine Learning Lecturer: Rob Schapire Lecture #12 Scribe: Jian Min Si March 14, 2013 1 Bounding the Margin We are continuing the proof of a bound on the generalization error of AdaBoost

More information

Computable Shell Decomposition Bounds

Computable Shell Decomposition Bounds Coputable Shell Decoposition Bounds John Langford TTI-Chicago jcl@cs.cu.edu David McAllester TTI-Chicago dac@autoreason.co Editor: Leslie Pack Kaelbling and David Cohn Abstract Haussler, Kearns, Seung

More information

arxiv: v1 [cs.ds] 29 Jan 2012

arxiv: v1 [cs.ds] 29 Jan 2012 A parallel approxiation algorith for ixed packing covering seidefinite progras arxiv:1201.6090v1 [cs.ds] 29 Jan 2012 Rahul Jain National U. Singapore January 28, 2012 Abstract Penghui Yao National U. Singapore

More information

13.2 Fully Polynomial Randomized Approximation Scheme for Permanent of Random 0-1 Matrices

13.2 Fully Polynomial Randomized Approximation Scheme for Permanent of Random 0-1 Matrices CS71 Randoness & Coputation Spring 018 Instructor: Alistair Sinclair Lecture 13: February 7 Disclaier: These notes have not been subjected to the usual scrutiny accorded to foral publications. They ay

More information

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation Course Notes for EE7C (Spring 018: Convex Optiization and Approxiation Instructor: Moritz Hardt Eail: hardt+ee7c@berkeley.edu Graduate Instructor: Max Sichowitz Eail: sichow+ee7c@berkeley.edu October 15,

More information

Algorithms for parallel processor scheduling with distinct due windows and unit-time jobs

Algorithms for parallel processor scheduling with distinct due windows and unit-time jobs BULLETIN OF THE POLISH ACADEMY OF SCIENCES TECHNICAL SCIENCES Vol. 57, No. 3, 2009 Algoriths for parallel processor scheduling with distinct due windows and unit-tie obs A. JANIAK 1, W.A. JANIAK 2, and

More information

Pattern Recognition and Machine Learning. Artificial Neural networks

Pattern Recognition and Machine Learning. Artificial Neural networks Pattern Recognition and Machine Learning Jaes L. Crowley ENSIMAG 3 - MMIS Fall Seester 2017 Lessons 7 20 Dec 2017 Outline Artificial Neural networks Notation...2 Introduction...3 Key Equations... 3 Artificial

More information

A Better Algorithm For an Ancient Scheduling Problem. David R. Karger Steven J. Phillips Eric Torng. Department of Computer Science

A Better Algorithm For an Ancient Scheduling Problem. David R. Karger Steven J. Phillips Eric Torng. Department of Computer Science A Better Algorith For an Ancient Scheduling Proble David R. Karger Steven J. Phillips Eric Torng Departent of Coputer Science Stanford University Stanford, CA 9435-4 Abstract One of the oldest and siplest

More information

A note on the multiplication of sparse matrices

A note on the multiplication of sparse matrices Cent. Eur. J. Cop. Sci. 41) 2014 1-11 DOI: 10.2478/s13537-014-0201-x Central European Journal of Coputer Science A note on the ultiplication of sparse atrices Research Article Keivan Borna 12, Sohrab Aboozarkhani

More information

A Theoretical Analysis of a Warm Start Technique

A Theoretical Analysis of a Warm Start Technique A Theoretical Analysis of a War Start Technique Martin A. Zinkevich Yahoo! Labs 701 First Avenue Sunnyvale, CA Abstract Batch gradient descent looks at every data point for every step, which is wasteful

More information

Bayes Decision Rule and Naïve Bayes Classifier

Bayes Decision Rule and Naïve Bayes Classifier Bayes Decision Rule and Naïve Bayes Classifier Le Song Machine Learning I CSE 6740, Fall 2013 Gaussian Mixture odel A density odel p(x) ay be ulti-odal: odel it as a ixture of uni-odal distributions (e.g.

More information

Support Vector Machines MIT Course Notes Cynthia Rudin

Support Vector Machines MIT Course Notes Cynthia Rudin Support Vector Machines MIT 5.097 Course Notes Cynthia Rudin Credit: Ng, Hastie, Tibshirani, Friedan Thanks: Şeyda Ertekin Let s start with soe intuition about argins. The argin of an exaple x i = distance

More information

A Low-Complexity Congestion Control and Scheduling Algorithm for Multihop Wireless Networks with Order-Optimal Per-Flow Delay

A Low-Complexity Congestion Control and Scheduling Algorithm for Multihop Wireless Networks with Order-Optimal Per-Flow Delay A Low-Coplexity Congestion Control and Scheduling Algorith for Multihop Wireless Networks with Order-Optial Per-Flow Delay Po-Kai Huang, Xiaojun Lin, and Chih-Chun Wang School of Electrical and Coputer

More information

On the Use of A Priori Information for Sparse Signal Approximations

On the Use of A Priori Information for Sparse Signal Approximations ITS TECHNICAL REPORT NO. 3/4 On the Use of A Priori Inforation for Sparse Signal Approxiations Oscar Divorra Escoda, Lorenzo Granai and Pierre Vandergheynst Signal Processing Institute ITS) Ecole Polytechnique

More information

Curious Bounds for Floor Function Sums

Curious Bounds for Floor Function Sums 1 47 6 11 Journal of Integer Sequences, Vol. 1 (018), Article 18.1.8 Curious Bounds for Floor Function Sus Thotsaporn Thanatipanonda and Elaine Wong 1 Science Division Mahidol University International

More information

Stochastic Subgradient Methods

Stochastic Subgradient Methods Stochastic Subgradient Methods Lingjie Weng Yutian Chen Bren School of Inforation and Coputer Science University of California, Irvine {wengl, yutianc}@ics.uci.edu Abstract Stochastic subgradient ethods

More information

A Simple Regression Problem

A Simple Regression Problem A Siple Regression Proble R. M. Castro March 23, 2 In this brief note a siple regression proble will be introduced, illustrating clearly the bias-variance tradeoff. Let Y i f(x i ) + W i, i,..., n, where

More information

Ensemble Based on Data Envelopment Analysis

Ensemble Based on Data Envelopment Analysis Enseble Based on Data Envelopent Analysis So Young Sohn & Hong Choi Departent of Coputer Science & Industrial Systes Engineering, Yonsei University, Seoul, Korea Tel) 82-2-223-404, Fax) 82-2- 364-7807

More information

A Note on Scheduling Tall/Small Multiprocessor Tasks with Unit Processing Time to Minimize Maximum Tardiness

A Note on Scheduling Tall/Small Multiprocessor Tasks with Unit Processing Time to Minimize Maximum Tardiness A Note on Scheduling Tall/Sall Multiprocessor Tasks with Unit Processing Tie to Miniize Maxiu Tardiness Philippe Baptiste and Baruch Schieber IBM T.J. Watson Research Center P.O. Box 218, Yorktown Heights,

More information

Intelligent Systems: Reasoning and Recognition. Artificial Neural Networks

Intelligent Systems: Reasoning and Recognition. Artificial Neural Networks Intelligent Systes: Reasoning and Recognition Jaes L. Crowley MOSIG M1 Winter Seester 2018 Lesson 7 1 March 2018 Outline Artificial Neural Networks Notation...2 Introduction...3 Key Equations... 3 Artificial

More information

Model Fitting. CURM Background Material, Fall 2014 Dr. Doreen De Leon

Model Fitting. CURM Background Material, Fall 2014 Dr. Doreen De Leon Model Fitting CURM Background Material, Fall 014 Dr. Doreen De Leon 1 Introduction Given a set of data points, we often want to fit a selected odel or type to the data (e.g., we suspect an exponential

More information

Approximate Modified Policy Iteration

Approximate Modified Policy Iteration Bruno Scherrer, Mohaad Ghavazadeh, Victor Gabillon, Matthieu Geist To cite this version: Bruno Scherrer, Mohaad Ghavazadeh, Victor Gabillon, Matthieu Geist Approxiate Modified Policy Iteration 29th International

More information

On the Inapproximability of Vertex Cover on k-partite k-uniform Hypergraphs

On the Inapproximability of Vertex Cover on k-partite k-uniform Hypergraphs On the Inapproxiability of Vertex Cover on k-partite k-unifor Hypergraphs Venkatesan Guruswai and Rishi Saket Coputer Science Departent Carnegie Mellon University Pittsburgh, PA 1513. Abstract. Coputing

More information

The Simplex Method is Strongly Polynomial for the Markov Decision Problem with a Fixed Discount Rate

The Simplex Method is Strongly Polynomial for the Markov Decision Problem with a Fixed Discount Rate The Siplex Method is Strongly Polynoial for the Markov Decision Proble with a Fixed Discount Rate Yinyu Ye April 20, 2010 Abstract In this note we prove that the classic siplex ethod with the ost-negativereduced-cost

More information

Quantum algorithms (CO 781, Winter 2008) Prof. Andrew Childs, University of Waterloo LECTURE 15: Unstructured search and spatial search

Quantum algorithms (CO 781, Winter 2008) Prof. Andrew Childs, University of Waterloo LECTURE 15: Unstructured search and spatial search Quantu algoriths (CO 781, Winter 2008) Prof Andrew Childs, University of Waterloo LECTURE 15: Unstructured search and spatial search ow we begin to discuss applications of quantu walks to search algoriths

More information

arxiv: v1 [cs.ds] 3 Feb 2014

arxiv: v1 [cs.ds] 3 Feb 2014 arxiv:40.043v [cs.ds] 3 Feb 04 A Bound on the Expected Optiality of Rando Feasible Solutions to Cobinatorial Optiization Probles Evan A. Sultani The Johns Hopins University APL evan@sultani.co http://www.sultani.co/

More information

Sharp Time Data Tradeoffs for Linear Inverse Problems

Sharp Time Data Tradeoffs for Linear Inverse Problems Sharp Tie Data Tradeoffs for Linear Inverse Probles Saet Oyak Benjain Recht Mahdi Soltanolkotabi January 016 Abstract In this paper we characterize sharp tie-data tradeoffs for optiization probles used

More information

Non-Parametric Non-Line-of-Sight Identification 1

Non-Parametric Non-Line-of-Sight Identification 1 Non-Paraetric Non-Line-of-Sight Identification Sinan Gezici, Hisashi Kobayashi and H. Vincent Poor Departent of Electrical Engineering School of Engineering and Applied Science Princeton University, Princeton,

More information

1 Identical Parallel Machines

1 Identical Parallel Machines FB3: Matheatik/Inforatik Dr. Syaantak Das Winter 2017/18 Optiizing under Uncertainty Lecture Notes 3: Scheduling to Miniize Makespan In any standard scheduling proble, we are given a set of jobs J = {j

More information

Lecture 21. Interior Point Methods Setup and Algorithm

Lecture 21. Interior Point Methods Setup and Algorithm Lecture 21 Interior Point Methods In 1984, Kararkar introduced a new weakly polynoial tie algorith for solving LPs [Kar84a], [Kar84b]. His algorith was theoretically faster than the ellipsoid ethod and

More information

Inspection; structural health monitoring; reliability; Bayesian analysis; updating; decision analysis; value of information

Inspection; structural health monitoring; reliability; Bayesian analysis; updating; decision analysis; value of information Cite as: Straub D. (2014). Value of inforation analysis with structural reliability ethods. Structural Safety, 49: 75-86. Value of Inforation Analysis with Structural Reliability Methods Daniel Straub

More information

Fairness via priority scheduling

Fairness via priority scheduling Fairness via priority scheduling Veeraruna Kavitha, N Heachandra and Debayan Das IEOR, IIT Bobay, Mubai, 400076, India vavitha,nh,debayan}@iitbacin Abstract In the context of ulti-agent resource allocation

More information

Upper-Confidence-Bound Algorithms for Active Learning in Multi-Armed Bandits

Upper-Confidence-Bound Algorithms for Active Learning in Multi-Armed Bandits Upper-Confidence-Bound Algorithms for Active Learning in Multi-Armed Bandits Alexandra Carpentier 1, Alessandro Lazaric 1, Mohammad Ghavamzadeh 1, Rémi Munos 1, and Peter Auer 2 1 INRIA Lille - Nord Europe,

More information

An improved self-adaptive harmony search algorithm for joint replenishment problems

An improved self-adaptive harmony search algorithm for joint replenishment problems An iproved self-adaptive harony search algorith for joint replenishent probles Lin Wang School of Manageent, Huazhong University of Science & Technology zhoulearner@gail.co Xiaojian Zhou School of Manageent,

More information

Chapter 6 1-D Continuous Groups

Chapter 6 1-D Continuous Groups Chapter 6 1-D Continuous Groups Continuous groups consist of group eleents labelled by one or ore continuous variables, say a 1, a 2,, a r, where each variable has a well- defined range. This chapter explores:

More information

Scheduling Contract Algorithms on Multiple Processors

Scheduling Contract Algorithms on Multiple Processors Fro: AAAI Technical Report FS-0-04. Copilation copyright 200, AAAI (www.aaai.org). All rights reserved. Scheduling Contract Algoriths on Multiple Processors Daniel S. Bernstein, Theodore. Perkins, Shloo

More information

Feature Extraction Techniques

Feature Extraction Techniques Feature Extraction Techniques Unsupervised Learning II Feature Extraction Unsupervised ethods can also be used to find features which can be useful for categorization. There are unsupervised ethods that

More information

are equal to zero, where, q = p 1. For each gene j, the pairwise null and alternative hypotheses are,

are equal to zero, where, q = p 1. For each gene j, the pairwise null and alternative hypotheses are, Page of 8 Suppleentary Materials: A ultiple testing procedure for ulti-diensional pairwise coparisons with application to gene expression studies Anjana Grandhi, Wenge Guo, Shyaal D. Peddada S Notations

More information

TEST OF HOMOGENEITY OF PARALLEL SAMPLES FROM LOGNORMAL POPULATIONS WITH UNEQUAL VARIANCES

TEST OF HOMOGENEITY OF PARALLEL SAMPLES FROM LOGNORMAL POPULATIONS WITH UNEQUAL VARIANCES TEST OF HOMOGENEITY OF PARALLEL SAMPLES FROM LOGNORMAL POPULATIONS WITH UNEQUAL VARIANCES S. E. Ahed, R. J. Tokins and A. I. Volodin Departent of Matheatics and Statistics University of Regina Regina,

More information

E0 370 Statistical Learning Theory Lecture 5 (Aug 25, 2011)

E0 370 Statistical Learning Theory Lecture 5 (Aug 25, 2011) E0 370 Statistical Learning Theory Lecture 5 Aug 5, 0 Covering Nubers, Pseudo-Diension, and Fat-Shattering Diension Lecturer: Shivani Agarwal Scribe: Shivani Agarwal Introduction So far we have seen how

More information

time time δ jobs jobs

time time δ jobs jobs Approxiating Total Flow Tie on Parallel Machines Stefano Leonardi Danny Raz y Abstract We consider the proble of optiizing the total ow tie of a strea of jobs that are released over tie in a ultiprocessor

More information

Computable Shell Decomposition Bounds

Computable Shell Decomposition Bounds Journal of Machine Learning Research 5 (2004) 529-547 Subitted 1/03; Revised 8/03; Published 5/04 Coputable Shell Decoposition Bounds John Langford David McAllester Toyota Technology Institute at Chicago

More information

Fixed-to-Variable Length Distribution Matching

Fixed-to-Variable Length Distribution Matching Fixed-to-Variable Length Distribution Matching Rana Ali Ajad and Georg Böcherer Institute for Counications Engineering Technische Universität München, Gerany Eail: raa2463@gail.co,georg.boecherer@tu.de

More information

Distributed Subgradient Methods for Multi-agent Optimization

Distributed Subgradient Methods for Multi-agent Optimization 1 Distributed Subgradient Methods for Multi-agent Optiization Angelia Nedić and Asuan Ozdaglar October 29, 2007 Abstract We study a distributed coputation odel for optiizing a su of convex objective functions

More information

Interactive Markov Models of Evolutionary Algorithms

Interactive Markov Models of Evolutionary Algorithms Cleveland State University EngagedScholarship@CSU Electrical Engineering & Coputer Science Faculty Publications Electrical Engineering & Coputer Science Departent 2015 Interactive Markov Models of Evolutionary

More information

Soft Computing Techniques Help Assign Weights to Different Factors in Vulnerability Analysis

Soft Computing Techniques Help Assign Weights to Different Factors in Vulnerability Analysis Soft Coputing Techniques Help Assign Weights to Different Factors in Vulnerability Analysis Beverly Rivera 1,2, Irbis Gallegos 1, and Vladik Kreinovich 2 1 Regional Cyber and Energy Security Center RCES

More information

arxiv: v1 [cs.ds] 17 Mar 2016

arxiv: v1 [cs.ds] 17 Mar 2016 Tight Bounds for Single-Pass Streaing Coplexity of the Set Cover Proble Sepehr Assadi Sanjeev Khanna Yang Li Abstract arxiv:1603.05715v1 [cs.ds] 17 Mar 2016 We resolve the space coplexity of single-pass

More information

This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and

This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and This article appeared in a ournal published by Elsevier. The attached copy is furnished to the author for internal non-coercial research and education use, including for instruction at the authors institution

More information

Computational and Statistical Learning Theory

Computational and Statistical Learning Theory Coputational and Statistical Learning Theory Proble sets 5 and 6 Due: Noveber th Please send your solutions to learning-subissions@ttic.edu Notations/Definitions Recall the definition of saple based Radeacher

More information

Approximation in Stochastic Scheduling: The Power of LP-Based Priority Policies

Approximation in Stochastic Scheduling: The Power of LP-Based Priority Policies Approxiation in Stochastic Scheduling: The Power of -Based Priority Policies Rolf Möhring, Andreas Schulz, Marc Uetz Setting (A P p stoch, r E( w and (B P p stoch E( w We will assue that the processing

More information

Combining Classifiers

Combining Classifiers Cobining Classifiers Generic ethods of generating and cobining ultiple classifiers Bagging Boosting References: Duda, Hart & Stork, pg 475-480. Hastie, Tibsharini, Friedan, pg 246-256 and Chapter 10. http://www.boosting.org/

More information

MSEC MODELING OF DEGRADATION PROCESSES TO OBTAIN AN OPTIMAL SOLUTION FOR MAINTENANCE AND PERFORMANCE

MSEC MODELING OF DEGRADATION PROCESSES TO OBTAIN AN OPTIMAL SOLUTION FOR MAINTENANCE AND PERFORMANCE Proceeding of the ASME 9 International Manufacturing Science and Engineering Conference MSEC9 October 4-7, 9, West Lafayette, Indiana, USA MSEC9-8466 MODELING OF DEGRADATION PROCESSES TO OBTAIN AN OPTIMAL

More information

1 Proof of learning bounds

1 Proof of learning bounds COS 511: Theoretical Machine Learning Lecturer: Rob Schapire Lecture #4 Scribe: Akshay Mittal February 13, 2013 1 Proof of learning bounds For intuition of the following theore, suppose there exists a

More information

Bootstrapping Dependent Data

Bootstrapping Dependent Data Bootstrapping Dependent Data One of the key issues confronting bootstrap resapling approxiations is how to deal with dependent data. Consider a sequence fx t g n t= of dependent rando variables. Clearly

More information

Support Vector Machine Classification of Uncertain and Imbalanced data using Robust Optimization

Support Vector Machine Classification of Uncertain and Imbalanced data using Robust Optimization Recent Researches in Coputer Science Support Vector Machine Classification of Uncertain and Ibalanced data using Robust Optiization RAGHAV PAT, THEODORE B. TRAFALIS, KASH BARKER School of Industrial Engineering

More information

A proposal for a First-Citation-Speed-Index Link Peer-reviewed author version

A proposal for a First-Citation-Speed-Index Link Peer-reviewed author version A proposal for a First-Citation-Speed-Index Link Peer-reviewed author version Made available by Hasselt University Library in Docuent Server@UHasselt Reference (Published version): EGGHE, Leo; Bornann,

More information

Lower Bounds for Quantized Matrix Completion

Lower Bounds for Quantized Matrix Completion Lower Bounds for Quantized Matrix Copletion Mary Wootters and Yaniv Plan Departent of Matheatics University of Michigan Ann Arbor, MI Eail: wootters, yplan}@uich.edu Mark A. Davenport School of Elec. &

More information

Pattern Recognition and Machine Learning. Artificial Neural networks

Pattern Recognition and Machine Learning. Artificial Neural networks Pattern Recognition and Machine Learning Jaes L. Crowley ENSIMAG 3 - MMIS Fall Seester 2016 Lessons 7 14 Dec 2016 Outline Artificial Neural networks Notation...2 1. Introduction...3... 3 The Artificial

More information

Polygonal Designs: Existence and Construction

Polygonal Designs: Existence and Construction Polygonal Designs: Existence and Construction John Hegean Departent of Matheatics, Stanford University, Stanford, CA 9405 Jeff Langford Departent of Matheatics, Drake University, Des Moines, IA 5011 G

More information

CS Lecture 13. More Maximum Likelihood

CS Lecture 13. More Maximum Likelihood CS 6347 Lecture 13 More Maxiu Likelihood Recap Last tie: Introduction to axiu likelihood estiation MLE for Bayesian networks Optial CPTs correspond to epirical counts Today: MLE for CRFs 2 Maxiu Likelihood

More information

Supplementary to Learning Discriminative Bayesian Networks from High-dimensional Continuous Neuroimaging Data

Supplementary to Learning Discriminative Bayesian Networks from High-dimensional Continuous Neuroimaging Data Suppleentary to Learning Discriinative Bayesian Networks fro High-diensional Continuous Neuroiaging Data Luping Zhou, Lei Wang, Lingqiao Liu, Philip Ogunbona, and Dinggang Shen Proposition. Given a sparse

More information

Using EM To Estimate A Probablity Density With A Mixture Of Gaussians

Using EM To Estimate A Probablity Density With A Mixture Of Gaussians Using EM To Estiate A Probablity Density With A Mixture Of Gaussians Aaron A. D Souza adsouza@usc.edu Introduction The proble we are trying to address in this note is siple. Given a set of data points

More information

Bipartite subgraphs and the smallest eigenvalue

Bipartite subgraphs and the smallest eigenvalue Bipartite subgraphs and the sallest eigenvalue Noga Alon Benny Sudaov Abstract Two results dealing with the relation between the sallest eigenvalue of a graph and its bipartite subgraphs are obtained.

More information

On Poset Merging. 1 Introduction. Peter Chen Guoli Ding Steve Seiden. Keywords: Merging, Partial Order, Lower Bounds. AMS Classification: 68W40

On Poset Merging. 1 Introduction. Peter Chen Guoli Ding Steve Seiden. Keywords: Merging, Partial Order, Lower Bounds. AMS Classification: 68W40 On Poset Merging Peter Chen Guoli Ding Steve Seiden Abstract We consider the follow poset erging proble: Let X and Y be two subsets of a partially ordered set S. Given coplete inforation about the ordering

More information

Rademacher Complexity Margin Bounds for Learning with a Large Number of Classes

Rademacher Complexity Margin Bounds for Learning with a Large Number of Classes Radeacher Coplexity Margin Bounds for Learning with a Large Nuber of Classes Vitaly Kuznetsov Courant Institute of Matheatical Sciences, 25 Mercer street, New York, NY, 002 Mehryar Mohri Courant Institute

More information

Uniform Approximation and Bernstein Polynomials with Coefficients in the Unit Interval

Uniform Approximation and Bernstein Polynomials with Coefficients in the Unit Interval Unifor Approxiation and Bernstein Polynoials with Coefficients in the Unit Interval Weiang Qian and Marc D. Riedel Electrical and Coputer Engineering, University of Minnesota 200 Union St. S.E. Minneapolis,

More information

1 Rademacher Complexity Bounds

1 Rademacher Complexity Bounds COS 511: Theoretical Machine Learning Lecturer: Rob Schapire Lecture #10 Scribe: Max Goer March 07, 2013 1 Radeacher Coplexity Bounds Recall the following theore fro last lecture: Theore 1. With probability

More information

Lecture October 23. Scribes: Ruixin Qiang and Alana Shine

Lecture October 23. Scribes: Ruixin Qiang and Alana Shine CSCI699: Topics in Learning and Gae Theory Lecture October 23 Lecturer: Ilias Scribes: Ruixin Qiang and Alana Shine Today s topic is auction with saples. 1 Introduction to auctions Definition 1. In a single

More information

Hybrid System Identification: An SDP Approach

Hybrid System Identification: An SDP Approach 49th IEEE Conference on Decision and Control Deceber 15-17, 2010 Hilton Atlanta Hotel, Atlanta, GA, USA Hybrid Syste Identification: An SDP Approach C Feng, C M Lagoa, N Ozay and M Sznaier Abstract The

More information

Handout 7. and Pr [M(x) = χ L (x) M(x) =? ] = 1.

Handout 7. and Pr [M(x) = χ L (x) M(x) =? ] = 1. Notes on Coplexity Theory Last updated: October, 2005 Jonathan Katz Handout 7 1 More on Randoized Coplexity Classes Reinder: so far we have seen RP,coRP, and BPP. We introduce two ore tie-bounded randoized

More information

Support Vector Machines. Maximizing the Margin

Support Vector Machines. Maximizing the Margin Support Vector Machines Support vector achines (SVMs) learn a hypothesis: h(x) = b + Σ i= y i α i k(x, x i ) (x, y ),..., (x, y ) are the training exs., y i {, } b is the bias weight. α,..., α are the

More information

When Short Runs Beat Long Runs

When Short Runs Beat Long Runs When Short Runs Beat Long Runs Sean Luke George Mason University http://www.cs.gu.edu/ sean/ Abstract What will yield the best results: doing one run n generations long or doing runs n/ generations long

More information

Optimum Value of Poverty Measure Using Inverse Optimization Programming Problem

Optimum Value of Poverty Measure Using Inverse Optimization Programming Problem International Journal of Conteporary Matheatical Sciences Vol. 14, 2019, no. 1, 31-42 HIKARI Ltd, www.-hikari.co https://doi.org/10.12988/ijcs.2019.914 Optiu Value of Poverty Measure Using Inverse Optiization

More information

3.8 Three Types of Convergence

3.8 Three Types of Convergence 3.8 Three Types of Convergence 3.8 Three Types of Convergence 93 Suppose that we are given a sequence functions {f k } k N on a set X and another function f on X. What does it ean for f k to converge to

More information

This model assumes that the probability of a gap has size i is proportional to 1/i. i.e., i log m e. j=1. E[gap size] = i P r(i) = N f t.

This model assumes that the probability of a gap has size i is proportional to 1/i. i.e., i log m e. j=1. E[gap size] = i P r(i) = N f t. CS 493: Algoriths for Massive Data Sets Feb 2, 2002 Local Models, Bloo Filter Scribe: Qin Lv Local Models In global odels, every inverted file entry is copressed with the sae odel. This work wells when

More information

The Multi-Arm Bandit Framework

The Multi-Arm Bandit Framework The Multi-Arm Bandit Framework A. LAZARIC (SequeL Team @INRIA-Lille) ENS Cachan - Master 2 MVA SequeL INRIA Lille MVA-RL Course In This Lecture A. LAZARIC Reinforcement Learning Algorithms Oct 29th, 2013-2/94

More information

Analyzing Simulation Results

Analyzing Simulation Results Analyzing Siulation Results Dr. John Mellor-Cruey Departent of Coputer Science Rice University johnc@cs.rice.edu COMP 528 Lecture 20 31 March 2005 Topics for Today Model verification Model validation Transient

More information

Graphical Models in Local, Asymmetric Multi-Agent Markov Decision Processes

Graphical Models in Local, Asymmetric Multi-Agent Markov Decision Processes Graphical Models in Local, Asyetric Multi-Agent Markov Decision Processes Ditri Dolgov and Edund Durfee Departent of Electrical Engineering and Coputer Science University of Michigan Ann Arbor, MI 48109

More information

Mixed Robust/Average Submodular Partitioning

Mixed Robust/Average Submodular Partitioning Mixed Robust/Average Subodular Partitioning Kai Wei 1 Rishabh Iyer 1 Shengjie Wang 2 Wenruo Bai 1 Jeff Biles 1 1 Departent of Electrical Engineering, University of Washington 2 Departent of Coputer Science,

More information

Learnability and Stability in the General Learning Setting

Learnability and Stability in the General Learning Setting Learnability and Stability in the General Learning Setting Shai Shalev-Shwartz TTI-Chicago shai@tti-c.org Ohad Shair The Hebrew University ohadsh@cs.huji.ac.il Nathan Srebro TTI-Chicago nati@uchicago.edu

More information

IN modern society that various systems have become more

IN modern society that various systems have become more Developent of Reliability Function in -Coponent Standby Redundant Syste with Priority Based on Maxiu Entropy Principle Ryosuke Hirata, Ikuo Arizono, Ryosuke Toohiro, Satoshi Oigawa, and Yasuhiko Takeoto

More information

Randomized Recovery for Boolean Compressed Sensing

Randomized Recovery for Boolean Compressed Sensing Randoized Recovery for Boolean Copressed Sensing Mitra Fatei and Martin Vetterli Laboratory of Audiovisual Counication École Polytechnique Fédéral de Lausanne (EPFL) Eail: {itra.fatei, artin.vetterli}@epfl.ch

More information

On the Communication Complexity of Lipschitzian Optimization for the Coordinated Model of Computation

On the Communication Complexity of Lipschitzian Optimization for the Coordinated Model of Computation journal of coplexity 6, 459473 (2000) doi:0.006jco.2000.0544, available online at http:www.idealibrary.co on On the Counication Coplexity of Lipschitzian Optiization for the Coordinated Model of Coputation

More information

Multi-Dimensional Hegselmann-Krause Dynamics

Multi-Dimensional Hegselmann-Krause Dynamics Multi-Diensional Hegselann-Krause Dynaics A. Nedić Industrial and Enterprise Systes Engineering Dept. University of Illinois Urbana, IL 680 angelia@illinois.edu B. Touri Coordinated Science Laboratory

More information

SPECTRUM sensing is a core concept of cognitive radio

SPECTRUM sensing is a core concept of cognitive radio World Acadey of Science, Engineering and Technology International Journal of Electronics and Counication Engineering Vol:6, o:2, 202 Efficient Detection Using Sequential Probability Ratio Test in Mobile

More information

Pattern Recognition and Machine Learning. Learning and Evaluation for Pattern Recognition

Pattern Recognition and Machine Learning. Learning and Evaluation for Pattern Recognition Pattern Recognition and Machine Learning Jaes L. Crowley ENSIMAG 3 - MMIS Fall Seester 2017 Lesson 1 4 October 2017 Outline Learning and Evaluation for Pattern Recognition Notation...2 1. The Pattern Recognition

More information

A Simplified Analytical Approach for Efficiency Evaluation of the Weaving Machines with Automatic Filling Repair

A Simplified Analytical Approach for Efficiency Evaluation of the Weaving Machines with Automatic Filling Repair Proceedings of the 6th SEAS International Conference on Siulation, Modelling and Optiization, Lisbon, Portugal, Septeber -4, 006 0 A Siplified Analytical Approach for Efficiency Evaluation of the eaving

More information

New Slack-Monotonic Schedulability Analysis of Real-Time Tasks on Multiprocessors

New Slack-Monotonic Schedulability Analysis of Real-Time Tasks on Multiprocessors New Slack-Monotonic Schedulability Analysis of Real-Tie Tasks on Multiprocessors Risat Mahud Pathan and Jan Jonsson Chalers University of Technology SE-41 96, Göteborg, Sweden {risat, janjo}@chalers.se

More information