Minimax strategy for prediction with expert advice under stochastic assumptions
|
|
- Augusta Jones
- 6 years ago
- Views:
Transcription
1 Minimax strategy for prediction ith expert advice under stochastic assumptions Wojciech Kotłosi Poznań University of Technology, Poland Abstract We consider the setting of prediction ith expert advice ith an additional assumption that each expert generates its losses i.i.d. according to some distribution. We first identify a class of admissable strategies, hich e call permutation invariant, and sho that every strategy outside this class ill perform not better than some permutation invariant strategy. We then sho that hen the losses are binary, a simple Follo the Leader (FL algorithm is the minimax strategy for this game, here minimaxity is simultaneously achieved for the expected regret, the expected redundancy, and the excess ris. Furthermore, FL has also the smallest regret, redundancy, and excess ris over all permutation invariant prediction strategies, simultaneously for all distributions over binary losses. When the losses are continuous in 0, 1, FL remains minimax only hen an additional tric called loss binarization is applied. 1 Introduction In the game of prediction ith expert advice 1, 2, the algorithm sequentially decides on one of K experts to follo, and suffers loss associated ith the chosen expert. The difference beteen the algorithm s cumulative loss and the cumulative loss of the best expert is called regret. The goal is to minimize the regret in the orst case over all possible loss sequences. An algorithm hich achieves this goal (i.e., minimizes the orst-case regret is called minimax. While there is no non solution to this problem in the general setting, it is possible to derive minimax algorithms for some special variants of this game 1, 2, 3, 4. Interestingly, all these algorithms share a similar strategy of playing against a maximin adversary hich assigns losses uniformly at random. They often have the equlization property: all data sequences lead to the same value of the regret. While this property maes them robust against the orst-case sequence, it also maes them over-conservative, preventing them from exploiting the case, hen the actual data are easy. There are various algorithm hich combine almost optimal orst-case performance ith good performance on easy sequences 5, 6, 7, 8, 9; these algorithms, hoever, are not motivated from the minimax principle. In this paper, e drop the analysis of orst-case performance entirely, and explore the minimax principle in a more constrained setting, in hich the adversary is assumed to be stochastic. In particular, e associate ith each expert a fixed distribution P over loss values, and assume the observed losses of expert are generated independently from P. We believe this setting might be practically useful, and that it is interesting to determine the minimax algorithm under this assumption. We immediately face to difficulties here. First, due to stochastic nature of the adversary, it is no longer possible to follo standard approaches of minimax analysis, such as bacard induction1, 2 or sequential minimax duality 10, 3, and e need to resort to a different technique. We define the notion of permutation invariance of prediction strategies. This let us identify a class of admissable strategies (hich e call permutation invariant, and sho that every strategy outside this class ill perform not better than some permutation invariant strategy. Secondly, hile the regret is a single, commonly used performance metric in the orst-case setting, the situation is different in 1
2 the stochastic case. We no at least three potentially useful metrics in the stochastic setting: the expected regret, the expected redundancy, and the excess ris 11, and it is not clear, hich of them should be used to define the minimax strategy. Fortunately, it turns out that there exists a single strategy hich is minimax ith respect to all three metrics simultaneously. In the case of binary losses, hich tae out values from {0, 1}, this strategy turns out to be the Follo the Leader (FL algorithm, hich chooses an expert ith the smallest cumulative loss at a given trial (ith ties broen randomly. Interestingly, FL is non to perform poorly in the orst-case, as its orst-case regret ill gro linearly ith T 2. On the contrary, in the stochastic setting ith binary losses, FL has also the smallest regret, redundancy, and excess ris over all permutation invariant prediction strategies, simultaneously for all distributions over binary losses! In a more general case of continuous losses in the range 0, 1, FL is provably suboptimal. Hoever, by applying binarization tric to the losses 6, i.e. randomly setting them to {0, 1} such that the expectation matches the actual loss, and using FL on the binarized sequence, e obtain the minimax strategy in the continuous case. 2 Problem Setting In the game of prediction ith expert advice, at each trial t = 1,..., T, the algorithm predicts ith a distribution t K over K experts. Then the loss vector l t X K is revealed (here X is either {0, 1} or 0, 1, and the algorithm suffers loss t l t. The sequence of outcomes l 1,..., l t is abbreviated as l t. We let L t, denote the cumulative loss of expert at iteration t, L t, = q t l t,. We assume there are K distributions P = (P 1,..., P K over X, such that for each, the losses l t,, t = 1,..., T, are generated i.i.d. from P. Note that this implies that l t, is independent from l t, henever t t or. We formally define the prediction strategy as a sequence of T functions ω = ( 1,..., T, such that t : X t 1 K. The performance of the strategy ω on the set of distributions P can be measured by one of the three folloing metrics: the expected regret: the expected redundancy: the excess ris: T R eg (ω, P = E t=1 t (l t 1 l t min L T, T R ed (ω, P = E t (l t 1 l t min E L T,, t=1 R is (ω, P = E T (l T 1 l T min E l T,, here in each case, the expectation is over the sequence l T ith respect to distributions P 1,..., P K. Given performance measure R, e say that a strategy ω is minimax ith respect to R, if: sup R(ω, P = inf sup R(ω, P, P ω P here the supremum is over all K-sets of distributions (P 1,..., P K on X, and the infimum is over all prediction strategies. We say that a strategy ω is permutation invariant if for any t = 1,..., T, and any permutation S K, here S K denotes the group of permutations over {1,..., K}, t ((l t 1 = ( t (l t 1, here for any vector v = (v 1,..., v K, e defined (v = (v (1,..., v (K and abbreviated (l t 1 = (l 1,..., (l t 1. In ords, if e -permute the indices of all past loss vectors, the resulting eight vector ill be the -permutation of the original eight vector. Permutation invariant strategies are natural, as they only rely on the observed outcomes, not on the expert indices. Lemma 1. Let ω be permutation invariant. Then, for any permutation S K, E (P t (l t 1 l t = EP t (l t 1 l t, and moreover R(ω, (P = R(ω, P, here R is the expected regret, expected redundancy, or excess ris, and (P = (P (1,..., P (K. We no sho that permutation invariant strategies are admissable in the folloing sense: Theorem 2. For any strategy ω, there exists permutation invariant strategy ω, such that sup P R( ω, P sup P R(ω, P, here R is either the expected regret, the expected redundancy or the excess ris., 2
3 Theorem 2 statest that it suffices to search for minimax strategy only ithin the set of permutation invariant strategies. Given loss sequence l t 1, let N = argmin j=1,...,k L t 1,j be the size of the leader set at the beginning of trial t. We define Follo the Leader (FL strategy t such that t, = 1 N if argmin j L t 1,j and t, = 0 otherise. In other ords, FL predicts ith the current leader, breaing ties uniformly at random. It is easy to sho that FL strategy is permutation invariant. 3 Binary losses In this section, e set X = {0, 1}, so that all losses are binary. In this case, each P is a Bernoulli distribution. Tae any permutation invariant strategy ω. Using Lemma 1 on all S K : E P t (l t 1 1 l t = E (P t (l t 1 l t. (1 }{{} =: loss t( t,p We no sho the main result of this paper, a surprisingly strong property of FL strategy, hich states that FL minimizes loss t ( t, P over all K-sets of distributions simultaneously. Hence, FL is not only optimal in the orst case, but is actually optimal for permutation-averaged expected loss for any P, hich implies by (1 that FL has the smallest expected loss among all permutation invariant strategies for any P. Theorem 3. Let ω = ( 1,..., T be the FL strategy. Then, for any K-set of distributions P = (P 1,..., P K over binary losses, for any strategy ω = ( 1,..., T, and any t = 1,..., T : loss t ( t, P loss t ( t, P. The consequence of Theorem 3 is the folloing corollary hich states the minimaxity of FL strategy for binary losses: Corollary 4. Let ω = ( 1,..., T be the FL strategy. Then, for any P over binary losses, and any permutation invariant strategy ω: R(ω, P R(ω, P. here R is the expected regret, expected redundancy, or excess ris. This implies: sup R(ω, P = inf sup R(ω, P, P ω P here the supremum is over all distributions on binary losses, and the infimum over all (not necessarily permutation invariant strategies. Proof. The second statement immediately follos from the first statement and Theorem 2. For the first statement, note that the loss of the best expert part of each measure only depends on P. Hence, e only need to sho that for any t = 1,..., T, E P t l t EP t l t. Since t and t are permutation invariant, Lemma 1 shos that E P t l t = losst ( t, P, and similarly, E P t l t = loss t ( t, P. Application of Theorem 3 finishes the proof. 4 Continuous losses In this section, e consider the general case X = 0, 1 of continuous loss vectors. We give a modification of FL and prove its minimaxity. In the appendix, e justify the modification by arguing that the vanilla FL strategy is not minimax for continuous losses. 3
4 The modification of FL is based on the procedure e call binarization. A similar tric has already been used in 6 to deal ith non-integer losses in a different context. We define a binarization of any loss value l t, 0, 1 as a Bernoulli random variable b t, hich taes out value 1 ith probability l t, and value 0 ith probability 1 l t,. In other ords, e replace each non-binary loss l t, by a random binary outcome b t,, such that Eb t, = l t,. Note that if l t, {0, 1}, then b t, = l t,, i.e. binarization has no effect on losses hich are already binary. Let us also define b t = (b t,1,..., b t,k, here all K Bernoulli random variables b t, are independent. Similarly, b t ill denote a binary loss sequence b 1,..., b t, here the binarization procedure as applied independently (ith a ne set of Bernoulli variables for each trial t. No, given the loss sequence l t 1, e define the binarized FL strategy ω b by: b t (l t 1 = E b t 1 t (b t 1, here t (b t 1 is the standard FL strategy applied to binarized losses b t 1, and the expectation is over internal randomization of the algorithm (binarization variables. Note that if the set of distributions P has support only on {0, 1}, then b t t. On the other hand, these to strategies may differ significantly for non-binary losses. Hoever, e ill sho that for any K-set of distributions P (ith support in 0, 1, b t ill behave in the same ay as t ould behave on some particular K-set of distributions over binary losses. To this end, e introduce binarization of a K-set of distributions P, defined as P bin = (P1 bin,..., PK bin, here is a distribution ith support {0, 1} such that: P bin E P binl t, = P bin (l t, = 1 = E P l t,. In other ords, P bin is a Bernoulli distribution hich has the same expectation as the original distribution (over continuous losses P. We no sho the folloing results: Lemma 5. For any K-set of distributions P = (P 1,..., P K ith support on X = 0, 1, E l t P t b t (l t 1 l t = El t (P bin t t (l t 1 l t. Lemma 6. For any K-set of distributions P = (P 1,..., P K ith support on X = 0, 1, R(ω b, P R(ω, P bin, here R is either the expected regret, the expected redundancy, or the excess ris. Theorem 7. Let ω b = ( b 1,..., b be the binarized FL strategy. Then: T sup R(ω b, P = inf sup R(ω, P, P ω P here R is the expected regret, expected redundancy, or excess ris, the supremum is over all K-sets of distributions on 0, 1, and the infimum is over all prediction strategies. In the appendix, e argue that vanilla FL is not the minimax strategy for continuous losses, so that the binarization procedure is justified. 5 Open Problem The setting considered in this paper is quite limited even in the stochastic case, as it does not consider distributions over loss vectors hich are i.i.d. beteen trials, but not necessarily i.i.d. beteen experts. It ould be interesting to determined the minimax strategy in this more general setting. Preliminary computational experiment suggest that FL is not minimax even for binary losses, hen dependencies beteen experts are alloed. Acnoledgments. The author as supported by the Polish National Science Centre under grant no. 2013/11/D/ST6/
5 References 1 Nicolò Cesa-Bianchi, Yaov Freund, David Haussler, David P. Helmbold, Robert E. Schapire, and Manfred K. Warmuth. Ho to use expert advice. Journal of the ACM, 44(3: , Nicolò Cesa-Bianchi and Gábor Lugosi. Prediction, learning, and games. Cambridge University Press, Wouter M. Koolen. Combining Strategies Efficiently: High-quality Decisions from Conicting Advice. PhD thesis, ILLC, University of Amsterdam, Jacob Abernethy, Manfred K. Warmuth, and Joel Yellin. When random play is optimal against an adversary. In COLT, pages , July Steven de Rooij, Tim van Erven, Peter D. Grünald, and Wouter M. Koolen. Follo the leader if you can, hedge if you must. Journal of Machine Learning Research, 15(1: , Tim van Erven, Wojciech Kotłosi, and Manfred K. Warmuth. Follo the leader ith dropout perturbations. In COLT, pages , Amir Sani, Gergely Neu, and Alessandro Lazaric. Exploiting easy data in online optimization. In NIPS, pages Wouter M. Koolen and Tim van Erven. Second-order quantile methods for experts and combinatorial games. In COLT, pages , Haipeng Luo and Robert E. Schapire. Achieving all ith no parameters: AdaNormalHedge. In COLT, pages , Jacob Abernethy, Aleh Agaral, Peter L. Bartlett, and Alexander Rahlin. A stochastic vie of optimal regret through minimax duality. In COLT, Peter D. Grünald. The Minimum Description Length Principle. MIT Press, Cambridge, MA,
6 Proof of Lemma 1 We first sho that the expected loss of the algorithm at any iteration t = 1,..., T, is the same for both (P and P: E (P t (l t 1 l t = EP t ((l t 1 (l t = E P (t (l t 1 (l t = E P t (l t 1 l t, here the first equality is due to the fact, that permuting the distributions is equivalent to permuting the coordinates of the losses (hich are random variables ith respect to these distributions, the second equality exploits the permutation invariance of ω, hile the third inequality uses a simple fact that the dot product is invariant under permuting both arguments. Therefore, the loss of the algorithm part of any of the three measures (regret, redundancy, ris remains the same. To sho that the loss of the optimal part of each measure is the same, note that for any t = 1,..., T, = 1,..., K, E (P l t, = E P lt,(, hich implies: min min E (P l T, = min E P lt,( = min E P l T,, E (P L T, = min E P LT,( = min E P L T,, E (P min L T, = E P min L T,( = E P min L T, so that the loss of the best expert parts of all measures are also the same for both (P and P. Proof of Theorem 2 Define ω = ( 1,..., T as: t (l t 1 = 1 τ 1 ( t (τ(l t 1. Note that ω is a valid prediction strategy, since t is a function of l t 1, and t K ( t is a convex combination of distributions, so it is a distribution itself. Moreover, ω is permutation invariant: t ((l t 1 = 1 ( τ 1 t (τ(l t 1 = 1 ( (τ 1 1 t (τ(l t 1 = 1 ( t (τ(l t 1 = ( t (l t 1, τ 1 here the second inequality is from replacing the summation index τ τ. No, note that the expected loss of t is: E P t (l t 1 1 l t = τ 1 ( t (τ(l t 1 l t = 1 = 1 = 1 E P E P E τ 1 (P S K E (P t (τ(l t 1 τ(l t t (l t 1 l t t (l t 1 l t. Since the loss of the best expert parts of all three measures are invariant under any permutation of P (see the proof of Lemma 1, e have: R( ω, P = 1 R(ω, (P max R(ω, (P. S K S K Hence, sup R( ω, P sup max R(ω, (P = sup R(ω, P. P P S K P, 6
7 Proof of Theorem 3 For any distribution P over binary losses, let p := P (l t, = 1 = E P l t,. We have: loss t ( t, P = 1 E (P t (l t 1 l t (2 = 1 E (P t (l t 1 E (P l t ( = 1 K ( t 1 K q=1 p l q, ( (1 p ( t 1 t 1 q=1 l q, t, (l t 1 p ( l t 1 =1 =1 = 1 K t, (l t 1 t 1 q=1 p lq,j (j (1 p (j t 1 t 1 q=1 lq,j p (, l t 1 =1 j=1 }{{} =: loss t( t,p l t 1 here in the second equality e used the fact that t depends on l t 1 and does not depend on l t. Fix l t 1 and consider the term loss t ( t, P l t 1. This term is linear in t, hence it is minimized by t = e for some = 1,..., K, here e is the -th standard basis vector ith 1 on the -th coordinate, and zeros on the remaining coordinates. We ill use a shorthand notation L j = t 1 q=1 l q,j, for j = 1,..., K, and L = (L 1,..., L K. In this notation, e rerite loss t ( t, P l t 1 as: K loss t ( t, P l t 1 = t, (l t 1 (j (1 p (j t 1 Lj p (, (3 =1 We ill sho that for any P, and any l t 1 (and hence, any L, loss t ( t, P l t 1 is minimized by setting t = e for any argmin j L j. In other ords, e ill sho that for any P, L, any argmin j L j, and any = 1,..., K, j=1 loss t (e, P l t 1 loss t (e, P l t 1. or equivalently, using (3, that for any P, L, argmin j L j, and = 1,..., K, j=1 (j (1 p (j t 1 Lj p ( j=1 (j (1 p (j t 1 Lj p ( (4 We proceed by induction on K. Tae K = 2 and note that hen =, there is nothing to prove, as both sides of (4 are identical. Therefore, ithout loss of generality, assume = 1 and = 2, hich implies L 1 L 2. Then, (4 reduces to: p L1 1 pl2 2 (1 p 1 t 1 L1 (1 p 2 t 1 L2 p 1 + p L1 2 pl2 1 (1 p 2 t 1 L1 (1 p 1 t 1 L2 p 2 p L1 1 pl2 2 (1 p 1 t 1 L1 (1 p 2 t 1 L2 p 2 + p L1 2 pl2 1 (1 p 2 t 1 L1 (1 p 1 t 1 L2 p 1, After rearranging the terms, it amounts to sho that: (p 1 p 2 L1( (1 p 1 (1 p 2 t 1 L 2(p1 p 2 ( (p 2 (1 p 1 L2 L1 (p 1 (1 p 2 L2 L1 0. But this ill hold if: (p 1 p 2 ( (p 2 (1 p 1 L2 L1 (p 1 (1 p 2 L2 L1 0. (5 If L 1 = L 2, (5 clearly holds; therefore assume L 1 < L 2. We prove the validity of (5 by noticing that: p 2 (1 p 1 > p 1 (1 p 2 0 p 2 > p 1, hich means that the to factors of the product on the left-hand side of (5 have the opposite sign (hen p 1 p 2 or are zero at the same time (hen p 1 = p 2. Hence, e proved (5, hich implies 7
8 (4 hen = 1 and = 2. The opposite case = 2, = 1 ith L 2 L 1 can be shon ith exactly the same line of arguments by simply exchanging the indices 1 and 2. No, e assume (4 holds for K 1 2 experts and any P = (P 1,..., P K 1, any L = (L 1,..., L K 1, any argmin j=1,...,k 1 L j, and any = 1,..., K 1, and e sho that it also holds for K experts. Tae any argmin j=1,...,k L j, and any = 1,..., K. Without loss of generality, assume that 1 and 1 (it is alays possible find expert different than and, because there are K 3 experts. We expand the sum over permutations on the left-hand side of (4 ith respect to the value of (1: K s=1 p L1 s (1 p s t 1 L1 : (1=s j=2 (j (1 p (j t 1 Lj p (, and e also expand the sum on the right-hand side of (4 in the same ay. To prove (4, it suffices to sho that every term in the sum over s on the left-hand side is not greater than the corresponding term in the sum on the right-hand side, i.e. to sho that for any s = 1,..., K, : (1=s j=2 (j (1 p (j t 1 Lj p ( : (1=s j=2 (j (1 p (j t 1 Lj p (. (6 We no argue that this inequality follos directly from the inductive assumption by dropping L 1 and P s, and applying (4 to such a (K 1-expert case. More precisly, note that the sum on both sides of (6 goes over all permutations on indices (1,..., s 1, s + 1,..., K and since, 1, argmin j=2,...,k L j and 2. Hence, applying (4 to K 1 expert case ith K 1 distributions (P 1, P 2,..., P s 1, P s+1,..., P K (or any permutation thereof, and K 1 integers (L 2,..., L K immediately implies (6. Thus, e proved (4 hich states that loss t ( t, P l t 1 is minimized by any leader argmin j L j, here L j = t 1 q=1 l q,j. This means loss t ( t, P l t 1 is also minimized by the FL strategy t, hich distributes its mass uniformly over all leaders. Since FL minimizes loss t ( t, P l t 1 for any l t 1, by (2 it also minimizes loss t ( t, P. Note that the proof did not require uniform tie breaing over leaders, as any distribution over leaders ould or as ell. Uniform distribution, hoever, maes the FL strategy permutation invariant. Proof of Lemma 5 Let p be the expectation of l t, according to either P or P bin, p := E P l t, = E P binl t,. Since for any prediction strategy ω, t depends on l t 1 and does not depend on l t, e have E P b t l t = EP b t EP l t = E P b t p, here p = (p1,..., p K. Simiarly, E P bin t l t = EP bin t p. Hence, e only need to sho that EP b t = EP bin t. This holds because b t sees only binary outcomes resulting from the joint distribution of P and the distribution of binarization variables: E l t 1 P t 1 b t (l t 1 = E l t 1 P t 1,b t 1 t (b t 1, and for any b t,, the probability (jointly over P and binarization variables of b t, = 1 is the same as probability of l t, = 1 over the distribution P bin : P (b t, = 1 = P (b t, = 1 l t, P (l t, dl t, Hence, = 0,1 0,1 l t, P (l t, dl t, = p t = P bin (l t, = 1. E l t 1 P t 1,b t 1 t (b t 1 = E l t 1 (P bin t 1 t (l t. 8
9 Proof of Lemma 6 Lemma 5 shos that the expected loss of ω b on P is the same as the expected loss of ω on P bin. Hence, to prove the inequality, e only need to consider the loss of the best expert part of each measure. For the expected redundancy, and the expected regret, it directly follos from the definition of P bin that for any t,, E P l t, = E P binl t,, hence min E P l T, = min E P binl T,, and simiarly, min E P L T, = min E P binl T,. Thus, for the expected redundancy and the excess ris, the lemma actually holds ith equality. For the expected regret, e ill sho that E P min L T, E P binmin L T,, hich ill finish the proof. Using the argument from the proof of Lemma 5, and denoting B T, = T t=1 b t,, e have: E l T (P bin min L T, = E T l T P T,b T min B T, E l T P min E T b T B T, l T = E l T P min L T,, T here the inequality follos from Jensen s inequality applied to a concave function min(. Proof of Theorem 7 Since ω b is the same as ω hen all the losses are binary, R(ω b, P = R(ω, P for every P over binary losses. Furthermore, Lemma 6 states that R(ω b, P R(ω b, P bin, i.e. for every P over continuous losses, there is a corresponding P bin over binary losses hich incurs at least the same regret/redundancy/ris to ω b. Therefore, sup R(ω b, P = sup R(ω b, P = sup R(ω, P. P on 0,1 P on {0,1} P on {0,1} By the second part of Corollary 4, for any prediction strategy ω: sup R(ω, P P on {0,1} hich finishes the proof. A counterexample to FL strategy sup R(ω, P P on {0,1} sup R(ω, P, P on 0,1 We no argue that the vanilla FL is not the minimax strategy for continuous losses, so that the binarization procedure is justified. We ill only consider excess ris for simplicity, but e can use similar arguments to sho a counterexample for the expected regret and the expected redundancy as ell. Tae K = 2 experts, T = 2 iterations and distributions P 1, P 2 on binary losses. Denote p 1 = P 1 (l t,1 = 1 and p 2 = P 2 (l t,2 = 1, and assume p 1 p 2. The ris of FL strategy is given by: ( p1 p 2 + (1 p 1 (1 p 2 p 1 + p 2 2 +p 1 (1 p 2 p 2 + p 2 (1 p 1 p 1 p 1 = δ 2 δ2 2, } {{ } ties here δ = p 2 p 1. Maximizing over δ the expression above gives δ = 1 2 and the maximum ris of FL on binary losses is equal to 1 8. This is also the minimax ris on continuous losses, as follos from the proof of Theorem 7 (because the binarized FL is the minimax strategy on continuous losses, and it achieves the maximum ris on binary losses. We no sho that there exist distributions P 1, P 2 on continuous losses hich force FL to suffer more excess ris than 1 8. We tae P 1 ith support on to points {ɛ, 1}, here ɛ is a very small positive number, and p 1 = P 1 (l t,1 = 1. Note that El t,1 = p 1 + ɛ(1 p 1. P 2 has support on {0, 1 ɛ}, and let p 2 = P 2 (l t,2 = 1. We also assume p 1 < p 2, i.e. expert 1 is the better expert. The main idea in this counterexample is that by using ɛ values, all ties are resolved in favor of expert 2, hich maes the FL algorithm suffer more loss. More precisely, this ris of FL is no given by: ( p1 p 2 + (1 p 1 (1 p 2 p 2 +p 1 (1 p 2 p 2 + p 2 (1 p 1 p 1 p 1 + O(ɛ. }{{} ties Choosing, e.g. p 1 = 0 and p 2 = 0.5, this gives O(ɛ excess ris, hich is more than 1 8, given tha e tae ɛ sufficiently small. 9
On Minimaxity of Follow the Leader Strategy in the Stochastic Setting
On Minimaxity of Follow the Leader Strategy in the Stochastic Setting Wojciech Kot lowsi Poznań University of Technology, Poland wotlowsi@cs.put.poznan.pl Abstract. We consider the setting of prediction
More informationLearning, Games, and Networks
Learning, Games, and Networks Abhishek Sinha Laboratory for Information and Decision Systems MIT ML Talk Series @CNRG December 12, 2016 1 / 44 Outline 1 Prediction With Experts Advice 2 Application to
More informationAdaptive Online Gradient Descent
University of Pennsylvania ScholarlyCommons Statistics Papers Wharton Faculty Research 6-4-2007 Adaptive Online Gradient Descent Peter Bartlett Elad Hazan Alexander Rakhlin University of Pennsylvania Follow
More informationBayesian Properties of Normalized Maximum Likelihood and its Fast Computation
Bayesian Properties of Normalized Maximum Likelihood and its Fast Computation Andre Barron Department of Statistics Yale University Email: andre.barron@yale.edu Teemu Roos HIIT & Dept of Computer Science
More informationExponential Weights on the Hypercube in Polynomial Time
European Workshop on Reinforcement Learning 14 (2018) October 2018, Lille, France. Exponential Weights on the Hypercube in Polynomial Time College of Information and Computer Sciences University of Massachusetts
More informationLearning with Large Number of Experts: Component Hedge Algorithm
Learning with Large Number of Experts: Component Hedge Algorithm Giulia DeSalvo and Vitaly Kuznetsov Courant Institute March 24th, 215 1 / 3 Learning with Large Number of Experts Regret of RWM is O( T
More informationStochastic and Adversarial Online Learning without Hyperparameters
Stochastic and Adversarial Online Learning without Hyperparameters Ashok Cutkosky Department of Computer Science Stanford University ashokc@cs.stanford.edu Kwabena Boahen Department of Bioengineering Stanford
More informationEASINESS IN BANDITS. Gergely Neu. Pompeu Fabra University
EASINESS IN BANDITS Gergely Neu Pompeu Fabra University EASINESS IN BANDITS Gergely Neu Pompeu Fabra University THE BANDIT PROBLEM Play for T rounds attempting to maximize rewards THE BANDIT PROBLEM Play
More informationThe No-Regret Framework for Online Learning
The No-Regret Framework for Online Learning A Tutorial Introduction Nahum Shimkin Technion Israel Institute of Technology Haifa, Israel Stochastic Processes in Engineering IIT Mumbai, March 2013 N. Shimkin,
More informationAdvanced Machine Learning
Advanced Machine Learning Follow-he-Perturbed Leader MEHRYAR MOHRI MOHRI@ COURAN INSIUE & GOOGLE RESEARCH. General Ideas Linear loss: decomposition as a sum along substructures. sum of edge losses in a
More informationAdvanced Machine Learning
Advanced Machine Learning Bandit Problems MEHRYAR MOHRI MOHRI@ COURANT INSTITUTE & GOOGLE RESEARCH. Multi-Armed Bandit Problem Problem: which arm of a K-slot machine should a gambler pull to maximize his
More informationCOS 511: Theoretical Machine Learning. Lecturer: Rob Schapire Lecture # 12 Scribe: Indraneel Mukherjee March 12, 2008
COS 511: Theoretical Machine Learning Lecturer: Rob Schapire Lecture # 12 Scribe: Indraneel Mukherjee March 12, 2008 In the previous lecture, e ere introduced to the SVM algorithm and its basic motivation
More informationStat 8112 Lecture Notes Weak Convergence in Metric Spaces Charles J. Geyer January 23, Metric Spaces
Stat 8112 Lecture Notes Weak Convergence in Metric Spaces Charles J. Geyer January 23, 2013 1 Metric Spaces Let X be an arbitrary set. A function d : X X R is called a metric if it satisfies the folloing
More informationNo-Regret Algorithms for Unconstrained Online Convex Optimization
No-Regret Algorithms for Unconstrained Online Convex Optimization Matthew Streeter Duolingo, Inc. Pittsburgh, PA 153 matt@duolingo.com H. Brendan McMahan Google, Inc. Seattle, WA 98103 mcmahan@google.com
More informationAdvanced Machine Learning
Advanced Machine Learning Learning with Large Expert Spaces MEHRYAR MOHRI MOHRI@ COURANT INSTITUTE & GOOGLE RESEARCH. Problem Learning guarantees: R T = O( p T log N). informative even for N very large.
More informationLearning Methods for Online Prediction Problems. Peter Bartlett Statistics and EECS UC Berkeley
Learning Methods for Online Prediction Problems Peter Bartlett Statistics and EECS UC Berkeley Course Synopsis A finite comparison class: A = {1,..., m}. Converting online to batch. Online convex optimization.
More informationLinear models: the perceptron and closest centroid algorithms. D = {(x i,y i )} n i=1. x i 2 R d 9/3/13. Preliminaries. Chapter 1, 7.
Preliminaries Linear models: the perceptron and closest centroid algorithms Chapter 1, 7 Definition: The Euclidean dot product beteen to vectors is the expression d T x = i x i The dot product is also
More informationEnhancing Generalization Capability of SVM Classifiers with Feature Weight Adjustment
Enhancing Generalization Capability of SVM Classifiers ith Feature Weight Adjustment Xizhao Wang and Qiang He College of Mathematics and Computer Science, Hebei University, Baoding 07002, Hebei, China
More informationA Drifting-Games Analysis for Online Learning and Applications to Boosting
A Drifting-Games Analysis for Online Learning and Applications to Boosting Haipeng Luo Department of Computer Science Princeton University Princeton, NJ 08540 haipengl@cs.princeton.edu Robert E. Schapire
More informationYevgeny Seldin. University of Copenhagen
Yevgeny Seldin University of Copenhagen Classical (Batch) Machine Learning Collect Data Data Assumption The samples are independent identically distributed (i.i.d.) Machine Learning Prediction rule New
More informationBlackwell s Approachability Theorem: A Generalization in a Special Case. Amy Greenwald, Amir Jafari and Casey Marks
Blackwell s Approachability Theorem: A Generalization in a Special Case Amy Greenwald, Amir Jafari and Casey Marks Department of Computer Science Brown University Providence, Rhode Island 02912 CS-06-01
More informationOnline Learning with Feedback Graphs
Online Learning with Feedback Graphs Claudio Gentile INRIA and Google NY clagentile@gmailcom NYC March 6th, 2018 1 Content of this lecture Regret analysis of sequential prediction problems lying between
More informationOnline Learning and Online Convex Optimization
Online Learning and Online Convex Optimization Nicolò Cesa-Bianchi Università degli Studi di Milano N. Cesa-Bianchi (UNIMI) Online Learning 1 / 49 Summary 1 My beautiful regret 2 A supposedly fun game
More informationFrom Bandits to Experts: A Tale of Domination and Independence
From Bandits to Experts: A Tale of Domination and Independence Nicolò Cesa-Bianchi Università degli Studi di Milano N. Cesa-Bianchi (UNIMI) Domination and Independence 1 / 1 From Bandits to Experts: A
More informationAgnostic Online learnability
Technical Report TTIC-TR-2008-2 October 2008 Agnostic Online learnability Shai Shalev-Shwartz Toyota Technological Institute Chicago shai@tti-c.org ABSTRACT We study a fundamental question. What classes
More informationAverage Coset Weight Distributions of Gallager s LDPC Code Ensemble
1 Average Coset Weight Distributions of Gallager s LDPC Code Ensemble Tadashi Wadayama Abstract In this correspondence, the average coset eight distributions of Gallager s LDPC code ensemble are investigated.
More informationOnline Learning: Random Averages, Combinatorial Parameters, and Learnability
Online Learning: Random Averages, Combinatorial Parameters, and Learnability Alexander Rakhlin Department of Statistics University of Pennsylvania Karthik Sridharan Toyota Technological Institute at Chicago
More informationMinimax Policies for Combinatorial Prediction Games
Minimax Policies for Combinatorial Prediction Games Jean-Yves Audibert Imagine, Univ. Paris Est, and Sierra, CNRS/ENS/INRIA, Paris, France audibert@imagine.enpc.fr Sébastien Bubeck Centre de Recerca Matemàtica
More informationFull-information Online Learning
Introduction Expert Advice OCO LM A DA NANJING UNIVERSITY Full-information Lijun Zhang Nanjing University, China June 2, 2017 Outline Introduction Expert Advice OCO 1 Introduction Definitions Regret 2
More informationOnline Aggregation of Unbounded Signed Losses Using Shifting Experts
Proceedings of Machine Learning Research 60: 5, 207 Conformal and Probabilistic Prediction and Applications Online Aggregation of Unbounded Signed Losses Using Shifting Experts Vladimir V. V yugin Institute
More information1 Overview. 2 Learning from Experts. 2.1 Defining a meaningful benchmark. AM 221: Advanced Optimization Spring 2016
AM 1: Advanced Optimization Spring 016 Prof. Yaron Singer Lecture 11 March 3rd 1 Overview In this lecture we will introduce the notion of online convex optimization. This is an extremely useful framework
More informationExplore no more: Improved high-probability regret bounds for non-stochastic bandits
Explore no more: Improved high-probability regret bounds for non-stochastic bandits Gergely Neu SequeL team INRIA Lille Nord Europe gergely.neu@gmail.com Abstract This work addresses the problem of regret
More informationRandomized Smoothing Networks
Randomized Smoothing Netorks Maurice Herlihy Computer Science Dept., Bron University, Providence, RI, USA Srikanta Tirthapura Dept. of Electrical and Computer Engg., Ioa State University, Ames, IA, USA
More informationPerceptron Mistake Bounds
Perceptron Mistake Bounds Mehryar Mohri, and Afshin Rostamizadeh Google Research Courant Institute of Mathematical Sciences Abstract. We present a brief survey of existing mistake bounds and introduce
More informationA Generalization of a result of Catlin: 2-factors in line graphs
AUSTRALASIAN JOURNAL OF COMBINATORICS Volume 72(2) (2018), Pages 164 184 A Generalization of a result of Catlin: 2-factors in line graphs Ronald J. Gould Emory University Atlanta, Georgia U.S.A. rg@mathcs.emory.edu
More informationOptimal and Adaptive Online Learning
Optimal and Adaptive Online Learning Haipeng Luo Advisor: Robert Schapire Computer Science Department Princeton University Examples of Online Learning (a) Spam detection 2 / 34 Examples of Online Learning
More informationUC Berkeley Department of Electrical Engineering and Computer Sciences. EECS 126: Probability and Random Processes
UC Berkeley Department of Electrical Engineering and Computer Sciences EECS 26: Probability and Random Processes Problem Set Spring 209 Self-Graded Scores Due:.59 PM, Monday, February 4, 209 Submit your
More informationMinimax Fixed-Design Linear Regression
JMLR: Workshop and Conference Proceedings vol 40:1 14, 2015 Mini Fixed-Design Linear Regression Peter L. Bartlett University of California at Berkeley and Queensland University of Technology Wouter M.
More informationGeneralization bounds
Advanced Course in Machine Learning pring 200 Generalization bounds Handouts are jointly prepared by hie Mannor and hai halev-hwartz he problem of characterizing learnability is the most basic question
More informationEfficient Minimax Strategies for Square Loss Games
Efficient Minimax Strategies for Square Loss Games Wouter M. Koolen Queensland University of Technology and UC Berkeley wouter.koolen@qut.edu.au Alan Malek University of California, Berkeley malek@eecs.berkeley.edu
More informationFoundations of Machine Learning On-Line Learning. Mehryar Mohri Courant Institute and Google Research
Foundations of Machine Learning On-Line Learning Mehryar Mohri Courant Institute and Google Research mohri@cims.nyu.edu Motivation PAC learning: distribution fixed over time (training and test). IID assumption.
More informationAdvanced Machine Learning
Advanced Machine Learning Learning and Games MEHRYAR MOHRI MOHRI@ COURANT INSTITUTE & GOOGLE RESEARCH. Outline Normal form games Nash equilibrium von Neumann s minimax theorem Correlated equilibrium Internal
More informationOnline Bounds for Bayesian Algorithms
Online Bounds for Bayesian Algorithms Sham M. Kakade Computer and Information Science Department University of Pennsylvania Andrew Y. Ng Computer Science Department Stanford University Abstract We present
More informationApplications of on-line prediction. in telecommunication problems
Applications of on-line prediction in telecommunication problems Gábor Lugosi Pompeu Fabra University, Barcelona based on joint work with András György and Tamás Linder 1 Outline On-line prediction; Some
More informationBandits for Online Optimization
Bandits for Online Optimization Nicolò Cesa-Bianchi Università degli Studi di Milano N. Cesa-Bianchi (UNIMI) Bandits for Online Optimization 1 / 16 The multiarmed bandit problem... K slot machines Each
More informationUnconstrained Online Linear Learning in Hilbert Spaces: Minimax Algorithms and Normal Approximations
JMLR: Workshop and Conference Proceedings vol 35:1 20, 2014 Unconstrained Online Linear Learning in Hilbert Spaces: Minimax Algorithms and Normal Approximations H. Brendan McMahan MCMAHAN@GOOGLE.COM Google,
More informationBandit models: a tutorial
Gdt COS, December 3rd, 2015 Multi-Armed Bandit model: general setting K arms: for a {1,..., K}, (X a,t ) t N is a stochastic process. (unknown distributions) Bandit game: a each round t, an agent chooses
More informationGambling in a rigged casino: The adversarial multi-armed bandit problem
Gambling in a rigged casino: The adversarial multi-armed bandit problem Peter Auer Institute for Theoretical Computer Science University of Technology Graz A-8010 Graz (Austria) pauer@igi.tu-graz.ac.at
More informationPreliminaries. Definition: The Euclidean dot product between two vectors is the expression. i=1
90 8 80 7 70 6 60 0 8/7/ Preliminaries Preliminaries Linear models and the perceptron algorithm Chapters, T x + b < 0 T x + b > 0 Definition: The Euclidean dot product beteen to vectors is the expression
More informationMax-Margin Ratio Machine
JMLR: Workshop and Conference Proceedings 25:1 13, 2012 Asian Conference on Machine Learning Max-Margin Ratio Machine Suicheng Gu and Yuhong Guo Department of Computer and Information Sciences Temple University,
More informationCS264: Beyond Worst-Case Analysis Lecture #20: From Unknown Input Distributions to Instance Optimality
CS264: Beyond Worst-Case Analysis Lecture #20: From Unknown Input Distributions to Instance Optimality Tim Roughgarden December 3, 2014 1 Preamble This lecture closes the loop on the course s circle of
More informationBandit Convex Optimization: T Regret in One Dimension
Bandit Convex Optimization: T Regret in One Dimension arxiv:1502.06398v1 [cs.lg 23 Feb 2015 Sébastien Bubeck Microsoft Research sebubeck@microsoft.com Tomer Koren Technion tomerk@technion.ac.il February
More informationCS281B/Stat241B. Statistical Learning Theory. Lecture 14.
CS281B/Stat241B. Statistical Learning Theory. Lecture 14. Wouter M. Koolen Convex losses Exp-concave losses Mixable losses The gradient trick Specialists 1 Menu Today we solve new online learning problems
More informationEfficient learning by implicit exploration in bandit problems with side observations
Efficient learning by implicit exploration in bandit problems with side observations omáš Kocák Gergely Neu Michal Valko Rémi Munos SequeL team, INRIA Lille Nord Europe, France {tomas.kocak,gergely.neu,michal.valko,remi.munos}@inria.fr
More informationTowards a Theory of Societal Co-Evolution: Individualism versus Collectivism
Toards a Theory of Societal Co-Evolution: Individualism versus Collectivism Kartik Ahuja, Simpson Zhang and Mihaela van der Schaar Department of Electrical Engineering, Department of Economics, UCLA Theorem
More informationTime Series Prediction & Online Learning
Time Series Prediction & Online Learning Joint work with Vitaly Kuznetsov (Google Research) MEHRYAR MOHRI MOHRI@ COURANT INSTITUTE & GOOGLE RESEARCH. Motivation Time series prediction: stock values. earthquakes.
More informationOnline Learning for Time Series Prediction
Online Learning for Time Series Prediction Joint work with Vitaly Kuznetsov (Google Research) MEHRYAR MOHRI MOHRI@ COURANT INSTITUTE & GOOGLE RESEARCH. Motivation Time series prediction: stock values.
More informationA survey: The convex optimization approach to regret minimization
A survey: The convex optimization approach to regret minimization Elad Hazan September 10, 2009 WORKING DRAFT Abstract A well studied and general setting for prediction and decision making is regret minimization
More informationExplore no more: Improved high-probability regret bounds for non-stochastic bandits
Explore no more: Improved high-probability regret bounds for non-stochastic bandits Gergely Neu SequeL team INRIA Lille Nord Europe gergely.neu@gmail.com Abstract This work addresses the problem of regret
More informationStat 260/CS Learning in Sequential Decision Problems. Peter Bartlett
Stat 260/CS 294-102. Learning in Sequential Decision Problems. Peter Bartlett 1. Adversarial bandits Definition: sequential game. Lower bounds on regret from the stochastic case. Exp3: exponential weights
More informationk-fold unions of low-dimensional concept classes
k-fold unions of low-dimensional concept classes David Eisenstat September 2009 Abstract We show that 2 is the minimum VC dimension of a concept class whose k-fold union has VC dimension Ω(k log k). Keywords:
More informationBeyond the regret minimization barrier: an optimal algorithm for stochastic strongly-convex optimization
JMLR: Workshop and Conference Proceedings vol (2010) 1 16 24th Annual Conference on Learning heory Beyond the regret minimization barrier: an optimal algorithm for stochastic strongly-convex optimization
More informationModèles stochastiques II
Modèles stochastiques II INFO 154 Gianluca Bontempi Département d Informatique Boulevard de Triomphe - CP 1 http://ulbacbe/di Modéles stochastiques II p1/50 The basics of statistics Statistics starts ith
More informationCompeting With Strategies
JMLR: Workshop and Conference Proceedings vol (203) 27 Competing With Strategies Wei Han Alexander Rakhlin Karthik Sridharan wehan@wharton.upenn.edu rakhlin@wharton.upenn.edu skarthik@wharton.upenn.edu
More informationLecture 3: Lower Bounds for Bandit Algorithms
CMSC 858G: Bandits, Experts and Games 09/19/16 Lecture 3: Lower Bounds for Bandit Algorithms Instructor: Alex Slivkins Scribed by: Soham De & Karthik A Sankararaman 1 Lower Bounds In this lecture (and
More informationOnline Learning with Predictable Sequences
JMLR: Workshop and Conference Proceedings vol (2013) 1 27 Online Learning with Predictable Sequences Alexander Rakhlin Karthik Sridharan rakhlin@wharton.upenn.edu skarthik@wharton.upenn.edu Abstract We
More information0.1 Motivating example: weighted majority algorithm
princeton univ. F 16 cos 521: Advanced Algorithm Design Lecture 8: Decision-making under total uncertainty: the multiplicative weight algorithm Lecturer: Sanjeev Arora Scribe: Sanjeev Arora (Today s notes
More informationThe best expert versus the smartest algorithm
Theoretical Computer Science 34 004 361 380 www.elsevier.com/locate/tcs The best expert versus the smartest algorithm Peter Chen a, Guoli Ding b; a Department of Computer Science, Louisiana State University,
More informationLong run input use-input price relations and the cost function Hessian. Ian Steedman Manchester Metropolitan University
Long run input use-input price relations and the cost function Hessian Ian Steedman Manchester Metropolitan University Abstract By definition, to compare alternative long run equilibria is to compare alternative
More informationThe Online Approach to Machine Learning
The Online Approach to Machine Learning Nicolò Cesa-Bianchi Università degli Studi di Milano N. Cesa-Bianchi (UNIMI) Online Approach to ML 1 / 53 Summary 1 My beautiful regret 2 A supposedly fun game I
More informationA Second-order Bound with Excess Losses
A Second-order Bound with Excess Losses Pierre Gaillard 12 Gilles Stoltz 2 Tim van Erven 3 1 EDF R&D, Clamart, France 2 GREGHEC: HEC Paris CNRS, Jouy-en-Josas, France 3 Leiden University, the Netherlands
More informationToward a Classification of Finite Partial-Monitoring Games
Toward a Classification of Finite Partial-Monitoring Games Gábor Bartók ( *student* ), Dávid Pál, and Csaba Szepesvári Department of Computing Science, University of Alberta, Canada {bartok,dpal,szepesva}@cs.ualberta.ca
More informationCHAPTER 3 THE COMMON FACTOR MODEL IN THE POPULATION. From Exploratory Factor Analysis Ledyard R Tucker and Robert C. MacCallum
CHAPTER 3 THE COMMON FACTOR MODEL IN THE POPULATION From Exploratory Factor Analysis Ledyard R Tucker and Robert C. MacCallum 1997 19 CHAPTER 3 THE COMMON FACTOR MODEL IN THE POPULATION 3.0. Introduction
More informationEndogeny for the Logistic Recursive Distributional Equation
Zeitschrift für Analysis und ihre Anendungen Journal for Analysis and its Applications Volume 30 20, 237 25 DOI: 0.47/ZAA/433 c European Mathematical Society Endogeny for the Logistic Recursive Distributional
More informationBregman Divergence and Mirror Descent
Bregman Divergence and Mirror Descent Bregman Divergence Motivation Generalize squared Euclidean distance to a class of distances that all share similar properties Lots of applications in machine learning,
More informationLecture 8: Decision-making under total uncertainty: the multiplicative weight algorithm. Lecturer: Sanjeev Arora
princeton univ. F 13 cos 521: Advanced Algorithm Design Lecture 8: Decision-making under total uncertainty: the multiplicative weight algorithm Lecturer: Sanjeev Arora Scribe: (Today s notes below are
More informationAsymptotic redundancy and prolixity
Asymptotic redundancy and prolixity Yuval Dagan, Yuval Filmus, and Shay Moran April 6, 2017 Abstract Gallager (1978) considered the worst-case redundancy of Huffman codes as the maximum probability tends
More informationLecture 4: Lower Bounds (ending); Thompson Sampling
CMSC 858G: Bandits, Experts and Games 09/12/16 Lecture 4: Lower Bounds (ending); Thompson Sampling Instructor: Alex Slivkins Scribed by: Guowei Sun,Cheng Jie 1 Lower bounds on regret (ending) Recap from
More informationMulti-armed bandit models: a tutorial
Multi-armed bandit models: a tutorial CERMICS seminar, March 30th, 2016 Multi-Armed Bandit model: general setting K arms: for a {1,..., K}, (X a,t ) t N is a stochastic process. (unknown distributions)
More informationarxiv: v1 [cs.lg] 1 Jun 2016
Improved Regret Bounds for Oracle-Based Adversarial Contextual Bandits arxiv:1606.00313v1 cs.g 1 Jun 2016 Vasilis Syrgkanis Microsoft Research Haipeng uo Princeton University Robert E. Schapire Microsoft
More informationOnline prediction with expert advise
Online prediction with expert advise Jyrki Kivinen Australian National University http://axiom.anu.edu.au/~kivinen Contents 1. Online prediction: introductory example, basic setting 2. Classification with
More informationOn a generalized combinatorial conjecture involving addition mod 2 k 1
On a generalized combinatorial conjecture involving addition mod k 1 Gérard Cohen Jean-Pierre Flori Tuesday 14 th February, 01 Abstract In this note, e give a simple proof of the combinatorial conjecture
More informationBivariate Uniqueness in the Logistic Recursive Distributional Equation
Bivariate Uniqueness in the Logistic Recursive Distributional Equation Antar Bandyopadhyay Technical Report # 629 University of California Department of Statistics 367 Evans Hall # 3860 Berkeley CA 94720-3860
More informationCombinatorial Bandits
Combinatorial Bandits Nicolò Cesa-Bianchi Università degli Studi di Milano, Italy Gábor Lugosi 1 ICREA and Pompeu Fabra University, Spain Abstract We study sequential prediction problems in which, at each
More informationOn the approximation of real powers of sparse, infinite, bounded and Hermitian matrices
On the approximation of real poers of sparse, infinite, bounded and Hermitian matrices Roman Werpachoski Center for Theoretical Physics, Al. Lotnikó 32/46 02-668 Warszaa, Poland Abstract We describe a
More informationAdaptive Online Learning in Dynamic Environments
Adaptive Online Learning in Dynamic Environments Lijun Zhang, Shiyin Lu, Zhi-Hua Zhou National Key Laboratory for Novel Software Technology Nanjing University, Nanjing 210023, China {zhanglj, lusy, zhouzh}@lamda.nju.edu.cn
More informationLecture 16: Perceptron and Exponential Weights Algorithm
EECS 598-005: Theoretical Foundations of Machine Learning Fall 2015 Lecture 16: Perceptron and Exponential Weights Algorithm Lecturer: Jacob Abernethy Scribes: Yue Wang, Editors: Weiqing Yu and Andrew
More informationy(x) = x w + ε(x), (1)
Linear regression We are ready to consider our first machine-learning problem: linear regression. Suppose that e are interested in the values of a function y(x): R d R, here x is a d-dimensional vector-valued
More informationLecture 19: UCB Algorithm and Adversarial Bandit Problem. Announcements Review on stochastic multi-armed bandit problem
Lecture 9: UCB Algorithm and Adversarial Bandit Problem EECS598: Prediction and Learning: It s Only a Game Fall 03 Lecture 9: UCB Algorithm and Adversarial Bandit Problem Prof. Jacob Abernethy Scribe:
More informationThe sample complexity of agnostic learning with deterministic labels
The sample complexity of agnostic learning with deterministic labels Shai Ben-David Cheriton School of Computer Science University of Waterloo Waterloo, ON, N2L 3G CANADA shai@uwaterloo.ca Ruth Urner College
More informationEarly & Quick COSMIC-FFP Analysis using Analytic Hierarchy Process
Early & Quick COSMIC-FFP Analysis using Analytic Hierarchy Process uca Santillo (luca.santillo@gmail.com) Abstract COSMIC-FFP is a rigorous measurement method that makes possible to measure the functional
More informationExtracting Certainty from Uncertainty: Regret Bounded by Variation in Costs
Extracting Certainty from Uncertainty: Regret Bounded by Variation in Costs Elad Hazan IBM Almaden 650 Harry Rd, San Jose, CA 95120 hazan@us.ibm.com Satyen Kale Microsoft Research 1 Microsoft Way, Redmond,
More informationOnline Learning with Feedback Graphs
Online Learning with Feedback Graphs Nicolò Cesa-Bianchi Università degli Studi di Milano Joint work with: Noga Alon (Tel-Aviv University) Ofer Dekel (Microsoft Research) Tomer Koren (Technion and Microsoft
More informationOnline Learning and Sequential Decision Making
Online Learning and Sequential Decision Making Emilie Kaufmann CNRS & CRIStAL, Inria SequeL, emilie.kaufmann@univ-lille.fr Research School, ENS Lyon, Novembre 12-13th 2018 Emilie Kaufmann Online Learning
More informationNeural Networks. Associative memory 12/30/2015. Associative memories. Associative memories
//5 Neural Netors Associative memory Lecture Associative memories Associative memories The massively parallel models of associative or content associative memory have been developed. Some of these models
More informationThe Free Matrix Lunch
The Free Matrix Lunch Wouter M. Koolen Wojciech Kot lowski Manfred K. Warmuth Tuesday 24 th April, 2012 Koolen, Kot lowski, Warmuth (RHUL) The Free Matrix Lunch Tuesday 24 th April, 2012 1 / 26 Introduction
More informationCHARACTERIZATION OF ULTRASONIC IMMERSION TRANSDUCERS
CHARACTERIZATION OF ULTRASONIC IMMERSION TRANSDUCERS INTRODUCTION David D. Bennink, Center for NDE Anna L. Pate, Engineering Science and Mechanics Ioa State University Ames, Ioa 50011 In any ultrasonic
More informationRegret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems, Part I. Sébastien Bubeck Theory Group
Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems, Part I Sébastien Bubeck Theory Group i.i.d. multi-armed bandit, Robbins [1952] i.i.d. multi-armed bandit, Robbins [1952] Known
More informationHedging Structured Concepts
Hedging Structured Concepts Wouter M. Koolen Advanced Systems Research Centrum Wiskunde en Informatica wmkoolen@cwi.nl Manfred K. Warmuth Department of Computer Science UC Santa Cruz manfred@cse.ucsc.edu
More informationTheory and Applications of A Repeated Game Playing Algorithm. Rob Schapire Princeton University [currently visiting Yahoo!
Theory and Applications of A Repeated Game Playing Algorithm Rob Schapire Princeton University [currently visiting Yahoo! Research] Learning Is (Often) Just a Game some learning problems: learn from training
More information