Minimax strategy for prediction with expert advice under stochastic assumptions

Size: px
Start display at page:

Download "Minimax strategy for prediction with expert advice under stochastic assumptions"

Transcription

1 Minimax strategy for prediction ith expert advice under stochastic assumptions Wojciech Kotłosi Poznań University of Technology, Poland Abstract We consider the setting of prediction ith expert advice ith an additional assumption that each expert generates its losses i.i.d. according to some distribution. We first identify a class of admissable strategies, hich e call permutation invariant, and sho that every strategy outside this class ill perform not better than some permutation invariant strategy. We then sho that hen the losses are binary, a simple Follo the Leader (FL algorithm is the minimax strategy for this game, here minimaxity is simultaneously achieved for the expected regret, the expected redundancy, and the excess ris. Furthermore, FL has also the smallest regret, redundancy, and excess ris over all permutation invariant prediction strategies, simultaneously for all distributions over binary losses. When the losses are continuous in 0, 1, FL remains minimax only hen an additional tric called loss binarization is applied. 1 Introduction In the game of prediction ith expert advice 1, 2, the algorithm sequentially decides on one of K experts to follo, and suffers loss associated ith the chosen expert. The difference beteen the algorithm s cumulative loss and the cumulative loss of the best expert is called regret. The goal is to minimize the regret in the orst case over all possible loss sequences. An algorithm hich achieves this goal (i.e., minimizes the orst-case regret is called minimax. While there is no non solution to this problem in the general setting, it is possible to derive minimax algorithms for some special variants of this game 1, 2, 3, 4. Interestingly, all these algorithms share a similar strategy of playing against a maximin adversary hich assigns losses uniformly at random. They often have the equlization property: all data sequences lead to the same value of the regret. While this property maes them robust against the orst-case sequence, it also maes them over-conservative, preventing them from exploiting the case, hen the actual data are easy. There are various algorithm hich combine almost optimal orst-case performance ith good performance on easy sequences 5, 6, 7, 8, 9; these algorithms, hoever, are not motivated from the minimax principle. In this paper, e drop the analysis of orst-case performance entirely, and explore the minimax principle in a more constrained setting, in hich the adversary is assumed to be stochastic. In particular, e associate ith each expert a fixed distribution P over loss values, and assume the observed losses of expert are generated independently from P. We believe this setting might be practically useful, and that it is interesting to determine the minimax algorithm under this assumption. We immediately face to difficulties here. First, due to stochastic nature of the adversary, it is no longer possible to follo standard approaches of minimax analysis, such as bacard induction1, 2 or sequential minimax duality 10, 3, and e need to resort to a different technique. We define the notion of permutation invariance of prediction strategies. This let us identify a class of admissable strategies (hich e call permutation invariant, and sho that every strategy outside this class ill perform not better than some permutation invariant strategy. Secondly, hile the regret is a single, commonly used performance metric in the orst-case setting, the situation is different in 1

2 the stochastic case. We no at least three potentially useful metrics in the stochastic setting: the expected regret, the expected redundancy, and the excess ris 11, and it is not clear, hich of them should be used to define the minimax strategy. Fortunately, it turns out that there exists a single strategy hich is minimax ith respect to all three metrics simultaneously. In the case of binary losses, hich tae out values from {0, 1}, this strategy turns out to be the Follo the Leader (FL algorithm, hich chooses an expert ith the smallest cumulative loss at a given trial (ith ties broen randomly. Interestingly, FL is non to perform poorly in the orst-case, as its orst-case regret ill gro linearly ith T 2. On the contrary, in the stochastic setting ith binary losses, FL has also the smallest regret, redundancy, and excess ris over all permutation invariant prediction strategies, simultaneously for all distributions over binary losses! In a more general case of continuous losses in the range 0, 1, FL is provably suboptimal. Hoever, by applying binarization tric to the losses 6, i.e. randomly setting them to {0, 1} such that the expectation matches the actual loss, and using FL on the binarized sequence, e obtain the minimax strategy in the continuous case. 2 Problem Setting In the game of prediction ith expert advice, at each trial t = 1,..., T, the algorithm predicts ith a distribution t K over K experts. Then the loss vector l t X K is revealed (here X is either {0, 1} or 0, 1, and the algorithm suffers loss t l t. The sequence of outcomes l 1,..., l t is abbreviated as l t. We let L t, denote the cumulative loss of expert at iteration t, L t, = q t l t,. We assume there are K distributions P = (P 1,..., P K over X, such that for each, the losses l t,, t = 1,..., T, are generated i.i.d. from P. Note that this implies that l t, is independent from l t, henever t t or. We formally define the prediction strategy as a sequence of T functions ω = ( 1,..., T, such that t : X t 1 K. The performance of the strategy ω on the set of distributions P can be measured by one of the three folloing metrics: the expected regret: the expected redundancy: the excess ris: T R eg (ω, P = E t=1 t (l t 1 l t min L T, T R ed (ω, P = E t (l t 1 l t min E L T,, t=1 R is (ω, P = E T (l T 1 l T min E l T,, here in each case, the expectation is over the sequence l T ith respect to distributions P 1,..., P K. Given performance measure R, e say that a strategy ω is minimax ith respect to R, if: sup R(ω, P = inf sup R(ω, P, P ω P here the supremum is over all K-sets of distributions (P 1,..., P K on X, and the infimum is over all prediction strategies. We say that a strategy ω is permutation invariant if for any t = 1,..., T, and any permutation S K, here S K denotes the group of permutations over {1,..., K}, t ((l t 1 = ( t (l t 1, here for any vector v = (v 1,..., v K, e defined (v = (v (1,..., v (K and abbreviated (l t 1 = (l 1,..., (l t 1. In ords, if e -permute the indices of all past loss vectors, the resulting eight vector ill be the -permutation of the original eight vector. Permutation invariant strategies are natural, as they only rely on the observed outcomes, not on the expert indices. Lemma 1. Let ω be permutation invariant. Then, for any permutation S K, E (P t (l t 1 l t = EP t (l t 1 l t, and moreover R(ω, (P = R(ω, P, here R is the expected regret, expected redundancy, or excess ris, and (P = (P (1,..., P (K. We no sho that permutation invariant strategies are admissable in the folloing sense: Theorem 2. For any strategy ω, there exists permutation invariant strategy ω, such that sup P R( ω, P sup P R(ω, P, here R is either the expected regret, the expected redundancy or the excess ris., 2

3 Theorem 2 statest that it suffices to search for minimax strategy only ithin the set of permutation invariant strategies. Given loss sequence l t 1, let N = argmin j=1,...,k L t 1,j be the size of the leader set at the beginning of trial t. We define Follo the Leader (FL strategy t such that t, = 1 N if argmin j L t 1,j and t, = 0 otherise. In other ords, FL predicts ith the current leader, breaing ties uniformly at random. It is easy to sho that FL strategy is permutation invariant. 3 Binary losses In this section, e set X = {0, 1}, so that all losses are binary. In this case, each P is a Bernoulli distribution. Tae any permutation invariant strategy ω. Using Lemma 1 on all S K : E P t (l t 1 1 l t = E (P t (l t 1 l t. (1 }{{} =: loss t( t,p We no sho the main result of this paper, a surprisingly strong property of FL strategy, hich states that FL minimizes loss t ( t, P over all K-sets of distributions simultaneously. Hence, FL is not only optimal in the orst case, but is actually optimal for permutation-averaged expected loss for any P, hich implies by (1 that FL has the smallest expected loss among all permutation invariant strategies for any P. Theorem 3. Let ω = ( 1,..., T be the FL strategy. Then, for any K-set of distributions P = (P 1,..., P K over binary losses, for any strategy ω = ( 1,..., T, and any t = 1,..., T : loss t ( t, P loss t ( t, P. The consequence of Theorem 3 is the folloing corollary hich states the minimaxity of FL strategy for binary losses: Corollary 4. Let ω = ( 1,..., T be the FL strategy. Then, for any P over binary losses, and any permutation invariant strategy ω: R(ω, P R(ω, P. here R is the expected regret, expected redundancy, or excess ris. This implies: sup R(ω, P = inf sup R(ω, P, P ω P here the supremum is over all distributions on binary losses, and the infimum over all (not necessarily permutation invariant strategies. Proof. The second statement immediately follos from the first statement and Theorem 2. For the first statement, note that the loss of the best expert part of each measure only depends on P. Hence, e only need to sho that for any t = 1,..., T, E P t l t EP t l t. Since t and t are permutation invariant, Lemma 1 shos that E P t l t = losst ( t, P, and similarly, E P t l t = loss t ( t, P. Application of Theorem 3 finishes the proof. 4 Continuous losses In this section, e consider the general case X = 0, 1 of continuous loss vectors. We give a modification of FL and prove its minimaxity. In the appendix, e justify the modification by arguing that the vanilla FL strategy is not minimax for continuous losses. 3

4 The modification of FL is based on the procedure e call binarization. A similar tric has already been used in 6 to deal ith non-integer losses in a different context. We define a binarization of any loss value l t, 0, 1 as a Bernoulli random variable b t, hich taes out value 1 ith probability l t, and value 0 ith probability 1 l t,. In other ords, e replace each non-binary loss l t, by a random binary outcome b t,, such that Eb t, = l t,. Note that if l t, {0, 1}, then b t, = l t,, i.e. binarization has no effect on losses hich are already binary. Let us also define b t = (b t,1,..., b t,k, here all K Bernoulli random variables b t, are independent. Similarly, b t ill denote a binary loss sequence b 1,..., b t, here the binarization procedure as applied independently (ith a ne set of Bernoulli variables for each trial t. No, given the loss sequence l t 1, e define the binarized FL strategy ω b by: b t (l t 1 = E b t 1 t (b t 1, here t (b t 1 is the standard FL strategy applied to binarized losses b t 1, and the expectation is over internal randomization of the algorithm (binarization variables. Note that if the set of distributions P has support only on {0, 1}, then b t t. On the other hand, these to strategies may differ significantly for non-binary losses. Hoever, e ill sho that for any K-set of distributions P (ith support in 0, 1, b t ill behave in the same ay as t ould behave on some particular K-set of distributions over binary losses. To this end, e introduce binarization of a K-set of distributions P, defined as P bin = (P1 bin,..., PK bin, here is a distribution ith support {0, 1} such that: P bin E P binl t, = P bin (l t, = 1 = E P l t,. In other ords, P bin is a Bernoulli distribution hich has the same expectation as the original distribution (over continuous losses P. We no sho the folloing results: Lemma 5. For any K-set of distributions P = (P 1,..., P K ith support on X = 0, 1, E l t P t b t (l t 1 l t = El t (P bin t t (l t 1 l t. Lemma 6. For any K-set of distributions P = (P 1,..., P K ith support on X = 0, 1, R(ω b, P R(ω, P bin, here R is either the expected regret, the expected redundancy, or the excess ris. Theorem 7. Let ω b = ( b 1,..., b be the binarized FL strategy. Then: T sup R(ω b, P = inf sup R(ω, P, P ω P here R is the expected regret, expected redundancy, or excess ris, the supremum is over all K-sets of distributions on 0, 1, and the infimum is over all prediction strategies. In the appendix, e argue that vanilla FL is not the minimax strategy for continuous losses, so that the binarization procedure is justified. 5 Open Problem The setting considered in this paper is quite limited even in the stochastic case, as it does not consider distributions over loss vectors hich are i.i.d. beteen trials, but not necessarily i.i.d. beteen experts. It ould be interesting to determined the minimax strategy in this more general setting. Preliminary computational experiment suggest that FL is not minimax even for binary losses, hen dependencies beteen experts are alloed. Acnoledgments. The author as supported by the Polish National Science Centre under grant no. 2013/11/D/ST6/

5 References 1 Nicolò Cesa-Bianchi, Yaov Freund, David Haussler, David P. Helmbold, Robert E. Schapire, and Manfred K. Warmuth. Ho to use expert advice. Journal of the ACM, 44(3: , Nicolò Cesa-Bianchi and Gábor Lugosi. Prediction, learning, and games. Cambridge University Press, Wouter M. Koolen. Combining Strategies Efficiently: High-quality Decisions from Conicting Advice. PhD thesis, ILLC, University of Amsterdam, Jacob Abernethy, Manfred K. Warmuth, and Joel Yellin. When random play is optimal against an adversary. In COLT, pages , July Steven de Rooij, Tim van Erven, Peter D. Grünald, and Wouter M. Koolen. Follo the leader if you can, hedge if you must. Journal of Machine Learning Research, 15(1: , Tim van Erven, Wojciech Kotłosi, and Manfred K. Warmuth. Follo the leader ith dropout perturbations. In COLT, pages , Amir Sani, Gergely Neu, and Alessandro Lazaric. Exploiting easy data in online optimization. In NIPS, pages Wouter M. Koolen and Tim van Erven. Second-order quantile methods for experts and combinatorial games. In COLT, pages , Haipeng Luo and Robert E. Schapire. Achieving all ith no parameters: AdaNormalHedge. In COLT, pages , Jacob Abernethy, Aleh Agaral, Peter L. Bartlett, and Alexander Rahlin. A stochastic vie of optimal regret through minimax duality. In COLT, Peter D. Grünald. The Minimum Description Length Principle. MIT Press, Cambridge, MA,

6 Proof of Lemma 1 We first sho that the expected loss of the algorithm at any iteration t = 1,..., T, is the same for both (P and P: E (P t (l t 1 l t = EP t ((l t 1 (l t = E P (t (l t 1 (l t = E P t (l t 1 l t, here the first equality is due to the fact, that permuting the distributions is equivalent to permuting the coordinates of the losses (hich are random variables ith respect to these distributions, the second equality exploits the permutation invariance of ω, hile the third inequality uses a simple fact that the dot product is invariant under permuting both arguments. Therefore, the loss of the algorithm part of any of the three measures (regret, redundancy, ris remains the same. To sho that the loss of the optimal part of each measure is the same, note that for any t = 1,..., T, = 1,..., K, E (P l t, = E P lt,(, hich implies: min min E (P l T, = min E P lt,( = min E P l T,, E (P L T, = min E P LT,( = min E P L T,, E (P min L T, = E P min L T,( = E P min L T, so that the loss of the best expert parts of all measures are also the same for both (P and P. Proof of Theorem 2 Define ω = ( 1,..., T as: t (l t 1 = 1 τ 1 ( t (τ(l t 1. Note that ω is a valid prediction strategy, since t is a function of l t 1, and t K ( t is a convex combination of distributions, so it is a distribution itself. Moreover, ω is permutation invariant: t ((l t 1 = 1 ( τ 1 t (τ(l t 1 = 1 ( (τ 1 1 t (τ(l t 1 = 1 ( t (τ(l t 1 = ( t (l t 1, τ 1 here the second inequality is from replacing the summation index τ τ. No, note that the expected loss of t is: E P t (l t 1 1 l t = τ 1 ( t (τ(l t 1 l t = 1 = 1 = 1 E P E P E τ 1 (P S K E (P t (τ(l t 1 τ(l t t (l t 1 l t t (l t 1 l t. Since the loss of the best expert parts of all three measures are invariant under any permutation of P (see the proof of Lemma 1, e have: R( ω, P = 1 R(ω, (P max R(ω, (P. S K S K Hence, sup R( ω, P sup max R(ω, (P = sup R(ω, P. P P S K P, 6

7 Proof of Theorem 3 For any distribution P over binary losses, let p := P (l t, = 1 = E P l t,. We have: loss t ( t, P = 1 E (P t (l t 1 l t (2 = 1 E (P t (l t 1 E (P l t ( = 1 K ( t 1 K q=1 p l q, ( (1 p ( t 1 t 1 q=1 l q, t, (l t 1 p ( l t 1 =1 =1 = 1 K t, (l t 1 t 1 q=1 p lq,j (j (1 p (j t 1 t 1 q=1 lq,j p (, l t 1 =1 j=1 }{{} =: loss t( t,p l t 1 here in the second equality e used the fact that t depends on l t 1 and does not depend on l t. Fix l t 1 and consider the term loss t ( t, P l t 1. This term is linear in t, hence it is minimized by t = e for some = 1,..., K, here e is the -th standard basis vector ith 1 on the -th coordinate, and zeros on the remaining coordinates. We ill use a shorthand notation L j = t 1 q=1 l q,j, for j = 1,..., K, and L = (L 1,..., L K. In this notation, e rerite loss t ( t, P l t 1 as: K loss t ( t, P l t 1 = t, (l t 1 (j (1 p (j t 1 Lj p (, (3 =1 We ill sho that for any P, and any l t 1 (and hence, any L, loss t ( t, P l t 1 is minimized by setting t = e for any argmin j L j. In other ords, e ill sho that for any P, L, any argmin j L j, and any = 1,..., K, j=1 loss t (e, P l t 1 loss t (e, P l t 1. or equivalently, using (3, that for any P, L, argmin j L j, and = 1,..., K, j=1 (j (1 p (j t 1 Lj p ( j=1 (j (1 p (j t 1 Lj p ( (4 We proceed by induction on K. Tae K = 2 and note that hen =, there is nothing to prove, as both sides of (4 are identical. Therefore, ithout loss of generality, assume = 1 and = 2, hich implies L 1 L 2. Then, (4 reduces to: p L1 1 pl2 2 (1 p 1 t 1 L1 (1 p 2 t 1 L2 p 1 + p L1 2 pl2 1 (1 p 2 t 1 L1 (1 p 1 t 1 L2 p 2 p L1 1 pl2 2 (1 p 1 t 1 L1 (1 p 2 t 1 L2 p 2 + p L1 2 pl2 1 (1 p 2 t 1 L1 (1 p 1 t 1 L2 p 1, After rearranging the terms, it amounts to sho that: (p 1 p 2 L1( (1 p 1 (1 p 2 t 1 L 2(p1 p 2 ( (p 2 (1 p 1 L2 L1 (p 1 (1 p 2 L2 L1 0. But this ill hold if: (p 1 p 2 ( (p 2 (1 p 1 L2 L1 (p 1 (1 p 2 L2 L1 0. (5 If L 1 = L 2, (5 clearly holds; therefore assume L 1 < L 2. We prove the validity of (5 by noticing that: p 2 (1 p 1 > p 1 (1 p 2 0 p 2 > p 1, hich means that the to factors of the product on the left-hand side of (5 have the opposite sign (hen p 1 p 2 or are zero at the same time (hen p 1 = p 2. Hence, e proved (5, hich implies 7

8 (4 hen = 1 and = 2. The opposite case = 2, = 1 ith L 2 L 1 can be shon ith exactly the same line of arguments by simply exchanging the indices 1 and 2. No, e assume (4 holds for K 1 2 experts and any P = (P 1,..., P K 1, any L = (L 1,..., L K 1, any argmin j=1,...,k 1 L j, and any = 1,..., K 1, and e sho that it also holds for K experts. Tae any argmin j=1,...,k L j, and any = 1,..., K. Without loss of generality, assume that 1 and 1 (it is alays possible find expert different than and, because there are K 3 experts. We expand the sum over permutations on the left-hand side of (4 ith respect to the value of (1: K s=1 p L1 s (1 p s t 1 L1 : (1=s j=2 (j (1 p (j t 1 Lj p (, and e also expand the sum on the right-hand side of (4 in the same ay. To prove (4, it suffices to sho that every term in the sum over s on the left-hand side is not greater than the corresponding term in the sum on the right-hand side, i.e. to sho that for any s = 1,..., K, : (1=s j=2 (j (1 p (j t 1 Lj p ( : (1=s j=2 (j (1 p (j t 1 Lj p (. (6 We no argue that this inequality follos directly from the inductive assumption by dropping L 1 and P s, and applying (4 to such a (K 1-expert case. More precisly, note that the sum on both sides of (6 goes over all permutations on indices (1,..., s 1, s + 1,..., K and since, 1, argmin j=2,...,k L j and 2. Hence, applying (4 to K 1 expert case ith K 1 distributions (P 1, P 2,..., P s 1, P s+1,..., P K (or any permutation thereof, and K 1 integers (L 2,..., L K immediately implies (6. Thus, e proved (4 hich states that loss t ( t, P l t 1 is minimized by any leader argmin j L j, here L j = t 1 q=1 l q,j. This means loss t ( t, P l t 1 is also minimized by the FL strategy t, hich distributes its mass uniformly over all leaders. Since FL minimizes loss t ( t, P l t 1 for any l t 1, by (2 it also minimizes loss t ( t, P. Note that the proof did not require uniform tie breaing over leaders, as any distribution over leaders ould or as ell. Uniform distribution, hoever, maes the FL strategy permutation invariant. Proof of Lemma 5 Let p be the expectation of l t, according to either P or P bin, p := E P l t, = E P binl t,. Since for any prediction strategy ω, t depends on l t 1 and does not depend on l t, e have E P b t l t = EP b t EP l t = E P b t p, here p = (p1,..., p K. Simiarly, E P bin t l t = EP bin t p. Hence, e only need to sho that EP b t = EP bin t. This holds because b t sees only binary outcomes resulting from the joint distribution of P and the distribution of binarization variables: E l t 1 P t 1 b t (l t 1 = E l t 1 P t 1,b t 1 t (b t 1, and for any b t,, the probability (jointly over P and binarization variables of b t, = 1 is the same as probability of l t, = 1 over the distribution P bin : P (b t, = 1 = P (b t, = 1 l t, P (l t, dl t, Hence, = 0,1 0,1 l t, P (l t, dl t, = p t = P bin (l t, = 1. E l t 1 P t 1,b t 1 t (b t 1 = E l t 1 (P bin t 1 t (l t. 8

9 Proof of Lemma 6 Lemma 5 shos that the expected loss of ω b on P is the same as the expected loss of ω on P bin. Hence, to prove the inequality, e only need to consider the loss of the best expert part of each measure. For the expected redundancy, and the expected regret, it directly follos from the definition of P bin that for any t,, E P l t, = E P binl t,, hence min E P l T, = min E P binl T,, and simiarly, min E P L T, = min E P binl T,. Thus, for the expected redundancy and the excess ris, the lemma actually holds ith equality. For the expected regret, e ill sho that E P min L T, E P binmin L T,, hich ill finish the proof. Using the argument from the proof of Lemma 5, and denoting B T, = T t=1 b t,, e have: E l T (P bin min L T, = E T l T P T,b T min B T, E l T P min E T b T B T, l T = E l T P min L T,, T here the inequality follos from Jensen s inequality applied to a concave function min(. Proof of Theorem 7 Since ω b is the same as ω hen all the losses are binary, R(ω b, P = R(ω, P for every P over binary losses. Furthermore, Lemma 6 states that R(ω b, P R(ω b, P bin, i.e. for every P over continuous losses, there is a corresponding P bin over binary losses hich incurs at least the same regret/redundancy/ris to ω b. Therefore, sup R(ω b, P = sup R(ω b, P = sup R(ω, P. P on 0,1 P on {0,1} P on {0,1} By the second part of Corollary 4, for any prediction strategy ω: sup R(ω, P P on {0,1} hich finishes the proof. A counterexample to FL strategy sup R(ω, P P on {0,1} sup R(ω, P, P on 0,1 We no argue that the vanilla FL is not the minimax strategy for continuous losses, so that the binarization procedure is justified. We ill only consider excess ris for simplicity, but e can use similar arguments to sho a counterexample for the expected regret and the expected redundancy as ell. Tae K = 2 experts, T = 2 iterations and distributions P 1, P 2 on binary losses. Denote p 1 = P 1 (l t,1 = 1 and p 2 = P 2 (l t,2 = 1, and assume p 1 p 2. The ris of FL strategy is given by: ( p1 p 2 + (1 p 1 (1 p 2 p 1 + p 2 2 +p 1 (1 p 2 p 2 + p 2 (1 p 1 p 1 p 1 = δ 2 δ2 2, } {{ } ties here δ = p 2 p 1. Maximizing over δ the expression above gives δ = 1 2 and the maximum ris of FL on binary losses is equal to 1 8. This is also the minimax ris on continuous losses, as follos from the proof of Theorem 7 (because the binarized FL is the minimax strategy on continuous losses, and it achieves the maximum ris on binary losses. We no sho that there exist distributions P 1, P 2 on continuous losses hich force FL to suffer more excess ris than 1 8. We tae P 1 ith support on to points {ɛ, 1}, here ɛ is a very small positive number, and p 1 = P 1 (l t,1 = 1. Note that El t,1 = p 1 + ɛ(1 p 1. P 2 has support on {0, 1 ɛ}, and let p 2 = P 2 (l t,2 = 1. We also assume p 1 < p 2, i.e. expert 1 is the better expert. The main idea in this counterexample is that by using ɛ values, all ties are resolved in favor of expert 2, hich maes the FL algorithm suffer more loss. More precisely, this ris of FL is no given by: ( p1 p 2 + (1 p 1 (1 p 2 p 2 +p 1 (1 p 2 p 2 + p 2 (1 p 1 p 1 p 1 + O(ɛ. }{{} ties Choosing, e.g. p 1 = 0 and p 2 = 0.5, this gives O(ɛ excess ris, hich is more than 1 8, given tha e tae ɛ sufficiently small. 9

On Minimaxity of Follow the Leader Strategy in the Stochastic Setting

On Minimaxity of Follow the Leader Strategy in the Stochastic Setting On Minimaxity of Follow the Leader Strategy in the Stochastic Setting Wojciech Kot lowsi Poznań University of Technology, Poland wotlowsi@cs.put.poznan.pl Abstract. We consider the setting of prediction

More information

Learning, Games, and Networks

Learning, Games, and Networks Learning, Games, and Networks Abhishek Sinha Laboratory for Information and Decision Systems MIT ML Talk Series @CNRG December 12, 2016 1 / 44 Outline 1 Prediction With Experts Advice 2 Application to

More information

Adaptive Online Gradient Descent

Adaptive Online Gradient Descent University of Pennsylvania ScholarlyCommons Statistics Papers Wharton Faculty Research 6-4-2007 Adaptive Online Gradient Descent Peter Bartlett Elad Hazan Alexander Rakhlin University of Pennsylvania Follow

More information

Bayesian Properties of Normalized Maximum Likelihood and its Fast Computation

Bayesian Properties of Normalized Maximum Likelihood and its Fast Computation Bayesian Properties of Normalized Maximum Likelihood and its Fast Computation Andre Barron Department of Statistics Yale University Email: andre.barron@yale.edu Teemu Roos HIIT & Dept of Computer Science

More information

Exponential Weights on the Hypercube in Polynomial Time

Exponential Weights on the Hypercube in Polynomial Time European Workshop on Reinforcement Learning 14 (2018) October 2018, Lille, France. Exponential Weights on the Hypercube in Polynomial Time College of Information and Computer Sciences University of Massachusetts

More information

Learning with Large Number of Experts: Component Hedge Algorithm

Learning with Large Number of Experts: Component Hedge Algorithm Learning with Large Number of Experts: Component Hedge Algorithm Giulia DeSalvo and Vitaly Kuznetsov Courant Institute March 24th, 215 1 / 3 Learning with Large Number of Experts Regret of RWM is O( T

More information

Stochastic and Adversarial Online Learning without Hyperparameters

Stochastic and Adversarial Online Learning without Hyperparameters Stochastic and Adversarial Online Learning without Hyperparameters Ashok Cutkosky Department of Computer Science Stanford University ashokc@cs.stanford.edu Kwabena Boahen Department of Bioengineering Stanford

More information

EASINESS IN BANDITS. Gergely Neu. Pompeu Fabra University

EASINESS IN BANDITS. Gergely Neu. Pompeu Fabra University EASINESS IN BANDITS Gergely Neu Pompeu Fabra University EASINESS IN BANDITS Gergely Neu Pompeu Fabra University THE BANDIT PROBLEM Play for T rounds attempting to maximize rewards THE BANDIT PROBLEM Play

More information

The No-Regret Framework for Online Learning

The No-Regret Framework for Online Learning The No-Regret Framework for Online Learning A Tutorial Introduction Nahum Shimkin Technion Israel Institute of Technology Haifa, Israel Stochastic Processes in Engineering IIT Mumbai, March 2013 N. Shimkin,

More information

Advanced Machine Learning

Advanced Machine Learning Advanced Machine Learning Follow-he-Perturbed Leader MEHRYAR MOHRI MOHRI@ COURAN INSIUE & GOOGLE RESEARCH. General Ideas Linear loss: decomposition as a sum along substructures. sum of edge losses in a

More information

Advanced Machine Learning

Advanced Machine Learning Advanced Machine Learning Bandit Problems MEHRYAR MOHRI MOHRI@ COURANT INSTITUTE & GOOGLE RESEARCH. Multi-Armed Bandit Problem Problem: which arm of a K-slot machine should a gambler pull to maximize his

More information

COS 511: Theoretical Machine Learning. Lecturer: Rob Schapire Lecture # 12 Scribe: Indraneel Mukherjee March 12, 2008

COS 511: Theoretical Machine Learning. Lecturer: Rob Schapire Lecture # 12 Scribe: Indraneel Mukherjee March 12, 2008 COS 511: Theoretical Machine Learning Lecturer: Rob Schapire Lecture # 12 Scribe: Indraneel Mukherjee March 12, 2008 In the previous lecture, e ere introduced to the SVM algorithm and its basic motivation

More information

Stat 8112 Lecture Notes Weak Convergence in Metric Spaces Charles J. Geyer January 23, Metric Spaces

Stat 8112 Lecture Notes Weak Convergence in Metric Spaces Charles J. Geyer January 23, Metric Spaces Stat 8112 Lecture Notes Weak Convergence in Metric Spaces Charles J. Geyer January 23, 2013 1 Metric Spaces Let X be an arbitrary set. A function d : X X R is called a metric if it satisfies the folloing

More information

No-Regret Algorithms for Unconstrained Online Convex Optimization

No-Regret Algorithms for Unconstrained Online Convex Optimization No-Regret Algorithms for Unconstrained Online Convex Optimization Matthew Streeter Duolingo, Inc. Pittsburgh, PA 153 matt@duolingo.com H. Brendan McMahan Google, Inc. Seattle, WA 98103 mcmahan@google.com

More information

Advanced Machine Learning

Advanced Machine Learning Advanced Machine Learning Learning with Large Expert Spaces MEHRYAR MOHRI MOHRI@ COURANT INSTITUTE & GOOGLE RESEARCH. Problem Learning guarantees: R T = O( p T log N). informative even for N very large.

More information

Learning Methods for Online Prediction Problems. Peter Bartlett Statistics and EECS UC Berkeley

Learning Methods for Online Prediction Problems. Peter Bartlett Statistics and EECS UC Berkeley Learning Methods for Online Prediction Problems Peter Bartlett Statistics and EECS UC Berkeley Course Synopsis A finite comparison class: A = {1,..., m}. Converting online to batch. Online convex optimization.

More information

Linear models: the perceptron and closest centroid algorithms. D = {(x i,y i )} n i=1. x i 2 R d 9/3/13. Preliminaries. Chapter 1, 7.

Linear models: the perceptron and closest centroid algorithms. D = {(x i,y i )} n i=1. x i 2 R d 9/3/13. Preliminaries. Chapter 1, 7. Preliminaries Linear models: the perceptron and closest centroid algorithms Chapter 1, 7 Definition: The Euclidean dot product beteen to vectors is the expression d T x = i x i The dot product is also

More information

Enhancing Generalization Capability of SVM Classifiers with Feature Weight Adjustment

Enhancing Generalization Capability of SVM Classifiers with Feature Weight Adjustment Enhancing Generalization Capability of SVM Classifiers ith Feature Weight Adjustment Xizhao Wang and Qiang He College of Mathematics and Computer Science, Hebei University, Baoding 07002, Hebei, China

More information

A Drifting-Games Analysis for Online Learning and Applications to Boosting

A Drifting-Games Analysis for Online Learning and Applications to Boosting A Drifting-Games Analysis for Online Learning and Applications to Boosting Haipeng Luo Department of Computer Science Princeton University Princeton, NJ 08540 haipengl@cs.princeton.edu Robert E. Schapire

More information

Yevgeny Seldin. University of Copenhagen

Yevgeny Seldin. University of Copenhagen Yevgeny Seldin University of Copenhagen Classical (Batch) Machine Learning Collect Data Data Assumption The samples are independent identically distributed (i.i.d.) Machine Learning Prediction rule New

More information

Blackwell s Approachability Theorem: A Generalization in a Special Case. Amy Greenwald, Amir Jafari and Casey Marks

Blackwell s Approachability Theorem: A Generalization in a Special Case. Amy Greenwald, Amir Jafari and Casey Marks Blackwell s Approachability Theorem: A Generalization in a Special Case Amy Greenwald, Amir Jafari and Casey Marks Department of Computer Science Brown University Providence, Rhode Island 02912 CS-06-01

More information

Online Learning with Feedback Graphs

Online Learning with Feedback Graphs Online Learning with Feedback Graphs Claudio Gentile INRIA and Google NY clagentile@gmailcom NYC March 6th, 2018 1 Content of this lecture Regret analysis of sequential prediction problems lying between

More information

Online Learning and Online Convex Optimization

Online Learning and Online Convex Optimization Online Learning and Online Convex Optimization Nicolò Cesa-Bianchi Università degli Studi di Milano N. Cesa-Bianchi (UNIMI) Online Learning 1 / 49 Summary 1 My beautiful regret 2 A supposedly fun game

More information

From Bandits to Experts: A Tale of Domination and Independence

From Bandits to Experts: A Tale of Domination and Independence From Bandits to Experts: A Tale of Domination and Independence Nicolò Cesa-Bianchi Università degli Studi di Milano N. Cesa-Bianchi (UNIMI) Domination and Independence 1 / 1 From Bandits to Experts: A

More information

Agnostic Online learnability

Agnostic Online learnability Technical Report TTIC-TR-2008-2 October 2008 Agnostic Online learnability Shai Shalev-Shwartz Toyota Technological Institute Chicago shai@tti-c.org ABSTRACT We study a fundamental question. What classes

More information

Average Coset Weight Distributions of Gallager s LDPC Code Ensemble

Average Coset Weight Distributions of Gallager s LDPC Code Ensemble 1 Average Coset Weight Distributions of Gallager s LDPC Code Ensemble Tadashi Wadayama Abstract In this correspondence, the average coset eight distributions of Gallager s LDPC code ensemble are investigated.

More information

Online Learning: Random Averages, Combinatorial Parameters, and Learnability

Online Learning: Random Averages, Combinatorial Parameters, and Learnability Online Learning: Random Averages, Combinatorial Parameters, and Learnability Alexander Rakhlin Department of Statistics University of Pennsylvania Karthik Sridharan Toyota Technological Institute at Chicago

More information

Minimax Policies for Combinatorial Prediction Games

Minimax Policies for Combinatorial Prediction Games Minimax Policies for Combinatorial Prediction Games Jean-Yves Audibert Imagine, Univ. Paris Est, and Sierra, CNRS/ENS/INRIA, Paris, France audibert@imagine.enpc.fr Sébastien Bubeck Centre de Recerca Matemàtica

More information

Full-information Online Learning

Full-information Online Learning Introduction Expert Advice OCO LM A DA NANJING UNIVERSITY Full-information Lijun Zhang Nanjing University, China June 2, 2017 Outline Introduction Expert Advice OCO 1 Introduction Definitions Regret 2

More information

Online Aggregation of Unbounded Signed Losses Using Shifting Experts

Online Aggregation of Unbounded Signed Losses Using Shifting Experts Proceedings of Machine Learning Research 60: 5, 207 Conformal and Probabilistic Prediction and Applications Online Aggregation of Unbounded Signed Losses Using Shifting Experts Vladimir V. V yugin Institute

More information

1 Overview. 2 Learning from Experts. 2.1 Defining a meaningful benchmark. AM 221: Advanced Optimization Spring 2016

1 Overview. 2 Learning from Experts. 2.1 Defining a meaningful benchmark. AM 221: Advanced Optimization Spring 2016 AM 1: Advanced Optimization Spring 016 Prof. Yaron Singer Lecture 11 March 3rd 1 Overview In this lecture we will introduce the notion of online convex optimization. This is an extremely useful framework

More information

Explore no more: Improved high-probability regret bounds for non-stochastic bandits

Explore no more: Improved high-probability regret bounds for non-stochastic bandits Explore no more: Improved high-probability regret bounds for non-stochastic bandits Gergely Neu SequeL team INRIA Lille Nord Europe gergely.neu@gmail.com Abstract This work addresses the problem of regret

More information

Randomized Smoothing Networks

Randomized Smoothing Networks Randomized Smoothing Netorks Maurice Herlihy Computer Science Dept., Bron University, Providence, RI, USA Srikanta Tirthapura Dept. of Electrical and Computer Engg., Ioa State University, Ames, IA, USA

More information

Perceptron Mistake Bounds

Perceptron Mistake Bounds Perceptron Mistake Bounds Mehryar Mohri, and Afshin Rostamizadeh Google Research Courant Institute of Mathematical Sciences Abstract. We present a brief survey of existing mistake bounds and introduce

More information

A Generalization of a result of Catlin: 2-factors in line graphs

A Generalization of a result of Catlin: 2-factors in line graphs AUSTRALASIAN JOURNAL OF COMBINATORICS Volume 72(2) (2018), Pages 164 184 A Generalization of a result of Catlin: 2-factors in line graphs Ronald J. Gould Emory University Atlanta, Georgia U.S.A. rg@mathcs.emory.edu

More information

Optimal and Adaptive Online Learning

Optimal and Adaptive Online Learning Optimal and Adaptive Online Learning Haipeng Luo Advisor: Robert Schapire Computer Science Department Princeton University Examples of Online Learning (a) Spam detection 2 / 34 Examples of Online Learning

More information

UC Berkeley Department of Electrical Engineering and Computer Sciences. EECS 126: Probability and Random Processes

UC Berkeley Department of Electrical Engineering and Computer Sciences. EECS 126: Probability and Random Processes UC Berkeley Department of Electrical Engineering and Computer Sciences EECS 26: Probability and Random Processes Problem Set Spring 209 Self-Graded Scores Due:.59 PM, Monday, February 4, 209 Submit your

More information

Minimax Fixed-Design Linear Regression

Minimax Fixed-Design Linear Regression JMLR: Workshop and Conference Proceedings vol 40:1 14, 2015 Mini Fixed-Design Linear Regression Peter L. Bartlett University of California at Berkeley and Queensland University of Technology Wouter M.

More information

Generalization bounds

Generalization bounds Advanced Course in Machine Learning pring 200 Generalization bounds Handouts are jointly prepared by hie Mannor and hai halev-hwartz he problem of characterizing learnability is the most basic question

More information

Efficient Minimax Strategies for Square Loss Games

Efficient Minimax Strategies for Square Loss Games Efficient Minimax Strategies for Square Loss Games Wouter M. Koolen Queensland University of Technology and UC Berkeley wouter.koolen@qut.edu.au Alan Malek University of California, Berkeley malek@eecs.berkeley.edu

More information

Foundations of Machine Learning On-Line Learning. Mehryar Mohri Courant Institute and Google Research

Foundations of Machine Learning On-Line Learning. Mehryar Mohri Courant Institute and Google Research Foundations of Machine Learning On-Line Learning Mehryar Mohri Courant Institute and Google Research mohri@cims.nyu.edu Motivation PAC learning: distribution fixed over time (training and test). IID assumption.

More information

Advanced Machine Learning

Advanced Machine Learning Advanced Machine Learning Learning and Games MEHRYAR MOHRI MOHRI@ COURANT INSTITUTE & GOOGLE RESEARCH. Outline Normal form games Nash equilibrium von Neumann s minimax theorem Correlated equilibrium Internal

More information

Online Bounds for Bayesian Algorithms

Online Bounds for Bayesian Algorithms Online Bounds for Bayesian Algorithms Sham M. Kakade Computer and Information Science Department University of Pennsylvania Andrew Y. Ng Computer Science Department Stanford University Abstract We present

More information

Applications of on-line prediction. in telecommunication problems

Applications of on-line prediction. in telecommunication problems Applications of on-line prediction in telecommunication problems Gábor Lugosi Pompeu Fabra University, Barcelona based on joint work with András György and Tamás Linder 1 Outline On-line prediction; Some

More information

Bandits for Online Optimization

Bandits for Online Optimization Bandits for Online Optimization Nicolò Cesa-Bianchi Università degli Studi di Milano N. Cesa-Bianchi (UNIMI) Bandits for Online Optimization 1 / 16 The multiarmed bandit problem... K slot machines Each

More information

Unconstrained Online Linear Learning in Hilbert Spaces: Minimax Algorithms and Normal Approximations

Unconstrained Online Linear Learning in Hilbert Spaces: Minimax Algorithms and Normal Approximations JMLR: Workshop and Conference Proceedings vol 35:1 20, 2014 Unconstrained Online Linear Learning in Hilbert Spaces: Minimax Algorithms and Normal Approximations H. Brendan McMahan MCMAHAN@GOOGLE.COM Google,

More information

Bandit models: a tutorial

Bandit models: a tutorial Gdt COS, December 3rd, 2015 Multi-Armed Bandit model: general setting K arms: for a {1,..., K}, (X a,t ) t N is a stochastic process. (unknown distributions) Bandit game: a each round t, an agent chooses

More information

Gambling in a rigged casino: The adversarial multi-armed bandit problem

Gambling in a rigged casino: The adversarial multi-armed bandit problem Gambling in a rigged casino: The adversarial multi-armed bandit problem Peter Auer Institute for Theoretical Computer Science University of Technology Graz A-8010 Graz (Austria) pauer@igi.tu-graz.ac.at

More information

Preliminaries. Definition: The Euclidean dot product between two vectors is the expression. i=1

Preliminaries. Definition: The Euclidean dot product between two vectors is the expression. i=1 90 8 80 7 70 6 60 0 8/7/ Preliminaries Preliminaries Linear models and the perceptron algorithm Chapters, T x + b < 0 T x + b > 0 Definition: The Euclidean dot product beteen to vectors is the expression

More information

Max-Margin Ratio Machine

Max-Margin Ratio Machine JMLR: Workshop and Conference Proceedings 25:1 13, 2012 Asian Conference on Machine Learning Max-Margin Ratio Machine Suicheng Gu and Yuhong Guo Department of Computer and Information Sciences Temple University,

More information

CS264: Beyond Worst-Case Analysis Lecture #20: From Unknown Input Distributions to Instance Optimality

CS264: Beyond Worst-Case Analysis Lecture #20: From Unknown Input Distributions to Instance Optimality CS264: Beyond Worst-Case Analysis Lecture #20: From Unknown Input Distributions to Instance Optimality Tim Roughgarden December 3, 2014 1 Preamble This lecture closes the loop on the course s circle of

More information

Bandit Convex Optimization: T Regret in One Dimension

Bandit Convex Optimization: T Regret in One Dimension Bandit Convex Optimization: T Regret in One Dimension arxiv:1502.06398v1 [cs.lg 23 Feb 2015 Sébastien Bubeck Microsoft Research sebubeck@microsoft.com Tomer Koren Technion tomerk@technion.ac.il February

More information

CS281B/Stat241B. Statistical Learning Theory. Lecture 14.

CS281B/Stat241B. Statistical Learning Theory. Lecture 14. CS281B/Stat241B. Statistical Learning Theory. Lecture 14. Wouter M. Koolen Convex losses Exp-concave losses Mixable losses The gradient trick Specialists 1 Menu Today we solve new online learning problems

More information

Efficient learning by implicit exploration in bandit problems with side observations

Efficient learning by implicit exploration in bandit problems with side observations Efficient learning by implicit exploration in bandit problems with side observations omáš Kocák Gergely Neu Michal Valko Rémi Munos SequeL team, INRIA Lille Nord Europe, France {tomas.kocak,gergely.neu,michal.valko,remi.munos}@inria.fr

More information

Towards a Theory of Societal Co-Evolution: Individualism versus Collectivism

Towards a Theory of Societal Co-Evolution: Individualism versus Collectivism Toards a Theory of Societal Co-Evolution: Individualism versus Collectivism Kartik Ahuja, Simpson Zhang and Mihaela van der Schaar Department of Electrical Engineering, Department of Economics, UCLA Theorem

More information

Time Series Prediction & Online Learning

Time Series Prediction & Online Learning Time Series Prediction & Online Learning Joint work with Vitaly Kuznetsov (Google Research) MEHRYAR MOHRI MOHRI@ COURANT INSTITUTE & GOOGLE RESEARCH. Motivation Time series prediction: stock values. earthquakes.

More information

Online Learning for Time Series Prediction

Online Learning for Time Series Prediction Online Learning for Time Series Prediction Joint work with Vitaly Kuznetsov (Google Research) MEHRYAR MOHRI MOHRI@ COURANT INSTITUTE & GOOGLE RESEARCH. Motivation Time series prediction: stock values.

More information

A survey: The convex optimization approach to regret minimization

A survey: The convex optimization approach to regret minimization A survey: The convex optimization approach to regret minimization Elad Hazan September 10, 2009 WORKING DRAFT Abstract A well studied and general setting for prediction and decision making is regret minimization

More information

Explore no more: Improved high-probability regret bounds for non-stochastic bandits

Explore no more: Improved high-probability regret bounds for non-stochastic bandits Explore no more: Improved high-probability regret bounds for non-stochastic bandits Gergely Neu SequeL team INRIA Lille Nord Europe gergely.neu@gmail.com Abstract This work addresses the problem of regret

More information

Stat 260/CS Learning in Sequential Decision Problems. Peter Bartlett

Stat 260/CS Learning in Sequential Decision Problems. Peter Bartlett Stat 260/CS 294-102. Learning in Sequential Decision Problems. Peter Bartlett 1. Adversarial bandits Definition: sequential game. Lower bounds on regret from the stochastic case. Exp3: exponential weights

More information

k-fold unions of low-dimensional concept classes

k-fold unions of low-dimensional concept classes k-fold unions of low-dimensional concept classes David Eisenstat September 2009 Abstract We show that 2 is the minimum VC dimension of a concept class whose k-fold union has VC dimension Ω(k log k). Keywords:

More information

Beyond the regret minimization barrier: an optimal algorithm for stochastic strongly-convex optimization

Beyond the regret minimization barrier: an optimal algorithm for stochastic strongly-convex optimization JMLR: Workshop and Conference Proceedings vol (2010) 1 16 24th Annual Conference on Learning heory Beyond the regret minimization barrier: an optimal algorithm for stochastic strongly-convex optimization

More information

Modèles stochastiques II

Modèles stochastiques II Modèles stochastiques II INFO 154 Gianluca Bontempi Département d Informatique Boulevard de Triomphe - CP 1 http://ulbacbe/di Modéles stochastiques II p1/50 The basics of statistics Statistics starts ith

More information

Competing With Strategies

Competing With Strategies JMLR: Workshop and Conference Proceedings vol (203) 27 Competing With Strategies Wei Han Alexander Rakhlin Karthik Sridharan wehan@wharton.upenn.edu rakhlin@wharton.upenn.edu skarthik@wharton.upenn.edu

More information

Lecture 3: Lower Bounds for Bandit Algorithms

Lecture 3: Lower Bounds for Bandit Algorithms CMSC 858G: Bandits, Experts and Games 09/19/16 Lecture 3: Lower Bounds for Bandit Algorithms Instructor: Alex Slivkins Scribed by: Soham De & Karthik A Sankararaman 1 Lower Bounds In this lecture (and

More information

Online Learning with Predictable Sequences

Online Learning with Predictable Sequences JMLR: Workshop and Conference Proceedings vol (2013) 1 27 Online Learning with Predictable Sequences Alexander Rakhlin Karthik Sridharan rakhlin@wharton.upenn.edu skarthik@wharton.upenn.edu Abstract We

More information

0.1 Motivating example: weighted majority algorithm

0.1 Motivating example: weighted majority algorithm princeton univ. F 16 cos 521: Advanced Algorithm Design Lecture 8: Decision-making under total uncertainty: the multiplicative weight algorithm Lecturer: Sanjeev Arora Scribe: Sanjeev Arora (Today s notes

More information

The best expert versus the smartest algorithm

The best expert versus the smartest algorithm Theoretical Computer Science 34 004 361 380 www.elsevier.com/locate/tcs The best expert versus the smartest algorithm Peter Chen a, Guoli Ding b; a Department of Computer Science, Louisiana State University,

More information

Long run input use-input price relations and the cost function Hessian. Ian Steedman Manchester Metropolitan University

Long run input use-input price relations and the cost function Hessian. Ian Steedman Manchester Metropolitan University Long run input use-input price relations and the cost function Hessian Ian Steedman Manchester Metropolitan University Abstract By definition, to compare alternative long run equilibria is to compare alternative

More information

The Online Approach to Machine Learning

The Online Approach to Machine Learning The Online Approach to Machine Learning Nicolò Cesa-Bianchi Università degli Studi di Milano N. Cesa-Bianchi (UNIMI) Online Approach to ML 1 / 53 Summary 1 My beautiful regret 2 A supposedly fun game I

More information

A Second-order Bound with Excess Losses

A Second-order Bound with Excess Losses A Second-order Bound with Excess Losses Pierre Gaillard 12 Gilles Stoltz 2 Tim van Erven 3 1 EDF R&D, Clamart, France 2 GREGHEC: HEC Paris CNRS, Jouy-en-Josas, France 3 Leiden University, the Netherlands

More information

Toward a Classification of Finite Partial-Monitoring Games

Toward a Classification of Finite Partial-Monitoring Games Toward a Classification of Finite Partial-Monitoring Games Gábor Bartók ( *student* ), Dávid Pál, and Csaba Szepesvári Department of Computing Science, University of Alberta, Canada {bartok,dpal,szepesva}@cs.ualberta.ca

More information

CHAPTER 3 THE COMMON FACTOR MODEL IN THE POPULATION. From Exploratory Factor Analysis Ledyard R Tucker and Robert C. MacCallum

CHAPTER 3 THE COMMON FACTOR MODEL IN THE POPULATION. From Exploratory Factor Analysis Ledyard R Tucker and Robert C. MacCallum CHAPTER 3 THE COMMON FACTOR MODEL IN THE POPULATION From Exploratory Factor Analysis Ledyard R Tucker and Robert C. MacCallum 1997 19 CHAPTER 3 THE COMMON FACTOR MODEL IN THE POPULATION 3.0. Introduction

More information

Endogeny for the Logistic Recursive Distributional Equation

Endogeny for the Logistic Recursive Distributional Equation Zeitschrift für Analysis und ihre Anendungen Journal for Analysis and its Applications Volume 30 20, 237 25 DOI: 0.47/ZAA/433 c European Mathematical Society Endogeny for the Logistic Recursive Distributional

More information

Bregman Divergence and Mirror Descent

Bregman Divergence and Mirror Descent Bregman Divergence and Mirror Descent Bregman Divergence Motivation Generalize squared Euclidean distance to a class of distances that all share similar properties Lots of applications in machine learning,

More information

Lecture 8: Decision-making under total uncertainty: the multiplicative weight algorithm. Lecturer: Sanjeev Arora

Lecture 8: Decision-making under total uncertainty: the multiplicative weight algorithm. Lecturer: Sanjeev Arora princeton univ. F 13 cos 521: Advanced Algorithm Design Lecture 8: Decision-making under total uncertainty: the multiplicative weight algorithm Lecturer: Sanjeev Arora Scribe: (Today s notes below are

More information

Asymptotic redundancy and prolixity

Asymptotic redundancy and prolixity Asymptotic redundancy and prolixity Yuval Dagan, Yuval Filmus, and Shay Moran April 6, 2017 Abstract Gallager (1978) considered the worst-case redundancy of Huffman codes as the maximum probability tends

More information

Lecture 4: Lower Bounds (ending); Thompson Sampling

Lecture 4: Lower Bounds (ending); Thompson Sampling CMSC 858G: Bandits, Experts and Games 09/12/16 Lecture 4: Lower Bounds (ending); Thompson Sampling Instructor: Alex Slivkins Scribed by: Guowei Sun,Cheng Jie 1 Lower bounds on regret (ending) Recap from

More information

Multi-armed bandit models: a tutorial

Multi-armed bandit models: a tutorial Multi-armed bandit models: a tutorial CERMICS seminar, March 30th, 2016 Multi-Armed Bandit model: general setting K arms: for a {1,..., K}, (X a,t ) t N is a stochastic process. (unknown distributions)

More information

arxiv: v1 [cs.lg] 1 Jun 2016

arxiv: v1 [cs.lg] 1 Jun 2016 Improved Regret Bounds for Oracle-Based Adversarial Contextual Bandits arxiv:1606.00313v1 cs.g 1 Jun 2016 Vasilis Syrgkanis Microsoft Research Haipeng uo Princeton University Robert E. Schapire Microsoft

More information

Online prediction with expert advise

Online prediction with expert advise Online prediction with expert advise Jyrki Kivinen Australian National University http://axiom.anu.edu.au/~kivinen Contents 1. Online prediction: introductory example, basic setting 2. Classification with

More information

On a generalized combinatorial conjecture involving addition mod 2 k 1

On a generalized combinatorial conjecture involving addition mod 2 k 1 On a generalized combinatorial conjecture involving addition mod k 1 Gérard Cohen Jean-Pierre Flori Tuesday 14 th February, 01 Abstract In this note, e give a simple proof of the combinatorial conjecture

More information

Bivariate Uniqueness in the Logistic Recursive Distributional Equation

Bivariate Uniqueness in the Logistic Recursive Distributional Equation Bivariate Uniqueness in the Logistic Recursive Distributional Equation Antar Bandyopadhyay Technical Report # 629 University of California Department of Statistics 367 Evans Hall # 3860 Berkeley CA 94720-3860

More information

Combinatorial Bandits

Combinatorial Bandits Combinatorial Bandits Nicolò Cesa-Bianchi Università degli Studi di Milano, Italy Gábor Lugosi 1 ICREA and Pompeu Fabra University, Spain Abstract We study sequential prediction problems in which, at each

More information

On the approximation of real powers of sparse, infinite, bounded and Hermitian matrices

On the approximation of real powers of sparse, infinite, bounded and Hermitian matrices On the approximation of real poers of sparse, infinite, bounded and Hermitian matrices Roman Werpachoski Center for Theoretical Physics, Al. Lotnikó 32/46 02-668 Warszaa, Poland Abstract We describe a

More information

Adaptive Online Learning in Dynamic Environments

Adaptive Online Learning in Dynamic Environments Adaptive Online Learning in Dynamic Environments Lijun Zhang, Shiyin Lu, Zhi-Hua Zhou National Key Laboratory for Novel Software Technology Nanjing University, Nanjing 210023, China {zhanglj, lusy, zhouzh}@lamda.nju.edu.cn

More information

Lecture 16: Perceptron and Exponential Weights Algorithm

Lecture 16: Perceptron and Exponential Weights Algorithm EECS 598-005: Theoretical Foundations of Machine Learning Fall 2015 Lecture 16: Perceptron and Exponential Weights Algorithm Lecturer: Jacob Abernethy Scribes: Yue Wang, Editors: Weiqing Yu and Andrew

More information

y(x) = x w + ε(x), (1)

y(x) = x w + ε(x), (1) Linear regression We are ready to consider our first machine-learning problem: linear regression. Suppose that e are interested in the values of a function y(x): R d R, here x is a d-dimensional vector-valued

More information

Lecture 19: UCB Algorithm and Adversarial Bandit Problem. Announcements Review on stochastic multi-armed bandit problem

Lecture 19: UCB Algorithm and Adversarial Bandit Problem. Announcements Review on stochastic multi-armed bandit problem Lecture 9: UCB Algorithm and Adversarial Bandit Problem EECS598: Prediction and Learning: It s Only a Game Fall 03 Lecture 9: UCB Algorithm and Adversarial Bandit Problem Prof. Jacob Abernethy Scribe:

More information

The sample complexity of agnostic learning with deterministic labels

The sample complexity of agnostic learning with deterministic labels The sample complexity of agnostic learning with deterministic labels Shai Ben-David Cheriton School of Computer Science University of Waterloo Waterloo, ON, N2L 3G CANADA shai@uwaterloo.ca Ruth Urner College

More information

Early & Quick COSMIC-FFP Analysis using Analytic Hierarchy Process

Early & Quick COSMIC-FFP Analysis using Analytic Hierarchy Process Early & Quick COSMIC-FFP Analysis using Analytic Hierarchy Process uca Santillo (luca.santillo@gmail.com) Abstract COSMIC-FFP is a rigorous measurement method that makes possible to measure the functional

More information

Extracting Certainty from Uncertainty: Regret Bounded by Variation in Costs

Extracting Certainty from Uncertainty: Regret Bounded by Variation in Costs Extracting Certainty from Uncertainty: Regret Bounded by Variation in Costs Elad Hazan IBM Almaden 650 Harry Rd, San Jose, CA 95120 hazan@us.ibm.com Satyen Kale Microsoft Research 1 Microsoft Way, Redmond,

More information

Online Learning with Feedback Graphs

Online Learning with Feedback Graphs Online Learning with Feedback Graphs Nicolò Cesa-Bianchi Università degli Studi di Milano Joint work with: Noga Alon (Tel-Aviv University) Ofer Dekel (Microsoft Research) Tomer Koren (Technion and Microsoft

More information

Online Learning and Sequential Decision Making

Online Learning and Sequential Decision Making Online Learning and Sequential Decision Making Emilie Kaufmann CNRS & CRIStAL, Inria SequeL, emilie.kaufmann@univ-lille.fr Research School, ENS Lyon, Novembre 12-13th 2018 Emilie Kaufmann Online Learning

More information

Neural Networks. Associative memory 12/30/2015. Associative memories. Associative memories

Neural Networks. Associative memory 12/30/2015. Associative memories. Associative memories //5 Neural Netors Associative memory Lecture Associative memories Associative memories The massively parallel models of associative or content associative memory have been developed. Some of these models

More information

The Free Matrix Lunch

The Free Matrix Lunch The Free Matrix Lunch Wouter M. Koolen Wojciech Kot lowski Manfred K. Warmuth Tuesday 24 th April, 2012 Koolen, Kot lowski, Warmuth (RHUL) The Free Matrix Lunch Tuesday 24 th April, 2012 1 / 26 Introduction

More information

CHARACTERIZATION OF ULTRASONIC IMMERSION TRANSDUCERS

CHARACTERIZATION OF ULTRASONIC IMMERSION TRANSDUCERS CHARACTERIZATION OF ULTRASONIC IMMERSION TRANSDUCERS INTRODUCTION David D. Bennink, Center for NDE Anna L. Pate, Engineering Science and Mechanics Ioa State University Ames, Ioa 50011 In any ultrasonic

More information

Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems, Part I. Sébastien Bubeck Theory Group

Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems, Part I. Sébastien Bubeck Theory Group Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems, Part I Sébastien Bubeck Theory Group i.i.d. multi-armed bandit, Robbins [1952] i.i.d. multi-armed bandit, Robbins [1952] Known

More information

Hedging Structured Concepts

Hedging Structured Concepts Hedging Structured Concepts Wouter M. Koolen Advanced Systems Research Centrum Wiskunde en Informatica wmkoolen@cwi.nl Manfred K. Warmuth Department of Computer Science UC Santa Cruz manfred@cse.ucsc.edu

More information

Theory and Applications of A Repeated Game Playing Algorithm. Rob Schapire Princeton University [currently visiting Yahoo!

Theory and Applications of A Repeated Game Playing Algorithm. Rob Schapire Princeton University [currently visiting Yahoo! Theory and Applications of A Repeated Game Playing Algorithm Rob Schapire Princeton University [currently visiting Yahoo! Research] Learning Is (Often) Just a Game some learning problems: learn from training

More information