Concentration, self-bounding functions

Size: px
Start display at page:

Download "Concentration, self-bounding functions"

Transcription

1 Concentration, self-bounding functions S. Boucheron 1 and G. Lugosi 2 and P. Massart 3 1 Laboratoire de Probabilités et Modèles Aléatoires Université Paris-Diderot 2 Economics University Pompeu Fabra 3 Département de Mathématiques Université Paris-Sud Cachan, 03/02/2011 Boucheron & Lugosi & Massart (LPMA et al.) Concentration, self-bounding functions Cachan, 03/02/ / 44

2 Context Motivations X 1,..., X n : X-valued, independent random variables. F : X n R Z = F(X 1,..., X n ) Goal : upper-bounds on log E [ λ(z EZ e )] t > 0 P {Z EZ + t} and P {Z EZ t} Boucheron & Lugosi & Massart (LPMA et al.) Concentration, self-bounding functions Cachan, 03/02/ / 44

3 Motivations Context... Non-asymptotic tail bounds for functions of many independent random variables that do not depend too much on any of them. 1 high dimensional geometry ; 2 random combinatorics ; 3 statistics ; A variety of methods 1 Martingales 2 Talagrand s induction method 3 Transportation method 4 Entropy method 5 Chatterjee s method (exchangeable pairs) Boucheron & Lugosi & Massart (LPMA et al.) Concentration, self-bounding functions Cachan, 03/02/ / 44

4 Motivations Inspiration: Gaussian concentration Theorem (Tsirelson, Borell, Gross,..., 1975) X 1,..., X n i.i.d. N(0, 1) F : R n R L-Lipschitz (w.r.t.) Euclidean distance Z = F(X 1,..., X n ) var[z ] L 2 Poincaré s inequality log E [ e λ(z EZ )] λ2 L 2 P {Z EZ + t} e t2 /(2L 2 ) 2 Boucheron & Lugosi & Massart (LPMA et al.) Concentration, self-bounding functions Cachan, 03/02/ / 44

5 Motivations Efron-Stein inequalities (1981) Z = F(X 1, X 2,..., X n ), (independent R.V) X 1,..., X n X 1,..., X n but from X 1,..., X n. For each i {1,..., n} Z = F(X i 1,..., X i 1, X i, X i+1,...x n ). X (i) = (X 1,..., X i 1, X i+1,..., X n ). F i : a function of n 1 arguments Z i = F i (X 1,..., X i 1, X i+1, X n ) = F i (X (i) ). Theorem (Jackknife estimates of variance are biased) V + = n E [ (Z Z i )2 + X ] 1,..., X n V = i (Z Z i ) 2 Var[Z ] E[V + ] E[V ] Boucheron & Lugosi & Massart (LPMA et al.) Concentration, self-bounding functions Cachan, 03/02/ / 44

6 Motivations Exponential Efron-Stein inequalities Theorem (Sub-Gaussian behavior) If V + v then, for λ 0 log E [ e λ(z EZ )]] λ2 v 2 Theorem (B., Lugosi and Massart, 2003) For 0 λ 1/θ, log E [ e λ(z EZ ]] λθ (1 λθ) log E [ e ] λv +/θ Boucheron & Lugosi & Massart (LPMA et al.) Concentration, self-bounding functions Cachan, 03/02/ / 44

7 Motivations Entropy method Entropy Y an X-valued random variable f non-negative (measurable) function over X Ent [f ] = E [f (Y ) log f (Y )] E[f (Y )] log E[f (Y )]. Why? if Y = exp(λ(z EZ )), let G(λ) = 1 λ log E [ eλ(z EZ )] [ 1 Ent ] e λ(z EZ ) λ 2 E [ eλ(z EZ )] = dg(λ) dλ Basis of Herbst s argument : bounds on Entropy can be translated into differential inequalities for logarithmic moment generating functions. Boucheron & Lugosi & Massart (LPMA et al.) Concentration, self-bounding functions Cachan, 03/02/ / 44

8 Motivations Gross logarithmic Sobolev inequality Theorem (Gross,..., 1975) X 1,..., X n i.i.d. N(0, 1) F : R n R differentiable Z = F(X 1,..., X n ) [ F(X1 var[z ] E,..., X n ) ] 2 Ent [ Z 2] 2E [ F 2] Boucheron & Lugosi & Massart (LPMA et al.) Concentration, self-bounding functions Cachan, 03/02/ / 44

9 Motivations Bounds on entropy Subadditivity X 1,..., X n random variables. Z = f (X 1,..., X n ) 0 Ent (i) [Z ] = E (i) [Z log Z ] E (i) Z log E (i) Z Ent [f (X 1,..., X n )] n E [ Ent (i) [Z ] ] Upper-bounding Entropy of a function of a single random variable Expected value minimizes expected Bregman divergence with respect to convex function x x log x Ent [Z ] inf E [Z (log Z u) (Z u)] u>0 Boucheron & Lugosi & Massart (LPMA et al.) Concentration, self-bounding functions Cachan, 03/02/ / 44

10 Motivations Entropy method in a nutshell Summary The entropy method converts a modified logarithmic Sobolev inequality into a differential inequality involving the logarithm of the moment generating function of Z. Starting point Theorem (a modified logarithmic sobolev inequality.) let φ(x) = e x x 1. For any λ R, λe [ Ze λz ] E [ e λz ] log E [ e λz ] n E [ e λz φ ( λ(z Z i )) ]. Use different conditions to upper-bound n φ ( λ(z Z i))... Boucheron & Lugosi & Massart (LPMA et al.) Concentration, self-bounding functions Cachan, 03/02/ / 44

11 Square roots Variance stabilization Folklore Z 0 and Var[Z ] aez Var[ Z ] a If Z n Pois(nµ), then Z n E Z n N(0, 1/4) (Cramer s Delta) Lemma If X Pois, let v = (EX)E[1/(4X + 1)], for λ 0, [ log E e λ( X E[ ] X]) vλ(e λ 1). for t > 0 P { X E X + t } exp ( t2 ( log 1 + t )). 2v Boucheron & Lugosi & Massart (LPMA et al.) Concentration, self-bounding functions Cachan, 03/02/ / 44

12 Proof Square roots Variance stabilization Poisson Poincaré inequality (Klaassen 1985) X Pois, Z = f (X) Var[Z ] EX E[ Df (X) 2 ], with Df (X) = f (X + 1) f (X). Consequence: Var[ X] v. Poisson logarithmic Sobolev inequality (L. Wu, Bobkov-Ledoux) Ent[Z ] EX E [Df D log f ] Boucheron & Lugosi & Massart (LPMA et al.) Concentration, self-bounding functions Cachan, 03/02/ / 44

13 Self-bounding Self-bounding property (I) Flavors of self-bounding f : X n R is said to have the self-bounding property if f i : X n 1 R : 1 x = (x 1,..., x n ) X n and i = 1,..., n, 0 f (x) f i (x (i) ) 1 2 n ( f (x) fi (x (i) ) ) f (x). where x (i) = (x 1,..., x i 1, x i+1,..., x n ). Boucheron & Lugosi & Massart (LPMA et al.) Concentration, self-bounding functions Cachan, 03/02/ / 44

14 Self-bounding Flavors of self-bounding Examples 1 Suprema of positive bounded empirical processes. X i = (X i,s ) s T, T finite, 0 X i,s 1. X i independent. Z = sup s T 2 Suprema of bounded empirical processes X i,s 1 (relaxing the second assumption) n 3 Largest eigenvalue of a Gram matrix sup u T u : u 2 =1 n X i,s X i X T i u Boucheron & Lugosi & Massart (LPMA et al.) Concentration, self-bounding functions Cachan, 03/02/ / 44

15 Self-bounding Concentration inequalities Binomial-Poisson tails If T = 1, Bennet inequality holds h(u) = (1 + u) log(1 + u) u, and u 1 φ(v) = sup (uv h(u)) = e v v 1. u 1 log E [ e λ(z EZ )] φ(λ)ez λ R. ( ( )) t P {Z EZ + t} exp EZh EZ t > 0 Boucheron & Lugosi & Massart (LPMA et al.) Concentration, self-bounding functions Cachan, 03/02/ / 44

16 Chi-square tails Self-bounding Concentration inequalities X i µ i χ 2 weighted chi-square random variables 1 Z = n X i v = 2 n µ2 and c = max i i µ i /2 Bernstein inequality log E [ e λ(z EZ )] vλ 2 2(1 cλ) ( t 2 ) P {Z EZ + t} exp 2(v + ct) Bennett inequality entails Bernstein inequality (with scale factor 1/3) Boucheron & Lugosi & Massart (LPMA et al.) Concentration, self-bounding functions Cachan, 03/02/ / 44

17 Self-bounding Concentration inequalities Self-bounding property and concentrations inequalities h(u) = (1 + u) log(1 + u) u, and u 1 φ(v) = sup (uv h(u)) = e v v 1. u 1 Theorem (B., Lugosi and Massart ) If Z satisfies the self-bounding property, log E [ e λ(z EZ )] φ(λ)ez λ R. ( ( t P {Z EZ + t} exp EZh EZ ( ( )) t P {Z EZ t} exp EZh EZ )) t > 0 0 < t EZ. Boucheron & Lugosi & Massart (LPMA et al.) Concentration, self-bounding functions Cachan, 03/02/ / 44

18 ... with applications Self-bounding Concentration inequalities Definition (Conditional Rademacher averages) ɛ 1,..., ɛ n Rademacher variables (x 1,..., x n ) F(x 1,..., x n ) = E Symmetrization inequalities (Giné and Zinn, 1984) 1 2 E [F(X 1,..., X n )] E sup s T sup s T n ɛ i x i,s n X i,s 2E [F(X 1,..., X n )] Boucheron & Lugosi & Massart (LPMA et al.) Concentration, self-bounding functions Cachan, 03/02/ / 44

19 Self-bounding Concentration inequalities Theorem (B., Lugosi and Massart 2003) Conditional Rademacher averages are self-bounding. Consequences while Var[sup s T Var[F(X 1,..., X n )] E[F(X 1,..., X n )] n X i,s ] sup s T n Var[X 1,s ] + 2E[sup s T n X i,s ]. Conditional Rademacher averages : kind of weighted Bootstrap estimates. Boucheron & Lugosi & Massart (LPMA et al.) Concentration, self-bounding functions Cachan, 03/02/ / 44

20 Variations on a theme Self-bounding Concentration inequalities Definition (weakly (a, b)-self-bounding) f : X n [0, ) is weakly (a, b)-self-bounding if f i : X n 1 [0, ) : x X n n ( f (x) fi (x (i) ) ) 2 af (x) + b. Definition (strongly (a, b)-self-bounding) f : X n [0, ) is strongly (a, b)-self-bounding if f i : X n 1 [0, ) : i = 1,..., n, andx X n, 0 f (x) f i (x (i) ) 1, and n ( f (x) fi (x (i) ) ) af (x) + b. Boucheron & Lugosi & Massart (LPMA et al.) Concentration, self-bounding functions Cachan, 03/02/ / 44

21 Self-bounding Concentration inequalities Definition (Submodular function) f : 2 n R f (A B) + f (A B) f (A) + f (B) Submodularity implies neither monotonicity nor non-negativity Example The capacity of cuts in a directed graph is submodular. Lemma (Vondrák) Non-negative 1-Lipschitz submodular function are (2, 0)-self-bounding. Boucheron & Lugosi & Massart (LPMA et al.) Concentration, self-bounding functions Cachan, 03/02/ / 44

22 Self-bounding Concentration inequalities Efron-Stein inequality n Var[Z ] E (f (X) f i (X (i) )) 2 Remark Both definitions imply that Z = f (X) satisfies Var[Z ] aez + b. Boucheron & Lugosi & Massart (LPMA et al.) Concentration, self-bounding functions Cachan, 03/02/ / 44

23 Self-bounding Concentration inequalities Theorem (Maurer 2006) X = (X 1,..., X n ) X-valued independent random variables. f : X n [0, ) weakly (a, b)-self-bounding function (a, b 0). Let Z = f (X). If i n, x X n, f i (x (i) ) f (x), then for all 0 λ 2/a, log E [ e λ(z EZ )] (aez + b)λ2 2(1 aλ/2) and for all t > 0, ( t 2 ) P {Z EZ + t} exp 2 (aez + b + at/2). Boucheron & Lugosi & Massart (LPMA et al.) Concentration, self-bounding functions Cachan, 03/02/ / 44

24 Lower tails Self-bounding Concentration inequalities Theorem (McDiarmid and Reed 2008, B., Lugosi, Massart, 2009) X = (X 1,..., X n ) X-valued independent random variables. Let f : X n [0, ) be a weakly (a, b)-self-bounding function (a, b 0). Let Z = f (X) and define c = (3a 1)/6. If, f (x) f i (x (i) ) 1 for each i n and x X n, then for 0 < t EZ, ( t 2 ) P {Z EZ t} exp. 2 (aez + b + c t) If a 1/3, sub-gaussian behavior. Boucheron & Lugosi & Massart (LPMA et al.) Concentration, self-bounding functions Cachan, 03/02/ / 44

25 Upper tails Self-bounding Concentration inequalities Theorem (McDiarmid and Reed 2008, B., Lugosi, Massart, 2009) X = (X 1,..., X n ) X-valued independent random variables. Let f : X n [0, ) be a weakly (a, b)-self-bounding function (a, b 0). Let Z = f (X) and define c = (3a 1)/6. Then for all λ 0, log E [ e λ(z EZ )] (aez + b)λ2 2(1 c + λ) and for all t > 0, ( t 2 ) P {Z EZ + t} exp 2 (aez + b + c + t). If a 1/3, sub-gaussian behavior. Boucheron & Lugosi & Massart (LPMA et al.) Concentration, self-bounding functions Cachan, 03/02/ / 44

26 Self-bounding Concentration inequalities Proofs Remark The entropy method converts a modified logarithmic Sobolev inequality into a differential inequality involving the logarithm of the moment generating function of Z. Starting point Theorem (a modified logarithmic sobolev inequality.) For any λ R, λe [ Ze λz ] E [ e λz ] log E [ e λz ] n E [ e λz φ ( λ(z Z i )) ]. Use different conditions to upper-bound n φ ( λ(z Z i))... Boucheron & Lugosi & Massart (LPMA et al.) Concentration, self-bounding functions Cachan, 03/02/ / 44

27 Proofs (...) Self-bounding Concentration inequalities Establishing differential inequalities for G(λ) = log E [ eλ(z EZ )] 1 (a, b)-self-bounding: Ent [ e λz ] φ( λ)e [ (az + b) e λz ] 2 (a, b)-weakly self-bounding, λ 0: Ent [ e ] λz λ2 2 E [ (az + b)e ] λz. 3 (a, b)-weakly self-bounding, λ 0: Ent [ e λz ] φ( λ)e [ (az + b)e λz ]. Key differential inequality where v = aez + b. [λ aφ ( λ)] G (λ) G(λ) vφ( λ), (1) Boucheron & Lugosi & Massart (LPMA et al.) Concentration, self-bounding functions Cachan, 03/02/ / 44

28 Self-bounding Around the Herbst argument Concentration inequalities Lemma Let f : I R,, C 1 where interval I 0 with f (0) = 0. Assume x 0 f (x) 0. Let g be C 0 on I and G be C on I with G(0) = G (0) = 0 and for every λ I, f (λ)g (λ) f (λ)g(λ) f 2 (λ)g(λ). Then, for every λ I, G(λ) f (λ) λ 0 g(x)dx. Boucheron & Lugosi & Massart (LPMA et al.) Concentration, self-bounding functions Cachan, 03/02/ / 44

29 Comparisons Self-bounding Concentration inequalities Let ρ be C 0 on I 0. Let a 0. Let H : I R, be C satisfying λh (λ) H(λ) ρ(λ) (ah (λ) + 1) with ah (λ) + 1 > 0 λ I and H (0) = H(0) = 0. Let ρ 0 : I R, assume that G 0 : I R is C with λ I, ag 0 (λ) + 1 > 0 and G 0 (0) = G 0(0) = 0 and G 0 (0) = 1. Assume also that G 0 solves the differential equation λg 0 (λ) G 0(λ) = ρ 0 (λ) ( ag 0 (λ) + 1). If ρ(λ) ρ 0 (λ) for every λ I, then H G 0. Boucheron & Lugosi & Massart (LPMA et al.) Concentration, self-bounding functions Cachan, 03/02/ / 44

30 Sketch of proof Self-bounding Concentration inequalities Key differential inequality λg (λ) G(λ) φ( λ)(1 + ag (λ))) 1 2G γ (λ) = λ 2 /(1 γλ) solves λh (λ) H(λ) λ 2 (1 + γh (λ))) 2 Choosing γ = a works for λ 0 3 May not be the best choice for ρ γ... 4 Optimizing ρ γ = (λg γ(λ) G γ (λ))/(1 + ag γ(λ))) leads to the desired result Boucheron & Lugosi & Massart (LPMA et al.) Concentration, self-bounding functions Cachan, 03/02/ / 44

31 Talagrand s convex distance Application: Convex distance Definition and motivations Definition A X n B 2 n unit ball in Rn endowed with euclidean metric Theorem d T (X, A) = inf y A sup α B n 2 n α i I Xi y i M(A) : Probability distributions supported by A d T (X, A) = sup α B n 2 inf ν M(A) n α i ν{x i Y i } Boucheron & Lugosi & Massart (LPMA et al.) Concentration, self-bounding functions Cachan, 03/02/ / 44

32 Modus operandi Talagrand s convex distance Definition and motivations Lemma For any A X n and x X n, the function f (x) = d T (x, A) 2 satisfies where f i is defined by 0 f (x) f i (x (i) ) 1 f i (x (i) ) = inf f (x 1,..., x i 1, x x i X i, x i+1,..., x n ). (2) Moreover, f is weakly (4, 0)-self-bounding. Boucheron & Lugosi & Massart (LPMA et al.) Concentration, self-bounding functions Cachan, 03/02/ / 44

33 Talagrand s convex distance Definition and motivations Efron-Stein estimates of the variance of d T Efron-Stein estimate of variance of d T (, A) n ( V + = f (x) f i (x (i) ) Lemma For all A X n, V + is bounded by 1. V + 1. Var [d T (X, A)] 1. ) 2 Boucheron & Lugosi & Massart (LPMA et al.) Concentration, self-bounding functions Cachan, 03/02/ / 44

34 Talagrand s convex distance Definition and motivations A consequence of the minmax characterization of d T 1 M(A) : the set of probability measures on A. 2 we may re-write d T as d T (x, A) = inf sup ν M(A) α: α 2 1 j=1 where Y = (Y 1,..., Y n ) is distributed according to ν. 3 By the Cauchy-Schwarz inequality, d T (x, A) 2 = inf ν M(A) j=1 n α j E ν [I xj Y j ] (3) n ( Eν [I xj Y j ] ) 2. Boucheron & Lugosi & Massart (LPMA et al.) Concentration, self-bounding functions Cachan, 03/02/ / 44

35 Talagrand s convex distance Definition and motivations Weak self-boundedness of d 2 T Denote the pair (ν, α) at which the saddle point is achieved by ( ν, α). For all x, ( n f 2 ( f 2 (x) f i (x )) (i) 1 since (x) f i (x )) (i) α 2. i n ( f (x) fi (x (i) ) ) 2 = n n ( ) 2 ( ) 2 f (x) f i (x (i) ) f (x) + f i (x (i) ) α 2 i 4f (x). 4f (x) Boucheron & Lugosi & Massart (LPMA et al.) Concentration, self-bounding functions Cachan, 03/02/ / 44

36 Talagrand s convex distance Self-bounding : example Talagrand s convex distance inequality Theorem (Talagrand 1995) ] P{A}E [e d 2 T (X,A)/4 1 Proof. P{X A} = P { d T (X, A) 2 E [ d 2 T (X, A)] t } ( exp E[d T (X, A) 2 ) ]. 8 For 0 λ 1/2, log E [ e λ(z EZ )] λ2 2EZ 1 2λ. Choosing λ = 1/10 leads to the desired result. Boucheron & Lugosi & Massart (LPMA et al.) Concentration, self-bounding functions Cachan, 03/02/ / 44

37 Suprema of non-centered empirical processes Talagrand-Bousquet Suprema of non-centered empirical processes Well-understood scenarios 1 Suprema of positive bounded empirical processes: self-bounding property 2 Suprema of centered bounded empirical processes : Talagrand s inequality (revisited by Ledoux, Massart, Rio, Klein, Bousquet,...). Bennett inequality with variance factor coinciding with the Efron-Stein estimate of variance Boucheron & Lugosi & Massart (LPMA et al.) Concentration, self-bounding functions Cachan, 03/02/ / 44

38 Suprema of non-centered empirical processes Talagrand-Bousquet Talagrand-... -Bousquet inequality Z = sup s T n X i,s X 1,..., X n identifically distributed EX i,s = 0 and 1 X i,s 1 σ 2 = sup s T n var[x i,s] Efron-Stein estimate of variance var[z ] v = 2EZ + σ 2 log E [ e λ(z EZ )] vφ(λ) For x > 0 { P Z EZ + 2vx + x } e x 3 Boucheron & Lugosi & Massart (LPMA et al.) Concentration, self-bounding functions Cachan, 03/02/ / 44

39 Suprema of non-centered empirical processes Empirical excess risk Another scenario : Excess Empirical Risk ŝ, s R(s) = E[X i,s ] Risk of s T R n (s) = 1 n n X i,s s : R( s) = EX i, s = inf s T EX i,s = inf s T R(s) ŝ : nr n (ŝ) = n X i,ŝ = inf s T n X i,s = inf s T R n (s) Excess risk and empirical counterpart Excess risk R(ŝ) R( s) Excess empirical risk (EER) Z = n(r n ( s) R n (ŝ)) = sup s T n (X i, s X i,s ) = n ( ) Xi, s X i,ŝ Boucheron & Lugosi & Massart (LPMA et al.) Concentration, self-bounding functions Cachan, 03/02/ / 44

40 Suprema of non-centered empirical processes Variance bounds for EER Empirical excess risk Consequences of Efron-Stein inequalities : Var [Z ] 2 E n 1 (X i, s EX i, s ) (X i,ŝ EX i,ŝ ) and Var [Z ] n n 2 E (X i, s X i,ŝ ) 2 + E (X i, s X + E i,ŝ )2 n (X i, s X i,ŝ ) 2 Boucheron & Lugosi & Massart (LPMA et al.) Concentration, self-bounding functions Cachan, 03/02/ / 44

41 Suprema of non-centered empirical processes Empirical excess risk Consequences of Talagrand s inequalities (and peeling) Assumptions d a distance over T, ψ, ω : [0, 1] R +,, ψ(x)/x, ω(x)/x : ne sup s : d(s, s) r (R(s) R n (s)) (R n ( s) R( s)) ψ(r) E [ (X i,s X i, s ) 2] ( ) 2 d(s, s) 2 ω R(s) R( s) Definition r is the positive solution of nr 2 = ψ(ω(r)) Boucheron & Lugosi & Massart (LPMA et al.) Concentration, self-bounding functions Cachan, 03/02/ / 44

42 Suprema of non-centered empirical processes Empirical excess risk Consequences, cont d With probability larger than 1 δ max (R(ŝ) R( s), R n ( s) R(ŝ)) κ (r 2 + ω(r ) 2 nr 2 log 1 ) δ r 2 is called the rate of the estimation problem max (E [R(ŝ) R( s)], E [R n ( s) R(ŝ)]) κ r 2 Combining... var [Z ] = var [ n(r n ( s) R n (ŝ)) ] nκ ( ω(r ) 2) Boucheron & Lugosi & Massart (LPMA et al.) Concentration, self-bounding functions Cachan, 03/02/ / 44

43 Suprema of non-centered empirical processes Bernstein inequality for EER Empirical excess risk Theorem (B., Bousquet, Lugosi, Massart, 2005) (Z EZ ) q + 3q V + q Bernstein like inequality ( n(rn ( s) R n (ŝ)) ) ( q nqω(r κ ) + ( ) ) ω(r ) nω q nr Variance factor nω(r ) 2 Scale factor ( ) ω(r ) nω nr Works for some statistical learning problems (learning VC-classes under good noise conditions). Boucheron & Lugosi & Massart (LPMA et al.) Concentration, self-bounding functions Cachan, 03/02/ / 44

44 References References 1 B. and Massart : A high dimensional Wilks phenomenon. Probability Theory and Related Fields online. (2010) 2 B., Lugosi and Massart : On concentration of self-bounding functions. Electronic Journal of Probability 14 (2009) and 1 Maurer. Concentration inequalities for functions of independent variables, Random Structures and Algorithms, 29 (2006) McDiarmid and Reed. Concentration for self-bounding functions and an inequality of talagrand. Random Structures and Algorithms, 29 (2006) B. Lugosi and Massart. A sharp concentration inequality with applications. Random Structures and Algorithms, 16 (2000), Boucheron & Lugosi & Massart (LPMA et al.) Concentration, self-bounding functions Cachan, 03/02/ / 44

Concentration inequalities and the entropy method

Concentration inequalities and the entropy method Concentration inequalities and the entropy method Gábor Lugosi ICREA and Pompeu Fabra University Barcelona what is concentration? We are interested in bounding random fluctuations of functions of many

More information

OXPORD UNIVERSITY PRESS

OXPORD UNIVERSITY PRESS Concentration Inequalities A Nonasymptotic Theory of Independence STEPHANE BOUCHERON GABOR LUGOSI PASCAL MASS ART OXPORD UNIVERSITY PRESS CONTENTS 1 Introduction 1 1.1 Sums of Independent Random Variables

More information

On concentration of self-bounding functions

On concentration of self-bounding functions E l e c t r o n i c J o u r n a l o f P r o b a b i l i t y Vol 14 (2009), Paper no 64, pages 1884 1899 Journal URL http://wwwmathwashingtonedu/~ejpecp/ On concentration of self-bounding functions Stéphane

More information

Uniform concentration inequalities, martingales, Rademacher complexity and symmetrization

Uniform concentration inequalities, martingales, Rademacher complexity and symmetrization Uniform concentration inequalities, martingales, Rademacher complexity and symmetrization John Duchi Outline I Motivation 1 Uniform laws of large numbers 2 Loss minimization and data dependence II Uniform

More information

Stein s method, logarithmic Sobolev and transport inequalities

Stein s method, logarithmic Sobolev and transport inequalities Stein s method, logarithmic Sobolev and transport inequalities M. Ledoux University of Toulouse, France and Institut Universitaire de France Stein s method, logarithmic Sobolev and transport inequalities

More information

Concentration inequalities: basics and some new challenges

Concentration inequalities: basics and some new challenges Concentration inequalities: basics and some new challenges M. Ledoux University of Toulouse, France & Institut Universitaire de France Measure concentration geometric functional analysis, probability theory,

More information

Concentration inequalities and tail bounds

Concentration inequalities and tail bounds Concentration inequalities and tail bounds John Duchi Outline I Basics and motivation 1 Law of large numbers 2 Markov inequality 3 Cherno bounds II Sub-Gaussian random variables 1 Definitions 2 Examples

More information

A Note on Jackknife Based Estimates of Sampling Distributions. Abstract

A Note on Jackknife Based Estimates of Sampling Distributions. Abstract A Note on Jackknife Based Estimates of Sampling Distributions C. Houdré Abstract Tail estimates of statistics are given; they depend on jackknife estimates of variance of the statistics of interest. 1

More information

Model Selection and Geometry

Model Selection and Geometry Model Selection and Geometry Pascal Massart Université Paris-Sud, Orsay Leipzig, February Purpose of the talk! Concentration of measure plays a fundamental role in the theory of model selection! Model

More information

Bennett-type Generalization Bounds: Large-deviation Case and Faster Rate of Convergence

Bennett-type Generalization Bounds: Large-deviation Case and Faster Rate of Convergence Bennett-type Generalization Bounds: Large-deviation Case and Faster Rate of Convergence Chao Zhang The Biodesign Institute Arizona State University Tempe, AZ 8587, USA Abstract In this paper, we present

More information

Concentration Inequalities for Random Matrices

Concentration Inequalities for Random Matrices Concentration Inequalities for Random Matrices M. Ledoux Institut de Mathématiques de Toulouse, France exponential tail inequalities classical theme in probability and statistics quantify the asymptotic

More information

Phenomena in high dimensions in geometric analysis, random matrices, and computational geometry Roscoff, France, June 25-29, 2012

Phenomena in high dimensions in geometric analysis, random matrices, and computational geometry Roscoff, France, June 25-29, 2012 Phenomena in high dimensions in geometric analysis, random matrices, and computational geometry Roscoff, France, June 25-29, 202 BOUNDS AND ASYMPTOTICS FOR FISHER INFORMATION IN THE CENTRAL LIMIT THEOREM

More information

STAT 200C: High-dimensional Statistics

STAT 200C: High-dimensional Statistics STAT 200C: High-dimensional Statistics Arash A. Amini May 30, 2018 1 / 59 Classical case: n d. Asymptotic assumption: d is fixed and n. Basic tools: LLN and CLT. High-dimensional setting: n d, e.g. n/d

More information

Stein s Method: Distributional Approximation and Concentration of Measure

Stein s Method: Distributional Approximation and Concentration of Measure Stein s Method: Distributional Approximation and Concentration of Measure Larry Goldstein University of Southern California 36 th Midwest Probability Colloquium, 2014 Concentration of Measure Distributional

More information

STAT 200C: High-dimensional Statistics

STAT 200C: High-dimensional Statistics STAT 200C: High-dimensional Statistics Arash A. Amini April 27, 2018 1 / 80 Classical case: n d. Asymptotic assumption: d is fixed and n. Basic tools: LLN and CLT. High-dimensional setting: n d, e.g. n/d

More information

A Gentle Introduction to Concentration Inequalities

A Gentle Introduction to Concentration Inequalities A Gentle Introduction to Concentration Inequalities Karthik Sridharan Abstract This notes is ment to be a review of some basic inequalities and bounds on Random variables. A basic understanding of probability

More information

Model selection theory: a tutorial with applications to learning

Model selection theory: a tutorial with applications to learning Model selection theory: a tutorial with applications to learning Pascal Massart Université Paris-Sud, Orsay ALT 2012, October 29 Asymptotic approach to model selection - Idea of using some penalized empirical

More information

MEASURE CONCENTRATION FOR COMPOUND POIS- SON DISTRIBUTIONS

MEASURE CONCENTRATION FOR COMPOUND POIS- SON DISTRIBUTIONS Elect. Comm. in Probab. 11 (26), 45 57 ELECTRONIC COMMUNICATIONS in PROBABILITY MEASURE CONCENTRATION FOR COMPOUND POIS- SON DISTRIBUTIONS I. KONTOYIANNIS 1 Division of Applied Mathematics, Brown University,

More information

March 1, Florida State University. Concentration Inequalities: Martingale. Approach and Entropy Method. Lizhe Sun and Boning Yang.

March 1, Florida State University. Concentration Inequalities: Martingale. Approach and Entropy Method. Lizhe Sun and Boning Yang. Florida State University March 1, 2018 Framework 1. (Lizhe) Basic inequalities Chernoff bounding Review for STA 6448 2. (Lizhe) Discrete-time martingales inequalities via martingale approach 3. (Boning)

More information

EXPONENTIAL INEQUALITIES IN NONPARAMETRIC ESTIMATION

EXPONENTIAL INEQUALITIES IN NONPARAMETRIC ESTIMATION EXPONENTIAL INEQUALITIES IN NONPARAMETRIC ESTIMATION Luc Devroye Division of Statistics University of California at Davis Davis, CA 95616 ABSTRACT We derive exponential inequalities for the oscillation

More information

ON MEHLER S FORMULA. Giovanni Peccati (Luxembourg University) Conférence Géométrie Stochastique Nantes April 7, 2016

ON MEHLER S FORMULA. Giovanni Peccati (Luxembourg University) Conférence Géométrie Stochastique Nantes April 7, 2016 1 / 22 ON MEHLER S FORMULA Giovanni Peccati (Luxembourg University) Conférence Géométrie Stochastique Nantes April 7, 2016 2 / 22 OVERVIEW ı I will discuss two joint works: Last, Peccati and Schulte (PTRF,

More information

Generalization Bounds in Machine Learning. Presented by: Afshin Rostamizadeh

Generalization Bounds in Machine Learning. Presented by: Afshin Rostamizadeh Generalization Bounds in Machine Learning Presented by: Afshin Rostamizadeh Outline Introduction to generalization bounds. Examples: VC-bounds Covering Number bounds Rademacher bounds Stability bounds

More information

A note on the convex infimum convolution inequality

A note on the convex infimum convolution inequality A note on the convex infimum convolution inequality Naomi Feldheim, Arnaud Marsiglietti, Piotr Nayar, Jing Wang Abstract We characterize the symmetric measures which satisfy the one dimensional convex

More information

AN INEQUALITY FOR TAIL PROBABILITIES OF MARTINGALES WITH BOUNDED DIFFERENCES

AN INEQUALITY FOR TAIL PROBABILITIES OF MARTINGALES WITH BOUNDED DIFFERENCES Lithuanian Mathematical Journal, Vol. 4, No. 3, 00 AN INEQUALITY FOR TAIL PROBABILITIES OF MARTINGALES WITH BOUNDED DIFFERENCES V. Bentkus Vilnius Institute of Mathematics and Informatics, Akademijos 4,

More information

Kernels to detect abrupt changes in time series

Kernels to detect abrupt changes in time series 1 UMR 8524 CNRS - Université Lille 1 2 Modal INRIA team-project 3 SSB group Paris joint work with S. Arlot, Z. Harchaoui, G. Rigaill, and G. Marot Computational and statistical trade-offs in learning IHES

More information

Concentration inequalities

Concentration inequalities Concentration inequalities A nonasymptotic theory of independence Stéphane Boucheron Paris 7 Gábor Lugosi ICREA and Universitat Pompeu Fabra Pascal Massart Orsay CLARENDON PRESS. OXFORD 2012 PREFACE Measure

More information

Concentration of Measure with Applications in Information Theory, Communications, and Coding

Concentration of Measure with Applications in Information Theory, Communications, and Coding Concentration of Measure with Applications in Information Theory, Communications, and Coding Maxim Raginsky (UIUC) and Igal Sason (Technion) A Tutorial, Presented at 2015 IEEE International Symposium on

More information

Gaussian Phase Transitions and Conic Intrinsic Volumes: Steining the Steiner formula

Gaussian Phase Transitions and Conic Intrinsic Volumes: Steining the Steiner formula Gaussian Phase Transitions and Conic Intrinsic Volumes: Steining the Steiner formula Larry Goldstein, University of Southern California Nourdin GIoVAnNi Peccati Luxembourg University University British

More information

Statistics 300B Winter 2018 Final Exam Due 24 Hours after receiving it

Statistics 300B Winter 2018 Final Exam Due 24 Hours after receiving it Statistics 300B Winter 08 Final Exam Due 4 Hours after receiving it Directions: This test is open book and open internet, but must be done without consulting other students. Any consultation of other students

More information

Entropy and Ergodic Theory Lecture 15: A first look at concentration

Entropy and Ergodic Theory Lecture 15: A first look at concentration Entropy and Ergodic Theory Lecture 15: A first look at concentration 1 Introduction to concentration Let X 1, X 2,... be i.i.d. R-valued RVs with common distribution µ, and suppose for simplicity that

More information

Localized Complexities for Transductive Learning

Localized Complexities for Transductive Learning JMLR: Workshop and Conference Proceedings vol 35:1 28, 2014 Localized Complexities for Transductive Learning Ilya Tolstikhin Computing Centre of Russian Academy of Sciences, Russia Gilles Blanchard Department

More information

Distance-Divergence Inequalities

Distance-Divergence Inequalities Distance-Divergence Inequalities Katalin Marton Alfréd Rényi Institute of Mathematics of the Hungarian Academy of Sciences Motivation To find a simple proof of the Blowing-up Lemma, proved by Ahlswede,

More information

Susceptible-Infective-Removed Epidemics and Erdős-Rényi random

Susceptible-Infective-Removed Epidemics and Erdős-Rényi random Susceptible-Infective-Removed Epidemics and Erdős-Rényi random graphs MSR-Inria Joint Centre October 13, 2015 SIR epidemics: the Reed-Frost model Individuals i [n] when infected, attempt to infect all

More information

Concentration inequalities for non-lipschitz functions

Concentration inequalities for non-lipschitz functions Concentration inequalities for non-lipschitz functions University of Warsaw Berkeley, October 1, 2013 joint work with Radosław Adamczak (University of Warsaw) Gaussian concentration (Sudakov-Tsirelson,

More information

Superconcentration inequalities for centered Gaussian stationnary processes

Superconcentration inequalities for centered Gaussian stationnary processes Superconcentration inequalities for centered Gaussian stationnary processes Kevin Tanguy Toulouse University June 21, 2016 1 / 22 Outline What is superconcentration? Convergence of extremes (Gaussian case).

More information

Refined Bounds on the Empirical Distribution of Good Channel Codes via Concentration Inequalities

Refined Bounds on the Empirical Distribution of Good Channel Codes via Concentration Inequalities Refined Bounds on the Empirical Distribution of Good Channel Codes via Concentration Inequalities Maxim Raginsky and Igal Sason ISIT 2013, Istanbul, Turkey Capacity-Achieving Channel Codes The set-up DMC

More information

Concentration behavior of the penalized least squares estimator

Concentration behavior of the penalized least squares estimator Concentration behavior of the penalized least squares estimator Penalized least squares behavior arxiv:1511.08698v2 [math.st] 19 Oct 2016 Alan Muro and Sara van de Geer {muro,geer}@stat.math.ethz.ch Seminar

More information

I. ANALYSIS; PROBABILITY

I. ANALYSIS; PROBABILITY ma414l1.tex Lecture 1. 12.1.2012 I. NLYSIS; PROBBILITY 1. Lebesgue Measure and Integral We recall Lebesgue measure (M411 Probability and Measure) λ: defined on intervals (a, b] by λ((a, b]) := b a (so

More information

Concentration Inequalities for Poisson Functionals

Concentration Inequalities for Poisson Functionals Concentration Inequalities for Poisson Functionals Dissertation zur Erlangung des Doktorgrades (Dr. rer. nat.) des Fachbereichs Mathematik/Informatik der Universität Osnabrück vorgelegt von Sascha Bachmann

More information

Functional Properties of MMSE

Functional Properties of MMSE Functional Properties of MMSE Yihong Wu epartment of Electrical Engineering Princeton University Princeton, NJ 08544, USA Email: yihongwu@princeton.edu Sergio Verdú epartment of Electrical Engineering

More information

Part IA Probability. Theorems. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015

Part IA Probability. Theorems. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015 Part IA Probability Theorems Based on lectures by R. Weber Notes taken by Dexter Chua Lent 2015 These notes are not endorsed by the lecturers, and I have modified them (often significantly) after lectures.

More information

Chapter 7. Basic Probability Theory

Chapter 7. Basic Probability Theory Chapter 7. Basic Probability Theory I-Liang Chern October 20, 2016 1 / 49 What s kind of matrices satisfying RIP Random matrices with iid Gaussian entries iid Bernoulli entries (+/ 1) iid subgaussian entries

More information

Exercises and Answers to Chapter 1

Exercises and Answers to Chapter 1 Exercises and Answers to Chapter The continuous type of random variable X has the following density function: a x, if < x < a, f (x), otherwise. Answer the following questions. () Find a. () Obtain mean

More information

Spectral Gap and Concentration for Some Spherically Symmetric Probability Measures

Spectral Gap and Concentration for Some Spherically Symmetric Probability Measures Spectral Gap and Concentration for Some Spherically Symmetric Probability Measures S.G. Bobkov School of Mathematics, University of Minnesota, 127 Vincent Hall, 26 Church St. S.E., Minneapolis, MN 55455,

More information

Spring 2012 Math 541B Exam 1

Spring 2012 Math 541B Exam 1 Spring 2012 Math 541B Exam 1 1. A sample of size n is drawn without replacement from an urn containing N balls, m of which are red and N m are black; the balls are otherwise indistinguishable. Let X denote

More information

arxiv:math/ v2 [math.pr] 16 Mar 2007

arxiv:math/ v2 [math.pr] 16 Mar 2007 CHARACTERIZATION OF LIL BEHAVIOR IN BANACH SPACE UWE EINMAHL a, and DELI LI b, a Departement Wiskunde, Vrije Universiteit Brussel, arxiv:math/0608687v2 [math.pr] 16 Mar 2007 Pleinlaan 2, B-1050 Brussel,

More information

Learning Theory. Ingo Steinwart University of Stuttgart. September 4, 2013

Learning Theory. Ingo Steinwart University of Stuttgart. September 4, 2013 Learning Theory Ingo Steinwart University of Stuttgart September 4, 2013 Ingo Steinwart University of Stuttgart () Learning Theory September 4, 2013 1 / 62 Basics Informal Introduction Informal Description

More information

Concentration of Measures by Bounded Couplings

Concentration of Measures by Bounded Couplings Concentration of Measures by Bounded Couplings Subhankar Ghosh, Larry Goldstein and Ümit Işlak University of Southern California [arxiv:0906.3886] [arxiv:1304.5001] May 2013 Concentration of Measure Distributional

More information

Concentration of Measures by Bounded Size Bias Couplings

Concentration of Measures by Bounded Size Bias Couplings Concentration of Measures by Bounded Size Bias Couplings Subhankar Ghosh, Larry Goldstein University of Southern California [arxiv:0906.3886] January 10 th, 2013 Concentration of Measure Distributional

More information

Exponential inequalities for U-statistics of order two with constants

Exponential inequalities for U-statistics of order two with constants Exponential inequalities for U-statistics of order two with constants C. Houdré P. Reynaud-Bouret December 13, 2002 1 Introduction We wish in these notes to further advance our knowledge of exponential

More information

Rademacher Bounds for Non-i.i.d. Processes

Rademacher Bounds for Non-i.i.d. Processes Rademacher Bounds for Non-i.i.d. Processes Afshin Rostamizadeh Joint work with: Mehryar Mohri Background Background Generalization Bounds - How well can we estimate an algorithm s true performance based

More information

l 1 -Regularized Linear Regression: Persistence and Oracle Inequalities

l 1 -Regularized Linear Regression: Persistence and Oracle Inequalities l -Regularized Linear Regression: Persistence and Oracle Inequalities Peter Bartlett EECS and Statistics UC Berkeley slides at http://www.stat.berkeley.edu/ bartlett Joint work with Shahar Mendelson and

More information

Concentration for Coulomb gases

Concentration for Coulomb gases 1/32 and Coulomb transport inequalities Djalil Chafaï 1, Adrien Hardy 2, Mylène Maïda 2 1 Université Paris-Dauphine, 2 Université de Lille November 4, 2016 IHP Paris Groupe de travail MEGA 2/32 Motivation

More information

Expansion and Isoperimetric Constants for Product Graphs

Expansion and Isoperimetric Constants for Product Graphs Expansion and Isoperimetric Constants for Product Graphs C. Houdré and T. Stoyanov May 4, 2004 Abstract Vertex and edge isoperimetric constants of graphs are studied. Using a functional-analytic approach,

More information

The Gaussian free field, Gibbs measures and NLS on planar domains

The Gaussian free field, Gibbs measures and NLS on planar domains The Gaussian free field, Gibbs measures and on planar domains N. Burq, joint with L. Thomann (Nantes) and N. Tzvetkov (Cergy) Université Paris Sud, Laboratoire de Mathématiques d Orsay, CNRS UMR 8628 LAGA,

More information

Generalization theory

Generalization theory Generalization theory Daniel Hsu Columbia TRIPODS Bootcamp 1 Motivation 2 Support vector machines X = R d, Y = { 1, +1}. Return solution ŵ R d to following optimization problem: λ min w R d 2 w 2 2 + 1

More information

Inference For High Dimensional M-estimates. Fixed Design Results

Inference For High Dimensional M-estimates. Fixed Design Results : Fixed Design Results Lihua Lei Advisors: Peter J. Bickel, Michael I. Jordan joint work with Peter J. Bickel and Noureddine El Karoui Dec. 8, 2016 1/57 Table of Contents 1 Background 2 Main Results and

More information

Introduction to Empirical Processes and Semiparametric Inference Lecture 02: Overview Continued

Introduction to Empirical Processes and Semiparametric Inference Lecture 02: Overview Continued Introduction to Empirical Processes and Semiparametric Inference Lecture 02: Overview Continued Michael R. Kosorok, Ph.D. Professor and Chair of Biostatistics Professor of Statistics and Operations Research

More information

Inverse Statistical Learning

Inverse Statistical Learning Inverse Statistical Learning Minimax theory, adaptation and algorithm avec (par ordre d apparition) C. Marteau, M. Chichignoud, C. Brunet and S. Souchet Dijon, le 15 janvier 2014 Inverse Statistical Learning

More information

Fast learning rates for plug-in classifiers under the margin condition

Fast learning rates for plug-in classifiers under the margin condition Fast learning rates for plug-in classifiers under the margin condition Jean-Yves Audibert 1 Alexandre B. Tsybakov 2 1 Certis ParisTech - Ecole des Ponts, France 2 LPMA Université Pierre et Marie Curie,

More information

Empirical Processes and random projections

Empirical Processes and random projections Empirical Processes and random projections B. Klartag, S. Mendelson School of Mathematics, Institute for Advanced Study, Princeton, NJ 08540, USA. Institute of Advanced Studies, The Australian National

More information

The Moment Method; Convex Duality; and Large/Medium/Small Deviations

The Moment Method; Convex Duality; and Large/Medium/Small Deviations Stat 928: Statistical Learning Theory Lecture: 5 The Moment Method; Convex Duality; and Large/Medium/Small Deviations Instructor: Sham Kakade The Exponential Inequality and Convex Duality The exponential

More information

Part IA Probability. Definitions. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015

Part IA Probability. Definitions. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015 Part IA Probability Definitions Based on lectures by R. Weber Notes taken by Dexter Chua Lent 2015 These notes are not endorsed by the lecturers, and I have modified them (often significantly) after lectures.

More information

Some basic elements of Probability Theory

Some basic elements of Probability Theory Chapter I Some basic elements of Probability Theory 1 Terminology (and elementary observations Probability theory and the material covered in a basic Real Variables course have much in common. However

More information

EE514A Information Theory I Fall 2013

EE514A Information Theory I Fall 2013 EE514A Information Theory I Fall 2013 K. Mohan, Prof. J. Bilmes University of Washington, Seattle Department of Electrical Engineering Fall Quarter, 2013 http://j.ee.washington.edu/~bilmes/classes/ee514a_fall_2013/

More information

STAT 992 Paper Review: Sure Independence Screening in Generalized Linear Models with NP-Dimensionality J.Fan and R.Song

STAT 992 Paper Review: Sure Independence Screening in Generalized Linear Models with NP-Dimensionality J.Fan and R.Song STAT 992 Paper Review: Sure Independence Screening in Generalized Linear Models with NP-Dimensionality J.Fan and R.Song Presenter: Jiwei Zhao Department of Statistics University of Wisconsin Madison April

More information

Stein s Method for Matrix Concentration

Stein s Method for Matrix Concentration Stein s Method for Matrix Concentration Lester Mackey Collaborators: Michael I. Jordan, Richard Y. Chen, Brendan Farrell, and Joel A. Tropp University of California, Berkeley California Institute of Technology

More information

Statistical Machine Learning

Statistical Machine Learning Statistical Machine Learning UoC Stats 37700, Winter quarter Lecture 2: Introduction to statistical learning theory. 1 / 22 Goals of statistical learning theory SLT aims at studying the performance of

More information

Probability for Statistics and Machine Learning

Probability for Statistics and Machine Learning ~Springer Anirban DasGupta Probability for Statistics and Machine Learning Fundamentals and Advanced Topics Contents Suggested Courses with Diffe~ent Themes........................... xix 1 Review of Univariate

More information

Extensions to McDiarmid s inequality when differences are bounded with high probability

Extensions to McDiarmid s inequality when differences are bounded with high probability Extensions to McDiarmid s inequality when differences are bounded with high probability Samuel Kutin April 12, 2002 Abstract The method of independent bounded differences McDiarmid, 1989) gives largedeviation

More information

Exponential tail inequalities for eigenvalues of random matrices

Exponential tail inequalities for eigenvalues of random matrices Exponential tail inequalities for eigenvalues of random matrices M. Ledoux Institut de Mathématiques de Toulouse, France exponential tail inequalities classical theme in probability and statistics quantify

More information

Lecture 2: Review of Basic Probability Theory

Lecture 2: Review of Basic Probability Theory ECE 830 Fall 2010 Statistical Signal Processing instructor: R. Nowak, scribe: R. Nowak Lecture 2: Review of Basic Probability Theory Probabilistic models will be used throughout the course to represent

More information

Asymptotics of minimax stochastic programs

Asymptotics of minimax stochastic programs Asymptotics of minimax stochastic programs Alexander Shapiro Abstract. We discuss in this paper asymptotics of the sample average approximation (SAA) of the optimal value of a minimax stochastic programming

More information

P (A G) dp G P (A G)

P (A G) dp G P (A G) First homework assignment. Due at 12:15 on 22 September 2016. Homework 1. We roll two dices. X is the result of one of them and Z the sum of the results. Find E [X Z. Homework 2. Let X be a r.v.. Assume

More information

3. Review of Probability and Statistics

3. Review of Probability and Statistics 3. Review of Probability and Statistics ECE 830, Spring 2014 Probabilistic models will be used throughout the course to represent noise, errors, and uncertainty in signal processing problems. This lecture

More information

2016 Final for Advanced Probability for Communications

2016 Final for Advanced Probability for Communications 06 Final for Advanced Probability for Communications The number of total points is 00 in this exam.. 8 pt. The Law of the Iterated Logarithm states that for i.i.d. {X i } i with mean 0 and variance, [

More information

Lecture 1 Measure concentration

Lecture 1 Measure concentration CSE 29: Learning Theory Fall 2006 Lecture Measure concentration Lecturer: Sanjoy Dasgupta Scribe: Nakul Verma, Aaron Arvey, and Paul Ruvolo. Concentration of measure: examples We start with some examples

More information

Stein s Method: Distributional Approximation and Concentration of Measure

Stein s Method: Distributional Approximation and Concentration of Measure Stein s Method: Distributional Approximation and Concentration of Measure Larry Goldstein University of Southern California 36 th Midwest Probability Colloquium, 2014 Stein s method for Distributional

More information

Probability and Measure

Probability and Measure Part II Year 2018 2017 2016 2015 2014 2013 2012 2011 2010 2009 2008 2007 2006 2005 2018 84 Paper 4, Section II 26J Let (X, A) be a measurable space. Let T : X X be a measurable map, and µ a probability

More information

X n D X lim n F n (x) = F (x) for all x C F. lim n F n(u) = F (u) for all u C F. (2)

X n D X lim n F n (x) = F (x) for all x C F. lim n F n(u) = F (u) for all u C F. (2) 14:17 11/16/2 TOPIC. Convergence in distribution and related notions. This section studies the notion of the so-called convergence in distribution of real random variables. This is the kind of convergence

More information

Fast Rates for Estimation Error and Oracle Inequalities for Model Selection

Fast Rates for Estimation Error and Oracle Inequalities for Model Selection Fast Rates for Estimation Error and Oracle Inequalities for Model Selection Peter L. Bartlett Computer Science Division and Department of Statistics University of California, Berkeley bartlett@cs.berkeley.edu

More information

Statistics for scientists and engineers

Statistics for scientists and engineers Statistics for scientists and engineers February 0, 006 Contents Introduction. Motivation - why study statistics?................................... Examples..................................................3

More information

1 Regression with High Dimensional Data

1 Regression with High Dimensional Data 6.883 Learning with Combinatorial Structure ote for Lecture 11 Instructor: Prof. Stefanie Jegelka Scribe: Xuhong Zhang 1 Regression with High Dimensional Data Consider the following regression problem:

More information

A talk on Oracle inequalities and regularization. by Sara van de Geer

A talk on Oracle inequalities and regularization. by Sara van de Geer A talk on Oracle inequalities and regularization by Sara van de Geer Workshop Regularization in Statistics Banff International Regularization Station September 6-11, 2003 Aim: to compare l 1 and other

More information

Weak and strong moments of l r -norms of log-concave vectors

Weak and strong moments of l r -norms of log-concave vectors Weak and strong moments of l r -norms of log-concave vectors Rafał Latała based on the joint work with Marta Strzelecka) University of Warsaw Minneapolis, April 14 2015 Log-concave measures/vectors A measure

More information

Introduction to Self-normalized Limit Theory

Introduction to Self-normalized Limit Theory Introduction to Self-normalized Limit Theory Qi-Man Shao The Chinese University of Hong Kong E-mail: qmshao@cuhk.edu.hk Outline What is the self-normalization? Why? Classical limit theorems Self-normalized

More information

Computational and Statistical Learning Theory

Computational and Statistical Learning Theory Computational and Statistical Learning Theory Problem set 1 Due: Monday, October 10th Please send your solutions to learning-submissions@ttic.edu Notation: Input space: X Label space: Y = {±1} Sample:

More information

Novel Bernstein-like Concentration Inequalities for the Missing Mass

Novel Bernstein-like Concentration Inequalities for the Missing Mass Novel Bernstein-like Concentration Inequalities for the Missing Mass Bahman Yari Saeed Khanloo Monash University bahman.khanloo@monash.edu Gholamreza Haffari Monash University gholamreza.haffari@monash.edu

More information

An exponential family of distributions is a parametric statistical model having densities with respect to some positive measure λ of the form.

An exponential family of distributions is a parametric statistical model having densities with respect to some positive measure λ of the form. Stat 8112 Lecture Notes Asymptotics of Exponential Families Charles J. Geyer January 23, 2013 1 Exponential Families An exponential family of distributions is a parametric statistical model having densities

More information

Tail inequalities for additive functionals and empirical processes of. Markov chains

Tail inequalities for additive functionals and empirical processes of. Markov chains Tail inequalities for additive functionals and empirical processes of geometrically ergodic Markov chains University of Warsaw Banff, June 2009 Geometric ergodicity Definition A Markov chain X = (X n )

More information

Selected Exercises on Expectations and Some Probability Inequalities

Selected Exercises on Expectations and Some Probability Inequalities Selected Exercises on Expectations and Some Probability Inequalities # If E(X 2 ) = and E X a > 0, then P( X λa) ( λ) 2 a 2 for 0 < λ

More information

Stochastic Convergence, Delta Method & Moment Estimators

Stochastic Convergence, Delta Method & Moment Estimators Stochastic Convergence, Delta Method & Moment Estimators Seminar on Asymptotic Statistics Daniel Hoffmann University of Kaiserslautern Department of Mathematics February 13, 2015 Daniel Hoffmann (TU KL)

More information

Heat Flows, Geometric and Functional Inequalities

Heat Flows, Geometric and Functional Inequalities Heat Flows, Geometric and Functional Inequalities M. Ledoux Institut de Mathématiques de Toulouse, France heat flow and semigroup interpolations Duhamel formula (19th century) pde, probability, dynamics

More information

Contents 1. Introduction 1 2. Main results 3 3. Proof of the main inequalities 7 4. Application to random dynamical systems 11 References 16

Contents 1. Introduction 1 2. Main results 3 3. Proof of the main inequalities 7 4. Application to random dynamical systems 11 References 16 WEIGHTED CSISZÁR-KULLBACK-PINSKER INEQUALITIES AND APPLICATIONS TO TRANSPORTATION INEQUALITIES FRANÇOIS BOLLEY AND CÉDRIC VILLANI Abstract. We strengthen the usual Csiszár-Kullback-Pinsker inequality by

More information

1 Examples and basic results

1 Examples and basic results We should start with administrative stu Instructor: Michael Damron mdamron at indiana dot edu 35 Rawles Hall 8-855-8670 grading based on attendance o ce hours: by appointment Text: Concentration inequalities:

More information

6.1 Variational representation of f-divergences

6.1 Variational representation of f-divergences ECE598: Information-theoretic methods in high-dimensional statistics Spring 2016 Lecture 6: Variational representation, HCR and CR lower bounds Lecturer: Yihong Wu Scribe: Georgios Rovatsos, Feb 11, 2016

More information

Gaussian vectors and central limit theorem

Gaussian vectors and central limit theorem Gaussian vectors and central limit theorem Samy Tindel Purdue University Probability Theory 2 - MA 539 Samy T. Gaussian vectors & CLT Probability Theory 1 / 86 Outline 1 Real Gaussian random variables

More information

Kernel change-point detection

Kernel change-point detection 1,2 (joint work with Alain Celisse 3 & Zaïd Harchaoui 4 ) 1 Cnrs 2 École Normale Supérieure (Paris), DIENS, Équipe Sierra 3 Université Lille 1 4 INRIA Grenoble Workshop Kernel methods for big data, Lille,

More information

Stability results for Logarithmic Sobolev inequality

Stability results for Logarithmic Sobolev inequality Stability results for Logarithmic Sobolev inequality Daesung Kim (joint work with Emanuel Indrei) Department of Mathematics Purdue University September 20, 2017 Daesung Kim (Purdue) Stability for LSI Probability

More information

Brownian Motion. 1 Definition Brownian Motion Wiener measure... 3

Brownian Motion. 1 Definition Brownian Motion Wiener measure... 3 Brownian Motion Contents 1 Definition 2 1.1 Brownian Motion................................. 2 1.2 Wiener measure.................................. 3 2 Construction 4 2.1 Gaussian process.................................

More information