Sub-Gaussian estimators under heavy tails

Size: px
Start display at page:

Download "Sub-Gaussian estimators under heavy tails"

Transcription

1 Sub-Gaussian estimators under heavy tails Roberto Imbuzeiro Oliveira XIX Escola Brasileira de Probabilidade Maresias, August 6th 2015

2 Joint with Luc Devroye (McGill) Matthieu Lerasle (CNRS/Nice) Gábor Lugosi (ICREA/UPF)

3 Our problem (and why it's interesting)

4 Our problem We want to estimate the mean of a probability distribution over the real line from an i.i.d. sample. This is (related to) many fundamental statistical tasks.

5 Our problem We assume finite variances, but as little else as possible. Interesting in theory, important in practice.

6 Our problem Want nearly optimal tail bounds, uniformly over large classes of distributions. High-confidence estimates are sometimes necessary.

7 Formal statement Given: P, family of probability distributions over R. For P 2 P, µ P and P 2 are the mean and variance of P. Want: for each large enough n 2 N, an estimator E b n : R n! R and a parameter min = min,n 2 [0, 1) such that, if X1 n =(X 1,...,X n ) is i.i.d. from P 2 P, then r! 8 2 [ min, 1) : P E b 1+ln(1/ ) n (X1 n ) µ P >L P apple. n

8 Formal statement Should be very large (nonparametric) Given: P, family of probability distributions over R. For P 2 P, µ P and P 2 are the mean and variance of P. Want: for each large enough n 2 N, an estimator E b n : R n! R and a parameter min = min,n 2 [0, 1) such that, if X1 n =(X 1,...,X n ) is i.i.d. from P 2 P, then r! 8 2 [ min, 1) : P E b 1+ln(1/ ) n (X1 n ) µ P >L P apple. n

9 Formal statement Given: P, family of probability distributions over R. For P 2 P, µ P and P 2 are the mean and variance of P. Should be very small (exponentially in n?) Want: for each large enough n 2 N, an estimator E b n : R n! R and a parameter min = min,n 2 [0, 1) such that, if X1 n =(X 1,...,X n ) is i.i.d. from P 2 P, then r! 8 2 [ min, 1) : P E b 1+ln(1/ ) n (X1 n ) µ P >L P apple. n

10 Formal statement Given: P, family of probability distributions over R. For P 2 P, µ P and P 2 are the mean and variance of P. Want: for each large enough n 2 N, an estimator E b n : R n! R and a parameter min = min,n 2 [0, 1) such that, if X1 n =(X 1,...,X n ) is i.i.d. from P 2 P, then r! 8 2 [ min, 1) : P E b 1+ln(1/ ) n (X1 n ) µ P >L P apple. n Constant (may depend on the family)

11 Why sub-gaussian? What we ask for is basically that the estimator has Gaussian-like fluctuations around the mean. P b E n (X n 1 ) µ P > P p n apple C 1 e 2 C 2 Catoni: Gaussian-like fluctuations are optimal for "reasonable" families of distributions (more on this below).

12 Why is this interesting? Estimator must turn heavy tails into light tails! (Tail surgery?)

13 Why is this interesting?

14 Why is this interesting? Related (weaker) estimators have been applied to problems in Statistics and Machine Learning. Our notion could improve these results. Audibert and Catoni + Hsu and Sabato (least squares), Buyback et al. (bandits), Brownlees et al. (empirical risk minimization).

15 When is this possible? This is the main subject of our paper. We present our results before we move on.

16 Our results

17 First result Assumption: variance known up to an interval.

18 Partially known variance Example: P [ 2 1, 2 2 ] := all distributions with variance P 2 [ 1, 2 ]. We let R := 2 / 1 (may depend on n). Theorem: If R is bounded, then for all large enough n there exist E b n : R n! R, min e cn and L constant such that, when P 2 P [ 2 1, 2 2 ] 2 and X1 n = d P n, 8 2 [ min, 1) : P b E n (X n 1 ) µ P >L P r 1+ln(1/ ) n!. If R unbounded, any sequence min! 0 fails.

19 Partially known variance Example: P [ 2 1, 2 2 ] := all distributions with variance P 2 [ 1, 2 ]. We let R := 2 / 1 (may depend on n). Optimal up to the exact values of c>0 e L>0. Theorem: If R is bounded, then for all large enough n there exist E b n : R n! R, min e cn and L constant such that, when P 2 P [ 2 1, 2 2 ] 2 and X1 n = d P n, 8 2 [ min, 1) : P b E n (X n 1 ) µ P >L P r 1+ln(1/ ) n!. If R unbounded, any sequence min! 0 fails.

20 Partially known variance Example: P [ 2 1, 2 2 ] := all distributions with variance P 2 [ 1, 2 ]. We let R := 2 / 1 (may depend on n). Truly different behavior! Theorem: If R is bounded, then for all large enough n there exist E b n : R n! R, min e cn and L constant such that, when P 2 P [ 2 1, 2 2 ] 2 and X1 n = d P n, 8 2 [ min, 1) : P b E n (X n 1 ) µ P >L P r 1+ln(1/ ) n!. If R unbounded, any sequence min! 0 fails.

21 Second result Assumption: (slightly) higher moments.

22 Higher moments Example: P, := all distributions with E P X µ P apple ( P ), (here 2 (2, 3) is fixed, 0 may depend on n) Theorem: for all large enough n, ifk, := (C ) 2 /( 2), there exist b E n : R n! R, min e cn/k, and L constant such that, when P 2 P, and X n 1 = d P n, 8 2 [ min, 1) : P b E n (X n 1 ) µ P >L P r 1+ln(1/ ) n!.

23 Higher moments Example: P, := all distributions with Optimal up to value of c>0. E P X µ P apple ( P ), (here 2 (2, 3) is fixed, 0 may depend on n) Theorem: for all large enough n, ifk, := (C ) 2 /( 2), there exist b E n : R n! R, min e cn/k, and L constant such that, when P 2 P, and X n 1 = d P n, 8 2 [ min, 1) : P b E n (X n 1 ) µ P >L P r 1+ln(1/ ) n!.

24 An extension Suffices to assume that the distributions which is k- regular: 9k 2 N, 8P 2 P, 8j k :ifx j 1 = d P n,! P ± 1 jx 1 (X i µ P ) apple 0 j 3. i=1 For instance, symmetric distributions are 1-regular.

25 Third result Assumption: under bounded kurtosis, can get nearly optimal constant L = p 2+" This will be further discussed later.

26 Some background

27 History Typical analyses of estimators for means are based on expectations, not deviations. Exceptions do exist (eg. Kolmogorov s CLT for medians), but assumptions and goals are different.

28 History Catoni s paper (AIHP Prob. Stat. 2012) seems to be the first to focus on deviations as a fundamental problem. We ll mention some more applied results later.

29 Gaussian lower bound Recall normal cumulative distribution function. (r) := Z r 1 e x2 2 dx p 2 1 (1 ) p 2ln(1/ ) for 1.

30 Gaussian lower bound Family: P 2 Gauss, all Gaussian distributions over R with variance 2 > 0. Thm (Catoni): for any n, inf P be n sup P2P X n 1 = dp n Similar result for lower tail. be n (X n 1 ) µ P (1 ) 1 P p n =

31 Gaussian lower bound Family: P 2 Gauss, all Gaussian distributions over R with variance 2 > 0. Thm (Catoni): for any n, inf P be n sup P2P X n 1 = dp n Similar result for lower tail. This is asymptotic to L p ln(1/ )withl = p 2 be n (X n 1 ) µ P (1 ) 1 P p n =

32 Compare with definition Given: P, family of probability distributions over R. For P 2 P, µ P and P 2 are the mean and variance of P. Want: for each large enough n 2 N, an estimator E b n : R n! R and a parameter min = min,n 2 [0, 1) such that, if X1 n =(X 1,...,X n ) is i.i.d. from P 2 P, then r! 8 2 [ min, 1) : P E b 1+ln(1/ ) n (X1 n ) µ P >L P apple. n

33 Gaussian lower bound Family: P 2 Gauss, all Gaussian distributions over R with variance 2 > 0. Thm (Catoni): for any n, inf P be n sup P2P X n 1 = dp n Similar result for lower tail. be n (X n 1 ) µ P (1 ) 1 P p n =

34 The empirical mean nx be n (X n 1 ):= 1 n i=1 X i It follows from Catoni s result the empirical mean has optimal deviations for all Gaussian distributions. This is an exception, rather than the rule.

35 Empirical mean fails Example: P 2 2, all distributions with variances 2 P = 2. Thm (Catoni): Chebyshev is basically optimal. sup P2P 2 2 X n 1 = dp n P 1 n nx i=1! X i µ P > c P p n.

36 Empirical mean fails Example: P krtappleapple, all distributions with kurtosis apple P := E P X µ P 4 / P 4 apple apple. Thm (Catoni): If n is large and apple 1/n. sup P2P krtappleapple X n 1 = dp n P 1 n nx i=1 X i µ P > c P ( n) 1/4!.

37 Positive results Catoni obtained sharp sub-gaussian estimators in some settings. Unfortunately, they depend on the confidence level!

38 One example Example: P 2 2, all distributions with variance 2 P = 2. Thm (Catoni): Set min := e o(n).then8 2 [ min, 1), there exists a -dependent b E n, with r! sup P b n (2 + o(1)) ln(2/ ) En, (X1 ) µ P > P n P2P 2 2 X1 n= dp n apple

39 Why is this bad? Suppose you want high confidence. Only guarantee is that the probability of huge error is very low. Nothing is known about the probability of averageto-large error in more typical events.

40 Why is this bad? Statistical and machine learning applications (Bubeck et al., Brownlees et al., Hsu/Sabato) had to cope with this dependence on the confidence level. In all cases, something was lost.

41 Our results are better or rather, genuinely different. Our results imply that for parameter-dependent estimators are easier to obtain. We ll see that right now.

42 Median of means

43 Median of means Simple construction of a sub-gaussian parameterdependent estimator that only requires finite second moments. Known for a long time, in many forms, in different comunities (Nemirovski/Yudin, Alon/Matias/Szégedy, Levin, Jerrum/Sinclair, Hsu ). Pre-history.

44 Median of means Example: P 2 2, all distributions with variances 2 P = 2. Thm: Set min := e 1 n/2.then8 2 [ min, 1), there exists a -dependent b E n, with r! sup P b n 1+ln(2/ ) En, (X1 ) µ P >L P n P2P 2 2 X1 n= dp n apple

45 Median of means Sample: X n 1 := (X 1,X 2,X 3,...,X n ) from distribution P. Blocks: split {1, 2,...,n} = B 1 [ B 2 [ [ B b, disjoint blocks of size n/b. Means: for each block B`, define Median of means: Y` := b n X i2b` X i be n, (X n 1 ) := median of (Y 1,Y 2,...,Y b )

46 Analysis Interval µ P L P r b n µ P µ P + L P r b n R

47 Analysis Want: median of Y 1,...,Y b in interval. Su cient: more than half of the Y` s are in there. µ P L P r b n µ P µ P + L P r b n R

48 Analysis Y` = b n X X i, with the X i i.i.d. P i2b` E(Y`) =µ P, Var(Y`) =b 2 P/n µ P L P r b n µ P µ P + L P r b n R

49 Analysis By Chebyshev, P (Y` 62 interval) apple L 2 Disjoint blocks) events are independent. µ P L P r b n µ P µ P + L P r b n R

50 Analysis Probability that b/2 Y` s not in interval is bounded by a binomial tail probability. If L is large, P Bin(b, L 2 ) b/2 apple e b µ P L P r b n µ P µ P + L P r b n R

51 Analysis Probability that b/2 Y` s not in interval is bounded by a binomial tail probability. If L is large, P Bin(b, L 2 ) b/2 apple e b µ P L P r b n µ P µ P + L P r b n R b ln(1/ ) and we re done

52 Our proof ideas

53 Exponential is optimal Family: P La, all Laplace distributions La,with 2 R and dla (x) dx = e x 2 Property: e n apple dla n dla n 0 (x) apple e n Consequence: any estimator with constant L will mistake a La 0 sample for a La 10L sample 2 with prob. e 1 5L2n.

54 Partially known variance Example: P [ 2 1, 2 2 ] := all distributions with variance P 2 [ 1, 2 ]. We let R := 2 / 1 (may depend on n). Theorem: If R is bounded, then for all large enough n there exist E b n : R n! R, min e cn and L constant such that, when P 2 P [ 2 1, 2 2 ] 2 and X1 n = d P n, 8 2 [ min, 1) : P b E n (X n 1 ) µ P >L P r 1+ln(1/ ) n!. If R unbounded, any sequence min! 0 fails.

55 Why unbounded fails Family: P [c/n,r c/n] Po, Poisson random variables with very small means c/n apple µ P apple R c/n. Recall mean=variance for Poisson! X n 1 := sample with mean c/n, S X := X X n. Y n 1 := sample with mean R c/n, S Y := Y Y n.

56 Why unbounded fails X n 1 := sample with mean c/n, S X := X X n. Y n 1 := sample with mean R c/n, S Y := Y Y n. Assume good estimator b E n with constant L. P ne(y b 1 n ) Rc/2 1 e 1 Rc 4L 2 In particular, P ne(y b 1 n ) Rc/2 S Y = Rc 1.

57 Why unbounded fails X1 n := sample with mean c/n, S X := X X n. Y1 n Same for X as for Y! := sample with mean R c/n, S Y := Y Y n. (Sample sum is sufficient statistic) Assume good estimator b E n with constant L. P ne(y b 1 n ) Rc/2 1 e 1 Rc 4L 2 In particular, P ne(y b 1 n ) Rc/2 S Y = Rc 1.

58 Why unbounded fails P ne(x b 1 n ) So P ne(x b 1 n ) Rc/2 S X = Rc Rc/2 1. P (S X = Rc) e R ln Rc On the other hand, the prob. should be e R2 c L 2 by the sub-gaussian estimation property )( for R large

59 The positive result Example: P [ 2 1, 2 2 ] := all distributions with variance P 2 [ 1, 2 ]. We let R := 2 / 1 (may depend on n). Theorem: If R is bounded, then for all large enough n there exist E b n : R n! R, min e cn and L constant such that, when P 2 P [ 2 1, 2 2 ] 2 and X1 n = d P n, 8 2 [ min, 1) : P b E n (X n 1 ) µ P >L P r 1+ln(1/ ) n!. If R unbounded, any sequence min! 0 fails.

60 Confidence intervals Use median of means. Get a confidence interval. bi n, (X n 1 ):= " be n, (X n 1 ) ± L 2 r 1+ln(1/ ) n P µ P 2 b I n, (X n 1 ) and b I n, (X n 1 ) apple 2LR P r 1+ln(1/ ) n #! 1

61 Confidence intervals We'll combine sub-gaussian confidence intervals to obtain a single sub-gaussian estimator. Similar in spirit to Lepskii s adaptation method from nonparametric statistics.

62 Confidence intervals Lemma: I 1,I 2,...,I K random nonempty closed intervals. Assume µ 2 R, P (µ 62 I k ) apple 2 k,1apple k apple K. Set ˆK := min{k apple K : \ K j=k I j 6= ;}. Let b E :=midpoint of \ K j= ˆK I j. Then 81 apple k apple K : P E b µ > I k apple 2 1 k.

63 Proof sketch I 1,I 2,...,I K random nonempty closed intervals. Set ˆK := min{k apple K : \ K j=k I j 6= ;}. Let b E :=midpoint of \ K j= ˆK I j. Assume 8j k, µ 2 I j. Obtain, \ K j=k I j 6= ;, so ˆK apple k. Hence E,µ b 2 I k under the assumption. ) P E b µ > I k apple P j k P (µ 62 I j).

64 Other uses Example: P, := all distributions with E P X µ P apple ( P ), (here 2 (2, 3) is fixed, 0 may depend on n) Theorem: for all large enough n, ifk, := (C ) 2 /( 2), there exist b E n : R n! R, min e cn/k, and L constant such that, when P 2 P, and X n 1 = d P n, 8 2 [ min, 1) : P b E n (X n 1 ) µ P >L P r 1+ln(1/ ) n!.

65 Other uses Example: P, := all distributions with Use quantiles of means (instead of medians E P X µ P apple ( P ), of means) to build confidence intervals. (here 2 (2, 3) is fixed, 0 may depend on n) Barry-Esséen-type bounds prove that Theorem: empirical for all large means enough are n, nearly ifk, := symmetric. (C ) 2 /( 2), there exist E b n : R n! R, min e cn/k, and L constant such that, when P 2 P, and X1 n = d P n, 8 2 [ min, 1) : P b E n (X n 1 ) µ P >L P r 1+ln(1/ ) n!.

66 Different ideas - kurtosis Under bounded kurtosis, can use the empirical mean of truncated random variables. The truncation is data driven and uses preliminary estimates of mean and variance. Use empirical processes to show this is similar to truncating at the exact mean and variance. Sharp bounds!

67 Open problems

68 Open problems Sharp constants are essential for statisticians. Are sub-gaussian confidence intervals somehow equivalent to sub-gaussian estimators? Efficient extensions to vector-valued data and to risk minimization problems. Optimal deviation bounds for Poissons, Bernoullis, etc.

69 Obrigado! (references in the next slides)

70 Our preprint Should be posted to the arxiv in some weeks. Available upon request from roboliv AT gmail.com

71 Catoni s work Catoni s estimation paper + companion paper on least squares (with Audibert). J.-Y. Audibert & O. Catoni. "Robust linear least squares regression. Ann. Stat. 39 no. 5 (2011) O. Catoni. "Challenging the empirical mean and empirical variance: A deviation study. Ann. Inst. H. Poincaré Probab. Statist. 48 no. 4 (2012)

72 Median of means D. Hsu robust-statistics.html (See also Levin, L. "Notes for Miscellaneous Lectures. arxiv:cs/ ) N. Alon, Y. Matias & M. Szégedy. "The Space Complexity of Approximating the Frequency Moments." J. Comput. Syst. Sci. 58 no. 1 (1999) A. Nemirovski & D. Yudin. Problem complexity and method efficiency in optimization. Wiley (1983).

73 Some applications C. Brownlees, E. Joly & G. Lugosi. "Empirical risk minimization for heavy-tailed losses. To appear in Ann. Stat. S. Bubeck, N. Cesa-Bianchi & G. Lugosi. Bandits with heavy tail. IEEE Transactions on Information Theory 59 no. 11 (2013) D. Hsu & S. Sabato. "Loss minimization and parameter estimation with heavy tails. arxiv: Abstract in ICML proceedings (2014).

Distributed Statistical Estimation and Rates of Convergence in Normal Approximation

Distributed Statistical Estimation and Rates of Convergence in Normal Approximation Distributed Statistical Estimation and Rates of Convergence in Normal Approximation Stas Minsker (joint with Nate Strawn) Department of Mathematics, USC July 3, 2017 Colloquium on Concentration inequalities,

More information

Risk minimization by median-of-means tournaments

Risk minimization by median-of-means tournaments Risk minimization by median-of-means tournaments Gábor Lugosi Shahar Mendelson August 2, 206 Abstract We consider the classical statistical learning/regression problem, when the value of a real random

More information

Chapter 6 Expectation and Conditional Expectation. Lectures Definition 6.1. Two random variables defined on a probability space are said to be

Chapter 6 Expectation and Conditional Expectation. Lectures Definition 6.1. Two random variables defined on a probability space are said to be Chapter 6 Expectation and Conditional Expectation Lectures 24-30 In this chapter, we introduce expected value or the mean of a random variable. First we define expectation for discrete random variables

More information

Stat 5101 Lecture Notes

Stat 5101 Lecture Notes Stat 5101 Lecture Notes Charles J. Geyer Copyright 1998, 1999, 2000, 2001 by Charles J. Geyer May 7, 2001 ii Stat 5101 (Geyer) Course Notes Contents 1 Random Variables and Change of Variables 1 1.1 Random

More information

Inference For High Dimensional M-estimates. Fixed Design Results

Inference For High Dimensional M-estimates. Fixed Design Results : Fixed Design Results Lihua Lei Advisors: Peter J. Bickel, Michael I. Jordan joint work with Peter J. Bickel and Noureddine El Karoui Dec. 8, 2016 1/57 Table of Contents 1 Background 2 Main Results and

More information

Advanced Machine Learning

Advanced Machine Learning Advanced Machine Learning Bandit Problems MEHRYAR MOHRI MOHRI@ COURANT INSTITUTE & GOOGLE RESEARCH. Multi-Armed Bandit Problem Problem: which arm of a K-slot machine should a gambler pull to maximize his

More information

Asymptotic Nonequivalence of Nonparametric Experiments When the Smoothness Index is ½

Asymptotic Nonequivalence of Nonparametric Experiments When the Smoothness Index is ½ University of Pennsylvania ScholarlyCommons Statistics Papers Wharton Faculty Research 1998 Asymptotic Nonequivalence of Nonparametric Experiments When the Smoothness Index is ½ Lawrence D. Brown University

More information

The Moment Method; Convex Duality; and Large/Medium/Small Deviations

The Moment Method; Convex Duality; and Large/Medium/Small Deviations Stat 928: Statistical Learning Theory Lecture: 5 The Moment Method; Convex Duality; and Large/Medium/Small Deviations Instructor: Sham Kakade The Exponential Inequality and Convex Duality The exponential

More information

Stochastic Convergence, Delta Method & Moment Estimators

Stochastic Convergence, Delta Method & Moment Estimators Stochastic Convergence, Delta Method & Moment Estimators Seminar on Asymptotic Statistics Daniel Hoffmann University of Kaiserslautern Department of Mathematics February 13, 2015 Daniel Hoffmann (TU KL)

More information

A spectral clustering algorithm based on Gram operators

A spectral clustering algorithm based on Gram operators A spectral clustering algorithm based on Gram operators Ilaria Giulini De partement de Mathe matiques et Applications ENS, Paris Joint work with Olivier Catoni 1 july 2015 Clustering task of grouping

More information

Sub-Gaussian Estimators of the Mean of a Random Matrix with Entries Possessing Only Two Moments

Sub-Gaussian Estimators of the Mean of a Random Matrix with Entries Possessing Only Two Moments Sub-Gaussian Estimators of the Mean of a Random Matrix with Entries Possessing Only Two Moments Stas Minsker University of Southern California July 21, 2016 ICERM Workshop Simple question: how to estimate

More information

COMS 4771 Introduction to Machine Learning. Nakul Verma

COMS 4771 Introduction to Machine Learning. Nakul Verma COMS 4771 Introduction to Machine Learning Nakul Verma Announcements HW2 due now! Project proposal due on tomorrow Midterm next lecture! HW3 posted Last time Linear Regression Parametric vs Nonparametric

More information

Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems, Part I. Sébastien Bubeck Theory Group

Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems, Part I. Sébastien Bubeck Theory Group Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems, Part I Sébastien Bubeck Theory Group i.i.d. multi-armed bandit, Robbins [1952] i.i.d. multi-armed bandit, Robbins [1952] Known

More information

Review. DS GA 1002 Statistical and Mathematical Models. Carlos Fernandez-Granda

Review. DS GA 1002 Statistical and Mathematical Models.   Carlos Fernandez-Granda Review DS GA 1002 Statistical and Mathematical Models http://www.cims.nyu.edu/~cfgranda/pages/dsga1002_fall16 Carlos Fernandez-Granda Probability and statistics Probability: Framework for dealing with

More information

Small Ball Probability, Arithmetic Structure and Random Matrices

Small Ball Probability, Arithmetic Structure and Random Matrices Small Ball Probability, Arithmetic Structure and Random Matrices Roman Vershynin University of California, Davis April 23, 2008 Distance Problems How far is a random vector X from a given subspace H in

More information

Theorem 2.1 (Caratheodory). A (countably additive) probability measure on a field has an extension. n=1

Theorem 2.1 (Caratheodory). A (countably additive) probability measure on a field has an extension. n=1 Chapter 2 Probability measures 1. Existence Theorem 2.1 (Caratheodory). A (countably additive) probability measure on a field has an extension to the generated σ-field Proof of Theorem 2.1. Let F 0 be

More information

Lecture 6 September 13, 2016

Lecture 6 September 13, 2016 CS 395T: Sublinear Algorithms Fall 206 Prof. Eric Price Lecture 6 September 3, 206 Scribe: Shanshan Wu, Yitao Chen Overview Recap of last lecture. We talked about Johnson-Lindenstrauss (JL) lemma [JL84]

More information

Exercises with solutions (Set D)

Exercises with solutions (Set D) Exercises with solutions Set D. A fair die is rolled at the same time as a fair coin is tossed. Let A be the number on the upper surface of the die and let B describe the outcome of the coin toss, where

More information

Example continued. Math 425 Intro to Probability Lecture 37. Example continued. Example

Example continued. Math 425 Intro to Probability Lecture 37. Example continued. Example continued : Coin tossing Math 425 Intro to Probability Lecture 37 Kenneth Harris kaharri@umich.edu Department of Mathematics University of Michigan April 8, 2009 Consider a Bernoulli trials process with

More information

Asymptotics for posterior hazards

Asymptotics for posterior hazards Asymptotics for posterior hazards Pierpaolo De Blasi University of Turin 10th August 2007, BNR Workshop, Isaac Newton Intitute, Cambridge, UK Joint work with Giovanni Peccati (Université Paris VI) and

More information

Machine Learning. VC Dimension and Model Complexity. Eric Xing , Fall 2015

Machine Learning. VC Dimension and Model Complexity. Eric Xing , Fall 2015 Machine Learning 10-701, Fall 2015 VC Dimension and Model Complexity Eric Xing Lecture 16, November 3, 2015 Reading: Chap. 7 T.M book, and outline material Eric Xing @ CMU, 2006-2015 1 Last time: PAC and

More information

Ordinal optimization - Empirical large deviations rate estimators, and multi-armed bandit methods

Ordinal optimization - Empirical large deviations rate estimators, and multi-armed bandit methods Ordinal optimization - Empirical large deviations rate estimators, and multi-armed bandit methods Sandeep Juneja Tata Institute of Fundamental Research Mumbai, India joint work with Peter Glynn Applied

More information

Large deviations for random walks under subexponentiality: the big-jump domain

Large deviations for random walks under subexponentiality: the big-jump domain Large deviations under subexponentiality p. Large deviations for random walks under subexponentiality: the big-jump domain Ton Dieker, IBM Watson Research Center joint work with D. Denisov (Heriot-Watt,

More information

Hypothesis Testing. 1 Definitions of test statistics. CB: chapter 8; section 10.3

Hypothesis Testing. 1 Definitions of test statistics. CB: chapter 8; section 10.3 Hypothesis Testing CB: chapter 8; section 0.3 Hypothesis: statement about an unknown population parameter Examples: The average age of males in Sweden is 7. (statement about population mean) The lowest

More information

large number of i.i.d. observations from P. For concreteness, suppose

large number of i.i.d. observations from P. For concreteness, suppose 1 Subsampling Suppose X i, i = 1,..., n is an i.i.d. sequence of random variables with distribution P. Let θ(p ) be some real-valued parameter of interest, and let ˆθ n = ˆθ n (X 1,..., X n ) be some estimate

More information

Chapter 10. Hypothesis Testing (I)

Chapter 10. Hypothesis Testing (I) Chapter 10. Hypothesis Testing (I) Hypothesis Testing, together with statistical estimation, are the two most frequently used statistical inference methods. It addresses a different type of practical problems

More information

Multi-armed bandit models: a tutorial

Multi-armed bandit models: a tutorial Multi-armed bandit models: a tutorial CERMICS seminar, March 30th, 2016 Multi-Armed Bandit model: general setting K arms: for a {1,..., K}, (X a,t ) t N is a stochastic process. (unknown distributions)

More information

6.1 Moment Generating and Characteristic Functions

6.1 Moment Generating and Characteristic Functions Chapter 6 Limit Theorems The power statistics can mostly be seen when there is a large collection of data points and we are interested in understanding the macro state of the system, e.g., the average,

More information

Applied Probability and Stochastic Processes

Applied Probability and Stochastic Processes Applied Probability and Stochastic Processes In Engineering and Physical Sciences MICHEL K. OCHI University of Florida A Wiley-Interscience Publication JOHN WILEY & SONS New York - Chichester Brisbane

More information

Random Forests. These notes rely heavily on Biau and Scornet (2016) as well as the other references at the end of the notes.

Random Forests. These notes rely heavily on Biau and Scornet (2016) as well as the other references at the end of the notes. Random Forests One of the best known classifiers is the random forest. It is very simple and effective but there is still a large gap between theory and practice. Basically, a random forest is an average

More information

LECTURE NOTES 57. Lecture 9

LECTURE NOTES 57. Lecture 9 LECTURE NOTES 57 Lecture 9 17. Hypothesis testing A special type of decision problem is hypothesis testing. We partition the parameter space into H [ A with H \ A = ;. Wewrite H 2 H A 2 A. A decision problem

More information

arxiv: v2 [math.st] 29 Nov 2017

arxiv: v2 [math.st] 29 Nov 2017 arxiv:1701.04112v2 [math.st] 29 Nov 2017 Regularization, sparse recovery, and median-of-means tournaments Gábor Lugosi Shahar Mendelson November 30, 2017 Abstract We introduce a regularized risk minimization

More information

The No-Regret Framework for Online Learning

The No-Regret Framework for Online Learning The No-Regret Framework for Online Learning A Tutorial Introduction Nahum Shimkin Technion Israel Institute of Technology Haifa, Israel Stochastic Processes in Engineering IIT Mumbai, March 2013 N. Shimkin,

More information

Statistical Machine Learning

Statistical Machine Learning Statistical Machine Learning UoC Stats 37700, Winter quarter Lecture 2: Introduction to statistical learning theory. 1 / 22 Goals of statistical learning theory SLT aims at studying the performance of

More information

Bandit models: a tutorial

Bandit models: a tutorial Gdt COS, December 3rd, 2015 Multi-Armed Bandit model: general setting K arms: for a {1,..., K}, (X a,t ) t N is a stochastic process. (unknown distributions) Bandit game: a each round t, an agent chooses

More information

Introduction to Rare Event Simulation

Introduction to Rare Event Simulation Introduction to Rare Event Simulation Brown University: Summer School on Rare Event Simulation Jose Blanchet Columbia University. Department of Statistics, Department of IEOR. Blanchet (Columbia) 1 / 31

More information

Declaring Independence via the Sketching of Sketches. Until August Hire Me!

Declaring Independence via the Sketching of Sketches. Until August Hire Me! Declaring Independence via the Sketching of Sketches Piotr Indyk Andrew McGregor Massachusetts Institute of Technology University of California, San Diego Until August 08 -- Hire Me! The Problem The Problem

More information

Recall the Basics of Hypothesis Testing

Recall the Basics of Hypothesis Testing Recall the Basics of Hypothesis Testing The level of significance α, (size of test) is defined as the probability of X falling in w (rejecting H 0 ) when H 0 is true: P(X w H 0 ) = α. H 0 TRUE H 1 TRUE

More information

Basic concepts of probability theory

Basic concepts of probability theory Basic concepts of probability theory Random variable discrete/continuous random variable Transform Z transform, Laplace transform Distribution Geometric, mixed-geometric, Binomial, Poisson, exponential,

More information

H 2 : otherwise. that is simply the proportion of the sample points below level x. For any fixed point x the law of large numbers gives that

H 2 : otherwise. that is simply the proportion of the sample points below level x. For any fixed point x the law of large numbers gives that Lecture 28 28.1 Kolmogorov-Smirnov test. Suppose that we have an i.i.d. sample X 1,..., X n with some unknown distribution and we would like to test the hypothesis that is equal to a particular distribution

More information

Lecture 4: Sampling, Tail Inequalities

Lecture 4: Sampling, Tail Inequalities Lecture 4: Sampling, Tail Inequalities Variance and Covariance Moment and Deviation Concentration and Tail Inequalities Sampling and Estimation c Hung Q. Ngo (SUNY at Buffalo) CSE 694 A Fun Course 1 /

More information

STA2603/205/1/2014 /2014. ry II. Tutorial letter 205/1/

STA2603/205/1/2014 /2014. ry II. Tutorial letter 205/1/ STA263/25//24 Tutorial letter 25// /24 Distribution Theor ry II STA263 Semester Department of Statistics CONTENTS: Examination preparation tutorial letterr Solutions to Assignment 6 2 Dear Student, This

More information

Online Learning and Sequential Decision Making

Online Learning and Sequential Decision Making Online Learning and Sequential Decision Making Emilie Kaufmann CNRS & CRIStAL, Inria SequeL, emilie.kaufmann@univ-lille.fr Research School, ENS Lyon, Novembre 12-13th 2018 Emilie Kaufmann Online Learning

More information

On probabilities of large and moderate deviations for L-statistics: a survey of some recent developments

On probabilities of large and moderate deviations for L-statistics: a survey of some recent developments UDC 519.2 On probabilities of large and moderate deviations for L-statistics: a survey of some recent developments N. V. Gribkova Department of Probability Theory and Mathematical Statistics, St.-Petersburg

More information

CS145: Probability & Computing

CS145: Probability & Computing CS45: Probability & Computing Lecture 5: Concentration Inequalities, Law of Large Numbers, Central Limit Theorem Instructor: Eli Upfal Brown University Computer Science Figure credits: Bertsekas & Tsitsiklis,

More information

STAT 6385 Survey of Nonparametric Statistics. Order Statistics, EDF and Censoring

STAT 6385 Survey of Nonparametric Statistics. Order Statistics, EDF and Censoring STAT 6385 Survey of Nonparametric Statistics Order Statistics, EDF and Censoring Quantile Function A quantile (or a percentile) of a distribution is that value of X such that a specific percentage of the

More information

Ordinal Optimization and Multi Armed Bandit Techniques

Ordinal Optimization and Multi Armed Bandit Techniques Ordinal Optimization and Multi Armed Bandit Techniques Sandeep Juneja. with Peter Glynn September 10, 2014 The ordinal optimization problem Determining the best of d alternative designs for a system, on

More information

Probability and Measure

Probability and Measure Chapter 4 Probability and Measure 4.1 Introduction In this chapter we will examine probability theory from the measure theoretic perspective. The realisation that measure theory is the foundation of probability

More information

Lecture 4: Lower Bounds (ending); Thompson Sampling

Lecture 4: Lower Bounds (ending); Thompson Sampling CMSC 858G: Bandits, Experts and Games 09/12/16 Lecture 4: Lower Bounds (ending); Thompson Sampling Instructor: Alex Slivkins Scribed by: Guowei Sun,Cheng Jie 1 Lower bounds on regret (ending) Recap from

More information

STAT 7032 Probability Spring Wlodek Bryc

STAT 7032 Probability Spring Wlodek Bryc STAT 7032 Probability Spring 2018 Wlodek Bryc Created: Friday, Jan 2, 2014 Revised for Spring 2018 Printed: January 9, 2018 File: Grad-Prob-2018.TEX Department of Mathematical Sciences, University of Cincinnati,

More information

On Bayesian bandit algorithms

On Bayesian bandit algorithms On Bayesian bandit algorithms Emilie Kaufmann joint work with Olivier Cappé, Aurélien Garivier, Nathaniel Korda and Rémi Munos July 1st, 2012 Emilie Kaufmann (Telecom ParisTech) On Bayesian bandit algorithms

More information

EXPONENTIAL INEQUALITIES IN NONPARAMETRIC ESTIMATION

EXPONENTIAL INEQUALITIES IN NONPARAMETRIC ESTIMATION EXPONENTIAL INEQUALITIES IN NONPARAMETRIC ESTIMATION Luc Devroye Division of Statistics University of California at Davis Davis, CA 95616 ABSTRACT We derive exponential inequalities for the oscillation

More information

Generalization theory

Generalization theory Generalization theory Daniel Hsu Columbia TRIPODS Bootcamp 1 Motivation 2 Support vector machines X = R d, Y = { 1, +1}. Return solution ŵ R d to following optimization problem: λ min w R d 2 w 2 2 + 1

More information

1 Degree distributions and data

1 Degree distributions and data 1 Degree distributions and data A great deal of effort is often spent trying to identify what functional form best describes the degree distribution of a network, particularly the upper tail of that distribution.

More information

Sharpness of second moment criteria for branching and tree-indexed processes

Sharpness of second moment criteria for branching and tree-indexed processes Sharpness of second moment criteria for branching and tree-indexed processes Robin Pemantle 1, 2 ABSTRACT: A class of branching processes in varying environments is exhibited which become extinct almost

More information

Week 1 Quantitative Analysis of Financial Markets Distributions A

Week 1 Quantitative Analysis of Financial Markets Distributions A Week 1 Quantitative Analysis of Financial Markets Distributions A Christopher Ting http://www.mysmu.edu/faculty/christophert/ Christopher Ting : christopherting@smu.edu.sg : 6828 0364 : LKCSB 5036 October

More information

15-388/688 - Practical Data Science: Basic probability. J. Zico Kolter Carnegie Mellon University Spring 2018

15-388/688 - Practical Data Science: Basic probability. J. Zico Kolter Carnegie Mellon University Spring 2018 15-388/688 - Practical Data Science: Basic probability J. Zico Kolter Carnegie Mellon University Spring 2018 1 Announcements Logistics of next few lectures Final project released, proposals/groups due

More information

Lecture 2: Repetition of probability theory and statistics

Lecture 2: Repetition of probability theory and statistics Algorithms for Uncertainty Quantification SS8, IN2345 Tobias Neckel Scientific Computing in Computer Science TUM Lecture 2: Repetition of probability theory and statistics Concept of Building Block: Prerequisites:

More information

Fluctuations from the Semicircle Law Lecture 4

Fluctuations from the Semicircle Law Lecture 4 Fluctuations from the Semicircle Law Lecture 4 Ioana Dumitriu University of Washington Women and Math, IAS 2014 May 23, 2014 Ioana Dumitriu (UW) Fluctuations from the Semicircle Law Lecture 4 May 23, 2014

More information

The main results about probability measures are the following two facts:

The main results about probability measures are the following two facts: Chapter 2 Probability measures The main results about probability measures are the following two facts: Theorem 2.1 (extension). If P is a (continuous) probability measure on a field F 0 then it has a

More information

STAT 430/510: Lecture 15

STAT 430/510: Lecture 15 STAT 430/510: Lecture 15 James Piette June 23, 2010 Updates HW4 is up on my website. It is due next Mon. (June 28th). Starting today back at section 6.4... Conditional Distribution: Discrete Def: The conditional

More information

Lecture 3. The Population Variance. The population variance, denoted σ 2, is the sum. of the squared deviations about the population

Lecture 3. The Population Variance. The population variance, denoted σ 2, is the sum. of the squared deviations about the population Lecture 5 1 Lecture 3 The Population Variance The population variance, denoted σ 2, is the sum of the squared deviations about the population mean divided by the number of observations in the population,

More information

Asymptotic Statistics-III. Changliang Zou

Asymptotic Statistics-III. Changliang Zou Asymptotic Statistics-III Changliang Zou The multivariate central limit theorem Theorem (Multivariate CLT for iid case) Let X i be iid random p-vectors with mean µ and and covariance matrix Σ. Then n (

More information

arxiv: v1 [cs.lg] 16 Jan 2017

arxiv: v1 [cs.lg] 16 Jan 2017 Achieving Privacy in the Adversarial Multi-Armed Bandit Aristide C. Y. Tossou Chalmers University of Technology Gothenburg, Sweden aristide@chalmers.se Christos Dimitrakakis University of Lille, France

More information

Concentration inequalities and tail bounds

Concentration inequalities and tail bounds Concentration inequalities and tail bounds John Duchi Outline I Basics and motivation 1 Law of large numbers 2 Markov inequality 3 Cherno bounds II Sub-Gaussian random variables 1 Definitions 2 Examples

More information

Fundamental Tools - Probability Theory IV

Fundamental Tools - Probability Theory IV Fundamental Tools - Probability Theory IV MSc Financial Mathematics The University of Warwick October 1, 2015 MSc Financial Mathematics Fundamental Tools - Probability Theory IV 1 / 14 Model-independent

More information

COMPLETE QTH MOMENT CONVERGENCE OF WEIGHTED SUMS FOR ARRAYS OF ROW-WISE EXTENDED NEGATIVELY DEPENDENT RANDOM VARIABLES

COMPLETE QTH MOMENT CONVERGENCE OF WEIGHTED SUMS FOR ARRAYS OF ROW-WISE EXTENDED NEGATIVELY DEPENDENT RANDOM VARIABLES Hacettepe Journal of Mathematics and Statistics Volume 43 2 204, 245 87 COMPLETE QTH MOMENT CONVERGENCE OF WEIGHTED SUMS FOR ARRAYS OF ROW-WISE EXTENDED NEGATIVELY DEPENDENT RANDOM VARIABLES M. L. Guo

More information

1 Exercises for lecture 1

1 Exercises for lecture 1 1 Exercises for lecture 1 Exercise 1 a) Show that if F is symmetric with respect to µ, and E( X )

More information

Optimal global rates of convergence for interpolation problems with random design

Optimal global rates of convergence for interpolation problems with random design Optimal global rates of convergence for interpolation problems with random design Michael Kohler 1 and Adam Krzyżak 2, 1 Fachbereich Mathematik, Technische Universität Darmstadt, Schlossgartenstr. 7, 64289

More information

On the singular values of random matrices

On the singular values of random matrices On the singular values of random matrices Shahar Mendelson Grigoris Paouris Abstract We present an approach that allows one to bound the largest and smallest singular values of an N n random matrix with

More information

LARGE DEVIATION PROBABILITIES FOR SUMS OF HEAVY-TAILED DEPENDENT RANDOM VECTORS*

LARGE DEVIATION PROBABILITIES FOR SUMS OF HEAVY-TAILED DEPENDENT RANDOM VECTORS* LARGE EVIATION PROBABILITIES FOR SUMS OF HEAVY-TAILE EPENENT RANOM VECTORS* Adam Jakubowski Alexander V. Nagaev Alexander Zaigraev Nicholas Copernicus University Faculty of Mathematics and Computer Science

More information

Consistency of Nearest Neighbor Methods

Consistency of Nearest Neighbor Methods E0 370 Statistical Learning Theory Lecture 16 Oct 25, 2011 Consistency of Nearest Neighbor Methods Lecturer: Shivani Agarwal Scribe: Arun Rajkumar 1 Introduction In this lecture we return to the study

More information

Quantile Regression for Extraordinarily Large Data

Quantile Regression for Extraordinarily Large Data Quantile Regression for Extraordinarily Large Data Shih-Kang Chao Department of Statistics Purdue University November, 2016 A joint work with Stanislav Volgushev and Guang Cheng Quantile regression Two-step

More information

Information Measure Estimation and Applications: Boosting the Effective Sample Size from n to n ln n

Information Measure Estimation and Applications: Boosting the Effective Sample Size from n to n ln n Information Measure Estimation and Applications: Boosting the Effective Sample Size from n to n ln n Jiantao Jiao (Stanford EE) Joint work with: Kartik Venkat Yanjun Han Tsachy Weissman Stanford EE Tsinghua

More information

Bahadur representations for bootstrap quantiles 1

Bahadur representations for bootstrap quantiles 1 Bahadur representations for bootstrap quantiles 1 Yijun Zuo Department of Statistics and Probability, Michigan State University East Lansing, MI 48824, USA zuo@msu.edu 1 Research partially supported by

More information

Understanding Generalization Error: Bounds and Decompositions

Understanding Generalization Error: Bounds and Decompositions CIS 520: Machine Learning Spring 2018: Lecture 11 Understanding Generalization Error: Bounds and Decompositions Lecturer: Shivani Agarwal Disclaimer: These notes are designed to be a supplement to the

More information

Does Unlabeled Data Help?

Does Unlabeled Data Help? Does Unlabeled Data Help? Worst-case Analysis of the Sample Complexity of Semi-supervised Learning. Ben-David, Lu and Pal; COLT, 2008. Presentation by Ashish Rastogi Courant Machine Learning Seminar. Outline

More information

arxiv: v2 [math.st] 30 Nov 2017

arxiv: v2 [math.st] 30 Nov 2017 Robust machine learning by median-of-means : theory and practice G. Lecué and M. Lerasle December 4, 2017 arxiv:1711.10306v2 [math.st] 30 Nov 2017 Abstract We introduce new estimators for robust machine

More information

TUM 2016 Class 1 Statistical learning theory

TUM 2016 Class 1 Statistical learning theory TUM 2016 Class 1 Statistical learning theory Lorenzo Rosasco UNIGE-MIT-IIT July 25, 2016 Machine learning applications Texts Images Data: (x 1, y 1 ),..., (x n, y n ) Note: x i s huge dimensional! All

More information

Limiting Distributions

Limiting Distributions Limiting Distributions We introduce the mode of convergence for a sequence of random variables, and discuss the convergence in probability and in distribution. The concept of convergence leads us to the

More information

Can we do statistical inference in a non-asymptotic way? 1

Can we do statistical inference in a non-asymptotic way? 1 Can we do statistical inference in a non-asymptotic way? 1 Guang Cheng 2 Statistics@Purdue www.science.purdue.edu/bigdata/ ONR Review Meeting@Duke Oct 11, 2017 1 Acknowledge NSF, ONR and Simons Foundation.

More information

Basic concepts of probability theory

Basic concepts of probability theory Basic concepts of probability theory Random variable discrete/continuous random variable Transform Z transform, Laplace transform Distribution Geometric, mixed-geometric, Binomial, Poisson, exponential,

More information

Randomized Algorithms

Randomized Algorithms Randomized Algorithms Prof. Tapio Elomaa tapio.elomaa@tut.fi Course Basics A new 4 credit unit course Part of Theoretical Computer Science courses at the Department of Mathematics There will be 4 hours

More information

On rate of convergence in distribution of asymptotically normal statistics based on samples of random size

On rate of convergence in distribution of asymptotically normal statistics based on samples of random size Annales Mathematicae et Informaticae 39 212 pp. 17 28 Proceedings of the Conference on Stochastic Models and their Applications Faculty of Informatics, University of Debrecen, Debrecen, Hungary, August

More information

Asymptotic distribution of the sample average value-at-risk in the case of heavy-tailed returns

Asymptotic distribution of the sample average value-at-risk in the case of heavy-tailed returns Asymptotic distribution of the sample average value-at-risk in the case of heavy-tailed returns Stoyan V. Stoyanov Chief Financial Researcher, FinAnalytica Inc., Seattle, USA e-mail: stoyan.stoyanov@finanalytica.com

More information

Mathematics Qualifying Examination January 2015 STAT Mathematical Statistics

Mathematics Qualifying Examination January 2015 STAT Mathematical Statistics Mathematics Qualifying Examination January 2015 STAT 52800 - Mathematical Statistics NOTE: Answer all questions completely and justify your derivations and steps. A calculator and statistical tables (normal,

More information

Part 2: One-parameter models

Part 2: One-parameter models Part 2: One-parameter models 1 Bernoulli/binomial models Return to iid Y 1,...,Y n Bin(1, ). The sampling model/likelihood is p(y 1,...,y n ) = P y i (1 ) n P y i When combined with a prior p( ), Bayes

More information

A Note on Interference in Random Networks

A Note on Interference in Random Networks CCCG 2012, Charlottetown, P.E.I., August 8 10, 2012 A Note on Interference in Random Networks Luc Devroye Pat Morin Abstract The (maximum receiver-centric) interference of a geometric graph (von Rickenbach

More information

Model Fitting. Jean Yves Le Boudec

Model Fitting. Jean Yves Le Boudec Model Fitting Jean Yves Le Boudec 0 Contents 1. What is model fitting? 2. Linear Regression 3. Linear regression with norm minimization 4. Choosing a distribution 5. Heavy Tail 1 Virus Infection Data We

More information

arxiv: v2 [math.pr] 8 Feb 2016

arxiv: v2 [math.pr] 8 Feb 2016 Noname manuscript No will be inserted by the editor Bounds on Tail Probabilities in Exponential families Peter Harremoës arxiv:600579v [mathpr] 8 Feb 06 Received: date / Accepted: date Abstract In this

More information

Limiting Distributions

Limiting Distributions We introduce the mode of convergence for a sequence of random variables, and discuss the convergence in probability and in distribution. The concept of convergence leads us to the two fundamental results

More information

The information complexity of sequential resource allocation

The information complexity of sequential resource allocation The information complexity of sequential resource allocation Emilie Kaufmann, joint work with Olivier Cappé, Aurélien Garivier and Shivaram Kalyanakrishan SMILE Seminar, ENS, June 8th, 205 Sequential allocation

More information

Learning Theory. Machine Learning CSE546 Carlos Guestrin University of Washington. November 25, Carlos Guestrin

Learning Theory. Machine Learning CSE546 Carlos Guestrin University of Washington. November 25, Carlos Guestrin Learning Theory Machine Learning CSE546 Carlos Guestrin University of Washington November 25, 2013 Carlos Guestrin 2005-2013 1 What now n We have explored many ways of learning from data n But How good

More information

R. Lachieze-Rey Recent Berry-Esseen bounds obtained with Stein s method andgeorgia PoincareTech. inequalities, 1 / with 29 G.

R. Lachieze-Rey Recent Berry-Esseen bounds obtained with Stein s method andgeorgia PoincareTech. inequalities, 1 / with 29 G. Recent Berry-Esseen bounds obtained with Stein s method and Poincare inequalities, with Geometric applications Raphaël Lachièze-Rey, Univ. Paris 5 René Descartes, Georgia Tech. R. Lachieze-Rey Recent Berry-Esseen

More information

Homework 4 Solutions

Homework 4 Solutions CS 174: Combinatorics and Discrete Probability Fall 01 Homework 4 Solutions Problem 1. (Exercise 3.4 from MU 5 points) Recall the randomized algorithm discussed in class for finding the median of a set

More information

Worst-Case Bounds for Gaussian Process Models

Worst-Case Bounds for Gaussian Process Models Worst-Case Bounds for Gaussian Process Models Sham M. Kakade University of Pennsylvania Matthias W. Seeger UC Berkeley Abstract Dean P. Foster University of Pennsylvania We present a competitive analysis

More information

Generalized Neyman Pearson optimality of empirical likelihood for testing parameter hypotheses

Generalized Neyman Pearson optimality of empirical likelihood for testing parameter hypotheses Ann Inst Stat Math (2009) 61:773 787 DOI 10.1007/s10463-008-0172-6 Generalized Neyman Pearson optimality of empirical likelihood for testing parameter hypotheses Taisuke Otsu Received: 1 June 2007 / Revised:

More information

Estimators for the binomial distribution that dominate the MLE in terms of Kullback Leibler risk

Estimators for the binomial distribution that dominate the MLE in terms of Kullback Leibler risk Ann Inst Stat Math (0) 64:359 37 DOI 0.007/s0463-00-036-3 Estimators for the binomial distribution that dominate the MLE in terms of Kullback Leibler risk Paul Vos Qiang Wu Received: 3 June 009 / Revised:

More information

Asymptotics for posterior hazards

Asymptotics for posterior hazards Asymptotics for posterior hazards Igor Prünster University of Turin, Collegio Carlo Alberto and ICER Joint work with P. Di Biasi and G. Peccati Workshop on Limit Theorems and Applications Paris, 16th January

More information

MOMENT CONVERGENCE RATES OF LIL FOR NEGATIVELY ASSOCIATED SEQUENCES

MOMENT CONVERGENCE RATES OF LIL FOR NEGATIVELY ASSOCIATED SEQUENCES J. Korean Math. Soc. 47 1, No., pp. 63 75 DOI 1.4134/JKMS.1.47..63 MOMENT CONVERGENCE RATES OF LIL FOR NEGATIVELY ASSOCIATED SEQUENCES Ke-Ang Fu Li-Hua Hu Abstract. Let X n ; n 1 be a strictly stationary

More information