Limit theorems. Sayan Mukherjee

Size: px

Start display at page:

Download "Limit theorems. Sayan Mukherjee"

Paul Wade
5 years ago
Views:

1 1 Limit theorems Saya Mukherjee Limit theorems Wewilllearvariouslawoflargeumberresults.Theseresultswillalsobeusedtomotivateissues such as itegratio ad switchig limits. These are usually taught before limit theorems but we have reversedtheorder.wewillalsolookatexamplesofhowtouselawoflargeumbers.wewilldiscuss thestroglawadweaklawoflargeumbersitheotesocovergece. Law of Large Numbers I this lecture, we will look at cocetratio iequalities or law of large umbers for a fixed fuctio. Let (Ω, L, µbeaprobabilityspace.let X 1,..., X berealradomvariableso Ω.Asequeceofradom variables Y covergesalmostsurelytoaradomvariable Yiff P(Y Y = 1.Asequeceofradom variables Y covergesiprobabilitytoaradomvariable Y iffforevery ǫ > 0, lim P( Y Y > ǫ = 0.Let ˆµ := 1 X.Thesequece X 1,..., X satisfiesthestroglawoflargeumbersiffor somecostat c, ˆµ covergesto calmostsurely.thesequece X 1,..., X satisfiestheweaklawoflarge umbersiffforsomecostat c, ˆµ covergesto ciprobability.igeeralthecostat cwillbethe expectatio of the radom variable EX. A give fuctio f(x of radom variables X cocetrates if the deviatio betwee its empirical average, 1 f(x iadexpectatio, Ef(X,goestozeroas goestoifiity.thatis f(xsatisfies the law of large umbers. Polyomial iequalities Theorem0.0.1(JeseIf φisacovexfuctiothe φ(ex Eφ(x. Theorem 0.0.(Bieaymé-Chebyshev For ay radom variable X, ǫ > 0 P( X ǫ EX ǫ. EX E(X I { x ǫ} ǫ P( X > ǫ. Theorem0.0.3(MarkovForayradomvariable X, ǫ > 0 ad P( X ǫ EeλX e λǫ P( X ǫ if λ<0 e λǫ Ee λx. Expoetial iequalities P(X > ǫ = P(e λx > e λǫ EeλX e λǫ. For the sums or averages of idepedet radom variables the above bouds ca be improved from polyomial i 1/ǫ to expoetial i ǫ. Theorem0.0.4(BeetLet X 1,..., X beidepedetradomvariableswith EX = 0, EX = σ,ad X i M.For ǫ > 0 ( P X i > ǫ e σ ǫm M φ( σ, where φ(z = (1 + zlog(1 + z z.

2 Wewillproveaboudooe-sideoftheabovetheorem ( P X i > ǫ. ( P X i > ǫ e λǫ Ee λ P X i = e λǫ Π Ee λxi = e λǫ (Ee λx. Ee λx (λx k = E = k! k=0 k=0 λ k = 1 + k! EX x k 1 + = 1 + σ M Thelastlieholdssice 1 + x e x. Therefore, k= λ k M k k= k! ( P X i > ǫ k= λ kexk k! λ k k! Mk σ = 1 + σ M (eλm 1 λm e σ M (eλm λm 1. e λǫ e σ M (eλm λm 1. (1 Weowoptimizewithrespectto λbytakigthederivativewithrespectto λ 0 = ǫ + σ M (MeλM M, e λm = ǫm σ + 1, λ = 1 M log ( 1 + ǫm σ Thetheoremisprovebysubstitutig λitoequatio(1. TheproblemwithBeet siequalityisthatitishardtogetasimpleexpressiofor ǫasafuctio oftheprobabilityofthesumexceedig ǫ. Theorem0.0.5(BersteiLet X 1,..., X beidepedetradomvariableswith EX = 0, EX = σ,ad X i M.For ǫ > 0 ( P X i > ǫ e ǫ Take the proof of Beet s iequality ad otice φ(z z + 3 z.. σ + 3 ǫm. Remark With Berstei s iequality a simple expressio for ǫ as a fuctio of the probability of the sum exceedig ǫ ca be computed x i 3 um + σ u.

3 3 Outlie. where weowsolvefor ǫ ad Sice a + b a + b So with large probability Ifwewattoboud we cosider Therefore ad Similarly, ( P X i > ǫ e ǫ σ + 3 ǫm = e u, u = ǫ σ + 3 ǫm. ǫ 3 ǫm σ ǫ = 0 ǫ = 1 3 um + u M + σ 9 u. ǫ = 3 um + σ u. X i 3 um + σ u. 1 f(x i Ef(X f(x i Ef(X M. (f(x i Ef(X 4 3 um + σ u 1 f(x i Ef(x 4 um 3 + σ u. Ef(x 1 f(x i 4 um 3 + σ u. Itheaboveboud whichimplies u σ 8M adtherefore σ u 4uM 1 f(x i Ef(X σ u for u σ, which correspods to the tail probability for a Gaussia radom variable ad is predicted by the Cetral LimitTheorem(CLTCoditiothat lim σ.if lim σ = C,where Cisafixedcostat, the 1 f(x i Ef(x C which correspods to the tail probability for a Poisso radom variable. Weowlookataevesimplerexpoetialiequalitywherewedooteediformatioothe variace.

4 4 Theorem0.0.6(HoeffdigLet X 1,..., X beidepedetradomvariableswith EX = 0ad X i M i. For ǫ > 0 ( P X i > ǫ e ǫ ( P X i > ǫ It ca be show(homework problem P M i. e λǫ Ee λ P Xi = e λǫ Π Ee λxi. E(e λxi e λ M i 8. Theboudisprovebyoptimizigthefollowigwithrespectto λ Applyig Hoeffdig s iequality to wecastatethatwithprobability 1 e u e λǫ Π e λ M i 8. 1 f(x i Ef(X 1 f(x i Ef(X Mu, whichisasub-gaussiaasithecltbutwithoutthevariaceiformatiowecaeverachievethe 1 rate we achieved whe the radom variable has a Poisso tail distributio. Theorem0.0.7(HoeffdigLet X 1,..., X beidepedetradomvariableswith P(X i = M i = 1/ad P(X i = M i = 1/.For ǫ > 0 ( P X i > ǫ e ǫ P M i. ( P x i > ǫ e λǫ Ee λ P xi = e λǫ Π Ee λxi. E(e λxi = 1 eλmi + 1 e λmi, 1 eλmi + 1 e λmi = Optimize the followig with respect to λ k=0 (M i λ k (k! e λ M i. Martigale iequalities e λǫ Π e λ M i.

5 5 I the previous sectio we stated some cocetratio iequalities for sums of idepedet radom variables. We ow look at more complicated fuctios of idepedet radom variables ad itroduce a particular Martigale iequality to prove cocetratio. Let (Ω, L, µbeaprobabilityspace.let X 1,..., X berealradomvariableso Ω.Letthefuctio Z(X 1,..., X : Ω Rbeamapfromtheradomvariablestoarealumber. Thefuctio Zcocetratesifthedeviatiobetwee Z(X 1,..., X ad E X1,..,X Z(X 1,.., X goes tozeroas goestoifiity. Theorem0.0.8(McDiarmidLet X 1,..., X beidepedetradomvariableslet Z(X 1,..., X : Ω R such that X 1,..., X, X 1,..., X Z(X 1,.., X Z(X 1,..., X i 1, X i, X i+1, X, the P(Z EZ > ǫ e ǫ P c i. P(Z EZ > ǫ = P(e λ(z EZ > e λǫ e λǫ Ee λ(z EZ. We will use the followig very useful decompositio Z(X 1,..., X E X 1,..,X Z(X 1,.., X = [Z(X 1,..., X E X 1 Z(X 1, X,..., X ] + [E X 1 Z(X 1, X,..., X E X 1,X Z(X 1, X, X 3,..., X ] [E X 1,...,X 1 Z(X 1, X,...X 1, x E X 1,...,X Z(X 1,..., X ]. We deote the radom variable Z i (X i,..., X := E X 1,...,X i 1 Z(X 1,..., X i 1, X i,..., X E X 1,...,X i Z(X 1,..., X i, X i+1,..., X, ad Z(X 1,..., X E X 1,..,X Z(X 1,.., X = Z Z. The followig iequality is true(see the followig Lemma for a proof E Xi e λzi e λ c i / λ R. by iductio Ee λ(z EZ EE X1 e λ(z1+...,+z = Ee λ(z1+...,+z = Ee λ(z+...,+z E X1 e λz1 Ee λ(z+...,+z e λc 1 /, Ee λ(z EZ e λ P /. Toderivetheboudweoptimizewithrespectto λ e λǫ+λ P /. Lemma0.0.1Forall λ R E Xi e λzi e λ c i /. Foray t [ 1, 1]thefuctio e λt iscovexwithrespectto λ. e λt = e 1+t 1 t λ( λ( 1 + t eλ + 1 t e λ = eλ + e λ e λ / + tsh(λ. + t eλ e λ

6 6 Set t = Zi adoticethat Zi [ 1, 1]so, e λzi = e λci z i e λ c i / + z i sh(λ, ad E Xi e λzi e λ c i /.

Glivenko-Cantelli Classes

Glivenko-Cantelli Classes CS28B/Stat24B (Sprig 2008 Statistical Learig Theory Lecture: 4 Gliveko-Catelli Classes Lecturer: Peter Bartlett Scribe: Michelle Besi Itroductio This lecture will cover Gliveko-Catelli (GC classes ad itroduce