Lecture 01: the Central Limit Theorem. 1 Central Limit Theorem for i.i.d. random variables

CSCI-B609: A Theorist s Toolkit, Fall 06 Aug 3 Lecture 0: the Cetral Limit Theorem Lecturer: Yua Zhou Scribe: Yua Xie & Yua Zhou Cetral Limit Theorem for iid radom variables Let us say that we wat to aalyze the total sum of a certai kid of result i a series of repeated idepedet radom experimets each of which has a well-defied expected value ad fiite variace I other words, a certai kid of result (eg whether the experimet is a success ) has some probability to be produced i each experimet We would like to repeat the experimet may times idepedetly ad uderstad the total sum of the results Beroulli variables We first cosider the sum of a buch of Beroulli variables Specifically, let X, X,, X be iid radom variables with Pr[X i = ] = p, Pr[X i = 0] = p Let S = S = X + X + + X ad we wat to uderstad S Accordig to the liearity of expectatio, we have E[S] = E[X ] + E[X ] + + E[X ] = p Sice X, X, X are idepedet, we have Var[S] = p( p) Now let us use a liear trasformatio to make S mea 0 ad variace Ie let us itroduce Z, a liear fuctio of S, to be Z = S p p( p) Usig µ = p ad σ = p( p), we have Z = S µ σ

Lecture 0: the Cetral Limit Theorem Via this trasformatio, we do ot lose ay iformatio about S = S Specifically, for ay u, we have [ Pr[S u] = Pr[σZ + µ u] = Pr Z u µ ] σ Therefore, we proceed to study the distributio of Z As a special istace, let us temporarily set p = so that X i s become ubiased coi flips I such case, we have Z = X + X + + X = ((X ) + (X ) + + (X )) For each iteger a [0, ], we have [ Pr Z = a ] ( = a) Therefore, we ca easily plot the probability desity curve of Z I Figure, we plot the desity curve for a few values of (a) = 5 (b) = 0 (c) = 0 (d) = 40 Figure : Probability desity curves of Z for a few values of

Lecture 0: the Cetral Limit Theorem 3 We ca see that as, the probability desity curve coverges to a fixed cotiuous curve as illustrated i Figure Figure : The famous Bell curve the probability desity fuctio of a stadard Gaussia variable Ideed, eve whe p = Pr[X i = ] is a costat i (0, ) other tha, the probability desity curve of Z still coverges to the same curve as We call the probability distributio usig such curve as pdf the Gaussia distributio (or Normal distributio) The Cetral Limit Theorem The Cetral Limit Theorem (CLT) for iid radom variables ca be stated as follows Theorem (the Cetral Limit Theorem) Let Z be a stadard Gaussia For ay iid X, X,, X (ot ecessarily biary valued), as, we have Z Z i the sese that u R, Pr[Z u] Pr[Z u] More specifically, for each ɛ > 0, there exists N N so that for every > N ad every u R, we have Pr[Z u] Pr[Z u] < ɛ Defiitio We use Z N (0, ) to deote that Z is a stadard Gaussia variable More specifically, Z is a cotiuous radom variable with probability desity fuctio φ(z) = π e z / We also use Y N (µ, σ) to deote that Y is a Gaussia variable with mea µ ad variace σ, ie Y = σz + µ where Z is a stadard Gaussia Now we itroduce a few facts about Gaussia variables

Lecture 0: the Cetral Limit Theorem 4 Theorem 3 Let Z = (Z, Z,, Z d ) R d, where Z, Z,, Z d are iid stadard Gaussias The the distributio of Z is rotatioally symmetric Ie, the probability desity will be the same for z ad z whe z = z Proof The probability desity fuctio of Z at z = (z, z,, z d ) is φ(z )φ(z )φ(z d ) = which oly depeds o z ( ) d ( ) d e (z +z ++z d )/ = e z, π π The followig corollary says that the fuctio φ( ) is ideed a probability desity fuctio Corollary 4 π e Z dz = Corollary 5 Liear combiatio of idepedet gaussias is still gaussia The Berry-Essee Theorem (CLT with error bouds) Whe desigig ad aalyzig algorithms, we usually eed to kow the covergece rate i order to derive a guaratee o the performace (eg time/space complexity) of the algorithm I this sese, the Cetral Limit Theorem (Theorem ) may ot be practically useful The followig Berry-Essee theorem stregthes the CLT with cocrete error bouds Theorem 6 (the Berry-Essee Theorem) Let X, X,, X be idepedet Assume wlog that E(X i ) = 0 ad Var(X i ) = σi ad i= σ i = Let Z = X + X + + X (Note that E[Z] =, Var[Z] = ) The u R, we have Pr[S u] Pr [Z u] O() β, where β = Σ i= E X i 3 Z N (0,) Remark The hidde costat i the upperboud of the theorem ca be as good as 554 by [She3] Remark The Berry-Essee theorem does ot eed X i s to be idetical Idepedece amog variables is still essetial We still use the ubiased coi flips example to see how this boud works

Lecture 0: the Cetral Limit Theorem 5 Let be idepedet radom variables X i = { + N, wp N, wp We ca check that E[X i ] = 0 ad Var(X i ) =, σ i = satisfy the requiremet i the Berry-Essee theroem We ca also compute that E X i 3 =, ad therefore β = Accordig to the Berry-Essee theorem, we have u R, Pr[S u] Pr Z N (0,) 3 [Z u] 56 () The right-had side ( 56 ) gives a cocrete covergece rate ( ) Now let us ivestigate whether the O upper boud ca be improved Say is eve, the S = #Heads #Tails The S = 0 #H = #T = Now let us estimate this probability usig () For ɛ > 0, we have Pr[#H = #T ] = Pr[S = 0] = Pr[S 0] Pr[S ɛ] = (Pr[S 0] Pr[Z 0]) (Pr[S ɛ] Pr[Z ɛ]) + (Pr[Z 0] Pr[Z ɛ]) Takig ɛ 0 +, we have Pr[S 0] Pr[Z 0] Pr[S ɛ] Pr[Z ɛ] + Pr[ ɛ < Z 0] Pr[#H = #T ] Pr[S 0] Pr[Z 0] Pr[S ɛ] Pr[Z ɛ] 56 + 56 =, () where the last iequality is because of () O the other had, it is easy to see that P r[#h = #T ] = ( ) Usig Sterlig s approximatio, whe, we have π( e Pr[#H = #T ] ) π ( e ) = 798 (3) π ( If we had a essetially better upper boud (say o )) i (), we would get a upper boud ( ) of o i () This would cotradict (3) Therefore the upper boud i () give by the Berry-Essee theorem is asymptotically tight

Lecture 0: the Cetral Limit Theorem 6 Refereces [She3] I G Shevtsova O the absolute costats i the Berry Essee iequality ad its structural ad ouiform improvemets Iform Prime, 7():4 5, 03