Central Limit Theorems and Proofs

Math/Stat 394, Winter 0 F.W. Scholz Central Limit Theorems and Proos The ollowin ives a sel-contained treatment o the central limit theorem (CLT). It is based on Lindeber s (9) method. To state the CLT which we shall prove, we introduce the ollowin notation. We assume that X n,..., X nn are independent random variables with means 0 and respective variances σ n,..., σ nn with σ n +... + σ nn = > 0 or all n. Denote the sum X n +... + X nn by S n and observe that S n has mean zero and variance. Lindeber s Central Limit Theorem: I the Lindeber condition is satisied, i.e., i or every ɛ > 0 we have that L n (ɛ) = then or every a R we have that ( XniI { Xni ɛ}) 0 as n, P (S n / a) Φ(a) 0 as n. Proo: Step (converence o expectations o smooth unctions): We will show in Appendix that or certain unctions we have that (S n / )] (Z)] 0 as n, () where Z denotes a standard normal random variable. I this converence would hold or any unction and i we then applied it to then a (x) = I (,a] (x) = i x a and = 0 i x > a, a (S n / )] a (Z)] = P (S n / a) Φ(a) and the statement o the CLT would ollow. Unortunately, we cannot directly demonstrate the above converence () or all, but only or smooth. Here smooth means that is bounded and has three bounded, continuous derivatives as stipulated in Lemma o Appendix. Step (sandwichin a step unction between smooth unctions): We will approximate a (x) = I (,a] (x), which is a step unction with step at x = a, by sandwichin it between two smooth unctions. In act, or δ > 0 one easily inds (see Appendix or an explicit example) smooth unctions (x) with (x) = or x a, (x) monotone decreasin rom to 0 on a, a + δ] and (x) = 0 or x a. Hence we would have a (x) (x) a+δ (x) or all x R.

...... a (x) (x) a+δ (x) a a + δ Since a+δ (x + δ) = a (x), we also et rom the second previous a (x) = a+δ (x + δ) (x + δ). Combinin these, we can bracket a (x) or all x R by (x + δ) a (x) (x)............. (x + δ) a (x) (x) a δ a a + δ Step 3 (the approximation arument): The ollowin diaram will clariy the approximation stratey. The inequalities result rom the bracketin in the previous step. ( S n + δ )] Sn a ( S n )] (Z + δ)] a (Z)] (Z)] means that or any ixed δ > 0 and thus or ixed the terms above and below become arbitrarily close, say within ɛ/3 o each other, as n (see Step ). Further, since (Z + δ) (Z) = 0 or Z a δ, a + δ] and (Z + δ) (Z) or a δ Z a + δ we have (Z + δ)] (Z)] = ((Z + δ) (Z)) Ia δ Z a+δ] ] Ia δ Z a+δ] ] = P (a δ Z a + δ) = Φ(a + δ) Φ(a δ) Φ(δ) Φ( δ) δφ(0) = δ/ π. The latter bound can be made as small as ɛ/3, no matter what is, by takin δ = ɛ π/6. With this choice o δ, and n suiciently lare, the above diaram entails that Sn a a (Z)] τ ɛ n 3 + ɛ 3 + ɛ 3 = ɛ

Since this can be done or any ɛ > 0 we have shown that Sn a a (Z)] τ 0 as n n From Lindeber s CLT other special versions ollow at once. The irst, the Lindeber-Levy CLT, considers independent identically distributed random variables with inite variance σ. The second, the Liapunov CLT, considers independent, but not necessarily identically distributed random variables with inite third moments. Lindeber-Levy CLT: Let Y,..., Y n be independent and identically distributed random variables with common mean µ and inite positive variance σ and let T n = Y +... + Y n. Then or all a R P ( Tn nµ nσ a ) Φ(a) 0 as n. Proo: Let X i = Y i µ, then X i has mean zero and variance σ. Note that S n = X +...+X n = T n nµ has variance τ = nσ so that T n nµ nσ = S n. The Lindeber unction L n (ɛ) becomes L n (ɛ) = nσ ] Xi I Xi ɛ nσ] = nσ n ] XI X ɛ nσ] 0 as n. For example, or a continuous r.v. with mean zero and inite variance σ this last converence ollows by notin σ = ( a,a) x (x)dx + x (x)dx, ( a,a) c where the irst summand on the riht converes to σ as a. Thus the second term must o to zero. But the second term is just x (x)dx = ] X I X a] ( a,a) c Liapunov CLT: Let Y,..., Y n be independent r.v. with means µ,..., µ n, variances σ,..., σ n and inite absolute central moments, i.e., ( Y i µ i 3 ) <. I Liapunov s condition is satisied, i.e., l n = τ 3 n ( Y i µ i 3) 0 as n, then P Yi µ i σ i a Φ(a) 0 as n We express the ollowin in terms o densities, but it holds enerally, i.e., or discrete r.v. s as well. 3

Proo: We will simply show that Liapunov s condition implies Lindeber s condition. Note that X i = Y i µ i has mean zero and variance σ i. l n = ( Xi 3) ( Xi 3 ɛ ( ) I 3 3 Xi ɛ]) Xi I 3 Xi ɛ] = ɛln (ɛ) Appendix Demonstration o Step : A crucial part o the proo is the ollowin orm o Taylor s ormula. You may want to skip the technical proo which is provided only or completeness. Lemma (Taylor s Formula): Let be a bounded unction deined on R with three bounded continuous derivatives (0) =, (), () and (3), i.e., with sup (i) (x) = M (i) < or i = 0,,, 3. x R Then or all h R (h) = sup (x + h) (x) () (x)h () (x)h K min ( h, h 3). x R Proo: From the usual orm o Taylor s theorem we have or any h (x + h) (x) () (x)h () (x)h = h3 6 (3) ( x) where x lies between x and x + h. Hence or all h R (x + h) (x) () (x)h () (x)h M (3) 6 h 3. () On the other hand by simple application o the trianle inequality ( a + b +... a + b +...) we also have (x + h) (x) () (x)h () (x)h M (0) + M (0) + M () h + M () (0) M = h + M () + h h M () which or lare h, say h b, and lare K, say K = M () +, can be bounded by K h. For h < b we have rom the inequality () h (x + h) (x) () (x)h () (x)h M (3) M (3) 6 h 3 b 6 h. 4

Takin M (3) K = max b 6, K, M (3) 6 we have or h R (x + h) (x) () (x)h () (x)h K h and K h 3, hence min(k h, K h 3 ). From Lemma we obtain the ollowin Corollary. Corollary : Under the conditions o Lemma we have (x + h ) (x + h ) = () (x)(h h ) + () (x)(h h )] + r with r (h ) + (h ) K ( min(h, h 3 ) + min(h, h 3 ) ). Proo: r = = (x + h ) (x + h ) () (x)(h h ) + () (x)(h h )] (x + h ) (x) () (x)h () (x)h (x + h ) (x) () (x)h () (x)h] (h ) + (h ) by the trianle inequality. To demonstrate the converence in () consider independent normal random variables V n,..., V nn with mean zero and respective variances σ n,..., σ nn, so that Z = V n +... + V nn is normal with mean zero and variance one, i.e., standard normal. These V ni s are also taken to be independent o the X nj s. First rewrite the approximation dierence in the ollowin telescoped ashion (to simpliy notation we write X i, V i and σ i or X ni, V ni and σ ni, respectively) Sn (Z)] = + X +... + X n X +... + X n + V n +......... X +... + X n + V n +......... X + V +... + V n V +... + V n +. 5 X +... + X n + V n + V n

Now let Y i = X +... + X i + V i+ +... + V n or i =,..., n and Y 0 = V +... + V n and Y n = X +... + X n. With this notation and employin the two term Taylor expansion o Corollary one can express a typical dierence term in the above telescoped sum as where ( Yi + X i+ R n,i+ )] ( ) Xi+ (Xi+ ) Yi + V i+ V i+ = + ( Vi+ ) K min ( X i+ () ( Yi, X i+ 3 ) + min 3 ) + X i+ V ( V i+ i+, V i+ 3 )] 3 Notin the independence o Y i and (X i+, V i+ ) and (X i+ ) = (V i+ ) = 0 we ind ( ) ] Xi+ V i+ () Yi Xi+ V i+ = () Yi = 0. Similarly, X i+ V Thus i+ ( ) ] () Yi X = i+ V ] i+ () Yi Yi + X i+ = σ i+ σi+ Yi + V i+ R n,i+. Usin the trianle inequality on the above telescoped sum we have Sn n n Xi+ (Z)] R n,i+ + = i=0 Xi i=0 + Vi, n i=0 () ( Yi Vi+ ( ) ] () Yi + R n,i+ (3) )] = 0 chanin the runnin index rom i = 0,..., n to i =,..., n in the last step. We will show that both sums on the riht can be made arbitrarily small by lettin n. First consider ( ) ] ( ) ] Xi Xi Xi = I { Xi ɛτ } + I { Xi >ɛτ n } n and invoke rom (3) the bound K X i / 3 or the irst term and the bound K X i / or the second term to et Xi Xi 3 ] Xi ] K + K ɛ K Summin these we et τ 3 n Xi I { Xi ɛ} I { Xi ɛ} ] + K Xi ] Xi ɛ K + K Xi Xi I { Xi >ɛ} I { Xi >ɛ} I { Xi >ɛ} ] ] = ɛ K σ i n K ɛ σ i + K ] X i I { Xi >ɛ} = K ɛ + K ] X i I { Xi >ɛ} = ɛ K + K L n (ɛ) ɛ K 6 + K X i I { Xi >ɛ}].

as n by the assumption o the Lindeber condition. Similarly one ets Vi ɛ K + K V i I { Vi >ɛ}] ɛ K, provided we can show that the Lindeber condition also holds or the V i s. These two converences to K ɛ show that the approximation error can be made arbitrarily small as n, since ɛ > 0 can be any small number, as lon as and hence K is kept ixed. To show the Lindeber condition or the V i s, we irst note that the Lindeber condition or the X i s entails that max{σ /,..., σ n / } 0 as n. (4) This ollows rom σ i = ] X i I { Xi ɛ} + ] X i I { Xi >ɛ} τ τ nɛ ] I n { Xi ɛ} + ] X i I { Xi >ɛ} ɛ + X i I { Xi >ɛ}] ɛ + L n (ɛ). From the Lindeber condition it ollows that the second term vanishes as n and hence σ i / ɛ as n or any ɛ > 0. This shows (4). The Lindeber condition or the V i s is seen as ollows, usin V i /( ɛ) > in the irst : as n. ] Vi I { Vi >ɛ} V ɛ 3 i 3] = σ ɛ 3 i 3 Z 3] n ɛ max{σ σi /,..., σ n / } Z 3] = ɛ max{σ /,..., σ n / } Z 3] 0 Appendix Here we will exhibit an explicit unction (x) which has three continuous and bounded derivatives and which is or x 0, decreases rom to 0 on the interval 0, ] and is 0 or x. The orm o this unction taken rom Thomasian (969). By takin ( ) x a h(x) = δ we immediately et a correspondin smooth unction which descends rom to 0 over the interval a, a + δ] instead. The unction (x) is iven as ollows or x 0, ]: ( (x) = 40 4 x4 3 5 x5 + x6 ) 7 x7. Then (x) = 40x 3 ( x) 3 or x 0, ], 7

hence is monotone decreasin on 0, ]. Further, () = 0 and (0) =. To check smoothness we have to veriy that the irst three derivatives join continuously at 0 and, i.e., are 0. Clearly, (0) = () = 0. Next, and (x) = 40x ( x) ( x) with (0) = () = 0 (x) = 40 ( x( x) ( x) ( x)x ( x) x ( x) ) with (0) = () = 0. Reerences Lindeber, J.W. (9). ine neue Herleitun des xponentialesetzes in der Wahrscheinlichkeitsrechnun. Mathemat. Z., 5, -5. Thomasian, A. (969). The Structure o Probability Theory with Applications. Mc Graw Hill, New York. (pp. 483-493) 8