THE LINDEBERG-FELLER CENTRAL LIMIT THEOREM VIA ZERO BIAS TRANSFORMATION

Similar documents
A Gentle Introduction to Stein s Method for Normal Approximation I

Expectation of Random Variables

Lecture 3: Expected Value. These integrals are taken over all of Ω. If we wish to integrate over a measurable subset A Ω, we will write

Stein s Method and the Zero Bias Transformation with Application to Simple Random Sampling

Concentration of Measures by Bounded Size Bias Couplings

CLASSICAL PROBABILITY MODES OF CONVERGENCE AND INEQUALITIES

Concentration of Measures by Bounded Couplings

Math 328 Course Notes

STAT 7032 Probability Spring Wlodek Bryc

Notions such as convergent sequence and Cauchy sequence make sense for any metric space. Convergent Sequences are Cauchy

Advanced Probability Theory (Math541)

Limiting Distributions

Finite-dimensional spaces. C n is the space of n-tuples x = (x 1,..., x n ) of complex numbers. It is a Hilbert space with the inner product

Lecture 10. Variance and standard deviation

MATH 426, TOPOLOGY. p 1.

Selected Topics in Probability and Stochastic Processes Steve Dunbar. Partial Converse of the Central Limit Theorem

The Central Limit Theorem: More of the Story

X n D X lim n F n (x) = F (x) for all x C F. lim n F n(u) = F (u) for all u C F. (2)

The main results about probability measures are the following two facts:

Random Variables. P(x) = P[X(e)] = P(e). (1)

University of Regina. Lecture Notes. Michael Kozdron

0.1 Uniform integrability

Notes 1 : Measure-theoretic foundations I

Expansion and Isoperimetric Constants for Product Graphs

Probability and Measure

Density estimators for the convolution of discrete and continuous random variables

d(x n, x) d(x n, x nk ) + d(x nk, x) where we chose any fixed k > N

Poisson approximations

Metric Spaces and Topology

Math 61CM - Solutions to homework 6

7 Convergence in R d and in Metric Spaces

If g is also continuous and strictly increasing on J, we may apply the strictly increasing inverse function g 1 to this inequality to get

Lecture 1: August 28

Lecture 2: Review of Basic Probability Theory

1 Presessional Probability

CONVERGENCE OF RANDOM SERIES AND MARTINGALES

CHAPTER 3: LARGE SAMPLE THEORY

Introduction to Bases in Banach Spaces

SUMMARY OF PROBABILITY CONCEPTS SO FAR (SUPPLEMENT FOR MA416)

2.1 Elementary probability; random sampling

Real Analysis Notes. Thomas Goller

Asymptotic Statistics-VI. Changliang Zou

Notes 6 : First and second moment methods

Lecture 1: Review on Probability and Statistics

Connectedness. Proposition 2.2. The following are equivalent for a topological space (X, T ).

Probability and Measure

Limiting Distributions

THEOREMS, ETC., FOR MATH 515

Theorem 2.1 (Caratheodory). A (countably additive) probability measure on a field has an extension. n=1

Another Riesz Representation Theorem

MATH 56A: STOCHASTIC PROCESSES CHAPTER 2

Notes on the second moment method, Erdős multiplication tables

Midterm #1. Lecture 10: Joint Distributions and the Law of Large Numbers. Joint Distributions - Example, cont. Joint Distributions - Example

Normal Approximation for Hierarchical Structures

Sample Spaces, Random Variables

REAL AND COMPLEX ANALYSIS

1/12/05: sec 3.1 and my article: How good is the Lebesgue measure?, Math. Intelligencer 11(2) (1989),

A FIXED POINT THEOREM FOR GENERALIZED NONEXPANSIVE MULTIVALUED MAPPINGS

MATH Solutions to Probability Exercises

Normed Vector Spaces and Double Duals

Measure Theory on Topological Spaces. Course: Prof. Tony Dorlas 2010 Typset: Cathal Ormond

Measure and Integration: Solutions of CW2

j=1 [We will show that the triangle inequality holds for each p-norm in Chapter 3 Section 6.] The 1-norm is A F = tr(a H A).

Chapter 2 Metric Spaces

Chapter 4: Asymptotic Properties of the MLE

Lecture 3 - Expectation, inequalities and laws of large numbers

Asymptotic Statistics-III. Changliang Zou

From now on, we will represent a metric space with (X, d). Here are some examples: i=1 (x i y i ) p ) 1 p, p 1.

ACM 116: Lecture 2. Agenda. Independence. Bayes rule. Discrete random variables Bernoulli distribution Binomial distribution

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 8 10/1/2008 CONTINUOUS RANDOM VARIABLES

Problem List MATH 5143 Fall, 2013

Week 12-13: Discrete Probability

Math 564 Homework 1. Solutions.

Probability Models of Information Exchange on Networks Lecture 1

6.1 Moment Generating and Characteristic Functions

The Lebesgue Integral

The Lindeberg central limit theorem

Filters in Analysis and Topology

Lecture 4: September Reminder: convergence of sequences

I. ANALYSIS; PROBABILITY

L p Functions. Given a measure space (X, µ) and a real number p [1, ), recall that the L p -norm of a measurable function f : X R is defined by

Section 27. The Central Limit Theorem. Po-Ning Chen, Professor. Institute of Communications Engineering. National Chiao Tung University

FORMULATION OF THE LEARNING PROBLEM

Eigenvalues and Eigenfunctions of the Laplacian

Kousha Etessami. U. of Edinburgh, UK. Kousha Etessami (U. of Edinburgh, UK) Discrete Mathematics (Chapter 7) 1 / 13

Probability Models. 4. What is the definition of the expectation of a discrete random variable?

Twelfth Problem Assignment

Partial Differential Equations, 2nd Edition, L.C.Evans Chapter 5 Sobolev Spaces

Chapter 6. Convergence. Probability Theory. Four different convergence concepts. Four different convergence concepts. Convergence in probability

Problem set 1, Real Analysis I, Spring, 2015.

Brownian motion. Samy Tindel. Purdue University. Probability Theory 2 - MA 539

Dynkin (λ-) and π-systems; monotone classes of sets, and of functions with some examples of application (mainly of a probabilistic flavor)

Course 212: Academic Year Section 1: Metric Spaces

Introduction to Probability

Exercise Solutions to Functional Analysis

Math Bootcamp 2012 Miscellaneous

FIRST YEAR CALCULUS W W L CHEN

Continuum Probability and Sets of Measure Zero

Probability and Distributions

MS&E 321 Spring Stochastic Systems June 1, 2013 Prof. Peter W. Glynn Page 1 of 10

Transcription:

THE LINDEBERG-FELLER CENTRAL LIMIT THEOREM VIA ZERO BIAS TRANSFORMATION JAINUL VAGHASIA Contents. Introduction. Notations 3. Background in Probability Theory 3.. Expectation and Variance 3.. Convergence 3. Zero Bias Transformation 3 5. Setting up the conditions 5 6. The Lindeberg-Feller Central Limit Theorem 7 7. The Feller-Lévy Partial Converse 8 8. Cycles in a Random Permutation: Application of Lindeberg-Feller CLT 9 References. Introduction In 738, Abraham de Moivre published his book The Doctrine of Chances where he provided techniques for solving gambling problems using a version of the Central Limit theorem for Bernoulli trials with p = /. Roughly a century later, Pierre Simon Laplace published an extension of this proof, although not a rigorous one, for p / in his work Theorie Analytique des Probabilities. A more general statement of the Central Limit Theorem appeared only in 9 when Lindeberg suggested that the sequence of random variables need not be identically distributed given certain condition known as the Lindeberg condition. Feller has explained that the Lindeberg condition requires the individual variances be due mainly to masses in an interval whose length is small in comparison to the overall variance. In his paper [3], Goldstein offers an equivalent, seemingly simpler condition through Stein s zero bias transformation introduced in []. As a result of equivalency, this condition can be used to prove the Lindeberg-Feller Central Limit theorem and its partial converse (independently due to Feller and Lévy). This paper will outline the properties of zero bias transformation, and describe its role in the proof of the Lindeberg-Feller Central Limit Theorem and its Feller-Lévy converse. In light of completeness, we shall also offer an application of the Central Limit theorem using the small zero bias condition to the number Date: June 3, 08.

JAINUL VAGHASIA of cycles in a random permutation pooled from all permutations of the set {,,..., n}, and show that it is asymptotically Normal in n. This paper uses the following notations:. Notations We write X P to mean that X has distribution P. Y n d Y means convergence in distribution. Y n p Y is convergence in probability. E means expected value and Var means variance, while P is the probability. denotes the indicator function X denotes the Random Variable with X-zero biased distribution 3. Background in Probability Theory We begin by highlighting some elementary facts and definitions that will be used later in this paper. Let Ω be a sample space with a probability distribution (also called a probability measure) P. A random variable is a map X : Ω R. We write 3.. Expectation and Variance. P(X A) = P({ω Ω : X(ω) A}). Definition 3.. (Expected value) The expected value of g(x) is E(g(X)) = It has following properties: g(x) dp(x) = { () Linearity: E( k j= c jg j (X)) = k j= c je(g j (X)). () If X,..., X n are independent then ( n ) E X i = i Definition 3.. (Variance) Variance is It has following properties: () Var (X) = E(X ) µ. () If X,..., X n are independent then g(x)p(x) dx if X is continuous, j g(x j)p(x j ) if X is discrete. E(X i ). σ = Var (X) = E((X µ) ). ( k ) Var a j X j = j= k a jvar (X j ). j=

3.. Convergence. THE LINDEBERG-FELLER CENTRAL LIMIT THEOREM VIA ZERO BIAS TRANSFORMATION 3 Definition 3.3. (Convergence in Distribution) A sequence of random variables Y n is said to converge in distribution to Y if lim P(Y n x) = P(Y x) for all continuity points x of P(Y x). Definition 3.. (Convergence in Probability) A sequence of random variables Y n converges in probability to Y if lim P( Y n Y ɛ) = 0 for all ɛ > 0. We will also need the following equivalence theorems between convergence of expectation and convergence in distribution. Theorem 3.5. Y n d Y implies lim E(h(Y n)) = E(h(Y )) for all h C b, where C b is the collection of all bounded, continuous functions. Since the set of all functions with compact support which integrate to zero and are infinitely differentiable Cc,0 C b, Theorem 3.5 holds for Cc,0. Following converse holds for C c,0 as well. Theorem 3.6. If for all h C c,0, then Y n d Y. lim E(h(Y n)) = E(h(Y )) The proofs of Theorems 3.5 and 3.6 are too involved to be worth the digression, and therefore shall not be presented here.. Zero Bias Transformation Definition.. Let X be a random variable with mean zero and finite, nonzero variance σ. We say that X has the X zero biased distribution if for all differentiable f for which these expectations exist, σ E(f (X )) = E(Xf(X)). In [3], the proof of existence of the zero biased distribution for any such X is omitted for simplicity. We provide a brief Ideas of the argument here. For any given continuous function with compact support g, let G = x 0 g. Then T g = σ E(XG(X)) exists since the variance E(X ) <. Note that T is linear as inherited by the linearity of expectation. Now take g 0, so G is increasing, and therefore X and G(X) are positively correlated. Hence, T g = E(XG(X)) E(X)E(G(X)) = 0. This shows that T is positive. The existence of a unique probability measure ν with T g = x 0 g dν would then be obtained by Riesz-Markov-Kakutani representation theorem (see eg. []; [5] provides a category theoretic proof). Next, we show that zero bias transformation enjoys the following continuity property. This will be instrumental later in proving the Feller-Lévy converse.

JAINUL VAGHASIA Theorem.. Let Y and Y n, n =,,... be mean zero random variables with finite, nonzero variances σ = Var (Y n ), respectively. If Y n d Y and lim σ n = σ, then Y n d Y. Proof. We shall use the alternate definition of convergence in distribution provided by Theorems 3.5 and 3.6. Let f Cc,0 and F (y) = y f(t) dt. Since Y and Y n have mean zero and finite variances, their zero bias distributions exists. In particular, By Theorem 3.5, since yf (y) is in C b, we obtain σ ne(f(y n )) = E[Y n F (Y n )] for all n. σ lim E(f(Y n )) = lim E[Y nf (Y n )] = E[Y F (Y )] = σ E(f(Y )). Hence, E(f(Y n )) E(f(Y )) for all f C c,0, so Y n d Y by Theorem 3.6. The main utility of the zero bias transforms is that Normal distribution is the unique fixed point of the transformation. This is stated formally below: Theorem.3. Let X be a zero-meaned random variable with nonzero, finite variance σ. Then X has distribution N (0, σ ) if and only if σ E(f (X)) = E(Xf(X)) for all absolutely continuous functions f for which these expectations exist. Proof. Recall the probability density function for the standard normal distribution: ϕ(x) = e x /. π We evaluate E(f (X)) using integration by parts: E(f (X)) = f (x)e x / dx π = [f(x)e x / π = π = E(Xf(X)), xf(x)e x / dx ] + xf(x)e x / dx as claimed. Since the solution to the differential equation is unique upto adding a constant, we have the result precisely for the family of normal distribution. This property can be exploited to reason that a distribution that gets mapped to one nearby is close to being a fixed point of the transformation. Hence, normal approximation can be applied whenever the distribution of a random variable is close to that of its zero bias transformation. With this in hand, consider the distribution of a sum W n of comparably sized independent random variables. We show that this can be zero biased by randomly selecting a single summand chosen with probability proportional to its variance,

THE LINDEBERG-FELLER CENTRAL LIMIT THEOREM VIA ZERO BIAS TRANSFORMATION 5 and replacing it with comparable size random variable. This would imply that W n and Wn are close, and therefore approximately Normal. This is the gist of the proof of the Lindeberg-Feller Central Limit Theorem under the condition that Goldstein calls small zero bias condition. We shall now formalize this argument. 5. Setting up the conditions This section serves to list the conditions that go together in some combination into creating the statement of the Lindeberg-Feller Central Limit Theorem. Condition 5.. For every n =,,..., the random variables making up the collection X n = {X i,n : i n} are independent with mean zero and finite variances σ i,n = Var (X i,n), standardized so that W n = X i,n has variance Var (W n ) = σi,n =. This is a non-negotiable condition because it is necessary with both the Lindeberg condition and the small zero bias condition. Condition 5.. (Lindeberg Condition) ɛ > 0 lim L n,ɛ = 0 where L n,ɛ = E[Xi,n( X i,n ɛ)]. As described in the introduction, this condition forces the random variables X i,n to have individual variances small compared to their sum. Condition 5.3. (Small zero bias condition) Given X n satisfying condition 5., let X n = {X i,n : i n} be a collection of random variables so that X i,n has the X i,n zero biased distribution and is independent of X n. Further let I n be a random index, independent of X n and X n, with distribution (5.) P(I n = i) = σ i,n, and define (5.5) X In,n = The condition requires that (I n = i)x i,n and XI n,n = X I n,n p 0. (I n = i)xi,n. Theorem 5.6. Condition 5. and Condition 5.3 are equivalent. Proof. Since the random index I n is independent of X n and X n, we use (5.) and (5.5) to obtain (5.7) E(f(X In,n)) = σi,ne(f(x i,n )) and E(f(XI n,n)) = σi,ne(f(x i,n))

6 JAINUL VAGHASIA Showing that 5. implies 5.3 relies on the function x + ɛ if x ɛ f(x) = x ɛ ( x ɛ) = 0 if ɛ < x < ɛ x ɛ if x ɛ Notice that f (x) = ( x ɛ) Then, using E((x)) = P(x) for the first equality and zero bias relation for the second equality, we obtain [ ] (5.8) σi,np( X i,n ɛ) = σi,ne(f (Xi,n)) = E (Xi,n ɛ X i,n )( X i,n ɛ) Bounding this quantity from above, [ ] σi,np( X i,n ɛ) E (Xi,n + ɛ X i,n )( X i,n ɛ) [ ] E Xi,n( X i,n ɛ). Applying (5.7) on the indicator function ( x ɛ) and using E((x)) = P(x), we obtain P( XI n,n ɛ) = σi,np( X i,n ɛ) Using the bound derived above, a.e. P( X I n,n ɛ) L n,ɛ. Hence, small bias condition is satisfied given Lindeberg condition. For the reverse implication, observe that x ( x ɛ) (x ɛ ) ( x x ɛ ). Then L n,ɛ = E(Xi,n( X i,n ɛ)) ( Xi,n ɛ ) ( X i,n X i,n ɛ ) ( = σi,np Xi,n ɛ ) ( = P Xi,n ɛ ), where we used (5.8) and (5.5) for second and third equalities, respectively. This finishes the equivalence argument. Now that we know that the two conditions are equivalent, it is worthwhile to see how small bias condition can be used to prove the Central Limit theorems.

THE LINDEBERG-FELLER CENTRAL LIMIT THEOREM VIA ZERO BIAS TRANSFORMATION 7 6. The Lindeberg-Feller Central Limit Theorem Lemma 6.. Let X n, n =,,... satisfy Condition 5. and m n = max i n σ i,n. Then X In,n p 0 whenever lim m n = 0. Proof. Using (5.7) with f(x) = x, we obtain E(X In,n) = 0, and hence Var (X In,n) = E(XI n,n ). Again with f(x) = x, Var (X In,n) = Since σ i,n σ i,n max j n σ j,n = σ i,n m n, we have for all ɛ > 0 σ i,n. P( X In,n ɛ) Var (X I n,n) ɛ where we used the Chebyshev s inequality. ɛ m n σi,n = ɛ m n, Lemma 6.. If X n, n =,,... satisfies Condition 5. and the small bias condition 5.3, then X In,n p 0. Proof. For all n, i n, and ɛ > 0, σi,n = E(Xi,n( X i,n < ɛ)) + E(Xi,n( X i,n ɛ)) ɛ + L n,ɛ. It follows that m n ɛ + L n,ɛ, and therefore lim sup m n ɛ, where we used L n,ɛ 0 by Theorem 5.6. The claim now follows by Lemma 6.. Theorem 6.3. If X n, =,,... satisfies Condition 5.3, then W n d Z, a standard normal random variable Proof. (Ideas) Let h Cc,0. Stein has shown that there exists a twice differentiable solution f with bounded derivatives to the Stein equation : Then using zero bias relation, f (w) wf(w) = h(w) E(h(Z)) E[h(W n ) E(h(Z))] = E[f (W n ) f (W n)] where Wn = W n +XI X n,n I n,n. Since f, f are bounded, we have η(δ) = sup x y δ f (x) f (y) 0 as δ 0. By the small bias condition, Wn W n 0. Hence, η( Wn W n ) 0. By the triangle inequality, E[h(W n ) E(h(Z))] E(η( Wn W n )). This shall give us lim E(h(W n )) = E(h(Z)). The result would then follow by Theorem 3.6.

8 JAINUL VAGHASIA 7. The Feller-Lévy Partial Converse Lemma 7.. (Slutsky s Lemma) Let U n and V n, n =,,... be two sequences of random variables. Then U n d U and V n p 0 implies U n + V n d U. The proof of Slutsky s Lemma is an exercise in applying definitions and is left to the reader. When independence holds, we have the following converse. Lemma 7.. Let U n and V n, n =,,... be two sequences of random variables such that U n and V n are independent for every n. Then U n d U and U n + V n d U implies V n p 0. Proof. (Ideas) Assume the special case that U n = d U for all n. Then assume for contradiction that V n does not tend to zero in probability. Without loss of generality, there exists an infinite subsequence K such that for all n K, P(V n ɛ) p. Use density of U to obtain continuous s(x) = P(x U x + ). Hence, s(x) can be showed to attain its maximum value in a bounded region. Then using the independence of U and V n, we have P(y U + V n y + V n ) = s(y V n ) for all n. On one hand, P(y U + V n y + V n ɛ) sup x y ɛ s(x) for all n K, but on the other hand by conditioning on V n ɛ and its complement, and the fact that U is absolutely continuous, we obtain a contradiction lim inf P(y U + V n y + ) < lim P(y U + V n y + ). Generalizing the situation requires more sophisticated constructs for which we refer to the Appendix of [3]. Theorem 7.3. If X n, =,,... satisfies Condition 5. and lim m n = 0, then the small zero bias condition is necessary for W n d Z. m n = max i n σ i,n, Proof. Since W n d Z and Var (W n ) Var (Z) with limit identically one, Theorem. implies W n Z. Note that the limit is technically Z but Z = Z for the fixed point of zero bias transformation. Since m n 0, Lemma 6. yields that X In,n p 0. It follows from Slutsky s Lemma 7. that Hence, W n + X I n,n = W n + X In,n d Z. W n d Z and W n + X I n,n d Z. Since W n is a function of X n, which is independent of I n and Xn and therefore XI n,n, Lemma 7. yields that XI n,n p 0.

THE LINDEBERG-FELLER CENTRAL LIMIT THEOREM VIA ZERO BIAS TRANSFORMATION 9 8. Cycles in a Random Permutation: Application of Lindeberg-Feller CLT Random permutations play an important role in many areas of probability and statistics. Here we provide one application that uses the cycles in random permutations. Consider a party with n people, and everyone writes their name down on a piece of paper and puts it into a bag. The bag is mixed up, and each person draws one piece of paper. If you draw the name of someone else, you are considered to be in his group. We want to show that the distribution of total number of groups is asymptotically Normal. This construct is an extension of that of the classical hat-check problem where each group is only allowed at most two people, and the motive is to find the distribution of the number of singleton groups. Let us formalize this problem now. Define the symmetric group S n to be the set of all permutations π on the set {,,..., n}. We can represent the permutation using the cycle notation. For example, π S 7 may be represented as π = (,, 6)(, 3, 7)(5) with the meaning that π maps to, to 6, 6 to, and so forth. Notice that π has two cycles of length 3 and one of length, for a total of three cycles. Hence, in the problem described above, each cycle would represent a single group of people. We will denote C r,n (π) to be the number of cycles of length r in permutation π S n. Define K n (π) to be the total number of cycles in permutation π S n. To begin, we show that C,n converges in distribution to a random variable having the Poisson distribution with mean. This should solve the simplest of the hat-check problems. Proposition 8.. For k = 0,,..., n, e P(C,n = k) d as n k! Proof. (Ideas) Using an inclusion-exclusion argument, we can show an elementary equality that P(C,n = k) = k! n k Then the proposition holds true as we pass to limit n. l=0 ( ) l. l! Now, we handle the extension of the problem. That is, we show that K n (π) is asymptotically Normal. We will employ Feller coupling [], which constructs a random permutation π uniformly from S n with help of n independent Bernoulli variables X :n with distributions (8.) P(X i = 0) = and P(X i = ) =, i =,..., n. i i Begin the first cycle at step with the element. At step i, i =,..., n, if X n i+ = close the current cycle and begin a new one starting with the smallest number not yet in any cycle, and otherwise choose an element uniformly from those yet unused. In this way, at step i, we complete a cycle with probability /(n i + ). As the total number K n (π) of cycles of π is precisely the number of times an element closes the loop upon completing its cycle, K n (π) = X i.

0 JAINUL VAGHASIA 6 6 6 3 6 7 3 6 3 7 5 6 3 7 5 Figure: An example of Feller Coupling for π = (,, 6)(, 3, 7)(5) with X = X = X 5 =. An important consequence of this definition is that random variables X i, i =,..., n, even though independent, are not identically distributed. As a result, the classical Central Limit does not apply to their sum. However, we show that the small zero bias conditions hold. This gives us the required asymptotic Normality of K π (n). First, we standardize K n (π) in order to satisfy Condition (5.). Since Bernoulli variables with defining probability p have expected value p and variance p( p), we use the linearity of expectation and that of variance for independent variables to obtain that (8.3) E(K n (π)) = E(X i ) = Then the standardized variable is defined as (8.) W n = K n(π) h n σ n = It is trivial to verify that W n satisfies Condition (5.). Now we verify the small zero bias condition. Notice that (8.5) lim i = h n and Var (K n (π)) = σ n = h n =, and hence log n lim X i,n where X i,n = X i i σ n. σ n log n =. ( i ) i. As a consequence of (8.) and (8.), we have W n = n i= X i,n as X = identically makes X,n = 0 for all n. Xi should be uniformly distributed: Let B be a Bernoulli random variable with success probability p and let X be the centered Bernoulli variable B p, having variance p( p). Then E(Xf(X)) = E((B p)f(b p)) = p( p)f( p) ( p)pf( p) = σ [f( p) f( p)] p = σ f (u) du p = σ E(f (U)),

THE LINDEBERG-FELLER CENTRAL LIMIT THEOREM VIA ZERO BIAS TRANSFORMATION for U having uniform density over [ p, p]. Thus, by and (8.), we have (αx) = d αx for all α 0, [ Xi,n U i = d, where U i has distribution U σ n i, ], i =,..., n. i It follows that U i for all i =,..., n, and therefore for uniformly random index I n X I n,n σ n 0 by (8.5). Hence, small bias condition is satisfied and the Lindeberg-Feller Central Limit Theorem gives us the result we want. References [] W. Feller. The fundamental limit theorems in probability. Bull. Amer. Math. Soc. 5 (95) 800 83. [] G. Folland. Real Analysis: Modern Techniques and Their Applications. John Wiley and Sons (98). [3] L. Goldstein. A Probabilistic Proof of the Lindeberg-Feller Central Limit Theorem. The American Mathematical Monthly. Vol. 6, No. (Jan., 009) 5 60. [] L. Goldstein and G. Rienhart. Stein s method and the zero bias transformation with application to simple random sampling. Ann. Appl. Probab. 7 (997) 935 95. [5] D. G. Hartig. The Riesz Representation Theorem Revisited. The American Mathematical Monthly. Vol. 90, No. (Apr., 983) 77 80.