X n D X lim n F n (x) = F (x) for all x C F. lim n F n(u) = F (u) for all u C F. (2)

Similar documents
the convolution of f and g) given by

F (x) = P [X x[. DF1 F is nondecreasing. DF2 F is right-continuous

or E ( U(X) ) e zx = e ux e ivx = e ux( cos(vx) + i sin(vx) ), B X := { u R : M X (u) < } (4)

consists of two disjoint copies of X n, each scaled down by 1,

Notions such as convergent sequence and Cauchy sequence make sense for any metric space. Convergent Sequences are Cauchy

7 Convergence in R d and in Metric Spaces

4 Expectation & the Lebesgue Theorems

Probability and Measure

2 (Bonus). Let A X consist of points (x, y) such that either x or y is a rational number. Is A measurable? What is its Lebesgue measure?

Metric Spaces and Topology

Real Analysis Problems

MAT 570 REAL ANALYSIS LECTURE NOTES. Contents. 1. Sets Functions Countability Axiom of choice Equivalence relations 9

Measure and Integration: Solutions of CW2

1/12/05: sec 3.1 and my article: How good is the Lebesgue measure?, Math. Intelligencer 11(2) (1989),

If g is also continuous and strictly increasing on J, we may apply the strictly increasing inverse function g 1 to this inequality to get

Homework 11. Solutions

Entrance Exam, Real Analysis September 1, 2017 Solve exactly 6 out of the 8 problems

Analysis Qualifying Exam

Lecture Notes in Advanced Calculus 1 (80315) Raz Kupferman Institute of Mathematics The Hebrew University

3 (Due ). Let A X consist of points (x, y) such that either x or y is a rational number. Is A measurable? What is its Lebesgue measure?

Probability and Measure

Exercises Measure Theoretic Probability

ABSTRACT INTEGRATION CHAPTER ONE

18.175: Lecture 3 Integration

REAL AND COMPLEX ANALYSIS

3 Integration and Expectation

Math 320: Real Analysis MWF 1pm, Campion Hall 302 Homework 8 Solutions Please write neatly, and in complete sentences when possible.

Exercises Measure Theoretic Probability

Probability and Measure

Limiting Distributions

Real Analysis Math 131AH Rudin, Chapter #1. Dominique Abdi

1 Independent increments

17. Convergence of Random Variables

I. ANALYSIS; PROBABILITY

Integration on Measure Spaces

Solutions Final Exam May. 14, 2014

L p Spaces and Convexity

Characteristic Functions and the Central Limit Theorem

ELEMENTS OF PROBABILITY THEORY

LEBESGUE INTEGRATION. Introduction

THEOREMS, ETC., FOR MATH 515

Real Analysis Notes. Thomas Goller

Math 5051 Measure Theory and Functional Analysis I Homework Assignment 2

Limiting Distributions

Problem set 1, Real Analysis I, Spring, 2015.

Problem set 5, Real Analysis I, Spring, otherwise. (a) Verify that f is integrable. Solution: Compute since f is even, 1 x (log 1/ x ) 2 dx 1

18.175: Lecture 15 Characteristic functions and central limit theorem

g 2 (x) (1/3)M 1 = (1/3)(2/3)M.

A List of Problems in Real Analysis

Indeed, if we want m to be compatible with taking limits, it should be countably additive, meaning that ( )

Tools from Lebesgue integration

MATH 202B - Problem Set 5

MATH 426, TOPOLOGY. p 1.

(1) Consider the space S consisting of all continuous real-valued functions on the closed interval [0, 1]. For f, g S, define

The Lebesgue Integral

Iowa State University. Instructor: Alex Roitershtein Summer Homework #5. Solutions

Transforms. Convergence of probability generating functions. Convergence of characteristic functions functions

2 Measure Theory. 2.1 Measures

Probability and Distributions

Math212a1413 The Lebesgue integral.

On the convergence of sequences of random variables: A primer

1. Let A R be a nonempty set that is bounded from above, and let a be the least upper bound of A. Show that there exists a sequence {a n } n N

MA/ST 810 Mathematical-Statistical Modeling and Analysis of Complex Systems

2 Lebesgue integration

Sequences. Chapter 3. n + 1 3n + 2 sin n n. 3. lim (ln(n + 1) ln n) 1. lim. 2. lim. 4. lim (1 + n)1/n. Answers: 1. 1/3; 2. 0; 3. 0; 4. 1.

Random Variables. Random variables. A numerically valued map X of an outcome ω from a sample space Ω to the real line R

CHAPTER 1. Metric Spaces. 1. Definition and examples

Lecture 4 Lebesgue spaces and inequalities

CHAPTER I THE RIESZ REPRESENTATION THEOREM

Immerse Metric Space Homework

Math 321 Final Examination April 1995 Notation used in this exam: N. (1) S N (f,x) = f(t)e int dt e inx.

Lebesgue measure and integration

Supplementary Notes for W. Rudin: Principles of Mathematical Analysis

+ 2x sin x. f(b i ) f(a i ) < ɛ. i=1. i=1

Defining the Integral

) ) = γ. and P ( X. B(a, b) = Γ(a)Γ(b) Γ(a + b) ; (x + y, ) I J}. Then, (rx) a 1 (ry) b 1 e (x+y)r r 2 dxdy Γ(a)Γ(b) D

Homework Assignment #2 for Prob-Stats, Fall 2018 Due date: Monday, October 22, 2018

II - REAL ANALYSIS. This property gives us a way to extend the notion of content to finite unions of rectangles: we define

Math 209B Homework 2

Three hours THE UNIVERSITY OF MANCHESTER. 24th January

3 hours UNIVERSITY OF MANCHESTER. 22nd May and. Electronic calculators may be used, provided that they cannot store text.

Measure Theory on Topological Spaces. Course: Prof. Tony Dorlas 2010 Typset: Cathal Ormond

1* (10 pts) Let X be a random variable with P (X = 1) = P (X = 1) = 1 2

Lebesgue Integration on R n

1 Review of Probability and Distributions

Week 2. Review of Probability, Random Variables and Univariate Distributions

Part IA Probability. Theorems. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015

CHAPTER 3: LARGE SAMPLE THEORY

Austin Mohr Math 704 Homework 6

x 0 + f(x), exist as extended real numbers. Show that f is upper semicontinuous This shows ( ɛ, ɛ) B α. Thus

Some Background Material

1 Measurable Functions

n [ F (b j ) F (a j ) ], n j=1(a j, b j ] E (4.1)

P-adic Functions - Part 1

Formulas for probability theory and linear models SF2941

L p Functions. Given a measure space (X, µ) and a real number p [1, ), recall that the L p -norm of a measurable function f : X R is defined by

Weak convergence. Amsterdam, 13 November Leiden University. Limit theorems. Shota Gugushvili. Generalities. Criteria

Exercises in Extreme value theory

4 Expectation & the Lebesgue Theorems

Introduction to Proofs in Analysis. updated December 5, By Edoh Y. Amiran Following the outline of notes by Donald Chalice INTRODUCTION

Transcription:

14:17 11/16/2 TOPIC. Convergence in distribution and related notions. This section studies the notion of the so-called convergence in distribution of real random variables. This is the kind of convergence that takes place in the central limit theorem, which will be developed in a later section. The basic idea is that the distributions of the random variables involved settle down. The following definition focuses on the probability contents of intervals. Definition 1. Let X 1, X 2,... and X be real random variables with respective distribution functions F 1, F 2,... and F. One says X n converges to X in distribution as n, and writes X n D X, if lim n F n x) = F x) for all x C F. 1) Here C F := {x R : F is continuous at x} = {x R : P [X = x] = } is the set of continuity points of F. Example 1. Suppose that the distribution of X n is a unit mass at the point 1/n, and the distribution of X is a unit mass at. If there is any justice to the definition, we should have X n D X. The distribution functions F n and F are graphed below; the graph of F n has been shifted up slightly to distinguish it from the graph of F. 1 F 1/n F n Note that C F = {x R : x }. It is easy to see that {, if x <, lim n F n x) = = F x) 1, if x >, for all x C F, so we do indeed have X n D X. Note, however, that lim n F n ) = 1 = F ); This illustrates why 1) doesn t require lim n F n x) = F x) for all x. 14 1 X n D X lim n F n x) = F x) for all x C F. Recall that for a distribution function F, the left-continuous inverse to F is the function F mapping, 1) to R defined by F u) = inf{x : u F x) }. F u) is the smallest u th -quantile of F. The following characterization of convergence in distribution in terms of quantiles was part of the first homework see Exercise 1.1). Theorem 1. Let X 1, X 2,..., and X be real random variables with left-continuous inverse distribution functions F 1, F 2,..., and F respectively. Then X n D X if and only if lim n F nu) = F u) for all u C F. 2) Here C F := {u, 1) : F is continuous at u }. Suppose the X n s and X are random variables for which 2) holds. Let U be a random variable uniformly distributed over, 1). According to the Inverse Probability Theorem Theorem 1.5), for each n the random variable X n := F nu) has the same distribution as X n ; we express this as X n X n. Similarly, X := F U) X. I claim that X n converges to X as n in a very strong sense, namely, P [ ω : X nω) X ω) ] = 1. 3) To see this, note that by Exercise 1.9, the set J of jumps of F is at most countable; consequently the jumps can be written down in a finite or infinite sequence u 1, u 2,.... Since X nω) = F n Uω) ) and X ω) = F Uω) ), 2) implies that P [ ω : X nω) X ω) ] P [ ω : Uω) C F ] = P [ ω : Uω) J ] = P [ k { ω : Uω) = u k} ] k P [ ω : Uω) = u k ] = k =. This proves 3). 14 2

Convergence almost everywhere and the representation theorem. The following notion arose in the preceding discussion. Definition 2. Let Y 1, Y 2,..., and Y be real random variables, all defined on a common probability space Ω. One says that Y n converges almost everywhere to Y, and writes Y n a.e. Y, if P [ ω Ω : lim n Y n ω) = Y ω) ] = 1. 4) Using this terminology, the result derived on the preceding page can be stated as follows. Theorem 2 The Skorokhod representation theorem). If X n D X, then there exist on some probability space random variables X1, X2,... and X such that X n X n for each n, X X, and X n a.e. X. 5) This is a special case of a result due to the famous Russian probabilist A. V. Skorokhod. It has a number of useful corollaries, which we will turn to after developing some properties of almost everywhere convergence. The following terminology is used in the next theorem. Let h be a mapping from R to R and let X be a real random variable. The continuity set of h is C h := { x : h is continuous at x }; its complement D h is called the discontinuity set of h. h is said to be X-continuous if P [X C h ] = 1, or, equivalently, P [X D h ] =. Example 2. Suppose h = I,x], as illustrated below: 1 x Graph of I,x] Then D h = {x}, so h is X-continuous = P [X D h ] = P [X = x] x C F, F being the distribution function of X. 14 3 X n a.e. X means P [ ω : X n ω) Xω) ] = 1. X n D X means F n x) F x) for each x C F. Theorem 3 The mapping theorem for almost everywhere convergence). Let X 1, X 2,... and X be real random variables, all defined on a common probability space Ω. If X n a.e. X as n and h: R R is X-continuous, then hx n ) a.e. hx). Proof Put Y n = hx n ) and Y = hx). If ω is a sample point such that X n ω) Xω) and Xω) C h, then Y n ω) = h X n ω) ) h Xω) ) = Y ω). Hence {ω : Y n ω) Y ω)} {ω : X n ω) Xω) } {ω : Xω) / C h }. Since the two events in the union on the right each have probability, so does the event on the left. Theorem 4. X n a.e. X = X n D X. Proof Let F n and F be the distribution functions of X n and X respectively, and let x C F. Since h := I,x] is X-continuous see Example 2), we have Y n := hx n ) a.e. Y := hx). Since Y n 1 and E1) = 1 <, the DCT implies that EY n ) EY ). But EY ) = E I,x] X) ) = P [X x] = F x), and similarly, EY n ) = F n x). Thus lim n F n x) = F x), as required. The converse doesn t hold: in general, X n D X does not imply X n a.e. X. For example, suppose Y is a random variable taking just the two values ±1, with probability 1/2 each. Set X n = 1) n Y for n = 1, 2,.... Then X n Y for each n, so X n D Y. But X n a.e. Y, since for each sample point ω, X n ω) alternates between +1 and 1, and hence doesn t converge at all, let alone to Y ω). 14 4

14:17 11/16/2 Theorem 5 The mapping theorem for convergence in distribution). Suppose X n D X and h: R R is X-continuous. Then hx n ) D hx). Proof Using the Skorokhod representation, construct random variables X 1, X 2,... and X such that X n X n for each n, X X, and X n a.e. X. By Theorem 3, Y n := hx n) a.e. Y := hx ), and hence Y n D Y by Theorem 4. The claim follows, since Y n hx n ) for each n and Y hx). Example 3. Suppose X n D Z N, 1). Then Xn 2 D Z 2 χ 2 1. This follows from Theorem 5, by taking hx) = x 2 ; h is X-continuous since it is continuous D h = = P [X D h ] =. Theorem 6 The -method). Suppose of random variables X 1, X 2,... that c n X n x ) D Y 6) as n, for some random variable Y, some number x R, and some sequence of positive numbers c n tending to. Suppose also that g: R R is differentiable at x. Then c n gxn ) gx ) ) D g x )Y. 7) I ll give the formal argument in a moment; here is a heuristic one. Since g is differentiable at x, one has gx + ). = gx ) + g x ) 8) for all small numbers. Put n = X n x and Y n = c n n. Since Y n D Y and c n, for large n it is highly likely that n = Y n /c n will be close to, and hence by 8) that c n [ gxn ) gx ) ] = c n [ gx + n ) ) gx ) ] will be close to c n n g x ) = g x )Y n, which will be distributed almost like g x )Y. This method of argument is called the -method because of the symbol in the seminal approximation 8). 14 5 Proof of Theorem 6. By assumption Y n := c n X n x ) D Y. Using the Skorokhod representation theorem, concoct random variables Y1, Y2,... and Y having the same marginal distributions as Y 1, Y 2,... and Y respectively, but with Yn a.e. Y. Since X n = x + Y n /c n, it is natural to introduce Xn := x + Yn /c n ; note that Xn X n for each n and Xn a.e. x as n. Now write G n := c n gx n ) gx ) ) gx = c n Xn x ) n ) gx ) ) Xn, x with the difference quotient taken to be g x ) if X n = x. Let n to get c n gx n ) gx ) ) G := Y g x ) almost everywhere, and so also in distribution. This implies G n := c n gxn ) gx ) ) D G := Y g x ), since G n G n and G G. The expectation characterization of convergence in distribution. The following terminology is used in the next theorem. A function f: R R is said to be Lipschitz continuous if { fy) fx) } f L := sup : < x < y < < ; 9) y x the quantity f L is called the Lipschitz norm of f. Example 4. The absolute value function fx) = x is Lipschitz. continuous with norm 1, since Graph of fx)= x y x y x by the triangle inequality. In contrast the function gx) = x 1/2 is not Lipschitz continuous, since.. gy) g) Graph of gx)= x lim y =. y The following result states that X n D X iff EfX n ) EfX) for all sufficiently smooth bounded functions f. To save space, we write bd for bounded in the statement of the theorem. 14 6

Theorem 7. Let X 1, X 2,..., X be real random variables. following are equivalent. a) X n D X. b) EfX n ) EfX) for all bd X-continuous f: R R. c) EfX n ) EfX) for all bd continuous f: R R. The d) EfX n ) EfX) for all bd Lipschitz continuous f: R R. Proof a) = b): This is rehash of some old ideas. Using the Skorokhod representation, construct random variables X1, X2,... and X such that Xn X n for each n, X X, and Xn a.e. X. Since f is X-continuous, Yn := fxn) a.e. Y := fx ). Since f is bounded, the DCT implies that EYn EY. Now use EfX n ) = EfXn) = EYn and EfX) = EfX ) = EY. b) = c) = d): This is obvious. d) = a): Let F n and F be the distribution functions of X n and X, respectively. We need to show lim n F n x) = F x) for each x C F. Fix such an x. For positive integers k, let f k : R R be the function whose graph is sketched below: Since I,x] f k, we have 1 f k. x x+1/k F n x) = EI,x] X n )) Ef k X n )) for each n, and since f k is bounded and Lipschitz continuous, d) implies lim sup n F n x) lim sup n Ef k X n )) = Ef k X)). Since f k X) I,x] X) as k, with the convergence being dominated by 1, the DCT implies lim k Ef k X)) = EI,x] X)) = 14 7 X n X EfX n ) EfX) for all Lipschitz continuous f: R R. F x). The upshot is lim sup n F n x) F x). A similar argument shows that F x ) = EI,x) X)) lim inf n F n x). Since x C F = F x) = F x ), we have lim n F n x) = F x), as required. Example 5. A) Let X n be a random variable taking the two values and n, with probabilities 1 1/n and 1/n respectively. Then X n D X, where X =. Let f: R R be the identity function: fx) = x for all x. f is Lipschitz continuous, but EfX n ) = EX n ) = 1 1/n) + n 1/n) = 1 = EX) = EfX). This doesn t contradict Theorem 7, since f isn t bounded. B). Let t R and consider the functions f t : R R and g t : R R defined by f t x) = costx) and g t x) = sintx). f t and g t are continuous and bounded, so X n D X implies Ee itx n ) = EcostX n )) + iesintx n )) EcostX)) + iesintx)) = Ee itx ). That is, the characteristic function of X n converges pointwise to the characteristic function of X. We will see later on that the converse is true too. Convergence of densities. Suppose X n D X and that X n and X have densities f n and f respectively. Does f n converge to f in any nice way? In general, the answer is no. 14 8

14:17 11/16/2 Example 6. Let X n and X be random variables taking values in [, 1], with densities 2 f f n x) = 1 + sin2πnx) 1. f 2 fx) = 1 1 for x 1. An easy calculation shows that the corresponding distribution functions satisfy F n x) F x) for all x R, so X n D X. However, as n, f n oscillates ever more wildly, and doesn t converge to f in any nice way; see Exercise 22. There is, however, a positive result in the opposite direction: convergence of densities implies convergence in distribution. This fact is a corollary of the next theorem, wherein B denotes the collection of Borel subsets of R. Theorem 8. Let X 1, X 2,... and X be real random variables having corresponding densities f 1, f 2,..., and f with respect to some measure µ. If lim f n x) = fx) for µ-almost all x, then lim n f n x) fx) µdx) =. 1) If 1) holds, then lim n sup{ P [Xn B] P [X B] : B B } =. 11) 11) is a much stronger assertion than X n D X; when it holds, one says that the distribution of X n converges strongly to the distribution of X, and writes X n s X. Proof of Theorem 8. 1) holds: Since f n f U n := f n +f U := 2f and U n = f n + f = 2 U, we have f n f by the Sandwich Theorem Theorem 1.5). 1) = 11): For each Borel set subset B of R, one has P [Xn B] P [X B] = B f nx) µdx) B fx) µdx) fn x) fx) B µdx) fn x) fx) µdx). 14 9 Example 7. Suppose Y r is a Gamma random variable with shape parameter r and scale parameter 1. Since Y r has mean r and variance r, the standardized variable X r := Y r r r has mean and variance 1. We are going to show that X r converges strongly to a standard normal random variable as r. By the change of variable formula, X r has density f r x) = f Yr r + rx) r [ 1 = Γr) r + rx) r 1 e r+ rx) I, ) r + ] r. rx) f r x) is graphed versus x for r = 1, 2, 4, and 8 on the next page; the plot also shows the standard normal density for comparison. By Stirling s formula Γr) = Γr + 1) r = 1 + o1)) 2πr r r e r = 1 + o1)) 2πr r r 1 e r, 12) where o1) denotes a quantity that tends to as r. Fix x. For all large r we have r + rx >, so f r x) = 1 1 + x/ r) r 1 e rx 1 + o1)). 2π Since log1 + u) = u u 2 /2 + u 3 /3 + for u < 1, logf r x)) = C + r 1) log1 + x/ r) rx + o1) [ x r = C + r 1) x2 1 )] 2r + O rx + o1) r 3/2 = C x 2 /2 + o1) where C = log 2π) and O1/r 3/2 ) denotes a quantity q r such that r 3/2 q r remains bounded as r. Consequently lim r f r x) = exp x 2 /2)/ 2π. By Theorem 8, Y r s Z N, 1). Exercise 25 rearranges this argument into a proof 12). 14 1 r

density at x..1.2.3.4 Figure 1 Density of Gammar)-r)/sqrtr) for r = 1,2,4,8,infinity -3-2 -1 1 2 3 x 14 11 1 2 4 8 Inf Characteristic functions. We are now going to give some criteria for convergence in terms of characteristic functions. Theorem 9 The continuity theorem for densities). Let X 1, X 2,..., and X be real random variables with corresponding integrable characteristic functions φ 1, φ 2,..., and φ. Suppose that φ n converges to φ in L 1, in the sense that φ n φ 1 := φn t) φt) dt 13) as n. Then X 1,..., X have bounded continuous densities f 1,..., f with respect to Lebesgue measure on R. As n, f n converges to f both uniformly and in L 1 : f n f := sup{ f n x) fx) : x R }, 14) f n f 1 := f nx) fx) dx. 15) In particular, X n s X. Proof By the inversion theorem, X n and X have the bounded continuous densities f n x) = 1 e itx φ n t) dt and fx) = 1 e itx φt) dt 2π 2π with respect to Lebesgue measure. 14) follows from 2π fn x) fx) = e itx φ n t) dt e itx φt) dt = e itx φ n t) φt) ) dt φn t) φt) dt. 15) follows from 14) and Theorem 8 with µdx) = dx. Now we come to the crucially important 14 12

14:17 11/16/2 Theorem 1 The continuity theorem for distribution functions). Let X 1, X 2,... and X be real-random variables with corresponding characteristic functions φ 1, φ 2,... and φ. X n D X φ n t) φt) for all t R. 16) Proof = ) This was done previously; see Example 5. =) Let Z N, 1), independently of the X n s and X. Let k be a large) positive integer and set σ k = 1/k, Y n,k = X n + σ k Z, and Y k = X + σ k Z. I claim that Y n,k D Y k as n. To see this, note that Y n,k and Y k have characteristic functions ψ n,k t) = Ee ity n,k) = φ n t)e σ2 k t2 /2 ψ k t) = Ee ity k) = φt)e σ2 k t2 /2. ψ n,k and ψ k are integrable, and ψ n,k t) ψ k t) dt φ n t) φt) e σ2 k t2 /2 dt tends to as n, by virtue of the DCT. Theorem 9 implies that Y n,k converges strongly to Y k, and hence also in distribution. We have just shown that for large n, the smoothing Y n,k of X n has almost the same distribution as the smoothing Y k of X. The idea now is to show that when k is sufficiently large and hence the smoothing parameter σ k is sufficiently small), Y n,k has nearly the same distribution as X n, and Y k nearly the same distribution as X; the upshot is that X n will have nearly the same distribution as X. To make this precise, let g: R R be bounded and Lipschitz continuous. I claim that E gx n ) ) E gx) ) 17) as n ; by criterion d) of Theorem 7, this is enough to show that X n D X. 14 13 17): E gx n ) ) E gx) ) as n. σ k = 1/k. Y n,k := X n + σ k Z D Y k := X + σ k Z as n, for each k. where To prove 17), note that by the triangle inequality EgX n ) EgX) I n,k + II n,k + III k I n,k = EgX n ) EgY n,k ), II n,k = EgY n,k ) EgY k ), III k = EgY k ) EgX). Letting L denote the Lipschitz norm of g, we have gx n ) gy n,k ) L X n Y n,k = Lσ k Z since Y n,k = X n + σ k Z. Thus I n,k E gx n ) gy n,k ) ) Lσ k E Z ). The same bound applies to III k, for similar reasons. Thus EgX n ) EgX) 2Lσ k E Z ) + II n,k. This holds for all n and all k. For each k, lim n II n,k = since g is continuous and bounded and Y n,k D Y k. 17) follows by taking limits, first as n and then as k. Remark. With more work, an even better result can be established. Suppose X 1, X 2,... are random variables with corresponding characteristic functions φ 1, φ 2,.... If φt) := lim n φ n t) exists for each t R and φ is continuous at, then φ is the characteristic function of some random variable X, and X n D X. 14 14

Example 8. Let φ n be the characteristic function of a random variable X n having a binomial distribution with parameters n and p n. If np n λ [, ) as n, then for each t R φ n t) = p n e it + 1 p n ) n = 1 + np ne it 1) n φt) := expλe it 1)) since for complex numbers z n z = 1 + z n /n) n e z 18) see below). Since φ is the characteristic function of a random variable X with a Poisson distribution with parameter λ, this proves that X n D X. We need to establish 18). Theorem 11. 18) holds. Proof For complex numbers ζ satisfying ζ < 1, the so-called principal value of log1 + ζ) has the power series expansion log1 + ζ) = n=1 1)n 1 ζn n = ζ ζ2 2 + ζ3 3 ζ4 4 +. 19) Consequently for ζ 1/2 log1 + ζ) ζ ζ 2 2 + ζ 3 3 + ζ 4 4 + ζ 2 2 1 + ζ + ζ 2 + ) = ζ 2 2 ) n 1 1 ζ ζ 2. 2) Now suppose z n z in C. Since z n /n, 2) implies that n log1 + z n /n) ) = n z n /n) + O1/n 2 ) ) = z n + O1/n) z = 1 + z n /n) n = exp n log1 + z n /n) ) expz). 14 15 Generalization to random vectors. All of the results of this section carry over to random vectors taking values in R k for some integer k. The proofs are straightforward generalizations of the ones for k = 1, except for the Skorokhod Representation Theorem, which can no longer be proved using inverse distribution functions. Exercise 1. For n = 1, 2,..., let X n be a random variable uniformly distributed over the n + 1 points k/n for k =, 1,..., n. Show that as n, X n D X, where X Uniform, 1). Exercise 2. For n = 1, 2,..., let G n be a random variable with a Geometric distribution. Show that if µ n := EG n ), then X n := G n /µ n D X, where X is a standard exponential random variable. Exercise 3. Let Y 1, Y 2,... be iid standard exponential random variables. For each n, set X n = maxy 1, Y 2,..., Y n ). Show that as n, X n logn) D X, where X is a random variable with distribution function F x) = exp exp x)) for < x <. In the next 4 exercises, X 1, X 2,..., and X are real random variables with corresponding distribution functions F 1, F 2,..., and F. Exercise 4. Show that X n D X if and only if there exists a dense set D of R such that F n x) F x) for all x D. Exercise 5. Suppose X n D X. Show that lim c sup n P [ X n c ] =. Exercise 6. Show that X n D X if and only if lim n Fn b) F n a) ) = F b) F a) 21) for all a C F and b C F with a < b. [Hint: for =, first show that 21) holds.] 14 16

14:17 11/16/2 Exercise 7. Suppose that F is continuous. Show that X n D X implies that lim n sup{ F n x) F x) : x R } =. Let X 1, X 2,..., and X be real random variables. One says X n converges to X in probability, and writes X n P X, if lim n P [ X n X ɛ ] = for all ɛ >. 22) Exercise 8. Suppose X n P and there exists a finite constant c such that X n c for all n. Show that lim n E X n =. Exercise 9. Suppose X n D X and Y n X n P. Show that Y n D X. [Hint: use criterion d) of Theorem 7 in conjunction with the previous exercise.] Exercise 1. a) Show that X n P X = X n D X. [Hint: use the preceding exercise.] b) Show by example that, in general, X n D X = / X n P X. Exercise 11. Show that if X is constant with probability one), then X n D X = X n P X. Exercise 12. Suppose X 1 ω) X 2 ω) X n ω) for each sample point ω. Show that X n a.e. X n P. [Hint: if A 1, A 2,..., and A are events such that A 1 A 2 A n and A = n=1 A n, then P [A] = lim n P [A n ]; you do not have to prove this fact from measure theory.] Exercise 13. Show that X n a.e. X sup p n X p X P as n. [Hint: use the preceding exercise.] Exercise 14. Suppose S 1, S 2,... are nonnegative variables such that S n nµ)/σ n ) D Z, for some positive finite numbers µ and σ and some random variable Z. Show that Sn nµ D σz/2 µ ). 23) 14 17 Exercise 15 The k th order -method). As in Theorem 6, suppose X 1, X 2,... are random variables such that c n X n x ) D Y as n, for some random variable Y, some number x, and some positive numbers c n tending to. Let k be a positive integer and let g: R R be a function such that gx + ) gx ) = γ k + o k ) 24) as. Show that as n, c k n gxn ) gx ) ) D γy k. 25) Exercise 16. Let f: [, 1] R be a bounded U-continuous function, where U Uniform[, 1]. Show that 1/n) n k= fk/n) EfU) as n. What does this say about Riemann integration? [Hint: use Theorem 7 and the result of Exercise 1.] The boundary B of a subset B of R is the set of points x for which there exists an infinite sequence y 1, y 2,... of points in B converging to x, and also an infinite sequence z 1, z 2,... of points not in B converging to x; y n = x or z n = x for some n s is allowed. For example, the boundary of the interval B = a, b] is B = {a, b}. Exercise 17. Suppose B a Borel) subset of R. Show that the discontinuity set of the indicator function I B equals B. Use Theorem 7 to show that if X n D X and P [ X B ] =, then P [ X n B ] P [ X B ]. How does this relate to 1)? Exercise 18. Suppose that X n D X and there exists a strictly positive number ɛ such that sup n E X n 1+ɛ ) <. Show that: a) E X 1+ɛ ) < ; b) X n and X are integrable; and c) EX n ) EX) as n. [Hint: if P [ X = ±c ] =, then the function f c : R R defined by f c x) = xi [ c,c] x) is X-continuous why?). Moreover, for any random variable Y, one has E Y I { Y c} ) E Y 1+ɛ )/c ɛ why?).] 14 18

Exercise 19. Suppose X n D X. Let M n and M be the moment generating functions of X n and X respectively. Show that if sup n M n u) < for some number u >, then lim n M n t) = Mt) < for all t < u. What happens if u <? [Hint: use the result of the preceding exercise.] Exercise 2. Let X n and X be as in Exercise 1. Show that X n s X; specifically, show that for B = { x [, 1] : x is irrational }, one has P [X n B] = for all n, whereas P [X B] = 1. Exercise 21. Suppose ξ is an irrational number. For each positive integer n, let x n = nξ mod 1. Show that the x n s are dense in [, 1). [Hint: first show that for each positive integer k, there exist integers i and j with 1 i k, 1 j k, and j i)ξ mod k 1/k.] Exercise 22. Let f n and f be as in Example 6. a) Show that lim sup n f n x) fx) = 1 for each irrational number x [, 1]. [Hint: use the preceding exercise.] b) Show that 1 f nx) fx) dx = 2/π for all n. Exercise 23. Suppose that P and Q are two probability distributions on R having densities p and q with respect to a measure µ. Show that sup{ Q[B] P [B] : B B } = 1 2 qx) px) µdx); 26) here B denotes the collection of Borel subsets B of R. Deduce that 11) implies 1). [Hint: first show that {q>p} q p) dµ = {p<q} p q) dµ = 1/2 q p dµ.] Exercise 24. Suppose that X n D X. Suppose further that X n and X have densities f n and f with respect to a measure µ on R and that there exists a µ-integrable function g such that f g and f n g for all n. Show that P [X n B] P [X B] for each Borel set B even if f n does not converge µ-almost everywhere to f). What bearing does this have on Example 6? [Hint: Suppose B is a Borel subset of R and ɛ >. There exist finitely many disjoint subintervals a i, b i ] of R such that A B gx) µdx) ɛ; here A = i a i, b i ]. You do not have to prove this fact from measure theory.] 14 19 Exercise 25 Stirling s formula for the Gamma function). Recall that Γr + 1) = yr e y dy for r, ). a) Use the change of variables x = y r)/ r to show that ρ r := Γr + 1) r r+1/2 e = e ψrx) dx, r r where ψ r x) = r φ1 + x/ r) φ1) ) with φu) = u logu). b) Show that lim r ψ r x) = x 2 /2 for each x R and that for all large r, ψ r x) x I [1, ) x )/4 for all x. c) Deduce lim r ρ r = exp x2 /2) dx = 2π. [Hints: φu) = 1 + o1))u 2 /2 as u, and φ is convex.] Exercise 26. Let X α,β be a random variable having a Beta distribution with parameters α and β. Show that if α stays fixed and β, then βx α,β converges strongly to X Gα, 1). Exercise 27. Let X α,β be a random variable having a Beta distribution with parameters α + 1 and β + 1. Set µ α,β = α/α + β) and σ 2 α,β = µ α,β1 µ α,β )/α + β). Show if α and β tend to infinity in such a way that µ α,β remains bounded away from and 1, then X α,β µ α,β )/σ α,β converges strongly to Z N, 1). Exercise 28. Let T n be a random variable having a t distribution with n degrees of freedom. Show that as n, T n s Z N, 1) Exercise 29. Use Theorem 1 to solve Exercise 2. Exercise 3. Let X take the values 1 and with probabilities p and q := 1 p respectively. Show that the characteristic function of Y := X EX) = X p satisfies as t. Ee ity ) = 1 + pe it 1))e ipt = 1 pqt 2 /2 + Ot 3 ) 27) 14 2

14:17 11/16/2 Exercise 31. Suppose that X n Binomialn, p) for n = 1, 2,..., where < p < 1. Use Theorem 1 to show that X n np)/ npq D Z as n. [Hint: use 27).] Exercise 32. Show that for complex numbers ζ satisfying ζ c < 1, the principal value of log1 + ζ) satisfies log1 + ζ) k j=1 for each positive integer k. 1)j 1 ζj j ζ k+1 k + 1 1 1 c 28) 14 21