X n D X lim n F n (x) = F (x) for all x C F. lim n F n(u) = F (u) for all u C F. (2)

14:17 11/16/2 TOPIC. Convergence in distribution and related notions. This section studies the notion of the so-called convergence in distribution of real random variables. This is the kind of convergence that takes place in the central limit theorem, which will be developed in a later section. The basic idea is that the distributions of the random variables involved settle down. The following definition focuses on the probability contents of intervals. Definition 1. Let X 1, X 2,... and X be real random variables with respective distribution functions F 1, F 2,... and F. One says X n converges to X in distribution as n, and writes X n D X, if lim n F n x) = F x) for all x C F. 1) Here C F := {x R : F is continuous at x} = {x R : P [X = x] = } is the set of continuity points of F. Example 1. Suppose that the distribution of X n is a unit mass at the point 1/n, and the distribution of X is a unit mass at. If there is any justice to the definition, we should have X n D X. The distribution functions F n and F are graphed below; the graph of F n has been shifted up slightly to distinguish it from the graph of F. 1 F 1/n F n Note that C F = {x R : x }. It is easy to see that {, if x <, lim n F n x) = = F x) 1, if x >, for all x C F, so we do indeed have X n D X. Note, however, that lim n F n ) = 1 = F ); This illustrates why 1) doesn t require lim n F n x) = F x) for all x. 14 1 X n D X lim n F n x) = F x) for all x C F. Recall that for a distribution function F, the left-continuous inverse to F is the function F mapping, 1) to R defined by F u) = inf{x : u F x) }. F u) is the smallest u th -quantile of F. The following characterization of convergence in distribution in terms of quantiles was part of the first homework see Exercise 1.1). Theorem 1. Let X 1, X 2,..., and X be real random variables with left-continuous inverse distribution functions F 1, F 2,..., and F respectively. Then X n D X if and only if lim n F nu) = F u) for all u C F. 2) Here C F := {u, 1) : F is continuous at u }. Suppose the X n s and X are random variables for which 2) holds. Let U be a random variable uniformly distributed over, 1). According to the Inverse Probability Theorem Theorem 1.5), for each n the random variable X n := F nu) has the same distribution as X n ; we express this as X n X n. Similarly, X := F U) X. I claim that X n converges to X as n in a very strong sense, namely, P [ ω : X nω) X ω) ] = 1. 3) To see this, note that by Exercise 1.9, the set J of jumps of F is at most countable; consequently the jumps can be written down in a finite or infinite sequence u 1, u 2,.... Since X nω) = F n Uω) ) and X ω) = F Uω) ), 2) implies that P [ ω : X nω) X ω) ] P [ ω : Uω) C F ] = P [ ω : Uω) J ] = P [ k { ω : Uω) = u k} ] k P [ ω : Uω) = u k ] = k =. This proves 3). 14 2

Convergence almost everywhere and the representation theorem. The following notion arose in the preceding discussion. Definition 2. Let Y 1, Y 2,..., and Y be real random variables, all defined on a common probability space Ω. One says that Y n converges almost everywhere to Y, and writes Y n a.e. Y, if P [ ω Ω : lim n Y n ω) = Y ω) ] = 1. 4) Using this terminology, the result derived on the preceding page can be stated as follows. Theorem 2 The Skorokhod representation theorem). If X n D X, then there exist on some probability space random variables X1, X2,... and X such that X n X n for each n, X X, and X n a.e. X. 5) This is a special case of a result due to the famous Russian probabilist A. V. Skorokhod. It has a number of useful corollaries, which we will turn to after developing some properties of almost everywhere convergence. The following terminology is used in the next theorem. Let h be a mapping from R to R and let X be a real random variable. The continuity set of h is C h := { x : h is continuous at x }; its complement D h is called the discontinuity set of h. h is said to be X-continuous if P [X C h ] = 1, or, equivalently, P [X D h ] =. Example 2. Suppose h = I,x], as illustrated below: 1 x Graph of I,x] Then D h = {x}, so h is X-continuous = P [X D h ] = P [X = x] x C F, F being the distribution function of X. 14 3 X n a.e. X means P [ ω : X n ω) Xω) ] = 1. X n D X means F n x) F x) for each x C F. Theorem 3 The mapping theorem for almost everywhere convergence). Let X 1, X 2,... and X be real random variables, all defined on a common probability space Ω. If X n a.e. X as n and h: R R is X-continuous, then hx n ) a.e. hx). Proof Put Y n = hx n ) and Y = hx). If ω is a sample point such that X n ω) Xω) and Xω) C h, then Y n ω) = h X n ω) ) h Xω) ) = Y ω). Hence {ω : Y n ω) Y ω)} {ω : X n ω) Xω) } {ω : Xω) / C h }. Since the two events in the union on the right each have probability, so does the event on the left. Theorem 4. X n a.e. X = X n D X. Proof Let F n and F be the distribution functions of X n and X respectively, and let x C F. Since h := I,x] is X-continuous see Example 2), we have Y n := hx n ) a.e. Y := hx). Since Y n 1 and E1) = 1 <, the DCT implies that EY n ) EY ). But EY ) = E I,x] X) ) = P [X x] = F x), and similarly, EY n ) = F n x). Thus lim n F n x) = F x), as required. The converse doesn t hold: in general, X n D X does not imply X n a.e. X. For example, suppose Y is a random variable taking just the two values ±1, with probability 1/2 each. Set X n = 1) n Y for n = 1, 2,.... Then X n Y for each n, so X n D Y. But X n a.e. Y, since for each sample point ω, X n ω) alternates between +1 and 1, and hence doesn t converge at all, let alone to Y ω). 14 4

14:17 11/16/2 Theorem 5 The mapping theorem for convergence in distribution). Suppose X n D X and h: R R is X-continuous. Then hx n ) D hx). Proof Using the Skorokhod representation, construct random variables X 1, X 2,... and X such that X n X n for each n, X X, and X n a.e. X. By Theorem 3, Y n := hx n) a.e. Y := hx ), and hence Y n D Y by Theorem 4. The claim follows, since Y n hx n ) for each n and Y hx). Example 3. Suppose X n D Z N, 1). Then Xn 2 D Z 2 χ 2 1. This follows from Theorem 5, by taking hx) = x 2 ; h is X-continuous since it is continuous D h = = P [X D h ] =. Theorem 6 The -method). Suppose of random variables X 1, X 2,... that c n X n x ) D Y 6) as n, for some random variable Y, some number x R, and some sequence of positive numbers c n tending to. Suppose also that g: R R is differentiable at x. Then c n gxn ) gx ) ) D g x )Y. 7) I ll give the formal argument in a moment; here is a heuristic one. Since g is differentiable at x, one has gx + ). = gx ) + g x ) 8) for all small numbers. Put n = X n x and Y n = c n n. Since Y n D Y and c n, for large n it is highly likely that n = Y n /c n will be close to, and hence by 8) that c n [ gxn ) gx ) ] = c n [ gx + n ) ) gx ) ] will be close to c n n g x ) = g x )Y n, which will be distributed almost like g x )Y. This method of argument is called the -method because of the symbol in the seminal approximation 8). 14 5 Proof of Theorem 6. By assumption Y n := c n X n x ) D Y. Using the Skorokhod representation theorem, concoct random variables Y1, Y2,... and Y having the same marginal distributions as Y 1, Y 2,... and Y respectively, but with Yn a.e. Y. Since X n = x + Y n /c n, it is natural to introduce Xn := x + Yn /c n ; note that Xn X n for each n and Xn a.e. x as n. Now write G n := c n gx n ) gx ) ) gx = c n Xn x ) n ) gx ) ) Xn, x with the difference quotient taken to be g x ) if X n = x. Let n to get c n gx n ) gx ) ) G := Y g x ) almost everywhere, and so also in distribution. This implies G n := c n gxn ) gx ) ) D G := Y g x ), since G n G n and G G. The expectation characterization of convergence in distribution. The following terminology is used in the next theorem. A function f: R R is said to be Lipschitz continuous if { fy) fx) } f L := sup : < x < y < < ; 9) y x the quantity f L is called the Lipschitz norm of f. Example 4. The absolute value function fx) = x is Lipschitz. continuous with norm 1, since Graph of fx)= x y x y x by the triangle inequality. In contrast the function gx) = x 1/2 is not Lipschitz continuous, since.. gy) g) Graph of gx)= x lim y =. y The following result states that X n D X iff EfX n ) EfX) for all sufficiently smooth bounded functions f. To save space, we write bd for bounded in the statement of the theorem. 14 6

Theorem 7. Let X 1, X 2,..., X be real random variables. following are equivalent. a) X n D X. b) EfX n ) EfX) for all bd X-continuous f: R R. c) EfX n ) EfX) for all bd continuous f: R R. The d) EfX n ) EfX) for all bd Lipschitz continuous f: R R. Proof a) = b): This is rehash of some old ideas. Using the Skorokhod representation, construct random variables X1, X2,... and X such that Xn X n for each n, X X, and Xn a.e. X. Since f is X-continuous, Yn := fxn) a.e. Y := fx ). Since f is bounded, the DCT implies that EYn EY. Now use EfX n ) = EfXn) = EYn and EfX) = EfX ) = EY. b) = c) = d): This is obvious. d) = a): Let F n and F be the distribution functions of X n and X, respectively. We need to show lim n F n x) = F x) for each x C F. Fix such an x. For positive integers k, let f k : R R be the function whose graph is sketched below: Since I,x] f k, we have 1 f k. x x+1/k F n x) = EI,x] X n )) Ef k X n )) for each n, and since f k is bounded and Lipschitz continuous, d) implies lim sup n F n x) lim sup n Ef k X n )) = Ef k X)). Since f k X) I,x] X) as k, with the convergence being dominated by 1, the DCT implies lim k Ef k X)) = EI,x] X)) = 14 7 X n X EfX n ) EfX) for all Lipschitz continuous f: R R. F x). The upshot is lim sup n F n x) F x). A similar argument shows that F x ) = EI,x) X)) lim inf n F n x). Since x C F = F x) = F x ), we have lim n F n x) = F x), as required. Example 5. A) Let X n be a random variable taking the two values and n, with probabilities 1 1/n and 1/n respectively. Then X n D X, where X =. Let f: R R be the identity function: fx) = x for all x. f is Lipschitz continuous, but EfX n ) = EX n ) = 1 1/n) + n 1/n) = 1 = EX) = EfX). This doesn t contradict Theorem 7, since f isn t bounded. B). Let t R and consider the functions f t : R R and g t : R R defined by f t x) = costx) and g t x) = sintx). f t and g t are continuous and bounded, so X n D X implies Ee itx n ) = EcostX n )) + iesintx n )) EcostX)) + iesintx)) = Ee itx ). That is, the characteristic function of X n converges pointwise to the characteristic function of X. We will see later on that the converse is true too. Convergence of densities. Suppose X n D X and that X n and X have densities f n and f respectively. Does f n converge to f in any nice way? In general, the answer is no. 14 8

14:17 11/16/2 Example 6. Let X n and X be random variables taking values in [, 1], with densities 2 f f n x) = 1 + sin2πnx) 1. f 2 fx) = 1 1 for x 1. An easy calculation shows that the corresponding distribution functions satisfy F n x) F x) for all x R, so X n D X. However, as n, f n oscillates ever more wildly, and doesn t converge to f in any nice way; see Exercise 22. There is, however, a positive result in the opposite direction: convergence of densities implies convergence in distribution. This fact is a corollary of the next theorem, wherein B denotes the collection of Borel subsets of R. Theorem 8. Let X 1, X 2,... and X be real random variables having corresponding densities f 1, f 2,..., and f with respect to some measure µ. If lim f n x) = fx) for µ-almost all x, then lim n f n x) fx) µdx) =. 1) If 1) holds, then lim n sup{ P [Xn B] P [X B] : B B } =. 11) 11) is a much stronger assertion than X n D X; when it holds, one says that the distribution of X n converges strongly to the distribution of X, and writes X n s X. Proof of Theorem 8. 1) holds: Since f n f U n := f n +f U := 2f and U n = f n + f = 2 U, we have f n f by the Sandwich Theorem Theorem 1.5). 1) = 11): For each Borel set subset B of R, one has P [Xn B] P [X B] = B f nx) µdx) B fx) µdx) fn x) fx) B µdx) fn x) fx) µdx). 14 9 Example 7. Suppose Y r is a Gamma random variable with shape parameter r and scale parameter 1. Since Y r has mean r and variance r, the standardized variable X r := Y r r r has mean and variance 1. We are going to show that X r converges strongly to a standard normal random variable as r. By the change of variable formula, X r has density f r x) = f Yr r + rx) r [ 1 = Γr) r + rx) r 1 e r+ rx) I, ) r + ] r. rx) f r x) is graphed versus x for r = 1, 2, 4, and 8 on the next page; the plot also shows the standard normal density for comparison. By Stirling s formula Γr) = Γr + 1) r = 1 + o1)) 2πr r r e r = 1 + o1)) 2πr r r 1 e r, 12) where o1) denotes a quantity that tends to as r. Fix x. For all large r we have r + rx >, so f r x) = 1 1 + x/ r) r 1 e rx 1 + o1)). 2π Since log1 + u) = u u 2 /2 + u 3 /3 + for u < 1, logf r x)) = C + r 1) log1 + x/ r) rx + o1) [ x r = C + r 1) x2 1 )] 2r + O rx + o1) r 3/2 = C x 2 /2 + o1) where C = log 2π) and O1/r 3/2 ) denotes a quantity q r such that r 3/2 q r remains bounded as r. Consequently lim r f r x) = exp x 2 /2)/ 2π. By Theorem 8, Y r s Z N, 1). Exercise 25 rearranges this argument into a proof 12). 14 1 r

density at x..1.2.3.4 Figure 1 Density of Gammar)-r)/sqrtr) for r = 1,2,4,8,infinity -3-2 -1 1 2 3 x 14 11 1 2 4 8 Inf Characteristic functions. We are now going to give some criteria for convergence in terms of characteristic functions. Theorem 9 The continuity theorem for densities). Let X 1, X 2,..., and X be real random variables with corresponding integrable characteristic functions φ 1, φ 2,..., and φ. Suppose that φ n converges to φ in L 1, in the sense that φ n φ 1 := φn t) φt) dt 13) as n. Then X 1,..., X have bounded continuous densities f 1,..., f with respect to Lebesgue measure on R. As n, f n converges to f both uniformly and in L 1 : f n f := sup{ f n x) fx) : x R }, 14) f n f 1 := f nx) fx) dx. 15) In particular, X n s X. Proof By the inversion theorem, X n and X have the bounded continuous densities f n x) = 1 e itx φ n t) dt and fx) = 1 e itx φt) dt 2π 2π with respect to Lebesgue measure. 14) follows from 2π fn x) fx) = e itx φ n t) dt e itx φt) dt = e itx φ n t) φt) ) dt φn t) φt) dt. 15) follows from 14) and Theorem 8 with µdx) = dx. Now we come to the crucially important 14 12

14:17 11/16/2 Theorem 1 The continuity theorem for distribution functions). Let X 1, X 2,... and X be real-random variables with corresponding characteristic functions φ 1, φ 2,... and φ. X n D X φ n t) φt) for all t R. 16) Proof = ) This was done previously; see Example 5. =) Let Z N, 1), independently of the X n s and X. Let k be a large) positive integer and set σ k = 1/k, Y n,k = X n + σ k Z, and Y k = X + σ k Z. I claim that Y n,k D Y k as n. To see this, note that Y n,k and Y k have characteristic functions ψ n,k t) = Ee ity n,k) = φ n t)e σ2 k t2 /2 ψ k t) = Ee ity k) = φt)e σ2 k t2 /2. ψ n,k and ψ k are integrable, and ψ n,k t) ψ k t) dt φ n t) φt) e σ2 k t2 /2 dt tends to as n, by virtue of the DCT. Theorem 9 implies that Y n,k converges strongly to Y k, and hence also in distribution. We have just shown that for large n, the smoothing Y n,k of X n has almost the same distribution as the smoothing Y k of X. The idea now is to show that when k is sufficiently large and hence the smoothing parameter σ k is sufficiently small), Y n,k has nearly the same distribution as X n, and Y k nearly the same distribution as X; the upshot is that X n will have nearly the same distribution as X. To make this precise, let g: R R be bounded and Lipschitz continuous. I claim that E gx n ) ) E gx) ) 17) as n ; by criterion d) of Theorem 7, this is enough to show that X n D X. 14 13 17): E gx n ) ) E gx) ) as n. σ k = 1/k. Y n,k := X n + σ k Z D Y k := X + σ k Z as n, for each k. where To prove 17), note that by the triangle inequality EgX n ) EgX) I n,k + II n,k + III k I n,k = EgX n ) EgY n,k ), II n,k = EgY n,k ) EgY k ), III k = EgY k ) EgX). Letting L denote the Lipschitz norm of g, we have gx n ) gy n,k ) L X n Y n,k = Lσ k Z since Y n,k = X n + σ k Z. Thus I n,k E gx n ) gy n,k ) ) Lσ k E Z ). The same bound applies to III k, for similar reasons. Thus EgX n ) EgX) 2Lσ k E Z ) + II n,k. This holds for all n and all k. For each k, lim n II n,k = since g is continuous and bounded and Y n,k D Y k. 17) follows by taking limits, first as n and then as k. Remark. With more work, an even better result can be established. Suppose X 1, X 2,... are random variables with corresponding characteristic functions φ 1, φ 2,.... If φt) := lim n φ n t) exists for each t R and φ is continuous at, then φ is the characteristic function of some random variable X, and X n D X. 14 14

Example 8. Let φ n be the characteristic function of a random variable X n having a binomial distribution with parameters n and p n. If np n λ [, ) as n, then for each t R φ n t) = p n e it + 1 p n ) n = 1 + np ne it 1) n φt) := expλe it 1)) since for complex numbers z n z = 1 + z n /n) n e z 18) see below). Since φ is the characteristic function of a random variable X with a Poisson distribution with parameter λ, this proves that X n D X. We need to establish 18). Theorem 11. 18) holds. Proof For complex numbers ζ satisfying ζ < 1, the so-called principal value of log1 + ζ) has the power series expansion log1 + ζ) = n=1 1)n 1 ζn n = ζ ζ2 2 + ζ3 3 ζ4 4 +. 19) Consequently for ζ 1/2 log1 + ζ) ζ ζ 2 2 + ζ 3 3 + ζ 4 4 + ζ 2 2 1 + ζ + ζ 2 + ) = ζ 2 2 ) n 1 1 ζ ζ 2. 2) Now suppose z n z in C. Since z n /n, 2) implies that n log1 + z n /n) ) = n z n /n) + O1/n 2 ) ) = z n + O1/n) z = 1 + z n /n) n = exp n log1 + z n /n) ) expz). 14 15 Generalization to random vectors. All of the results of this section carry over to random vectors taking values in R k for some integer k. The proofs are straightforward generalizations of the ones for k = 1, except for the Skorokhod Representation Theorem, which can no longer be proved using inverse distribution functions. Exercise 1. For n = 1, 2,..., let X n be a random variable uniformly distributed over the n + 1 points k/n for k =, 1,..., n. Show that as n, X n D X, where X Uniform, 1). Exercise 2. For n = 1, 2,..., let G n be a random variable with a Geometric distribution. Show that if µ n := EG n ), then X n := G n /µ n D X, where X is a standard exponential random variable. Exercise 3. Let Y 1, Y 2,... be iid standard exponential random variables. For each n, set X n = maxy 1, Y 2,..., Y n ). Show that as n, X n logn) D X, where X is a random variable with distribution function F x) = exp exp x)) for < x <. In the next 4 exercises, X 1, X 2,..., and X are real random variables with corresponding distribution functions F 1, F 2,..., and F. Exercise 4. Show that X n D X if and only if there exists a dense set D of R such that F n x) F x) for all x D. Exercise 5. Suppose X n D X. Show that lim c sup n P [ X n c ] =. Exercise 6. Show that X n D X if and only if lim n Fn b) F n a) ) = F b) F a) 21) for all a C F and b C F with a < b. [Hint: for =, first show that 21) holds.] 14 16

14:17 11/16/2 Exercise 7. Suppose that F is continuous. Show that X n D X implies that lim n sup{ F n x) F x) : x R } =. Let X 1, X 2,..., and X be real random variables. One says X n converges to X in probability, and writes X n P X, if lim n P [ X n X ɛ ] = for all ɛ >. 22) Exercise 8. Suppose X n P and there exists a finite constant c such that X n c for all n. Show that lim n E X n =. Exercise 9. Suppose X n D X and Y n X n P. Show that Y n D X. [Hint: use criterion d) of Theorem 7 in conjunction with the previous exercise.] Exercise 1. a) Show that X n P X = X n D X. [Hint: use the preceding exercise.] b) Show by example that, in general, X n D X = / X n P X. Exercise 11. Show that if X is constant with probability one), then X n D X = X n P X. Exercise 12. Suppose X 1 ω) X 2 ω) X n ω) for each sample point ω. Show that X n a.e. X n P. [Hint: if A 1, A 2,..., and A are events such that A 1 A 2 A n and A = n=1 A n, then P [A] = lim n P [A n ]; you do not have to prove this fact from measure theory.] Exercise 13. Show that X n a.e. X sup p n X p X P as n. [Hint: use the preceding exercise.] Exercise 14. Suppose S 1, S 2,... are nonnegative variables such that S n nµ)/σ n ) D Z, for some positive finite numbers µ and σ and some random variable Z. Show that Sn nµ D σz/2 µ ). 23) 14 17 Exercise 15 The k th order -method). As in Theorem 6, suppose X 1, X 2,... are random variables such that c n X n x ) D Y as n, for some random variable Y, some number x, and some positive numbers c n tending to. Let k be a positive integer and let g: R R be a function such that gx + ) gx ) = γ k + o k ) 24) as. Show that as n, c k n gxn ) gx ) ) D γy k. 25) Exercise 16. Let f: [, 1] R be a bounded U-continuous function, where U Uniform[, 1]. Show that 1/n) n k= fk/n) EfU) as n. What does this say about Riemann integration? [Hint: use Theorem 7 and the result of Exercise 1.] The boundary B of a subset B of R is the set of points x for which there exists an infinite sequence y 1, y 2,... of points in B converging to x, and also an infinite sequence z 1, z 2,... of points not in B converging to x; y n = x or z n = x for some n s is allowed. For example, the boundary of the interval B = a, b] is B = {a, b}. Exercise 17. Suppose B a Borel) subset of R. Show that the discontinuity set of the indicator function I B equals B. Use Theorem 7 to show that if X n D X and P [ X B ] =, then P [ X n B ] P [ X B ]. How does this relate to 1)? Exercise 18. Suppose that X n D X and there exists a strictly positive number ɛ such that sup n E X n 1+ɛ ) <. Show that: a) E X 1+ɛ ) < ; b) X n and X are integrable; and c) EX n ) EX) as n. [Hint: if P [ X = ±c ] =, then the function f c : R R defined by f c x) = xi [ c,c] x) is X-continuous why?). Moreover, for any random variable Y, one has E Y I { Y c} ) E Y 1+ɛ )/c ɛ why?).] 14 18

Exercise 19. Suppose X n D X. Let M n and M be the moment generating functions of X n and X respectively. Show that if sup n M n u) < for some number u >, then lim n M n t) = Mt) < for all t < u. What happens if u <? [Hint: use the result of the preceding exercise.] Exercise 2. Let X n and X be as in Exercise 1. Show that X n s X; specifically, show that for B = { x [, 1] : x is irrational }, one has P [X n B] = for all n, whereas P [X B] = 1. Exercise 21. Suppose ξ is an irrational number. For each positive integer n, let x n = nξ mod 1. Show that the x n s are dense in [, 1). [Hint: first show that for each positive integer k, there exist integers i and j with 1 i k, 1 j k, and j i)ξ mod k 1/k.] Exercise 22. Let f n and f be as in Example 6. a) Show that lim sup n f n x) fx) = 1 for each irrational number x [, 1]. [Hint: use the preceding exercise.] b) Show that 1 f nx) fx) dx = 2/π for all n. Exercise 23. Suppose that P and Q are two probability distributions on R having densities p and q with respect to a measure µ. Show that sup{ Q[B] P [B] : B B } = 1 2 qx) px) µdx); 26) here B denotes the collection of Borel subsets B of R. Deduce that 11) implies 1). [Hint: first show that {q>p} q p) dµ = {p<q} p q) dµ = 1/2 q p dµ.] Exercise 24. Suppose that X n D X. Suppose further that X n and X have densities f n and f with respect to a measure µ on R and that there exists a µ-integrable function g such that f g and f n g for all n. Show that P [X n B] P [X B] for each Borel set B even if f n does not converge µ-almost everywhere to f). What bearing does this have on Example 6? [Hint: Suppose B is a Borel subset of R and ɛ >. There exist finitely many disjoint subintervals a i, b i ] of R such that A B gx) µdx) ɛ; here A = i a i, b i ]. You do not have to prove this fact from measure theory.] 14 19 Exercise 25 Stirling s formula for the Gamma function). Recall that Γr + 1) = yr e y dy for r, ). a) Use the change of variables x = y r)/ r to show that ρ r := Γr + 1) r r+1/2 e = e ψrx) dx, r r where ψ r x) = r φ1 + x/ r) φ1) ) with φu) = u logu). b) Show that lim r ψ r x) = x 2 /2 for each x R and that for all large r, ψ r x) x I [1, ) x )/4 for all x. c) Deduce lim r ρ r = exp x2 /2) dx = 2π. [Hints: φu) = 1 + o1))u 2 /2 as u, and φ is convex.] Exercise 26. Let X α,β be a random variable having a Beta distribution with parameters α and β. Show that if α stays fixed and β, then βx α,β converges strongly to X Gα, 1). Exercise 27. Let X α,β be a random variable having a Beta distribution with parameters α + 1 and β + 1. Set µ α,β = α/α + β) and σ 2 α,β = µ α,β1 µ α,β )/α + β). Show if α and β tend to infinity in such a way that µ α,β remains bounded away from and 1, then X α,β µ α,β )/σ α,β converges strongly to Z N, 1). Exercise 28. Let T n be a random variable having a t distribution with n degrees of freedom. Show that as n, T n s Z N, 1) Exercise 29. Use Theorem 1 to solve Exercise 2. Exercise 3. Let X take the values 1 and with probabilities p and q := 1 p respectively. Show that the characteristic function of Y := X EX) = X p satisfies as t. Ee ity ) = 1 + pe it 1))e ipt = 1 pqt 2 /2 + Ot 3 ) 27) 14 2

14:17 11/16/2 Exercise 31. Suppose that X n Binomialn, p) for n = 1, 2,..., where < p < 1. Use Theorem 1 to show that X n np)/ npq D Z as n. [Hint: use 27).] Exercise 32. Show that for complex numbers ζ satisfying ζ c < 1, the principal value of log1 + ζ) satisfies log1 + ζ) k j=1 for each positive integer k. 1)j 1 ζj j ζ k+1 k + 1 1 1 c 28) 14 21