CHAPTER 3: LARGE SAMPLE THEORY

Size: px
Start display at page:

Download "CHAPTER 3: LARGE SAMPLE THEORY"

Transcription

1 CHAPTER 3 LARGE SAMPLE THEORY 1 CHAPTER 3: LARGE SAMPLE THEORY

2 CHAPTER 3 LARGE SAMPLE THEORY 2 Introduction

3 CHAPTER 3 LARGE SAMPLE THEORY 3 Why large sample theory studying small sample property is usually difficult and complicated large sample theory studies the limit behavior of a sequence of random variables, say X n. example: Xn µ, n( X n µ)

4 CHAPTER 3 LARGE SAMPLE THEORY 4 Modes of Convergence

5 CHAPTER 3 LARGE SAMPLE THEORY 5 Convergence almost surely Definition 3.1 X n is said to converge almost surely to X, denoted by X n a.s. X, if there exists a set A Ω such that P (A c ) = 0 and for each ω A, X n (ω) X(ω) in real space.

6 CHAPTER 3 LARGE SAMPLE THEORY 6 Equivalent condition {ω : X n (ω) X(ω)} c = ϵ>0 n {ω : sup X m (ω) X(ω) > ϵ} m n X n a.s. X iff P (sup X m X > ϵ) 0 m n

7 CHAPTER 3 LARGE SAMPLE THEORY 7 Convergence in probability Definition 3.2 X n is said to converge in probability to X, denoted by X n p X, if for every ϵ > 0, P ( X n X > ϵ) 0.

8 CHAPTER 3 LARGE SAMPLE THEORY 8 Convergence in moments/means Definition 3.3 X n is said to converge in rth mean to X, denote by X n r X, if E[ X n X r ] 0 as n for functions X n, X L r (P ), where X L r (P ) means X r dp <.

9 CHAPTER 3 LARGE SAMPLE THEORY 9 Convergence in distribution Definition 3.4 X n is said to converge in distribution of X, denoted by X n d X or F n d F (or L(X n ) L(X) with L referring to the law or distribution ), if the distribution functions F n and F of X n and X satisfy F n (x) F (x) as n for each continuity point x of F.

10 CHAPTER 3 LARGE SAMPLE THEORY 10 Uniform integrability Definition 3.5 A sequence of random variables {X n } is uniformly integrable if lim lim sup E { X n I( X n λ)} = 0. λ n

11 CHAPTER 3 LARGE SAMPLE THEORY 11 A note Convergence almost surely and convergence in probability are the same as we defined in measure theory. Two new definitions are convergence in rth mean convergence in distribution

12 CHAPTER 3 LARGE SAMPLE THEORY 12 convergence in distribution is very different from others example: a sequence X, Y, X, Y, X, Y,... where X and Y are N(0, 1); the sequence converges in distribution to N(0, 1) but the other modes do not hold. convergence in distribution is important for asymptotic statistical inference.

13 CHAPTER 3 LARGE SAMPLE THEORY 13 Relationship among different modes Theorem 3.1 A. If X n a.s. X, then X n p X. B. If X n p X, then X nk a.s. X for some subsequence X nk. C. If X n r X, then X n p X. D. If X n p X and X n r is uniformly integrable, then X n r X. E. If X n p X and lim sup n E X n r E X r, then X n r X.

14 CHAPTER 3 LARGE SAMPLE THEORY 14 F. If X n r X, then X n r X for any 0 < r r. G. If X n p X, then X n d X. H. X n p X if and only if for every subsequence {X nk } there exists a further subsequence {X nk,l} such that X nk,l a.s. X. I. If X n d c for a constant c, then X n p c.

15 CHAPTER 3 LARGE SAMPLE THEORY 15

16 CHAPTER 3 LARGE SAMPLE THEORY 16 Proof A and B follow from the results in the measure theory. Prove C. Markov inequality: for any increasing function g( ) and random variable Y, P ( Y > ϵ) E[ g( Y ) g(ϵ) ]. P ( X n X > ϵ) E[ X n X r ϵ r ] 0.

17 CHAPTER 3 LARGE SAMPLE THEORY 17 Prove D. It is sufficient to show that for any subsequence of {X n }, there exists a further subsequence {X nk } such that E X nk X r 0. For any subsequence of {X n }, from B, there exists a further subsequence {X nk } such that X nk a.s. X. For any ϵ, there exists λ such that lim sup nk E[ X nk r I( X nk r λ)] < ϵ. Particularly, choose λ such that P ( X r = λ) = 0 X nk r I( X nk r λ) a.s. X r I( X r λ). By the Fatou s Lemma, E[ X r I( X r λ)] lim sup n k E[ X nk r I( X nk r λ)] < ϵ.

18 CHAPTER 3 LARGE SAMPLE THEORY 18 E[ X nk X r ] E[ X nk X r I( X nk r < 2λ, X r < 2λ)] +E[ X nk X r I( X nk r 2λ, or, X r 2λ)] E[ X nk X r I( X nk r < 2λ, X r < 2λ)] +2 r E[( X nk r + X r )I( X nk r 2λ, or, X r 2λ)], where the last inequality follows from the inequality (x + y) r 2 r (max(x, y)) r 2 r (x r + y r ), x 0, y 0. When n k is large, the second term is bounded by 2 2 r {E[ X nk r I( X nk λ)] + E[ X r I( X λ)]} 2 r+1 ϵ. lim sup n E[ X nk X r ] 2 r+1 ϵ.

19 CHAPTER 3 LARGE SAMPLE THEORY 19 Prove E. It is sufficient to show that for any subsequence of {X n }, there exists a further subsequence {X nk } such that E[ X nk X r ] 0. For any subsequence of {X n }, there exists a further subsequence {X nk } such that X nk a.s. X. Define Y nk = 2 r ( X nk r + X r ) X nk X r 0. By the Fatou s Lemma, lim inf Y nk dp lim inf n k n k It is equivalent to Y nk dp. 2 r+1 E[ X r ] lim inf n k {2 r E[ X nk r ] + 2 r E[ X r ] E[ X nk X r ]}.

20 CHAPTER 3 LARGE SAMPLE THEORY 20 Prove F. The Hölder inequality: f(x)g(x) dµ { } 1/p { f(x) p dµ(x) g(x) p dµ(x)} 1/q, 1 p + 1 q = 1. Choose µ = P, f = X n X r, g 1 and p = r/r, q = r/(r r ) in the Hölder inequality E[ X n X r ] E[ X n X r ] r /r 0.

21 CHAPTER 3 LARGE SAMPLE THEORY 21 Prove G. X n p X. If P (X = x) = 0, then for any ϵ > 0, P ( I(X n x) I(X x) > ϵ) = P ( I(X n x) I(X x) > ϵ, X x > δ) +P ( I(X n x) I(X x) > ϵ, X x δ) P (X n x, X > x + δ) + P (X n > x, X < x δ) +P ( X x δ) P ( X n X > δ) + P ( X x δ). The first term converges to zero since X n p X. The second term can be arbitrarily small if δ is small, since lim δ 0 P ( X x δ) = P (X = x) = 0. I(X n x) p I(X x) F n (x) = E[I(X n x)] E[I(X x)] = F (x).

22 CHAPTER 3 LARGE SAMPLE THEORY 22 Prove H. One direction follows from B. To prove the other direction, use the contradiction. Suppose there exists ϵ > 0 such that P ( X n X > ϵ) does not converge to zero. find a subsequence {X n } such hat P ( X n X > ϵ) > δ for some δ > 0. However, by the condition, there exists a further subsequence X n such that X n a.s. X then X n p X from A. Contradiction!

23 CHAPTER 3 LARGE SAMPLE THEORY 23 Prove I. Let X c. P ( X n c > ϵ) 1 F n (c + ϵ) + F n (c ϵ) 1 F X (c + ϵ) + F (c ϵ) = 0.

24 CHAPTER 3 LARGE SAMPLE THEORY 24 Some counter-examples (Example 1 ) Suppose that X n is degenerate at a point 1/n; i.e., P (X n = 1/n) = 1. Then X n converges in distribution to zero. Indeed, X n converges almost surely.

25 CHAPTER 3 LARGE SAMPLE THEORY 25 (Example 2 ) X 1, X 2,... are i.i.d with standard normal distribution. Then X n d X 1 but X n does not converge in probability to X 1.

26 CHAPTER 3 LARGE SAMPLE THEORY 26 (Example 3 ) Let Z be a random variable with a uniform distribution in [0, 1]. Let X n = I(m2 k Z < (m + 1)2 k ) when n = 2 k + m where 0 m < 2 k. Then it is shown that X n converges in probability to zero but not almost surely. This example is already given in the second chapter.

27 CHAPTER 3 LARGE SAMPLE THEORY 27 (Example 4 ) Let Z be Uniform(0, 1) and let X n = 2 n I(0 Z < 1/n). Then E[ X n r ]] but X n converges to zero almost surely.

28 CHAPTER 3 LARGE SAMPLE THEORY 28 Result for convergence in rth mean Theorem 3.2 (Vitali s theorem) Suppose that X n L r (P ), i.e., X n r <, where 0 < r < and X n p X. Then the following are equivalent: A. { X n r } are uniformly integrable. B. X n r X. C. E[ X n r ] E[ X r ].

29 CHAPTER 3 LARGE SAMPLE THEORY 29 One sufficient condition for uniform integrability Liapunov condition: there exists a positive constant ϵ 0 such that lim sup n E[ X n r+ϵ 0 ] < E[ X n r I( X n r λ)] E[ X n r+ϵ 0 ] λ ϵ 0

30 CHAPTER 3 LARGE SAMPLE THEORY 30 Integral inequalities

31 CHAPTER 3 LARGE SAMPLE THEORY 31 Young s inequality ab a p p + b q q, a, b > 0, where the equality holds if and only if a = b. log x is concave: log( 1 p a p + 1 q b q ) 1 p log a p + 1 q log b. Geometric interpretation (insert figure here):

32 CHAPTER 3 LARGE SAMPLE THEORY 32 Hölder inequality f(x)g(x) dµ(x) { } 1 { } 1 f(x) p p dµ(x) g(x) q q dµ(x). in the Young s inequality, let a = f(x)/ { f(x) p dµ(x)} 1/p b = g(x)/ { g(x) q dµ(x)} 1/q. when µ = P and f = X(ω), g = 1, µ s t r where µ r = E[ X r ] and r s t 0. µ r s t µ r t s when p = q = 2, obtain Cauchy-Schwartz inequality: f(x)g(x) dµ(x) { } 1 { } 1 f(x) 2 2 dµ(x) g(x) 2 2 dµ(x).

33 CHAPTER 3 LARGE SAMPLE THEORY 33 Minkowski s inequality r > 1, derivation: X + Y r X r + Y r. E[ X + Y r ] E[( X + Y ) X + Y r 1 ] E[ X r ] 1/r E[ X+Y r ] 1 1/r +E[ Y r ] 1/r E[ X+Y r ] 1 1/r. r in fact is a norm in the linear space {X : X r < }. Such a normed space is denoted as L r (P ).

34 CHAPTER 3 LARGE SAMPLE THEORY 34 Markov s inequality P ( X ϵ) E[g( X )], g(ϵ) where g 0 is a increasing function in [0, ). Derivation: P ( X ϵ) P (g( X ) g(ϵ)) = E[I(g( X ) g(ϵ))] E[ g( X ) g(ϵ) ]. When g(x) = x 2 and X replaced by X µ, obtain Chebyshev s inequality: P ( X µ ϵ) V ar(x) ϵ 2.

35 CHAPTER 3 LARGE SAMPLE THEORY 35 Application of Vitali s theorem Y 1, Y 2,... are i.i.d with mean µ and variance σ 2. Let X n = Ȳn. By the Chebyshev s inequality, X n p µ. P ( X n µ > ϵ) V ar(x n) ϵ 2 = σ2 nϵ 2 0. From the Liapunov condition with r = 1 and ϵ 0 = 1, X n µ satisfies the uniform integrability condition E[ X n µ ] 0.

36 CHAPTER 3 LARGE SAMPLE THEORY 36 Convergence in Distribution

37 CHAPTER 3 LARGE SAMPLE THEORY 37 Convergence in distribution is the most important mode of convergence in statistical inference.

38 CHAPTER 3 LARGE SAMPLE THEORY 38 Equivalent conditions Theorem 3.3 (Portmanteau Theorem) The following conditions are equivalent. (a). X n converges in distribution to X. (b). For any bounded continuous function g( ), E[g(X n )] E[g(X)]. (c). For any open set G in R, lim inf n P (X n G) P (X G). (d). For any closed set F in R, lim sup n P (X n F ) P (X F ). (e). For any Borel set O in R with P (X O) = 0 where O is the boundary of O, P (X n O) P (X O).

39 CHAPTER 3 LARGE SAMPLE THEORY 39 Proof (a) (b). Without loss of generality, assume g(x) 1. We choose [ M, M] such that P ( X = M) = 0. Since g is continuous in [ M, M], g is uniformly continuous in [ M, M]. Partition [ M, M] into finite intervals I 1... I m such that within each interval I k, max Ik g(x) min Ik g(x) ϵ and X has no mass at all the endpoints of I k (why?).

40 CHAPTER 3 LARGE SAMPLE THEORY 40 Therefore, if choose any point x k I k, k = 1,..., m, E[g(X n )] E[g(X)] E[ g(x n ) I( X n > M)] + E[ g(x) I( X > M)] m + E[g(X n )I( X n M)] g(x k )P (X n I k ) + m g(x k )P (X n I k ) k=1 + E[g(X)I( X M)] k=1 m g(x k )P (X I k ) k=1 m g(x k )P (X I k ) k=1 P ( X n > M) + P ( X > M) m +2ϵ + P (X n I k ) P (X I k ). k=1 lim sup n E[g(X n )] E[g(X)] 2P ( X > M) + 2ϵ. Let M and ϵ 0.

41 CHAPTER 3 LARGE SAMPLE THEORY 41 ϵ (b) (c). For any open set G, define g(x) = 1 ϵ+d(x,g c ), where d(x, G c ) is the minimal distance between x and G c, inf y G c x y. For any y G c, d(x 1, G c ) x 2 y x 1 y x 2 y x 1 x 2, d(x 1, G c ) d(x 2, G c ) x 1 x 2. g(x 1 ) g(x 2 ) ϵ 1 d(x 1, G c ) d(x 2, G c ) ϵ 1 x 1 x 2. g(x) is continuous and bounded. E[g(X n )] E[g(X)]. Note 0 g(x) I G (x) lim inf P (X n G) lim inf n n E[g(X n )] E[g(X)]. Let ϵ 0 E[g(X)] converges to E[I(X G)] = P (X G). (c) (d). This is clear by taking complement of F.

42 CHAPTER 3 LARGE SAMPLE THEORY 42 (d) (e). For any O with P (X O) = 0, lim sup n P (X n O) lim sup n P (X n Ō) P (X Ō) = P (X O), lim inf n P (X n O) lim inf n P (X n O o ) P (X O o ) = P (X O). (e) (a). Choose O = (, x] with P (X O) = P (X = x) = 0.

43 CHAPTER 3 LARGE SAMPLE THEORY 43 Counter-examples Let g(x) = x, a continuous but unbounded function. Let X n be a random variable taking value n with probability 1/n and value 0 with probability (1 1/n). Then X n d 0. However, E[g(X)] = 1 does not converge to 0. The continuity at boundary in (e) is also necessary: let X n be degenerate at 1/n and consider O = {x : x > 0}. Then P (X n O) = 1 but X n d 0.

44 CHAPTER 3 LARGE SAMPLE THEORY 44 Weak Convergence and Characteristic Functions

45 CHAPTER 3 LARGE SAMPLE THEORY 45 Theorem 3.4 (Continuity Theorem) Let ϕ n and ϕ denote the characteristic functions of X n and X respectively. Then X n d X is equivalent to ϕ n (t) ϕ(t) for each t.

46 CHAPTER 3 LARGE SAMPLE THEORY 46 Proof To prove direction, from (b) in Theorem 3.1, ϕ n (t) = E[e itx n ] E[e itx ] = ϕ(t). The proof of direction consists of a few tricky constructions (skipped).

47 CHAPTER 3 LARGE SAMPLE THEORY 47 One simple example X 1,..., X n Bernoulli(p) ϕ Xn (t) = E[e it(x X n )/n ] = (1 = p + pe it/n ) n = (1 p + p + itp/n + o(1/n)) n e itp. Note the limit is the c.f. of X = p. Thus, Xn d p so X n converges in probability to p.

48 CHAPTER 3 LARGE SAMPLE THEORY 48 Generalization to multivariate random vectors X n d X if and only if E[exp{it X n }] E[exp{it X }], where t is any k-dimensional constant Equivalently, t X n d t X for any t to study the weak convergence of random vectors, we can reduce to study the weak convergence of one-dimensional linear combination of the random vectors This is the well-known Cramér-Wold s device

49 CHAPTER 3 LARGE SAMPLE THEORY 49 Theorem 3.5 (The Cramér-Wold device) Random vector X n in R k satisfy X n d X if and only t X n d t X in R for all t R k.

50 CHAPTER 3 LARGE SAMPLE THEORY 50 Properties of Weak Convergence

51 CHAPTER 3 LARGE SAMPLE THEORY 51 Theorem 3.6 (Continuous mapping theorem) Suppose X n a.s. X, or X n p X, or X n d X. Then for any continuous function g( ), g(x n ) converges to g(x) almost surely, or in probability, or in distribution.

52 CHAPTER 3 LARGE SAMPLE THEORY 52 Proof If X n a.s. X, g(x n ) a.s g(x). If X n p X, then for any subsequence, there exists a further subsequence X nk a.s. X. Thus, g(x nk ) a.s. g(x). Then g(x n ) p g(x) from (H) in Theorem 3.1. To prove that g(x n ) d g(x) when X n d X, use (b) of Theorem 3.1.

53 CHAPTER 3 LARGE SAMPLE THEORY 53 One remark Theorem 3.6 concludes that g(x n ) d g(x) if X n d X and g is continuous. In fact, this result still holds if P (X C(g)) = 1 where C(g) contains all the continuity points of g. That is, if g s discontinuity points take zero probability of X, the continuous mapping theorem holds.

54 CHAPTER 3 LARGE SAMPLE THEORY 54 Theorem 3.7 (Slutsky theorem) Suppose X n d X, Y n p y and Z n p z for some constant y and z. Then Z n X n + T n d zx + y.

55 CHAPTER 3 LARGE SAMPLE THEORY 55 Proof First show that X n + Y n d X + y. For any ϵ > 0, P (X n + Y n x) P (X n + Y n x, Y n y ϵ) + P ( Y n y > ϵ) P (X n x y + ϵ) + P ( Y n y > ϵ). lim sup n F Xn +Y n (x) lim sup n F Xn (x y + ϵ) F X (x y + ϵ).

56 CHAPTER 3 LARGE SAMPLE THEORY 56 On the other hand, P (X n + Y n > x) = P (X n + Y n > x, Y n y ϵ) + P ( Y n y > ϵ) P (X n > x y ϵ) + P ( Y n y > ϵ). lim sup(1 F Xn +Y n (x)) lim sup P (X n > x y ϵ) n n lim sup n P (X n x y 2ϵ) (1 F X (x y 2ϵ)). F X (x y 2ϵ) lim inf n F Xn +Y n (x) lim sup n F Xn +Y n (x) F X (x + y + ϵ). F X+y (x ) lim inf n F Xn +Y n (x) lim sup n F Xn +Y n (x) F X+y (x).

57 CHAPTER 3 LARGE SAMPLE THEORY 57 To complete the proof, P ( (Z n z)x n > ϵ) P ( Z n z > ϵ 2 )+P ( Z n z ϵ 2, X n > 1 ϵ ). lim sup n P ( (Z n z)x n > ϵ) lim sup n + lim sup n that (Z n z)x n p 0. P ( Z n z > ϵ 2 ) P ( X n 1 2ϵ ) P ( X 1 2ϵ ). Clearly zx n d zx Z n X n d zx from the proof in the first half. Again, using the first half s proof, Z n X n + Y n d zx + y.

58 CHAPTER 3 LARGE SAMPLE THEORY 58 Examples Suppose X n d N(0, 1). Then by continuous mapping theorem, X 2 n d χ 2 1. This example shows that g can be discontinuous in Theorem 3.6. Let X n d X with X N(0, 1) and g(x) = 1/x. Although g(x) is discontinuous at origin, we can still show that 1/X n d 1/X, the reciprocal of the normal distribution. This is because P (X = 0) = 0. However, in Example 3.6 where g(x) = I(x > 0), it shows that Theorem 3.6 may not be true if P (X C(g)) < 1.

59 CHAPTER 3 LARGE SAMPLE THEORY 59 The condition Y n p y, where y is a constant, is necessary. For example, let X n = X Uniform(0, 1). Let Y n = X so Y n d X, where X is an independent random variable with the same distribution as X. However X n + Y n = 0 does not converge in distribution to X X.

60 CHAPTER 3 LARGE SAMPLE THEORY 60 Let X 1, X 2,... be a random sample from a normal distribution with mean µ and variance σ 2 > 0, n( Xn µ) d N(0, σ 2 ), s 2 n = 1 n 1 n i=1 (X i X n ) 2 a.s σ 2. n( Xn µ) 1 d s n σ N(0, σ2 ) = N(0, 1). in large sample, t n 1 can be approximated by a standard normal distribution.

61 CHAPTER 3 LARGE SAMPLE THEORY 61 Representation of Weak Convergence

62 CHAPTER 3 LARGE SAMPLE THEORY 62 Theorem 3.8 (Skorohod s Representation Theorem) Let {X n } and X be random variables in a probability space (Ω, A, P ) and X n d X. Then there exists another probability space ( Ω, Ã, P ) and a sequence of random variables X n and X defined on this space such that X n and X n have the same distributions, X and X have the same distributions, and moreover, Xn a.s. X.

63 CHAPTER 3 LARGE SAMPLE THEORY 63 Quantile function F 1 (p) = inf{x : F (x) p}. Proposition 3.1 (a) F 1 is left-continuous. (b) If X has continuous distribution function F, then F (X) Uniform(0, 1). (c) Let ξ Uniform(0, 1) and let X = F 1 (ξ). Then for all x, {X x} = {ξ F (x)}. Thus, X has distribution function F.

64 CHAPTER 3 LARGE SAMPLE THEORY 64 Proof (a) Clearly, F 1 is nondecreasing. Suppose p n increases to p then F 1 (p n ) increases to some y F 1 (p). Then F (y) p n so F (y) p. F 1 (p) y y = F 1 (p). (b) {X x} {F (X) F (x)} F (x) P (F (X) F (x)). {F (X) F (x) ϵ} {X x} P (F (X) F (x) ϵ) F (x) P (F (X) F (x) ) F (x). Then if X is continuous, P (F (X) F (x)) = F (x). (c) P (X x) = P (F 1 (ξ) x) = P (ξ F (x)) = F (x).

65 CHAPTER 3 LARGE SAMPLE THEORY 65 Proof Let ( Ω, Ã, P ) be ([0, 1], B [0, 1], λ). Define X n = Fn 1 (ξ), X = F 1 (ξ), where ξ Uniform(0, 1). Xn has a distribution F n which is the same as X n. For any t (0, 1) such that there is at most one value x such that F (x) = t (it is easy to see t is the continuous point of F 1 ), for any z < x, F (z) < t when n is large, F n (z) < t so Fn 1 (t) z. lim inf n F 1 n (t) z lim inf n F 1 n (t) x = F 1 (t). From F (x + ϵ) > t, F n (x + ϵ) > t so Fn 1 (t) x + ϵ. lim sup n Fn 1 (t) x + ϵ lim sup n Fn 1 (t) x. Thus F 1 n (t) F 1 (t) for almost every t (0, 1) X n a.s. X.

66 CHAPTER 3 LARGE SAMPLE THEORY 66 Usefulness of representation theorem For example, if X n d X and one wishes to show some function of X n, denote by g(x n ), converges in distribution to g(x): see the diagram in Figure 2.

67 CHAPTER 3 LARGE SAMPLE THEORY 67

68 CHAPTER 3 LARGE SAMPLE THEORY 68 Alternative Proof for Slutsky Theorem First, show (X n, Y n ) d (X, y). ϕ (Xn,Y n )(t 1, t 2 ) ϕ (X,y) (t 1, t 2 ) = E[e it 1X n e it 2Y n ] E[e it1x e it2y ] E[e it 1X n (e it 2Y n e it2y )] + e it2y E[e it 1X n ] E[e it1x ] E[ e it 2Y n e it2y ] + E[e it 1X n ] E[e it1x ] 0. Thus, (Z n, X n ) d (z, X). Since g(z, x) = zx is continuous, Z n X n d zx. Since (Z n X n, Y n ) d (zx, y) and g(x, y) = x + y is continuous, Z n X n + Y n d zx + y.

69 CHAPTER 3 LARGE SAMPLE THEORY 69 Summation of Independent R.V.s

70 CHAPTER 3 LARGE SAMPLE THEORY 70 Some preliminary lemmas Proposition 3.2 (Borel-Cantelli Lemma) For any events A n, i=1 P (A n ) < implies P (A n, i.o.) = P ({A n } occurs infinitely often) = 0; or equivalently, P ( n=1 m n A m ) = 0. Proof P (A n, i.o) P ( m n A m ) m n P (A m ) 0, as n.

71 CHAPTER 3 LARGE SAMPLE THEORY 71 One result of the first Borel-cantelli lemma If for a sequence of random variables, {Z n }, and for any ϵ > 0, n P ( Z n > ϵ) <, then Z n > ϵ only occurs finite times. Z n a.s. 0.

72 CHAPTER 3 LARGE SAMPLE THEORY 72 Proposition 3.3 (Second Borel-Cantelli Lemma) For a sequence of independent events A 1, A 2,..., n=1 P (A n ) = implies P (A n, i.o.) = 1. Proof Consider the complement of {A n, i.o}. P ( n=1 m n A c m) = lim n P ( m n A c m) = lim n lim sup n m n exp{ m n P (A m )} = 0. (1 P (A m ))

73 CHAPTER 3 LARGE SAMPLE THEORY 73 Equivalence lemma Proposition 3.4 X 1,..., X n are i.i.d with finite mean. Define Y n = X n I( X n n). Then n=1 P (X n Y n ) <.

74 CHAPTER 3 LARGE SAMPLE THEORY 74 Proof Since E[ X 1 ] <, P ( X n) = n=1 np (n X < (n + 1)) n=1 E[ X ] <. n=1 From the Borel-Cantelli Lemma, P (X n Y n, i.o) = 0. For almost every ω Ω, when n is large enough, X n (ω) = Y n (ω).

75 CHAPTER 3 LARGE SAMPLE THEORY 75 Weak Law of Large Numbers

76 CHAPTER 3 LARGE SAMPLE THEORY 76 Theorem 3.9 (Weak Law of Large Number) If X, X 1,..., X n are i.i.d with mean µ (so E[ X ] < and µ = E[X]), then X n p µ.

77 CHAPTER 3 LARGE SAMPLE THEORY 77 Proof Define Y n = X n I( n X n n). Let µ n = n k=1 E[Y k]/n. P ( Ȳn µ n ϵ) V ar(ȳn) ϵ 2 n k=1 V ar(x ki( X k k)) n 2 ϵ 2. V ar(x k I( X k k)) E[X 2 ki( X k k)] = E[X 2 ki( X k k, X k kϵ 2 )] + E[X 2 ki( X k k, X kϵ 2 )] ke[ X k I( X k kϵ 2 )] + kϵ 4, n P ( Ȳn µ n ϵ) lim sup n P ( Ȳn µ n ϵ) ϵ 2 Ȳn µ n p 0. k=1 E[ X I( X kϵ 2 )] nϵ 2 + ϵ 2 n(n+1) 2n. 2 µ n µ Ȳn p µ. From Proposition 3.4 and subsequence arguments, X nk a.s. µ X n p µ.

78 CHAPTER 3 LARGE SAMPLE THEORY 78 Strong Law of Large Numbers

79 CHAPTER 3 LARGE SAMPLE THEORY 79 Theorem 3.10 (Strong Law of Large Number) If X 1,..., X n are i.i.d with mean µ then X n a.s. µ.

80 CHAPTER 3 LARGE SAMPLE THEORY 80 Proof Without loss of generality, we assume X n 0 since if this is true, the result also holds for any X n by X n = X + n X n. Similar to Theorem 3.9, it is sufficient to show Ȳn a.s. µ, where Y n = X n I(X n n). Note E[Y n ] = E[X 1 I(X 1 n)] µ so n E[Y k ]/n µ. k=1 if we denote S n = n k=1 (Y k E[Y k ]) and we can show S n /n a.s. 0, then the result holds.

81 CHAPTER 3 LARGE SAMPLE THEORY 81 V ar( S n n ) = V ar(y k ) k=1 By the Chebyshev s inequality, n k=1 E[Y 2 k ] ne[x 2 1 I(X 1 n)]. n=1 P ( S n n > ϵ) 1 n 2 ϵ 2 V ar( S n ) E[X2 1 I(X 1 n)] nϵ 2. For any α > 1, let u n = [α n ]. P ( S un > ϵ) u n n=1 1 u n ϵ 2 E[X2 1 I(X 1 u n )] 1 ϵ 2 E[X2 1 u n X 1 1 u n ]. Since for any x > 0, u n x {µ n} 1 < 2 n log x/ log α α n Kx 1 for some constant K, P ( S un > ϵ) K u n ϵ 2 E[X 1] <, n=1 S un /u n a.s. 0.

82 CHAPTER 3 LARGE SAMPLE THEORY 82 For any k, we can find u n < k u n+1. Thus, since X 1, X 2,... 0, S un u n µ/α lim inf k u n u n+1 S k k S un+1 u n+1 u n+1 u n. S k k lim sup k S k k µα. Since α is arbitrary number larger than 1, let α 1 and we obtain lim k Sk /k = µ.

83 CHAPTER 3 LARGE SAMPLE THEORY 83 Central Limit Theorems

84 CHAPTER 3 LARGE SAMPLE THEORY 84 Preliminary result of c.f. Proposition 3.5 Suppose E[ X m ] < for some integer m 0. Then ϕ X (t) m k=0 (it) k E[X k ] / t m 0, as t 0. k!

85 CHAPTER 3 LARGE SAMPLE THEORY 85 Proof e itx = m k=1 (itx) k k! + (itx)m m! [e itθx 1], where θ [0, 1]. ϕ X (t) as t 0. m k=0 (it) k E[X k ] / t m E[ X m e itθx 1 ]/m! 0, k!

86 CHAPTER 3 LARGE SAMPLE THEORY 86 Simple versions of CLT Theorem 3.11 (Central Limit Theorem) If X 1,..., X n are i.i.d with mean µ and variance σ 2 then n( Xn µ) d N(0, σ 2 ).

87 CHAPTER 3 LARGE SAMPLE THEORY 87 Proof Denote Y n = n( X n µ). ϕ Yn (t) = { ϕ X1 µ(t/ n) } n. ϕ X1 µ(t/ n) = 1 σ 2 t 2 /2n + o(1/n). ϕ Yn (t) exp{ σ2 t 2 2 }.

88 CHAPTER 3 LARGE SAMPLE THEORY 88 Theorem 3.12 (Multivariate Central Limit Theorem) If X 1,..., X n are i.i.d random vectors in R k with mean µ and covariance Σ = E[(X µ)(x µ) ], then n( X n µ) d N(0, Σ). Proof Use the Cramér-Wold s device.

89 CHAPTER 3 LARGE SAMPLE THEORY 89 Liaponov CLT Theorem 3.13 (Liapunov Central Limit Theorem) Let X n1,..., X nn be independent random variables with µ ni = E[X ni ] and σ 2 ni = V ar(x ni ). Let µ n = n i=1 µ ni, σ 2 n = n i=1 σ 2 ni. If n i=1 E[ X ni µ ni 3 ] σ 3 n then n i=1 (X ni µ ni )/σ n d N(0, 1). 0,

90 CHAPTER 3 LARGE SAMPLE THEORY 90 Lindeberg-Feller CLT Theorem 3.14 (Lindeberg-Fell Central Limit Theorem) Let X n1,..., X nn be independent random variables with µ ni = E[X ni ] and σ 2 ni = V ar(x ni ). Let σ 2 n = n i=1 σ 2 ni. Then both n i=1 (X ni µ ni )/σ n d N(0, 1) and max {σ 2 ni/σ 2 n : 1 i n} 0 if and only if the Lindeberg condition 1 σ 2 n n i=1 holds. E[ X ni µ ni 2 I( X ni µ ni ϵσ n )] 0, for all ϵ > 0

91 CHAPTER 3 LARGE SAMPLE THEORY 91 Proof of Liapunov CLT using Theorem σ 2 n n i=1 E[ X nk µ nk 2 I( X nk µ nk > ϵσ n )] 1 ϵ 3 σ 3 n n k=1 E[ X nk µ nk 3 ].

92 CHAPTER 3 LARGE SAMPLE THEORY 92 Examples This is one example from a simple linear regression X j = α + βz j + ϵ j for j = 1, 2,... where z j are known numbers not all equal and the ϵ j are i.i.d with mean zero and variance σ 2. ˆβ n = n j=1 X j (z j z n )/ n j=1 (z j z n ) 2 = β + n j=1 ϵ j (z j z n )/ n j=1 (z j z n ) 2. Assume max j n (z j z n ) 2 / n j=1 (z j z n ) 2 0. n n j=1 (z j z n ) 2 n ( ˆβ n β) d N(0, σ 2 ).

93 CHAPTER 3 LARGE SAMPLE THEORY 93 The example is taken from the randomization test for paired comparison. Let (X j, Y j ) denote the values of jth pairs with X j being the result of the treatment and Z j = X j Y j. Conditional on Z j = z j, Z j = Z j sgn(z j ) is independent taking values ± Z j with probability 1/2, when treatment and control have no difference. Conditional on z 1, z 2,..., the randomization t-test is the t-statistic n 1 Z n /s z where s 2 z is 1/n n j=1 (Z j Z n ) 2. When max j n z2 j / n j=1 z 2 j 0, this statistic has an asymptotic normal distribution N(0, 1).

94 CHAPTER 3 LARGE SAMPLE THEORY 94 Delta Method

95 CHAPTER 3 LARGE SAMPLE THEORY 95 Theorem 3.15 (Delta method) For random vector X and X n in R k, if there exists two constant a n and µ such that a n (X n µ) d X and a n, then for any function g : R k R l such that g has a derivative at µ, denoted by g(µ) a n (g(x n ) g(µ)) d g(µ)x.

96 CHAPTER 3 LARGE SAMPLE THEORY 96 Proof By the Skorohod representation, we can construct X n and X such that X n d X n and X d X ( d means the same distribution) and a n ( X n µ) a.s. X. a n (g( X n ) g(µ)) a.s. g(µ) X a n (g(x n ) g(µ)) d g(µ)x

97 CHAPTER 3 LARGE SAMPLE THEORY 97 Examples Let X 1, X 2,... be i.i.d with fourth moment and s 2 n = (1/n) n i=1 (X i X n ) 2. Denote m k as the kth moment of X 1 for k 4. Note that s 2 n = (1/n) n i=1 Xi 2 ( n i=1 X i /n) 2 and [( ) ( )] X n n (1/n) m1 n i=1 X2 i m 2 ( ( )) m2 m 1 m 3 m 1 m 2 d N 0, m 3 m 1 m 2 m 4 m 2, 2 the Delta method with g(x, y) = y x 2 n(s 2 n V ar(x 1 )) d N(0, m 4 (m 2 m 2 1) 2 ).

98 CHAPTER 3 LARGE SAMPLE THEORY 98 Let (X 1, Y 1 ), (X 2, Y 2 ),... be i.i.d bivariate samples with finite fourth moment. One estimate of the correlation among X and Y is ˆρ n = s xy, s 2 x s 2 y where s xy = (1/n) n i=1 (X i X n )(Y i Ȳn), s 2 x = (1/n) n i=1 (X i X n ) 2 and s 2 y = (1/n) n i=1 (Y i Ȳn) 2. To derive the large sample distribution of ˆρ n, first obtain the large sample distribution of (s xy, s 2 x, s 2 y) using the Delta method then further apply the Delta method with g(x, y, z) = x/ yz.

99 CHAPTER 3 LARGE SAMPLE THEORY 99 The example is taken from the Pearson s Chi-square statistic. Suppose that one subject falls into K categories with probabilities p 1,..., p K, where p p K = 1. The Pearson s statistic is defined as χ 2 = n K k=1 ( n k n p k) 2 /p k, which can be treated as (observed count expected count) 2 /expected count. Note n(n 1 /n p 1,..., n K /n p K ) has an asymptotic multivariate normal distribution. Then we can apply the Delta method to g(x 1,..., x K ) = K i=1 x 2 k.

100 CHAPTER 3 LARGE SAMPLE THEORY 100 U-statistics

101 CHAPTER 3 LARGE SAMPLE THEORY 101 Definition Definition 3.6 A U-statistics associated with h(x 1,..., x r ) is defined as U n = 1 r! ( n r) β h(x β1,..., X βr ), where the sum is taken over the set of all unordered subsets β of r different integers chosen from {1,..., n}.

102 CHAPTER 3 LARGE SAMPLE THEORY 102 Examples One simple example is h(x, y) = xy. Then U n = (n(n 1)) 1 i j X i X j. U n = E[ h(x 1,..., X r ) X (1),..., X (n) ]. U n is the summation of non-independent random variables. If define h(x 1,..., x r ) as (r!) 1 ( x 1,..., x r ) h( x 1,..., x r ), then h(x 1,..., x r ) is permutation-symmetric U n = 1 ( n r) β 1 <...<β r h(β 1,..., β r ). h is called the kernel of the U-statistic U n.

103 CHAPTER 3 LARGE SAMPLE THEORY 103 CLT for U-statistics Theorem 3.16 Let µ = E[h(X 1,..., X r )]. If E[h(X 1,..., X r ) 2 ] <, then n(un µ) n n i=1 E[U n µ X i ] p 0. Consequently, n(u n µ) is asymptotically normal with mean zero and variance r 2 σ 2, where, with X 1,..., X r, X 1,..., X r i.i.d variables, σ 2 = Cov(h(X 1, X 2,..., X r ), h(x 1, X 2,..., X r )).

104 CHAPTER 3 LARGE SAMPLE THEORY 104 Some preparation Linear space of r.v.:let S be a linear space of random variables with finite second moments that contain the constants; i.e., 1 S and for any X, Y S, ax + by S n where a and b are constants. Projection: for random variable T, a random variable S is called the projection of T on S if E[(T S) 2 ] minimizes E[(T S) 2 ], S S.

105 CHAPTER 3 LARGE SAMPLE THEORY 105 Proposition 3.7 Let S be a linear space of random variables with finite second moments. Then S is the projection of T on S if and only if S S and for any S S, E[(T S) S] = 0. Every two projections of T onto S are almost surely equal. If the linear space S contains the constant variable, then E[T ] = E[S] and Cov(T S, S) = 0 for every S S.

106 CHAPTER 3 LARGE SAMPLE THEORY 106 Proof For any S and S in S, E[(T S) 2 ] = E[(T S) 2 ] + 2E[(T S) S] + E[(S S) 2 ]. if S satisfies that E[(T S) S] = 0, then E[(T S) 2 ] E[(T S) 2 ]. S is the projection of T on S. If S is the projection, for any constant α, E[(T S α S) 2 ] is minimized at α = 0. Calculate the derivative at α = 0 E[(T S) S] = 0. If T has two projections S 1 and S 2, E[(S 1 S 2 ) 2 ] = 0. Thus, S 1 = S 2, a.s. If the linear space S contains the constant variable, choose S = 1 0 = E[(T S) S] = E[T ] E[S]. Clearly, Cov(T S, S) = E[(T S) S] = 0.

107 CHAPTER 3 LARGE SAMPLE THEORY 107 Equivalence with projection Proposition 3.8 Let S n be linear space of random variables with finite second moments that contain the constants. Let T n be random variables with projections S n on to S n. If V ar(t n )/V ar(s n ) 1 then Z n T n E[T n ] V ar(t n ) S n E[S n ] V ar(s n ) p 0.

108 CHAPTER 3 LARGE SAMPLE THEORY 108 Proof. E[Z n ] = 0. Note that Cov(T n, S n ) V ar(z n ) = 2 2 V ar(tn )V ar(s n ). Since S n is the projection of T n, Cov(T n, S n ) = Cov(T n S n, S n ) + V ar(s n ) = V ar(s n ). We have V ar(s n ) V ar(z n ) = 2(1 V ar(t n ) ) 0. By the Markov s inequality, we conclude that Z n p 0.

109 CHAPTER 3 LARGE SAMPLE THEORY 109 Conclusion if S n is the summation of i.i.d random variables such that (S n E[S n ])/ V ar(s n ) d N(0, σ 2 ), so is (T n E[T n ])/ V ar(t n ). The limit distribution of U-statistics is derived using this lemma.

110 CHAPTER 3 LARGE SAMPLE THEORY 110 Proof of CLT for U-statistics Proof Let X 1,..., X r be random variables with the same distribution as X 1 and they are independent of X 1,..., X n. Denote Ũn by n i=1 E[U µ X i]. We show that Ũn is the projection of U n on the linear space S n = { g 1 (X 1 ) g n (X n ) : E[g k (X k ) 2 ] <, k = 1,..., n }, which contains the constant variables. Clearly, Ũn S n. For any g k (X k ) S n, E[(U n Ũn)g k (X k )] = E[E[U n Ũn X k ]g k (X k )] = 0.

111 CHAPTER 3 LARGE SAMPLE THEORY 111 Ũ n = = r n n i=1 ( n 1 r 1 ) ( n r) E[h( X 1,..., X r 1, X i ) µ X i ] n E[h( X 1,..., X r 1, X i ) µ X i ]. i=1 V ar(ũn) = r2 n 2 n E[(E[h( X 1,..., X r 1, X i ) µ X i ]) 2 ] i=1 = r2 n Cov(E[h( X 1,..., X r 1, X 1 ) X 1 ], E[h( X 1,..., X r 1, X 1 ) X 1 ]) = r2 n Cov(h(X 1, X 2,..., X r ), h(x 1, X 2..., X r )) = r2 σ 2 n.

112 CHAPTER 3 LARGE SAMPLE THEORY 112 Furthermore, = = V ar(u n ) ( n r ( n r ) 2 β ) 2 r k=1 β Cov(h(X β1,..., X βr ), h(x β 1,..., X β r )) β and β share k components Cov(h(X 1, X 2,.., X k, X k+1,..., X r ), h(x 1, X 2,..., X k, X k+1,..., X r )). V ar(u n ) = r k=1 r! k!(r k)! (n r)(n r+1) (n 2r+k+1) n(n 1) (n r+1) c k. V ar(u n ) = r2 n Cov(h(X 1, X 2,..., X r ), h(x 1, X 2,..., X r )) + O( 1 n 2 ). V ar(u n )/V ar(ũn) 1. U n µ V ar(un ) Ũ n V ar(ũn) p 0.

113 CHAPTER 3 LARGE SAMPLE THEORY 113 Example In a bivariate i.i.d sample (X 1, Y 1 ), (X 2, Y 2 ),..., one statistic of measuring the agreement is called Kendall s τ-statistic ˆτ = 4 n(n 1) i<j I {(Y j Y i )(X j X i ) > 0} 1. ˆτ + 1 is a U-statistic of order 2 with the kernel 2I {(y 2 y 1 )(x 2 x 1 ) > 0}. n(ˆτ n + 1 2P ((Y 2 Y 1 )(X 2 X 1 ) > 0)) has an asymptotic normal distribution with mean zero.

114 CHAPTER 3 LARGE SAMPLE THEORY 114 Rank Statistics

115 CHAPTER 3 LARGE SAMPLE THEORY 115 Some definitions X (1) X (2)... X (n) is called order statistics The rank statistics, denoted by R 1,..., R n are the ranks of X i among X 1,..., X n. Thus, if all the X s are different, X i = X (Ri ). When there are ties, R i is defined as the average of all indices such that X i = X (j) (sometimes called midrank). Only consider the case that X s have continuous densities.

116 CHAPTER 3 LARGE SAMPLE THEORY 116 More definitions a rank statistic is any function of the ranks a linear rank statistic is a rank statistic of the special form n i=1 a(i, R i ) for a given matrix (a(i, j)) n n. if a(i, j) = c i a j, then such statistic with form ni=1 c i a Ri is called simple linear rank statistic: c and a s are called the coefficients and scores.

117 CHAPTER 3 LARGE SAMPLE THEORY 117 Examples In two independent sample X 1,..., X n and Y 1,..., Y m, a Wilcoxon statistic is defined as the summation of all the ranks of the second sample in the pooled data X 1,..., X n, Y 1,..., Y m, i.e., W n = n+m i=n+1 R i. Other choices for rank statistics: for instance, the van der Waerden statistic n+m i=n+1 Φ 1 (R i ).

118 CHAPTER 3 LARGE SAMPLE THEORY 118 Properties of rank statistics Proposition 3.9 Let X 1,..., X n be a random sample from continuous distribution function F with density f. Then 1. the vectors (X (1),..., X (n) ) and (R 1,..., R n ) are independent; 2. the vector (X (1),..., X (n) ) has density n! n i=1 f(x i ) on the set x 1 <... < x n ; 3. the variable X (i) has density ) F (x) i 1 (1 F (x)) n i f(x); for F the uniform ( n 1 i 1 distribution on [0, 1], it has mean i/(n + 1) and variance i(n i + 1)/[(n + 1) 2 (n + 2)];

119 CHAPTER 3 LARGE SAMPLE THEORY the vector (R 1,..., R n ) is uniformly distributed on the set of all n! permutations of 1, 2,..., n; 5. for any statistic T and permutation r = (r 1,..., r n ) of 1, 2,..., n, E[T (X 1,..., X n ) (R 1,.., R n ) = r] = E[T (X (r1 ),.., X (rn ))]; 6. for any simple linear rank statistic T = n i=1 c i a Ri, E[T ] = n c n ā n, V ar(t ) = 1 n 1 n n (c i c n ) 2 (a i ā n ) 2. i=1 i=1

120 CHAPTER 3 LARGE SAMPLE THEORY 120 CLT of rank statistics Theorem 3.17 Let T n = n i=1 c i a Ri such that max a i ā n / n (a i ā n ) 2 0, max c i c n / n (c i c n ) 2 0. i n i n i=1 Then (T n E[T n ])/ for every ϵ > 0, (i,j) i=1 V ar(t n ) d N(0, 1) if and only if { n a i ā n c i c n I n i=1 (a i ā n ) 2 n i=1 (c i c n ) 2 > ϵ a i ā n 2 c i c n 2 n i=1 (a i ā n ) 2 n i=1 (c i c n ) 0. 2 }

121 CHAPTER 3 LARGE SAMPLE THEORY 121 More on rank statistics a simple linear signed rank statistic n i=1 a R + sign(x i ), i where R + 1,..., R + n, absolute rank, are the ranks of X 1,..., X n. In a bivariate sample (X 1, Y 1 ),..., (X n, Y n ), ni=1 a Ri b Si where (R 1,..., R n ) and (S 1,..., S n ) are respective ranks of (X 1,..., X n ) and (Y 1,..., Y n ).

122 CHAPTER 3 LARGE SAMPLE THEORY 122 Martingales

123 CHAPTER 3 LARGE SAMPLE THEORY 123 Definition 3.7 Let {Y n } be a sequence of random variables and F n be sequence of σ-fields such that F 1 F 2... Suppose E[ Y n ] <. Then the pairs {(Y n, F n )} is called a martingale if E[Y n F n 1 ] = Y n 1, a.s. {(Y n, F n )} is a submartingale if E[Y n F n 1 ] Y n 1, a.s. {(Y n, F n )} is a supmartingale if E[Y n F n 1 ] Y n 1, a.s.

124 CHAPTER 3 LARGE SAMPLE THEORY 124 Some notes on definition Y 1,..., Y n are measurable in F n. Sometimes, we say Y n is adapted to F n. One simple example: Y n = X X n, where X 1, X 2,... are i.i.d with mean zero, and F n is the σ-filed generated by X 1,..., X n.

125 CHAPTER 3 LARGE SAMPLE THEORY 125 Convex function of martingales Proposition 3.9 Let {(Y n, F n )} be a martingale. For any measurable and convex function ϕ, {(ϕ(y n ), F n )} is a submartingale.

126 CHAPTER 3 LARGE SAMPLE THEORY 126 Proof Clearly, ϕ(y n ) is adapted to F n. It is sufficient to show E[ϕ(Y n ) F n 1 ] ϕ(y n 1 ). This follows from the well-known Jensen s inequality: for any convex function ϕ, E[ϕ(Y n ) F n 1 ] ϕ(e[y n F n 1 ]) = ϕ(y n 1 ).

127 CHAPTER 3 LARGE SAMPLE THEORY 127 Jensen s inequality Proposition 3.10 For any random variable X and any convex measurable function ϕ, E[ϕ(X)] ϕ(e[x]).

128 CHAPTER 3 LARGE SAMPLE THEORY 128 Proof Claim that for any x 0, there exists a constant k 0 such that for any x, ϕ(x) ϕ(x 0 ) + k 0 (x x 0 ). By the convexity, for any x < y < x 0 < y < x, ϕ(x 0 ) ϕ(x ) x 0 x ϕ(y) ϕ(x 0) y x 0 ϕ(x) ϕ(x 0) x x 0. Thus, ϕ(x) ϕ(x 0) x x 0 is bounded and decreasing as x decreases to x 0. Let the limit be k 0 + ϕ(x) ϕ(x 0) x x 0 k 0 +. ϕ(x) k 0 + (x x 0) + ϕ(x 0 ).

129 CHAPTER 3 LARGE SAMPLE THEORY 129 Similarly, ϕ(x ) ϕ(x 0 ) x x 0 ϕ(y ) ϕ(x 0 ) y x 0 ϕ(x) ϕ(x 0) x x 0. Then ϕ(x ) ϕ(x 0 ) x x 0 is increasing and bounded as x increases to x 0. Let the limit be k0 ϕ(x ) k0 (x x 0 ) + ϕ(x 0 ). Clearly, k + 0 k 0. Combining those two inequalities, ϕ(x) ϕ(x 0 ) + k 0 (x x 0 ) for k 0 = (k k 0 )/2. Choose x 0 = E[X] then ϕ(x) ϕ(e[x]) + k 0 (X E[X]).

130 CHAPTER 3 LARGE SAMPLE THEORY 130 Decomposition of submartingale Y n = (Y n E[Y n F n 1 ]) + E[Y n F n 1 ] any submartingale can be written as the summation of a martingale and a random variable predictable in F n 1.

131 CHAPTER 3 LARGE SAMPLE THEORY 131 Convergence of martingales Theorem 3.18 (Martingale Convergence Theorem) Let {(X n, F n )} be submartingale. If K = sup n E[ X n ] <, then X n a.s. X where X is a random variable satisfying E[ X ] K.

132 CHAPTER 3 LARGE SAMPLE THEORY 132 Corollary 3.1 If F n is increasing σ-field and denote F as the σ-field generated by n=1f n, then for any random variable Z with E[ Z ] <, it holds E[Z F n ] a.s. E[Z F ].

133 CHAPTER 3 LARGE SAMPLE THEORY 133 CLT for martingale Theorem 3.19 (Martingale Central Limit Theorem) Let (Y n1, F n1 ), (Y n2, F n2 ),... be a martingale. Define X nk = Y nk Y n,k 1 with Y n0 = 0 thus Y nk = X n X nk. Suppose that k E[X 2 nk F n,k 1 ] p σ 2 where σ is a positive constant and that for each ϵ > 0. Then k E[X 2 nki( X nk ϵ) F n,k 1 ] p 0 k X nk d N(0, σ 2 ).

134 CHAPTER 3 LARGE SAMPLE THEORY 134 Some Notation

135 CHAPTER 3 LARGE SAMPLE THEORY 135 o p (1) and O p (1) X n = o p (1) denotes that X n converges in probability to zero, X n = O p (1) denotes that X n is bounded in probability; i.e., lim lim sup P ( X n M) = 0. M n for a sequence of random variable {r n }, X n = o p (r n ) means that X n /r n p 0 and X n = O p (r n ) means that X n /r n is bounded in probability.

136 CHAPTER 3 LARGE SAMPLE THEORY 136 Algebra in o p (1) and O p (1) o p (1) + o p (1) = o p (1) O p (1) + O p (1) = O p (1), O p (1)o p (1) = o p (1) (1 + o p (1)) 1 = 1 + o p (1) o p (R n ) = R n o p (1) O p (R n ) = R n O p (1) o p (O p (1)) = o p (1). If a real function R( ) satisfies that R(h) = o( h p ) as h 0, R(X n ) = o p ( X n p ). If R(h) = O( h p ) as h 0, R(X n ) = O p ( X n p ).

137 CHAPTER 3 LARGE SAMPLE THEORY 137 READING MATERIALS: Lehmann and Casella, Section 1.8, Ferguson, Part 1, Part 2, Part

Large Sample Theory. Consider a sequence of random variables Z 1, Z 2,..., Z n. Convergence in probability: Z n

Large Sample Theory. Consider a sequence of random variables Z 1, Z 2,..., Z n. Convergence in probability: Z n Large Sample Theory In statistics, we are interested in the properties of particular random variables (or estimators ), which are functions of our data. In ymptotic analysis, we focus on describing the

More information

X n D X lim n F n (x) = F (x) for all x C F. lim n F n(u) = F (u) for all u C F. (2)

X n D X lim n F n (x) = F (x) for all x C F. lim n F n(u) = F (u) for all u C F. (2) 14:17 11/16/2 TOPIC. Convergence in distribution and related notions. This section studies the notion of the so-called convergence in distribution of real random variables. This is the kind of convergence

More information

3 (Due ). Let A X consist of points (x, y) such that either x or y is a rational number. Is A measurable? What is its Lebesgue measure?

3 (Due ). Let A X consist of points (x, y) such that either x or y is a rational number. Is A measurable? What is its Lebesgue measure? MA 645-4A (Real Analysis), Dr. Chernov Homework assignment 1 (Due ). Show that the open disk x 2 + y 2 < 1 is a countable union of planar elementary sets. Show that the closed disk x 2 + y 2 1 is a countable

More information

Probability and Measure

Probability and Measure Part II Year 2018 2017 2016 2015 2014 2013 2012 2011 2010 2009 2008 2007 2006 2005 2018 84 Paper 4, Section II 26J Let (X, A) be a measurable space. Let T : X X be a measurable map, and µ a probability

More information

Lecture 3: Expected Value. These integrals are taken over all of Ω. If we wish to integrate over a measurable subset A Ω, we will write

Lecture 3: Expected Value. These integrals are taken over all of Ω. If we wish to integrate over a measurable subset A Ω, we will write Lecture 3: Expected Value 1.) Definitions. If X 0 is a random variable on (Ω, F, P), then we define its expected value to be EX = XdP. Notice that this quantity may be. For general X, we say that EX exists

More information

Convergence in Distribution

Convergence in Distribution Convergence in Distribution Undergraduate version of central limit theorem: if X 1,..., X n are iid from a population with mean µ and standard deviation σ then n 1/2 ( X µ)/σ has approximately a normal

More information

7 Convergence in R d and in Metric Spaces

7 Convergence in R d and in Metric Spaces STA 711: Probability & Measure Theory Robert L. Wolpert 7 Convergence in R d and in Metric Spaces A sequence of elements a n of R d converges to a limit a if and only if, for each ǫ > 0, the sequence a

More information

If g is also continuous and strictly increasing on J, we may apply the strictly increasing inverse function g 1 to this inequality to get

If g is also continuous and strictly increasing on J, we may apply the strictly increasing inverse function g 1 to this inequality to get 18:2 1/24/2 TOPIC. Inequalities; measures of spread. This lecture explores the implications of Jensen s inequality for g-means in general, and for harmonic, geometric, arithmetic, and related means in

More information

Useful Probability Theorems

Useful Probability Theorems Useful Probability Theorems Shiu-Tang Li Finished: March 23, 2013 Last updated: November 2, 2013 1 Convergence in distribution Theorem 1.1. TFAE: (i) µ n µ, µ n, µ are probability measures. (ii) F n (x)

More information

P (A G) dp G P (A G)

P (A G) dp G P (A G) First homework assignment. Due at 12:15 on 22 September 2016. Homework 1. We roll two dices. X is the result of one of them and Z the sum of the results. Find E [X Z. Homework 2. Let X be a r.v.. Assume

More information

STA205 Probability: Week 8 R. Wolpert

STA205 Probability: Week 8 R. Wolpert INFINITE COIN-TOSS AND THE LAWS OF LARGE NUMBERS The traditional interpretation of the probability of an event E is its asymptotic frequency: the limit as n of the fraction of n repeated, similar, and

More information

Selected Exercises on Expectations and Some Probability Inequalities

Selected Exercises on Expectations and Some Probability Inequalities Selected Exercises on Expectations and Some Probability Inequalities # If E(X 2 ) = and E X a > 0, then P( X λa) ( λ) 2 a 2 for 0 < λ

More information

3 Integration and Expectation

3 Integration and Expectation 3 Integration and Expectation 3.1 Construction of the Lebesgue Integral Let (, F, µ) be a measure space (not necessarily a probability space). Our objective will be to define the Lebesgue integral R fdµ

More information

Asymptotic Statistics-III. Changliang Zou

Asymptotic Statistics-III. Changliang Zou Asymptotic Statistics-III Changliang Zou The multivariate central limit theorem Theorem (Multivariate CLT for iid case) Let X i be iid random p-vectors with mean µ and and covariance matrix Σ. Then n (

More information

STAT 7032 Probability Spring Wlodek Bryc

STAT 7032 Probability Spring Wlodek Bryc STAT 7032 Probability Spring 2018 Wlodek Bryc Created: Friday, Jan 2, 2014 Revised for Spring 2018 Printed: January 9, 2018 File: Grad-Prob-2018.TEX Department of Mathematical Sciences, University of Cincinnati,

More information

4 Expectation & the Lebesgue Theorems

4 Expectation & the Lebesgue Theorems STA 205: Probability & Measure Theory Robert L. Wolpert 4 Expectation & the Lebesgue Theorems Let X and {X n : n N} be random variables on a probability space (Ω,F,P). If X n (ω) X(ω) for each ω Ω, does

More information

2 (Bonus). Let A X consist of points (x, y) such that either x or y is a rational number. Is A measurable? What is its Lebesgue measure?

2 (Bonus). Let A X consist of points (x, y) such that either x or y is a rational number. Is A measurable? What is its Lebesgue measure? MA 645-4A (Real Analysis), Dr. Chernov Homework assignment 1 (Due 9/5). Prove that every countable set A is measurable and µ(a) = 0. 2 (Bonus). Let A consist of points (x, y) such that either x or y is

More information

Uses of Asymptotic Distributions: In order to get distribution theory, we need to norm the random variable; we usually look at n 1=2 ( X n ).

Uses of Asymptotic Distributions: In order to get distribution theory, we need to norm the random variable; we usually look at n 1=2 ( X n ). 1 Economics 620, Lecture 8a: Asymptotics II Uses of Asymptotic Distributions: Suppose X n! 0 in probability. (What can be said about the distribution of X n?) In order to get distribution theory, we need

More information

Lecture 22: Variance and Covariance

Lecture 22: Variance and Covariance EE5110 : Probability Foundations for Electrical Engineers July-November 2015 Lecture 22: Variance and Covariance Lecturer: Dr. Krishna Jagannathan Scribes: R.Ravi Kiran In this lecture we will introduce

More information

Lecture 5: Expectation

Lecture 5: Expectation Lecture 5: Expectation 1. Expectations for random variables 1.1 Expectations for simple random variables 1.2 Expectations for bounded random variables 1.3 Expectations for general random variables 1.4

More information

Asymptotic Statistics-VI. Changliang Zou

Asymptotic Statistics-VI. Changliang Zou Asymptotic Statistics-VI Changliang Zou Kolmogorov-Smirnov distance Example (Kolmogorov-Smirnov confidence intervals) We know given α (0, 1), there is a well-defined d = d α,n such that, for any continuous

More information

STOR 635 Notes (S13)

STOR 635 Notes (S13) STOR 635 Notes (S13) Jimmy Jin UNC-Chapel Hill Last updated: 1/14/14 Contents 1 Measure theory and probability basics 2 1.1 Algebras and measure.......................... 2 1.2 Integration................................

More information

THEOREMS, ETC., FOR MATH 515

THEOREMS, ETC., FOR MATH 515 THEOREMS, ETC., FOR MATH 515 Proposition 1 (=comment on page 17). If A is an algebra, then any finite union or finite intersection of sets in A is also in A. Proposition 2 (=Proposition 1.1). For every

More information

A D VA N C E D P R O B A B I L - I T Y

A D VA N C E D P R O B A B I L - I T Y A N D R E W T U L L O C H A D VA N C E D P R O B A B I L - I T Y T R I N I T Y C O L L E G E T H E U N I V E R S I T Y O F C A M B R I D G E Contents 1 Conditional Expectation 5 1.1 Discrete Case 6 1.2

More information

Gaussian vectors and central limit theorem

Gaussian vectors and central limit theorem Gaussian vectors and central limit theorem Samy Tindel Purdue University Probability Theory 2 - MA 539 Samy T. Gaussian vectors & CLT Probability Theory 1 / 86 Outline 1 Real Gaussian random variables

More information

18.175: Lecture 3 Integration

18.175: Lecture 3 Integration 18.175: Lecture 3 Scott Sheffield MIT Outline Outline Recall definitions Probability space is triple (Ω, F, P) where Ω is sample space, F is set of events (the σ-algebra) and P : F [0, 1] is the probability

More information

Product measure and Fubini s theorem

Product measure and Fubini s theorem Chapter 7 Product measure and Fubini s theorem This is based on [Billingsley, Section 18]. 1. Product spaces Suppose (Ω 1, F 1 ) and (Ω 2, F 2 ) are two probability spaces. In a product space Ω = Ω 1 Ω

More information

Probability Theory. Richard F. Bass

Probability Theory. Richard F. Bass Probability Theory Richard F. Bass ii c Copyright 2014 Richard F. Bass Contents 1 Basic notions 1 1.1 A few definitions from measure theory............. 1 1.2 Definitions............................. 2

More information

1/12/05: sec 3.1 and my article: How good is the Lebesgue measure?, Math. Intelligencer 11(2) (1989),

1/12/05: sec 3.1 and my article: How good is the Lebesgue measure?, Math. Intelligencer 11(2) (1989), Real Analysis 2, Math 651, Spring 2005 April 26, 2005 1 Real Analysis 2, Math 651, Spring 2005 Krzysztof Chris Ciesielski 1/12/05: sec 3.1 and my article: How good is the Lebesgue measure?, Math. Intelligencer

More information

CLASSICAL PROBABILITY MODES OF CONVERGENCE AND INEQUALITIES

CLASSICAL PROBABILITY MODES OF CONVERGENCE AND INEQUALITIES CLASSICAL PROBABILITY 2008 2. MODES OF CONVERGENCE AND INEQUALITIES JOHN MORIARTY In many interesting and important situations, the object of interest is influenced by many random factors. If we can construct

More information

Probability and Measure

Probability and Measure Probability and Measure Robert L. Wolpert Institute of Statistics and Decision Sciences Duke University, Durham, NC, USA Convergence of Random Variables 1. Convergence Concepts 1.1. Convergence of Real

More information

Part II Probability and Measure

Part II Probability and Measure Part II Probability and Measure Theorems Based on lectures by J. Miller Notes taken by Dexter Chua Michaelmas 2016 These notes are not endorsed by the lecturers, and I have modified them (often significantly)

More information

Part IA Probability. Definitions. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015

Part IA Probability. Definitions. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015 Part IA Probability Definitions Based on lectures by R. Weber Notes taken by Dexter Chua Lent 2015 These notes are not endorsed by the lecturers, and I have modified them (often significantly) after lectures.

More information

Chapter 6. Convergence. Probability Theory. Four different convergence concepts. Four different convergence concepts. Convergence in probability

Chapter 6. Convergence. Probability Theory. Four different convergence concepts. Four different convergence concepts. Convergence in probability Probability Theory Chapter 6 Convergence Four different convergence concepts Let X 1, X 2, be a sequence of (usually dependent) random variables Definition 1.1. X n converges almost surely (a.s.), or with

More information

On the convergence of sequences of random variables: A primer

On the convergence of sequences of random variables: A primer BCAM May 2012 1 On the convergence of sequences of random variables: A primer Armand M. Makowski ECE & ISR/HyNet University of Maryland at College Park armand@isr.umd.edu BCAM May 2012 2 A sequence a :

More information

Lecture 4 Lebesgue spaces and inequalities

Lecture 4 Lebesgue spaces and inequalities Lecture 4: Lebesgue spaces and inequalities 1 of 10 Course: Theory of Probability I Term: Fall 2013 Instructor: Gordan Zitkovic Lecture 4 Lebesgue spaces and inequalities Lebesgue spaces We have seen how

More information

Lectures 22-23: Conditional Expectations

Lectures 22-23: Conditional Expectations Lectures 22-23: Conditional Expectations 1.) Definitions Let X be an integrable random variable defined on a probability space (Ω, F 0, P ) and let F be a sub-σ-algebra of F 0. Then the conditional expectation

More information

Lecture 6 Basic Probability

Lecture 6 Basic Probability Lecture 6: Basic Probability 1 of 17 Course: Theory of Probability I Term: Fall 2013 Instructor: Gordan Zitkovic Lecture 6 Basic Probability Probability spaces A mathematical setup behind a probabilistic

More information

Theorem 2.1 (Caratheodory). A (countably additive) probability measure on a field has an extension. n=1

Theorem 2.1 (Caratheodory). A (countably additive) probability measure on a field has an extension. n=1 Chapter 2 Probability measures 1. Existence Theorem 2.1 (Caratheodory). A (countably additive) probability measure on a field has an extension to the generated σ-field Proof of Theorem 2.1. Let F 0 be

More information

Lecture 21: Convergence of transformations and generating a random variable

Lecture 21: Convergence of transformations and generating a random variable Lecture 21: Convergence of transformations and generating a random variable If Z n converges to Z in some sense, we often need to check whether h(z n ) converges to h(z ) in the same sense. Continuous

More information

1. Probability Measure and Integration Theory in a Nutshell

1. Probability Measure and Integration Theory in a Nutshell 1. Probability Measure and Integration Theory in a Nutshell 1.1. Measurable Space and Measurable Functions Definition 1.1. A measurable space is a tuple (Ω, F) where Ω is a set and F a σ-algebra on Ω,

More information

1 Probability space and random variables

1 Probability space and random variables 1 Probability space and random variables As graduate level, we inevitably need to study probability based on measure theory. It obscures some intuitions in probability, but it also supplements our intuition,

More information

The main results about probability measures are the following two facts:

The main results about probability measures are the following two facts: Chapter 2 Probability measures The main results about probability measures are the following two facts: Theorem 2.1 (extension). If P is a (continuous) probability measure on a field F 0 then it has a

More information

Formulas for probability theory and linear models SF2941

Formulas for probability theory and linear models SF2941 Formulas for probability theory and linear models SF2941 These pages + Appendix 2 of Gut) are permitted as assistance at the exam. 11 maj 2008 Selected formulae of probability Bivariate probability Transforms

More information

Ergodic Theorems. Samy Tindel. Purdue University. Probability Theory 2 - MA 539. Taken from Probability: Theory and examples by R.

Ergodic Theorems. Samy Tindel. Purdue University. Probability Theory 2 - MA 539. Taken from Probability: Theory and examples by R. Ergodic Theorems Samy Tindel Purdue University Probability Theory 2 - MA 539 Taken from Probability: Theory and examples by R. Durrett Samy T. Ergodic theorems Probability Theory 1 / 92 Outline 1 Definitions

More information

The Theory of Statistics and Its Applications

The Theory of Statistics and Its Applications The Theory of Statistics and Its Applications 1 By Dennis D. Cox Rice University c Copyright 2000, 2004 by Dennis D. Cox. May be reproduced for personal use by students in STAT 532 at Rice University.

More information

Chapter 5. Weak convergence

Chapter 5. Weak convergence Chapter 5 Weak convergence We will see later that if the X i are i.i.d. with mean zero and variance one, then S n / p n converges in the sense P(S n / p n 2 [a, b])! P(Z 2 [a, b]), where Z is a standard

More information

Lecture 2. We now introduce some fundamental tools in martingale theory, which are useful in controlling the fluctuation of martingales.

Lecture 2. We now introduce some fundamental tools in martingale theory, which are useful in controlling the fluctuation of martingales. Lecture 2 1 Martingales We now introduce some fundamental tools in martingale theory, which are useful in controlling the fluctuation of martingales. 1.1 Doob s inequality We have the following maximal

More information

1* (10 pts) Let X be a random variable with P (X = 1) = P (X = 1) = 1 2

1* (10 pts) Let X be a random variable with P (X = 1) = P (X = 1) = 1 2 Math 736-1 Homework Fall 27 1* (1 pts) Let X be a random variable with P (X = 1) = P (X = 1) = 1 2 and let Y be a standard normal random variable. Assume that X and Y are independent. Find the distribution

More information

Stochastic Models (Lecture #4)

Stochastic Models (Lecture #4) Stochastic Models (Lecture #4) Thomas Verdebout Université libre de Bruxelles (ULB) Today Today, our goal will be to discuss limits of sequences of rv, and to study famous limiting results. Convergence

More information

Part IA Probability. Theorems. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015

Part IA Probability. Theorems. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015 Part IA Probability Theorems Based on lectures by R. Weber Notes taken by Dexter Chua Lent 2015 These notes are not endorsed by the lecturers, and I have modified them (often significantly) after lectures.

More information

Exercises Measure Theoretic Probability

Exercises Measure Theoretic Probability Exercises Measure Theoretic Probability 2002-2003 Week 1 1. Prove the folloing statements. (a) The intersection of an arbitrary family of d-systems is again a d- system. (b) The intersection of an arbitrary

More information

Notes on Random Vectors and Multivariate Normal

Notes on Random Vectors and Multivariate Normal MATH 590 Spring 06 Notes on Random Vectors and Multivariate Normal Properties of Random Vectors If X,, X n are random variables, then X = X,, X n ) is a random vector, with the cumulative distribution

More information

Real Analysis Notes. Thomas Goller

Real Analysis Notes. Thomas Goller Real Analysis Notes Thomas Goller September 4, 2011 Contents 1 Abstract Measure Spaces 2 1.1 Basic Definitions........................... 2 1.2 Measurable Functions........................ 2 1.3 Integration..............................

More information

STAT 512 sp 2018 Summary Sheet

STAT 512 sp 2018 Summary Sheet STAT 5 sp 08 Summary Sheet Karl B. Gregory Spring 08. Transformations of a random variable Let X be a rv with support X and let g be a function mapping X to Y with inverse mapping g (A = {x X : g(x A}

More information

MATH MEASURE THEORY AND FOURIER ANALYSIS. Contents

MATH MEASURE THEORY AND FOURIER ANALYSIS. Contents MATH 3969 - MEASURE THEORY AND FOURIER ANALYSIS ANDREW TULLOCH Contents 1. Measure Theory 2 1.1. Properties of Measures 3 1.2. Constructing σ-algebras and measures 3 1.3. Properties of the Lebesgue measure

More information

Analysis Comprehensive Exam Questions Fall 2008

Analysis Comprehensive Exam Questions Fall 2008 Analysis Comprehensive xam Questions Fall 28. (a) Let R be measurable with finite Lebesgue measure. Suppose that {f n } n N is a bounded sequence in L 2 () and there exists a function f such that f n (x)

More information

Lecture Notes on Asymptotic Statistics. Changliang Zou

Lecture Notes on Asymptotic Statistics. Changliang Zou Lecture Notes on Asymptotic Statistics Changliang Zou Prologue Why asymptotic statistics? The use of asymptotic approximation is two-fold. First, they enable us to find approximate tests and confidence

More information

Probability Background

Probability Background Probability Background Namrata Vaswani, Iowa State University August 24, 2015 Probability recap 1: EE 322 notes Quick test of concepts: Given random variables X 1, X 2,... X n. Compute the PDF of the second

More information

n! (k 1)!(n k)! = F (X) U(0, 1). (x, y) = n(n 1) ( F (y) F (x) ) n 2

n! (k 1)!(n k)! = F (X) U(0, 1). (x, y) = n(n 1) ( F (y) F (x) ) n 2 Order statistics Ex. 4. (*. Let independent variables X,..., X n have U(0, distribution. Show that for every x (0,, we have P ( X ( < x and P ( X (n > x as n. Ex. 4.2 (**. By using induction or otherwise,

More information

Measure Theory on Topological Spaces. Course: Prof. Tony Dorlas 2010 Typset: Cathal Ormond

Measure Theory on Topological Spaces. Course: Prof. Tony Dorlas 2010 Typset: Cathal Ormond Measure Theory on Topological Spaces Course: Prof. Tony Dorlas 2010 Typset: Cathal Ormond May 22, 2011 Contents 1 Introduction 2 1.1 The Riemann Integral........................................ 2 1.2 Measurable..............................................

More information

SPECTRAL THEOREM FOR COMPACT SELF-ADJOINT OPERATORS

SPECTRAL THEOREM FOR COMPACT SELF-ADJOINT OPERATORS SPECTRAL THEOREM FOR COMPACT SELF-ADJOINT OPERATORS G. RAMESH Contents Introduction 1 1. Bounded Operators 1 1.3. Examples 3 2. Compact Operators 5 2.1. Properties 6 3. The Spectral Theorem 9 3.3. Self-adjoint

More information

Chapter 2: Fundamentals of Statistics Lecture 15: Models and statistics

Chapter 2: Fundamentals of Statistics Lecture 15: Models and statistics Chapter 2: Fundamentals of Statistics Lecture 15: Models and statistics Data from one or a series of random experiments are collected. Planning experiments and collecting data (not discussed here). Analysis:

More information

Elementary Probability. Exam Number 38119

Elementary Probability. Exam Number 38119 Elementary Probability Exam Number 38119 2 1. Introduction Consider any experiment whose result is unknown, for example throwing a coin, the daily number of customers in a supermarket or the duration of

More information

THE LINDEBERG-FELLER CENTRAL LIMIT THEOREM VIA ZERO BIAS TRANSFORMATION

THE LINDEBERG-FELLER CENTRAL LIMIT THEOREM VIA ZERO BIAS TRANSFORMATION THE LINDEBERG-FELLER CENTRAL LIMIT THEOREM VIA ZERO BIAS TRANSFORMATION JAINUL VAGHASIA Contents. Introduction. Notations 3. Background in Probability Theory 3.. Expectation and Variance 3.. Convergence

More information

Problem set 1, Real Analysis I, Spring, 2015.

Problem set 1, Real Analysis I, Spring, 2015. Problem set 1, Real Analysis I, Spring, 015. (1) Let f n : D R be a sequence of functions with domain D R n. Recall that f n f uniformly if and only if for all ɛ > 0, there is an N = N(ɛ) so that if n

More information

Multivariate Random Variable

Multivariate Random Variable Multivariate Random Variable Author: Author: Andrés Hincapié and Linyi Cao This Version: August 7, 2016 Multivariate Random Variable 3 Now we consider models with more than one r.v. These are called multivariate

More information

Introduction to Empirical Processes and Semiparametric Inference Lecture 02: Overview Continued

Introduction to Empirical Processes and Semiparametric Inference Lecture 02: Overview Continued Introduction to Empirical Processes and Semiparametric Inference Lecture 02: Overview Continued Michael R. Kosorok, Ph.D. Professor and Chair of Biostatistics Professor of Statistics and Operations Research

More information

Lecture 21: Expectation of CRVs, Fatou s Lemma and DCT Integration of Continuous Random Variables

Lecture 21: Expectation of CRVs, Fatou s Lemma and DCT Integration of Continuous Random Variables EE50: Probability Foundations for Electrical Engineers July-November 205 Lecture 2: Expectation of CRVs, Fatou s Lemma and DCT Lecturer: Krishna Jagannathan Scribe: Jainam Doshi In the present lecture,

More information

(U) =, if 0 U, 1 U, (U) = X, if 0 U, and 1 U. (U) = E, if 0 U, but 1 U. (U) = X \ E if 0 U, but 1 U. n=1 A n, then A M.

(U) =, if 0 U, 1 U, (U) = X, if 0 U, and 1 U. (U) = E, if 0 U, but 1 U. (U) = X \ E if 0 U, but 1 U. n=1 A n, then A M. 1. Abstract Integration The main reference for this section is Rudin s Real and Complex Analysis. The purpose of developing an abstract theory of integration is to emphasize the difference between the

More information

Probability and Measure

Probability and Measure Chapter 4 Probability and Measure 4.1 Introduction In this chapter we will examine probability theory from the measure theoretic perspective. The realisation that measure theory is the foundation of probability

More information

1. Stochastic Processes and filtrations

1. Stochastic Processes and filtrations 1. Stochastic Processes and 1. Stoch. pr., A stochastic process (X t ) t T is a collection of random variables on (Ω, F) with values in a measurable space (S, S), i.e., for all t, In our case X t : Ω S

More information

Qualifying Exam in Probability and Statistics. https://www.soa.org/files/edu/edu-exam-p-sample-quest.pdf

Qualifying Exam in Probability and Statistics. https://www.soa.org/files/edu/edu-exam-p-sample-quest.pdf Part : Sample Problems for the Elementary Section of Qualifying Exam in Probability and Statistics https://www.soa.org/files/edu/edu-exam-p-sample-quest.pdf Part 2: Sample Problems for the Advanced Section

More information

4 Sums of Independent Random Variables

4 Sums of Independent Random Variables 4 Sums of Independent Random Variables Standing Assumptions: Assume throughout this section that (,F,P) is a fixed probability space and that X 1, X 2, X 3,... are independent real-valued random variables

More information

Math 832 Fall University of Wisconsin at Madison. Instructor: David F. Anderson

Math 832 Fall University of Wisconsin at Madison. Instructor: David F. Anderson Math 832 Fall 2013 University of Wisconsin at Madison Instructor: David F. Anderson Pertinent information Instructor: David Anderson Office: Van Vleck 617 email: anderson@math.wisc.edu Office hours: Mondays

More information

Brownian Motion and Conditional Probability

Brownian Motion and Conditional Probability Math 561: Theory of Probability (Spring 2018) Week 10 Brownian Motion and Conditional Probability 10.1 Standard Brownian Motion (SBM) Brownian motion is a stochastic process with both practical and theoretical

More information

the convolution of f and g) given by

the convolution of f and g) given by 09:53 /5/2000 TOPIC Characteristic functions, cont d This lecture develops an inversion formula for recovering the density of a smooth random variable X from its characteristic function, and uses that

More information

MAT 570 REAL ANALYSIS LECTURE NOTES. Contents. 1. Sets Functions Countability Axiom of choice Equivalence relations 9

MAT 570 REAL ANALYSIS LECTURE NOTES. Contents. 1. Sets Functions Countability Axiom of choice Equivalence relations 9 MAT 570 REAL ANALYSIS LECTURE NOTES PROFESSOR: JOHN QUIGG SEMESTER: FALL 204 Contents. Sets 2 2. Functions 5 3. Countability 7 4. Axiom of choice 8 5. Equivalence relations 9 6. Real numbers 9 7. Extended

More information

Theoretical Statistics. Lecture 1.

Theoretical Statistics. Lecture 1. 1. Organizational issues. 2. Overview. 3. Stochastic convergence. Theoretical Statistics. Lecture 1. eter Bartlett 1 Organizational Issues Lectures: Tue/Thu 11am 12:30pm, 332 Evans. eter Bartlett. bartlett@stat.

More information

Section 27. The Central Limit Theorem. Po-Ning Chen, Professor. Institute of Communications Engineering. National Chiao Tung University

Section 27. The Central Limit Theorem. Po-Ning Chen, Professor. Institute of Communications Engineering. National Chiao Tung University Section 27 The Central Limit Theorem Po-Ning Chen, Professor Institute of Communications Engineering National Chiao Tung University Hsin Chu, Taiwan 3000, R.O.C. Identically distributed summands 27- Central

More information

Advanced Probability

Advanced Probability Advanced Probability Perla Sousi October 10, 2011 Contents 1 Conditional expectation 1 1.1 Discrete case.................................. 3 1.2 Existence and uniqueness............................ 3 1

More information

Chapter 5. Measurable Functions

Chapter 5. Measurable Functions Chapter 5. Measurable Functions 1. Measurable Functions Let X be a nonempty set, and let S be a σ-algebra of subsets of X. Then (X, S) is a measurable space. A subset E of X is said to be measurable if

More information

µ X (A) = P ( X 1 (A) )

µ X (A) = P ( X 1 (A) ) 1 STOCHASTIC PROCESSES This appendix provides a very basic introduction to the language of probability theory and stochastic processes. We assume the reader is familiar with the general measure and integration

More information

Qualifying Exam in Probability and Statistics. https://www.soa.org/files/edu/edu-exam-p-sample-quest.pdf

Qualifying Exam in Probability and Statistics. https://www.soa.org/files/edu/edu-exam-p-sample-quest.pdf Part 1: Sample Problems for the Elementary Section of Qualifying Exam in Probability and Statistics https://www.soa.org/files/edu/edu-exam-p-sample-quest.pdf Part 2: Sample Problems for the Advanced Section

More information

STAT 200C: High-dimensional Statistics

STAT 200C: High-dimensional Statistics STAT 200C: High-dimensional Statistics Arash A. Amini May 30, 2018 1 / 59 Classical case: n d. Asymptotic assumption: d is fixed and n. Basic tools: LLN and CLT. High-dimensional setting: n d, e.g. n/d

More information

Measure-theoretic probability

Measure-theoretic probability Measure-theoretic probability Koltay L. VEGTMAM144B November 28, 2012 (VEGTMAM144B) Measure-theoretic probability November 28, 2012 1 / 27 The probability space De nition The (Ω, A, P) measure space is

More information

SOME CONVERSE LIMIT THEOREMS FOR EXCHANGEABLE BOOTSTRAPS

SOME CONVERSE LIMIT THEOREMS FOR EXCHANGEABLE BOOTSTRAPS SOME CONVERSE LIMIT THEOREMS OR EXCHANGEABLE BOOTSTRAPS Jon A. Wellner University of Washington The bootstrap Glivenko-Cantelli and bootstrap Donsker theorems of Giné and Zinn (990) contain both necessary

More information

Advanced Probability Theory (Math541)

Advanced Probability Theory (Math541) Advanced Probability Theory (Math541) Instructor: Kani Chen (Classic)/Modern Probability Theory (1900-1960) Instructor: Kani Chen (HKUST) Advanced Probability Theory (Math541) 1 / 17 Primitive/Classic

More information

Actuarial Science Exam 1/P

Actuarial Science Exam 1/P Actuarial Science Exam /P Ville A. Satopää December 5, 2009 Contents Review of Algebra and Calculus 2 2 Basic Probability Concepts 3 3 Conditional Probability and Independence 4 4 Combinatorial Principles,

More information

Real Analysis Problems

Real Analysis Problems Real Analysis Problems Cristian E. Gutiérrez September 14, 29 1 1 CONTINUITY 1 Continuity Problem 1.1 Let r n be the sequence of rational numbers and Prove that f(x) = 1. f is continuous on the irrationals.

More information

Regression and Statistical Inference

Regression and Statistical Inference Regression and Statistical Inference Walid Mnif wmnif@uwo.ca Department of Applied Mathematics The University of Western Ontario, London, Canada 1 Elements of Probability 2 Elements of Probability CDF&PDF

More information

5 Operations on Multiple Random Variables

5 Operations on Multiple Random Variables EE360 Random Signal analysis Chapter 5: Operations on Multiple Random Variables 5 Operations on Multiple Random Variables Expected value of a function of r.v. s Two r.v. s: ḡ = E[g(X, Y )] = g(x, y)f X,Y

More information

(A n + B n + 1) A n + B n

(A n + B n + 1) A n + B n 344 Problem Hints and Solutions Solution for Problem 2.10. To calculate E(M n+1 F n ), first note that M n+1 is equal to (A n +1)/(A n +B n +1) with probability M n = A n /(A n +B n ) and M n+1 equals

More information

L p Functions. Given a measure space (X, µ) and a real number p [1, ), recall that the L p -norm of a measurable function f : X R is defined by

L p Functions. Given a measure space (X, µ) and a real number p [1, ), recall that the L p -norm of a measurable function f : X R is defined by L p Functions Given a measure space (, µ) and a real number p [, ), recall that the L p -norm of a measurable function f : R is defined by f p = ( ) /p f p dµ Note that the L p -norm of a function f may

More information

Weak convergence. Amsterdam, 13 November Leiden University. Limit theorems. Shota Gugushvili. Generalities. Criteria

Weak convergence. Amsterdam, 13 November Leiden University. Limit theorems. Shota Gugushvili. Generalities. Criteria Weak Leiden University Amsterdam, 13 November 2013 Outline 1 2 3 4 5 6 7 Definition Definition Let µ, µ 1, µ 2,... be probability measures on (R, B). It is said that µ n converges weakly to µ, and we then

More information

Multiple Random Variables

Multiple Random Variables Multiple Random Variables This Version: July 30, 2015 Multiple Random Variables 2 Now we consider models with more than one r.v. These are called multivariate models For instance: height and weight An

More information

Elements of Probability Theory

Elements of Probability Theory Elements of Probability Theory CHUNG-MING KUAN Department of Finance National Taiwan University December 5, 2009 C.-M. Kuan (National Taiwan Univ.) Elements of Probability Theory December 5, 2009 1 / 58

More information

MATHS 730 FC Lecture Notes March 5, Introduction

MATHS 730 FC Lecture Notes March 5, Introduction 1 INTRODUCTION MATHS 730 FC Lecture Notes March 5, 2014 1 Introduction Definition. If A, B are sets and there exists a bijection A B, they have the same cardinality, which we write as A, #A. If there exists

More information

Exercises in Extreme value theory

Exercises in Extreme value theory Exercises in Extreme value theory 2016 spring semester 1. Show that L(t) = logt is a slowly varying function but t ǫ is not if ǫ 0. 2. If the random variable X has distribution F with finite variance,

More information

Dalal-Schmutz (2002) and Diaconis-Freedman (1999) 1. Random Compositions. Evangelos Kranakis School of Computer Science Carleton University

Dalal-Schmutz (2002) and Diaconis-Freedman (1999) 1. Random Compositions. Evangelos Kranakis School of Computer Science Carleton University Dalal-Schmutz (2002) and Diaconis-Freedman (1999) 1 Random Compositions Evangelos Kranakis School of Computer Science Carleton University Dalal-Schmutz (2002) and Diaconis-Freedman (1999) 2 Outline 1.

More information