Stochastic Convergence, Delta Method & Moment Estimators

Size: px
Start display at page:

Download "Stochastic Convergence, Delta Method & Moment Estimators"

Transcription

1 Stochastic Convergence, Delta Method & Moment Estimators Seminar on Asymptotic Statistics Daniel Hoffmann University of Kaiserslautern Department of Mathematics February 13, 2015 Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54

2 Overview 1 Stochastic Convergence Concepts of convergence and basic results Theoretical examples: LLN and CLT Tools for weak convergence More on weak convergence: Tightness and Prohorov s theorem Stochastic Landau notation 2 Delta Method Basic result Application I: Testing variance Application II: Asymptotic confidence intervals and variance-stabilizing transformations 3 Moment Estimators Method of Moments: Definition Existence and asymptotic normality 4 List of literature Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54

3 Chapter 1 Stochastic Convergence Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54

4 Scope and general assumptions We recall the basic notions of stochastic convergence from Probability Theory and take a closer look at weak convergence culminating in Prohorov s theorem. 1 I.e.: There is some countable, dense subset of S. This is just a technical assumption to guarantee the measurability of events like {d(x, Y ) > η} for some random variables X, Y and a threshold η. Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54

5 Scope and general assumptions We recall the basic notions of stochastic convergence from Probability Theory and take a closer look at weak convergence culminating in Prohorov s theorem. Throughout this talk we fix a probability space (Ω, A, P) on which all appearing random variables will be defined if not stated differently. Furthermore let (S, d) be a separable 1 metric space which will serve as codomain. Later on we will restrict ourselves to the case S = R k. Let L(P, S) := {X : Ω S X is A B(S) measurable} denote the space of all random variables of interest. 1 I.e.: There is some countable, dense subset of S. This is just a technical assumption to guarantee the measurability of events like {d(x, Y ) > η} for some random variables X, Y and a threshold η. Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54

6 Concepts of convergence Definitions and properties Definition Let (X n ) n N L(P, S) N and X L(P, S). The sequence (X n ) n N is said to... converge almost surely to X (notation: X n X ) if there is some P-null set N A, s.t. X n (ω) n X (ω) for each ω Ω \ N. converge in probability to X (notation: X n a.s. P X ) if we have ε > 0 : lim n P (d(x n, X ) > ε) = 0. converge weakly to X (notation: X n X ) if for each f C b (S) we have f dp(x n ) = f dp(x ). lim n S converge in L p L -sense, p [1, ], to X if S = R (notation: X p n X ) if lim n X n X Lp (P) = 0. S Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54

7 From Probability Theory one is familiar with the following relations between these different concepts of convergence: Proposition (relations) Let (X n ) n N L(P, S) and X L(P, S). Then it holds: a.s. P (a) X n X = X n X = X n X. (b) subsequence principle: P X (n k ) k N N N (k l ) l N N N a.s. : X nkl X. X n (c) Slutsky s lemma: Let S = R k and A n, B n L(P, S), n N, s.t. P A n a R k P and B n b R. If X n X it holds: A n + B n X n a + b X. (d) Let S = R and p [1, ). Then it holds: L p P X = X n X. X n Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54

8 Moreover the above notions of convergence are compatible with continuity, i.e. a convergent sequence of random variables can be transported to another space using continuous functions and preserving the convergence: Proposition (continuous mapping principle) Let X n, X L(P, S), n N, and Φ : S S a Borel-measurable, P(X )-a.e. continuous mapping where (S, d ) is another metric space. Then one has: { where, X n X = Φ(X n ) Φ(X ), } P., a.s. Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54

9 Theoretical examples: Where do these notions occur? The most important examples include: Theorem (weak law of large numbers) Let (X n ) n N be a sequence of uncorrelated R-valued random variables satisfying sup n N Var[X n ] <. Then we have 1 n n P (X i E[X i ]) 0. L 2 i=1 Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54

10 Theoretical examples: Where do these notions occur? The most important examples include: Theorem (weak law of large numbers) Let (X n ) n N be a sequence of uncorrelated R-valued random variables satisfying sup n N Var[X n ] <. Then we have 1 n n P (X i E[X i ]) 0. L 2 i=1 Theorem (strong law of large numbers) Let (X n ) n N (L 1 (P)) N be a sequence of i.i.d. random variables. Then 1 n n i=1 X i a.s. L 1 E[X 1 ]. Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54

11 Theorem (central limit theorem) Let (X n ) n N be a sequence of i.i.d. R k -valued random variables satisfying E [ X 1 2 2] <. Then we have ( ) 1 n P (X i E[X i ]) N k (0, Cov[X 1 ]). n i=1 Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54

12 Theorem (central limit theorem) Let (X n ) n N be a sequence of i.i.d. R k -valued random variables satisfying E [ X 1 2 2] <. Then we have ( ) 1 n P (X i E[X i ]) N k (0, Cov[X 1 ]). n i=1 Theorem (weak law of small numbers) Let {X n,m } n N be a triangular array of independent random m=1,...,n variables with P(X n,m ) = Bin(1, p n,m ), m = 1,..., n, n N. Suppose that n m=1 p n n n,m λ > 0 and max m=1,...,n p n,m 0. Then we have: ( n ) P X n,m = Poi(λ). n m=1 Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54

13 Tools for weak convergence Definition (Weak convergence General approach) Let µ n, µ, n N, be probability measures on B(S). Then the sequence (µ n ) n N converges weakly to µ iff f dµ n = f dµ f Cb 0 (S). lim n S S Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54

14 Tools for weak convergence Definition (Weak convergence General approach) Let µ n, µ, n N, be probability measures on B(S). Then the sequence (µ n ) n N converges weakly to µ iff f dµ n = f dµ f Cb 0 (S). lim n Remark (A slight generalization) S S Hence weak convergence of random variables only depends on distributions: X n X P(X n ) P(X ). Due to this equivalence, it is possible to define weak convergence for random variables defined on different probability spaces: X n on (Ω n, A n, P n ), n N, and X on (Ω, A, P). For the sake of simplicity, we will not consider this slight generalization here. Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54

15 From Probability Theory one is familiar with the following characterization of weak convergence: Theorem (portmanteau theorem) Let X n, X, n N, be S-valued random variables. Then t.f.a.e.: (a) X n X, i.e. E [f (X n )] n E [f (X )] f Cb 0(S). (b) E [f (X n )] n E [f (X )] for all Lipschitz-continuous f Cb 0(S). (c) P(X O) lim inf n P(X n O) for all open O S. (d) P(X F ) lim sup n P(X n F ) for all closed F S. (e) P(X B) = lim n P(X n B) for all B B(S) with P(X )-negligible boundary, i.e. P(X B) = 0. (f) E [f (X n )] n E [f (X )] for all bounded B(S)-measurable functions f : S R that are P(X )-a.e. continuous. Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54

16 In an Euclidean k-space the distribution function is an appropriate tool to characterize weak convergence: Definition Let X L(P, R k ). Then its (cumulative) distribution function, for short cdf, is given by ( ) F X : R k [0, 1], x P(X x) = P X k (, x i ] i=1. Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54

17 In an Euclidean k-space the distribution function is an appropriate tool to characterize weak convergence: Definition Let X L(P, R k ). Then its (cumulative) distribution function, for short cdf, is given by ( ) F X : R k [0, 1], x P(X x) = P X k (, x i ] i=1. Remark Note that, as a consequence of the uniqueness theorem for finite measures, the cdf characterizes the distribution of X uniquely since { } k E := (, x i ] x 1,..., x k R i=1 is a π-system (i.e. it is -stable) that generates B(R k ). Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54

18 Proposition (weak convergence on R k via cdf) Let X n, X, n N, be R k -valued random variables. Then it holds X n X iff F Xn (x) n F X (x) for all x R k where F X is continuous. Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54

19 Proposition (weak convergence on R k via cdf) Let X n, X, n N, be R k -valued random variables. Then it holds X n X iff F Xn (x) n F X (x) for all x R k where F X is continuous. Example (N 1 ( 0, 1 n) δ0 ) Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54

20 Some more theory on weak convergence We have already observed that weak convergence is weak in the sense that it is implied by all other concepts of convergence that we have introduced. Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54

21 Some more theory on weak convergence We have already observed that weak convergence is weak in the sense that it is implied by all other concepts of convergence that we have introduced. Let us have a closer look at weak convergence and recall from calculus: Proposition (a) Every convergent sequence in R k is bounded. (b) Every bounded sequence in R k has a convergent subsequence. (Bolzano-Weierstrass theorem) Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54

22 Some more theory on weak convergence We have already observed that weak convergence is weak in the sense that it is implied by all other concepts of convergence that we have introduced. Let us have a closer look at weak convergence and recall from calculus: Proposition (a) Every convergent sequence in R k is bounded. (b) Every bounded sequence in R k has a convergent subsequence. (Bolzano-Weierstrass theorem) Is there an analogon involving weak convergence and probabilistic boundedness? Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54

23 Prohorov s theorem Yes, indeed: Prohorov s theorem answers this question: Theorem (Prohorov) Let (X n ) n N be a sequence of R k -valued random variables. Then it holds: (a) If X n X for some R k -valued random variable X, then (X n ) n N is uniformly tight 2. (b) If (X n ) n N is uniformly tight 2, then there exists a subsequence ( Xnj ) j N with X n j X for some R k -valued random variable X. For proving this theorem we need some additional concepts and results. 2 This will be made precise shortly. Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54

24 Probabilistic boundedness Definition (uniform tightness) Let I be an index set and F := {X i } i I a family of R k -valued random variables. Then F is called uniformly tight or bounded in probability if for every ε > 0 there is a constant M ε > 0, such that sup P ( X i 2 > M ε ) < ε. i I Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54

25 Probabilistic boundedness Definition (uniform tightness) Let I be an index set and F := {X i } i I a family of R k -valued random variables. Then F is called uniformly tight or bounded in probability if for every ε > 0 there is a constant M ε > 0, such that Remark sup P ( X i 2 > M ε ) < ε. i I Uniform tightness of a sequence of random vectors in R k (i.e. I = N) is exactly the definition of the stochastic Landau notation O p : (X n ) n N is uniformly tight iff X n = O P (1). We will scrutinize this notation later. Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54

26 Helly s lemma Definition A function F : R k [0, 1] is called a defective distribution function if there is some finite measure µ on B(R k ) taking values in [0, 1] and a constant c F [0, 1], such that ( ) F (x) c F = µ ((, x]) = µ k i=1 (, x i ], x R k. Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54

27 Helly s lemma Definition A function F : R k [0, 1] is called a defective distribution function if there is some finite measure µ on B(R k ) taking values in [0, 1] and a constant c F [0, 1], such that ( ) Remark F (x) c F = µ ((, x]) = µ k i=1 (, x i ], x R k. 1 By continuity of measures from above we have: c F = lim xi F (x). i=1,...,k 2 A defective distribution function is a cdf iff the underlying finite measure is a probability measure and c F = 0. Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54

28 Helly s lemma Lemma (Helly s lemma/helly s selection theorem) Let (F n ) n N be a sequence of distribution functions with domain R k. Then this sequence possesses a subsequence ( F nj with the property )j N lim j F nj (x) = F (x) for each continuity point x R k of some defective distribution function F. Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54

29 Helly s lemma Lemma (Helly s lemma/helly s selection theorem) Let (F n ) n N be a sequence of distribution functions with domain R k. Then this sequence possesses a subsequence ( F nj with the property )j N lim j F nj (x) = F (x) for each continuity point x R k of some defective distribution function F. Rough idea of the proof. The proof is quite technical. Hence we only present the idea of the construction of F. For details, please refer to [Dur10, Thm , Thm ] and [Van98, Lemma 2.5]. BOARD Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54

30 Is it really possible that Helly s lemma fails to provide us with an honest cdf? Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54

31 Is it really possible that Helly s lemma fails to provide us with an honest cdf? Unfortunately, yes! Example Consider a sequence (X n ) n N of real-valued random variables satisfying X n δ n, n N. Then the corresponding sequence of distribution functions is given by F n : R {0, 1}, x 1 [n, ) (x). Obviously lim j F nj (x) = 0 for each x R and each subsequence (n j ) j N. Hence Helly s lemma cannot yield an honest cdf! Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54

32 Prohorov s theorem Now we are in a position to prove Prohorov s theorem: Theorem (Prohorov) Let (X n ) n N be a sequence of R k -valued random variables. Then it holds: (a) If X n X for some R k -valued random variable X, then (X n ) n N is uniformly tight. (b) If (X n ) n N is uniformly tight, then there exists a subsequence ( Xnj ) j N with X n j X for some R k -valued random variable X. Proof. BOARD Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54

33 Stochastic Landau notation Similar to the well-known O-Notation from calculus, one can introduce a stochastic version of these Landau symbols in order to express the speed of convergence (in probability): Definition Let (X n ) n N, (R n ) n N be sequences of R k - and R-valued random variables, respectively. We write: (a) X n = O P (1) : {X n : n N} is uniformly tight. (b) X n = O P (R n ) : X n = R n Y n for a sequence (Y n ) n N of R k -valued random variables satisfying Y n = O P (1). Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54

34 Stochastic Landau notation Similar to the well-known O-Notation from calculus, one can introduce a stochastic version of these Landau symbols in order to express the speed of convergence (in probability): Definition Let (X n ) n N, (R n ) n N be sequences of R k - and R-valued random variables, respectively. We write: (a) X n = O P (1) : {X n : n N} is uniformly tight. (b) X n = O P (R n ) : X n = R n Y n for a sequence (Y n ) n N of R k -valued random variables satisfying Y n = O P (1). P (c) X n = o P (1) : X n 0. (d) X n = o P (R n ) : X n = R n Y n for a sequence (Y n ) n N of R k -valued random variables satisfying Y n = o P (1). Commonly, (R n ) n N is called the rate (of convergence). Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54

35 In our next chapter we will use differential calculus and therefore need the following lemma. Think of R as the remainder term in some Taylor expansion. Lemma Let D R k be open with 0 D and let R : D R m be a function with R(0) = 0. Furthermore let (X n ) n N be a sequence of random variables taking values in D with X n = o P (1). Then for every p > 0 we have: (a) R(h) = o ( h p 2) (h 0) = R(Xn ) = o P ( Xn p 2) ; (b) R(h) = O ( h p 2) (h 0) = R(Xn ) = O P ( Xn p 2). Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54

36 In our next chapter we will use differential calculus and therefore need the following lemma. Think of R as the remainder term in some Taylor expansion. Lemma Let D R k be open with 0 D and let R : D R m be a function with R(0) = 0. Furthermore let (X n ) n N be a sequence of random variables taking values in D with X n = o P (1). Then for every p > 0 we have: (a) R(h) = o ( h p 2) (h 0) = R(Xn ) = o P ( Xn p 2) ; (b) R(h) = O ( h p 2) (h 0) = R(Xn ) = O P ( Xn p 2). Proof. Let p > 0 and define g(h) := { R(h) h p 2 if h 0, 0 else. Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54

37 In our next chapter we will use differential calculus and therefore need the following lemma. Think of R as the remainder term in some Taylor expansion. Lemma Let D R k be open with 0 D and let R : D R m be a function with R(0) = 0. Furthermore let (X n ) n N be a sequence of random variables taking values in D with X n = o P (1). Then for every p > 0 we have: (a) R(h) = o ( h p 2) (h 0) = R(Xn ) = o P ( Xn p 2) ; (b) R(h) = O ( h p 2) (h 0) = R(Xn ) = O P ( Xn p 2). Proof. Let p > 0 and define g(h) := { R(h) h p 2 if h 0, 0 else. [=:R n] [=:Y n] {}}{{}}{ Then for each n N we have R(X n ) = X n p 2 g(x n ). Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54

38 Proof (continued). (a) By assumption R(h) lim g(h) = lim h 0 h 0 h p 2 = 0, i.e. g is continuous at 0. Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54

39 Proof (continued). (a) By assumption R(h) lim g(h) = lim h 0 h 0 h p 2 = 0, i.e. g is continuous at 0. Since X n 0 (by assumption), the P continuous mapping principle yields g(x n ) 0, i.e. g(x n ) = o P (1). Thus R(X n ) = o P ( X n p 2 ). P Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54

40 Proof (continued). (a) By assumption R(h) lim g(h) = lim h 0 h 0 h p 2 = 0, i.e. g is continuous at 0. Since X n 0 (by assumption), the P continuous mapping principle yields g(x n ) 0, i.e. g(x n ) = o P (1). Thus R(X n ) = o P ( X n p 2 ). (b) By assumption there is some δ > 0 and some M > 0, s.t. g(h) 2 = R(h) M for all h B δ (0) D. 2 h p 2 P Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54

41 Proof (continued). (a) By assumption R(h) lim g(h) = lim h 0 h 0 h p 2 = 0, i.e. g is continuous at 0. Since X n 0 (by assumption), the P continuous mapping principle yields g(x n ) 0, i.e. g(x n ) = o P (1). Thus R(X n ) = o P ( X n p 2 ). (b) By assumption there is some δ > 0 and some M > 0, s.t. g(h) 2 = R(h) P M for all h B δ (0) D. Since X n 0 2 h p 2 (by assumption), we obtain: P ( g(x n ) 2 > M) P ( X n 2 > δ) n 0. P Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54

42 Proof (continued). Hence for given ε > 0 we can choose n ε N, s.t. P ( g(x n ) 2 > M) < ε 2 for all n > n ε. For n {1,..., n ε } we choose M ε M suitably large, s.t. P ( g(x n ) 2 > M ε ) < ε for these n. 2 Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54

43 Proof (continued). Hence for given ε > 0 we can choose n ε N, s.t. P ( g(x n ) 2 > M) < ε 2 for all n > n ε. For n {1,..., n ε } we choose M ε M suitably large, s.t. P ( g(x n ) 2 > M ε ) < ε for these n. Obviously, this yields 2 sup P ( g(x n ) 2 > M ε ) ε n N 2 < ε, i.e. {g(x n )} n N is uniformly tight. Thus, g(x n ) = O P (1) and R(X n ) = O P ( X n p 2 ). Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54

44 Proof (continued). Hence for given ε > 0 we can choose n ε N, s.t. P ( g(x n ) 2 > M) < ε 2 for all n > n ε. For n {1,..., n ε } we choose M ε M suitably large, s.t. P ( g(x n ) 2 > M ε ) < ε for these n. Obviously, this yields 2 sup P ( g(x n ) 2 > M ε ) ε n N 2 < ε, i.e. {g(x n )} n N is uniformly tight. Thus, g(x n ) = O P (1) and R(X n ) = O P ( X n p 2 ). Example (LLN) Let (X n ) n N ( L 1 (P) ) N be a sequence of i.i.d. random variables. Define S n := n i=1 X i. Then we know that Sn n = X n P E[X 1 ], i.e. S n ne[x 1 ] = o P (n). Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54

45 Chapter 2 Delta Method Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54

46 Motivation/Idea Given a limit law of n (T n θ) (often derived from the CLT), how to deduce one of n (φ(t n ) φ(θ)) where φ is some differentiable mapping? Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54

47 Motivation/Idea Given a limit law of n (T n θ) (often derived from the CLT), how to deduce one of n (φ(t n ) φ(θ)) where φ is some differentiable mapping? Use a Taylor expansion! Remark In applications, T n often is an estimator for some parameter θ. Note that the question appeals to the limit distribution. Hence φ(t n ) may inherit a property like asymptotic efficiency from T n. Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54

48 Let us recall some definitions concerning estimators: Definition Let {P ϑ } ϑ Θ be a familiy of probability measures on B(R m ), consider an i.i.d. sample X 1,..., X n with P(X 1 ) {P ϑ } ϑ Θ and assume T n = T (X 1,..., X n ) : Ω Θ to be an estimator for ϑ. (a) T n is called consistent iff T n P ϑ ϑ (X 1 P ϑ ) for all ϑ Θ. (b) T n is called unbiased iff bias ϑ (T n ) := E ϑ [T n ] ϑ = 0 for all ϑ Θ. (Existence of the involved integral is required.) (c) T n is called asymptotically efficient iff (provided that T n is R k -valued) for all ϑ Θ we have: P ϑ ( n (Tn ϑ) ) N k ( 0, I(Pϑ ) 1). [ ]) Here I(P ϑ ) := (E ϑ ϑ i log f ϑ (X ) ϑ j log f ϑ (X ) denotes the i,j=1,...,k Fisher information matrix of P ϑ where Θ R k is assumed. Moreover X P ϑ is an R m -valued random variable and f ϑ = dp ϑ dλ, if P m ϑ λ m, and f ϑ = dp ϑ d#, if P m ϑ # m (in either way for all ϑ Θ). Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54

49 Proof. BOARD Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54 Delta method Basic result Theorem (Delta method) Let D R k be an open subset and suppose that φ : D R m is a mapping that is differentiable at θ D. Furthermore let {T n } n N be a family of D-valued random variables and let (r n ) n N be a sequence of n real numbers satisfying 0 < r n. If r n (T n θ) T for some D-valued random variable T, then n r n (φ(t n ) φ(θ)) n Dφ(θ)T, where Dφ(θ) L ( R k, R m) = R m k denotes the Fréchet-derivative (represented by the Jacobian matrix) of φ at θ. Moreover, we have P r n (φ(t n ) φ(θ)) Dφ(θ) (r n (T n θ)) 2 0.

50 There is also a slightly more general result if we assume φ to be of class C 1 around θ. We state it without proof: Theorem (Uniform delta method) Let D R k be an open subset and suppose that φ : D R m is a mapping that is continuously differentiable in an open neighborhood of θ D. Furthermore let {T n } n N be a family of D-valued random variables and let (r n ) n N be a sequence of real numbers satisfying n 0 < r n. If r n (T n θ n ) T for some D-valued random variable T and some n n D θ n θ, then r n (φ(t n ) φ(θ n )) n Dφ(θ)T, where Dφ(θ) L ( R k, R m) = R m k denotes the Fréchet-derivative (represented by the Jacobian matrix) of φ at θ. Moreover, we have P r n (φ(t n ) φ(θ n )) Dφ(θ) (r n (T n θ n )) 2 0. Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54

51 Application I: Testing variance CLT revisited Example (sample variance) Given a data set consisting of i.i.d. observations X 1,..., X n L 4 (P), n N, we want to estimate its variance: Therefore we consider the biased estimator bŝ n 2 := 1 n ( ) 2 Xi X n = X 2 n X 2 n = φ(x n, X n 2 n), i=1 where φ(x, y) := y x 2, (x, y) T R 2. Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54

52 Application I: Testing variance CLT revisited Example (sample variance) Given a data set consisting of i.i.d. observations X 1,..., X n L 4 (P), n N, we want to estimate its variance: Therefore we consider the biased estimator bŝ n 2 := 1 n ( ) 2 Xi X n = X 2 n X 2 n = φ(x n, X n 2 n), i=1 where φ(x, y) := y x 2, (x, y) T R 2. Define µ k := E[X1 k ], k = 1..., 4. Then for the vectors (X i, Xi 2 ) T, i = 1,..., n, it holds by the CLT: (( ) ( )) X n n µ1 = 1 n (( ) [( )]) Xi Xi X 2 n µ 2 n Xi 2 E Xi 2 Z i=1 Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54

53 Example (sample variance (continued)) n ( (X n, X 2 n) T ( µ1, µ 2 ) T ) Z, where (( ) ( )) 0 µ2 µ Z N 2, 2 1 µ 3 µ 1 µ 2 0 µ 3 µ 1 µ 2 µ 4 µ 2. 2 Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54

54 Example (sample variance (continued)) n ( (X n, X 2 n) T ( µ1, µ 2 ) T ) Z, where (( ) ( )) 0 µ2 µ Z N 2, 2 1 µ 3 µ 1 µ 2 0 µ 3 µ 1 µ 2 µ 4 µ 2. 2 Hence, the delta method implies (since b Ŝn 2 = φ(x n, X 2 n), where φ(x, y) = y x 2 ): ( ) n φ(x n, X 2 n) φ(µ 1, µ 2 ) Dφ(µ 1, µ 2 )Z, i.e. (since Dφ(x, y)z = ( 2x, 1 ) z, z R 2 ) n ( bŝ 2 n ( µ 2 µ 2 1) ) Z. Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54

55 Example (sample variance (continued)) n ( bŝ 2 n ( µ 2 µ 2 1) ) Z, where ( Z N 1 0, ( 2µ 1, 1 ) ( ) ( )) µ 2 µ 2 1 µ 3 µ 1 µ 2 2µ1 µ 3 µ 1 µ 2 µ 4 µ = N 1 (0, µ 4 µ 2 2 2µ 4 1 4µ 1 µ 3 + 6µ 2 1µ 2 ). Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54

56 Example (sample variance (continued)) n ( bŝ 2 n ( µ 2 µ 2 1) ) Z, where ( Z N 1 0, ( 2µ 1, 1 ) ( ) ( )) µ 2 µ 2 1 µ 3 µ 1 µ 2 2µ1 µ 3 µ 1 µ 2 µ 4 µ = N 1 (0, µ 4 µ 2 2 2µ 4 1 4µ 1 µ 3 + 6µ 2 1µ 2 ). Slutsky s lemma implies that this result also holds for the corresponding unbiased estimator Ŝ 2 n := n n 1 bŝ 2 n. We will apply this result to construct a test for the variance of a data set. Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54

57 First, we recall some notions and results from Mathematical Statistics : Definition (kurtosis) Let X L 4 (P) be a random variable. Then the kurtosis of X is defined by E [(X E[X ]) 4] E [(X E[X ]) 4] κ X := ( E [(X E[X ]) 2]) 2 = (Var(X )) 2 Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54

58 First, we recall some notions and results from Mathematical Statistics : Definition (kurtosis) Let X L 4 (P) be a random variable. Then the kurtosis of X is defined by E [(X E[X ]) 4] E [(X E[X ]) 4] κ X := ( E [(X E[X ]) 2]) 2 = (Var(X )) 2 Remark If X N 1 (µ, σ 2 ), then κ X = 3. Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54

59 First, we recall some notions and results from Mathematical Statistics : Definition (kurtosis) Let X L 4 (P) be a random variable. Then the kurtosis of X is defined by E [(X E[X ]) 4] E [(X E[X ]) 4] κ X := ( E [(X E[X ]) 2]) 2 = (Var(X )) 2 Remark If X N 1 (µ, σ 2 ), then κ X = 3. Definition (Chi-square distribution) Let X 1,..., X n N 1 (0, 1) be i.i.d. random variables. Then the probability measure χ 2 n := P ( n i=1 X i 2 ) on B(R) is called the chi-square distribution with n degrees of freedom. Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54

60 From Mathematical statistics one is familiar with the following Proposition Let X 1,..., X n N 1 (µ, σ 2 ) be i.i.d. random variables and Ŝn 2 = 1 n ( ) 2 n 1 i=1 Xi X n the unbiased estimator of σ 2 from above. Then ( ) (n 1)Ŝ n 2 P = χ 2 σ n 1. 2 Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54

61 From Mathematical statistics one is familiar with the following Proposition Let X 1,..., X n N 1 (µ, σ 2 ) be i.i.d. random variables and Ŝn 2 = 1 n ( ) 2 n 1 i=1 Xi X n the unbiased estimator of σ 2 from above. Then ( ) (n 1)Ŝ n 2 P = χ 2 σ n 1. 2 This result gives rise to the following test for normal data: Example (One-sided test for σ 2 ) In the situation of the preceding proposition and for given σ 2 0 > 0, we want to test H 0 : σ 2 = σ 2 0 vs. H 1 : σ 2 > σ 2 0 at level α (0, 1). Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54

62 Example (One-sided test for σ 2 (continued)) As a test statistic we use T n := (n 1)Ŝ2 n σ0 2 H 0 if T n > q (1 α) χ 2 n 1 σ 2 0 χ 2 n 1 and decide to reject ((1 α)-quantile of then χ 2 n 1-distribution). Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54

63 Example (One-sided test for σ 2 (continued)) As a test statistic we use T n := (n 1)Ŝ2 n σ 2 0 H 0 if T n > q (1 α) χ 2 n 1 Hence we obtain: P σ 2 0 ( σ 2 0 χ 2 n 1 and decide to reject ((1 α)-quantile of then χ 2 n 1-distribution). ) (( T n > q (1 α) = 1 χ 2 n 1 χ 2 n 1 i.e. the test has exactly level α., q (1 α) χ 2 n 1 ]) = α, Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54

64 Example (One-sided test for σ 2 (continued)) As a test statistic we use T n := (n 1)Ŝ2 n σ 2 0 H 0 if T n > q (1 α) χ 2 n 1 Hence we obtain: P σ 2 0 ( σ 2 0 χ 2 n 1 and decide to reject ((1 α)-quantile of then χ 2 n 1-distribution). ) (( T n > q (1 α) = 1 χ 2 n 1 χ 2 n 1 i.e. the test has exactly level α., q (1 α) χ 2 n 1 ]) = α, What if we know that the given data are not normally distributed? Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54

65 Example (One-sided test for σ 2 (continued)) As a test statistic we use T n := (n 1)Ŝ2 n σ 2 0 H 0 if T n > q (1 α) χ 2 n 1 Hence we obtain: P σ 2 0 ( σ 2 0 χ 2 n 1 and decide to reject ((1 α)-quantile of then χ 2 n 1-distribution). ) (( T n > q (1 α) = 1 χ 2 n 1 χ 2 n 1 i.e. the test has exactly level α., q (1 α) χ 2 n 1 ]) = α, What if we know that the given data are not normally distributed? We use the approximation n 2 (Ŝ n ( ) ) µ 2 µ 2 1 Z, where Z N 1 (0, µ 4 µ 2 2 2µ 4 1 4µ 1 µ 3 + 6µ 2 1µ 2 ), from above to derive a test of asymptotic level α for certain data sets. Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54

66 Example (One-sided test for σ 2 without normality assumption) Let X 1,..., X n L 4 (P) be i.i.d. random variables. As above let µ k := E[X1 k ]. For the sake of simplicity we assume µ 1 = 0. 3 We obtain: σ 2 := Var(X 1 ) = µ 2 and κ := κ X1 = µ 4. Hence, our µ 2 2 approximation reduces to ) (Ŝ2 n n 1 Z N 1 (0, κ 1). µ 2 3 This is not a restriction: Centering observations neither affects their dispersion nor Ŝ n 2 = 1 n ( ) 2. n 1 i=1 Xi X n (After centering one has to use centered moments instead of our µ k.) Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54

67 Example (One-sided test for σ 2 without normality assumption) Let X 1,..., X n L 4 (P) be i.i.d. random variables. As above let µ k := E[X1 k ]. For the sake of simplicity we assume µ 1 = 0. 3 We obtain: σ 2 := Var(X 1 ) = µ 2 and κ := κ X1 = µ 4. Hence, our µ 2 2 approximation reduces to ) (Ŝ2 n n 1 Z N 1 (0, κ 1). µ 2 Again, for given σ 2 0 > 0, we want to test H 0 : σ 2 = σ0 2 vs. H 1 : σ 2 > σ0 2 at level α (0, 1). We use T n := ( ) n Ŝ2 n 1 as test statistic. σ0 2 3 This is not a restriction: Centering observations neither affects their dispersion nor Ŝ n 2 = 1 n ( ) 2. n 1 i=1 Xi X n (After centering one has to use centered moments instead of our µ k.) Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54

68 T n κ 1 = (Ŝ ) 2 n n κ 1 1 N σ0 2 1 (0, 1) 1000 repetitions Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54

69 T n κ 1 = (Ŝ ) 2 n n κ 1 1 N σ0 2 1 (0, 1) 1000 repetitions Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54

70 A remark on quantiles of χ 2 n 1 Remark Let C n 1 χ 2 n 1, n N >1. Then the CLT implies ( ) Cn 1 (n 1) P N 1 (0, 1). 2n 2 4 This is due to the fact that Φ increases strictly. ( Probability Theory ) Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54

71 A remark on quantiles of χ 2 n 1 Remark Let C n 1 χ 2 n 1, n N >1. Then the CLT implies ( ) Cn 1 (n 1) P N 1 (0, 1). 2n 2 Hence, for α (0, 1), the (1 α)-quantile of the latter probability distribution converges to that of a standard normal distribution as quantiles of this distribution are uniquely determined 4 : lim n (n 1) 2n 2 }{{} =: 1 q n 2 q (1 α) χ 2 n 1 = Φ 1 (1 α). 4 This is due to the fact that Φ increases strictly. ( Probability Theory ) Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54

72 Example (One-sided test for σ 2 (continued)) ) σ 2 0 Recall: T n := ( n Ŝ2 n 1 σ Z N (0, κ 1), lim n q n = 2Φ 1 (1 α). Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54

73 Example (One-sided test for σ 2 (continued)) ) σ 2 0 Recall: T n := ( n Ŝ2 n 1 σ Z N (0, κ 1), lim n q n = 2Φ 1 (1 α). Thus Slutsky s lemma implies: P σ 2 0 ((T n q n ) ) N 1 ( ) 2Φ 1 (1 α), κ 1. Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54

74 Example (One-sided test for σ 2 (continued)) ) σ 2 0 Recall: T n := ( n Ŝ2 n 1 σ Z N (0, κ 1), lim n q n = 2Φ 1 (1 α). Thus Slutsky s lemma implies: P σ 2 0 ((T n q n ) ) N 1 ( ) 2Φ 1 (1 α), κ 1. We implement the following decision rule: Reject H 0 if T n > q (1 α) χ 2 (n 1) n 1 n 1 = q n. Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54

75 Example (One-sided test for σ 2 (continued)) ) σ 2 0 Recall: T n := ( n Ŝ2 n 1 σ Z N (0, κ 1), lim n q n = 2Φ 1 (1 α). Thus Slutsky s lemma implies: P σ 2 0 ((T n q n ) ) N 1 ( ) 2Φ 1 (1 α), κ 1. We implement the following decision rule: Reject H 0 if T n > q (1 α) χ 2 (n 1) n 1 n 1 = q n. For the error of type I, we obtain: P σ 2 0 (T n > q n ) = P σ 2 0 (T n q n > 0) n portmanteau (e) 1 N 1 ( 2Φ 1 (1 α), κ 1 ) ((, 0]) Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54

76 Example (One-sided test for σ 2 (continued)) ( P σ 2 0 (T n > q n ) n 2Φ 1 (1 α) 1 Φ κ 1 )! = α κ = 3! α 1 κ 3 Hence our decision rule establishes an (asymptotic) one-sided test (of level α) for σ 2 iff the distribution of the observations is platykurtic or mesokurtic, i.e. κ < 3 and κ = 3, respectively. Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54

77 Recall: H 0 : σ 2 = σ 2 0 T n := (n 1)Ŝ2 n σ 2 0 T n := n ( Ŝ2 n σ 2 0 H 0 if T n > q n = used with normal data Reject H 0 if T n > q (1 α). χ 2 n 1 ) 1 used with possibly non-normal data Reject q (1 α) χ 2 (n 1) n 1 n 1. Remark The presented testing procedures are closely related: They are based on the same (asymptotic) decision rule (if µ 1 = 0) as one can prove: T n > q n T n (n 1) > n 1 ( ) q (1 α) (n 1). }{{ n χ 2 n 1 } 1 for large n Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54

78 Application II: Asymptotic confidence intervals and variance-stabilizing transformations (VST) We are given a parametric model {P ϑ } ϑ Θ of probability measures on B(R k ) and assume Θ := (ϑ, ϑ + ) R to be an open interval. Furthermore assume the existence of an estimator T n = T (X 1,..., X n ) of ϑ Θ (where {X l } l N is a familiy of i.i.d. R k -valued random variables with P(X 1 ) {P ϑ } ϑ Θ 5 ) that satisfies n (Tn ϑ) ϑ = n T N 1(0, σ 2 (ϑ)) for all ϑ Θ. We assume that σ 2 ( ) is known as function of ϑ. 5 Formally, for l N, Xl is defined on (Ω, A, P) as fixed above. If ϑ Θ is the true (but unknown) parameter then the image measure of P under X 1 (i.e. the distribution of X 1 ) is given by P ϑ. Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54

79 Application II: Asymptotic confidence intervals and variance-stabilizing transformations (VST) We are given a parametric model {P ϑ } ϑ Θ of probability measures on B(R k ) and assume Θ := (ϑ, ϑ + ) R to be an open interval. Furthermore assume the existence of an estimator T n = T (X 1,..., X n ) of ϑ Θ (where {X l } l N is a familiy of i.i.d. R k -valued random variables with P(X 1 ) {P ϑ } ϑ Θ 5 ) that satisfies n (Tn ϑ) ϑ = n T N 1(0, σ 2 (ϑ)) for all ϑ Θ. We assume that σ 2 ( ) is known as function of ϑ. Task: For fixed γ (0, 1), find an asymptotic confidence interval for ϑ. 5 Formally, for l N, Xl is defined on (Ω, A, P) as fixed above. If ϑ Θ is the true (but unknown) parameter then the image measure of P under X 1 (i.e. the distribution of X 1 ) is given by P ϑ. Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54

80 Application II: Asymptotic confidence intervals and variance-stabilizing transformations (VST) We are given a parametric model {P ϑ } ϑ Θ of probability measures on B(R k ) and assume Θ := (ϑ, ϑ + ) R to be an open interval. Furthermore assume the existence of an estimator T n = T (X 1,..., X n ) of ϑ Θ (where {X l } l N is a familiy of i.i.d. R k -valued random variables with P(X 1 ) {P ϑ } ϑ Θ 5 ) that satisfies n (Tn ϑ) ϑ = n T N 1(0, σ 2 (ϑ)) for all ϑ Θ. We assume that σ 2 ( ) is known as function of ϑ. Task: For fixed γ (0, 1), find an asymptotic confidence interval for ϑ. First idea: Consider the asymptotic γ-confidence interval CI ϑ;n (γ) := [ T n Φ 1 ( 1 + γ 2 ) σ(ϑ) n, T n Φ 1 ( 1 + γ 2 ) ] σ(ϑ). n 5 Formally, for l N, Xl is defined on (Ω, A, P) as fixed above. If ϑ Θ is the true (but unknown) parameter then the image measure of P under X 1 (i.e. the distribution of X 1 ) is given by P ϑ. Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54

81 First idea: Consider the asymptotic γ-confidence interval [ ( ) ( ) ] 1 + γ σ(ϑ) 1 + γ σ(ϑ) CI ϑ;n(γ) = T n Φ 1, T n Φ 1. 2 n 2 n Problem: ϑ and thus σ(ϑ) are unknown in general. Hence, these confidence intervals are useless in practice. Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54

82 First idea: Consider the asymptotic γ-confidence interval [ ( ) ( ) ] 1 + γ σ(ϑ) 1 + γ σ(ϑ) CI ϑ;n(γ) = T n Φ 1, T n Φ 1. 2 n 2 n Problem: ϑ and thus σ(ϑ) are unknown in general. Hence, these confidence intervals are useless in practice. Solution 1: Estimate σ 2 (ϑ) using a consistent estimator. This approach is discussed in Mathematical Statistics. Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54

83 First idea: Consider the asymptotic γ-confidence interval [ ( ) ( ) ] 1 + γ σ(ϑ) 1 + γ σ(ϑ) CI ϑ;n(γ) = T n Φ 1, T n Φ 1. 2 n 2 n Problem: ϑ and thus σ(ϑ) are unknown in general. Hence, these confidence intervals are useless in practice. Solution 1: Estimate σ 2 (ϑ) using a consistent estimator. This approach is discussed in Mathematical Statistics. Solution 2: Use a variance-stabilizing transformation of the given data set. Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54

84 Variance-stabilizing transformations Assumption Let ϑ 0 Θ be fixed. We assume that the mapping Θ = (ϑ, ϑ + ) ϑ ϑ ϑ 0 1 σ(θ) dθ R is well-defined and differentiable (with derivative 1/σ( )). Definition (VST) In the stated situation, under the latter assumption and for some fixed η > 0, the differentiable mapping φ : Θ = (ϑ, ϑ + ) R, ϑ is called a variance-stabilizing transformation. ϑ ϑ 0 η σ(θ) dθ Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54

85 Recall: φ(ϑ) = ϑ ϑ 0 Remark (Basic properties) η σ(θ) dθ, ϑ (ϑ, ϑ + ) 1 φ is continuous and due to η > 0 and σ > 0 on Θ also strictly increasing as its derivative equals φ η. Hence φ is invertible. σ 2 φ exhibits the variance-stabilizing property: φ σ η. Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54

86 Recall: φ(ϑ) = ϑ ϑ 0 Remark (Basic properties) η σ(θ) dθ, ϑ (ϑ, ϑ + ) 1 φ is continuous and due to η > 0 and σ > 0 on Θ also strictly increasing as its derivative equals φ η. Hence φ is invertible. σ 2 φ exhibits the variance-stabilizing property: φ σ η. Remark (What is the origin of this name?) Recall: n (T n ϑ) differentiable on Θ. ϑ = n T N 1(0, σ 2 (ϑ)) for all ϑ Θ and φ is Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54

87 Recall: φ(ϑ) = ϑ ϑ 0 Remark (Basic properties) η σ(θ) dθ, ϑ (ϑ, ϑ + ) 1 φ is continuous and due to η > 0 and σ > 0 on Θ also strictly increasing as its derivative equals φ η. Hence φ is invertible. σ 2 φ exhibits the variance-stabilizing property: φ σ η. Remark (What is the origin of this name?) Recall: ϑ n (T n ϑ) = T N 1(0, σ 2 (ϑ)) for all ϑ Θ and φ is n differentiable on Θ. Hence, the delta method implies: n (φ(tn ) φ(ϑ)) ϑ = n φ (ϑ)t N 1 (0, (φ (ϑ)) 2 σ 2 (ϑ)) = N 1 (0, η 2 ) for all ϑ Θ, i.e. the asymptotic variance is stabilized to η 2 (which is usually chosen to be 1 in practice). Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54

88 Recall: T n is an estimator of ϑ with n (Tn ϑ) ϑ = n T N 1(0, σ 2 (ϑ)) for all ϑ Θ. Goal: Find an asymptotic γ-confidence interval for ϑ. Derived so far: ϑ n (φ(tn ) φ(ϑ)) = n φ (ϑ) N 1 (0, η 2 ) for all ϑ Θ. Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54

89 Recall: T n is an estimator of ϑ with n (Tn ϑ) ϑ = n T N 1(0, σ 2 (ϑ)) for all ϑ Θ. Goal: Find an asymptotic γ-confidence interval for ϑ. Derived so far: ϑ n (φ(tn ) φ(ϑ)) = n φ (ϑ) N 1 (0, η 2 ) for all ϑ Θ. Example (Asymptotic CI via VST) In the above situation and using a variance-stabilizing transformation φ, our First Idea implies that [ ( ) ( ) ] 1 + γ η 1 + γ η CI φ(ϑ),n (γ) := φ(t n ) Φ 1 n, φ(t n ) + Φ 1 n 2 2 is an asymptotic γ-confidence interval for φ(ϑ). Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54

90 Example (Asymptotic CI via VST (continued)) [ ( ) 1 + γ η CI φ(ϑ),n (γ) := φ(t n ) Φ 1 2 is an asymptotic γ-confidence interval for φ(ϑ). n, φ(t n ) + Φ 1 ( 1 + γ 2 ) ] η n Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54

91 Example (Asymptotic CI via VST (continued)) [ ( ) 1 + γ η CI φ(ϑ),n (γ) := φ(t n ) Φ 1 2 n, φ(t n ) + Φ 1 ( 1 + γ 2 ) ] η n is an asymptotic γ-confidence interval for φ(ϑ). Idea: Transform this interval using φ 1 to obtain an asymptotic CI for ϑ. Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54

92 Example (Asymptotic CI via VST (continued)) [ ( ) 1 + γ η CI φ(ϑ),n (γ) := φ(t n ) Φ 1 2 n, φ(t n ) + Φ 1 ( 1 + γ 2 ) ] η n is an asymptotic γ-confidence interval for φ(ϑ). Idea: Transform this interval using φ 1 to obtain an asymptotic CI for ϑ. We know for ϑ Θ: γ lim inf n P ϑ (φ(ϑ) CI φ(ϑ),n (γ) ) ( }) = lim inf P ϑ ϑ {φ CI n φ(ϑ),n (γ). Now, since φ is continuous and strictly increasing (in particular one-to-one), { φ CI φ(ϑ),n(γ) } R is really an interval. Thus, depending on a specific φ, we obtain an asymptotic γ-confidence interval for ϑ. Note that this interval is easy to compute in practice: Just apply φ 1 to the boundary values of CI φ(ϑ),n(γ). Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54

93 Chapter 3 Moment Estimators Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54

94 Method of Moments Let Θ R k and {P ϑ } ϑ Θ be a familiy of probability measures on B(R). As above we assume that {X l } l N is a familiy of i.i.d. R-valued random variables with P(X 1 ) {P ϑ } ϑ Θ, i.e. the distribution of X 1 is known up to the parameter vector ϑ Θ. 6 Of course, this requires certain integrability conditions on these functions and X 1 s.t. all involved expectations are well-defined. 7 Note that it is not clear a priori whether such a ϑ exists. Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54

95 Method of Moments Let Θ R k and {P ϑ } ϑ Θ be a familiy of probability measures on B(R). As above we assume that {X l } l N is a familiy of i.i.d. R-valued random variables with P(X 1 ) {P ϑ } ϑ Θ, i.e. the distribution of X 1 is known up to the parameter vector ϑ Θ. Given some functions f 1,..., f k : R R, the method of moments pursues the following ansatz 6 : Find ϑ Θ, s.t. f j (X ) n = 1 n n f j (X i ) =! E ϑ [f j (X 1 )], j = 1,..., k. 7 i=1 It is obvious that the LLN motivates this approach. 6 Of course, this requires certain integrability conditions on these functions and X 1 s.t. all involved expectations are well-defined. 7 Note that it is not clear a priori whether such a ϑ exists. Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54

96 Ansatz Find ϑ Θ, s.t. 1 n n i=1 f j(x i )! = E ϑ [f j (X 1 )], j = 1,..., k. Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54

97 Ansatz Find ϑ Θ, s.t. 1 n n i=1 f j(x i )! = E ϑ [f j (X 1 )], j = 1,..., k. Remark (Recourse to Mathematical Statistics ) Consider f j (x) := x j, j = 1,..., k. Then the method of moments reduces to finding ϑ Θ s.t. X j n = 1 n n i=1 X j i = 1 n n ] f j (X i ) =! E ϑ [f j (X 1 )] = E ϑ [X j 1, j = 1,..., k. i=1 Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54

98 Ansatz Find ϑ Θ, s.t. 1 n n i=1 f j(x i )! = E ϑ [f j (X 1 )], j = 1,..., k. Remark (Recourse to Mathematical Statistics ) Consider f j (x) := x j, j = 1,..., k. Then the method of moments reduces to finding ϑ Θ s.t. X j n = 1 n n i=1 X j i = 1 n n ] f j (X i ) =! E ϑ [f j (X 1 )] = E ϑ [X j 1, j = 1,..., k. i=1 Now we want to scrutinize conditions for existence and asymptotic normality of this type of estimator (to be introduced shortly). Therefore, we use the following Notation f := (f 1,..., f k ) T, e : Θ R k, ϑ E ϑ [f (X 1 )]. Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54

99 Moment estimators Therefore, the equation of interest is given by )! (f j (X ) n = e(ϑ). (*) j=1,...,k Definition (moment estimators) An estimator ϑ n solving equation (*) is called a moment estimator. Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54

100 Existence and asymptotic normality Theorem We consider the situation stated above. Let Θ R k be an open set and suppose e(ϑ) = E ϑ [f (X 1 )], ϑ Θ, is continuously differentiable in an open neighborhood of some point ϑ] 0 Θ with det De(ϑ 0 ) 0. Moreover, assume that E ϑ0 [ f (X 1 ) 2 2 <. Then e is C 1 -invertible in an open neighborhood of ϑ 0 and moment estimators ϑ n exists with probability tending to 1 as n 8. Furthermore they obey ( n ) ) ϑ0 9 ( P ( ϑn ϑ 0 N k 0, (De(ϑ 0 )) 1 Cov ϑ0 [f (X 1 )] (De(ϑ 0 )) T) Proof. BOARD 8 I.e., informally, the set of ω s where ϑ n can be defined gains P-mass as n. 9 If ϑ 0 is the true parameter. Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54

101 List of literature Durrett, R.: Probability: Theory and Examples. Cambridge University Press, Klenke, A.: Wahrscheinlichkeitstheorie. Berlin: Springer, Redenbach, C.: Mathematical Statistics. Lecture Notes TU Kaiserslautern, Seifried, F. T.: Maß und Integration. Lecture Notes TU Kaiserslautern, Seifried, F. T.: Probability Theory. Lecture Notes TU Kaiserslautern, Van Der Vaart, A. W.: Asymptotic Statistics. Cambridge University Press, Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54

Weak convergence. Amsterdam, 13 November Leiden University. Limit theorems. Shota Gugushvili. Generalities. Criteria

Weak convergence. Amsterdam, 13 November Leiden University. Limit theorems. Shota Gugushvili. Generalities. Criteria Weak Leiden University Amsterdam, 13 November 2013 Outline 1 2 3 4 5 6 7 Definition Definition Let µ, µ 1, µ 2,... be probability measures on (R, B). It is said that µ n converges weakly to µ, and we then

More information

On the convergence of sequences of random variables: A primer

On the convergence of sequences of random variables: A primer BCAM May 2012 1 On the convergence of sequences of random variables: A primer Armand M. Makowski ECE & ISR/HyNet University of Maryland at College Park armand@isr.umd.edu BCAM May 2012 2 A sequence a :

More information

Hypothesis Testing. 1 Definitions of test statistics. CB: chapter 8; section 10.3

Hypothesis Testing. 1 Definitions of test statistics. CB: chapter 8; section 10.3 Hypothesis Testing CB: chapter 8; section 0.3 Hypothesis: statement about an unknown population parameter Examples: The average age of males in Sweden is 7. (statement about population mean) The lowest

More information

Theoretical Statistics. Lecture 1.

Theoretical Statistics. Lecture 1. 1. Organizational issues. 2. Overview. 3. Stochastic convergence. Theoretical Statistics. Lecture 1. eter Bartlett 1 Organizational Issues Lectures: Tue/Thu 11am 12:30pm, 332 Evans. eter Bartlett. bartlett@stat.

More information

Lecture 32: Asymptotic confidence sets and likelihoods

Lecture 32: Asymptotic confidence sets and likelihoods Lecture 32: Asymptotic confidence sets and likelihoods Asymptotic criterion In some problems, especially in nonparametric problems, it is difficult to find a reasonable confidence set with a given confidence

More information

Uniformity and the delta method

Uniformity and the delta method Uniformity and the delta method Maximilian Kasy January, 208 Abstract When are asymptotic approximations using the delta-method uniformly valid? We provide sufficient conditions as well as closely related

More information

Gaussian vectors and central limit theorem

Gaussian vectors and central limit theorem Gaussian vectors and central limit theorem Samy Tindel Purdue University Probability Theory 2 - MA 539 Samy T. Gaussian vectors & CLT Probability Theory 1 / 86 Outline 1 Real Gaussian random variables

More information

SDS : Theoretical Statistics

SDS : Theoretical Statistics SDS 384 11: Theoretical Statistics Lecture 1: Introduction Purnamrita Sarkar Department of Statistics and Data Science The University of Texas at Austin https://psarkar.github.io/teaching Manegerial Stuff

More information

Introduction to Empirical Processes and Semiparametric Inference Lecture 08: Stochastic Convergence

Introduction to Empirical Processes and Semiparametric Inference Lecture 08: Stochastic Convergence Introduction to Empirical Processes and Semiparametric Inference Lecture 08: Stochastic Convergence Michael R. Kosorok, Ph.D. Professor and Chair of Biostatistics Professor of Statistics and Operations

More information

ENEE 621 SPRING 2016 DETECTION AND ESTIMATION THEORY THE PARAMETER ESTIMATION PROBLEM

ENEE 621 SPRING 2016 DETECTION AND ESTIMATION THEORY THE PARAMETER ESTIMATION PROBLEM c 2007-2016 by Armand M. Makowski 1 ENEE 621 SPRING 2016 DETECTION AND ESTIMATION THEORY THE PARAMETER ESTIMATION PROBLEM 1 The basic setting Throughout, p, q and k are positive integers. The setup With

More information

A Very Brief Summary of Statistical Inference, and Examples

A Very Brief Summary of Statistical Inference, and Examples A Very Brief Summary of Statistical Inference, and Examples Trinity Term 2009 Prof. Gesine Reinert Our standard situation is that we have data x = x 1, x 2,..., x n, which we view as realisations of random

More information

Economics 241B Review of Limit Theorems for Sequences of Random Variables

Economics 241B Review of Limit Theorems for Sequences of Random Variables Economics 241B Review of Limit Theorems for Sequences of Random Variables Convergence in Distribution The previous de nitions of convergence focus on the outcome sequences of a random variable. Convergence

More information

ST5215: Advanced Statistical Theory

ST5215: Advanced Statistical Theory Department of Statistics & Applied Probability Wednesday, October 5, 2011 Lecture 13: Basic elements and notions in decision theory Basic elements X : a sample from a population P P Decision: an action

More information

The Canonical Gaussian Measure on R

The Canonical Gaussian Measure on R The Canonical Gaussian Measure on R 1. Introduction The main goal of this course is to study Gaussian measures. The simplest example of a Gaussian measure is the canonical Gaussian measure P on R where

More information

Metric spaces and metrizability

Metric spaces and metrizability 1 Motivation Metric spaces and metrizability By this point in the course, this section should not need much in the way of motivation. From the very beginning, we have talked about R n usual and how relatively

More information

Probability and Measure

Probability and Measure Probability and Measure Robert L. Wolpert Institute of Statistics and Decision Sciences Duke University, Durham, NC, USA Convergence of Random Variables 1. Convergence Concepts 1.1. Convergence of Real

More information

Economics 583: Econometric Theory I A Primer on Asymptotics

Economics 583: Econometric Theory I A Primer on Asymptotics Economics 583: Econometric Theory I A Primer on Asymptotics Eric Zivot January 14, 2013 The two main concepts in asymptotic theory that we will use are Consistency Asymptotic Normality Intuition consistency:

More information

Large Sample Theory. Consider a sequence of random variables Z 1, Z 2,..., Z n. Convergence in probability: Z n

Large Sample Theory. Consider a sequence of random variables Z 1, Z 2,..., Z n. Convergence in probability: Z n Large Sample Theory In statistics, we are interested in the properties of particular random variables (or estimators ), which are functions of our data. In ymptotic analysis, we focus on describing the

More information

Lecture 7 Introduction to Statistical Decision Theory

Lecture 7 Introduction to Statistical Decision Theory Lecture 7 Introduction to Statistical Decision Theory I-Hsiang Wang Department of Electrical Engineering National Taiwan University ihwang@ntu.edu.tw December 20, 2016 1 / 55 I-Hsiang Wang IT Lecture 7

More information

Some Background Material

Some Background Material Chapter 1 Some Background Material In the first chapter, we present a quick review of elementary - but important - material as a way of dipping our toes in the water. This chapter also introduces important

More information

Chapter 6. Convergence. Probability Theory. Four different convergence concepts. Four different convergence concepts. Convergence in probability

Chapter 6. Convergence. Probability Theory. Four different convergence concepts. Four different convergence concepts. Convergence in probability Probability Theory Chapter 6 Convergence Four different convergence concepts Let X 1, X 2, be a sequence of (usually dependent) random variables Definition 1.1. X n converges almost surely (a.s.), or with

More information

Prohorov s theorem. Bengt Ringnér. October 26, 2008

Prohorov s theorem. Bengt Ringnér. October 26, 2008 Prohorov s theorem Bengt Ringnér October 26, 2008 1 The theorem Definition 1 A set Π of probability measures defined on the Borel sets of a topological space is called tight if, for each ε > 0, there is

More information

IEOR 3106: Introduction to Operations Research: Stochastic Models. Fall 2011, Professor Whitt. Class Lecture Notes: Thursday, September 15.

IEOR 3106: Introduction to Operations Research: Stochastic Models. Fall 2011, Professor Whitt. Class Lecture Notes: Thursday, September 15. IEOR 3106: Introduction to Operations Research: Stochastic Models Fall 2011, Professor Whitt Class Lecture Notes: Thursday, September 15. Random Variables, Conditional Expectation and Transforms 1. Random

More information

Notes 9 : Infinitely divisible and stable laws

Notes 9 : Infinitely divisible and stable laws Notes 9 : Infinitely divisible and stable laws Math 733 - Fall 203 Lecturer: Sebastien Roch References: [Dur0, Section 3.7, 3.8], [Shi96, Section III.6]. Infinitely divisible distributions Recall: EX 9.

More information

17. Convergence of Random Variables

17. Convergence of Random Variables 7. Convergence of Random Variables In elementary mathematics courses (such as Calculus) one speaks of the convergence of functions: f n : R R, then lim f n = f if lim f n (x) = f(x) for all x in R. This

More information

STAT Sample Problem: General Asymptotic Results

STAT Sample Problem: General Asymptotic Results STAT331 1-Sample Problem: General Asymptotic Results In this unit we will consider the 1-sample problem and prove the consistency and asymptotic normality of the Nelson-Aalen estimator of the cumulative

More information

Asymptotic Statistics-III. Changliang Zou

Asymptotic Statistics-III. Changliang Zou Asymptotic Statistics-III Changliang Zou The multivariate central limit theorem Theorem (Multivariate CLT for iid case) Let X i be iid random p-vectors with mean µ and and covariance matrix Σ. Then n (

More information

Lecture 2: Convergence of Random Variables

Lecture 2: Convergence of Random Variables Lecture 2: Convergence of Random Variables Hyang-Won Lee Dept. of Internet & Multimedia Eng. Konkuk University Lecture 2 Introduction to Stochastic Processes, Fall 2013 1 / 9 Convergence of Random Variables

More information

Ergodic Theorems. Samy Tindel. Purdue University. Probability Theory 2 - MA 539. Taken from Probability: Theory and examples by R.

Ergodic Theorems. Samy Tindel. Purdue University. Probability Theory 2 - MA 539. Taken from Probability: Theory and examples by R. Ergodic Theorems Samy Tindel Purdue University Probability Theory 2 - MA 539 Taken from Probability: Theory and examples by R. Durrett Samy T. Ergodic theorems Probability Theory 1 / 92 Outline 1 Definitions

More information

6.1 Variational representation of f-divergences

6.1 Variational representation of f-divergences ECE598: Information-theoretic methods in high-dimensional statistics Spring 2016 Lecture 6: Variational representation, HCR and CR lower bounds Lecturer: Yihong Wu Scribe: Georgios Rovatsos, Feb 11, 2016

More information

Review and continuation from last week Properties of MLEs

Review and continuation from last week Properties of MLEs Review and continuation from last week Properties of MLEs As we have mentioned, MLEs have a nice intuitive property, and as we have seen, they have a certain equivariance property. We will see later that

More information

Lecture 8: Information Theory and Statistics

Lecture 8: Information Theory and Statistics Lecture 8: Information Theory and Statistics Part II: Hypothesis Testing and I-Hsiang Wang Department of Electrical Engineering National Taiwan University ihwang@ntu.edu.tw December 23, 2015 1 / 50 I-Hsiang

More information

Lecture Notes in Advanced Calculus 1 (80315) Raz Kupferman Institute of Mathematics The Hebrew University

Lecture Notes in Advanced Calculus 1 (80315) Raz Kupferman Institute of Mathematics The Hebrew University Lecture Notes in Advanced Calculus 1 (80315) Raz Kupferman Institute of Mathematics The Hebrew University February 7, 2007 2 Contents 1 Metric Spaces 1 1.1 Basic definitions...........................

More information

Lecture 6 Basic Probability

Lecture 6 Basic Probability Lecture 6: Basic Probability 1 of 17 Course: Theory of Probability I Term: Fall 2013 Instructor: Gordan Zitkovic Lecture 6 Basic Probability Probability spaces A mathematical setup behind a probabilistic

More information

Introduction to Empirical Processes and Semiparametric Inference Lecture 09: Stochastic Convergence, Continued

Introduction to Empirical Processes and Semiparametric Inference Lecture 09: Stochastic Convergence, Continued Introduction to Empirical Processes and Semiparametric Inference Lecture 09: Stochastic Convergence, Continued Michael R. Kosorok, Ph.D. Professor and Chair of Biostatistics Professor of Statistics and

More information

Lecture 1: August 28

Lecture 1: August 28 36-705: Intermediate Statistics Fall 2017 Lecturer: Siva Balakrishnan Lecture 1: August 28 Our broad goal for the first few lectures is to try to understand the behaviour of sums of independent random

More information

Theoretical Statistics. Lecture 17.

Theoretical Statistics. Lecture 17. Theoretical Statistics. Lecture 17. Peter Bartlett 1. Asymptotic normality of Z-estimators: classical conditions. 2. Asymptotic equicontinuity. 1 Recall: Delta method Theorem: Supposeφ : R k R m is differentiable

More information

1 Probability theory. 2 Random variables and probability theory.

1 Probability theory. 2 Random variables and probability theory. Probability theory Here we summarize some of the probability theory we need. If this is totally unfamiliar to you, you should look at one of the sources given in the readings. In essence, for the major

More information

simple if it completely specifies the density of x

simple if it completely specifies the density of x 3. Hypothesis Testing Pure significance tests Data x = (x 1,..., x n ) from f(x, θ) Hypothesis H 0 : restricts f(x, θ) Are the data consistent with H 0? H 0 is called the null hypothesis simple if it completely

More information

Chapter 7. Confidence Sets Lecture 30: Pivotal quantities and confidence sets

Chapter 7. Confidence Sets Lecture 30: Pivotal quantities and confidence sets Chapter 7. Confidence Sets Lecture 30: Pivotal quantities and confidence sets Confidence sets X: a sample from a population P P. θ = θ(p): a functional from P to Θ R k for a fixed integer k. C(X): a confidence

More information

Module 3. Function of a Random Variable and its distribution

Module 3. Function of a Random Variable and its distribution Module 3 Function of a Random Variable and its distribution 1. Function of a Random Variable Let Ω, F, be a probability space and let be random variable defined on Ω, F,. Further let h: R R be a given

More information

Math Camp II. Calculus. Yiqing Xu. August 27, 2014 MIT

Math Camp II. Calculus. Yiqing Xu. August 27, 2014 MIT Math Camp II Calculus Yiqing Xu MIT August 27, 2014 1 Sequence and Limit 2 Derivatives 3 OLS Asymptotics 4 Integrals Sequence Definition A sequence {y n } = {y 1, y 2, y 3,..., y n } is an ordered set

More information

Notes 1 : Measure-theoretic foundations I

Notes 1 : Measure-theoretic foundations I Notes 1 : Measure-theoretic foundations I Math 733-734: Theory of Probability Lecturer: Sebastien Roch References: [Wil91, Section 1.0-1.8, 2.1-2.3, 3.1-3.11], [Fel68, Sections 7.2, 8.1, 9.6], [Dur10,

More information

Mathematical Methods for Neurosciences. ENS - Master MVA Paris 6 - Master Maths-Bio ( )

Mathematical Methods for Neurosciences. ENS - Master MVA Paris 6 - Master Maths-Bio ( ) Mathematical Methods for Neurosciences. ENS - Master MVA Paris 6 - Master Maths-Bio (2014-2015) Etienne Tanré - Olivier Faugeras INRIA - Team Tosca October 22nd, 2014 E. Tanré (INRIA - Team Tosca) Mathematical

More information

Stat 8112 Lecture Notes Weak Convergence in Metric Spaces Charles J. Geyer January 23, Metric Spaces

Stat 8112 Lecture Notes Weak Convergence in Metric Spaces Charles J. Geyer January 23, Metric Spaces Stat 8112 Lecture Notes Weak Convergence in Metric Spaces Charles J. Geyer January 23, 2013 1 Metric Spaces Let X be an arbitrary set. A function d : X X R is called a metric if it satisfies the folloing

More information

Probability Theory and Statistics. Peter Jochumzen

Probability Theory and Statistics. Peter Jochumzen Probability Theory and Statistics Peter Jochumzen April 18, 2016 Contents 1 Probability Theory And Statistics 3 1.1 Experiment, Outcome and Event................................ 3 1.2 Probability............................................

More information

Empirical Processes: General Weak Convergence Theory

Empirical Processes: General Weak Convergence Theory Empirical Processes: General Weak Convergence Theory Moulinath Banerjee May 18, 2010 1 Extended Weak Convergence The lack of measurability of the empirical process with respect to the sigma-field generated

More information

Lecture 8 Inequality Testing and Moment Inequality Models

Lecture 8 Inequality Testing and Moment Inequality Models Lecture 8 Inequality Testing and Moment Inequality Models Inequality Testing In the previous lecture, we discussed how to test the nonlinear hypothesis H 0 : h(θ 0 ) 0 when the sample information comes

More information

Probability and Distributions

Probability and Distributions Probability and Distributions What is a statistical model? A statistical model is a set of assumptions by which the hypothetical population distribution of data is inferred. It is typically postulated

More information

40.530: Statistics. Professor Chen Zehua. Singapore University of Design and Technology

40.530: Statistics. Professor Chen Zehua. Singapore University of Design and Technology Singapore University of Design and Technology Lecture 9: Hypothesis testing, uniformly most powerful tests. The Neyman-Pearson framework Let P be the family of distributions of concern. The Neyman-Pearson

More information

Statistical inference

Statistical inference Statistical inference Contents 1. Main definitions 2. Estimation 3. Testing L. Trapani MSc Induction - Statistical inference 1 1 Introduction: definition and preliminary theory In this chapter, we shall

More information

EC212: Introduction to Econometrics Review Materials (Wooldridge, Appendix)

EC212: Introduction to Econometrics Review Materials (Wooldridge, Appendix) 1 EC212: Introduction to Econometrics Review Materials (Wooldridge, Appendix) Taisuke Otsu London School of Economics Summer 2018 A.1. Summation operator (Wooldridge, App. A.1) 2 3 Summation operator For

More information

Chapter 4. Theory of Tests. 4.1 Introduction

Chapter 4. Theory of Tests. 4.1 Introduction Chapter 4 Theory of Tests 4.1 Introduction Parametric model: (X, B X, P θ ), P θ P = {P θ θ Θ} where Θ = H 0 +H 1 X = K +A : K: critical region = rejection region / A: acceptance region A decision rule

More information

If we want to analyze experimental or simulated data we might encounter the following tasks:

If we want to analyze experimental or simulated data we might encounter the following tasks: Chapter 1 Introduction If we want to analyze experimental or simulated data we might encounter the following tasks: Characterization of the source of the signal and diagnosis Studying dependencies Prediction

More information

CONVERGENCE OF RANDOM SERIES AND MARTINGALES

CONVERGENCE OF RANDOM SERIES AND MARTINGALES CONVERGENCE OF RANDOM SERIES AND MARTINGALES WESLEY LEE Abstract. This paper is an introduction to probability from a measuretheoretic standpoint. After covering probability spaces, it delves into the

More information

University of Regina. Lecture Notes. Michael Kozdron

University of Regina. Lecture Notes. Michael Kozdron University of Regina Statistics 252 Mathematical Statistics Lecture Notes Winter 2005 Michael Kozdron kozdron@math.uregina.ca www.math.uregina.ca/ kozdron Contents 1 The Basic Idea of Statistics: Estimating

More information

Asymptotic Statistics-VI. Changliang Zou

Asymptotic Statistics-VI. Changliang Zou Asymptotic Statistics-VI Changliang Zou Kolmogorov-Smirnov distance Example (Kolmogorov-Smirnov confidence intervals) We know given α (0, 1), there is a well-defined d = d α,n such that, for any continuous

More information

Spring 2012 Math 541B Exam 1

Spring 2012 Math 541B Exam 1 Spring 2012 Math 541B Exam 1 1. A sample of size n is drawn without replacement from an urn containing N balls, m of which are red and N m are black; the balls are otherwise indistinguishable. Let X denote

More information

Lecture 2. We now introduce some fundamental tools in martingale theory, which are useful in controlling the fluctuation of martingales.

Lecture 2. We now introduce some fundamental tools in martingale theory, which are useful in controlling the fluctuation of martingales. Lecture 2 1 Martingales We now introduce some fundamental tools in martingale theory, which are useful in controlling the fluctuation of martingales. 1.1 Doob s inequality We have the following maximal

More information

The properties of L p -GMM estimators

The properties of L p -GMM estimators The properties of L p -GMM estimators Robert de Jong and Chirok Han Michigan State University February 2000 Abstract This paper considers Generalized Method of Moment-type estimators for which a criterion

More information

STAT 200C: High-dimensional Statistics

STAT 200C: High-dimensional Statistics STAT 200C: High-dimensional Statistics Arash A. Amini May 30, 2018 1 / 59 Classical case: n d. Asymptotic assumption: d is fixed and n. Basic tools: LLN and CLT. High-dimensional setting: n d, e.g. n/d

More information

Statistics. Statistics

Statistics. Statistics The main aims of statistics 1 1 Choosing a model 2 Estimating its parameter(s) 1 point estimates 2 interval estimates 3 Testing hypotheses Distributions used in statistics: χ 2 n-distribution 2 Let X 1,

More information

CHAPTER 3: LARGE SAMPLE THEORY

CHAPTER 3: LARGE SAMPLE THEORY CHAPTER 3 LARGE SAMPLE THEORY 1 CHAPTER 3: LARGE SAMPLE THEORY CHAPTER 3 LARGE SAMPLE THEORY 2 Introduction CHAPTER 3 LARGE SAMPLE THEORY 3 Why large sample theory studying small sample property is usually

More information

Probability and Measure

Probability and Measure Part II Year 2018 2017 2016 2015 2014 2013 2012 2011 2010 2009 2008 2007 2006 2005 2018 84 Paper 4, Section II 26J Let (X, A) be a measurable space. Let T : X X be a measurable map, and µ a probability

More information

The International Journal of Biostatistics

The International Journal of Biostatistics The International Journal of Biostatistics Volume 7, Issue 1 2011 Article 12 Consonance and the Closure Method in Multiple Testing Joseph P. Romano, Stanford University Azeem Shaikh, University of Chicago

More information

Convergence in Distribution

Convergence in Distribution Convergence in Distribution Undergraduate version of central limit theorem: if X 1,..., X n are iid from a population with mean µ and standard deviation σ then n 1/2 ( X µ)/σ has approximately a normal

More information

Chapter 5. Weak convergence

Chapter 5. Weak convergence Chapter 5 Weak convergence We will see later that if the X i are i.i.d. with mean zero and variance one, then S n / p n converges in the sense P(S n / p n 2 [a, b])! P(Z 2 [a, b]), where Z is a standard

More information

X n D X lim n F n (x) = F (x) for all x C F. lim n F n(u) = F (u) for all u C F. (2)

X n D X lim n F n (x) = F (x) for all x C F. lim n F n(u) = F (u) for all u C F. (2) 14:17 11/16/2 TOPIC. Convergence in distribution and related notions. This section studies the notion of the so-called convergence in distribution of real random variables. This is the kind of convergence

More information

Introduction to Probability

Introduction to Probability LECTURE NOTES Course 6.041-6.431 M.I.T. FALL 2000 Introduction to Probability Dimitri P. Bertsekas and John N. Tsitsiklis Professors of Electrical Engineering and Computer Science Massachusetts Institute

More information

Introduction to Empirical Processes and Semiparametric Inference Lecture 02: Overview Continued

Introduction to Empirical Processes and Semiparametric Inference Lecture 02: Overview Continued Introduction to Empirical Processes and Semiparametric Inference Lecture 02: Overview Continued Michael R. Kosorok, Ph.D. Professor and Chair of Biostatistics Professor of Statistics and Operations Research

More information

Chapter 2: Fundamentals of Statistics Lecture 15: Models and statistics

Chapter 2: Fundamentals of Statistics Lecture 15: Models and statistics Chapter 2: Fundamentals of Statistics Lecture 15: Models and statistics Data from one or a series of random experiments are collected. Planning experiments and collecting data (not discussed here). Analysis:

More information

Limiting Distributions

Limiting Distributions Limiting Distributions We introduce the mode of convergence for a sequence of random variables, and discuss the convergence in probability and in distribution. The concept of convergence leads us to the

More information

STA 732: Inference. Notes 2. Neyman-Pearsonian Classical Hypothesis Testing B&D 4

STA 732: Inference. Notes 2. Neyman-Pearsonian Classical Hypothesis Testing B&D 4 STA 73: Inference Notes. Neyman-Pearsonian Classical Hypothesis Testing B&D 4 1 Testing as a rule Fisher s quantification of extremeness of observed evidence clearly lacked rigorous mathematical interpretation.

More information

Multivariate Analysis and Likelihood Inference

Multivariate Analysis and Likelihood Inference Multivariate Analysis and Likelihood Inference Outline 1 Joint Distribution of Random Variables 2 Principal Component Analysis (PCA) 3 Multivariate Normal Distribution 4 Likelihood Inference Joint density

More information

7 Convergence in R d and in Metric Spaces

7 Convergence in R d and in Metric Spaces STA 711: Probability & Measure Theory Robert L. Wolpert 7 Convergence in R d and in Metric Spaces A sequence of elements a n of R d converges to a limit a if and only if, for each ǫ > 0, the sequence a

More information

Chapter 3. Point Estimation. 3.1 Introduction

Chapter 3. Point Estimation. 3.1 Introduction Chapter 3 Point Estimation Let (Ω, A, P θ ), P θ P = {P θ θ Θ}be probability space, X 1, X 2,..., X n : (Ω, A) (IR k, B k ) random variables (X, B X ) sample space γ : Θ IR k measurable function, i.e.

More information

Chapter 3: Unbiased Estimation Lecture 22: UMVUE and the method of using a sufficient and complete statistic

Chapter 3: Unbiased Estimation Lecture 22: UMVUE and the method of using a sufficient and complete statistic Chapter 3: Unbiased Estimation Lecture 22: UMVUE and the method of using a sufficient and complete statistic Unbiased estimation Unbiased or asymptotically unbiased estimation plays an important role in

More information

Hochdimensionale Integration

Hochdimensionale Integration Oliver Ernst Institut für Numerische Mathematik und Optimierung Hochdimensionale Integration 14-tägige Vorlesung im Wintersemester 2010/11 im Rahmen des Moduls Ausgewählte Kapitel der Numerik Contents

More information

Overview of normed linear spaces

Overview of normed linear spaces 20 Chapter 2 Overview of normed linear spaces Starting from this chapter, we begin examining linear spaces with at least one extra structure (topology or geometry). We assume linearity; this is a natural

More information

Multivariate Distributions

Multivariate Distributions IEOR E4602: Quantitative Risk Management Spring 2016 c 2016 by Martin Haugh Multivariate Distributions We will study multivariate distributions in these notes, focusing 1 in particular on multivariate

More information

MA651 Topology. Lecture 9. Compactness 2.

MA651 Topology. Lecture 9. Compactness 2. MA651 Topology. Lecture 9. Compactness 2. This text is based on the following books: Topology by James Dugundgji Fundamental concepts of topology by Peter O Neil Elements of Mathematics: General Topology

More information

Metric Spaces Lecture 17

Metric Spaces Lecture 17 Metric Spaces Lecture 17 Homeomorphisms At the end of last lecture an example was given of a bijective continuous function f such that f 1 is not continuous. For another example, consider the sets T =

More information

Stochastic Processes

Stochastic Processes Stochastic Processes A very simple introduction Péter Medvegyev 2009, January Medvegyev (CEU) Stochastic Processes 2009, January 1 / 54 Summary from measure theory De nition (X, A) is a measurable space

More information

STAT 7032 Probability Spring Wlodek Bryc

STAT 7032 Probability Spring Wlodek Bryc STAT 7032 Probability Spring 2018 Wlodek Bryc Created: Friday, Jan 2, 2014 Revised for Spring 2018 Printed: January 9, 2018 File: Grad-Prob-2018.TEX Department of Mathematical Sciences, University of Cincinnati,

More information

Notes 18 : Optional Sampling Theorem

Notes 18 : Optional Sampling Theorem Notes 18 : Optional Sampling Theorem Math 733-734: Theory of Probability Lecturer: Sebastien Roch References: [Wil91, Chapter 14], [Dur10, Section 5.7]. Recall: DEF 18.1 (Uniform Integrability) A collection

More information

Metric Spaces and Topology

Metric Spaces and Topology Chapter 2 Metric Spaces and Topology From an engineering perspective, the most important way to construct a topology on a set is to define the topology in terms of a metric on the set. This approach underlies

More information

Building Infinite Processes from Finite-Dimensional Distributions

Building Infinite Processes from Finite-Dimensional Distributions Chapter 2 Building Infinite Processes from Finite-Dimensional Distributions Section 2.1 introduces the finite-dimensional distributions of a stochastic process, and shows how they determine its infinite-dimensional

More information

Notions such as convergent sequence and Cauchy sequence make sense for any metric space. Convergent Sequences are Cauchy

Notions such as convergent sequence and Cauchy sequence make sense for any metric space. Convergent Sequences are Cauchy Banach Spaces These notes provide an introduction to Banach spaces, which are complete normed vector spaces. For the purposes of these notes, all vector spaces are assumed to be over the real numbers.

More information

SYSM 6303: Quantitative Introduction to Risk and Uncertainty in Business Lecture 4: Fitting Data to Distributions

SYSM 6303: Quantitative Introduction to Risk and Uncertainty in Business Lecture 4: Fitting Data to Distributions SYSM 6303: Quantitative Introduction to Risk and Uncertainty in Business Lecture 4: Fitting Data to Distributions M. Vidyasagar Cecil & Ida Green Chair The University of Texas at Dallas Email: M.Vidyasagar@utdallas.edu

More information

Real Analysis. July 10, These notes are intended for use in the warm-up camp for incoming Berkeley Statistics

Real Analysis. July 10, These notes are intended for use in the warm-up camp for incoming Berkeley Statistics Real Analysis July 10, 2006 1 Introduction These notes are intended for use in the warm-up camp for incoming Berkeley Statistics graduate students. Welcome to Cal! The real analysis review presented here

More information

MA 575 Linear Models: Cedric E. Ginestet, Boston University Revision: Probability and Linear Algebra Week 1, Lecture 2

MA 575 Linear Models: Cedric E. Ginestet, Boston University Revision: Probability and Linear Algebra Week 1, Lecture 2 MA 575 Linear Models: Cedric E Ginestet, Boston University Revision: Probability and Linear Algebra Week 1, Lecture 2 1 Revision: Probability Theory 11 Random Variables A real-valued random variable is

More information

Cherry Blossom run (1) The credit union Cherry Blossom Run is a 10 mile race that takes place every year in D.C. In 2009 there were participants

Cherry Blossom run (1) The credit union Cherry Blossom Run is a 10 mile race that takes place every year in D.C. In 2009 there were participants 18.650 Statistics for Applications Chapter 5: Parametric hypothesis testing 1/37 Cherry Blossom run (1) The credit union Cherry Blossom Run is a 10 mile race that takes place every year in D.C. In 2009

More information

Elementary Probability. Exam Number 38119

Elementary Probability. Exam Number 38119 Elementary Probability Exam Number 38119 2 1. Introduction Consider any experiment whose result is unknown, for example throwing a coin, the daily number of customers in a supermarket or the duration of

More information

STATS 200: Introduction to Statistical Inference. Lecture 29: Course review

STATS 200: Introduction to Statistical Inference. Lecture 29: Course review STATS 200: Introduction to Statistical Inference Lecture 29: Course review Course review We started in Lecture 1 with a fundamental assumption: Data is a realization of a random process. The goal throughout

More information

Regression and Statistical Inference

Regression and Statistical Inference Regression and Statistical Inference Walid Mnif wmnif@uwo.ca Department of Applied Mathematics The University of Western Ontario, London, Canada 1 Elements of Probability 2 Elements of Probability CDF&PDF

More information

Statistics for scientists and engineers

Statistics for scientists and engineers Statistics for scientists and engineers February 0, 006 Contents Introduction. Motivation - why study statistics?................................... Examples..................................................3

More information

1 Fourier Integrals of finite measures.

1 Fourier Integrals of finite measures. 18.103 Fall 2013 1 Fourier Integrals of finite measures. Denote the space of finite, positive, measures on by M + () = {µ : µ is a positive measure on ; µ() < } Proposition 1 For µ M + (), we define the

More information

Lecture 21: Convergence of transformations and generating a random variable

Lecture 21: Convergence of transformations and generating a random variable Lecture 21: Convergence of transformations and generating a random variable If Z n converges to Z in some sense, we often need to check whether h(z n ) converges to h(z ) in the same sense. Continuous

More information

Recall that in order to prove Theorem 8.8, we argued that under certain regularity conditions, the following facts are true under H 0 : 1 n

Recall that in order to prove Theorem 8.8, we argued that under certain regularity conditions, the following facts are true under H 0 : 1 n Chapter 9 Hypothesis Testing 9.1 Wald, Rao, and Likelihood Ratio Tests Suppose we wish to test H 0 : θ = θ 0 against H 1 : θ θ 0. The likelihood-based results of Chapter 8 give rise to several possible

More information

Lecture 28: Asymptotic confidence sets

Lecture 28: Asymptotic confidence sets Lecture 28: Asymptotic confidence sets 1 α asymptotic confidence sets Similar to testing hypotheses, in many situations it is difficult to find a confidence set with a given confidence coefficient or level

More information