Stochastic Convergence, Delta Method & Moment Estimators
|
|
- Ashley Daniel
- 6 years ago
- Views:
Transcription
1 Stochastic Convergence, Delta Method & Moment Estimators Seminar on Asymptotic Statistics Daniel Hoffmann University of Kaiserslautern Department of Mathematics February 13, 2015 Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54
2 Overview 1 Stochastic Convergence Concepts of convergence and basic results Theoretical examples: LLN and CLT Tools for weak convergence More on weak convergence: Tightness and Prohorov s theorem Stochastic Landau notation 2 Delta Method Basic result Application I: Testing variance Application II: Asymptotic confidence intervals and variance-stabilizing transformations 3 Moment Estimators Method of Moments: Definition Existence and asymptotic normality 4 List of literature Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54
3 Chapter 1 Stochastic Convergence Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54
4 Scope and general assumptions We recall the basic notions of stochastic convergence from Probability Theory and take a closer look at weak convergence culminating in Prohorov s theorem. 1 I.e.: There is some countable, dense subset of S. This is just a technical assumption to guarantee the measurability of events like {d(x, Y ) > η} for some random variables X, Y and a threshold η. Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54
5 Scope and general assumptions We recall the basic notions of stochastic convergence from Probability Theory and take a closer look at weak convergence culminating in Prohorov s theorem. Throughout this talk we fix a probability space (Ω, A, P) on which all appearing random variables will be defined if not stated differently. Furthermore let (S, d) be a separable 1 metric space which will serve as codomain. Later on we will restrict ourselves to the case S = R k. Let L(P, S) := {X : Ω S X is A B(S) measurable} denote the space of all random variables of interest. 1 I.e.: There is some countable, dense subset of S. This is just a technical assumption to guarantee the measurability of events like {d(x, Y ) > η} for some random variables X, Y and a threshold η. Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54
6 Concepts of convergence Definitions and properties Definition Let (X n ) n N L(P, S) N and X L(P, S). The sequence (X n ) n N is said to... converge almost surely to X (notation: X n X ) if there is some P-null set N A, s.t. X n (ω) n X (ω) for each ω Ω \ N. converge in probability to X (notation: X n a.s. P X ) if we have ε > 0 : lim n P (d(x n, X ) > ε) = 0. converge weakly to X (notation: X n X ) if for each f C b (S) we have f dp(x n ) = f dp(x ). lim n S converge in L p L -sense, p [1, ], to X if S = R (notation: X p n X ) if lim n X n X Lp (P) = 0. S Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54
7 From Probability Theory one is familiar with the following relations between these different concepts of convergence: Proposition (relations) Let (X n ) n N L(P, S) and X L(P, S). Then it holds: a.s. P (a) X n X = X n X = X n X. (b) subsequence principle: P X (n k ) k N N N (k l ) l N N N a.s. : X nkl X. X n (c) Slutsky s lemma: Let S = R k and A n, B n L(P, S), n N, s.t. P A n a R k P and B n b R. If X n X it holds: A n + B n X n a + b X. (d) Let S = R and p [1, ). Then it holds: L p P X = X n X. X n Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54
8 Moreover the above notions of convergence are compatible with continuity, i.e. a convergent sequence of random variables can be transported to another space using continuous functions and preserving the convergence: Proposition (continuous mapping principle) Let X n, X L(P, S), n N, and Φ : S S a Borel-measurable, P(X )-a.e. continuous mapping where (S, d ) is another metric space. Then one has: { where, X n X = Φ(X n ) Φ(X ), } P., a.s. Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54
9 Theoretical examples: Where do these notions occur? The most important examples include: Theorem (weak law of large numbers) Let (X n ) n N be a sequence of uncorrelated R-valued random variables satisfying sup n N Var[X n ] <. Then we have 1 n n P (X i E[X i ]) 0. L 2 i=1 Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54
10 Theoretical examples: Where do these notions occur? The most important examples include: Theorem (weak law of large numbers) Let (X n ) n N be a sequence of uncorrelated R-valued random variables satisfying sup n N Var[X n ] <. Then we have 1 n n P (X i E[X i ]) 0. L 2 i=1 Theorem (strong law of large numbers) Let (X n ) n N (L 1 (P)) N be a sequence of i.i.d. random variables. Then 1 n n i=1 X i a.s. L 1 E[X 1 ]. Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54
11 Theorem (central limit theorem) Let (X n ) n N be a sequence of i.i.d. R k -valued random variables satisfying E [ X 1 2 2] <. Then we have ( ) 1 n P (X i E[X i ]) N k (0, Cov[X 1 ]). n i=1 Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54
12 Theorem (central limit theorem) Let (X n ) n N be a sequence of i.i.d. R k -valued random variables satisfying E [ X 1 2 2] <. Then we have ( ) 1 n P (X i E[X i ]) N k (0, Cov[X 1 ]). n i=1 Theorem (weak law of small numbers) Let {X n,m } n N be a triangular array of independent random m=1,...,n variables with P(X n,m ) = Bin(1, p n,m ), m = 1,..., n, n N. Suppose that n m=1 p n n n,m λ > 0 and max m=1,...,n p n,m 0. Then we have: ( n ) P X n,m = Poi(λ). n m=1 Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54
13 Tools for weak convergence Definition (Weak convergence General approach) Let µ n, µ, n N, be probability measures on B(S). Then the sequence (µ n ) n N converges weakly to µ iff f dµ n = f dµ f Cb 0 (S). lim n S S Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54
14 Tools for weak convergence Definition (Weak convergence General approach) Let µ n, µ, n N, be probability measures on B(S). Then the sequence (µ n ) n N converges weakly to µ iff f dµ n = f dµ f Cb 0 (S). lim n Remark (A slight generalization) S S Hence weak convergence of random variables only depends on distributions: X n X P(X n ) P(X ). Due to this equivalence, it is possible to define weak convergence for random variables defined on different probability spaces: X n on (Ω n, A n, P n ), n N, and X on (Ω, A, P). For the sake of simplicity, we will not consider this slight generalization here. Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54
15 From Probability Theory one is familiar with the following characterization of weak convergence: Theorem (portmanteau theorem) Let X n, X, n N, be S-valued random variables. Then t.f.a.e.: (a) X n X, i.e. E [f (X n )] n E [f (X )] f Cb 0(S). (b) E [f (X n )] n E [f (X )] for all Lipschitz-continuous f Cb 0(S). (c) P(X O) lim inf n P(X n O) for all open O S. (d) P(X F ) lim sup n P(X n F ) for all closed F S. (e) P(X B) = lim n P(X n B) for all B B(S) with P(X )-negligible boundary, i.e. P(X B) = 0. (f) E [f (X n )] n E [f (X )] for all bounded B(S)-measurable functions f : S R that are P(X )-a.e. continuous. Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54
16 In an Euclidean k-space the distribution function is an appropriate tool to characterize weak convergence: Definition Let X L(P, R k ). Then its (cumulative) distribution function, for short cdf, is given by ( ) F X : R k [0, 1], x P(X x) = P X k (, x i ] i=1. Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54
17 In an Euclidean k-space the distribution function is an appropriate tool to characterize weak convergence: Definition Let X L(P, R k ). Then its (cumulative) distribution function, for short cdf, is given by ( ) F X : R k [0, 1], x P(X x) = P X k (, x i ] i=1. Remark Note that, as a consequence of the uniqueness theorem for finite measures, the cdf characterizes the distribution of X uniquely since { } k E := (, x i ] x 1,..., x k R i=1 is a π-system (i.e. it is -stable) that generates B(R k ). Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54
18 Proposition (weak convergence on R k via cdf) Let X n, X, n N, be R k -valued random variables. Then it holds X n X iff F Xn (x) n F X (x) for all x R k where F X is continuous. Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54
19 Proposition (weak convergence on R k via cdf) Let X n, X, n N, be R k -valued random variables. Then it holds X n X iff F Xn (x) n F X (x) for all x R k where F X is continuous. Example (N 1 ( 0, 1 n) δ0 ) Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54
20 Some more theory on weak convergence We have already observed that weak convergence is weak in the sense that it is implied by all other concepts of convergence that we have introduced. Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54
21 Some more theory on weak convergence We have already observed that weak convergence is weak in the sense that it is implied by all other concepts of convergence that we have introduced. Let us have a closer look at weak convergence and recall from calculus: Proposition (a) Every convergent sequence in R k is bounded. (b) Every bounded sequence in R k has a convergent subsequence. (Bolzano-Weierstrass theorem) Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54
22 Some more theory on weak convergence We have already observed that weak convergence is weak in the sense that it is implied by all other concepts of convergence that we have introduced. Let us have a closer look at weak convergence and recall from calculus: Proposition (a) Every convergent sequence in R k is bounded. (b) Every bounded sequence in R k has a convergent subsequence. (Bolzano-Weierstrass theorem) Is there an analogon involving weak convergence and probabilistic boundedness? Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54
23 Prohorov s theorem Yes, indeed: Prohorov s theorem answers this question: Theorem (Prohorov) Let (X n ) n N be a sequence of R k -valued random variables. Then it holds: (a) If X n X for some R k -valued random variable X, then (X n ) n N is uniformly tight 2. (b) If (X n ) n N is uniformly tight 2, then there exists a subsequence ( Xnj ) j N with X n j X for some R k -valued random variable X. For proving this theorem we need some additional concepts and results. 2 This will be made precise shortly. Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54
24 Probabilistic boundedness Definition (uniform tightness) Let I be an index set and F := {X i } i I a family of R k -valued random variables. Then F is called uniformly tight or bounded in probability if for every ε > 0 there is a constant M ε > 0, such that sup P ( X i 2 > M ε ) < ε. i I Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54
25 Probabilistic boundedness Definition (uniform tightness) Let I be an index set and F := {X i } i I a family of R k -valued random variables. Then F is called uniformly tight or bounded in probability if for every ε > 0 there is a constant M ε > 0, such that Remark sup P ( X i 2 > M ε ) < ε. i I Uniform tightness of a sequence of random vectors in R k (i.e. I = N) is exactly the definition of the stochastic Landau notation O p : (X n ) n N is uniformly tight iff X n = O P (1). We will scrutinize this notation later. Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54
26 Helly s lemma Definition A function F : R k [0, 1] is called a defective distribution function if there is some finite measure µ on B(R k ) taking values in [0, 1] and a constant c F [0, 1], such that ( ) F (x) c F = µ ((, x]) = µ k i=1 (, x i ], x R k. Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54
27 Helly s lemma Definition A function F : R k [0, 1] is called a defective distribution function if there is some finite measure µ on B(R k ) taking values in [0, 1] and a constant c F [0, 1], such that ( ) Remark F (x) c F = µ ((, x]) = µ k i=1 (, x i ], x R k. 1 By continuity of measures from above we have: c F = lim xi F (x). i=1,...,k 2 A defective distribution function is a cdf iff the underlying finite measure is a probability measure and c F = 0. Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54
28 Helly s lemma Lemma (Helly s lemma/helly s selection theorem) Let (F n ) n N be a sequence of distribution functions with domain R k. Then this sequence possesses a subsequence ( F nj with the property )j N lim j F nj (x) = F (x) for each continuity point x R k of some defective distribution function F. Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54
29 Helly s lemma Lemma (Helly s lemma/helly s selection theorem) Let (F n ) n N be a sequence of distribution functions with domain R k. Then this sequence possesses a subsequence ( F nj with the property )j N lim j F nj (x) = F (x) for each continuity point x R k of some defective distribution function F. Rough idea of the proof. The proof is quite technical. Hence we only present the idea of the construction of F. For details, please refer to [Dur10, Thm , Thm ] and [Van98, Lemma 2.5]. BOARD Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54
30 Is it really possible that Helly s lemma fails to provide us with an honest cdf? Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54
31 Is it really possible that Helly s lemma fails to provide us with an honest cdf? Unfortunately, yes! Example Consider a sequence (X n ) n N of real-valued random variables satisfying X n δ n, n N. Then the corresponding sequence of distribution functions is given by F n : R {0, 1}, x 1 [n, ) (x). Obviously lim j F nj (x) = 0 for each x R and each subsequence (n j ) j N. Hence Helly s lemma cannot yield an honest cdf! Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54
32 Prohorov s theorem Now we are in a position to prove Prohorov s theorem: Theorem (Prohorov) Let (X n ) n N be a sequence of R k -valued random variables. Then it holds: (a) If X n X for some R k -valued random variable X, then (X n ) n N is uniformly tight. (b) If (X n ) n N is uniformly tight, then there exists a subsequence ( Xnj ) j N with X n j X for some R k -valued random variable X. Proof. BOARD Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54
33 Stochastic Landau notation Similar to the well-known O-Notation from calculus, one can introduce a stochastic version of these Landau symbols in order to express the speed of convergence (in probability): Definition Let (X n ) n N, (R n ) n N be sequences of R k - and R-valued random variables, respectively. We write: (a) X n = O P (1) : {X n : n N} is uniformly tight. (b) X n = O P (R n ) : X n = R n Y n for a sequence (Y n ) n N of R k -valued random variables satisfying Y n = O P (1). Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54
34 Stochastic Landau notation Similar to the well-known O-Notation from calculus, one can introduce a stochastic version of these Landau symbols in order to express the speed of convergence (in probability): Definition Let (X n ) n N, (R n ) n N be sequences of R k - and R-valued random variables, respectively. We write: (a) X n = O P (1) : {X n : n N} is uniformly tight. (b) X n = O P (R n ) : X n = R n Y n for a sequence (Y n ) n N of R k -valued random variables satisfying Y n = O P (1). P (c) X n = o P (1) : X n 0. (d) X n = o P (R n ) : X n = R n Y n for a sequence (Y n ) n N of R k -valued random variables satisfying Y n = o P (1). Commonly, (R n ) n N is called the rate (of convergence). Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54
35 In our next chapter we will use differential calculus and therefore need the following lemma. Think of R as the remainder term in some Taylor expansion. Lemma Let D R k be open with 0 D and let R : D R m be a function with R(0) = 0. Furthermore let (X n ) n N be a sequence of random variables taking values in D with X n = o P (1). Then for every p > 0 we have: (a) R(h) = o ( h p 2) (h 0) = R(Xn ) = o P ( Xn p 2) ; (b) R(h) = O ( h p 2) (h 0) = R(Xn ) = O P ( Xn p 2). Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54
36 In our next chapter we will use differential calculus and therefore need the following lemma. Think of R as the remainder term in some Taylor expansion. Lemma Let D R k be open with 0 D and let R : D R m be a function with R(0) = 0. Furthermore let (X n ) n N be a sequence of random variables taking values in D with X n = o P (1). Then for every p > 0 we have: (a) R(h) = o ( h p 2) (h 0) = R(Xn ) = o P ( Xn p 2) ; (b) R(h) = O ( h p 2) (h 0) = R(Xn ) = O P ( Xn p 2). Proof. Let p > 0 and define g(h) := { R(h) h p 2 if h 0, 0 else. Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54
37 In our next chapter we will use differential calculus and therefore need the following lemma. Think of R as the remainder term in some Taylor expansion. Lemma Let D R k be open with 0 D and let R : D R m be a function with R(0) = 0. Furthermore let (X n ) n N be a sequence of random variables taking values in D with X n = o P (1). Then for every p > 0 we have: (a) R(h) = o ( h p 2) (h 0) = R(Xn ) = o P ( Xn p 2) ; (b) R(h) = O ( h p 2) (h 0) = R(Xn ) = O P ( Xn p 2). Proof. Let p > 0 and define g(h) := { R(h) h p 2 if h 0, 0 else. [=:R n] [=:Y n] {}}{{}}{ Then for each n N we have R(X n ) = X n p 2 g(x n ). Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54
38 Proof (continued). (a) By assumption R(h) lim g(h) = lim h 0 h 0 h p 2 = 0, i.e. g is continuous at 0. Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54
39 Proof (continued). (a) By assumption R(h) lim g(h) = lim h 0 h 0 h p 2 = 0, i.e. g is continuous at 0. Since X n 0 (by assumption), the P continuous mapping principle yields g(x n ) 0, i.e. g(x n ) = o P (1). Thus R(X n ) = o P ( X n p 2 ). P Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54
40 Proof (continued). (a) By assumption R(h) lim g(h) = lim h 0 h 0 h p 2 = 0, i.e. g is continuous at 0. Since X n 0 (by assumption), the P continuous mapping principle yields g(x n ) 0, i.e. g(x n ) = o P (1). Thus R(X n ) = o P ( X n p 2 ). (b) By assumption there is some δ > 0 and some M > 0, s.t. g(h) 2 = R(h) M for all h B δ (0) D. 2 h p 2 P Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54
41 Proof (continued). (a) By assumption R(h) lim g(h) = lim h 0 h 0 h p 2 = 0, i.e. g is continuous at 0. Since X n 0 (by assumption), the P continuous mapping principle yields g(x n ) 0, i.e. g(x n ) = o P (1). Thus R(X n ) = o P ( X n p 2 ). (b) By assumption there is some δ > 0 and some M > 0, s.t. g(h) 2 = R(h) P M for all h B δ (0) D. Since X n 0 2 h p 2 (by assumption), we obtain: P ( g(x n ) 2 > M) P ( X n 2 > δ) n 0. P Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54
42 Proof (continued). Hence for given ε > 0 we can choose n ε N, s.t. P ( g(x n ) 2 > M) < ε 2 for all n > n ε. For n {1,..., n ε } we choose M ε M suitably large, s.t. P ( g(x n ) 2 > M ε ) < ε for these n. 2 Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54
43 Proof (continued). Hence for given ε > 0 we can choose n ε N, s.t. P ( g(x n ) 2 > M) < ε 2 for all n > n ε. For n {1,..., n ε } we choose M ε M suitably large, s.t. P ( g(x n ) 2 > M ε ) < ε for these n. Obviously, this yields 2 sup P ( g(x n ) 2 > M ε ) ε n N 2 < ε, i.e. {g(x n )} n N is uniformly tight. Thus, g(x n ) = O P (1) and R(X n ) = O P ( X n p 2 ). Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54
44 Proof (continued). Hence for given ε > 0 we can choose n ε N, s.t. P ( g(x n ) 2 > M) < ε 2 for all n > n ε. For n {1,..., n ε } we choose M ε M suitably large, s.t. P ( g(x n ) 2 > M ε ) < ε for these n. Obviously, this yields 2 sup P ( g(x n ) 2 > M ε ) ε n N 2 < ε, i.e. {g(x n )} n N is uniformly tight. Thus, g(x n ) = O P (1) and R(X n ) = O P ( X n p 2 ). Example (LLN) Let (X n ) n N ( L 1 (P) ) N be a sequence of i.i.d. random variables. Define S n := n i=1 X i. Then we know that Sn n = X n P E[X 1 ], i.e. S n ne[x 1 ] = o P (n). Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54
45 Chapter 2 Delta Method Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54
46 Motivation/Idea Given a limit law of n (T n θ) (often derived from the CLT), how to deduce one of n (φ(t n ) φ(θ)) where φ is some differentiable mapping? Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54
47 Motivation/Idea Given a limit law of n (T n θ) (often derived from the CLT), how to deduce one of n (φ(t n ) φ(θ)) where φ is some differentiable mapping? Use a Taylor expansion! Remark In applications, T n often is an estimator for some parameter θ. Note that the question appeals to the limit distribution. Hence φ(t n ) may inherit a property like asymptotic efficiency from T n. Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54
48 Let us recall some definitions concerning estimators: Definition Let {P ϑ } ϑ Θ be a familiy of probability measures on B(R m ), consider an i.i.d. sample X 1,..., X n with P(X 1 ) {P ϑ } ϑ Θ and assume T n = T (X 1,..., X n ) : Ω Θ to be an estimator for ϑ. (a) T n is called consistent iff T n P ϑ ϑ (X 1 P ϑ ) for all ϑ Θ. (b) T n is called unbiased iff bias ϑ (T n ) := E ϑ [T n ] ϑ = 0 for all ϑ Θ. (Existence of the involved integral is required.) (c) T n is called asymptotically efficient iff (provided that T n is R k -valued) for all ϑ Θ we have: P ϑ ( n (Tn ϑ) ) N k ( 0, I(Pϑ ) 1). [ ]) Here I(P ϑ ) := (E ϑ ϑ i log f ϑ (X ) ϑ j log f ϑ (X ) denotes the i,j=1,...,k Fisher information matrix of P ϑ where Θ R k is assumed. Moreover X P ϑ is an R m -valued random variable and f ϑ = dp ϑ dλ, if P m ϑ λ m, and f ϑ = dp ϑ d#, if P m ϑ # m (in either way for all ϑ Θ). Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54
49 Proof. BOARD Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54 Delta method Basic result Theorem (Delta method) Let D R k be an open subset and suppose that φ : D R m is a mapping that is differentiable at θ D. Furthermore let {T n } n N be a family of D-valued random variables and let (r n ) n N be a sequence of n real numbers satisfying 0 < r n. If r n (T n θ) T for some D-valued random variable T, then n r n (φ(t n ) φ(θ)) n Dφ(θ)T, where Dφ(θ) L ( R k, R m) = R m k denotes the Fréchet-derivative (represented by the Jacobian matrix) of φ at θ. Moreover, we have P r n (φ(t n ) φ(θ)) Dφ(θ) (r n (T n θ)) 2 0.
50 There is also a slightly more general result if we assume φ to be of class C 1 around θ. We state it without proof: Theorem (Uniform delta method) Let D R k be an open subset and suppose that φ : D R m is a mapping that is continuously differentiable in an open neighborhood of θ D. Furthermore let {T n } n N be a family of D-valued random variables and let (r n ) n N be a sequence of real numbers satisfying n 0 < r n. If r n (T n θ n ) T for some D-valued random variable T and some n n D θ n θ, then r n (φ(t n ) φ(θ n )) n Dφ(θ)T, where Dφ(θ) L ( R k, R m) = R m k denotes the Fréchet-derivative (represented by the Jacobian matrix) of φ at θ. Moreover, we have P r n (φ(t n ) φ(θ n )) Dφ(θ) (r n (T n θ n )) 2 0. Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54
51 Application I: Testing variance CLT revisited Example (sample variance) Given a data set consisting of i.i.d. observations X 1,..., X n L 4 (P), n N, we want to estimate its variance: Therefore we consider the biased estimator bŝ n 2 := 1 n ( ) 2 Xi X n = X 2 n X 2 n = φ(x n, X n 2 n), i=1 where φ(x, y) := y x 2, (x, y) T R 2. Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54
52 Application I: Testing variance CLT revisited Example (sample variance) Given a data set consisting of i.i.d. observations X 1,..., X n L 4 (P), n N, we want to estimate its variance: Therefore we consider the biased estimator bŝ n 2 := 1 n ( ) 2 Xi X n = X 2 n X 2 n = φ(x n, X n 2 n), i=1 where φ(x, y) := y x 2, (x, y) T R 2. Define µ k := E[X1 k ], k = 1..., 4. Then for the vectors (X i, Xi 2 ) T, i = 1,..., n, it holds by the CLT: (( ) ( )) X n n µ1 = 1 n (( ) [( )]) Xi Xi X 2 n µ 2 n Xi 2 E Xi 2 Z i=1 Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54
53 Example (sample variance (continued)) n ( (X n, X 2 n) T ( µ1, µ 2 ) T ) Z, where (( ) ( )) 0 µ2 µ Z N 2, 2 1 µ 3 µ 1 µ 2 0 µ 3 µ 1 µ 2 µ 4 µ 2. 2 Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54
54 Example (sample variance (continued)) n ( (X n, X 2 n) T ( µ1, µ 2 ) T ) Z, where (( ) ( )) 0 µ2 µ Z N 2, 2 1 µ 3 µ 1 µ 2 0 µ 3 µ 1 µ 2 µ 4 µ 2. 2 Hence, the delta method implies (since b Ŝn 2 = φ(x n, X 2 n), where φ(x, y) = y x 2 ): ( ) n φ(x n, X 2 n) φ(µ 1, µ 2 ) Dφ(µ 1, µ 2 )Z, i.e. (since Dφ(x, y)z = ( 2x, 1 ) z, z R 2 ) n ( bŝ 2 n ( µ 2 µ 2 1) ) Z. Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54
55 Example (sample variance (continued)) n ( bŝ 2 n ( µ 2 µ 2 1) ) Z, where ( Z N 1 0, ( 2µ 1, 1 ) ( ) ( )) µ 2 µ 2 1 µ 3 µ 1 µ 2 2µ1 µ 3 µ 1 µ 2 µ 4 µ = N 1 (0, µ 4 µ 2 2 2µ 4 1 4µ 1 µ 3 + 6µ 2 1µ 2 ). Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54
56 Example (sample variance (continued)) n ( bŝ 2 n ( µ 2 µ 2 1) ) Z, where ( Z N 1 0, ( 2µ 1, 1 ) ( ) ( )) µ 2 µ 2 1 µ 3 µ 1 µ 2 2µ1 µ 3 µ 1 µ 2 µ 4 µ = N 1 (0, µ 4 µ 2 2 2µ 4 1 4µ 1 µ 3 + 6µ 2 1µ 2 ). Slutsky s lemma implies that this result also holds for the corresponding unbiased estimator Ŝ 2 n := n n 1 bŝ 2 n. We will apply this result to construct a test for the variance of a data set. Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54
57 First, we recall some notions and results from Mathematical Statistics : Definition (kurtosis) Let X L 4 (P) be a random variable. Then the kurtosis of X is defined by E [(X E[X ]) 4] E [(X E[X ]) 4] κ X := ( E [(X E[X ]) 2]) 2 = (Var(X )) 2 Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54
58 First, we recall some notions and results from Mathematical Statistics : Definition (kurtosis) Let X L 4 (P) be a random variable. Then the kurtosis of X is defined by E [(X E[X ]) 4] E [(X E[X ]) 4] κ X := ( E [(X E[X ]) 2]) 2 = (Var(X )) 2 Remark If X N 1 (µ, σ 2 ), then κ X = 3. Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54
59 First, we recall some notions and results from Mathematical Statistics : Definition (kurtosis) Let X L 4 (P) be a random variable. Then the kurtosis of X is defined by E [(X E[X ]) 4] E [(X E[X ]) 4] κ X := ( E [(X E[X ]) 2]) 2 = (Var(X )) 2 Remark If X N 1 (µ, σ 2 ), then κ X = 3. Definition (Chi-square distribution) Let X 1,..., X n N 1 (0, 1) be i.i.d. random variables. Then the probability measure χ 2 n := P ( n i=1 X i 2 ) on B(R) is called the chi-square distribution with n degrees of freedom. Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54
60 From Mathematical statistics one is familiar with the following Proposition Let X 1,..., X n N 1 (µ, σ 2 ) be i.i.d. random variables and Ŝn 2 = 1 n ( ) 2 n 1 i=1 Xi X n the unbiased estimator of σ 2 from above. Then ( ) (n 1)Ŝ n 2 P = χ 2 σ n 1. 2 Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54
61 From Mathematical statistics one is familiar with the following Proposition Let X 1,..., X n N 1 (µ, σ 2 ) be i.i.d. random variables and Ŝn 2 = 1 n ( ) 2 n 1 i=1 Xi X n the unbiased estimator of σ 2 from above. Then ( ) (n 1)Ŝ n 2 P = χ 2 σ n 1. 2 This result gives rise to the following test for normal data: Example (One-sided test for σ 2 ) In the situation of the preceding proposition and for given σ 2 0 > 0, we want to test H 0 : σ 2 = σ 2 0 vs. H 1 : σ 2 > σ 2 0 at level α (0, 1). Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54
62 Example (One-sided test for σ 2 (continued)) As a test statistic we use T n := (n 1)Ŝ2 n σ0 2 H 0 if T n > q (1 α) χ 2 n 1 σ 2 0 χ 2 n 1 and decide to reject ((1 α)-quantile of then χ 2 n 1-distribution). Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54
63 Example (One-sided test for σ 2 (continued)) As a test statistic we use T n := (n 1)Ŝ2 n σ 2 0 H 0 if T n > q (1 α) χ 2 n 1 Hence we obtain: P σ 2 0 ( σ 2 0 χ 2 n 1 and decide to reject ((1 α)-quantile of then χ 2 n 1-distribution). ) (( T n > q (1 α) = 1 χ 2 n 1 χ 2 n 1 i.e. the test has exactly level α., q (1 α) χ 2 n 1 ]) = α, Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54
64 Example (One-sided test for σ 2 (continued)) As a test statistic we use T n := (n 1)Ŝ2 n σ 2 0 H 0 if T n > q (1 α) χ 2 n 1 Hence we obtain: P σ 2 0 ( σ 2 0 χ 2 n 1 and decide to reject ((1 α)-quantile of then χ 2 n 1-distribution). ) (( T n > q (1 α) = 1 χ 2 n 1 χ 2 n 1 i.e. the test has exactly level α., q (1 α) χ 2 n 1 ]) = α, What if we know that the given data are not normally distributed? Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54
65 Example (One-sided test for σ 2 (continued)) As a test statistic we use T n := (n 1)Ŝ2 n σ 2 0 H 0 if T n > q (1 α) χ 2 n 1 Hence we obtain: P σ 2 0 ( σ 2 0 χ 2 n 1 and decide to reject ((1 α)-quantile of then χ 2 n 1-distribution). ) (( T n > q (1 α) = 1 χ 2 n 1 χ 2 n 1 i.e. the test has exactly level α., q (1 α) χ 2 n 1 ]) = α, What if we know that the given data are not normally distributed? We use the approximation n 2 (Ŝ n ( ) ) µ 2 µ 2 1 Z, where Z N 1 (0, µ 4 µ 2 2 2µ 4 1 4µ 1 µ 3 + 6µ 2 1µ 2 ), from above to derive a test of asymptotic level α for certain data sets. Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54
66 Example (One-sided test for σ 2 without normality assumption) Let X 1,..., X n L 4 (P) be i.i.d. random variables. As above let µ k := E[X1 k ]. For the sake of simplicity we assume µ 1 = 0. 3 We obtain: σ 2 := Var(X 1 ) = µ 2 and κ := κ X1 = µ 4. Hence, our µ 2 2 approximation reduces to ) (Ŝ2 n n 1 Z N 1 (0, κ 1). µ 2 3 This is not a restriction: Centering observations neither affects their dispersion nor Ŝ n 2 = 1 n ( ) 2. n 1 i=1 Xi X n (After centering one has to use centered moments instead of our µ k.) Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54
67 Example (One-sided test for σ 2 without normality assumption) Let X 1,..., X n L 4 (P) be i.i.d. random variables. As above let µ k := E[X1 k ]. For the sake of simplicity we assume µ 1 = 0. 3 We obtain: σ 2 := Var(X 1 ) = µ 2 and κ := κ X1 = µ 4. Hence, our µ 2 2 approximation reduces to ) (Ŝ2 n n 1 Z N 1 (0, κ 1). µ 2 Again, for given σ 2 0 > 0, we want to test H 0 : σ 2 = σ0 2 vs. H 1 : σ 2 > σ0 2 at level α (0, 1). We use T n := ( ) n Ŝ2 n 1 as test statistic. σ0 2 3 This is not a restriction: Centering observations neither affects their dispersion nor Ŝ n 2 = 1 n ( ) 2. n 1 i=1 Xi X n (After centering one has to use centered moments instead of our µ k.) Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54
68 T n κ 1 = (Ŝ ) 2 n n κ 1 1 N σ0 2 1 (0, 1) 1000 repetitions Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54
69 T n κ 1 = (Ŝ ) 2 n n κ 1 1 N σ0 2 1 (0, 1) 1000 repetitions Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54
70 A remark on quantiles of χ 2 n 1 Remark Let C n 1 χ 2 n 1, n N >1. Then the CLT implies ( ) Cn 1 (n 1) P N 1 (0, 1). 2n 2 4 This is due to the fact that Φ increases strictly. ( Probability Theory ) Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54
71 A remark on quantiles of χ 2 n 1 Remark Let C n 1 χ 2 n 1, n N >1. Then the CLT implies ( ) Cn 1 (n 1) P N 1 (0, 1). 2n 2 Hence, for α (0, 1), the (1 α)-quantile of the latter probability distribution converges to that of a standard normal distribution as quantiles of this distribution are uniquely determined 4 : lim n (n 1) 2n 2 }{{} =: 1 q n 2 q (1 α) χ 2 n 1 = Φ 1 (1 α). 4 This is due to the fact that Φ increases strictly. ( Probability Theory ) Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54
72 Example (One-sided test for σ 2 (continued)) ) σ 2 0 Recall: T n := ( n Ŝ2 n 1 σ Z N (0, κ 1), lim n q n = 2Φ 1 (1 α). Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54
73 Example (One-sided test for σ 2 (continued)) ) σ 2 0 Recall: T n := ( n Ŝ2 n 1 σ Z N (0, κ 1), lim n q n = 2Φ 1 (1 α). Thus Slutsky s lemma implies: P σ 2 0 ((T n q n ) ) N 1 ( ) 2Φ 1 (1 α), κ 1. Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54
74 Example (One-sided test for σ 2 (continued)) ) σ 2 0 Recall: T n := ( n Ŝ2 n 1 σ Z N (0, κ 1), lim n q n = 2Φ 1 (1 α). Thus Slutsky s lemma implies: P σ 2 0 ((T n q n ) ) N 1 ( ) 2Φ 1 (1 α), κ 1. We implement the following decision rule: Reject H 0 if T n > q (1 α) χ 2 (n 1) n 1 n 1 = q n. Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54
75 Example (One-sided test for σ 2 (continued)) ) σ 2 0 Recall: T n := ( n Ŝ2 n 1 σ Z N (0, κ 1), lim n q n = 2Φ 1 (1 α). Thus Slutsky s lemma implies: P σ 2 0 ((T n q n ) ) N 1 ( ) 2Φ 1 (1 α), κ 1. We implement the following decision rule: Reject H 0 if T n > q (1 α) χ 2 (n 1) n 1 n 1 = q n. For the error of type I, we obtain: P σ 2 0 (T n > q n ) = P σ 2 0 (T n q n > 0) n portmanteau (e) 1 N 1 ( 2Φ 1 (1 α), κ 1 ) ((, 0]) Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54
76 Example (One-sided test for σ 2 (continued)) ( P σ 2 0 (T n > q n ) n 2Φ 1 (1 α) 1 Φ κ 1 )! = α κ = 3! α 1 κ 3 Hence our decision rule establishes an (asymptotic) one-sided test (of level α) for σ 2 iff the distribution of the observations is platykurtic or mesokurtic, i.e. κ < 3 and κ = 3, respectively. Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54
77 Recall: H 0 : σ 2 = σ 2 0 T n := (n 1)Ŝ2 n σ 2 0 T n := n ( Ŝ2 n σ 2 0 H 0 if T n > q n = used with normal data Reject H 0 if T n > q (1 α). χ 2 n 1 ) 1 used with possibly non-normal data Reject q (1 α) χ 2 (n 1) n 1 n 1. Remark The presented testing procedures are closely related: They are based on the same (asymptotic) decision rule (if µ 1 = 0) as one can prove: T n > q n T n (n 1) > n 1 ( ) q (1 α) (n 1). }{{ n χ 2 n 1 } 1 for large n Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54
78 Application II: Asymptotic confidence intervals and variance-stabilizing transformations (VST) We are given a parametric model {P ϑ } ϑ Θ of probability measures on B(R k ) and assume Θ := (ϑ, ϑ + ) R to be an open interval. Furthermore assume the existence of an estimator T n = T (X 1,..., X n ) of ϑ Θ (where {X l } l N is a familiy of i.i.d. R k -valued random variables with P(X 1 ) {P ϑ } ϑ Θ 5 ) that satisfies n (Tn ϑ) ϑ = n T N 1(0, σ 2 (ϑ)) for all ϑ Θ. We assume that σ 2 ( ) is known as function of ϑ. 5 Formally, for l N, Xl is defined on (Ω, A, P) as fixed above. If ϑ Θ is the true (but unknown) parameter then the image measure of P under X 1 (i.e. the distribution of X 1 ) is given by P ϑ. Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54
79 Application II: Asymptotic confidence intervals and variance-stabilizing transformations (VST) We are given a parametric model {P ϑ } ϑ Θ of probability measures on B(R k ) and assume Θ := (ϑ, ϑ + ) R to be an open interval. Furthermore assume the existence of an estimator T n = T (X 1,..., X n ) of ϑ Θ (where {X l } l N is a familiy of i.i.d. R k -valued random variables with P(X 1 ) {P ϑ } ϑ Θ 5 ) that satisfies n (Tn ϑ) ϑ = n T N 1(0, σ 2 (ϑ)) for all ϑ Θ. We assume that σ 2 ( ) is known as function of ϑ. Task: For fixed γ (0, 1), find an asymptotic confidence interval for ϑ. 5 Formally, for l N, Xl is defined on (Ω, A, P) as fixed above. If ϑ Θ is the true (but unknown) parameter then the image measure of P under X 1 (i.e. the distribution of X 1 ) is given by P ϑ. Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54
80 Application II: Asymptotic confidence intervals and variance-stabilizing transformations (VST) We are given a parametric model {P ϑ } ϑ Θ of probability measures on B(R k ) and assume Θ := (ϑ, ϑ + ) R to be an open interval. Furthermore assume the existence of an estimator T n = T (X 1,..., X n ) of ϑ Θ (where {X l } l N is a familiy of i.i.d. R k -valued random variables with P(X 1 ) {P ϑ } ϑ Θ 5 ) that satisfies n (Tn ϑ) ϑ = n T N 1(0, σ 2 (ϑ)) for all ϑ Θ. We assume that σ 2 ( ) is known as function of ϑ. Task: For fixed γ (0, 1), find an asymptotic confidence interval for ϑ. First idea: Consider the asymptotic γ-confidence interval CI ϑ;n (γ) := [ T n Φ 1 ( 1 + γ 2 ) σ(ϑ) n, T n Φ 1 ( 1 + γ 2 ) ] σ(ϑ). n 5 Formally, for l N, Xl is defined on (Ω, A, P) as fixed above. If ϑ Θ is the true (but unknown) parameter then the image measure of P under X 1 (i.e. the distribution of X 1 ) is given by P ϑ. Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54
81 First idea: Consider the asymptotic γ-confidence interval [ ( ) ( ) ] 1 + γ σ(ϑ) 1 + γ σ(ϑ) CI ϑ;n(γ) = T n Φ 1, T n Φ 1. 2 n 2 n Problem: ϑ and thus σ(ϑ) are unknown in general. Hence, these confidence intervals are useless in practice. Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54
82 First idea: Consider the asymptotic γ-confidence interval [ ( ) ( ) ] 1 + γ σ(ϑ) 1 + γ σ(ϑ) CI ϑ;n(γ) = T n Φ 1, T n Φ 1. 2 n 2 n Problem: ϑ and thus σ(ϑ) are unknown in general. Hence, these confidence intervals are useless in practice. Solution 1: Estimate σ 2 (ϑ) using a consistent estimator. This approach is discussed in Mathematical Statistics. Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54
83 First idea: Consider the asymptotic γ-confidence interval [ ( ) ( ) ] 1 + γ σ(ϑ) 1 + γ σ(ϑ) CI ϑ;n(γ) = T n Φ 1, T n Φ 1. 2 n 2 n Problem: ϑ and thus σ(ϑ) are unknown in general. Hence, these confidence intervals are useless in practice. Solution 1: Estimate σ 2 (ϑ) using a consistent estimator. This approach is discussed in Mathematical Statistics. Solution 2: Use a variance-stabilizing transformation of the given data set. Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54
84 Variance-stabilizing transformations Assumption Let ϑ 0 Θ be fixed. We assume that the mapping Θ = (ϑ, ϑ + ) ϑ ϑ ϑ 0 1 σ(θ) dθ R is well-defined and differentiable (with derivative 1/σ( )). Definition (VST) In the stated situation, under the latter assumption and for some fixed η > 0, the differentiable mapping φ : Θ = (ϑ, ϑ + ) R, ϑ is called a variance-stabilizing transformation. ϑ ϑ 0 η σ(θ) dθ Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54
85 Recall: φ(ϑ) = ϑ ϑ 0 Remark (Basic properties) η σ(θ) dθ, ϑ (ϑ, ϑ + ) 1 φ is continuous and due to η > 0 and σ > 0 on Θ also strictly increasing as its derivative equals φ η. Hence φ is invertible. σ 2 φ exhibits the variance-stabilizing property: φ σ η. Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54
86 Recall: φ(ϑ) = ϑ ϑ 0 Remark (Basic properties) η σ(θ) dθ, ϑ (ϑ, ϑ + ) 1 φ is continuous and due to η > 0 and σ > 0 on Θ also strictly increasing as its derivative equals φ η. Hence φ is invertible. σ 2 φ exhibits the variance-stabilizing property: φ σ η. Remark (What is the origin of this name?) Recall: n (T n ϑ) differentiable on Θ. ϑ = n T N 1(0, σ 2 (ϑ)) for all ϑ Θ and φ is Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54
87 Recall: φ(ϑ) = ϑ ϑ 0 Remark (Basic properties) η σ(θ) dθ, ϑ (ϑ, ϑ + ) 1 φ is continuous and due to η > 0 and σ > 0 on Θ also strictly increasing as its derivative equals φ η. Hence φ is invertible. σ 2 φ exhibits the variance-stabilizing property: φ σ η. Remark (What is the origin of this name?) Recall: ϑ n (T n ϑ) = T N 1(0, σ 2 (ϑ)) for all ϑ Θ and φ is n differentiable on Θ. Hence, the delta method implies: n (φ(tn ) φ(ϑ)) ϑ = n φ (ϑ)t N 1 (0, (φ (ϑ)) 2 σ 2 (ϑ)) = N 1 (0, η 2 ) for all ϑ Θ, i.e. the asymptotic variance is stabilized to η 2 (which is usually chosen to be 1 in practice). Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54
88 Recall: T n is an estimator of ϑ with n (Tn ϑ) ϑ = n T N 1(0, σ 2 (ϑ)) for all ϑ Θ. Goal: Find an asymptotic γ-confidence interval for ϑ. Derived so far: ϑ n (φ(tn ) φ(ϑ)) = n φ (ϑ) N 1 (0, η 2 ) for all ϑ Θ. Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54
89 Recall: T n is an estimator of ϑ with n (Tn ϑ) ϑ = n T N 1(0, σ 2 (ϑ)) for all ϑ Θ. Goal: Find an asymptotic γ-confidence interval for ϑ. Derived so far: ϑ n (φ(tn ) φ(ϑ)) = n φ (ϑ) N 1 (0, η 2 ) for all ϑ Θ. Example (Asymptotic CI via VST) In the above situation and using a variance-stabilizing transformation φ, our First Idea implies that [ ( ) ( ) ] 1 + γ η 1 + γ η CI φ(ϑ),n (γ) := φ(t n ) Φ 1 n, φ(t n ) + Φ 1 n 2 2 is an asymptotic γ-confidence interval for φ(ϑ). Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54
90 Example (Asymptotic CI via VST (continued)) [ ( ) 1 + γ η CI φ(ϑ),n (γ) := φ(t n ) Φ 1 2 is an asymptotic γ-confidence interval for φ(ϑ). n, φ(t n ) + Φ 1 ( 1 + γ 2 ) ] η n Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54
91 Example (Asymptotic CI via VST (continued)) [ ( ) 1 + γ η CI φ(ϑ),n (γ) := φ(t n ) Φ 1 2 n, φ(t n ) + Φ 1 ( 1 + γ 2 ) ] η n is an asymptotic γ-confidence interval for φ(ϑ). Idea: Transform this interval using φ 1 to obtain an asymptotic CI for ϑ. Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54
92 Example (Asymptotic CI via VST (continued)) [ ( ) 1 + γ η CI φ(ϑ),n (γ) := φ(t n ) Φ 1 2 n, φ(t n ) + Φ 1 ( 1 + γ 2 ) ] η n is an asymptotic γ-confidence interval for φ(ϑ). Idea: Transform this interval using φ 1 to obtain an asymptotic CI for ϑ. We know for ϑ Θ: γ lim inf n P ϑ (φ(ϑ) CI φ(ϑ),n (γ) ) ( }) = lim inf P ϑ ϑ {φ CI n φ(ϑ),n (γ). Now, since φ is continuous and strictly increasing (in particular one-to-one), { φ CI φ(ϑ),n(γ) } R is really an interval. Thus, depending on a specific φ, we obtain an asymptotic γ-confidence interval for ϑ. Note that this interval is easy to compute in practice: Just apply φ 1 to the boundary values of CI φ(ϑ),n(γ). Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54
93 Chapter 3 Moment Estimators Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54
94 Method of Moments Let Θ R k and {P ϑ } ϑ Θ be a familiy of probability measures on B(R). As above we assume that {X l } l N is a familiy of i.i.d. R-valued random variables with P(X 1 ) {P ϑ } ϑ Θ, i.e. the distribution of X 1 is known up to the parameter vector ϑ Θ. 6 Of course, this requires certain integrability conditions on these functions and X 1 s.t. all involved expectations are well-defined. 7 Note that it is not clear a priori whether such a ϑ exists. Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54
95 Method of Moments Let Θ R k and {P ϑ } ϑ Θ be a familiy of probability measures on B(R). As above we assume that {X l } l N is a familiy of i.i.d. R-valued random variables with P(X 1 ) {P ϑ } ϑ Θ, i.e. the distribution of X 1 is known up to the parameter vector ϑ Θ. Given some functions f 1,..., f k : R R, the method of moments pursues the following ansatz 6 : Find ϑ Θ, s.t. f j (X ) n = 1 n n f j (X i ) =! E ϑ [f j (X 1 )], j = 1,..., k. 7 i=1 It is obvious that the LLN motivates this approach. 6 Of course, this requires certain integrability conditions on these functions and X 1 s.t. all involved expectations are well-defined. 7 Note that it is not clear a priori whether such a ϑ exists. Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54
96 Ansatz Find ϑ Θ, s.t. 1 n n i=1 f j(x i )! = E ϑ [f j (X 1 )], j = 1,..., k. Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54
97 Ansatz Find ϑ Θ, s.t. 1 n n i=1 f j(x i )! = E ϑ [f j (X 1 )], j = 1,..., k. Remark (Recourse to Mathematical Statistics ) Consider f j (x) := x j, j = 1,..., k. Then the method of moments reduces to finding ϑ Θ s.t. X j n = 1 n n i=1 X j i = 1 n n ] f j (X i ) =! E ϑ [f j (X 1 )] = E ϑ [X j 1, j = 1,..., k. i=1 Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54
98 Ansatz Find ϑ Θ, s.t. 1 n n i=1 f j(x i )! = E ϑ [f j (X 1 )], j = 1,..., k. Remark (Recourse to Mathematical Statistics ) Consider f j (x) := x j, j = 1,..., k. Then the method of moments reduces to finding ϑ Θ s.t. X j n = 1 n n i=1 X j i = 1 n n ] f j (X i ) =! E ϑ [f j (X 1 )] = E ϑ [X j 1, j = 1,..., k. i=1 Now we want to scrutinize conditions for existence and asymptotic normality of this type of estimator (to be introduced shortly). Therefore, we use the following Notation f := (f 1,..., f k ) T, e : Θ R k, ϑ E ϑ [f (X 1 )]. Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54
99 Moment estimators Therefore, the equation of interest is given by )! (f j (X ) n = e(ϑ). (*) j=1,...,k Definition (moment estimators) An estimator ϑ n solving equation (*) is called a moment estimator. Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54
100 Existence and asymptotic normality Theorem We consider the situation stated above. Let Θ R k be an open set and suppose e(ϑ) = E ϑ [f (X 1 )], ϑ Θ, is continuously differentiable in an open neighborhood of some point ϑ] 0 Θ with det De(ϑ 0 ) 0. Moreover, assume that E ϑ0 [ f (X 1 ) 2 2 <. Then e is C 1 -invertible in an open neighborhood of ϑ 0 and moment estimators ϑ n exists with probability tending to 1 as n 8. Furthermore they obey ( n ) ) ϑ0 9 ( P ( ϑn ϑ 0 N k 0, (De(ϑ 0 )) 1 Cov ϑ0 [f (X 1 )] (De(ϑ 0 )) T) Proof. BOARD 8 I.e., informally, the set of ω s where ϑ n can be defined gains P-mass as n. 9 If ϑ 0 is the true parameter. Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54
101 List of literature Durrett, R.: Probability: Theory and Examples. Cambridge University Press, Klenke, A.: Wahrscheinlichkeitstheorie. Berlin: Springer, Redenbach, C.: Mathematical Statistics. Lecture Notes TU Kaiserslautern, Seifried, F. T.: Maß und Integration. Lecture Notes TU Kaiserslautern, Seifried, F. T.: Probability Theory. Lecture Notes TU Kaiserslautern, Van Der Vaart, A. W.: Asymptotic Statistics. Cambridge University Press, Daniel Hoffmann (TU KL) Seminar: Asymptotic Statistics February 13, / 54
Weak convergence. Amsterdam, 13 November Leiden University. Limit theorems. Shota Gugushvili. Generalities. Criteria
Weak Leiden University Amsterdam, 13 November 2013 Outline 1 2 3 4 5 6 7 Definition Definition Let µ, µ 1, µ 2,... be probability measures on (R, B). It is said that µ n converges weakly to µ, and we then
More informationOn the convergence of sequences of random variables: A primer
BCAM May 2012 1 On the convergence of sequences of random variables: A primer Armand M. Makowski ECE & ISR/HyNet University of Maryland at College Park armand@isr.umd.edu BCAM May 2012 2 A sequence a :
More informationHypothesis Testing. 1 Definitions of test statistics. CB: chapter 8; section 10.3
Hypothesis Testing CB: chapter 8; section 0.3 Hypothesis: statement about an unknown population parameter Examples: The average age of males in Sweden is 7. (statement about population mean) The lowest
More informationTheoretical Statistics. Lecture 1.
1. Organizational issues. 2. Overview. 3. Stochastic convergence. Theoretical Statistics. Lecture 1. eter Bartlett 1 Organizational Issues Lectures: Tue/Thu 11am 12:30pm, 332 Evans. eter Bartlett. bartlett@stat.
More informationLecture 32: Asymptotic confidence sets and likelihoods
Lecture 32: Asymptotic confidence sets and likelihoods Asymptotic criterion In some problems, especially in nonparametric problems, it is difficult to find a reasonable confidence set with a given confidence
More informationUniformity and the delta method
Uniformity and the delta method Maximilian Kasy January, 208 Abstract When are asymptotic approximations using the delta-method uniformly valid? We provide sufficient conditions as well as closely related
More informationGaussian vectors and central limit theorem
Gaussian vectors and central limit theorem Samy Tindel Purdue University Probability Theory 2 - MA 539 Samy T. Gaussian vectors & CLT Probability Theory 1 / 86 Outline 1 Real Gaussian random variables
More informationSDS : Theoretical Statistics
SDS 384 11: Theoretical Statistics Lecture 1: Introduction Purnamrita Sarkar Department of Statistics and Data Science The University of Texas at Austin https://psarkar.github.io/teaching Manegerial Stuff
More informationIntroduction to Empirical Processes and Semiparametric Inference Lecture 08: Stochastic Convergence
Introduction to Empirical Processes and Semiparametric Inference Lecture 08: Stochastic Convergence Michael R. Kosorok, Ph.D. Professor and Chair of Biostatistics Professor of Statistics and Operations
More informationENEE 621 SPRING 2016 DETECTION AND ESTIMATION THEORY THE PARAMETER ESTIMATION PROBLEM
c 2007-2016 by Armand M. Makowski 1 ENEE 621 SPRING 2016 DETECTION AND ESTIMATION THEORY THE PARAMETER ESTIMATION PROBLEM 1 The basic setting Throughout, p, q and k are positive integers. The setup With
More informationA Very Brief Summary of Statistical Inference, and Examples
A Very Brief Summary of Statistical Inference, and Examples Trinity Term 2009 Prof. Gesine Reinert Our standard situation is that we have data x = x 1, x 2,..., x n, which we view as realisations of random
More informationEconomics 241B Review of Limit Theorems for Sequences of Random Variables
Economics 241B Review of Limit Theorems for Sequences of Random Variables Convergence in Distribution The previous de nitions of convergence focus on the outcome sequences of a random variable. Convergence
More informationST5215: Advanced Statistical Theory
Department of Statistics & Applied Probability Wednesday, October 5, 2011 Lecture 13: Basic elements and notions in decision theory Basic elements X : a sample from a population P P Decision: an action
More informationThe Canonical Gaussian Measure on R
The Canonical Gaussian Measure on R 1. Introduction The main goal of this course is to study Gaussian measures. The simplest example of a Gaussian measure is the canonical Gaussian measure P on R where
More informationMetric spaces and metrizability
1 Motivation Metric spaces and metrizability By this point in the course, this section should not need much in the way of motivation. From the very beginning, we have talked about R n usual and how relatively
More informationProbability and Measure
Probability and Measure Robert L. Wolpert Institute of Statistics and Decision Sciences Duke University, Durham, NC, USA Convergence of Random Variables 1. Convergence Concepts 1.1. Convergence of Real
More informationEconomics 583: Econometric Theory I A Primer on Asymptotics
Economics 583: Econometric Theory I A Primer on Asymptotics Eric Zivot January 14, 2013 The two main concepts in asymptotic theory that we will use are Consistency Asymptotic Normality Intuition consistency:
More informationLarge Sample Theory. Consider a sequence of random variables Z 1, Z 2,..., Z n. Convergence in probability: Z n
Large Sample Theory In statistics, we are interested in the properties of particular random variables (or estimators ), which are functions of our data. In ymptotic analysis, we focus on describing the
More informationLecture 7 Introduction to Statistical Decision Theory
Lecture 7 Introduction to Statistical Decision Theory I-Hsiang Wang Department of Electrical Engineering National Taiwan University ihwang@ntu.edu.tw December 20, 2016 1 / 55 I-Hsiang Wang IT Lecture 7
More informationSome Background Material
Chapter 1 Some Background Material In the first chapter, we present a quick review of elementary - but important - material as a way of dipping our toes in the water. This chapter also introduces important
More informationChapter 6. Convergence. Probability Theory. Four different convergence concepts. Four different convergence concepts. Convergence in probability
Probability Theory Chapter 6 Convergence Four different convergence concepts Let X 1, X 2, be a sequence of (usually dependent) random variables Definition 1.1. X n converges almost surely (a.s.), or with
More informationProhorov s theorem. Bengt Ringnér. October 26, 2008
Prohorov s theorem Bengt Ringnér October 26, 2008 1 The theorem Definition 1 A set Π of probability measures defined on the Borel sets of a topological space is called tight if, for each ε > 0, there is
More informationIEOR 3106: Introduction to Operations Research: Stochastic Models. Fall 2011, Professor Whitt. Class Lecture Notes: Thursday, September 15.
IEOR 3106: Introduction to Operations Research: Stochastic Models Fall 2011, Professor Whitt Class Lecture Notes: Thursday, September 15. Random Variables, Conditional Expectation and Transforms 1. Random
More informationNotes 9 : Infinitely divisible and stable laws
Notes 9 : Infinitely divisible and stable laws Math 733 - Fall 203 Lecturer: Sebastien Roch References: [Dur0, Section 3.7, 3.8], [Shi96, Section III.6]. Infinitely divisible distributions Recall: EX 9.
More information17. Convergence of Random Variables
7. Convergence of Random Variables In elementary mathematics courses (such as Calculus) one speaks of the convergence of functions: f n : R R, then lim f n = f if lim f n (x) = f(x) for all x in R. This
More informationSTAT Sample Problem: General Asymptotic Results
STAT331 1-Sample Problem: General Asymptotic Results In this unit we will consider the 1-sample problem and prove the consistency and asymptotic normality of the Nelson-Aalen estimator of the cumulative
More informationAsymptotic Statistics-III. Changliang Zou
Asymptotic Statistics-III Changliang Zou The multivariate central limit theorem Theorem (Multivariate CLT for iid case) Let X i be iid random p-vectors with mean µ and and covariance matrix Σ. Then n (
More informationLecture 2: Convergence of Random Variables
Lecture 2: Convergence of Random Variables Hyang-Won Lee Dept. of Internet & Multimedia Eng. Konkuk University Lecture 2 Introduction to Stochastic Processes, Fall 2013 1 / 9 Convergence of Random Variables
More informationErgodic Theorems. Samy Tindel. Purdue University. Probability Theory 2 - MA 539. Taken from Probability: Theory and examples by R.
Ergodic Theorems Samy Tindel Purdue University Probability Theory 2 - MA 539 Taken from Probability: Theory and examples by R. Durrett Samy T. Ergodic theorems Probability Theory 1 / 92 Outline 1 Definitions
More information6.1 Variational representation of f-divergences
ECE598: Information-theoretic methods in high-dimensional statistics Spring 2016 Lecture 6: Variational representation, HCR and CR lower bounds Lecturer: Yihong Wu Scribe: Georgios Rovatsos, Feb 11, 2016
More informationReview and continuation from last week Properties of MLEs
Review and continuation from last week Properties of MLEs As we have mentioned, MLEs have a nice intuitive property, and as we have seen, they have a certain equivariance property. We will see later that
More informationLecture 8: Information Theory and Statistics
Lecture 8: Information Theory and Statistics Part II: Hypothesis Testing and I-Hsiang Wang Department of Electrical Engineering National Taiwan University ihwang@ntu.edu.tw December 23, 2015 1 / 50 I-Hsiang
More informationLecture Notes in Advanced Calculus 1 (80315) Raz Kupferman Institute of Mathematics The Hebrew University
Lecture Notes in Advanced Calculus 1 (80315) Raz Kupferman Institute of Mathematics The Hebrew University February 7, 2007 2 Contents 1 Metric Spaces 1 1.1 Basic definitions...........................
More informationLecture 6 Basic Probability
Lecture 6: Basic Probability 1 of 17 Course: Theory of Probability I Term: Fall 2013 Instructor: Gordan Zitkovic Lecture 6 Basic Probability Probability spaces A mathematical setup behind a probabilistic
More informationIntroduction to Empirical Processes and Semiparametric Inference Lecture 09: Stochastic Convergence, Continued
Introduction to Empirical Processes and Semiparametric Inference Lecture 09: Stochastic Convergence, Continued Michael R. Kosorok, Ph.D. Professor and Chair of Biostatistics Professor of Statistics and
More informationLecture 1: August 28
36-705: Intermediate Statistics Fall 2017 Lecturer: Siva Balakrishnan Lecture 1: August 28 Our broad goal for the first few lectures is to try to understand the behaviour of sums of independent random
More informationTheoretical Statistics. Lecture 17.
Theoretical Statistics. Lecture 17. Peter Bartlett 1. Asymptotic normality of Z-estimators: classical conditions. 2. Asymptotic equicontinuity. 1 Recall: Delta method Theorem: Supposeφ : R k R m is differentiable
More information1 Probability theory. 2 Random variables and probability theory.
Probability theory Here we summarize some of the probability theory we need. If this is totally unfamiliar to you, you should look at one of the sources given in the readings. In essence, for the major
More informationsimple if it completely specifies the density of x
3. Hypothesis Testing Pure significance tests Data x = (x 1,..., x n ) from f(x, θ) Hypothesis H 0 : restricts f(x, θ) Are the data consistent with H 0? H 0 is called the null hypothesis simple if it completely
More informationChapter 7. Confidence Sets Lecture 30: Pivotal quantities and confidence sets
Chapter 7. Confidence Sets Lecture 30: Pivotal quantities and confidence sets Confidence sets X: a sample from a population P P. θ = θ(p): a functional from P to Θ R k for a fixed integer k. C(X): a confidence
More informationModule 3. Function of a Random Variable and its distribution
Module 3 Function of a Random Variable and its distribution 1. Function of a Random Variable Let Ω, F, be a probability space and let be random variable defined on Ω, F,. Further let h: R R be a given
More informationMath Camp II. Calculus. Yiqing Xu. August 27, 2014 MIT
Math Camp II Calculus Yiqing Xu MIT August 27, 2014 1 Sequence and Limit 2 Derivatives 3 OLS Asymptotics 4 Integrals Sequence Definition A sequence {y n } = {y 1, y 2, y 3,..., y n } is an ordered set
More informationNotes 1 : Measure-theoretic foundations I
Notes 1 : Measure-theoretic foundations I Math 733-734: Theory of Probability Lecturer: Sebastien Roch References: [Wil91, Section 1.0-1.8, 2.1-2.3, 3.1-3.11], [Fel68, Sections 7.2, 8.1, 9.6], [Dur10,
More informationMathematical Methods for Neurosciences. ENS - Master MVA Paris 6 - Master Maths-Bio ( )
Mathematical Methods for Neurosciences. ENS - Master MVA Paris 6 - Master Maths-Bio (2014-2015) Etienne Tanré - Olivier Faugeras INRIA - Team Tosca October 22nd, 2014 E. Tanré (INRIA - Team Tosca) Mathematical
More informationStat 8112 Lecture Notes Weak Convergence in Metric Spaces Charles J. Geyer January 23, Metric Spaces
Stat 8112 Lecture Notes Weak Convergence in Metric Spaces Charles J. Geyer January 23, 2013 1 Metric Spaces Let X be an arbitrary set. A function d : X X R is called a metric if it satisfies the folloing
More informationProbability Theory and Statistics. Peter Jochumzen
Probability Theory and Statistics Peter Jochumzen April 18, 2016 Contents 1 Probability Theory And Statistics 3 1.1 Experiment, Outcome and Event................................ 3 1.2 Probability............................................
More informationEmpirical Processes: General Weak Convergence Theory
Empirical Processes: General Weak Convergence Theory Moulinath Banerjee May 18, 2010 1 Extended Weak Convergence The lack of measurability of the empirical process with respect to the sigma-field generated
More informationLecture 8 Inequality Testing and Moment Inequality Models
Lecture 8 Inequality Testing and Moment Inequality Models Inequality Testing In the previous lecture, we discussed how to test the nonlinear hypothesis H 0 : h(θ 0 ) 0 when the sample information comes
More informationProbability and Distributions
Probability and Distributions What is a statistical model? A statistical model is a set of assumptions by which the hypothetical population distribution of data is inferred. It is typically postulated
More information40.530: Statistics. Professor Chen Zehua. Singapore University of Design and Technology
Singapore University of Design and Technology Lecture 9: Hypothesis testing, uniformly most powerful tests. The Neyman-Pearson framework Let P be the family of distributions of concern. The Neyman-Pearson
More informationStatistical inference
Statistical inference Contents 1. Main definitions 2. Estimation 3. Testing L. Trapani MSc Induction - Statistical inference 1 1 Introduction: definition and preliminary theory In this chapter, we shall
More informationEC212: Introduction to Econometrics Review Materials (Wooldridge, Appendix)
1 EC212: Introduction to Econometrics Review Materials (Wooldridge, Appendix) Taisuke Otsu London School of Economics Summer 2018 A.1. Summation operator (Wooldridge, App. A.1) 2 3 Summation operator For
More informationChapter 4. Theory of Tests. 4.1 Introduction
Chapter 4 Theory of Tests 4.1 Introduction Parametric model: (X, B X, P θ ), P θ P = {P θ θ Θ} where Θ = H 0 +H 1 X = K +A : K: critical region = rejection region / A: acceptance region A decision rule
More informationIf we want to analyze experimental or simulated data we might encounter the following tasks:
Chapter 1 Introduction If we want to analyze experimental or simulated data we might encounter the following tasks: Characterization of the source of the signal and diagnosis Studying dependencies Prediction
More informationCONVERGENCE OF RANDOM SERIES AND MARTINGALES
CONVERGENCE OF RANDOM SERIES AND MARTINGALES WESLEY LEE Abstract. This paper is an introduction to probability from a measuretheoretic standpoint. After covering probability spaces, it delves into the
More informationUniversity of Regina. Lecture Notes. Michael Kozdron
University of Regina Statistics 252 Mathematical Statistics Lecture Notes Winter 2005 Michael Kozdron kozdron@math.uregina.ca www.math.uregina.ca/ kozdron Contents 1 The Basic Idea of Statistics: Estimating
More informationAsymptotic Statistics-VI. Changliang Zou
Asymptotic Statistics-VI Changliang Zou Kolmogorov-Smirnov distance Example (Kolmogorov-Smirnov confidence intervals) We know given α (0, 1), there is a well-defined d = d α,n such that, for any continuous
More informationSpring 2012 Math 541B Exam 1
Spring 2012 Math 541B Exam 1 1. A sample of size n is drawn without replacement from an urn containing N balls, m of which are red and N m are black; the balls are otherwise indistinguishable. Let X denote
More informationLecture 2. We now introduce some fundamental tools in martingale theory, which are useful in controlling the fluctuation of martingales.
Lecture 2 1 Martingales We now introduce some fundamental tools in martingale theory, which are useful in controlling the fluctuation of martingales. 1.1 Doob s inequality We have the following maximal
More informationThe properties of L p -GMM estimators
The properties of L p -GMM estimators Robert de Jong and Chirok Han Michigan State University February 2000 Abstract This paper considers Generalized Method of Moment-type estimators for which a criterion
More informationSTAT 200C: High-dimensional Statistics
STAT 200C: High-dimensional Statistics Arash A. Amini May 30, 2018 1 / 59 Classical case: n d. Asymptotic assumption: d is fixed and n. Basic tools: LLN and CLT. High-dimensional setting: n d, e.g. n/d
More informationStatistics. Statistics
The main aims of statistics 1 1 Choosing a model 2 Estimating its parameter(s) 1 point estimates 2 interval estimates 3 Testing hypotheses Distributions used in statistics: χ 2 n-distribution 2 Let X 1,
More informationCHAPTER 3: LARGE SAMPLE THEORY
CHAPTER 3 LARGE SAMPLE THEORY 1 CHAPTER 3: LARGE SAMPLE THEORY CHAPTER 3 LARGE SAMPLE THEORY 2 Introduction CHAPTER 3 LARGE SAMPLE THEORY 3 Why large sample theory studying small sample property is usually
More informationProbability and Measure
Part II Year 2018 2017 2016 2015 2014 2013 2012 2011 2010 2009 2008 2007 2006 2005 2018 84 Paper 4, Section II 26J Let (X, A) be a measurable space. Let T : X X be a measurable map, and µ a probability
More informationThe International Journal of Biostatistics
The International Journal of Biostatistics Volume 7, Issue 1 2011 Article 12 Consonance and the Closure Method in Multiple Testing Joseph P. Romano, Stanford University Azeem Shaikh, University of Chicago
More informationConvergence in Distribution
Convergence in Distribution Undergraduate version of central limit theorem: if X 1,..., X n are iid from a population with mean µ and standard deviation σ then n 1/2 ( X µ)/σ has approximately a normal
More informationChapter 5. Weak convergence
Chapter 5 Weak convergence We will see later that if the X i are i.i.d. with mean zero and variance one, then S n / p n converges in the sense P(S n / p n 2 [a, b])! P(Z 2 [a, b]), where Z is a standard
More informationX n D X lim n F n (x) = F (x) for all x C F. lim n F n(u) = F (u) for all u C F. (2)
14:17 11/16/2 TOPIC. Convergence in distribution and related notions. This section studies the notion of the so-called convergence in distribution of real random variables. This is the kind of convergence
More informationIntroduction to Probability
LECTURE NOTES Course 6.041-6.431 M.I.T. FALL 2000 Introduction to Probability Dimitri P. Bertsekas and John N. Tsitsiklis Professors of Electrical Engineering and Computer Science Massachusetts Institute
More informationIntroduction to Empirical Processes and Semiparametric Inference Lecture 02: Overview Continued
Introduction to Empirical Processes and Semiparametric Inference Lecture 02: Overview Continued Michael R. Kosorok, Ph.D. Professor and Chair of Biostatistics Professor of Statistics and Operations Research
More informationChapter 2: Fundamentals of Statistics Lecture 15: Models and statistics
Chapter 2: Fundamentals of Statistics Lecture 15: Models and statistics Data from one or a series of random experiments are collected. Planning experiments and collecting data (not discussed here). Analysis:
More informationLimiting Distributions
Limiting Distributions We introduce the mode of convergence for a sequence of random variables, and discuss the convergence in probability and in distribution. The concept of convergence leads us to the
More informationSTA 732: Inference. Notes 2. Neyman-Pearsonian Classical Hypothesis Testing B&D 4
STA 73: Inference Notes. Neyman-Pearsonian Classical Hypothesis Testing B&D 4 1 Testing as a rule Fisher s quantification of extremeness of observed evidence clearly lacked rigorous mathematical interpretation.
More informationMultivariate Analysis and Likelihood Inference
Multivariate Analysis and Likelihood Inference Outline 1 Joint Distribution of Random Variables 2 Principal Component Analysis (PCA) 3 Multivariate Normal Distribution 4 Likelihood Inference Joint density
More information7 Convergence in R d and in Metric Spaces
STA 711: Probability & Measure Theory Robert L. Wolpert 7 Convergence in R d and in Metric Spaces A sequence of elements a n of R d converges to a limit a if and only if, for each ǫ > 0, the sequence a
More informationChapter 3. Point Estimation. 3.1 Introduction
Chapter 3 Point Estimation Let (Ω, A, P θ ), P θ P = {P θ θ Θ}be probability space, X 1, X 2,..., X n : (Ω, A) (IR k, B k ) random variables (X, B X ) sample space γ : Θ IR k measurable function, i.e.
More informationChapter 3: Unbiased Estimation Lecture 22: UMVUE and the method of using a sufficient and complete statistic
Chapter 3: Unbiased Estimation Lecture 22: UMVUE and the method of using a sufficient and complete statistic Unbiased estimation Unbiased or asymptotically unbiased estimation plays an important role in
More informationHochdimensionale Integration
Oliver Ernst Institut für Numerische Mathematik und Optimierung Hochdimensionale Integration 14-tägige Vorlesung im Wintersemester 2010/11 im Rahmen des Moduls Ausgewählte Kapitel der Numerik Contents
More informationOverview of normed linear spaces
20 Chapter 2 Overview of normed linear spaces Starting from this chapter, we begin examining linear spaces with at least one extra structure (topology or geometry). We assume linearity; this is a natural
More informationMultivariate Distributions
IEOR E4602: Quantitative Risk Management Spring 2016 c 2016 by Martin Haugh Multivariate Distributions We will study multivariate distributions in these notes, focusing 1 in particular on multivariate
More informationMA651 Topology. Lecture 9. Compactness 2.
MA651 Topology. Lecture 9. Compactness 2. This text is based on the following books: Topology by James Dugundgji Fundamental concepts of topology by Peter O Neil Elements of Mathematics: General Topology
More informationMetric Spaces Lecture 17
Metric Spaces Lecture 17 Homeomorphisms At the end of last lecture an example was given of a bijective continuous function f such that f 1 is not continuous. For another example, consider the sets T =
More informationStochastic Processes
Stochastic Processes A very simple introduction Péter Medvegyev 2009, January Medvegyev (CEU) Stochastic Processes 2009, January 1 / 54 Summary from measure theory De nition (X, A) is a measurable space
More informationSTAT 7032 Probability Spring Wlodek Bryc
STAT 7032 Probability Spring 2018 Wlodek Bryc Created: Friday, Jan 2, 2014 Revised for Spring 2018 Printed: January 9, 2018 File: Grad-Prob-2018.TEX Department of Mathematical Sciences, University of Cincinnati,
More informationNotes 18 : Optional Sampling Theorem
Notes 18 : Optional Sampling Theorem Math 733-734: Theory of Probability Lecturer: Sebastien Roch References: [Wil91, Chapter 14], [Dur10, Section 5.7]. Recall: DEF 18.1 (Uniform Integrability) A collection
More informationMetric Spaces and Topology
Chapter 2 Metric Spaces and Topology From an engineering perspective, the most important way to construct a topology on a set is to define the topology in terms of a metric on the set. This approach underlies
More informationBuilding Infinite Processes from Finite-Dimensional Distributions
Chapter 2 Building Infinite Processes from Finite-Dimensional Distributions Section 2.1 introduces the finite-dimensional distributions of a stochastic process, and shows how they determine its infinite-dimensional
More informationNotions such as convergent sequence and Cauchy sequence make sense for any metric space. Convergent Sequences are Cauchy
Banach Spaces These notes provide an introduction to Banach spaces, which are complete normed vector spaces. For the purposes of these notes, all vector spaces are assumed to be over the real numbers.
More informationSYSM 6303: Quantitative Introduction to Risk and Uncertainty in Business Lecture 4: Fitting Data to Distributions
SYSM 6303: Quantitative Introduction to Risk and Uncertainty in Business Lecture 4: Fitting Data to Distributions M. Vidyasagar Cecil & Ida Green Chair The University of Texas at Dallas Email: M.Vidyasagar@utdallas.edu
More informationReal Analysis. July 10, These notes are intended for use in the warm-up camp for incoming Berkeley Statistics
Real Analysis July 10, 2006 1 Introduction These notes are intended for use in the warm-up camp for incoming Berkeley Statistics graduate students. Welcome to Cal! The real analysis review presented here
More informationMA 575 Linear Models: Cedric E. Ginestet, Boston University Revision: Probability and Linear Algebra Week 1, Lecture 2
MA 575 Linear Models: Cedric E Ginestet, Boston University Revision: Probability and Linear Algebra Week 1, Lecture 2 1 Revision: Probability Theory 11 Random Variables A real-valued random variable is
More informationCherry Blossom run (1) The credit union Cherry Blossom Run is a 10 mile race that takes place every year in D.C. In 2009 there were participants
18.650 Statistics for Applications Chapter 5: Parametric hypothesis testing 1/37 Cherry Blossom run (1) The credit union Cherry Blossom Run is a 10 mile race that takes place every year in D.C. In 2009
More informationElementary Probability. Exam Number 38119
Elementary Probability Exam Number 38119 2 1. Introduction Consider any experiment whose result is unknown, for example throwing a coin, the daily number of customers in a supermarket or the duration of
More informationSTATS 200: Introduction to Statistical Inference. Lecture 29: Course review
STATS 200: Introduction to Statistical Inference Lecture 29: Course review Course review We started in Lecture 1 with a fundamental assumption: Data is a realization of a random process. The goal throughout
More informationRegression and Statistical Inference
Regression and Statistical Inference Walid Mnif wmnif@uwo.ca Department of Applied Mathematics The University of Western Ontario, London, Canada 1 Elements of Probability 2 Elements of Probability CDF&PDF
More informationStatistics for scientists and engineers
Statistics for scientists and engineers February 0, 006 Contents Introduction. Motivation - why study statistics?................................... Examples..................................................3
More information1 Fourier Integrals of finite measures.
18.103 Fall 2013 1 Fourier Integrals of finite measures. Denote the space of finite, positive, measures on by M + () = {µ : µ is a positive measure on ; µ() < } Proposition 1 For µ M + (), we define the
More informationLecture 21: Convergence of transformations and generating a random variable
Lecture 21: Convergence of transformations and generating a random variable If Z n converges to Z in some sense, we often need to check whether h(z n ) converges to h(z ) in the same sense. Continuous
More informationRecall that in order to prove Theorem 8.8, we argued that under certain regularity conditions, the following facts are true under H 0 : 1 n
Chapter 9 Hypothesis Testing 9.1 Wald, Rao, and Likelihood Ratio Tests Suppose we wish to test H 0 : θ = θ 0 against H 1 : θ θ 0. The likelihood-based results of Chapter 8 give rise to several possible
More informationLecture 28: Asymptotic confidence sets
Lecture 28: Asymptotic confidence sets 1 α asymptotic confidence sets Similar to testing hypotheses, in many situations it is difficult to find a confidence set with a given confidence coefficient or level
More information