Asymptotic Statistics-III. Changliang Zou

Similar documents
Lecture Notes on Asymptotic Statistics. Changliang Zou

Asymptotic Statistics-VI. Changliang Zou

Exercises and Answers to Chapter 1

Chapter 5 continued. Chapter 5 sections

Section 27. The Central Limit Theorem. Po-Ning Chen, Professor. Institute of Communications Engineering. National Chiao Tung University

1 Exercises for lecture 1

f (1 0.5)/n Z =

STT 843 Key to Homework 1 Spring 2018

The Central Limit Theorem: More of the Story

Spring 2012 Math 541B Exam 1

Probability and Statistics Notes

Limiting Distributions

Lecture 4: Sampling, Tail Inequalities

Statistics Ph.D. Qualifying Exam

If g is also continuous and strictly increasing on J, we may apply the strictly increasing inverse function g 1 to this inequality to get

Qualifying Exam in Probability and Statistics.

Summary of Chapters 7-9

Uses of Asymptotic Distributions: In order to get distribution theory, we need to norm the random variable; we usually look at n 1=2 ( X n ).

Chapter 5. Chapter 5 sections

Stat 5101 Lecture Slides: Deck 7 Asymptotics, also called Large Sample Theory. Charles J. Geyer School of Statistics University of Minnesota

STATISTICS SYLLABUS UNIT I

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

Final Examination Statistics 200C. T. Ferguson June 11, 2009

Comprehensive Examination Quantitative Methods Spring, 2018

Convergence in Distribution

Limiting Distributions

Lecture 3 - Expectation, inequalities and laws of large numbers

Probability Theory and Statistics. Peter Jochumzen

Regression and Statistical Inference

STA205 Probability: Week 8 R. Wolpert

Notes on Random Vectors and Multivariate Normal

Notes on the Multivariate Normal and Related Topics

Chapter 7: Special Distributions

STA 2201/442 Assignment 2

Final Exam. 1. (6 points) True/False. Please read the statements carefully, as no partial credit will be given.

APPM/MATH 4/5520 Solutions to Exam I Review Problems. f X 1,X 2. 2e x 1 x 2. = x 2

15 Discrete Distributions

Week 9 The Central Limit Theorem and Estimation Concepts

Confidence Intervals, Testing and ANOVA Summary

Statistics Ph.D. Qualifying Exam: Part I October 18, 2003

Problem Selected Scores

LIST OF FORMULAS FOR STK1100 AND STK1110

5.1 Consistency of least squares estimates. We begin with a few consistency results that stand on their own and do not depend on normality.

Estimation theory. Parametric estimation. Properties of estimators. Minimum variance estimator. Cramer-Rao bound. Maximum likelihood estimators

BIOS 2083 Linear Models Abdus S. Wahed. Chapter 2 84

Probability Models. 4. What is the definition of the expectation of a discrete random variable?

MAS223 Statistical Inference and Modelling Exercises

Central Limit Theorem ( 5.3)

Statistics & Data Sciences: First Year Prelim Exam May 2018

Formulas for probability theory and linear models SF2941

t x 1 e t dt, and simplify the answer when possible (for example, when r is a positive even number). In particular, confirm that EX 4 = 3.

1 Appendix A: Matrix Algebra

Simple Linear Regression

Course: ESO-209 Home Work: 1 Instructor: Debasis Kundu

Qualifying Exam CS 661: System Simulation Summer 2013 Prof. Marvin K. Nakayama

Random Variables and Their Distributions

Lecture 21: Convergence of transformations and generating a random variable

Ph.D. Qualifying Exam Friday Saturday, January 6 7, 2017

Master s Written Examination - Solution

Part IA Probability. Definitions. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015

STAT Chapter 5 Continuous Distributions

4. Distributions of Functions of Random Variables

We introduce methods that are useful in:

Statistics. Statistics

Lecture 2: Review of Probability

Multivariate Random Variable

Bivariate Paired Numerical Data

This paper is not to be removed from the Examination Halls

Qualifying Exam in Probability and Statistics.

Non-parametric Inference and Resampling

Lecture 1: August 28

Multiple Random Variables

Spring 2012 Math 541A Exam 1. X i, S 2 = 1 n. n 1. X i I(X i < c), T n =

Probability on a Riemannian Manifold

2. Variance and Covariance: We will now derive some classic properties of variance and covariance. Assume real-valued random variables X and Y.

ACM 116: Lectures 3 4

A Short Course in Basic Statistics

First Year Examination Department of Statistics, University of Florida

Master s Written Examination

Lecture 2: Repetition of probability theory and statistics

Stat 5101 Notes: Brand Name Distributions

MATH c UNIVERSITY OF LEEDS Examination for the Module MATH2715 (January 2015) STATISTICAL METHODS. Time allowed: 2 hours

Quick Review on Linear Multiple Regression

1.1 Review of Probability Theory

Probability Lecture III (August, 2006)

Lecture 5: Moment generating functions

Lecture 11. Multivariate Normal theory

Probability and Distributions

Chapter 2: Fundamentals of Statistics Lecture 15: Models and statistics

The Lindeberg central limit theorem

Statement: With my signature I confirm that the solutions are the product of my own work. Name: Signature:.

Large Sample Theory. Consider a sequence of random variables Z 1, Z 2,..., Z n. Convergence in probability: Z n

Statistical Hypothesis Testing

Elements of Asymptotic Theory. James L. Powell Department of Economics University of California, Berkeley

Lecture 8 Inequality Testing and Moment Inequality Models

Test Code: STA/STB (Short Answer Type) 2013 Junior Research Fellowship for Research Course in Statistics

Expectation. DS GA 1002 Statistical and Mathematical Models. Carlos Fernandez-Granda

Algorithms for Uncertainty Quantification

Chapter 2. Discrete Distributions

1 Probability theory. 2 Random variables and probability theory.

Transcription:

Asymptotic Statistics-III Changliang Zou

The multivariate central limit theorem Theorem (Multivariate CLT for iid case) Let X i be iid random p-vectors with mean µ and and covariance matrix Σ. Then n ( X µ ) d Np (0, Σ). By the Cramer-Wold device, this can be proved by finding the limit distribution of the sequences of real variables ( ) c T 1 (X i µ) = 1 (c T X i c T µ). n n i=1 i=1

The multivariate central limit theorem Because the random variables c T X i c T µ are iid with zero mean and variance c T Σc, this sequence is AN(0, c T Σc) by Theorem??. This is exactly the distribution of c T X if X possesses an N p (0, Σ).

The multivariate central limit theorem Example Suppose that X 1,..., X n is a random sample from the Poisson distribution with mean θ. Let Z n be the proportions of zero observed, i.e., Z n = 1/n n i=1 I {X j =0}. Let us find the joint asymptotic distribution of ( X n, Z n ) Note that E(X 1 ) = θ, EI {X1 =0} = e θ, var(x 1 ) = θ, var(i {X1 =0}) = e θ (1 e θ ), and EX 1 I {X1 =0} = 0. So, cov(x 1, I {X1 =0}) = θe θ. n ( ( X n, Z n ) (θ, e θ ) ) d N2 (0, Σ), where Σ = ( θ θe θ θe θ e θ (1 e θ ) ).

Exercises Consider two sequences of random variables X n and Y n. If (X n EX n )/ varx n d X and corr(xn, Y n ) 1, then (Y n EY n )/ vary n d X. Let X 1, X 2,... be iid double exponential (Laplace) random variables with density, f (x) = (2τ) 1 exp{ x /τ}, where τ is a positive parameter that represents the mean deviation, i.e., τ = E X. Let X n = n 1 n i=1 X i and Ȳ n = n 1 n i=1 X i. (a) Find the joint asymptotic distribution of X n and Ȳn. (b) Find the asymptotic distribution of (Ȳn τ)/ X n.

(a) Let Y i = X i. Then (X i, Y i ) are iid with E(X i, Y i ) = (0, τ). Since EX 2 = 2τ 2, we have varx i = 2τ 2 and vary i = τ 2. We also have cov(x i, Y i ) = 0. Therefore, from the multivariate CLT, ( ( n( Xn, (Ȳn τ)) d 2τ 2 0 N 2 0, 0 τ 2 )). (b) From CMT with g(x, y) = y/x, continuous except on the line x = 0, we have (Y n τ)/x n = n(y n τ)/( nx n ) d V /U, where U and V are independent normal random variables with zero means and 2τ 2 and τ 2 respectively. This has a Cauchy distribution with median zero and scale parameter 1/ 2, independent of τ. Of course, (Y n τ)/(x n / 2) has a standard Cauchy distribution.

CLT: existence of a variance is not necessary Definition A function g : R R is called slowly varying at if, for every t > 0, lim x g(tx)/g(x) = 1. Examples: log x, x/(1 + x), and indeed any function with a finite limit as x ; x or e x are not slowly varying. Theorem Let X 1, X 2,... be iid from a CDF F on R. Let v(x) = x x y 2 df (y). Then, there exist constants {a n }, {b n } such that n i=1 X i a n b n d N(0, 1), if and only if v(x) is slowly varying at. If F has a finite second moment, v(x) is slowly varying at.

CLT: existence of a variance is not necessary Example Suppose X 1, X 2,... are iid from a t-distribution with 2 degrees of freedom (t(2)) that has a finite mean but not a finite variance. The density is given by f (y) = c/(2 + y 2 ) 3 2 for some positive c. by a direct integration, for some other constant k, 1 [ v(x) = k 2 + x 2 x 2 + x 2 arcsinh(x/ ] 2). on using the fact that arcsinh(x) = log(2x) + O(x 2 ) as x, we get, for any t > 0, v(tx) v(x) 1. the partial sums n i=1 X i converge to a normal distribution The centering can be taken to be zero for the centered t-distribution; it can be shown that the normalizing required is b n = n log n

CLT for the independent not necessarily iid case Theorem (Lindeberg-Feller) Suppose X n is a sequence of independent variables with means µ n and variances σn 2 <. Let sn 2 = n i=1 σ2 i. If for any ɛ > 0 1 s 2 n j=1 where F i is the CDF of X i, then x µ j >ɛs n (x µ j ) 2 df j (x) 0, (1) (X i µ i ) i=1 s n d N(0, 1). The condition (1) is called Lindeberg-Feller condition.

CLT for the independent not necessarily iid case Example Let X 1, X 2..., be independent variables such that X j has the uniform distribution on [ j, j], j = 1, 2,.... Let us verify the conditions of the theorem are satisfied. Note that EX j = 0 and σj 2 = 1 j 2j j x 2 dx = j 2 /3 for all j. s 2 n = σj 2 = 1 3 j=1 j 2 = j=1 n(n + 1)(2n + 1). 18 For any ɛ > 0, n < ɛs n for sufficiently large n, since lim n n/s n = 0. Because X j j n, when n is sufficiently large, E(X 2 j I { Xj >ɛs n}) = 0. Consequently, lim n n j=1 E(X 2 j I { X j >ɛs n}) <. Considering s n, Lindeberg s condition holds.

CLT for the independent not necessarily iid case It is hard to verify the Lindeberg-Feller condition. A simpler theorem Theorem (Liapounov) Suppose X n is a sequence of independent variables with means µ n and variances σn 2 <. Let sn 2 = n i=1 σ2 i. If for some δ > 0 as n, then 1 s 2+δ n E X j µ j 2+δ 0 (2) j=1 (X i µ i ) i=1 s n d N(0, 1).

Liapounov CLT s n, sup j 1 E X j µ j 2+δ < and n 1 s n is bounded In practice, work with δ = 1 or 2. If X i is uniformly bounded and s n, the condition is immediately satisfied with δ = 1.

Liapounov CLT Example Let X 1, X 2,... be independent random variables. Suppose that X i has the binomial distribution BIN(p i, 1), i = 1, 2,.... For each i, EX i = p i and E X i EX i 3 = (1 p i ) 3 p i + pi 3(1 p i) 2p i (1 p i ). n i=1 E X i EX i 3 2sn 2 = 2 n i=1 E X i EX i 2 = 2 n i=1 p i(1 p i ). Liapounov s condition (2) holds with δ = 1 if s n. For example, if p i = 1/i or M 1 p i M 2 with two constants belong to (0, 1), s n holds. Accordingly, by Liapounov s theorem, n i=1 (X i p i ) s n d N(0, 1).

CLT for double array and triangular array Double array: X 11 with distribution F 1 X 21, X 22 independent, each with distribution F 2 X n1,... X nn independent, each with distribution F n Triangular array: X 11 with distribution F 1 X 21, X 22 independent, with distribution F 21, F 22 X n1,... X nn independent, with distributions F n1,..., F nn.

CLT for double array Theorem Let the X ij be distributed as a double array. Then ( n( X n µ n ) P σ n ) x Φ(x) as n for any sequence F n with mean µ n and variance σ 2 n for which E n X n1 µ n 3 /σ 3 n = o( n). Here E n denotes the expectation under F n. Eg, Bin(p n, n), where the success probability depends on n.

CLT for triangular array Theorem Let the X ij be distributed as a triangular array and let E(X ij ) = µ ij, var(x ij ) = σ 2 ij <, and s2 n = n j=1 σ2 nj. Then, n j=1 (X nj µ nj ) s n d N(0, 1), provided that 1 s 2+δ n E X nj µ nj 2+δ 0 j=1

Hajek-Sidak CLT Theorem (Hajek-Sidak) Suppose X 1, X 2,... are iid random variables with mean µ and variance σ 2 <. Let c n = (c n1, c n2,..., c nn ) be a vector of constants such that max 1 i n c 2 ni cnj 2 j=1 0 (3) as n. Then c ni (X i µ) σ i=1 cnj 2 j=1 d N(0, 1).

Hajek-Sidak CLT The condition (3) is to ensure that no coefficient dominates the vector c n, and is referred as Hajek-Sidak condition. For example, if c n = (1, 0,..., 0), then the condition would fail and so would the theorem.

Hajek-Sidak CLT Example (Simplest linear regression) Consider the simple linear regression model y i = β 0 + β 1 x i + ε i, where ε i s are iid with mean 0 and variance σ 2 but are not necessarily normally distributed. The least squares estimate of β 1 based on n observations is β 1 = n i=1 (y i ȳ n )(x i x n ) n i=1 (x i x n ) 2 = β 1 + n i=1 ε i(x i x n ) n i=1 (x i x n ) 2.

Hajek-Sidak CLT β 1 = β 1 + n i=1 ε ic ni / n j=1 c2 nj, where c ni = x i x n. By the Hajek-Sidak s Theorem provided as n. j=1 c 2 nj β 1 β 1 σ = n i=1 ε ic ni σ n j=1 c2 nj max 1 i n (x i x n ) 2 n j=1 (x j x n ) 2 0 Under some conditions on the design variables d N(0, 1),

Lindeberg-Feller multivariate CLT Theorem (Lindeberg-Feller multivariate) Suppose X i is a sequence of independent vectors with means µ i, covariances Σ i and distribution function F i. Suppose that 1 n n i=1 Σ i Σ as n, and that for any ɛ > 0 then 1 n j=1 x µ j >ɛ n 1 n x µ j 2 df j (x) 0, (X i µ i ) d N(0, Σ). i=1

Lindeberg-Feller multivariate CLT Example (multiple regression) In the linear regression problem, we observe a vector y = Xβ + ε for a fixed or random matrix X of full rank, and an error vector ε with iid components with mean zero and variance σ 2. The least squares estimator of β is β = (X T X) 1 X T y. This estimator is unbiased and has covariance matrix σ 2 (X T X) 1. If the error vector ε is normally distributed, then β is exactly normally distributed. Under reasonable conditions on the design matrix, β is asymptotically normally distributed for a large range of error distributions.

Lindeberg-Feller multivariate CLT Here we fix p and let n tend to infinity. This follows from the representation (X T X) 1/2 ( β β) = (X T X) 1/2 X T ε = a ni ε i, i=1 where a n1,..., a nn are the columns of the (p n) matrix (X T X) 1/2 X T =: A. This sequence is asymptotically normal if the vectors a n1 ε 1,..., a nn ε n satisfy the Lindeberg conditions. The norming matrix (X T X) 1/2 has been chosen to ensure that the vectors in the display have covariance matrix σ 2 I p for every n. The remaining condition is a ni 2 Eε 2 i I { ani ε i >ɛ} 0. i=1

Lindeberg-Feller multivariate CLT Because a ni 2 = tr(aa T ) = p, it suffices that max i Eε 2 i I { a ni ε i >ɛ} 0 The expectation Eε 2 i I { a ni ε i >ɛ} can be bounded ɛ k E ε i k+2 a ni k a set of sufficient conditions is a ni k 0; E ε 1 k <, k > 2. i=1

CLT for a random number of summands the number of terms present in a partial sum is a random variable. Precisely, {N(t)}, t 0, is a family of (nonnegative) integer-valued random variables, and we want to approximate the distribution of T N(t) Theorem (Anscombe-Renyi) Let X i be iid with mean µ and a finite variance σ 2, and let {N n }, be a sequence of (nonnegative) integer-valued random variables and {a n } a sequence of positive constants tending to such that N n /a n p c, 0 < c <, as n. Then, T Nn N n µ σ N n d N(0, 1) as n.

CLT for a random number of summands Example (coupon collection problem) Consider a problem in which a person keeps purchasing boxes of cereals until she obtains a full set of some n coupons. The assumptions are that the boxes have an equal probability of containing any of the n coupons mutually independently. Suppose that the costs of buying the cereal boxes are iid with some mean µ and some variance σ 2. If it takes N n boxes to obtain the complete set of all n coupons, then N n /(n ln n) p 1 as n. The total cost to the customer to obtain the complete set of coupons is T Nn = X 1 +... + X Nn. T Nn N n µ σ n ln n d N(0, 1).

CLT for a random number of summands [On the asymptotic behavior of N n ]. Let t i be the boxes to collect the i-th coupon after i 1 coupons have been collected. the probability of collecting a new coupon given i 1 coupons is p i = (n i + 1)/n. t i has a geometric distribution with expectation 1/p i and N n = n i=1 t i. By WLLN, we know 1 n ln n N n 1 n ln n i=1 p 1 i p 0 1 n ln n i=1 p 1 i = 1 n ln n n 1 i i=1 = 1 ln n H n = ln n + γ + o(1); γ is Euler-constant p 1. N n n ln n i=1 1 i =: 1 ln n H n.

Exercises Suppose (X i, Y i ), i = 1,..., n are iid bivariate normal samples with E(X 1 ) = µ 1, E(Y 1 ) = µ 2, var(x 1 ) = σ 2 1, var(y 1) = σ 2 2, and corr(x 1, Y 1 ) = ρ. The standard test of the hypothesis H 0 : ρ = 0, or equivalently, H 0 : X, Y are independent, rejects H 0 when the sample correlation r n is sufficiently large in absolute value. Please find the asymptotic critical value. indep Suppose X i (µ, σi 2), where σ2 i = iδ. Find the asymptotic distribution of the best linear unbiased estimate of µ.

Homework Prove Theorem 1.3.9 or Theorem 1.3.10 (choose one of them); Suppose X 1,..., X n are i.i.d. N(µ, µ 2 ), µ > 0. Therefore, X n and S n are both reasonable estimates of µ. Find the limit of P( S n µ < X n µ ); Consider n observations {(x i, y i )} n i=1 from the simple linear regression model y i = β 0 + β 1 x i + ε i, where ε i s are iid with mean 0 and unknown variance σ 2 < (but are not necessarily normally distributed). Assume x i is equally spaced in the design interval [0, 1], say x i = i n. We are interested in testing the null hypothesis H 0 : β 0 = 0 versus H 1 : β 0 0. Please provide a proper test statistic based on the least squares estimate and find the critical value such that the asymptotic level of the test is α.