An Introduction to Asymptotic Theory

Similar documents
Lecture 8: Convergence of transformations and law of large numbers

Large Sample Theory. Convergence. Central Limit Theorems Asymptotic Distribution Delta Method. Convergence in Probability Convergence in Distribution

Lecture 19: Convergence

LECTURE 8: ASYMPTOTICS I

Let us give one more example of MLE. Example 3. The uniform distribution U[0, θ] on the interval [0, θ] has p.d.f.

Distribution of Random Samples & Limit theorems

STA Object Data Analysis - A List of Projects. January 18, 2018

Lecture 20: Multivariate convergence and the Central Limit Theorem

Lecture 33: Bootstrap

January 25, 2017 INTRODUCTION TO MATHEMATICAL STATISTICS

ECONOMETRIC THEORY. MODULE XIII Lecture - 34 Asymptotic Theory and Stochastic Regressors

Estimation for Complete Data

Probability and Statistics

Lecture 3. Properties of Summary Statistics: Sampling Distribution

Introductory statistics

Kernel density estimator

Asymptotics. Hypothesis Testing UMP. Asymptotic Tests and p-values

ST5215: Advanced Statistical Theory

Resampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n.

Slide Set 13 Linear Model with Endogenous Regressors and the GMM estimator

11 THE GMM ESTIMATION

Random Variables, Sampling and Estimation

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 2 9/9/2013. Large Deviations for i.i.d. Random Variables

Chapter 3. Strong convergence. 3.1 Definition of almost sure convergence

Lecture 15: Strong, Conditional, & Joint Typicality

Lecture 2: Monte Carlo Simulation

Last Lecture. Biostatistics Statistical Inference Lecture 16 Evaluation of Bayes Estimator. Recap - Example. Recap - Bayes Estimator

1 Convergence in Probability and the Weak Law of Large Numbers

Introduction to Extreme Value Theory Laurens de Haan, ISM Japan, Erasmus University Rotterdam, NL University of Lisbon, PT

Probability 2 - Notes 10. Lemma. If X is a random variable and g(x) 0 for all x in the support of f X, then P(g(X) 1) E[g(X)].

Convergence of random variables. (telegram style notes) P.J.C. Spreij

Statistical Inference Based on Extremum Estimators

7.1 Convergence of sequences of random variables

Notes 5 : More on the a.s. convergence of sums

Lecture 01: the Central Limit Theorem. 1 Central Limit Theorem for i.i.d. random variables

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 6 9/23/2013. Brownian motion. Introduction

MATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4

SDS 321: Introduction to Probability and Statistics

Output Analysis and Run-Length Control

STAT Homework 1 - Solutions

This section is optional.

Econ 325/327 Notes on Sample Mean, Sample Proportion, Central Limit Theorem, Chi-square Distribution, Student s t distribution 1.

Asymptotic Results for the Linear Regression Model

Lecture 7: Properties of Random Samples

32 estimating the cumulative distribution function

This exam contains 19 pages (including this cover page) and 10 questions. A Formulae sheet is provided with the exam.

Chapter 6 Principles of Data Reduction

CEE 522 Autumn Uncertainty Concepts for Geotechnical Engineering

7.1 Convergence of sequences of random variables

Unbiased Estimation. February 7-12, 2008

Point Estimation: properties of estimators 1 FINITE-SAMPLE PROPERTIES. finite-sample properties (CB 7.3) large-sample properties (CB 10.

Notes On Median and Quantile Regression. James L. Powell Department of Economics University of California, Berkeley

Entropy Rates and Asymptotic Equipartition

Binomial Distribution

Mathematical Statistics - MS

Elements of Statistical Methods Lots of Data or Large Samples (Ch 8)

MA Advanced Econometrics: Properties of Least Squares Estimators

Lecture 15: Density estimation

Lecture Chapter 6: Convergence of Random Sequences

17. Joint distributions of extreme order statistics Lehmann 5.1; Ferguson 15

4. Basic probability theory

Economics 241B Relation to Method of Moments and Maximum Likelihood OLSE as a Maximum Likelihood Estimator

TAMS24: Notations and Formulas

Least Squares Estimation- Large-Sample Properties

Limit Theorems. Convergence in Probability. Let X be the number of heads observed in n tosses. Then, E[X] = np and Var[X] = np(1-p).

STATISTICAL METHODS FOR BUSINESS

Singular Continuous Measures by Michael Pejic 5/14/10

Probability and Random Processes

CEU Department of Economics Econometrics 1, Problem Set 1 - Solutions

Machine Learning Brett Bernstein

Lecture Note 8 Point Estimators and Point Estimation Methods. MIT Spring 2006 Herman Bennett

2.2. Central limit theorem.

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 19 11/17/2008 LAWS OF LARGE NUMBERS II THE STRONG LAW OF LARGE NUMBERS

Rank tests and regression rank scores tests in measurement error models

Entropy and Ergodic Theory Lecture 5: Joint typicality and conditional AEP

Rates of Convergence by Moduli of Continuity

EE 4TM4: Digital Communications II Probability Theory

Parameter, Statistic and Random Samples

Math 525: Lecture 5. January 18, 2018

Summary. Recap. Last Lecture. Let W n = W n (X 1,, X n ) = W n (X) be a sequence of estimators for

Bull. Korean Math. Soc. 36 (1999), No. 3, pp. 451{457 THE STRONG CONSISTENCY OF NONLINEAR REGRESSION QUANTILES ESTIMATORS Seung Hoe Choi and Hae Kyung

Notes 19 : Martingale CLT

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 21 11/27/2013

EECS564 Estimation, Filtering, and Detection Hwk 2 Solns. Winter p θ (z) = (2θz + 1 θ), 0 z 1

Lecture 18: Sampling distributions

Expectation and Variance of a random variable

Statistical Theory MT 2009 Problems 1: Solution sketches

Empirical Processes: Glivenko Cantelli Theorems

Stat410 Probability and Statistics II (F16)

1 = δ2 (0, ), Y Y n nδ. , T n = Y Y n n. ( U n,k + X ) ( f U n,k + Y ) n 2n f U n,k + θ Y ) 2 E X1 2 X1

Mathematics 170B Selected HW Solutions.

Regression with an Evaporating Logarithmic Trend

Econ 325 Notes on Point Estimator and Confidence Interval 1 By Hiro Kasahara

Topic 9: Sampling Distributions of Estimators

Lecture 27: Optimal Estimators and Functional Delta Method

Discrete Mathematics for CS Spring 2008 David Wagner Note 22

Study the bias (due to the nite dimensional approximation) and variance of the estimators

1.010 Uncertainty in Engineering Fall 2008

Sequences and Series of Functions

EFFECTIVE WLLN, SLLN, AND CLT IN STATISTICAL MODELS

Transcription:

A Itroductio to Asymptotic Theory Pig Yu School of Ecoomics ad Fiace The Uiversity of Hog Kog Pig Yu (HKU) Asymptotic Theory 1 / 20

Five Weapos i Asymptotic Theory Five Weapos i Asymptotic Theory Pig Yu (HKU) Asymptotic Theory 2 / 20

Five Weapos i Asymptotic Theory Five Weapos The weak law of large umbers (WLLN, or LLN) The cetral limit theorem (CLT) The cotiuous mappig theorem (CMT) Slutsky s theorem The Delta method Notatios: - I oliear (i parameter) models, the capital letters such as X deote radom variables or radom vectors ad the correspodig lower case letters such as x deote the potetial values they may take. - Geeric otatio for a parameter i oliear eviromets (e.g., oliear models or oliear costraits) is θ, while i liear eviromets is β. Pig Yu (HKU) Asymptotic Theory 3 / 20

Five Weapos i Asymptotic Theory The WLLN Defiitio A radom vector Z coverges i probability to Z as!, deoted as Z for ay δ > 0, lim P(kZ Z k > δ ) = 0.! p! Z, if Although the limit Z ca be radom, it is usually costat. [ituitio] p The probability limit of Z is ofte deoted as plim(z ). If Z! 0, we deote Z = o p (1). Whe a estimator coverges i probability to the true value as the sample size diverges, we say that the estimator is cosistet. Cosistecy is a importat prelimiary step i establishig other importat asymptotic approximatios. Theorem (WLLN) Suppose X 1,,X, are i.i.d. radom vectors, ad E [kxk] < ; the as!, X 1 p X i! E [X ]. Pig Yu (HKU) Asymptotic Theory 4 / 20

Five Weapos i Asymptotic Theory The CLT Defiitio A radom k vector Z coverges i distributio to Z as!, deoted as Z d! Z, if lim F (z) = F (z),! at all z where F () is cotiuous, where F is the cdf of Z ad F is the cdf of Z. Usually, Z is ormally distributed, so all z 2 R k are cotiuity poits of F. If Z coverges i distributio to Z, the Z is stochastically bouded ad we deote Z = O p (1). Rigorously, Z = O p (1) if 8ε > 0, 9M ε < such that P(kZ k > M ε ) < ε for ay. If Z = o p (1), the Z = O p (1). We ca show that o p (1) + o p (1) = o p (1), o p (1) + O p (1) = O p (1), O p (1) + O p (1) = O p (1), o p (1)o p (1) = o p (1), o p (1)O p (1) = o p (1), ad O p (1)O p (1) = O p (1). Theorem (CLT) suppose X 1,,X, are i.i.d. radom k vectors, E [X ] = µ, ad Var(X ) = Σ; the p X µ d! N(0,Σ). Pig Yu (HKU) Asymptotic Theory 5 / 20

Five Weapos i Asymptotic Theory Compariso Betwe the WLLN ad CLT The CLT tells more tha the WLLN. p X µ d! N(0,Σ) implies X p! µ, so the CLT is stroger tha the WLLN. X p! µ meas X µ = o p (1), but does ot provide ay iformatio about p X µ. The CLT tells that p X µ = O p (1) or X µ = O p ( 1/2 ). But the WLLN does ot require the secod momet fiite; that is, a stroger result is ot free. Pig Yu (HKU) Asymptotic Theory 6 / 20

Five Weapos i Asymptotic Theory The CMT Theorem (CMT) Suppose X 1,,X, are radom k vectors, ad g is a cotiuous fuctio o the support of X (to R l ) a.s. P X ; the X p! X =) g(x ) p! g(x ); X d! X =) g(x ) d! g(x ). The CMT allows the fuctio g to be discotiuous but the probability of beig at a discotiuity poit is zero. For example, the fuctio g(u) = u 1 is discotiuous at u = 0, but if d X! X N(0,1) the P(X = 0) = 0 so X 1 d! X 1. Pig Yu (HKU) Asymptotic Theory 7 / 20

Five Weapos i Asymptotic Theory Slutsky s Theorem I the CMT, X coverges to X joitly i various modes of covergece. For the covergece i probability ( p!), margial covergece implies joit covergece, so there is o problem if we substitute joit covergece by margial covergece. But for the covergece i distributio ( d d d!), X! X, Y! Y does ot imply X d X!. Y Y Nevertheless, there is a special case where this result holds, which is Slutsky s theorem. Theorem (Slutsky s Theorem) d d p X If X! X, Y! c () Y! c, where c is a costat, the Y d! X c This implies X + Y d! X + c, Y X d! cx, Y 1 X d! c 1 X whe c 6= 0. Here X,Y,X,c ca be uderstood as vectors or matrices as log as the operatios are compatible.. Pig Yu (HKU) Asymptotic Theory 8 / 20

Five Weapos i Asymptotic Theory Applicatios of the CMT ad Slutsky s Theorem Example Suppose X d! N(0,Σ), ad Y p! Σ; the Y 1/2 X d! Σ 1/2 N(0,Σ) = N(0,I), where I is the idetity matrix. (why?) Example Suppose X d! N(0,Σ), ad Y p! Σ; the X 0 Y 1 X d! χ 2 k, where k is the dimesio of X. (why?) Aother importat applicatio of Slutsky s theorem is the Delta method. Pig Yu (HKU) Asymptotic Theory 9 / 20

Five Weapos i Asymptotic Theory The Delta Method Theorem Suppose p (Z c) d! Z N(0,Σ), c 2 R k, ad g(z) : R k! R. If dg(z) dz is 0 cotiuous at c, the p (g(z ) g(c)) d! dg(c) dz Z. 0 Proof. p (g(z ) g(c)) = p dg(c) dz 0 (Z c), where c is betwee Z ad c. p (Z c) d p! Z implies that Z! c, so by the CMT, dg(c) p dg(c) dz! 0 dz. By Slutsky s theorem, p (g(z 0 ) g(c)) has the asymptotic distributio dg(c) dz Z. 0 The Delta method implies that asymptotically, the radomess i a trasformatio of Z is completely cotrolled by that i Z. Pig Yu (HKU) Asymptotic Theory 10 / 20

Asymptotics for the MoM Estimator Asymptotics for the MoM Estimator Pig Yu (HKU) Asymptotic Theory 11 / 20

Asymptotics for the MoM Estimator The MoM Estimator Recall that the MoM estimator is defied as the solutio to 1 m(x i jθ) = 0. We ca prove the MoM estimator is cosistet ad asymptotically ormal (CAN) uder some regularity coditios. Specifically, the asymptotic distributio of the MoM estimator is p bθ θ 0 d! N 0,M 1 ΩM 0 1, where M = de[m(xjθ 0)] dθ 0 ad Ω = E [m(xjθ 0 )m(xjθ 0 ) 0 ]. The asymptotic variace takes a sadwich form ad ca be estimated by its sample aalog. Pig Yu (HKU) Asymptotic Theory 12 / 20

Asymptotics for the MoM Estimator Derivatio of the Asymptotic Distributio of the MoM Estimator 1 =) 1 =) p bθ m(x i j b θ) = 0 m(x i jθ 0 ) + 1 θ 0 = d!? 1 dm(x i jθ) dθ 0 bθ θ 0 = 0 M 1 N (0,Ω) dm(x i jθ) dθ 0 p 1 1 m(x i jθ 0 ) p bθ θ 0 p 1 M 1 m(x i jθ 0 ), so M 1 m(x i jθ 0 ) is called the ifluece fuctio. h i We use de[m(xjθ 0)] dm(xjθ istead of E 0 ) because E [m(xjθ)] is more smooth dθ 0 dθ 0 tha m(xjθ) ad ca be applied to such situatios as quatile estimatio where m(xjθ) is ot differetiable at θ 0. I this course, we will ot meet such cases. Pig Yu (HKU) Asymptotic Theory 13 / 20

Asymptotics for the MoM Estimator Ituitio for the Asymptotic Distributio of the MoM Estimator Suppose E [X ] = g(θ 0 ) with g 2 C (1) i a eighborhood of θ 0 ; the θ 0 = g 1 (E [X ]) h(e [X ]). (what are m,m ad Ω here?) The MoM estimator of θ is to set X = g(θ), so b θ = h(x ). By the WLLN, X p! E [X ]; the by the CMT, θ b p! h(e [X ]) = θ 0 sice h() is cotiuous. Now, p bθ θ 0 = p h(x ) h(e [X ]) = p h 0 X X E [X ] = h 0 X p X theorem (MVT). E [X ], where the secod equality is from the mea value Because X is betwee X ad E [X ] ad X p! E [X ], X p! E [X ]. By the CMT, h 0 X p! h 0 (E [X ]). By the CLT, p X E [X ] d! N(0,Var(X )). The by Slutsky s theorem, p bθ d! θ 0 h 0 (E [X ])N(0,Var(X )) = N 0,h 0 (E [X ]) 2?= Var(X ) N 0, Var(X ) g 0 (θ 0 ) 2. Pig Yu (HKU) Asymptotic Theory 14 / 20

Asymptotics for the MoM Estimator cotiue... The larger g 0 (θ 0 ) is, the smaller the asymptotic variace of θ b is. o Cosider a more specific example. Suppose the desity of X is θ 2 x exp x 2 θ, θ > 0, x > 0, that is, X follows the Weibull (2,θ) distributio. p We ca show E [X ] = g(θ) = π 2 θ 1/2 π, ad Var (X ) = θ 1 4. So p bθ d! θ(1 θ N 0, π 4 ) pπ 2 1 2 θ 1/2 2! = N 0,16θ 2 1 π 1 4. Figure 1 shows E [X ] ad the asymptotic variace of p bθ θ. θ as a fuctio of Ituitively, the larger the derivative of E[X ] with respect to θ, the easier to idetify θ from X, so the smaller the asymptotic variace. Pig Yu (HKU) Asymptotic Theory 15 / 20

Asymptotics for the MoM Estimator 0 0 0 0 Figure: E [X ] ad Asymptotic Variace as a Fuctio of θ Pig Yu (HKU) Asymptotic Theory 16 / 20

Asymptotics for the MoM Estimator A Example Suppose the momet coditios are X µ E (X µ) 2 σ 2 The the sample aalog is 0 so the solutio is 1 B @ X i µ (X i µ) 2 σ 2 = 0. 1 C A = 0, bµ = X bσ 2 = 1 X i X 2 = X 2 X 2. Pig Yu (HKU) Asymptotic Theory 17 / 20

Asymptotics for the MoM Estimator cotiue... Cosistecy: bµ = X p! µ, bσ 2 = X 2 X 2 p! 1 0 Asymptotic Normality: M = E 2(X µ) 1 " Ω = E = 0 @ µ 2 + σ 2 µ 2 = σ 2. = 1 0 0 1 (X µ) 2 (X µ) 3 σ 2 (X µ) (X µ) 3 σ 2 (X µ) (X µ) 4 2σ 2 (X µ) 2 + σ 4 σ 2 E h(x µ) 3i 1 h E (X µ) 3i h E (X µ) 4i A, σ 4,!# so p bµ µ bσ 2 σ 2 If X N µ,σ 2, the what is Ω? d! N (0,Ω). Pig Yu (HKU) Asymptotic Theory 18 / 20

Asymptotics for the MoM Estimator Aother Example: Empirical Distributio Fuctio Suppose we wat to estimate θ = F (x) for a fixed x, where F () is the cdf of a radom variable X. A ituitive estimator is the ratio of samples below x, 1 1(X i x), which is called the empirical distributio fuctio (EDF), while it is a MoM estimator. Why? ote that the momet coditio for this problem is Its sample aalog is so E [1(X x) F (x)] = 0. 1 (1(X i x) F (x)) = 0, bf (x) = 1 1(X i x). By the WLLN, it is cosistet. By the CLT, p bf (x) F (x) d! N (0,F (x) (1 F (x))).(why?) A iterestig pheomeo is that the asymptotic variace reaches its maximum at the media of the distributio of X. Pig Yu (HKU) Asymptotic Theory 19 / 20

Asymptotics for the MoM Estimator 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 3 2 1 0 1 2 3 Figure: Empirical Distributio Fuctios: 10 samples from N(0, 1) with sample size = 50 Pig Yu (HKU) Asymptotic Theory 20 / 20