An Introduction to Asymptotic Theory

A Itroductio to Asymptotic Theory Pig Yu School of Ecoomics ad Fiace The Uiversity of Hog Kog Pig Yu (HKU) Asymptotic Theory 1 / 20

Five Weapos i Asymptotic Theory Five Weapos i Asymptotic Theory Pig Yu (HKU) Asymptotic Theory 2 / 20

Five Weapos i Asymptotic Theory Five Weapos The weak law of large umbers (WLLN, or LLN) The cetral limit theorem (CLT) The cotiuous mappig theorem (CMT) Slutsky s theorem The Delta method Notatios: - I oliear (i parameter) models, the capital letters such as X deote radom variables or radom vectors ad the correspodig lower case letters such as x deote the potetial values they may take. - Geeric otatio for a parameter i oliear eviromets (e.g., oliear models or oliear costraits) is θ, while i liear eviromets is β. Pig Yu (HKU) Asymptotic Theory 3 / 20

Five Weapos i Asymptotic Theory The WLLN Defiitio A radom vector Z coverges i probability to Z as!, deoted as Z for ay δ > 0, lim P(kZ Z k > δ ) = 0.! p! Z, if Although the limit Z ca be radom, it is usually costat. [ituitio] p The probability limit of Z is ofte deoted as plim(z ). If Z! 0, we deote Z = o p (1). Whe a estimator coverges i probability to the true value as the sample size diverges, we say that the estimator is cosistet. Cosistecy is a importat prelimiary step i establishig other importat asymptotic approximatios. Theorem (WLLN) Suppose X 1,,X, are i.i.d. radom vectors, ad E [kxk] < ; the as!, X 1 p X i! E [X ]. Pig Yu (HKU) Asymptotic Theory 4 / 20

Five Weapos i Asymptotic Theory The CLT Defiitio A radom k vector Z coverges i distributio to Z as!, deoted as Z d! Z, if lim F (z) = F (z),! at all z where F () is cotiuous, where F is the cdf of Z ad F is the cdf of Z. Usually, Z is ormally distributed, so all z 2 R k are cotiuity poits of F. If Z coverges i distributio to Z, the Z is stochastically bouded ad we deote Z = O p (1). Rigorously, Z = O p (1) if 8ε > 0, 9M ε < such that P(kZ k > M ε ) < ε for ay. If Z = o p (1), the Z = O p (1). We ca show that o p (1) + o p (1) = o p (1), o p (1) + O p (1) = O p (1), O p (1) + O p (1) = O p (1), o p (1)o p (1) = o p (1), o p (1)O p (1) = o p (1), ad O p (1)O p (1) = O p (1). Theorem (CLT) suppose X 1,,X, are i.i.d. radom k vectors, E [X ] = µ, ad Var(X ) = Σ; the p X µ d! N(0,Σ). Pig Yu (HKU) Asymptotic Theory 5 / 20

Five Weapos i Asymptotic Theory Compariso Betwe the WLLN ad CLT The CLT tells more tha the WLLN. p X µ d! N(0,Σ) implies X p! µ, so the CLT is stroger tha the WLLN. X p! µ meas X µ = o p (1), but does ot provide ay iformatio about p X µ. The CLT tells that p X µ = O p (1) or X µ = O p ( 1/2 ). But the WLLN does ot require the secod momet fiite; that is, a stroger result is ot free. Pig Yu (HKU) Asymptotic Theory 6 / 20

Five Weapos i Asymptotic Theory The CMT Theorem (CMT) Suppose X 1,,X, are radom k vectors, ad g is a cotiuous fuctio o the support of X (to R l ) a.s. P X ; the X p! X =) g(x ) p! g(x ); X d! X =) g(x ) d! g(x ). The CMT allows the fuctio g to be discotiuous but the probability of beig at a discotiuity poit is zero. For example, the fuctio g(u) = u 1 is discotiuous at u = 0, but if d X! X N(0,1) the P(X = 0) = 0 so X 1 d! X 1. Pig Yu (HKU) Asymptotic Theory 7 / 20

Five Weapos i Asymptotic Theory Slutsky s Theorem I the CMT, X coverges to X joitly i various modes of covergece. For the covergece i probability ( p!), margial covergece implies joit covergece, so there is o problem if we substitute joit covergece by margial covergece. But for the covergece i distributio ( d d d!), X! X, Y! Y does ot imply X d X!. Y Y Nevertheless, there is a special case where this result holds, which is Slutsky s theorem. Theorem (Slutsky s Theorem) d d p X If X! X, Y! c () Y! c, where c is a costat, the Y d! X c This implies X + Y d! X + c, Y X d! cx, Y 1 X d! c 1 X whe c 6= 0. Here X,Y,X,c ca be uderstood as vectors or matrices as log as the operatios are compatible.. Pig Yu (HKU) Asymptotic Theory 8 / 20

Five Weapos i Asymptotic Theory Applicatios of the CMT ad Slutsky s Theorem Example Suppose X d! N(0,Σ), ad Y p! Σ; the Y 1/2 X d! Σ 1/2 N(0,Σ) = N(0,I), where I is the idetity matrix. (why?) Example Suppose X d! N(0,Σ), ad Y p! Σ; the X 0 Y 1 X d! χ 2 k, where k is the dimesio of X. (why?) Aother importat applicatio of Slutsky s theorem is the Delta method. Pig Yu (HKU) Asymptotic Theory 9 / 20

Five Weapos i Asymptotic Theory The Delta Method Theorem Suppose p (Z c) d! Z N(0,Σ), c 2 R k, ad g(z) : R k! R. If dg(z) dz is 0 cotiuous at c, the p (g(z ) g(c)) d! dg(c) dz Z. 0 Proof. p (g(z ) g(c)) = p dg(c) dz 0 (Z c), where c is betwee Z ad c. p (Z c) d p! Z implies that Z! c, so by the CMT, dg(c) p dg(c) dz! 0 dz. By Slutsky s theorem, p (g(z 0 ) g(c)) has the asymptotic distributio dg(c) dz Z. 0 The Delta method implies that asymptotically, the radomess i a trasformatio of Z is completely cotrolled by that i Z. Pig Yu (HKU) Asymptotic Theory 10 / 20

Asymptotics for the MoM Estimator Asymptotics for the MoM Estimator Pig Yu (HKU) Asymptotic Theory 11 / 20

Asymptotics for the MoM Estimator The MoM Estimator Recall that the MoM estimator is defied as the solutio to 1 m(x i jθ) = 0. We ca prove the MoM estimator is cosistet ad asymptotically ormal (CAN) uder some regularity coditios. Specifically, the asymptotic distributio of the MoM estimator is p bθ θ 0 d! N 0,M 1 ΩM 0 1, where M = de[m(xjθ 0)] dθ 0 ad Ω = E [m(xjθ 0 )m(xjθ 0 ) 0 ]. The asymptotic variace takes a sadwich form ad ca be estimated by its sample aalog. Pig Yu (HKU) Asymptotic Theory 12 / 20

Asymptotics for the MoM Estimator Derivatio of the Asymptotic Distributio of the MoM Estimator 1 =) 1 =) p bθ m(x i j b θ) = 0 m(x i jθ 0 ) + 1 θ 0 = d!? 1 dm(x i jθ) dθ 0 bθ θ 0 = 0 M 1 N (0,Ω) dm(x i jθ) dθ 0 p 1 1 m(x i jθ 0 ) p bθ θ 0 p 1 M 1 m(x i jθ 0 ), so M 1 m(x i jθ 0 ) is called the ifluece fuctio. h i We use de[m(xjθ 0)] dm(xjθ istead of E 0 ) because E [m(xjθ)] is more smooth dθ 0 dθ 0 tha m(xjθ) ad ca be applied to such situatios as quatile estimatio where m(xjθ) is ot differetiable at θ 0. I this course, we will ot meet such cases. Pig Yu (HKU) Asymptotic Theory 13 / 20

Asymptotics for the MoM Estimator Ituitio for the Asymptotic Distributio of the MoM Estimator Suppose E [X ] = g(θ 0 ) with g 2 C (1) i a eighborhood of θ 0 ; the θ 0 = g 1 (E [X ]) h(e [X ]). (what are m,m ad Ω here?) The MoM estimator of θ is to set X = g(θ), so b θ = h(x ). By the WLLN, X p! E [X ]; the by the CMT, θ b p! h(e [X ]) = θ 0 sice h() is cotiuous. Now, p bθ θ 0 = p h(x ) h(e [X ]) = p h 0 X X E [X ] = h 0 X p X theorem (MVT). E [X ], where the secod equality is from the mea value Because X is betwee X ad E [X ] ad X p! E [X ], X p! E [X ]. By the CMT, h 0 X p! h 0 (E [X ]). By the CLT, p X E [X ] d! N(0,Var(X )). The by Slutsky s theorem, p bθ d! θ 0 h 0 (E [X ])N(0,Var(X )) = N 0,h 0 (E [X ]) 2?= Var(X ) N 0, Var(X ) g 0 (θ 0 ) 2. Pig Yu (HKU) Asymptotic Theory 14 / 20

Asymptotics for the MoM Estimator cotiue... The larger g 0 (θ 0 ) is, the smaller the asymptotic variace of θ b is. o Cosider a more specific example. Suppose the desity of X is θ 2 x exp x 2 θ, θ > 0, x > 0, that is, X follows the Weibull (2,θ) distributio. p We ca show E [X ] = g(θ) = π 2 θ 1/2 π, ad Var (X ) = θ 1 4. So p bθ d! θ(1 θ N 0, π 4 ) pπ 2 1 2 θ 1/2 2! = N 0,16θ 2 1 π 1 4. Figure 1 shows E [X ] ad the asymptotic variace of p bθ θ. θ as a fuctio of Ituitively, the larger the derivative of E[X ] with respect to θ, the easier to idetify θ from X, so the smaller the asymptotic variace. Pig Yu (HKU) Asymptotic Theory 15 / 20

Asymptotics for the MoM Estimator 0 0 0 0 Figure: E [X ] ad Asymptotic Variace as a Fuctio of θ Pig Yu (HKU) Asymptotic Theory 16 / 20

Asymptotics for the MoM Estimator A Example Suppose the momet coditios are X µ E (X µ) 2 σ 2 The the sample aalog is 0 so the solutio is 1 B @ X i µ (X i µ) 2 σ 2 = 0. 1 C A = 0, bµ = X bσ 2 = 1 X i X 2 = X 2 X 2. Pig Yu (HKU) Asymptotic Theory 17 / 20

Asymptotics for the MoM Estimator cotiue... Cosistecy: bµ = X p! µ, bσ 2 = X 2 X 2 p! 1 0 Asymptotic Normality: M = E 2(X µ) 1 " Ω = E = 0 @ µ 2 + σ 2 µ 2 = σ 2. = 1 0 0 1 (X µ) 2 (X µ) 3 σ 2 (X µ) (X µ) 3 σ 2 (X µ) (X µ) 4 2σ 2 (X µ) 2 + σ 4 σ 2 E h(x µ) 3i 1 h E (X µ) 3i h E (X µ) 4i A, σ 4,!# so p bµ µ bσ 2 σ 2 If X N µ,σ 2, the what is Ω? d! N (0,Ω). Pig Yu (HKU) Asymptotic Theory 18 / 20

Asymptotics for the MoM Estimator Aother Example: Empirical Distributio Fuctio Suppose we wat to estimate θ = F (x) for a fixed x, where F () is the cdf of a radom variable X. A ituitive estimator is the ratio of samples below x, 1 1(X i x), which is called the empirical distributio fuctio (EDF), while it is a MoM estimator. Why? ote that the momet coditio for this problem is Its sample aalog is so E [1(X x) F (x)] = 0. 1 (1(X i x) F (x)) = 0, bf (x) = 1 1(X i x). By the WLLN, it is cosistet. By the CLT, p bf (x) F (x) d! N (0,F (x) (1 F (x))).(why?) A iterestig pheomeo is that the asymptotic variace reaches its maximum at the media of the distributio of X. Pig Yu (HKU) Asymptotic Theory 19 / 20

Asymptotics for the MoM Estimator 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 3 2 1 0 1 2 3 Figure: Empirical Distributio Fuctios: 10 samples from N(0, 1) with sample size = 50 Pig Yu (HKU) Asymptotic Theory 20 / 20