Lecture 2: Consistency of M-estimators

Similar documents
1 Extremum Estimators

Elements of Asymptotic Theory. James L. Powell Department of Economics University of California, Berkeley

Lecture 3 January 16

Econometrics I. September, Part I. Department of Economics Stanford University

Analogy Principle. Asymptotic Theory Part II. James J. Heckman University of Chicago. Econ 312 This draft, April 5, 2006

LECTURE 7 NOTES. x n. d x if. E [g(x n )] E [g(x)]

Lecture 1: Review of Basic Asymptotic Theory

MATH 829: Introduction to Data Mining and Analysis Consistency of Linear Regression

Maximum Likelihood Asymptotic Theory. Eduardo Rossi University of Pavia

Elements of Asymptotic Theory. James L. Powell Department of Economics University of California, Berkeley

Lecture 3 Consistency of Extremum Estimators 1

Problem Set 2 Solution

Econometrics I, Estimation

Quantile methods. Class Notes Manuel Arellano December 1, Let F (r) =Pr(Y r). Forτ (0, 1), theτth population quantile of Y is defined to be

Chapter 7: Special Distributions

Estimation of Dynamic Regression Models

1 General problem. 2 Terminalogy. Estimation. Estimate θ. (Pick a plausible distribution from family. ) Or estimate τ = τ(θ).

CALCULATION METHOD FOR NONLINEAR DYNAMIC LEAST-ABSOLUTE DEVIATIONS ESTIMATOR

Introduction to Estimation Methods for Time Series models. Lecture 1

Analysis of some entrance probabilities for killed birth-death processes

P n. This is called the law of large numbers but it comes in two forms: Strong and Weak.

1 Martingales. Martingales. (Ω, B, P ) is a probability space.

Asymptotic theory for linear regression and IV estimation

Statistical Properties of Numerical Derivatives

Econ 583 Final Exam Fall 2008

Lecture 8 Inequality Testing and Moment Inequality Models

General Linear Model Introduction, Classes of Linear models and Estimation

Elementary theory of L p spaces

Asymptotic behavior of sample paths for retarded stochastic differential equations without dissipativity

Sharp gradient estimate and spectral rigidity for p-laplacian

Introduction to Estimation Methods for Time Series models Lecture 2

The properties of L p -GMM estimators

Real Analysis 1 Fall Homework 3. a n.

Lecture 6: Discrete Choice: Qualitative Response

Generalized Method of Moments (GMM) Estimation

Sobolev Spaces with Weights in Domains and Boundary Value Problems for Degenerate Elliptic Equations

Research Article An iterative Algorithm for Hemicontractive Mappings in Banach Spaces

Chapter 4: Asymptotic Properties of the MLE

Section 8: Asymptotic Properties of the MLE

ECON 3150/4150, Spring term Lecture 6

Working Paper No Maximum score type estimators

Econometrics II - EXAM Outline Solutions All questions have 25pts Answer each question in separate sheets

Fall, 2007 Nonlinear Econometrics. Theory: Consistency for Extremum Estimators. Modeling: Probit, Logit, and Other Links.

Nonparametric estimation of Exact consumer surplus with endogeneity in price

Convex Analysis and Economic Theory Winter 2018

Lecture 6. 2 Recurrence/transience, harmonic functions and martingales

Quick Review on Linear Multiple Regression

High Dimensional Empirical Likelihood for Generalized Estimating Equations with Dependent Data

7. Introduction to Large Sample Theory

Introduction to Empirical Processes and Semiparametric Inference Lecture 02: Overview Continued

Discrete Dependent Variable Models

MA Advanced Econometrics: Applying Least Squares to Time Series

MEI Exam Review. June 7, 2002

Lasso Maximum Likelihood Estimation of Parametric Models with Singular Information Matrices

Maximum Likelihood Estimation

Chapter 3. GMM: Selected Topics

Mathematical statistics

Asymptotic Distribution of M-estimator

DA Freedman Notes on the MLE Fall 2003

Lane-Emden problems: symmetries of low energy solutions

Consistency and asymptotic normality

On Isoperimetric Functions of Probability Measures Having Log-Concave Densities with Respect to the Standard Normal Law

Lecture 4: September Reminder: convergence of sequences

The generalized method of moments

Asymptotically Optimal Simulation Allocation under Dependent Sampling

Lecture 23 Maximum Likelihood Estimation and Bayesian Inference

Closest Moment Estimation under General Conditions

Linear Models and Estimation by Least Squares

Extremal Polynomials with Varying Measures

Graduate Econometrics I: Maximum Likelihood I

Asymptotics for Nonlinear GMM

UNIVERSITY OF CALIFORNIA Spring Economics 241A Econometrics

Advanced Econometrics II (Part 1)

MATH 6210: SOLUTIONS TO PROBLEM SET #3

Partial Identification in Triangular Systems of Equations with Binary Dependent Variables

A Resampling Method on Pivotal Estimating Functions

Economics 583: Econometric Theory I A Primer on Asymptotics

Mollifiers and its applications in L p (Ω) space

Economics 620, Lecture 9: Asymptotics III: Maximum Likelihood Estimation

KIRCHHOFF TYPE PROBLEMS INVOLVING P -BIHARMONIC OPERATORS AND CRITICAL EXPONENTS

Journal of Mathematical Analysis and Applications

Hölder s and Minkowski s Inequality

MLE and GMM. Li Zhao, SJTU. Spring, Li Zhao MLE and GMM 1 / 22

Elementary Analysis in Q p

A Local Generalized Method of Moments Estimator

On a Fuzzy Logistic Difference Equation

A sharp generalization on cone b-metric space over Banach algebra

Flexible Estimation of Treatment Effect Parameters

Research Note REGRESSION ANALYSIS IN MARKOV CHAIN * A. Y. ALAMUTI AND M. R. MESHKANI **

On-Line Appendix. Matching on the Estimated Propensity Score (Abadie and Imbens, 2015)

Spring 2017 Econ 574 Roger Koenker. Lecture 14 GEE-GMM

Introduction to Probability and Statistics

Introduction to Quantile Regression

Chapter 2. Discrete Distributions

Simulated Method of Moments Estimation for Copula-Based Multivariate Models

Expecting the Unexpected: Uniform Quantile Regression Bands with an application to Investor Sentiments

Econometrica Supplementary Material

Final Exam. Economics 835: Econometrics. Fall 2010

Chapter 3. Point Estimation. 3.1 Introduction

Transcription:

Lecture 2: Instructor: Deartment of Economics Stanford University Preared by Wenbo Zhou, Renmin University

References Takeshi Amemiya, 1985, Advanced Econometrics, Harvard University Press Newey and McFadden, 1994, Chater 36, Volume 4, The Handbook of Econometrics.

Consistency Distinction between global and local consistency. Global condition: If Θ is comact, su θ Θ Q n θ) Q θ) 0, Q θ) < Q θ 0 ) for θ θ 0, then ˆθ θ 0, where ˆθ = argmax θ Θ Q n θ) Local condition: If N is a neighborhood around θ 0, Q su nθ) θ N θ Qθ) θ 0, Q θ) < Q θ 0 ) for θ θ 0 and θ N, then inf θ ˆΘ θ θ 0 which Qnθ) θ = 0. 0, where ˆΘ denotes the set of θ for For the local consistency condition, check 1) Qθ 0) θ = 0 and 2) 2 Qθ 0 ) θ θ negative definite.

Consistency for MLE Let L y 1,..., y n, θ) be the JOINT density for i.i.d data y 1,..., y n, then Q n θ) 1 n log L y 1,..., y n, θ) = 1 n n log f y t, θ). Change assumtions to θ 0 is identified, i.e. θ θ 0 f y t, θ) f y t, θ 0 ), E su θ Θ log f y; θ) <. Identification imlies Q θ) < Q θ 0 ) since log f y; θ) f y; θ) E < log E log f y; θ 0 ) f y; θ 0 ) = log f y; θ) dy = log 1 = 0. Condition 2 is a dominance condition for stochastic equicontinuity. MLE consistency holds even if you have a arameter deendent suort of the data.

In general case when y t is not i.i.d, E log L y 1,..., y n ; θ) log EL y 1,..., y n ; θ 0 ) still holds but to justify the strict < is harder. When global condition fails or Θ is not comact, local condition may hold. Examle: Mixture of normal distributions. L = [ n y t λn µ 1, σ1) 2 + 1 λ) N µ2, σ2) 2, ) y t u 1 ) 2 2σ1 2 + 1 λ ex 2πσ2 λ 2πσ1 ex y t u 2 ) 2 2σ 2 2 Set u 1 = y 1 and let σ 1 0, then L increases to. Hence global MLE cannot be consistent, but local MLE is. )].

Consistency for GMM Q n θ) = g n θ) Wg n θ), for g n θ) = 1 n n g z t, θ), and W is the ositive definite weighting matrix. If su θ Θ g n θ) Eg z t, θ) 0, Eg z t, θ) = 0 iff θ = θ 0, then ˆθ argmax θ Q n θ) 0. Global identification in nonlinear GMM model is usually difficult and assumed. But identification in linear models usually reduces to condition that the samle var-cov matrix for regressors is full rank, i.e Ex t x t for iid models, 1 n lim n n x tx t for fixed regressors. For least square, 1 n n y t x tβ) 2 full rank, E y x β) 2. Iff Ex t x t E y x β) 2 E y x β 0 ) 2 = E [x β β 0 )] 2 = β β 0 ) Ex t x t β β 0 ) > 0 if β β 0.

Quantile Regression Conditional τth quantile of y t given x t is a linear regression function x tβ 0, i.e. Pr y t x tβ 0 x t ) F y x tβ 0 x t ) = τ. The τ = 1 2th quantile is the median. Poulation moment condition: E τ 1 y t x tβ 0 )) xt = E τ Pr y t x tβ 0 x t )) xt = 0. Samle moment condition: 0 1 n = 1 n n n x t τ 1 y t x t ˆβ )) x t [τ1 y > x t ˆβ ) 1 τ) 1 y t x t ˆβ )]. Integrate the condition back to obtain the convex objective function Q n β).

Objective function for QR: Q n β) = 1 n = 1 n n [τ 1 y t x tβ)] y t x tβ) n [τ y t x tβ) + + 1 τ) y t x tβ) ] When τ = 1 2, Q n β) = 1 n n y t x tβ becomes the Least Absolute Deviation LAD) regression, which looks for the conditional median. Also, that Ex t x t is full rank imlies global consistency for the linear quantile regression model.

Q n β) for QR has two features: Q n β) is convex so that ointwise convergence is sufficient for uniform convergence over comact Θ and the arameter sace does not have to be comact. No moment conditions are needed for y t to obtain ointwise convergence, this is done by subtracting Q n β 0 ), and Q n β) Q n β 0 ) Q β) Q β 0 ), by alying triangular inequality. Concavity and noncomact arameter set: when Q n θ) is concave for maximization or convex for minimization), then ointwise convergence uniform convergence. Qθ) s local maximization global consistency.

Uniform Convergence in robability) Definition: ˆQ θ) converges in robability to Q θ) uniformly over the comact set θ Θ if ) ɛ > 0, lim P su ˆQ θ) Q θ) > ɛ = 0. T θ Θ Consistency of M-Estimators: If Q T θ) converges in robability to Q θ) uniformly, Q θ) continuous and uniquely maximized at θ 0, ˆθ = argmaxq T θ) over comact arameter set Θ, lus continuity and measurability for Q T θ), then ˆθ θ 0. Consistency of estimated var-cov matrix: Note that it is sufficient for uniform convergence to hold over a shrinking neighborhood of θ 0.

Conditions for Uniform Convergence: Equicontinuity First think about sequence of deterministic functions f n θ). Uniform Equicontinuity for f n θ): lim su su δ 0 n θ θ <δ f n θ ) f n θ) = 0. What if f n θ) may be discontinuous but the size of the jum goes to 0? Asymtotic uniform equicontinuity for f n θ): lim δ 0 lim su n su θ θ <δ f n θ ) f n θ) = 0. Uniform convergence of f n θ): Θ comact, su θ Θ f n θ) 0 if and only if f n θ) 0 for each θ and f n is asymtotically uniformly equicontinuous.

Then the stochastic case Q n θ). Definition: A sequence of random functions Q n θ) is stochastic uniform equicontinuity if ɛ > 0, ) lim δ 0 lim su P n su Q n θ) Q n θ ) > ɛ θ θ <δ Uniform convergence in robability: If Q n θ) 0 for each θ, and Q n θ) is stochastic equicontinuous on θ Θ comact, then su Q n θ) 0. θ Θ = 0.

Lischitz Condition for Stochastic Equicontinuity Simle sufficient condition for stochastic equicontinuity. where the objective function is smooth, differentiable, etc. Lischitz condition: For θ, θ Θ, if Q n θ) Q n θ ) B n d θ, θ ), where lim δ 0 su θ θ <δ d θ, θ ) = 0 and B n = O 1), then Q n θ) is stochastic equicontinuous. Examle: Suose Q n θ) = 1 n n f z t, θ), z t iid, f z t, θ) differentiable with f θ z t, θ), then by Taylor, for θ θ, θ ), Q n θ) Q n θ ) 1 n n f θ zt, θ ) θ θ. If b z t ) = su θ Θ f θ z t, θ) is such that Eb z t ) <, then the Lischitz condition holds with B n = 1 n n b z t).

Uniform WLLN But what to do when the Lischitz condition is not alicable? Uniform WLLN Θ comact, y t iid, g y t, θ) continuous in θ for each y t a.s., Eg y t, θ) = 0, E su θ Θ g y t, θ) <, then ɛ > 0, ) lim P n su 1 θ Θ n n g y t, θ) > ɛ = 0.

Proof: Use ointwise convergence + stochastic equicontinuity. 1 E su θ Θ g y t, θ) < = E g y t, θ) > for each θ, so use SLLN 2 to conclude 1 n n g y t, θ) a.s.) 0 for each θ. 2 Verify stochastic equicontinuity for 1 n n g y t, θ): su 1 θ θ <δ n n g y t, θ) g y t, θ ) 1 su θ θ <δ n 1 n n n g y t, θ) g y t, θ ) su θ θ <δ g y t, θ) g y t, θ ).

Therefore lim δ 0 lim su P n lim lim su P δ 0 n su 1 θ θ <δ n 1 n n n g y t, θ) g ) y t, θ ) > ɛ su θ θ <δ g y t, θ) g y t, θ ) > ɛ E n lim lim su su θ θ <δ g yt, θ) g yt, θ ) δ 0 n nɛ = lim E su g y t, θ) g y t, θ ) δ 0 θ θ <δ Finally use uniform b/o comact Θ) continuity of g y t, θ) and DOM. Since lim δ 0 su θ θ <δ g y t, θ) g y t, θ ) almost surely, and E su δ su θ θ <δ g y t, θ) g y t, θ ) < E2 su θ g y t, θ) <. )