Ultra High Dimensional Variable Selection with Endogenous Variables

Similar documents
Nonconcave Penalized Likelihood with A Diverging Number of Parameters

Dealing With Endogeneity

MLE and GMM. Li Zhao, SJTU. Spring, Li Zhao MLE and GMM 1 / 22

High Dimensional Sparse Econometric Models: An Introduction

Paper Review: Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties by Jianqing Fan and Runze Li (2001)

Smoothly Clipped Absolute Deviation (SCAD) for Correlated Variables

A UNIFIED APPROACH TO MODEL SELECTION AND SPARS. REGULARIZED LEAST SQUARES by Jinchi Lv and Yingying Fan The annals of Statistics (2009)

Mostly Dangerous Econometrics: How to do Model Selection with Inference in Mind

Program Evaluation with High-Dimensional Data

Lecture 14: Variable Selection - Beyond LASSO

Motivation Non-linear Rational Expectations The Permanent Income Hypothesis The Log of Gravity Non-linear IV Estimation Summary.

PhD/MA Econometrics Examination January 2012 PART A

A Bootstrap Lasso + Partial Ridge Method to Construct Confidence Intervals for Parameters in High-dimensional Sparse Linear Models

The risk of machine learning

High-Dimensional Sparse Econometrics. Models, an Introduction

Single Index Quantile Regression for Heteroscedastic Data

Flexible Estimation of Treatment Effect Parameters

Applied Health Economics (for B.Sc.)

Applied Econometrics (MSc.) Lecture 3 Instrumental Variables

IV Quantile Regression for Group-level Treatments, with an Application to the Distributional Effects of Trade

ESSAYS ON INSTRUMENTAL VARIABLES. Enrique Pinzón García. A dissertation submitted in partial fulfillment of. the requirements for the degree of

Semiparametric Estimation of a Sample Selection Model in the Presence of Endogeneity

Uniform Inference for Conditional Factor Models with Instrumental and Idiosyncratic Betas

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018

When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Longitudinal Data?

Binary Models with Endogenous Explanatory Variables

Quantile methods. Class Notes Manuel Arellano December 1, Let F (r) =Pr(Y r). Forτ (0, 1), theτth population quantile of Y is defined to be

The outline for Unit 3

Uniform Post Selection Inference for LAD Regression and Other Z-estimation problems. ArXiv: Alexandre Belloni (Duke) + Kengo Kato (Tokyo)

Econ 583 Final Exam Fall 2008

Generalized Method of Moment

Moment and IV Selection Approaches: A Comparative Simulation Study

When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Longitudinal Data?

Econometrics of Panel Data

A Goodness-of-fit Test for Copulas

WEIGHTED QUANTILE REGRESSION THEORY AND ITS APPLICATION. Abstract

Interquantile Shrinkage in Regression Models

Econometrics of High-Dimensional Sparse Models

ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Spring 2013 Instructor: Victor Aguirregabiria

Lecture 3: Statistical Decision Theory (Part II)

Finite Sample Performance of A Minimum Distance Estimator Under Weak Instruments

Single-Equation GMM: Endogeneity Bias

13 Endogeneity and Nonparametric IV

Introduction to Econometrics

Panel Threshold Regression Models with Endogenous Threshold Variables

Variable selection and estimation in high-dimensional models

Non-linear Supervised High Frequency Trading Strategies with Applications in US Equity Markets

Regression, Ridge Regression, Lasso

Truncation and Censoring

ECONOMETRICS HONOR S EXAM REVIEW SESSION

Machine learning, shrinkage estimation, and economic theory

When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Panel Data?

Lasso Maximum Likelihood Estimation of Parametric Models with Singular Information Matrices

Introduction to Estimation Methods for Time Series models. Lecture 1

Introduction to Regression Analysis. Dr. Devlina Chatterjee 11 th August, 2017

Lecture 9: Quantile Methods 2

arxiv: v1 [stat.me] 30 Dec 2017

INFERENCE APPROACHES FOR INSTRUMENTAL VARIABLE QUANTILE REGRESSION. 1. Introduction

Inference in Nonparametric Series Estimation with Data-Dependent Number of Series Terms

Habilitationsvortrag: Machine learning, shrinkage estimation, and economic theory

ON ILL-POSEDNESS OF NONPARAMETRIC INSTRUMENTAL VARIABLE REGRESSION WITH CONVEXITY CONSTRAINTS

Penalized Empirical Likelihood and Growing Dimensional General Estimating Equations

ECON 3150/4150, Spring term Lecture 6

High Dimensional Propensity Score Estimation via Covariate Balancing

DISCUSSION PAPER. The Bias from Misspecification of Control Variables as Linear. L e o n a r d G o f f. November 2014 RFF DP 14-41

Problem Set #6: OLS. Economics 835: Econometrics. Fall 2012

Bayesian variable selection via. Penalized credible regions. Brian Reich, NCSU. Joint work with. Howard Bondell and Ander Wilson

Linear Models and Estimation by Least Squares

Spring 2017 Econ 574 Roger Koenker. Lecture 14 GEE-GMM

SPARSE MODELS AND METHODS FOR OPTIMAL INSTRUMENTS WITH AN APPLICATION TO EMINENT DOMAIN

Maximum Likelihood Estimation for Factor Analysis. Yuan Liao

Econometrics of Big Data: Large p Case

High Dimensional Empirical Likelihood for Generalized Estimating Equations with Dependent Data

A Course in Applied Econometrics Lecture 18: Missing Data. Jeff Wooldridge IRP Lectures, UW Madison, August Linear model with IVs: y i x i u i,

On the Behavior of Marginal and Conditional Akaike Information Criteria in Linear Mixed Models

Birkbeck Working Papers in Economics & Finance

Problem Set 7. Ideally, these would be the same observations left out when you

Variable Screening in High-dimensional Feature Space

Instrumental Variables and the Problem of Endogeneity

Economics 582 Random Effects Estimation

Models of Qualitative Binary Response

More on Specification and Data Issues

Regression #3: Properties of OLS Estimator

Linear Regression. Junhui Qian. October 27, 2014

Economics 536 Lecture 7. Introduction to Specification Testing in Dynamic Econometric Models

Non-linear panel data modeling

Advanced Econometrics

Consistent high-dimensional Bayesian variable selection via penalized credible regions

Econometrics Honor s Exam Review Session. Spring 2012 Eunice Han

LASSO METHODS FOR GAUSSIAN INSTRUMENTAL VARIABLES MODELS. 1. Introduction

1 Motivation for Instrumental Variable (IV) Regression

08 Endogenous Right-Hand-Side Variables. Andrius Buteikis,

Instrumental Variables, Simultaneous and Systems of Equations

Ramsey Cass Koopmans Model (1): Setup of the Model and Competitive Equilibrium Path

Fall, 2007 Nonlinear Econometrics. Theory: Consistency for Extremum Estimators. Modeling: Probit, Logit, and Other Links.

Density estimation Nonparametric conditional mean estimation Semiparametric conditional mean estimation. Nonparametrics. Gabriel Montes-Rojas

Quantile Processes for Semi and Nonparametric Regression

Econometrics I KS. Module 1: Bivariate Linear Regression. Alexander Ahammer. This version: March 12, 2018

ECON 4160: Econometrics-Modelling and Systems Estimation Lecture 9: Multiple equation models II

Nonparametric Econometrics

Transcription:

1 / 39 Ultra High Dimensional Variable Selection with Endogenous Variables Yuan Liao Princeton University Joint work with Jianqing Fan Job Market Talk January, 2012

2 / 39 Outline 1 Examples of Ultra High Dimensional Econometric Model 2 Problem I: Ultra-high dim. covariates selection with endogeneity Inconsistency of Penalized OLS Penalized GMM and Penalized empirical likelihood 3 Problem II: Ultra-high dimensional instrumental selection Linear model: 2SLS with many weak instruments Nonlinear model: ultra-high dimensional sieve approximation 4 Implementation and Simulation

Examples of Ultra High Dimensional Econometric Model 3 / 39 Examples of Ultra High Dimensional Econometric Model

4 / 39 Examples of Ultra High Dimensional Econometric Model Cross-country Growth Regression Estimating the effect of an initial GDP per capita on the growth rates of GDP per capita. Solow-Swan-Ramsey model: poorer countries should grow faster, and catch up with richer countries. effect of initial GDP on growth rate should be negative Rejected using a simple bivariate regression ( Barro and Sala-i-Martin 1995) Conditional effects: For countries with similar characteristics, the effect of initial GDP on growth rate is negative.

Examples of Ultra High Dimensional Econometric Model 5 / 39 Cross-country Growth Regression y i = a 0 + a 1 log G i + x T i β + ɛ i y: growth rate; G: initial GDP x: country s char.: measures of edu, policies, trade openness, saving rates, investment rate, etc. H 0 : a 1 < 0. Barro and Lee (1994): p = 62, n = 90. Severe criticism of literature for relying on ad hoc covariate selection (Levine and Renelt 92) Development of a data-driven procedure for covariates selection is essential.

Examples of Ultra High Dimensional Econometric Model 6 / 39 Home price prediction Housing market based on state-level panel data can capture state-specific dynamics and variations If focus on local levels, including only macroeconomic variable cannot capture the cross-sectional correlation among local levels. y i,t+s = p y kt β ik + x T t θ i + ɛ i,t+s. k=1 x: macroeconomic variables p 1000; n < 200 for monthly sales data in ten years. Only a few county-level info. should be useful conditioning on national factors.

7 / 39 Examples of Ultra High Dimensional Econometric Model Home price prediction Fan, Lv and Qi (2011): monthly repeated sales of 352 largest counties in US from January 2000 to December 2009(n = 120) Testing periods: 2006.1-2009.12 black: historical data blue: OLS with national house-price appreciation only red: penalized variable selection

Examples of Ultra High Dimensional Econometric Model 8 / 39 Labor Economics: Wage regression Effect of education on future income Response: log-wage y = E(y x) + ɛ. Nonparametric sieve approx. E(y x) = p i=1 β ip i (x) + r, P 1 (x),..., P p (x) are either polynomials or spline transformations. No guarantee that r is small using low-order polynomials. Possible oscillatory behavior associated with advanced degrees higher order

9 / 39 Examples of Ultra High Dimensional Econometric Model Labor Economics: Wage regression Belloni and Chernozhukov 11

Examples of Ultra High Dimensional Econometric Model 10 / 39 Instrumental Selection y = θ 0 + θ 1 z + w T γ + u 1 z = x T β + w T δ + u 2 y : wage; z : education. Angrist and Krueger: 180 IV s Two approaches in classical literature: 1 use 3 leading IV s: large variance 2 use 180 IV s: large bias 37 Lasso selected IV s. (Belloni and Chernozhukov 11)

Examples of Ultra High Dimensional Econometric Model 11 / 39 Model setting Consider y i = x T i β 0 + ɛ i, i = 1,..., n. dim(x) = p >> n. Allow p = exp(n α ), for some α (0, 1). Assume β 0 to be sparse. β 0 = (β 0S, 0) where dim(β 0S ) = s << n. Accordingly, x = (x S, x N ): important and unimportant

Examples of Ultra High Dimensional Econometric Model 12 / 39 Two Problems in This Talk

13 / 39 Examples of Ultra High Dimensional Econometric Model Problem I: Ultra-high dim. covariates selection y = x T β 0 + ɛ, β 0 = (β 0S, 0) x may contain many endogenous components. How to achieve oracle property? 1 ˆβ S β 0S = O p ( s/n log s). 2 P( ˆβ N = 0) 1. 3 ˆβ S has asymptotic normality. Solution: penalized GMM and penalized EL.

14 / 39 Examples of Ultra High Dimensional Econometric Model Problem II: Ultra high dim. instrumental selection dim(w) can be ultra high. y = x T S β 0S + ɛ x S = Θ 0 w + v dim(w) = O(exp(n α )), α (0, 1). Many instruments are weak. Solution: Penalized LS in 2SLS.

Problem I: Ultra-high dim. covariates selection with endogeneity 15 / 39 Problem I: Ultra-high dimensional covariates selection

16 / 39 Problem I: Ultra-high dim. covariates selection with endogeneity Penalized OLS Inconsistency of Penalized OLS Find ˆβ as: 1 ˆβ = arg min β n n (y i x i β)2 + i=1 P n : penalty function. Lasso: P n ( β j ) = λ n β j, where λ n 0. SCAD (Fan and Li 2001), etc. Key assumption: E(ɛ x S, x N ) = 0 p P n ( β j ). Unimportant predictors are artificially added; more desirable to assume only E(ɛ x S ) = 0 In many cases, important covariates are also endogenous. Instead, E(ɛ w) = 0. j=1

17 / 39 Problem I: Ultra-high dim. covariates selection with endogeneity A simulation example Inconsistency of Penalized OLS β 0 = (2.5, 4, 7, 1.5, 0,..., 0), p = 10, n = 100. (x 1,..., x 4 ) = (z 1,..., z 4 ), x j = (z j + 2)(ɛ + 1) where z N p (0, Σ), Σ ij = 0.5 i j. Penalized OLS +SCAD λ = 0.2 λ = 0.7 λ = 1.2 λ = 1.7 TP-Mean 4 4 4 4 FP-Mean 5.25 5.34 5.24 5.14 FP-Median 5 6 5 5 (0.901) (0.799) (0.912) (0.83)

18 / 39 Problem I: Ultra-high dim. covariates selection with endogeneity Inconsistency of POLS Inconsistency of Penalized OLS Theorem 1 Suppose Ex l ɛ >> 0 for some x l. If β = ( β T S, β T N )T is POLS estimator, then either β S β 0S p 0, or lim sup P( β N = 0) < 1. n The inconsistency of POLS comes from the fact that, when x l is endogenous, E(y x T β 0 )x l = 0 is misspecified.

Problem I: Ultra-high dim. covariates selection with endogeneity Penalized GMM and Penalized empirical likelihood 19 / 39 Ultra-high dim. covariates selection with endogeneity Consider more general E[g(y, x T β 0 ) w] = 0, β 0 = (β 0S, 0). linear model: g = y x T β 0 logit model: g = y exp(x T β 0 )/(1 + exp(x T β 0 )) probit model: g = y Φ(x T β 0 ) Both important and unimportant covariates are possibly endogenous w: a set of valid instrumental variables.

Problem I: Ultra-high dim. covariates selection with endogeneity Penalized GMM and Penalized empirical likelihood 20 / 39 Penalized GMM Let v be p-dim. technical instruments. If dim(w) dim(x), v w. v = (f 1 (w),..., f p (w)). For fixed β R p, let v(β) contain only components {v l : β l 0} e.g., p = 3, β = (1, 0, 2), then v(β) = (v 1, v 3 ). Define L GMM (β) = [ 1 n n g(y i, x T i β)v i (β)] T W [ 1 n i=1 n g(y i, x T i β)v i (β)] i=1 p Q GMM (β) = L GMM (β) + P n ( β i ). i=1

Problem I: Ultra-high dim. covariates selection with endogeneity Penalized GMM and Penalized empirical likelihood 21 / 39 P n : penalty function. 1 P n is concave, increasing on [0, ), differentiable 2 P n(0 + ) > n 1/2 ; P n(t) = o(1) when t > c > 0. 3 max β0j 0 P n (β 0j ) = o(1). Examples 1 Lasso (Tibshirani 1986): P n(t) = λ n t 2 SCAD (Fan and Li 2001): P n(t) = λ n[λ n + λ n (aλ n t) + (a 1)λ n dt] 3 MCP (Zhang 2009): P n(t) = 1 (aλn t)+dt. a 4 Hard thresholding (Antoniadis 1996): a = 1.

Problem I: Ultra-high dim. covariates selection with endogeneity Penalized GMM and Penalized empirical likelihood 22 / 39 x 10-3 SCAD (Fan and Li 2001) Lasso (Tibshirani 1986) 0-0.1 0 0.1

23 / 39 Problem I: Ultra-high dim. covariates selection with endogeneity Oracle properties of PGMM Penalized GMM and Penalized empirical likelihood Theorem 1 Either E(g(y, x T β 0 ) x S ) = 0 or E(g(y, x T β 0 ) w) = 0 0 < c < λ min (Ex S v(β 0S ) T ) λ max (Ex S v(β 0S ) T ) < M. s 3 log s = o(n). Under regularity conditions, there exists a strictly local minimizer of Q GMM : 1 2 P( ˆβ N = 0) 1. s log s ˆβ S β 0S = O p ( + sp n n(min( β 0S ))). 3 Asymptotic normality of ˆβ S.

24 / 39 Problem I: Ultra-high dim. covariates selection with endogeneity Penalized empirical likelihood Penalized GMM and Penalized empirical likelihood Theorem 2 L EL (β) = 1 max λ R k β 0 n n log{1 + λ T [(y i x T i β)v i (β)]}. (2.1) i=1 Q EL (β) = L EL (β) + p P n ( β j ). (2.2) s 4 log s = o(n), there exists a strictly local minimizer of Q EL : 1 2 P( ˆβ N = 0) 1. j=1 s log s ˆβ S β 0S = O p ( + sp n n(min( β 0S ))). 3 Asymptotic normality of ˆβ S.

Problem II: Ultra-high dimensional instrumental selection Linear model: 2SLS with many weak instruments 25 / 39 Problem II: Ultra-high dimensional instrumental selection

26 / 39 Problem II: Ultra-high dimensional instrumental selection Linear model: 2SLS with many weak instruments Ultra high dim. instrumental selection Suppose oracle property is achieved, w.p.a.1, we identity: E[g(y, x T S β 0S) w] = 0. Optimal IV: A(w) = D(w) T Ω(w) 1, (Newey 01) D(w) = E( g(β 0S) β S w), Ω(w) = E(g(y, x T β 0S )g(y, x T β 0S ) T w). Ω(w): homoskedasticity. dim(w) can be ultra high. y = x T S β 0S + ɛ x S = Θ 0 w + v D(w) = Θ 0 w. But many instruments are weak. Including many weak IV s in 2SLS is severely biased.

27 / 39 Problem II: Ultra-high dimensional instrumental selection Linear model Linear model: 2SLS with many weak instruments Method based on MSE: (Donald&Newey 01, Kuersteiner&Okui 10) dim(w) n. requires natural ordering of IV s. In general, computationally infeasible: NP-hard. Proposed method: on the first stage, 1 ˆθ l = arg min θ n n (x Sl w T i θ) 2 + i=1 p P n ( θ j ). j=1 x = ˆΘv, ˆΘ = (ˆθ1,..., ˆθ s ) T. We allow dim(w) = o(exp( T )). s LASSO: ˆθ l θ 0l = O p ( 1 log s 1 n + s 1 λ n ), s SCAD: ˆθ l θ 0l = O p ( 1 log s 1 n ).

Problem II: Ultra-high dimensional instrumental selection Linear model: 2SLS with many weak instruments 28 / 39 Recent literature proposed methods based on l 1 penalty: (Belloni et al. 10, Garcia 11, Can&Fan 11) computationally efficient Lasso: choice of λ n is very restrictive. λ n large miss many important IVs. λ n small include too many weak IVs, complicated model Adaptive lasso: P n ( β j ) = β j 1 λ n β j. requires initial estimator, which is hard to obtain when w is ultra high dimensional. iterative algorithm may permanently remove important IVs. Proposed method allows more adaptive penalties.

Problem II: Ultra-high dimensional instrumental selection Linear model: 2SLS with many weak instruments 29 / 39 Nonlinear model Optimal IV: D(w) = E( g(β 0S) w) β S Estimate based on sieve approx. (Newey 01) p 1 D(w) = θ i f i (w) + r, p 1 n. i=1 No guarantee r is small if p 1 is small. Goal: allow for higher order polynomials

30 / 39 Problem II: Ultra-high dimensional instrumental selection Ultra-high dim. sieve approximation Nonlinear model: ultra-high dimensional sieve approximation Assumption: 1 There is a large set of technical IV s v = (f 1 (w),..., f p1 (w)) T (possibly p n): D(w) = Θ 0 v + a(w), max l s ( 1 n n a l (w i ) 2 ) = O p (cn) 2 i=1 2 max l s i / T l θ 0l,i < n α 1, minl s,i Tl θ 0l,i = h n > n α 2 max l s #{i : i T l } = s 1 = o(n). Penalized estimator: 1 ˆθ l = arg min θ n n ( g(y i, ˆx T i ˆβ S ) v T i θ) 2 + β Sl i=1 ˆD(w) = ˆΘv, ˆΘ = (ˆθ 1,..., ˆθ s ) T. p P n ( θ j ). j=1

Problem II: Ultra-high dimensional instrumental selection Nonlinear model: ultra-high dimensional sieve approximation 31 / 39 Theorem 2 There exists a strictly local minimizer ˆθ l = (ˆθ ls, ˆθ ls ), s.t. s log s s1 log s 1 ˆθ ls θ 0l,S = O p ( + + s 1 n α 1 + s 1 c n n n 1 n n i=1 + s 1 P n(h n )). lim P(ˆθ ln = 0) = 1. n ˆD l (W i ) D l (W i ) 2 = O p ( s 1s log s 1 n +s 2 1 n 2α 1 + s 2 1 c2 n + s 2 1 P n(h n ) 2 ). + s2 1 log s 1 n

Implementation and Simulation 32 / 39 Implementation

33 / 39 Smoothing Implementation and Simulation L GMM is not continuous. = p j=1 L GMM (β) = w j [ 1 n n i=1 [ 1 n n i=1 (y i x T i β)x i (β)] T W (y i x T i β)x ij I(β j 0) ] T [ 1 n n i=1 [ 1 n n i=1 (y i x T i β)x i (β) (y i x T i β)x ij I(β j 0) ] ] L EL (β) = max λ 1 n n i=1 1 = max λ j R k,j=1,...,p n log(1 + λ T (y i x T i β)x i (β) n log(1 + i=1 p j=1 λ T j (y i x T i β)x ij I(β j 0)),

Implementation and Simulation Replace I(β j 0) with K (β 2 j /σ n ), σ n 0 K (0) = 0, K (+ ) = 1, lim t K (t)t = 0, lim t K (t)t <. K (.) < M. Kernel K is similar to a cdf, as in smoothed maximum score. Horowitz (1992) Example: K (t) = 0.5(Φ(t) 0.5). Theorem 3 Under regularity conditions of P n, K n, and Theorems 1-4, smoothed PGMM and PEL achieve oracle properties. 34 / 39

35 / 39 Simulation Implementation and Simulation E(ɛ x S ) = 0, without knowing x S. y = x T β 0 + ɛ β 0 = (2.5, 4, 7, 1.5, 0,..., 0), ɛ N(0, 1). z N p (0, Σ), Σ ij = 0.5 i j. (x 1,..., x 4 ) = (z 1,..., z 4 ), x j = (z j + 2)(ɛ + 1)

Implementation and Simulation 36 / 39 Table: POLS and PGMM when p = 50, n = 200 POLS PGMM λ = 0.05 λ = 0.1 λ = 0.5 λ = 1 λ = 0.05 λ = 0.1 λ = 0.2 λ = 0.4 MSE S 0.145 0.133 0.629 1.417 0.261 0.184 0.194 0.979 (0.053) (0.043) (0.301) (0.329) (0.094) (0.069) (0.076) (0.245) MSE N 0.126 0.068 0.072 0.095 0.001 0 0.001 0.003 (0.035) (0.016) (0.016) (0.019) (0.010) (0) (0.009) (0.014) TP-Mean 5 5 4.82 3.63 5 5 5 4.5 Median 5 5 5 4 5 5 5 4.5 (0) (0) (0.385) (0.504) (0) (0) (0) (0.503) FP-Mean 37.68 35.36 8.84 2.58 0.08 0 0.02 0.14 Median 38 35 8 2 0 0 0 0 (2.902) (3.045) (3.334) (1.557) (0.337) (0) (0.141) (0.569)

37 / 39 Implementation and Simulation Sensitivity to minimal signal β 4 = 1.5 β 4 = 0.5 Table: Penalized GMM when p = 20, β 4 = 0.5 λ 0.001 0.005 0.01 0.05 0.1 0.5 MSE S 0.112 0.136 0.137 0.156 0.142 0.433 (0.090) (0.117) (0.102) (0.117) (0.083) (0.158) TP-Mean 4.96 4.92 4.94 4.91 4.96 4.25 Median 5 5 5 5 5 4 (0.197) (0.273) (0.239) (0.288) (0.197) (0.435) FP-Mean 11.28 3.88 1.135 0.020 0 0 Median 11 3 1 1 0 0 (1.545) (2.447) (2.139) (0.141) (0) (0)

Implementation and Simulation 38 / 39 Conclusion Many applications in economics contains ultra. high dim. regressors Careful about POLS for variable selection PGMM/ PEL allow endogeneity in ultra high dim. estimation and selection. Allow ultra high dim. instruments for 2SLS Allow ultra high dim. sieve approx. for optimal IV.

39 / 39 Some references Implementation and Simulation HANSEN, L. (1982). Large sample properties of generalized method of moments estimators. Econometrica, 50 1029-1054 OWEN, A. (1988). Empirical likelihood ratio confidence intervals for a single functional. Biometrika, 75, 237-249. ANDREWS, D. (1999). Consistent moment selection procedures for generalized method of moments estimation. Econometrica, 67 543-564 CANER, M. (2009). Lasso-type GMM estimator. Econometric Theory, 25 270-290 BELLONI, A. and CHERNOZHUKOV, V. (2011). High-Dimensional Sparse Econometric Models, an Introduction. In Ill-Posed Inverse Problems and High-Dimensional Estimation, Springer Lecture Notes in Statistics. SALIA-I-MARTIN, X. (1997). I Just Ran Two Million Regressions. The American Economic Review 87, 178-183.