Ultra High Dimensional Variable Selection with Endogenous Variables
|
|
- Mabel Gibson
- 5 years ago
- Views:
Transcription
1 1 / 39 Ultra High Dimensional Variable Selection with Endogenous Variables Yuan Liao Princeton University Joint work with Jianqing Fan Job Market Talk January, 2012
2 2 / 39 Outline 1 Examples of Ultra High Dimensional Econometric Model 2 Problem I: Ultra-high dim. covariates selection with endogeneity Inconsistency of Penalized OLS Penalized GMM and Penalized empirical likelihood 3 Problem II: Ultra-high dimensional instrumental selection Linear model: 2SLS with many weak instruments Nonlinear model: ultra-high dimensional sieve approximation 4 Implementation and Simulation
3 Examples of Ultra High Dimensional Econometric Model 3 / 39 Examples of Ultra High Dimensional Econometric Model
4 4 / 39 Examples of Ultra High Dimensional Econometric Model Cross-country Growth Regression Estimating the effect of an initial GDP per capita on the growth rates of GDP per capita. Solow-Swan-Ramsey model: poorer countries should grow faster, and catch up with richer countries. effect of initial GDP on growth rate should be negative Rejected using a simple bivariate regression ( Barro and Sala-i-Martin 1995) Conditional effects: For countries with similar characteristics, the effect of initial GDP on growth rate is negative.
5 Examples of Ultra High Dimensional Econometric Model 5 / 39 Cross-country Growth Regression y i = a 0 + a 1 log G i + x T i β + ɛ i y: growth rate; G: initial GDP x: country s char.: measures of edu, policies, trade openness, saving rates, investment rate, etc. H 0 : a 1 < 0. Barro and Lee (1994): p = 62, n = 90. Severe criticism of literature for relying on ad hoc covariate selection (Levine and Renelt 92) Development of a data-driven procedure for covariates selection is essential.
6 Examples of Ultra High Dimensional Econometric Model 6 / 39 Home price prediction Housing market based on state-level panel data can capture state-specific dynamics and variations If focus on local levels, including only macroeconomic variable cannot capture the cross-sectional correlation among local levels. y i,t+s = p y kt β ik + x T t θ i + ɛ i,t+s. k=1 x: macroeconomic variables p 1000; n < 200 for monthly sales data in ten years. Only a few county-level info. should be useful conditioning on national factors.
7 7 / 39 Examples of Ultra High Dimensional Econometric Model Home price prediction Fan, Lv and Qi (2011): monthly repeated sales of 352 largest counties in US from January 2000 to December 2009(n = 120) Testing periods: black: historical data blue: OLS with national house-price appreciation only red: penalized variable selection
8 Examples of Ultra High Dimensional Econometric Model 8 / 39 Labor Economics: Wage regression Effect of education on future income Response: log-wage y = E(y x) + ɛ. Nonparametric sieve approx. E(y x) = p i=1 β ip i (x) + r, P 1 (x),..., P p (x) are either polynomials or spline transformations. No guarantee that r is small using low-order polynomials. Possible oscillatory behavior associated with advanced degrees higher order
9 9 / 39 Examples of Ultra High Dimensional Econometric Model Labor Economics: Wage regression Belloni and Chernozhukov 11
10 Examples of Ultra High Dimensional Econometric Model 10 / 39 Instrumental Selection y = θ 0 + θ 1 z + w T γ + u 1 z = x T β + w T δ + u 2 y : wage; z : education. Angrist and Krueger: 180 IV s Two approaches in classical literature: 1 use 3 leading IV s: large variance 2 use 180 IV s: large bias 37 Lasso selected IV s. (Belloni and Chernozhukov 11)
11 Examples of Ultra High Dimensional Econometric Model 11 / 39 Model setting Consider y i = x T i β 0 + ɛ i, i = 1,..., n. dim(x) = p >> n. Allow p = exp(n α ), for some α (0, 1). Assume β 0 to be sparse. β 0 = (β 0S, 0) where dim(β 0S ) = s << n. Accordingly, x = (x S, x N ): important and unimportant
12 Examples of Ultra High Dimensional Econometric Model 12 / 39 Two Problems in This Talk
13 13 / 39 Examples of Ultra High Dimensional Econometric Model Problem I: Ultra-high dim. covariates selection y = x T β 0 + ɛ, β 0 = (β 0S, 0) x may contain many endogenous components. How to achieve oracle property? 1 ˆβ S β 0S = O p ( s/n log s). 2 P( ˆβ N = 0) 1. 3 ˆβ S has asymptotic normality. Solution: penalized GMM and penalized EL.
14 14 / 39 Examples of Ultra High Dimensional Econometric Model Problem II: Ultra high dim. instrumental selection dim(w) can be ultra high. y = x T S β 0S + ɛ x S = Θ 0 w + v dim(w) = O(exp(n α )), α (0, 1). Many instruments are weak. Solution: Penalized LS in 2SLS.
15 Problem I: Ultra-high dim. covariates selection with endogeneity 15 / 39 Problem I: Ultra-high dimensional covariates selection
16 16 / 39 Problem I: Ultra-high dim. covariates selection with endogeneity Penalized OLS Inconsistency of Penalized OLS Find ˆβ as: 1 ˆβ = arg min β n n (y i x i β)2 + i=1 P n : penalty function. Lasso: P n ( β j ) = λ n β j, where λ n 0. SCAD (Fan and Li 2001), etc. Key assumption: E(ɛ x S, x N ) = 0 p P n ( β j ). Unimportant predictors are artificially added; more desirable to assume only E(ɛ x S ) = 0 In many cases, important covariates are also endogenous. Instead, E(ɛ w) = 0. j=1
17 17 / 39 Problem I: Ultra-high dim. covariates selection with endogeneity A simulation example Inconsistency of Penalized OLS β 0 = (2.5, 4, 7, 1.5, 0,..., 0), p = 10, n = 100. (x 1,..., x 4 ) = (z 1,..., z 4 ), x j = (z j + 2)(ɛ + 1) where z N p (0, Σ), Σ ij = 0.5 i j. Penalized OLS +SCAD λ = 0.2 λ = 0.7 λ = 1.2 λ = 1.7 TP-Mean FP-Mean FP-Median (0.901) (0.799) (0.912) (0.83)
18 18 / 39 Problem I: Ultra-high dim. covariates selection with endogeneity Inconsistency of POLS Inconsistency of Penalized OLS Theorem 1 Suppose Ex l ɛ >> 0 for some x l. If β = ( β T S, β T N )T is POLS estimator, then either β S β 0S p 0, or lim sup P( β N = 0) < 1. n The inconsistency of POLS comes from the fact that, when x l is endogenous, E(y x T β 0 )x l = 0 is misspecified.
19 Problem I: Ultra-high dim. covariates selection with endogeneity Penalized GMM and Penalized empirical likelihood 19 / 39 Ultra-high dim. covariates selection with endogeneity Consider more general E[g(y, x T β 0 ) w] = 0, β 0 = (β 0S, 0). linear model: g = y x T β 0 logit model: g = y exp(x T β 0 )/(1 + exp(x T β 0 )) probit model: g = y Φ(x T β 0 ) Both important and unimportant covariates are possibly endogenous w: a set of valid instrumental variables.
20 Problem I: Ultra-high dim. covariates selection with endogeneity Penalized GMM and Penalized empirical likelihood 20 / 39 Penalized GMM Let v be p-dim. technical instruments. If dim(w) dim(x), v w. v = (f 1 (w),..., f p (w)). For fixed β R p, let v(β) contain only components {v l : β l 0} e.g., p = 3, β = (1, 0, 2), then v(β) = (v 1, v 3 ). Define L GMM (β) = [ 1 n n g(y i, x T i β)v i (β)] T W [ 1 n i=1 n g(y i, x T i β)v i (β)] i=1 p Q GMM (β) = L GMM (β) + P n ( β i ). i=1
21 Problem I: Ultra-high dim. covariates selection with endogeneity Penalized GMM and Penalized empirical likelihood 21 / 39 P n : penalty function. 1 P n is concave, increasing on [0, ), differentiable 2 P n(0 + ) > n 1/2 ; P n(t) = o(1) when t > c > 0. 3 max β0j 0 P n (β 0j ) = o(1). Examples 1 Lasso (Tibshirani 1986): P n(t) = λ n t 2 SCAD (Fan and Li 2001): P n(t) = λ n[λ n + λ n (aλ n t) + (a 1)λ n dt] 3 MCP (Zhang 2009): P n(t) = 1 (aλn t)+dt. a 4 Hard thresholding (Antoniadis 1996): a = 1.
22 Problem I: Ultra-high dim. covariates selection with endogeneity Penalized GMM and Penalized empirical likelihood 22 / 39 x 10-3 SCAD (Fan and Li 2001) Lasso (Tibshirani 1986)
23 23 / 39 Problem I: Ultra-high dim. covariates selection with endogeneity Oracle properties of PGMM Penalized GMM and Penalized empirical likelihood Theorem 1 Either E(g(y, x T β 0 ) x S ) = 0 or E(g(y, x T β 0 ) w) = 0 0 < c < λ min (Ex S v(β 0S ) T ) λ max (Ex S v(β 0S ) T ) < M. s 3 log s = o(n). Under regularity conditions, there exists a strictly local minimizer of Q GMM : 1 2 P( ˆβ N = 0) 1. s log s ˆβ S β 0S = O p ( + sp n n(min( β 0S ))). 3 Asymptotic normality of ˆβ S.
24 24 / 39 Problem I: Ultra-high dim. covariates selection with endogeneity Penalized empirical likelihood Penalized GMM and Penalized empirical likelihood Theorem 2 L EL (β) = 1 max λ R k β 0 n n log{1 + λ T [(y i x T i β)v i (β)]}. (2.1) i=1 Q EL (β) = L EL (β) + p P n ( β j ). (2.2) s 4 log s = o(n), there exists a strictly local minimizer of Q EL : 1 2 P( ˆβ N = 0) 1. j=1 s log s ˆβ S β 0S = O p ( + sp n n(min( β 0S ))). 3 Asymptotic normality of ˆβ S.
25 Problem II: Ultra-high dimensional instrumental selection Linear model: 2SLS with many weak instruments 25 / 39 Problem II: Ultra-high dimensional instrumental selection
26 26 / 39 Problem II: Ultra-high dimensional instrumental selection Linear model: 2SLS with many weak instruments Ultra high dim. instrumental selection Suppose oracle property is achieved, w.p.a.1, we identity: E[g(y, x T S β 0S) w] = 0. Optimal IV: A(w) = D(w) T Ω(w) 1, (Newey 01) D(w) = E( g(β 0S) β S w), Ω(w) = E(g(y, x T β 0S )g(y, x T β 0S ) T w). Ω(w): homoskedasticity. dim(w) can be ultra high. y = x T S β 0S + ɛ x S = Θ 0 w + v D(w) = Θ 0 w. But many instruments are weak. Including many weak IV s in 2SLS is severely biased.
27 27 / 39 Problem II: Ultra-high dimensional instrumental selection Linear model Linear model: 2SLS with many weak instruments Method based on MSE: (Donald&Newey 01, Kuersteiner&Okui 10) dim(w) n. requires natural ordering of IV s. In general, computationally infeasible: NP-hard. Proposed method: on the first stage, 1 ˆθ l = arg min θ n n (x Sl w T i θ) 2 + i=1 p P n ( θ j ). j=1 x = ˆΘv, ˆΘ = (ˆθ1,..., ˆθ s ) T. We allow dim(w) = o(exp( T )). s LASSO: ˆθ l θ 0l = O p ( 1 log s 1 n + s 1 λ n ), s SCAD: ˆθ l θ 0l = O p ( 1 log s 1 n ).
28 Problem II: Ultra-high dimensional instrumental selection Linear model: 2SLS with many weak instruments 28 / 39 Recent literature proposed methods based on l 1 penalty: (Belloni et al. 10, Garcia 11, Can&Fan 11) computationally efficient Lasso: choice of λ n is very restrictive. λ n large miss many important IVs. λ n small include too many weak IVs, complicated model Adaptive lasso: P n ( β j ) = β j 1 λ n β j. requires initial estimator, which is hard to obtain when w is ultra high dimensional. iterative algorithm may permanently remove important IVs. Proposed method allows more adaptive penalties.
29 Problem II: Ultra-high dimensional instrumental selection Linear model: 2SLS with many weak instruments 29 / 39 Nonlinear model Optimal IV: D(w) = E( g(β 0S) w) β S Estimate based on sieve approx. (Newey 01) p 1 D(w) = θ i f i (w) + r, p 1 n. i=1 No guarantee r is small if p 1 is small. Goal: allow for higher order polynomials
30 30 / 39 Problem II: Ultra-high dimensional instrumental selection Ultra-high dim. sieve approximation Nonlinear model: ultra-high dimensional sieve approximation Assumption: 1 There is a large set of technical IV s v = (f 1 (w),..., f p1 (w)) T (possibly p n): D(w) = Θ 0 v + a(w), max l s ( 1 n n a l (w i ) 2 ) = O p (cn) 2 i=1 2 max l s i / T l θ 0l,i < n α 1, minl s,i Tl θ 0l,i = h n > n α 2 max l s #{i : i T l } = s 1 = o(n). Penalized estimator: 1 ˆθ l = arg min θ n n ( g(y i, ˆx T i ˆβ S ) v T i θ) 2 + β Sl i=1 ˆD(w) = ˆΘv, ˆΘ = (ˆθ 1,..., ˆθ s ) T. p P n ( θ j ). j=1
31 Problem II: Ultra-high dimensional instrumental selection Nonlinear model: ultra-high dimensional sieve approximation 31 / 39 Theorem 2 There exists a strictly local minimizer ˆθ l = (ˆθ ls, ˆθ ls ), s.t. s log s s1 log s 1 ˆθ ls θ 0l,S = O p ( + + s 1 n α 1 + s 1 c n n n 1 n n i=1 + s 1 P n(h n )). lim P(ˆθ ln = 0) = 1. n ˆD l (W i ) D l (W i ) 2 = O p ( s 1s log s 1 n +s 2 1 n 2α 1 + s 2 1 c2 n + s 2 1 P n(h n ) 2 ). + s2 1 log s 1 n
32 Implementation and Simulation 32 / 39 Implementation
33 33 / 39 Smoothing Implementation and Simulation L GMM is not continuous. = p j=1 L GMM (β) = w j [ 1 n n i=1 [ 1 n n i=1 (y i x T i β)x i (β)] T W (y i x T i β)x ij I(β j 0) ] T [ 1 n n i=1 [ 1 n n i=1 (y i x T i β)x i (β) (y i x T i β)x ij I(β j 0) ] ] L EL (β) = max λ 1 n n i=1 1 = max λ j R k,j=1,...,p n log(1 + λ T (y i x T i β)x i (β) n log(1 + i=1 p j=1 λ T j (y i x T i β)x ij I(β j 0)),
34 Implementation and Simulation Replace I(β j 0) with K (β 2 j /σ n ), σ n 0 K (0) = 0, K (+ ) = 1, lim t K (t)t = 0, lim t K (t)t <. K (.) < M. Kernel K is similar to a cdf, as in smoothed maximum score. Horowitz (1992) Example: K (t) = 0.5(Φ(t) 0.5). Theorem 3 Under regularity conditions of P n, K n, and Theorems 1-4, smoothed PGMM and PEL achieve oracle properties. 34 / 39
35 35 / 39 Simulation Implementation and Simulation E(ɛ x S ) = 0, without knowing x S. y = x T β 0 + ɛ β 0 = (2.5, 4, 7, 1.5, 0,..., 0), ɛ N(0, 1). z N p (0, Σ), Σ ij = 0.5 i j. (x 1,..., x 4 ) = (z 1,..., z 4 ), x j = (z j + 2)(ɛ + 1)
36 Implementation and Simulation 36 / 39 Table: POLS and PGMM when p = 50, n = 200 POLS PGMM λ = 0.05 λ = 0.1 λ = 0.5 λ = 1 λ = 0.05 λ = 0.1 λ = 0.2 λ = 0.4 MSE S (0.053) (0.043) (0.301) (0.329) (0.094) (0.069) (0.076) (0.245) MSE N (0.035) (0.016) (0.016) (0.019) (0.010) (0) (0.009) (0.014) TP-Mean Median (0) (0) (0.385) (0.504) (0) (0) (0) (0.503) FP-Mean Median (2.902) (3.045) (3.334) (1.557) (0.337) (0) (0.141) (0.569)
37 37 / 39 Implementation and Simulation Sensitivity to minimal signal β 4 = 1.5 β 4 = 0.5 Table: Penalized GMM when p = 20, β 4 = 0.5 λ MSE S (0.090) (0.117) (0.102) (0.117) (0.083) (0.158) TP-Mean Median (0.197) (0.273) (0.239) (0.288) (0.197) (0.435) FP-Mean Median (1.545) (2.447) (2.139) (0.141) (0) (0)
38 Implementation and Simulation 38 / 39 Conclusion Many applications in economics contains ultra. high dim. regressors Careful about POLS for variable selection PGMM/ PEL allow endogeneity in ultra high dim. estimation and selection. Allow ultra high dim. instruments for 2SLS Allow ultra high dim. sieve approx. for optimal IV.
39 39 / 39 Some references Implementation and Simulation HANSEN, L. (1982). Large sample properties of generalized method of moments estimators. Econometrica, OWEN, A. (1988). Empirical likelihood ratio confidence intervals for a single functional. Biometrika, 75, ANDREWS, D. (1999). Consistent moment selection procedures for generalized method of moments estimation. Econometrica, CANER, M. (2009). Lasso-type GMM estimator. Econometric Theory, BELLONI, A. and CHERNOZHUKOV, V. (2011). High-Dimensional Sparse Econometric Models, an Introduction. In Ill-Posed Inverse Problems and High-Dimensional Estimation, Springer Lecture Notes in Statistics. SALIA-I-MARTIN, X. (1997). I Just Ran Two Million Regressions. The American Economic Review 87,
Nonconcave Penalized Likelihood with A Diverging Number of Parameters
Nonconcave Penalized Likelihood with A Diverging Number of Parameters Jianqing Fan and Heng Peng Presenter: Jiale Xu March 12, 2010 Jianqing Fan and Heng Peng Presenter: JialeNonconcave Xu () Penalized
More informationDealing With Endogeneity
Dealing With Endogeneity Junhui Qian December 22, 2014 Outline Introduction Instrumental Variable Instrumental Variable Estimation Two-Stage Least Square Estimation Panel Data Endogeneity in Econometrics
More informationMLE and GMM. Li Zhao, SJTU. Spring, Li Zhao MLE and GMM 1 / 22
MLE and GMM Li Zhao, SJTU Spring, 2017 Li Zhao MLE and GMM 1 / 22 Outline 1 MLE 2 GMM 3 Binary Choice Models Li Zhao MLE and GMM 2 / 22 Maximum Likelihood Estimation - Introduction For a linear model y
More informationHigh Dimensional Sparse Econometric Models: An Introduction
High Dimensional Sparse Econometric Models: An Introduction Alexandre Belloni and Victor Chernozhukov Abstract In this chapter we discuss conceptually high dimensional sparse econometric models as well
More informationPaper Review: Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties by Jianqing Fan and Runze Li (2001)
Paper Review: Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties by Jianqing Fan and Runze Li (2001) Presented by Yang Zhao March 5, 2010 1 / 36 Outlines 2 / 36 Motivation
More informationSmoothly Clipped Absolute Deviation (SCAD) for Correlated Variables
Smoothly Clipped Absolute Deviation (SCAD) for Correlated Variables LIB-MA, FSSM Cadi Ayyad University (Morocco) COMPSTAT 2010 Paris, August 22-27, 2010 Motivations Fan and Li (2001), Zou and Li (2008)
More informationA UNIFIED APPROACH TO MODEL SELECTION AND SPARS. REGULARIZED LEAST SQUARES by Jinchi Lv and Yingying Fan The annals of Statistics (2009)
A UNIFIED APPROACH TO MODEL SELECTION AND SPARSE RECOVERY USING REGULARIZED LEAST SQUARES by Jinchi Lv and Yingying Fan The annals of Statistics (2009) Mar. 19. 2010 Outline 1 2 Sideline information Notations
More informationMostly Dangerous Econometrics: How to do Model Selection with Inference in Mind
Outline Introduction Analysis in Low Dimensional Settings Analysis in High-Dimensional Settings Bonus Track: Genaralizations Econometrics: How to do Model Selection with Inference in Mind June 25, 2015,
More informationProgram Evaluation with High-Dimensional Data
Program Evaluation with High-Dimensional Data Alexandre Belloni Duke Victor Chernozhukov MIT Iván Fernández-Val BU Christian Hansen Booth ESWC 215 August 17, 215 Introduction Goal is to perform inference
More informationLecture 14: Variable Selection - Beyond LASSO
Fall, 2017 Extension of LASSO To achieve oracle properties, L q penalty with 0 < q < 1, SCAD penalty (Fan and Li 2001; Zhang et al. 2007). Adaptive LASSO (Zou 2006; Zhang and Lu 2007; Wang et al. 2007)
More informationMotivation Non-linear Rational Expectations The Permanent Income Hypothesis The Log of Gravity Non-linear IV Estimation Summary.
Econometrics I Department of Economics Universidad Carlos III de Madrid Master in Industrial Economics and Markets Outline Motivation 1 Motivation 2 3 4 5 Motivation Hansen's contributions GMM was developed
More informationPhD/MA Econometrics Examination January 2012 PART A
PhD/MA Econometrics Examination January 2012 PART A ANSWER ANY TWO QUESTIONS IN THIS SECTION NOTE: (1) The indicator function has the properties: (2) Question 1 Let, [defined as if using the indicator
More informationA Bootstrap Lasso + Partial Ridge Method to Construct Confidence Intervals for Parameters in High-dimensional Sparse Linear Models
A Bootstrap Lasso + Partial Ridge Method to Construct Confidence Intervals for Parameters in High-dimensional Sparse Linear Models Jingyi Jessica Li Department of Statistics University of California, Los
More informationThe risk of machine learning
/ 33 The risk of machine learning Alberto Abadie Maximilian Kasy July 27, 27 2 / 33 Two key features of machine learning procedures Regularization / shrinkage: Improve prediction or estimation performance
More informationHigh-Dimensional Sparse Econometrics. Models, an Introduction
High-Dimensional Sparse Econometric Models, an Introduction ICE, July 2011 Outline 1. High-Dimensional Sparse Econometric Models (HDSM) Models Motivating Examples for linear/nonparametric regression 2.
More informationSingle Index Quantile Regression for Heteroscedastic Data
Single Index Quantile Regression for Heteroscedastic Data E. Christou M. G. Akritas Department of Statistics The Pennsylvania State University SMAC, November 6, 2015 E. Christou, M. G. Akritas (PSU) SIQR
More informationFlexible Estimation of Treatment Effect Parameters
Flexible Estimation of Treatment Effect Parameters Thomas MaCurdy a and Xiaohong Chen b and Han Hong c Introduction Many empirical studies of program evaluations are complicated by the presence of both
More informationApplied Health Economics (for B.Sc.)
Applied Health Economics (for B.Sc.) Helmut Farbmacher Department of Economics University of Mannheim Autumn Semester 2017 Outlook 1 Linear models (OLS, Omitted variables, 2SLS) 2 Limited and qualitative
More informationApplied Econometrics (MSc.) Lecture 3 Instrumental Variables
Applied Econometrics (MSc.) Lecture 3 Instrumental Variables Estimation - Theory Department of Economics University of Gothenburg December 4, 2014 1/28 Why IV estimation? So far, in OLS, we assumed independence.
More informationIV Quantile Regression for Group-level Treatments, with an Application to the Distributional Effects of Trade
IV Quantile Regression for Group-level Treatments, with an Application to the Distributional Effects of Trade Denis Chetverikov Brad Larsen Christopher Palmer UCLA, Stanford and NBER, UC Berkeley September
More informationESSAYS ON INSTRUMENTAL VARIABLES. Enrique Pinzón García. A dissertation submitted in partial fulfillment of. the requirements for the degree of
ESSAYS ON INSTRUMENTAL VARIABLES By Enrique Pinzón García A dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy Economics) at the UNIVERSITY OF WISCONSIN-MADISON
More informationSemiparametric Estimation of a Sample Selection Model in the Presence of Endogeneity
Semiparametric Estimation of a Sample Selection Model in the Presence of Endogeneity Jörg Schwiebert Abstract In this paper, we derive a semiparametric estimation procedure for the sample selection model
More informationUniform Inference for Conditional Factor Models with Instrumental and Idiosyncratic Betas
Uniform Inference for Conditional Factor Models with Instrumental and Idiosyncratic Betas Yuan Xiye Yang Rutgers University Dec 27 Greater NY Econometrics Overview main results Introduction Consider a
More informationEconometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018
Econometrics I KS Module 2: Multivariate Linear Regression Alexander Ahammer Department of Economics Johannes Kepler University of Linz This version: April 16, 2018 Alexander Ahammer (JKU) Module 2: Multivariate
More informationWhen Should We Use Linear Fixed Effects Regression Models for Causal Inference with Longitudinal Data?
When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Longitudinal Data? Kosuke Imai Department of Politics Center for Statistics and Machine Learning Princeton University
More informationBinary Models with Endogenous Explanatory Variables
Binary Models with Endogenous Explanatory Variables Class otes Manuel Arellano ovember 7, 2007 Revised: January 21, 2008 1 Introduction In Part I we considered linear and non-linear models with additive
More informationQuantile methods. Class Notes Manuel Arellano December 1, Let F (r) =Pr(Y r). Forτ (0, 1), theτth population quantile of Y is defined to be
Quantile methods Class Notes Manuel Arellano December 1, 2009 1 Unconditional quantiles Let F (r) =Pr(Y r). Forτ (0, 1), theτth population quantile of Y is defined to be Q τ (Y ) q τ F 1 (τ) =inf{r : F
More informationThe outline for Unit 3
The outline for Unit 3 Unit 1. Introduction: The regression model. Unit 2. Estimation principles. Unit 3: Hypothesis testing principles. 3.1 Wald test. 3.2 Lagrange Multiplier. 3.3 Likelihood Ratio Test.
More informationUniform Post Selection Inference for LAD Regression and Other Z-estimation problems. ArXiv: Alexandre Belloni (Duke) + Kengo Kato (Tokyo)
Uniform Post Selection Inference for LAD Regression and Other Z-estimation problems. ArXiv: 1304.0282 Victor MIT, Economics + Center for Statistics Co-authors: Alexandre Belloni (Duke) + Kengo Kato (Tokyo)
More informationEcon 583 Final Exam Fall 2008
Econ 583 Final Exam Fall 2008 Eric Zivot December 11, 2008 Exam is due at 9:00 am in my office on Friday, December 12. 1 Maximum Likelihood Estimation and Asymptotic Theory Let X 1,...,X n be iid random
More informationGeneralized Method of Moment
Generalized Method of Moment CHUNG-MING KUAN Department of Finance & CRETA National Taiwan University June 16, 2010 C.-M. Kuan (Finance & CRETA, NTU Generalized Method of Moment June 16, 2010 1 / 32 Lecture
More informationMoment and IV Selection Approaches: A Comparative Simulation Study
Moment and IV Selection Approaches: A Comparative Simulation Study Mehmet Caner Esfandiar Maasoumi Juan Andrés Riquelme August 7, 2014 Abstract We compare three moment selection approaches, followed by
More informationWhen Should We Use Linear Fixed Effects Regression Models for Causal Inference with Longitudinal Data?
When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Longitudinal Data? Kosuke Imai Princeton University Asian Political Methodology Conference University of Sydney Joint
More informationEconometrics of Panel Data
Econometrics of Panel Data Jakub Mućk Meeting # 6 Jakub Mućk Econometrics of Panel Data Meeting # 6 1 / 36 Outline 1 The First-Difference (FD) estimator 2 Dynamic panel data models 3 The Anderson and Hsiao
More informationA Goodness-of-fit Test for Copulas
A Goodness-of-fit Test for Copulas Artem Prokhorov August 2008 Abstract A new goodness-of-fit test for copulas is proposed. It is based on restrictions on certain elements of the information matrix and
More informationWEIGHTED QUANTILE REGRESSION THEORY AND ITS APPLICATION. Abstract
Journal of Data Science,17(1). P. 145-160,2019 DOI:10.6339/JDS.201901_17(1).0007 WEIGHTED QUANTILE REGRESSION THEORY AND ITS APPLICATION Wei Xiong *, Maozai Tian 2 1 School of Statistics, University of
More informationInterquantile Shrinkage in Regression Models
Interquantile Shrinkage in Regression Models Liewen Jiang, Huixia Judy Wang, Howard D. Bondell Abstract Conventional analysis using quantile regression typically focuses on fitting the regression model
More informationEconometrics of High-Dimensional Sparse Models
Econometrics of High-Dimensional Sparse Models (p much larger than n) Victor Chernozhukov Christian Hansen NBER, July 2013 Outline Plan 1. High-Dimensional Sparse Framework The Framework Two Examples 2.
More informationECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Spring 2013 Instructor: Victor Aguirregabiria
ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Spring 2013 Instructor: Victor Aguirregabiria SOLUTION TO FINAL EXAM Friday, April 12, 2013. From 9:00-12:00 (3 hours) INSTRUCTIONS:
More informationLecture 3: Statistical Decision Theory (Part II)
Lecture 3: Statistical Decision Theory (Part II) Hao Helen Zhang Hao Helen Zhang Lecture 3: Statistical Decision Theory (Part II) 1 / 27 Outline of This Note Part I: Statistics Decision Theory (Classical
More informationFinite Sample Performance of A Minimum Distance Estimator Under Weak Instruments
Finite Sample Performance of A Minimum Distance Estimator Under Weak Instruments Tak Wai Chau February 20, 2014 Abstract This paper investigates the nite sample performance of a minimum distance estimator
More informationSingle-Equation GMM: Endogeneity Bias
Single-Equation GMM: Lecture for Economics 241B Douglas G. Steigerwald UC Santa Barbara January 2012 Initial Question Initial Question How valuable is investment in college education? economics - measure
More information13 Endogeneity and Nonparametric IV
13 Endogeneity and Nonparametric IV 13.1 Nonparametric Endogeneity A nonparametric IV equation is Y i = g (X i ) + e i (1) E (e i j i ) = 0 In this model, some elements of X i are potentially endogenous,
More informationIntroduction to Econometrics
Introduction to Econometrics T H I R D E D I T I O N Global Edition James H. Stock Harvard University Mark W. Watson Princeton University Boston Columbus Indianapolis New York San Francisco Upper Saddle
More informationPanel Threshold Regression Models with Endogenous Threshold Variables
Panel Threshold Regression Models with Endogenous Threshold Variables Chien-Ho Wang National Taipei University Eric S. Lin National Tsing Hua University This Version: June 29, 2010 Abstract This paper
More informationVariable selection and estimation in high-dimensional models
Variable selection and estimation in high-dimensional models Joel L. Horowitz Department of Economics, Northwestern University Abstract. Models with high-dimensional covariates arise frequently in economics
More informationNon-linear Supervised High Frequency Trading Strategies with Applications in US Equity Markets
Non-linear Supervised High Frequency Trading Strategies with Applications in US Equity Markets Nan Zhou, Wen Cheng, Ph.D. Associate, Quantitative Research, J.P. Morgan nan.zhou@jpmorgan.com The 4th Annual
More informationRegression, Ridge Regression, Lasso
Regression, Ridge Regression, Lasso Fabio G. Cozman - fgcozman@usp.br October 2, 2018 A general definition Regression studies the relationship between a response variable Y and covariates X 1,..., X n.
More informationTruncation and Censoring
Truncation and Censoring Laura Magazzini laura.magazzini@univr.it Laura Magazzini (@univr.it) Truncation and Censoring 1 / 35 Truncation and censoring Truncation: sample data are drawn from a subset of
More informationECONOMETRICS HONOR S EXAM REVIEW SESSION
ECONOMETRICS HONOR S EXAM REVIEW SESSION Eunice Han ehan@fas.harvard.edu March 26 th, 2013 Harvard University Information 2 Exam: April 3 rd 3-6pm @ Emerson 105 Bring a calculator and extra pens. Notes
More informationMachine learning, shrinkage estimation, and economic theory
Machine learning, shrinkage estimation, and economic theory Maximilian Kasy December 14, 2018 1 / 43 Introduction Recent years saw a boom of machine learning methods. Impressive advances in domains such
More informationWhen Should We Use Linear Fixed Effects Regression Models for Causal Inference with Panel Data?
When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Panel Data? Kosuke Imai Department of Politics Center for Statistics and Machine Learning Princeton University Joint
More informationLasso Maximum Likelihood Estimation of Parametric Models with Singular Information Matrices
Article Lasso Maximum Likelihood Estimation of Parametric Models with Singular Information Matrices Fei Jin 1,2 and Lung-fei Lee 3, * 1 School of Economics, Shanghai University of Finance and Economics,
More informationIntroduction to Estimation Methods for Time Series models. Lecture 1
Introduction to Estimation Methods for Time Series models Lecture 1 Fulvio Corsi SNS Pisa Fulvio Corsi Introduction to Estimation () Methods for Time Series models Lecture 1 SNS Pisa 1 / 19 Estimation
More informationIntroduction to Regression Analysis. Dr. Devlina Chatterjee 11 th August, 2017
Introduction to Regression Analysis Dr. Devlina Chatterjee 11 th August, 2017 What is regression analysis? Regression analysis is a statistical technique for studying linear relationships. One dependent
More informationLecture 9: Quantile Methods 2
Lecture 9: Quantile Methods 2 1. Equivariance. 2. GMM for Quantiles. 3. Endogenous Models 4. Empirical Examples 1 1. Equivariance to Monotone Transformations. Theorem (Equivariance of Quantiles under Monotone
More informationarxiv: v1 [stat.me] 30 Dec 2017
arxiv:1801.00105v1 [stat.me] 30 Dec 2017 An ISIS screening approach involving threshold/partition for variable selection in linear regression 1. Introduction Yu-Hsiang Cheng e-mail: 96354501@nccu.edu.tw
More informationINFERENCE APPROACHES FOR INSTRUMENTAL VARIABLE QUANTILE REGRESSION. 1. Introduction
INFERENCE APPROACHES FOR INSTRUMENTAL VARIABLE QUANTILE REGRESSION VICTOR CHERNOZHUKOV CHRISTIAN HANSEN MICHAEL JANSSON Abstract. We consider asymptotic and finite-sample confidence bounds in instrumental
More informationInference in Nonparametric Series Estimation with Data-Dependent Number of Series Terms
Inference in Nonparametric Series Estimation with Data-Dependent Number of Series Terms Byunghoon ang Department of Economics, University of Wisconsin-Madison First version December 9, 204; Revised November
More informationHabilitationsvortrag: Machine learning, shrinkage estimation, and economic theory
Habilitationsvortrag: Machine learning, shrinkage estimation, and economic theory Maximilian Kasy May 25, 218 1 / 27 Introduction Recent years saw a boom of machine learning methods. Impressive advances
More informationON ILL-POSEDNESS OF NONPARAMETRIC INSTRUMENTAL VARIABLE REGRESSION WITH CONVEXITY CONSTRAINTS
ON ILL-POSEDNESS OF NONPARAMETRIC INSTRUMENTAL VARIABLE REGRESSION WITH CONVEXITY CONSTRAINTS Olivier Scaillet a * This draft: July 2016. Abstract This note shows that adding monotonicity or convexity
More informationPenalized Empirical Likelihood and Growing Dimensional General Estimating Equations
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 Biometrika (0000), 00, 0, pp. 1 15 C 0000 Biometrika Trust Printed
More informationECON 3150/4150, Spring term Lecture 6
ECON 3150/4150, Spring term 2013. Lecture 6 Review of theoretical statistics for econometric modelling (II) Ragnar Nymoen University of Oslo 31 January 2013 1 / 25 References to Lecture 3 and 6 Lecture
More informationHigh Dimensional Propensity Score Estimation via Covariate Balancing
High Dimensional Propensity Score Estimation via Covariate Balancing Kosuke Imai Princeton University Talk at Columbia University May 13, 2017 Joint work with Yang Ning and Sida Peng Kosuke Imai (Princeton)
More informationDISCUSSION PAPER. The Bias from Misspecification of Control Variables as Linear. L e o n a r d G o f f. November 2014 RFF DP 14-41
DISCUSSION PAPER November 014 RFF DP 14-41 The Bias from Misspecification of Control Variables as Linear L e o n a r d G o f f 1616 P St. NW Washington, DC 0036 0-38-5000 www.rff.org The Bias from Misspecification
More informationProblem Set #6: OLS. Economics 835: Econometrics. Fall 2012
Problem Set #6: OLS Economics 835: Econometrics Fall 202 A preliminary result Suppose we have a random sample of size n on the scalar random variables (x, y) with finite means, variances, and covariance.
More informationBayesian variable selection via. Penalized credible regions. Brian Reich, NCSU. Joint work with. Howard Bondell and Ander Wilson
Bayesian variable selection via penalized credible regions Brian Reich, NC State Joint work with Howard Bondell and Ander Wilson Brian Reich, NCSU Penalized credible regions 1 Motivation big p, small n
More informationLinear Models and Estimation by Least Squares
Linear Models and Estimation by Least Squares Jin-Lung Lin 1 Introduction Causal relation investigation lies in the heart of economics. Effect (Dependent variable) cause (Independent variable) Example:
More informationSpring 2017 Econ 574 Roger Koenker. Lecture 14 GEE-GMM
University of Illinois Department of Economics Spring 2017 Econ 574 Roger Koenker Lecture 14 GEE-GMM Throughout the course we have emphasized methods of estimation and inference based on the principle
More informationSPARSE MODELS AND METHODS FOR OPTIMAL INSTRUMENTS WITH AN APPLICATION TO EMINENT DOMAIN
SPARSE MODELS AND METHODS FOR OPTIMAL INSTRUMENTS WITH AN APPLICATION TO EMINENT DOMAIN A. BELLONI, D. CHEN, V. CHERNOZHUKOV, AND C. HANSEN Abstract. We develop results for the use of LASSO and Post-LASSO
More informationMaximum Likelihood Estimation for Factor Analysis. Yuan Liao
Maximum Likelihood Estimation for Factor Analysis Yuan Liao University of Maryland Joint worth Jushan Bai June 15, 2013 High Dim. Factor Model y it = λ i f t + u it i N,t f t : common factor λ i : loading
More informationEconometrics of Big Data: Large p Case
Econometrics of Big Data: Large p Case (p much larger than n) Victor Chernozhukov MIT, October 2014 VC Econometrics of Big Data: Large p Case 1 Outline Plan 1. High-Dimensional Sparse Framework The Framework
More informationHigh Dimensional Empirical Likelihood for Generalized Estimating Equations with Dependent Data
High Dimensional Empirical Likelihood for Generalized Estimating Equations with Dependent Data Song Xi CHEN Guanghua School of Management and Center for Statistical Science, Peking University Department
More informationA Course in Applied Econometrics Lecture 18: Missing Data. Jeff Wooldridge IRP Lectures, UW Madison, August Linear model with IVs: y i x i u i,
A Course in Applied Econometrics Lecture 18: Missing Data Jeff Wooldridge IRP Lectures, UW Madison, August 2008 1. When Can Missing Data be Ignored? 2. Inverse Probability Weighting 3. Imputation 4. Heckman-Type
More informationOn the Behavior of Marginal and Conditional Akaike Information Criteria in Linear Mixed Models
On the Behavior of Marginal and Conditional Akaike Information Criteria in Linear Mixed Models Thomas Kneib Institute of Statistics and Econometrics Georg-August-University Göttingen Department of Statistics
More informationBirkbeck Working Papers in Economics & Finance
ISSN 1745-8587 Birkbeck Working Papers in Economics & Finance Department of Economics, Mathematics and Statistics BWPEF 1809 A Note on Specification Testing in Some Structural Regression Models Walter
More informationProblem Set 7. Ideally, these would be the same observations left out when you
Business 4903 Instructor: Christian Hansen Problem Set 7. Use the data in MROZ.raw to answer this question. The data consist of 753 observations. Before answering any of parts a.-b., remove 253 observations
More informationVariable Screening in High-dimensional Feature Space
ICCM 2007 Vol. II 735 747 Variable Screening in High-dimensional Feature Space Jianqing Fan Abstract Variable selection in high-dimensional space characterizes many contemporary problems in scientific
More informationInstrumental Variables and the Problem of Endogeneity
Instrumental Variables and the Problem of Endogeneity September 15, 2015 1 / 38 Exogeneity: Important Assumption of OLS In a standard OLS framework, y = xβ + ɛ (1) and for unbiasedness we need E[x ɛ] =
More informationEconomics 582 Random Effects Estimation
Economics 582 Random Effects Estimation Eric Zivot May 29, 2013 Random Effects Model Hence, the model can be re-written as = x 0 β + + [x ] = 0 (no endogeneity) [ x ] = = + x 0 β + + [x ] = 0 [ x ] = 0
More informationModels of Qualitative Binary Response
Models of Qualitative Binary Response Probit and Logit Models October 6, 2015 Dependent Variable as a Binary Outcome Suppose we observe an economic choice that is a binary signal. The focus on the course
More informationMore on Specification and Data Issues
More on Specification and Data Issues Ping Yu School of Economics and Finance The University of Hong Kong Ping Yu (HKU) Specification and Data Issues 1 / 35 Functional Form Misspecification Functional
More informationRegression #3: Properties of OLS Estimator
Regression #3: Properties of OLS Estimator Econ 671 Purdue University Justin L. Tobias (Purdue) Regression #3 1 / 20 Introduction In this lecture, we establish some desirable properties associated with
More informationLinear Regression. Junhui Qian. October 27, 2014
Linear Regression Junhui Qian October 27, 2014 Outline The Model Estimation Ordinary Least Square Method of Moments Maximum Likelihood Estimation Properties of OLS Estimator Unbiasedness Consistency Efficiency
More informationEconomics 536 Lecture 7. Introduction to Specification Testing in Dynamic Econometric Models
University of Illinois Fall 2016 Department of Economics Roger Koenker Economics 536 Lecture 7 Introduction to Specification Testing in Dynamic Econometric Models In this lecture I want to briefly describe
More informationNon-linear panel data modeling
Non-linear panel data modeling Laura Magazzini University of Verona laura.magazzini@univr.it http://dse.univr.it/magazzini May 2010 Laura Magazzini (@univr.it) Non-linear panel data modeling May 2010 1
More informationAdvanced Econometrics
Advanced Econometrics Dr. Andrea Beccarini Center for Quantitative Economics Winter 2013/2014 Andrea Beccarini (CQE) Econometrics Winter 2013/2014 1 / 156 General information Aims and prerequisites Objective:
More informationConsistent high-dimensional Bayesian variable selection via penalized credible regions
Consistent high-dimensional Bayesian variable selection via penalized credible regions Howard Bondell bondell@stat.ncsu.edu Joint work with Brian Reich Howard Bondell p. 1 Outline High-Dimensional Variable
More informationEconometrics Honor s Exam Review Session. Spring 2012 Eunice Han
Econometrics Honor s Exam Review Session Spring 2012 Eunice Han Topics 1. OLS The Assumptions Omitted Variable Bias Conditional Mean Independence Hypothesis Testing and Confidence Intervals Homoskedasticity
More informationLASSO METHODS FOR GAUSSIAN INSTRUMENTAL VARIABLES MODELS. 1. Introduction
LASSO METHODS FOR GAUSSIAN INSTRUMENTAL VARIABLES MODELS A. BELLONI, V. CHERNOZHUKOV, AND C. HANSEN Abstract. In this note, we propose to use sparse methods (e.g. LASSO, Post-LASSO, LASSO, and Post- LASSO)
More information1 Motivation for Instrumental Variable (IV) Regression
ECON 370: IV & 2SLS 1 Instrumental Variables Estimation and Two Stage Least Squares Econometric Methods, ECON 370 Let s get back to the thiking in terms of cross sectional (or pooled cross sectional) data
More information08 Endogenous Right-Hand-Side Variables. Andrius Buteikis,
08 Endogenous Right-Hand-Side Variables Andrius Buteikis, andrius.buteikis@mif.vu.lt http://web.vu.lt/mif/a.buteikis/ Introduction Consider a simple regression model: Y t = α + βx t + u t Under the classical
More informationInstrumental Variables, Simultaneous and Systems of Equations
Chapter 6 Instrumental Variables, Simultaneous and Systems of Equations 61 Instrumental variables In the linear regression model y i = x iβ + ε i (61) we have been assuming that bf x i and ε i are uncorrelated
More informationRamsey Cass Koopmans Model (1): Setup of the Model and Competitive Equilibrium Path
Ramsey Cass Koopmans Model (1): Setup of the Model and Competitive Equilibrium Path Ryoji Ohdoi Dept. of Industrial Engineering and Economics, Tokyo Tech This lecture note is mainly based on Ch. 8 of Acemoglu
More informationFall, 2007 Nonlinear Econometrics. Theory: Consistency for Extremum Estimators. Modeling: Probit, Logit, and Other Links.
14.385 Fall, 2007 Nonlinear Econometrics Lecture 2. Theory: Consistency for Extremum Estimators Modeling: Probit, Logit, and Other Links. 1 Example: Binary Choice Models. The latent outcome is defined
More informationDensity estimation Nonparametric conditional mean estimation Semiparametric conditional mean estimation. Nonparametrics. Gabriel Montes-Rojas
0 0 5 Motivation: Regression discontinuity (Angrist&Pischke) Outcome.5 1 1.5 A. Linear E[Y 0i X i] 0.2.4.6.8 1 X Outcome.5 1 1.5 B. Nonlinear E[Y 0i X i] i 0.2.4.6.8 1 X utcome.5 1 1.5 C. Nonlinearity
More informationQuantile Processes for Semi and Nonparametric Regression
Quantile Processes for Semi and Nonparametric Regression Shih-Kang Chao Department of Statistics Purdue University IMS-APRM 2016 A joint work with Stanislav Volgushev and Guang Cheng Quantile Response
More informationEconometrics I KS. Module 1: Bivariate Linear Regression. Alexander Ahammer. This version: March 12, 2018
Econometrics I KS Module 1: Bivariate Linear Regression Alexander Ahammer Department of Economics Johannes Kepler University of Linz This version: March 12, 2018 Alexander Ahammer (JKU) Module 1: Bivariate
More informationECON 4160: Econometrics-Modelling and Systems Estimation Lecture 9: Multiple equation models II
ECON 4160: Econometrics-Modelling and Systems Estimation Lecture 9: Multiple equation models II Ragnar Nymoen Department of Economics University of Oslo 9 October 2018 The reference to this lecture is:
More informationNonparametric Econometrics
Applied Microeconometrics with Stata Nonparametric Econometrics Spring Term 2011 1 / 37 Contents Introduction The histogram estimator The kernel density estimator Nonparametric regression estimators Semi-
More information