Economics 620, Lecture 18: Nonlinear Models

Similar documents
LECTURE 18: NONLINEAR MODELS

Economics 620, Lecture 9: Asymptotics III: Maximum Likelihood Estimation

Economics 620, Lecture 15: Introduction to the Simultaneous Equations Model

Economics 620, Lecture 20: Generalized Method of Moment (GMM)

Economics 620, Lecture 19: Introduction to Nonparametric and Semiparametric Estimation

Economics 620, Lecture 14: Lags and Dynamics

Suggested Solution for PS #5

Economics 620, Lecture 4: The K-Varable Linear Model I

A Course on Advanced Econometrics

plim W 0 " 1 N = 0 Prof. N. M. Kiefer, Econ 620, Cornell University, Lecture 16.

Economics 620, Lecture 13: Time Series I

Introduction to the Simultaneous Equations Model

13 Endogeneity and Nonparametric IV

Economics 620, Lecture 7: Still More, But Last, on the K-Varable Linear Model

Chapter 1. GMM: Basic Concepts

Introduction: structural econometrics. Jean-Marc Robin

Introduction to Nonparametric and Semiparametric Estimation. Good when there are lots of data and very little prior information on functional form.

Economics 620, Lecture 4: The K-Variable Linear Model I. y 1 = + x 1 + " 1 y 2 = + x 2 + " 2 :::::::: :::::::: y N = + x N + " N

ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Spring 2013 Instructor: Victor Aguirregabiria

7 Semiparametric Methods and Partially Linear Regression

We begin by thinking about population relationships.

Lecture: Simultaneous Equation Model (Wooldridge s Book Chapter 16)

Simple Estimators for Semiparametric Multinomial Choice Models

1. The Multivariate Classical Linear Regression Model

Econometrics II - EXAM Answer each question in separate sheets in three hours

Chapter 6: Endogeneity and Instrumental Variables (IV) estimator

Rewrap ECON November 18, () Rewrap ECON 4135 November 18, / 35

Chapter 2. Dynamic panel data models

STAT 100C: Linear models

LECTURE 13: TIME SERIES I

Discrete Dependent Variable Models

Lecture Notes on Measurement Error

ECONOMETRICS FIELD EXAM Michigan State University May 9, 2008

Economics 620, Lecture 8: Asymptotics I

Econometrics I, Estimation

ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Winter 2014 Instructor: Victor Aguirregabiria

MS&E 226: Small Data. Lecture 11: Maximum likelihood (v2) Ramesh Johari

Estimation of Static Discrete Choice Models Using Market Level Data

Time Series Models and Inference. James L. Powell Department of Economics University of California, Berkeley

Panel Data. March 2, () Applied Economoetrics: Topic 6 March 2, / 43

Spatial panels: random components vs. xed e ects

Environmental Econometrics

δ -method and M-estimation

Econ 583 Final Exam Fall 2008

1 The Multiple Regression Model: Freeing Up the Classical Assumptions

Finite Sample Performance of A Minimum Distance Estimator Under Weak Instruments

Econometric Analysis of Cross Section and Panel Data

Daily Welfare Gains from Trade

A New Approach to Robust Inference in Cointegration

Variations. ECE 6540, Lecture 10 Maximum Likelihood Estimation

Econ 582 Fixed Effects Estimation of Panel Data

A Course in Applied Econometrics Lecture 14: Control Functions and Related Methods. Jeff Wooldridge IRP Lectures, UW Madison, August 2008

11. Bootstrap Methods

x i = 1 yi 2 = 55 with N = 30. Use the above sample information to answer all the following questions. Show explicitly all formulas and calculations.

Solutions to Problem Set 4 Macro II (14.452)

Lecture 4: Linear panel models

Models, Testing, and Correction of Heteroskedasticity. James L. Powell Department of Economics University of California, Berkeley

The Asymptotic Variance of Semi-parametric Estimators with Generated Regressors

Markov-Switching Models with Endogenous Explanatory Variables. Chang-Jin Kim 1

Economics 241B Review of Limit Theorems for Sequences of Random Variables

PANEL DATA RANDOM AND FIXED EFFECTS MODEL. Professor Menelaos Karanasos. December Panel Data (Institute) PANEL DATA December / 1

POLI 8501 Introduction to Maximum Likelihood Estimation

When is it really justifiable to ignore explanatory variable endogeneity in a regression model?

2014 Preliminary Examination

1 Outline. Introduction: Econometricians assume data is from a simple random survey. This is never the case.

Introduction to Econometrics

Econometrics II - EXAM Outline Solutions All questions have 25pts Answer each question in separate sheets

Maximum Likelihood (ML) Estimation

On the Power of Tests for Regime Switching

GMM-based inference in the AR(1) panel data model for parameter values where local identi cation fails

Quantitative Techniques - Lecture 8: Estimation

Transition Density Function and Partial Di erential Equations

Multiple Equation GMM with Common Coefficients: Panel Data

Nonlinear GMM. Eric Zivot. Winter, 2013

Parametric Inference on Strong Dependence

21. Panel: Linear A. Colin Cameron Pravin K. Trivedi Copyright 2006

Nonparametric Identi cation and Estimation of Truncated Regression Models with Heteroskedasticity

Economics 241B Estimation with Instruments

ECON0702: Mathematical Methods in Economics

4. Nonlinear regression functions

Linear Methods for Prediction

1 Some general theory for 2nd order linear nonhomogeneous

Birkbeck Economics MSc Economics, PGCert Econometrics MSc Financial Economics Autumn 2009 ECONOMETRICS Ron Smith :

Estimating the Fractional Response Model with an Endogenous Count Variable

RBC Model with Indivisible Labor. Advanced Macroeconomic Theory

Strength and weakness of instruments in IV and GMM estimation of dynamic panel data models

Maximum Likelihood Estimation

Notes on Asymptotic Theory: Convergence in Probability and Distribution Introduction to Econometric Theory Econ. 770

The linear regression model: functional form and structural breaks

Likelihood Ratio Based Test for the Exogeneity and the Relevance of Instrumental Variables

Information in a Two-Stage Adaptive Optimal Design

Economics 536 Lecture 7. Introduction to Specification Testing in Dynamic Econometric Models

LECTURE 2 LINEAR REGRESSION MODEL AND OLS

Max. Likelihood Estimation. Outline. Econometrics II. Ricardo Mora. Notes. Notes

Wooldridge, Introductory Econometrics, 4th ed. Chapter 15: Instrumental variables and two stage least squares

Lecture 12. Functional form

A Course in Applied Econometrics Lecture 18: Missing Data. Jeff Wooldridge IRP Lectures, UW Madison, August Linear model with IVs: y i x i u i,

GMM estimation of spatial panels

Lecture Notes Part 7: Systems of Equations

Nonlinear Regression. Chapter 2 of Bates and Watts. Dave Campbell 2009

Transcription:

Economics 620, Lecture 18: Nonlinear Models Nicholas M. Kiefer Cornell University Professor N. M. Kiefer (Cornell University) Lecture 18: Nonlinear Models 1 / 18

The basic point is that smooth nonlinear models look like linear models locally. Models linear in parameters are no problem even if they are nonlinear in variables. For example '(y) = 0 + 1 1 (x 1 ) + 2 2 (x 2 ) + ::: with ' and known functions of observable regressors, is still a linear regression model. However, y = 1 + 2 e x 3 + " is nonlinear (arising, for example, as a solution to a di erential equation). Professor N. M. Kiefer (Cornell University) Lecture 18: Nonlinear Models 2 / 18

Notation: y i = f (X i ; ) + " i = f i () + " i i = 1; 2; :::; N. Stacking up yields y = f () + " where y is N 1, f () is N 1, is K 1 and " is N 1. i = 1; 2; :::; N xed, f : R K! R N. @f @ 0 = F () = 2 6 4 @f 1 @f 1 @ K @ 1 : : : : : : @f N @ 1 : : : @f N @ K 3 7 5 : For X i Obviously F () is N K. Assume E" = 0 and V " = 2 I. Professor N. M. Kiefer (Cornell University) Lecture 18: Nonlinear Models 3 / 18

The nonlinear least squares estimator minmizes S() = P N i=1 (y i f i ()) 2 = (y f ()) 0 (y f ()). Di erentiation yields @ @ 0 S() = @ @ 0 (y f ())0 (y f ()) = 2(y f ()) 0 F (): Thus, the nonlinear least squares (NLS) estimator ^ satis es (Are these equations familiar?) F (^) 0 (y f (^)) = 0. This is like the property X 0 e = 0 in the LS method. Professor N. M. Kiefer (Cornell University) Lecture 18: Nonlinear Models 4 / 18

Computing ^: The Guass-Newton method is to use a rst-order expansion of f () and T (a trial value) in S(), giving S T () = (y f ( T ) F ( T )( T )) 0 (y f ( T ) F ( T )( T )). Minimizing S T in gives M = T + [F ( T ) 0 F ( T )] 1 F ( T ) 0 (y f ( T )). (Exercise: Show M is the minimizer.) We know S T ( M ) S T ( T ). Is it true that S( M ) S( T )? Professor N. M. Kiefer (Cornell University) Lecture 18: Nonlinear Models 5 / 18

Professor N. M. Kiefer (Cornell University) Lecture 18: Nonlinear Models 6 / 18

The method is to use M for the new trial value, expand again, minimize and iterate. But there can be problems. For example, it is possible that S( M ) > S( T ), but there is a between T and M, = T + ( M T ) for some < 1, with S( ) S( T ). This suggests trying a decreasing sequence of s in [0; 1], leading to the modi ed Gauss-Newton method. (1) Start with T and compute D T = [F ( T ) 0 F ( T )] 1 F ( T ) 0 (y f ( T )). (2) Find such that S( T + D T ) < S( T ). (3) Set = T + D T and go to (1) using as a new value for T. Professor N. M. Kiefer (Cornell University) Lecture 18: Nonlinear Models 7 / 18

Stop when the changes in parameter values and S between iterations are small. Good practice is to try several di erent starting values. Estimation of 2 = V ("): Why are there two possibilities? s 2 = (y f (^)) 0 (y f (^)) N k 2 = (y f (^)) 0 (y f (^)) N Professor N. M. Kiefer (Cornell University) Lecture 18: Nonlinear Models 8 / 18

Note the simpli cation possible when the model is linear in some of the parameters (as in the example) y = 1 + 2 exp( 3 x) + ": Here, given 3, the other parameters can be estimated by OLS. Thus the sum of squares function S can be concentrated, that is written as a function of one parameter alone. The nonlinear maximization problem is 1-dimensional, not 3. This is an important trick. We used the same device to estimate models with autocorrelation the di erenced model could be esimated by OLS conditional on. Professor N. M. Kiefer (Cornell University) Lecture 18: Nonlinear Models 9 / 18

Inference: We can show that ^ = 0 + (F ( 0 ) 0 F ( 0 )) 1 F ( 0 ) 0 " + r where 0 is the true parameter value and plimn 1=2 r = 0 (show). So r can be ignored in calculating the asymptotic distribution of ^. This is just like the expression in the linear model - decomposing ^ into the true value plus sampling error. Thus the asymptotic distribution of N 1=2 (^ 0 ) is (F N 0; 2 (0 ) 0! F ( 0 ) 1 : N So the approximate distribution of ^ becomes N( 0 ; 2 (F ( 0 ) 0 F ( 0 )) 1 ): Professor N. M. Kiefer (Cornell University) Lecture 18: Nonlinear Models 10 / 18

In practice 2 is estimated by s 2 or ^ 2 and F ( 0 ) 0 F ( 0 ) is estimated by Check that this is OK. F (^) 0 F (^). Applications: 1. The overidenti ed SEM is a nonlinear regression model (linear in variables, nonlinear in parameters) - consider the reduced form equations in terms of structural parameters. 2. Many other models - more to come. Professor N. M. Kiefer (Cornell University) Lecture 18: Nonlinear Models 11 / 18

E ciency: Consider another consistent estimator with sampling error in the form n 1=2 ( ) = (F ( 0 ) 0 F ( 0 )) 1 F ( 0 ) 0 " + C 0 " + r. It can be shown, in a proof like that of the Gauss-Markov Theorem, that the minimum variance estimator in this class (locally linear) has C = 0. A better property holds when the errors are iid normal. Then, the NLS estimator is the MLE, and we have the Cramer-Rao e ciency result. Professor N. M. Kiefer (Cornell University) Lecture 18: Nonlinear Models 12 / 18

Nonlinear SEM Let y i = f (Z i ; ) + " i = f i () + " I. Where now Z is used to indicate included endogenous variables. NLS will not be consistent (why not?). The trick is to nd instruments W and look at W 0 y = W 0 f () + W 0 ": If the model is just-identi ed, we can just set W 0 " = 0 (its expected value) and solve. Otherwise, we can do nonlinear GLS, minimizing the variance-weighted sum of squares (y f ()) 0 W (W 0 W ) 1 W 0 (y f ()): Professor N. M. Kiefer (Cornell University) Lecture 18: Nonlinear Models 13 / 18

This ^ is called the nonlinear 2SLS estimator (by Amemiya, who studied its properties in 1974). Note that there are not two separate stages. Speci cally, it might be tempting to just obtain a rst state ^Z = (I M)Z = [ ^Y 2 X 1 ] and do nonlinear regression of y on f (^Z; ). This does not necessarily work. ^Z is orthogonal to ", but f may not be. In the linear regression case, we want ^Z 0 " = 0. The corresponding condition here is ^F 0 " = 0. Since ^F depends generally on, there is no real way to do this in stages. Professor N. M. Kiefer (Cornell University) Lecture 18: Nonlinear Models 14 / 18

Asymptotic Distribution Theory n 1=2 ( 0 )! N 0; 2 (F (0 ) 0 (W (W 0 W ) 1 W 0 )F ( 0 ) N In practice 2 is estimated by (y f (Z; ^)) 0 (y f (Z; ^))=n 1! (or over n k) and F ( 0 ) is esimated by F (^). Don t forget to remove the factor N 1 in the variance when approximating the variance of your estimator. The calculation is exactly like the usual calculation of the variance of the GLS estimator. : Professor N. M. Kiefer (Cornell University) Lecture 18: Nonlinear Models 15 / 18

MISCELLANEOUS: Proposition: P 1 I =1 (f i ( 0 ) f i ( 0 )) 2 = 1, 0 6= 0 is necessary for the existence of a consistent NLS estimator. this with OLS.) (Compare Funny example: Consider y i = e i + " i where 2 (0; 2) and V (" i ) = 2. Is there a consistent estimator? Consistency can be shown (generally, not just in this example) using regularity conditions, not requiring derivatives. Normality can be shown using rst but not second derivatives. Professor N. M. Kiefer (Cornell University) Lecture 18: Nonlinear Models 16 / 18

Many models can be set up as nonlinear regressions - like qualitative dependent variable models. Sometimes it is hard to interpret parameters. We mgiht be interested in functions of parameters, for example elasticities. Tests on functions of parameters: Remember the delta-method. Let ^ N( 0 ; P ) be a consistent estimator of whose true value is 0. Suppose g() is the quantity of interest. Expanding g(^) around 0 yields g(^) = g( 0 ) + @g @ j = 0 (^ 0 )+ more. Professor N. M. Kiefer (Cornell University) Lecture 18: Nonlinear Models 17 / 18

This implies that V (g(^)) = E(g(^) g( 0 )) 2 @g P 0 @g @ 0 @ 0 : Asymptotically, g(^) N(g( 0 ), V (g(^)). This is a very useful method. It also shows that normal approximation may be poor. (Why?) How can the normal approximation be improved? This is a deep question. One trick is to choose a parametrization for which @ 2`=@@ 0 is constant or nearly so. Why does this work? Consider the Taylor expansion of the score... Professor N. M. Kiefer (Cornell University) Lecture 18: Nonlinear Models 18 / 18