Lecture 14 More on structural estimation

Similar documents
A Very Brief Summary of Statistical Inference, and Examples

Lecture 11/12. Roy Model, MTE, Structural Estimation

Incentives Work: Getting Teachers to Come to School. Esther Duflo, Rema Hanna, and Stephen Ryan. Web Appendix

Estimating Dynamic Programming Models

MLE and GMM. Li Zhao, SJTU. Spring, Li Zhao MLE and GMM 1 / 22

Various types of likelihood

Econometrics Lecture 5: Limited Dependent Variable Models: Logit and Probit

MS&E 226: Small Data. Lecture 11: Maximum likelihood (v2) Ramesh Johari

A Model of Human Capital Accumulation and Occupational Choices. A simplified version of Keane and Wolpin (JPE, 1997)

Time Series Analysis Fall 2008

Syllabus. By Joan Llull. Microeconometrics. IDEA PhD Program. Fall Chapter 1: Introduction and a Brief Review of Relevant Tools

Dynamic Economics Quantitative Methods and Applications to Macro and Micro

A Very Brief Summary of Statistical Inference, and Examples

Ph.D. Qualifying Exam Friday Saturday, January 3 4, 2014

Computer Intensive Methods in Mathematical Statistics

Advanced Econometrics

SEQUENTIAL ESTIMATION OF DYNAMIC DISCRETE GAMES. Victor Aguirregabiria (Boston University) and. Pedro Mira (CEMFI) Applied Micro Workshop at Minnesota

Central Limit Theorem ( 5.3)

1 Outline. 1. MSL. 2. MSM and Indirect Inference. 3. Example of MSM-Berry(1994) and BLP(1995). 4. Ackerberg s Importance Sampler.

Econometrics III: Problem Set # 2 Single Agent Dynamic Discrete Choice

DSGE Methods. Estimation of DSGE models: GMM and Indirect Inference. Willi Mutschler, M.Sc.

Numerical Methods-Lecture XIII: Dynamic Discrete Choice Estimation

Econometric Analysis of Games 1

Estimating Single-Agent Dynamic Models

Theory of Maximum Likelihood Estimation. Konstantin Kashin

ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Winter 2014 Instructor: Victor Aguirregabiria

Chapter 1: A Brief Review of Maximum Likelihood, GMM, and Numerical Tools. Joan Llull. Microeconometrics IDEA PhD Program

Lecture 8. Roy Model, IV with essential heterogeneity, MTE

Introduction to Machine Learning. Maximum Likelihood and Bayesian Inference. Lecturers: Eran Halperin, Yishay Mansour, Lior Wolf

Estimating Single-Agent Dynamic Models

Lecture notes: Rust (1987) Economics : M. Shum 1

DSGE-Models. Limited Information Estimation General Method of Moments and Indirect Inference

Maximum Likelihood Tests and Quasi-Maximum-Likelihood

Lecture 8 Panel Data

Panel Data Seminar. Discrete Response Models. Crest-Insee. 11 April 2008

Graduate Econometrics I: Maximum Likelihood II

Ph.D. Qualifying Exam Friday Saturday, January 6 7, 2017

Lecture Notes: Estimation of dynamic discrete choice models

Lecture 11 Roy model, MTE, PRTE

Last lecture 1/35. General optimization problems Newton Raphson Fisher scoring Quasi Newton

Likelihood-based inference with missing data under missing-at-random

Approximate Likelihoods

Introduction to Machine Learning. Maximum Likelihood and Bayesian Inference. Lecturers: Eran Halperin, Lior Wolf

Computing the MLE and the EM Algorithm

Lecture 6: Discrete Choice: Qualitative Response

Maximum Likelihood Methods

Sequential Estimation of Dynamic Programming Models

Parameter Estimation

Lecture 12: Application of Maximum Likelihood Estimation:Truncation, Censoring, and Corner Solutions

Non-linear panel data modeling

New Bayesian methods for model comparison

Approximating High-Dimensional Dynamic Models: Sieve Value Function Iteration

Introduction to Estimation Methods for Time Series models Lecture 2

CCP Estimation. Robert A. Miller. March Dynamic Discrete Choice. Miller (Dynamic Discrete Choice) cemmap 6 March / 27

Lecture 8 Inequality Testing and Moment Inequality Models

Generalized Linear Models Introduction

MISCELLANEOUS TOPICS RELATED TO LIKELIHOOD. Copyright c 2012 (Iowa State University) Statistics / 30

Greene, Econometric Analysis (6th ed, 2008)

I. Multinomial Logit Suppose we only have individual specific covariates. Then we can model the response probability as

Maximum Likelihood (ML) Estimation

Econometric Analysis of Cross Section and Panel Data

Outline of GLMs. Definitions

1 Hotz-Miller approach: avoid numeric dynamic programming

Approximating High-Dimensional Dynamic Models: Sieve Value Function Iteration

DA Freedman Notes on the MLE Fall 2003

Bayesian Inference in GLMs. Frequentists typically base inferences on MLEs, asymptotic confidence

Lecture 4: Hidden Markov Models: An Introduction to Dynamic Decision Making. November 11, 2010

Latent Variable Models for Binary Data. Suppose that for a given vector of explanatory variables x, the latent

INDIRECT INFERENCE BASED ON THE SCORE

Discrete Choice Modeling

GARCH Models Estimation and Inference

Lecture 5: LDA and Logistic Regression

Further Evidence on Simulation Inference for Near Unit-Root Processes with Implications for Term Structure Estimation

Recap. Vector observation: Y f (y; θ), Y Y R m, θ R d. sample of independent vectors y 1,..., y n. pairwise log-likelihood n m. weights are often 1

Lecture 4 September 15

ECON 594: Lecture #6

Limited Dependent Variables and Panel Data

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

Final Exam. 1. (6 points) True/False. Please read the statements carefully, as no partial credit will be given.

Chap 2. Linear Classifiers (FTH, ) Yongdai Kim Seoul National University

Gibbs Sampling in Latent Variable Models #1

Gauge Plots. Gauge Plots JAPANESE BEETLE DATA MAXIMUM LIKELIHOOD FOR SPATIALLY CORRELATED DISCRETE DATA JAPANESE BEETLE DATA

Women. Sheng-Kai Chang. Abstract. In this paper a computationally practical simulation estimator is proposed for the twotiered


Munich Lecture Series 2 Non-linear panel data models: Binary response and ordered choice models and bias-corrected fixed effects models

Statistical Estimation

(θ θ ), θ θ = 2 L(θ ) θ θ θ θ θ (θ )= H θθ (θ ) 1 d θ (θ )

Econ 583 Homework 7 Suggested Solutions: Wald, LM and LR based on GMM and MLE

Max. Likelihood Estimation. Outline. Econometrics II. Ricardo Mora. Notes. Notes

STATS 200: Introduction to Statistical Inference. Lecture 29: Course review

Discrete Choice Modeling

Standard Errors & Confidence Intervals. N(0, I( β) 1 ), I( β) = [ 2 l(β, φ; y) β i β β= β j

Model comparison and selection

Chapter 4 HOMEWORK ASSIGNMENTS. 4.1 Homework #1

covariance between any two observations

Linear Regression Models P8111

CCP and the Estimation of Nonseparable Dynamic Models

Economics 582 Random Effects Estimation

Estimation of Discrete Choice Dynamic Programming Models

What s New in Econometrics. Lecture 15

Transcription:

Lecture 14 More on structural estimation Economics 8379 George Washington University Instructor: Prof. Ben Williams

traditional MLE and GMM MLE requires a full specification of a model for the distribution of y x. Typically, f y1,...,y n x 1,...,x n = n i=1 f θ y i x i where fy θ i x i known if θ R K is known. GMM is sometimes preferred because it is based only on moments E(w i m(y i, x i, θ)) = 0. Both are difficult when f yi x i or m is difficult to compute.

MLE vs GMM Two reasons f yi x i or m can be difficult to compute: latent variable: fy θ i x i = fy x,u θ f udu y i is determined conditional on x i and unobserved shock(s) via an economic model which may involve dynamic optimization, solution of a nash equilibrium, etc.

Early examples Multinomial probit: l(β) = n J 1(y i = j) log(pr(y i = j X i )) i=1 j=1 where Pr(y i = j X i ) = Pr(X ij β + ε ij max X il β + ε il) and ε i = (ε i1,..., ε ij ) N(0, Σ) l j

Early examples Random coefficients logit: same form for likelihood with Pr(y i = j X i ) = and β i N( β, Σ β ) exp(x ij β i) J l=1 exp(x il β)f (β i)dβ i

Maximum Simulated Likelihood Suppose that f (y i X i, θ) = g(y i X i, u, θ)ψ(u)du. simulate u i1,..., u is i.i.d. ψ( ) for each i and replace l i (θ) = log(f (y i X i, θ)) with ˆl i (θ) = log ( S 1 S s=1 g(y i X i, u is, θ) ) then ˆθ MSL = arg max θ n i=1 ˆl i (θ).

Maximum Simulated Likelihood Only consistent and asymptotically normal if n/s 0. Take S as a multiple of the sample size if feasible. do not draw new simulation sample in each iteration of the optimization routine! Sometimes this naive simulation can be improved by importance sampling and other variance-reduction techniques. See 12.7 in CT.

Method of Simulated Moments Suppose we want to estimate θ based on the moment condition: E(w i m(y i, x i, θ 0 )) = 0 where m(y i, x i, θ) = h(y i, x i, u, θ)ψ(u)du requires simulation then draw u is, s = 1,..., S for each i and compute ˆm(y i, x i, θ) = S 1 S s=1 h(y i, x i, u is, θ)

Method of Simulated Moments If E( ˆm(y i, x i, θ) y i, x i ) = m(y i, x i, θ) (unbiased simulation) and the usual GMM conditions are satisfied then minimizing n w i ˆm(y i, x i, θ) i=1 provides a consistent, asymptotically normal estimator

Method of Simulated Moments If E( ˆm(y i, x i, θ) y i, x i ) = m(y i, x i, θ) (unbiased simulation) and the usual GMM conditions are satisfied then minimizing n w i ˆm(y i, x i, θ) i=1 provides a consistent, asymptotically normal estimator if, in addition, S, then the estimator is asymptotically equivalent to the GMM estimator for finite S, the asymptotic variance is inflated by a factor of 1 + S 1, though this can be improved, e.g. by importance sampling

Method of Simulated Moments variance estimation requires either simulation or bootstrap Gourieroux and Monfort (1991) provide more general conditions under which S is not necessary Pakes and Pollard (1989) provide some examples.

Indirect inference auxiliary model, e.g., a likelihood: l n (θ) = n i=1 log(f (y i X i, θ)) an auxiliary estimate, e.g., ˆθ = arg max θ l n (θ) economic model, e.g., y i = G(X i, u i ; β) for i = 1,..., n and u i iid F u simulate {y m (β)} and obtain θ(β) by maximizing M n m=1 i=1 log(f (y m i (β) X i, θ)

Indirect inference ˆβ = arg min β D(ˆθ, θ(β)) D is a metric function; Smith (2008) suggests Wald, LR, LM metrics consistent and asymptotically normal for M fixed, n variance inflate by 1 + M 1 very easy to implement despite the lack of efficiency

Indirect inference the following are typical conditions required for indirect inference: the economic model is correctly specified and well-behaved the auxiliary likelihood function is well-behaved in the limit, despite the fact that it is misspecified binding function l n(θ) p l(θ; β, F u) when the data is generated by the economic model with parameters β and distribution F u define θ(β) = arg max θ l(θ; β, F u) θ 0 = θ(β 0 ) is the pseudo-true value θ(β) is the binding function

Indirect inference under appropriate regularity conditions, ˆθ p θ 0 and θ(β) p θ(β) thus the identification condition: is β 0 the unique solution to θ 0 = θ(β)? requires dim(θ) dim(β) simulation avoids needing to know the binding function

Rust (1987) Agents solve a dynamic control problem where d t is a discrete control variable and state variables are x t, ε t. We observe (d t, x t ) T t=1. The agent solves: max E d t,d t+1,... ( T t ) β s t U(d s, x s, ε s ) d t, x t, ε t s=t Suppose 1. U(d s, x s, ε s ) = u(d s, x s ) + ε s (d s ) 2. f (x t+1, ε t+1 d t, x t, ε t ) = f (x t+1 d t, x t )f (ε t+1 ) Then the solution is characterized by the integrated value function.

Rust (1987) In particular, the integrated value function solves the Bellman equation: { V (x t ) = max u(d, x t ) + ε t (d) d + β V (x t+1 )f (x t+1 d, x t ) f (ε t)dε t x t+1 and the decision rule is d t = arg max u(d, x t ) + ε t (d) d + β V (x t+1 )f (x t+1 d, x t ) x t+1

Rust (1987) Then Pr(d t = d x t ) = Pr(v(d, x t )+ε t (d) > v(d, x t )+ε t (d ) for all d ) where v(d, x t ) = u(d, x t ) + β x t+1 V (x t+1 )f (x t+1 d, x t ) This can in principle be implemented through maximum likelihood since the log-likelihood can be written as n i=1 t=1 T log(f (x i,t+1 d i,t, x i,t )) + log(pr(d i,t x i,t ))

Rust (1987) This can be adapted to a finite horizon problem too, as discussed in Arcidiacono and Ellickson (2011). Start with the last period and use backwards recursion to solve for period-specific value functions. The model can also include a choice-dependent outcome variable, y t, by adding log(f (y it d it, x it )) to the log-likelihood.

Rust (1987) Rust (1987) assumes that ε t (d) are independent type 1 extreme value, which leads to a logistic expression for the choice probability: Pr(d t = d x t ) = exp(u(d, x t ) + β x t+1 V (xt+1 )f (x t+1 x t, d) d exp(u(d, x t ) + β x t+1 V (xt+1 )f (x t+1 x t, d ) and for the integrated value function: V (x t ) = log exp(u(d, x t ) + β d x t+1 V (x t+1 )f (x t+1 d, x t ))

Rust (1987) Rust (1987) nested fixed point algorithm: Estimate f (x t+1 d t, x t ) separately. Start with an initial value for the other parameters, ˆθ 0. Using this parameter value, solve for the value function through successive iterations of the Bellman equation. This will be a vector (because x t is discretized), V(ˆθ 0 ). Given ˆθ 0 and V(ˆθ 0 ), compute the ingredients needed to update: ˆθ 1 = ˆθ 0 + H(ˆθ 0 ) 1 J(ˆθ 0 ) Iterate this until convergence. Check other initial values.

Keane and Wolpin (1997) Keane and Wolpin (1997) extend this in several ways: permanent unobserved heterogeneity payoff variables that are choice-censored nonseparable unobservable unobservables correlated across choice alternatives

Keane and Wolpin (1997) The( log-likelihood for the model is a mixture: L ) log l=1 L i(θ, ω l )π l xi1 where T i L i (θ, ω l ) = Pr(d i,t x i,t, θ, ω l )f (y it a it, x it, θ, ω l ) t=1 T i 1 t=1 f (x i,t+1 d i,t, x i,t, θ, ω l ) There is a type-specific integrated value function: { V l (x t ) = max U(d, x t, ω l, ε t ) d +β V l (x t+1 )f (x t+1 d, x t, ω l ) f (ε t)dε t x t+1

Keane and Wolpin (1997) Keane and Wolpin (1997) develop a new method to simulate the integrals which will not be closed form even if errors are type I extreme value interpolate the value function computed on only part of the state space This is called the simulation and interpolation method.