Estimation of Static Discrete Choice Models Using Market Level Data

Similar documents
Revisiting the Nested Fixed-Point Algorithm in BLP Random Coeffi cients Demand Estimation

A New Computational Algorithm for Random Coe cients Model with Aggregate-level Data

The BLP Method of Demand Curve Estimation in Industrial Organization

Nevo on Random-Coefficient Logit

Lecture 10 Demand for Autos (BLP) Bronwyn H. Hall Economics 220C, UC Berkeley Spring 2005

The BLP Method of Demand Curve Estimation in Industrial Organization

Mini Course on Structural Estimation of Static and Dynamic Games

A Note on Bayesian analysis of the random coefficient model using aggregate data, an alternative approach

Berry et al. (2004), Differentiated products demand systems from a combination of micro and macro data: The new car market

Control Function Corrections for Unobserved Factors in Di erentiated Product Models

Demand in Differentiated-Product Markets (part 2)

NBER WORKING PAPER SERIES IMPROVING THE NUMERICAL PERFORMANCE OF BLP STATIC AND DYNAMIC DISCRETE CHOICE RANDOM COEFFICIENTS DEMAND ESTIMATION

Estimating Single-Agent Dynamic Models

Chapter 2. Dynamic panel data models

IDENTIFICATION IN DIFFERENTIATED PRODUCTS MARKETS USING MARKET LEVEL DATA. Steven T. Berry and Philip A. Haile. January 2010 Updated February 2010

Introduction: structural econometrics. Jean-Marc Robin

Chapter 6: Endogeneity and Instrumental Variables (IV) estimator

A Note on Demand Estimation with Supply Information. in Non-Linear Models

Asymptotic theory for differentiated products demand models with many markets

IDENTIFICATION IN DIFFERENTIATED PRODUCTS MARKETS USING MARKET LEVEL DATA. Steven T. Berry and Philip A. Haile. January 2010 Revised May 2012

Economics 620, Lecture 18: Nonlinear Models

The Empirical Likelihood MPEC Approach to Demand Estimation

Chapter 1. GMM: Basic Concepts

Improving the Numerical Performance of BLP Static and Dynamic Discrete Choice Random Coefficients Demand Estimation

Specification Test on Mixed Logit Models

Estimation of demand for differentiated durable goods

Consumer heterogeneity, demand for durable goods and the dynamics of quality

Industrial Organization II (ECO 2901) Winter Victor Aguirregabiria. Problem Set #1 Due of Friday, March 22, 2013

Identification and Estimation of Differentiated Products Models using Market Size and Cost Data

A Course in Applied Econometrics Lecture 14: Control Functions and Related Methods. Jeff Wooldridge IRP Lectures, UW Madison, August 2008

High Performance Quadrature Rules: How Numerical Integration Affects a Popular Model of Product Differentiation

Estimation of a Local-Aggregate Network Model with. Sampled Networks

ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Spring 2013 Instructor: Victor Aguirregabiria

Do preferences for private labels respond to supermarket loyalty programs?

Lecture Notes: Estimation of dynamic discrete choice models

Are Marijuana and Cocaine Complements or Substitutes? Evidence from the Silk Road

13 Endogeneity and Nonparametric IV

Aggregate Supply. Econ 208. April 3, Lecture 16. Econ 208 (Lecture 16) Aggregate Supply April 3, / 12

ECONOMETRICS FIELD EXAM Michigan State University May 9, 2008

What Accounts for the Growing Fluctuations in FamilyOECD Income March in the US? / 32

Demand Systems in Industrial Organization: Part II

Economics 241B Estimation with Instruments

GMM-based inference in the AR(1) panel data model for parameter values where local identi cation fails

Lecture 3 Unobserved choice sets

Demand and supply of differentiated. products. Paul Schrimpf. products. Paul Schrimpf. UBC Economics 565. September 22, 2015

1 Outline. 1. Motivation. 2. SUR model. 3. Simultaneous equations. 4. Estimation

Estimating Static Discrete Games

Dynamic Discrete Choice Structural Models in Empirical IO

Lecture 6: Discrete Choice: Qualitative Response

Notes on Time Series Modeling

Introduction to the Simultaneous Equations Model

Instrument-free Identification and Estimation of Differentiated Products Models using Cost Data

Wageningen Summer School in Econometrics. The Bayesian Approach in Theory and Practice

A Robust Approach to Estimating Production Functions: Replication of the ACF procedure

1 Bewley Economies with Aggregate Uncertainty

Estimating Single-Agent Dynamic Models

Chapter 6. Maximum Likelihood Analysis of Dynamic Stochastic General Equilibrium (DSGE) Models

Competition Between Networks: A Study in the Market for Yellow Pages Mark Rysman

Estimating the Pure Characteristics Demand Model: A Computational Note

Constrained Optimization Approaches to Estimation of Structural Models: Comment

Chapter 1: A Brief Review of Maximum Likelihood, GMM, and Numerical Tools. Joan Llull. Microeconometrics IDEA PhD Program

ECON 594: Lecture #6

ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Winter 2016 Instructor: Victor Aguirregabiria

Economics 620, Lecture 15: Introduction to the Simultaneous Equations Model

Econometric Analysis of Games 1

Lecture Notes on Measurement Error

Internation1al Trade

Lecture Notes: Brand Loyalty and Demand for Experience Goods

Estimation with Aggregate Shocks

Semiparametric Estimation of CES Demand System with Observed and Unobserved Product Characteristics

Problem set 1 - Solutions

MVE165/MMG631 Linear and integer optimization with applications Lecture 5 Linear programming duality and sensitivity analysis

Exercises Chapter 4 Statistical Hypothesis Testing

Stochastic Demand and Revealed Preference

Identification of Consumer Switching Behavior with Market Level Data

Econ Review Set 2 - Answers

Demand Estimation Under Incomplete Product Availability

ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Winter 2014 Instructor: Victor Aguirregabiria

Chapter 6. Panel Data. Joan Llull. Quantitative Statistical Methods II Barcelona GSE

VALUATION USING HOUSEHOLD PRODUCTION LECTURE PLAN 15: APRIL 14, 2011 Hunt Allcott

Measuring Substitution Patterns in Differentiated Products Industries

Economics 472. Lecture 10. where we will refer to y t as a m-vector of endogenous variables, x t as a q-vector of exogenous variables,

GMM estimation of spatial panels

MS&E 226: Small Data. Lecture 11: Maximum likelihood (v2) Ramesh Johari

Lecture 4: Linear panel models

Nonparametric identification of endogenous and heterogeneous aggregate demand models: complements, bundles and the market level

Estimating Deep Parameters: GMM and SMM

Next, we discuss econometric methods that can be used to estimate panel data models.

Lecture 14 More on structural estimation

Observed and Unobserved Product Characteristics

A Course in Applied Econometrics Lecture 18: Missing Data. Jeff Wooldridge IRP Lectures, UW Madison, August Linear model with IVs: y i x i u i,

NBER WORKING PAPER SERIES NONPARAMETRIC IDENTIFICATION OF MULTINOMIAL CHOICE DEMAND MODELS WITH HETEROGENEOUS CONSUMERS

ECON0702: Mathematical Methods in Economics

Advanced Economic Growth: Lecture 8, Technology Di usion, Trade and Interdependencies: Di usion of Technology

4- Current Method of Explaining Business Cycles: DSGE Models. Basic Economic Models

1 A Non-technical Introduction to Regression

1. The Multivariate Classical Linear Regression Model

Simultaneous Equation Models

1 Correlation between an independent variable and the error

Parameterized Expectations Algorithm

Transcription:

Estimation of Static Discrete Choice Models Using Market Level Data NBER Methods Lectures Aviv Nevo Northwestern University and NBER July 2012

Data Structures Market-level data cross section/time series/panel of markets Consumer level data cross section of consumers sometimes: panel (i.e., repeated choices) sometimes: second choice data Combination sample of consumers plus market-level data quantity/share by demographic groups average demographics of purchasers of good j

Market-level Data We see product-level quantity/market shares by "market" Data include: aggregate (market-level) quantity prices/characteristics/advertising de nition of market size distribution of demographics Advantages: sample of actual consumers data to estimate a parametric distribution easier to get sample selection less of an issue Disadvantages estimation often harder and identi cation less clear

Consumer-level Data See match between consumers and their choices Data include: consumer choices (including choice of outside good) prices/characteristics/advertising of all options consumer demographics Advantages: impact of demographics identi cation and estimation dynamics (especially if we have panel) Disadvantages harder/more costly to get sample selection and reporting error

Review of the Model and Notation Indirect utility function for the J inside goods U(x jt, ξ jt, I i p jt, D it, ν it ; θ), where x jt observed product characteristics ξ jt unobserved (by us) product characteristic D it "observed" consumer demographics (e.g., income) ν it unobserved consumer attributes ξ jt will play an important role realistic will act as a "residual" (why don t predicted shares t exactly over tting) potentially implies endogeneity

Linear RC (Mixed) Logit Model A common model is the linear Mixed Logit model u ijt = x jt β i + α i p jt + ξ jt + ε ijt where αi β i = α β + ΠD i + Σv i It will be coinvent to write u ijt δ(x jt, p jt, ξ jt ; θ 1 ) + µ(x jt, p jt, D i, ν i ; θ 2 ) + ε ijt where δ jt = x jt β + αp jt + ξ jt, and µ ijt = (p jt x jt ) (ΠD i + Σv i )

Linear RC (Mixed) Logit Model A common model is the linear Mixed Logit model and u ijt = x jt β i + α i p jt + ξ jt + ε ijt u ijt δ(x jt, p jt, ξ jt ; θ 1 ) + µ(x jt, p jt, D i, ν i ; θ 2 ) + ε ijt Note: (1) the mean utility will play a key role in what follows (2) the interplay between µ ijt and ε ijt (3) the "linear" and "non-linear" parameters (4) de nition of a market

Key Challenges for Estimation Recovering the non-linear parameters, which govern heterogeneity, without observing consumer level data The unobserved characteristic, ξ jt a main di erence from early DC model (earlier models often had option speci c constant in consumer level models) generates a potential for correlation with price (or other x s) when constructing a counterfactual we will have to deal with what happens to ξ jt Consumer-level vs Market-level data with consumer data, the rst issue is less of a problem the "endogeneity" problem can exist with both consumer and market level data: a point often missed

What would we do if we had micro data? Estimate in two steps. First step, estimate (δ, θ 2 ) say by MLE Pr(y it = jjd it, x t, p t, ξ t, θ) = Pr(y it = jjd it, δ(x t, p t, ξ t, θ 1 ), x t, p t, θ e.g., assume ε ijt is iid double exponential (Logit), and Σ = 0 = expfδ jt + (p jt x jt )ΠD i g J k=0 expfδ kt + (p kt x kt )ΠD i g Second step, recover θ 1 bδ jt = x jt β + αp jt + ξ jt ξ jt is the residual. If it is correlated with price (or x s) need IVs (or an assumption about the panel structure)

Intuition from estimation with consumer data Estimation in 2 steps: rst recover δ and θ 2 (parameters of heterogeneity) and then recover θ 1 Di erent variation identfying the di erent parameters θ 2 is identi ed from variation in demographics holding the level (i.e., δ) constant If Σ 6= 0 then it is identi ed from within market share variation in choice probabilities θ 1 is identi ed from cross market variation (and appropriate exclusion restrictions) With market-level data will in some sense try to follow a similar logic however, we do not have within market variation to identify θ 2 will rely on cross market variation (in choice sets and demographics) for both steps a key issue is that ξ jt is not held constant

Estimation with market data: preliminaries In principle, we could consider estimating θ by min the distance between observed and predicted shares: min ks t θ s j (x t, p t, θ)k Issues: computation (all parameters enter non-linearly) more importantly, prices might be correlated with the ξ jt ( structural error) standard IV methods do not work

Inversion Instead, follow estimation method proposed by Berry (1994) and BLP (1995) Key insight: with ξ jt predicted shares can equal observed shares Z σ j (δ t, x t, p t ; θ 2 ) = 1 u ijt u ikt 8k 6= j df (ε it, D it, ν it ) = S jt under weak conditions this mapping can be inverted δ t = σ 1 (S t, x t, p t ; θ 2 ) the mean utility is linear in ξ jt ; thus, we can form linear moment conditions estimate parameters via GMM

Important (and often missed) point IVs play dual role (recall 2 steps with consumer level data) generate moment conditions to identify θ 2 deal with the correlation of prices and error Even if prices are exogenous still need IVs This last point is often missed Why di erent than consumer-level data? with aggregate data we only know the mean choice probability, i.e., the market share with consumer level data we know more moments of the distribution of choice probabilities (holding ξ jt constant) : these moments help identify the heterogeneity parameters

I will now go over the steps of the estimation For now I assume that we have valid IVs later we will discuss where these come from I will follow the original BLP algorithm I will discuss recently proposed alternatives later I will also discuss results on the performance of the algorithm

The BLP Estimation Algorithm 1. Compute predicted shares: given δ t and θ 2 compute σ j (δ t, x t, p t ; θ 2 ) 2. Inversion: given θ 2 search for δ t that equates σ j (δ t, x t, p t ; θ 2 ) and the observed market shares the search for δ t will call the function computed in (1) 3. Use the computed δ t to compute ξ jt and form the GMM objective function (as a function of θ) 4. Search for the value of θ that minimizes the objective function

Example: Estimation of the Logit Model Data: aggregate quantity, price characteristics. Market share s jt = q jt /M t Note: need for data on market size Computing market share s jt = expfδ jt g J k=0 expfδ kt g Inversion ln(s jt ) ln(s 0t ) = δ jt δ 0t = x jt β + αp jt + ξ jt Estimate using linear methods (e.g., 2SLS) with ln(s jt ) ln(s 0t ) as the "dependent variable".

Step 1: Compute the market shares predicted by the model Given δ t and θ 2 (and the data) compute Z σ j (δ t, x t, p t ; θ 2 ) = 1 [u ijt u ikt 8k 6= j] df (ε it, D it, ν it ) For some models this can be done analytically (e.g., Logit, Nested Logit and a few others) Generally the integral is computed numerically A common way to do this is via simulation eσ j (δ t, x t, p t, F ns ; θ 2 ) = 1 ns ns i=1 expfδ jt + (p jt x jt )(ΠD i + Σv i )g 1 + J k=1 expfδ kt + (p kt x kt )(ΠD i + Σ where v i and D i, i = 1,..., ns are draws from F v (v) and F D (D), Note: the ε s are integrated analytically other simulators (importance sampling, Halton seq) integral can be approximated in other ways (e.g., quadrature)

Step 2: Invert the shares to get mean utilities Given θ 2, for each market compute mean utility, δ t, that equates the market shares computed in Step 1 to observed shares by solving eσ(δ t, x t, p t, F ns ; θ 2 ) = S t For some model (e.g., Logit and Nested Logit) this inversion can be computed analytically. Generally solved using a contraction mapping for each market δ h+1 t = δ h t + ln(s t ) ln(eσ(δ h t, x t, p t, F ns ; θ 2 ) h = 0,..., H, where H is the smallest integer such that δ H t δ H 1 < ρ δ H t is the approximation to δ t Choosing a high tolerance level, ρ, is crucial (at least 10 12 ) t

Step 3: Compute the GMM objective Once the inversion has been computed the error term is de ned as ξ jt (θ) = eσ 1 (S t, x t, p t ; θ 2 ) x jt β αp jt Note: θ 1 enters this term, and the GMM objective, in a linear fashion, while θ 2 enters non-linearly. This error is interacted with the IV to form ξ(θ) 0 ZWZ 0 ξ(θ) where W is the GMM weight matrix

Step 4: Search for the parameters that maximize the objective In general, the search is non-linear It can be simpli ed in two ways. concentrate out the linear parameters and limit search to θ 2 use the Implicit Function Theorem to compute the analytic gradient and use it to aid the search Still highly non-linear so much care should be taken: start search from di erent starting points use di erent optimizers

Identi cation Ideal experiment: randomly vary prices, characteristics and availability of products, and see where consumers switch (i.e., shares of which products respond) In practice we will use IVs that try to mimic this ideal experiment Next lecture we will see examples Is there "enough" variation to identify substitution? Solutions: supply information (BLP) many markets (Nevo) add micro information (Petrin, MicroBLP) For further discussion and proofs (in NP case) see Haile and Berry

The Limit Distribution for the Parameter Estimates Can be obtained in a similar way to any GMM estimator With one cross section of observations is where J 1 (Γ 0 Γ) 1 Γ 0 V 0 Γ(Γ 0 Γ) 1 Γ derivative of the expectation of the moments wrt parameters V 0 variance-covariance of those moments evaluated V 0 has (at least) two orthogonal sources of randomness randomness generated by random draws on ξ variance generated by simulation draws. in samples based on a small set of consumers: randomness in sample Berry Linton and Pakes, (2004 RESTUD): last 2 components likely to be very large if market shares are small. A separate issue: limit in J or in T in large part depends on the data

Challenges Knittel and Metaxoglou found that di erent optimizers give di erent results and are sensitive to starting values Some have used these results to argue against the use of the model Note, that its unlikely that a researcher will mimic the KM exercise start from one starting point and not check others some of the algorithms they use are not very good and rarely used It ends up that much (but not all!) of their results go away if use tight tolerance in inversion (10 12 ) proper code reasonable optimizers This is an important warning about the challenges of NL estimation

MPEC Judd and Su, and Dube, Fox and Su, advocate the use of MPEC algorithm instead of the Nested xed point The basic idea: maximize the GMM objective function subject to the "equilibrium" constraints (i.e., that the predicted shares equal the observed shares) Avoids the need to perform the inversion at each and every iteration of the search performing the inversion for values of the parameters far from the truth can be quite costly The problem to solve can be quite large, but e cient optimzers (e.g., Knitro) can solve it e ectively. DFS report signi cant speed improvements

MPEC (cont) Formally min θ,ξ subject to ξ 0 ZWZ 0 ξ eσ(ξ; x, p, F ns, θ) = S Note the min is over both θ and ξ: a much higher dimension search ξ is a vector of parameters, and unlike before it is not a function of θ avoid the need for an inversion: equilibrium constraint only holds at the optimum in principle, should yield the same solution as the nested xed point Many bells and whistles that I will skip

ABLP Lee (2011) builds on some of the ideas proposed in dynamic choice to propose what he calls Approximate BLP The basic idea Start with a guess to ξ, denoted ξ 0, and use it to compute a rst order Taylor approximation to σ(ξ t, x t, p t ; θ) given by ln s(ξ t ; θ) ln s A (ξ t ; θ) ln s(ξ 0 t ; θ) + ln s(ξ0 t ; θ) ln ξt 0 (ξ t ξ 0 t ) From ln S t = ln s A (ξ t ; θ) we get ξ t = Φ t (ξ 0 t, θ) ξ0 t + " Use this approximation for estimation ln s(ξ 0 t ; θ) # 1 ln ξt 0 (ln S t ln s(ξ 0 t ; θ)) min θ Φ(ξ 0, θ) 0 ZWZ 0 Φ(ξ 0, θ)

ABLP (cont) Nest this idea into K-step procedure Step 1: Obtain new GMM estimate Step 2: Update ξ θ K = arg min θ Φ(ξ K 1, θ) 0 ZWZ 0 Φ(ξ K 1, θ) Repeat until convergence ξ K = Φ(ξ K 1 t, θ K ) Like MPEC avoids inversion at each stage, but has low dimensional search Lee reports signi cant improvements over MPEC Disclaimer: still a WP and has not been signi cantly tested

Comparing the methods Patel (2012, chapter 2 NWU thesis) compared the 3 methods

Comparing the methods Word of caution: MPEC results can probably be signi cantly improved with better implementation This is just an example of what one might expect if asking a (good) RA to program these methods