IDENTIFICATION AND ESTIMATION OF PARTIALLY IDENTIFIED MODELS WITH APPLICATIONS TO INDUSTRIAL ORGANIZATION

Similar documents
Robust Inference for Differentiated Product Demand Systems

Lecture 8 Inequality Testing and Moment Inequality Models

Inference for Identifiable Parameters in Partially Identified Econometric Models

Comparison of inferential methods in partially identified models in terms of error in coverage probability

A Note on Demand Estimation with Supply Information. in Non-Linear Models

Revisiting the Nested Fixed-Point Algorithm in BLP Random Coeffi cients Demand Estimation

Inference for identifiable parameters in partially identified econometric models

large number of i.i.d. observations from P. For concreteness, suppose

Measurable Choice Functions

Demand in Differentiated-Product Markets (part 2)

Econometric Analysis of Games 1

Boundary Behavior of Excess Demand Functions without the Strong Monotonicity Assumption

Estimating Single-Agent Dynamic Models

Inference for Subsets of Parameters in Partially Identified Models

Supplement to Quantile-Based Nonparametric Inference for First-Price Auctions

A Course in Applied Econometrics. Lecture 10. Partial Identification. Outline. 1. Introduction. 2. Example I: Missing Data

Semi and Nonparametric Models in Econometrics

Oblivious Equilibrium: A Mean Field Approximation for Large-Scale Dynamic Games

A Robust Approach to Estimating Production Functions: Replication of the ACF procedure

6.254 : Game Theory with Engineering Applications Lecture 7: Supermodular Games

Specification Test on Mixed Logit Models

Additional Material for Estimating the Technology of Cognitive and Noncognitive Skill Formation (Cuttings from the Web Appendix)

Appendix B for The Evolution of Strategic Sophistication (Intended for Online Publication)

Lecture #11: Introduction to the New Empirical Industrial Organization (NEIO) -

Inference For High Dimensional M-estimates. Fixed Design Results

Flexible Estimation of Treatment Effect Parameters

Duration-Based Volatility Estimation

UNIVERSITY OF NOTTINGHAM. Discussion Papers in Economics CONSISTENT FIRM CHOICE AND THE THEORY OF SUPPLY

Lecture 10 Demand for Autos (BLP) Bronwyn H. Hall Economics 220C, UC Berkeley Spring 2005

1 Lyapunov theory of stability

Consistency and Asymptotic Normality for Equilibrium Models with Partially Observed Outcome Variables

Approximating High-Dimensional Dynamic Models: Sieve Value Function Iteration

Estimation of demand for differentiated durable goods

2. The Concept of Convergence: Ultrafilters and Nets

Sharp identification regions in models with convex moment predictions

University of Warwick, EC9A0 Maths for Economists Lecture Notes 10: Dynamic Programming

Next, we discuss econometric methods that can be used to estimate panel data models.

Empirical Processes: General Weak Convergence Theory

The properties of L p -GMM estimators

RENORMALIZED SOLUTIONS ON QUASI OPEN SETS WITH NONHOMOGENEOUS BOUNDARY VALUES TONI HUKKANEN

Econometric Analysis of Cross Section and Panel Data

Consumer heterogeneity, demand for durable goods and the dynamics of quality

Price and Capacity Competition

Large Market Asymptotics for Differentiated Product Demand Estimators with Economic Models of Supply

Solving Classification Problems By Knowledge Sets

is a Borel subset of S Θ for each c R (Bertsekas and Shreve, 1978, Proposition 7.36) This always holds in practical applications.

Solving Dual Problems

Identification and Inference on Regressions with Missing Covariate Data

Oligopoly. Firm s Profit Maximization Firm i s profit maximization problem: Static oligopoly model with n firms producing homogenous product.

Machine Learning. Support Vector Machines. Fabio Vandin November 20, 2017

SEQUENTIAL ESTIMATION OF DYNAMIC DISCRETE GAMES. Victor Aguirregabiria (Boston University) and. Pedro Mira (CEMFI) Applied Micro Workshop at Minnesota

Bargaining, Contracts, and Theories of the Firm. Dr. Margaret Meyer Nuffield College

ECO 2901 EMPIRICAL INDUSTRIAL ORGANIZATION

Inference For High Dimensional M-estimates: Fixed Design Results

Robust Predictions in Games with Incomplete Information

BROUWER S FIXED POINT THEOREM: THE WALRASIAN AUCTIONEER

Estimating Single-Agent Dynamic Models

ESTIMATING STATISTICAL CHARACTERISTICS UNDER INTERVAL UNCERTAINTY AND CONSTRAINTS: MEAN, VARIANCE, COVARIANCE, AND CORRELATION ALI JALAL-KAMALI

II. Analysis of Linear Programming Solutions

The Folk Theorem for Finitely Repeated Games with Mixed Strategies

Instrumental Variables Estimation and Weak-Identification-Robust. Inference Based on a Conditional Quantile Restriction

Oligopoly Theory 2 Bertrand Market Games

Simultaneous Choice Models: The Sandwich Approach to Nonparametric Analysis

A NEW SET THEORY FOR ANALYSIS

A CHARACTERIZATION OF STRICT LOCAL MINIMIZERS OF ORDER ONE FOR STATIC MINMAX PROBLEMS IN THE PARAMETRIC CONSTRAINT CASE

Semiparametric Identification in Panel Data Discrete Response Models

Estimation of Static Discrete Choice Models Using Market Level Data

Substitute Valuations, Auctions, and Equilibrium with Discrete Goods

Proofs for Large Sample Properties of Generalized Method of Moments Estimators

Using Economic Contexts to Advance in Mathematics

Dynamic Discrete Choice Structural Models in Empirical IO

Large Market Asymptotics for Differentiated Product Demand Estimators with Economic Models of Supply

More Powerful Tests for Homogeneity of Multivariate Normal Mean Vectors under an Order Restriction

AN INFORMATION THEORY APPROACH TO WIRELESS SENSOR NETWORK DESIGN

Inference in Nonparametric Series Estimation with Data-Dependent Number of Series Terms

Compositions, Bijections, and Enumerations

1 Lattices and Tarski s Theorem

Case study: stochastic simulation via Rademacher bootstrap

Partial Identification and Inference in Binary Choice and Duration Panel Data Models

Observer design for a general class of triangular systems

Econ 504, Lecture 1: Transversality and Stochastic Lagrange Multipliers

Lecture 1. Stochastic Optimization: Introduction. January 8, 2018

September Math Course: First Order Derivative

arxiv: v1 [math.fa] 14 Jul 2018

High-dimensional Problems in Finance and Economics. Thomas M. Mertens

Learning Theory. Ingo Steinwart University of Stuttgart. September 4, 2013

3.10 Lagrangian relaxation

Introduction to the Mathematical and Statistical Foundations of Econometrics Herman J. Bierens Pennsylvania State University

Notes on Supermodularity and Increasing Differences. in Expected Utility

Uniqueness, Stability, and Gross Substitutes

DEPARTMENT OF ECONOMICS DISCUSSION PAPER SERIES

University of California San Diego and Stanford University and

NBER WORKING PAPER SERIES PRICE AND CAPACITY COMPETITION. Daron Acemoglu Kostas Bimpikis Asuman Ozdaglar

Testing Homogeneity Of A Large Data Set By Bootstrapping

Quantile Regression for Panel Data Models with Fixed Effects and Small T : Identification and Estimation

A Geometric Framework for Nonconvex Optimization Duality using Augmented Lagrangian Functions

MKTG 555: Marketing Models

5 Measure theory II. (or. lim. Prove the proposition. 5. For fixed F A and φ M define the restriction of φ on F by writing.

Price Discrimination through Refund Contracts in Airlines

Online Appendix for Dynamic Ex Post Equilibrium, Welfare, and Optimal Trading Frequency in Double Auctions

Transcription:

The Pennsylvania State University The Graduate School Department of Economics IDENTIFICATION AND ESTIMATION OF PARTIALLY IDENTIFIED MODELS WITH APPLICATIONS TO INDUSTRIAL ORGANIZATION A Dissertation in Economics by Xian Li c 2018 Xian Li Submitted in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy August 2018

The dissertation of Xian Li was reviewed and approved by the following: Andrés Aradillas-López Associate Professor of Economics Dissertation Advisor Chair of Committee Patrik Guggenberger Professor of Economics Joris Pinkse Professor of Economics John R. Howell Assistant Professor of Marketing Barry Ickes Professor of Economics Head of the Department of Economics Signatures are on file in the Graduate School. ii

Abstract This dissertation consists of three chapters on identification and estimation of partially identified models and their application to industrial organization. Chapter 1: Identification and Estimation of Partially Identified Models Defined by Moment Equalities with Latent Variables This chapter studies identification and inference for a type of partially identified models defined by moment equalities with latent variables. In such type of models, since the latent variables are not observed by researchers, any possible conditional distribution of the latent variables will lead to a set of possible parameter values, thus result in the models being partially identified. Since the space of distributions is infinite dimensional, the estimation of such models always requires dimension reduction. This chapter provides a way to reduce the dimension of the original problem, targeted specifically at situations when economic models impose specific restrictions on the family of the distributions (e.g. shape restrictions). The method is proven to be consistent when the support of the latent variables is convex and compact and the moment function maps a convex set into a convex set. Chapter 2: Robust Inference for Differentiated Product Demand Systems This chapter provides robust inference for differentiated product demand with measurement error in market shares. Market shares have been used as estimators for choice probabilities generated from a (random coefficient) discrete choice model. However, there are situations in which market shares are inaccurately measured, such in the cases of unobserved choice set variations (e.g., stock-out events), sampling error and measurement error in market sizes. The existing point identification approaches to address measurement error introduced by stock-out events do not allow for endogenous price. The partial identification approach by moment inequalities, in general, does not characterize a sharp identified set and the demand function can only be estimated based on market level variations. This chapter gives a sharp characterization of the identified set using moment equalities with latent iii

variables. A feasible estimation of such sets requires reducing the dimension of an optimization problem. The existing duality approach does not have a natural generalization to the demand estimation environment due to the special dependence structure inside the market. This chapter then adopts the techniques developed in chapter one to conduct valid inference on the identified set. Theoretically, the method is proven to be robust to measurement error in market shares, and it is also verified by simulations and empirical studies. Chapter 3: Identification and Estimation of Price Competition with Capacity Constraints Traditional models of price competition assume that the sales of firms are the exact realizations of the demand of the market, and firms compete by setting prices accordingly. However, in the case when firms have capacity constraints, the sales maybe the capacity cap of firms rather than the realized market demand (e.g. parking deck, hotel room). This chapter presents an oligopoly model with homogeneous product in which the firms set up capacities first, then compete in price. The demand function is shown to be partially identified in general and point identified under stronger assumptions. If the capacity level is observed, then the cost function is point identified. The performance of the method is also verified by a simulation study. iv

Contents List of Figures List of Tables Acknowledgements vii viii ix Chapter 1 Identification and Estimation of Partially Identified Models Defined by Moment Equalities with Latent Variables 1 1.1 Introduction............................... 1 1.2 Identification Result.......................... 3 1.3 Estimation................................ 8 1.4 Inference................................. 11 1.5 Conclusion................................ 13 Chapter 2 Robust Inference for Differentiated Product Demand Systems 14 2.1 Introduction............................... 14 2.2 Motivating Example.......................... 19 2.3 Model.................................. 22 2.4 Identification.............................. 25 2.4.1 Main Result........................... 25 2.4.2 Sharp Characterization of the Identified Set......... 30 2.5 Estimation and Inference........................ 32 2.5.1 Constructing Sample Objective Function........... 32 2.5.2 Estimation............................ 36 2.5.3 Inference............................. 41 2.6 Monte-Carlo Simulation........................ 44 2.6.1 One Product in Each Market................. 44 2.6.2 Twenty Products in Each Market............... 46 2.7 Empirical Study............................. 47 2.8 Conclusion................................ 50 v

Chapter 3 Identification and Estimation of Price Competition with Capacity Constraints 52 3.1 Introduction............................... 52 3.2 Model.................................. 54 3.2.1 Consumers Problem...................... 54 3.2.2 Firms Problem......................... 55 3.3 Equilibrium Analysis.......................... 59 3.3.1 Price Setting Subgame..................... 59 3.3.2 Setting Capacities........................ 61 3.4 Identification Result.......................... 62 3.4.1 Partial Identification of the Demand Function........ 62 3.4.2 Point Identification of the Demand Function......... 64 3.4.3 Identification of Marginal Cost and the Marginal Cost Function 65 3.5 Estimation................................ 65 3.5.1 Estimation of the Demand Function.............. 65 3.5.2 Estimation of the Cost Function................ 66 3.5.3 Monte Carlo Simulation.................... 67 3.6 Conclusion................................ 68 Appendix A Omitted Proofs for Chapter 1 69 A.1 Preliminaries.............................. 69 A.2 Extreme Points............................. 69 A.3 Proofs.................................. 70 Appendix B Omitted Proofs for Chapter 2 76 B.1 Proofs.................................. 76 B.2 Verification of Regularity Conditions................. 82 Appendix C Omitted Proofs for Chapter 3 85 Bibliography 90 vi

List of Figures 1.1 Image of g(w i, ; θ)........................... 8 1.2 image of E µ [g(w i, ; θ)]......................... 8 2.1 Graphical illustration of bounds.................... 29 2.2 Weekly sales of OOS products three weeks before and after..... 49 3.1 Contingent Demand 1......................... 57 3.2 Contingent Demand 2......................... 57 3.3 Contingent Demand 3......................... 58 3.4 Conditions to support a pure strategy equilibrium.......... 61 vii

List of Tables 2.1 Coverage probability of 95% confidence region............ 45 2.2 Coverage probability of 95% confidence interval........... 47 2.3 Standard Estimators.......................... 50 2.4 95% Confidence Interval........................ 50 3.1 Simulation Result............................ 68 viii

Acknowledgments I would like to express a deep gratitude to my adviser, Andrés Aradillas-López. His invaluable guidance and advise is crucial to my research and academic career. I am also deeply indebted to Patrik Guggenberger and Joris Pinkse, for their constant help throughout my life at Penn State. I am very grateful to Ronald Gallant, Daniel Grodzicki, Marc Henry, John Howell, Sung Jae Jun, Keisuke Hirano, Vijay Krishna, Paul Grieco, Peter Newberry, Hari Sridhar, and all my seminar attendants who provided valuable comments and feedback to my work, which constitute an important part of my thesis. I would like to thank the rest of the faculty members as well as my classmates and other fellow students in the Department of Economics, who has helped me over the past several years. Last but not least, I would like to thank my family for their support during my doctoral study at Penn State, without them I would never have been able to survive many difficulties in my life. ix

Chapter 1 Identification and Estimation of Partially Identified Models Defined by Moment Equalities with Latent Variables 1.1 Introduction This chapter studies a type of partially identified models defined by moment equalities with latent variables. Suppose the researcher observes a sequence of random vectors W i R d W, i = 1, 2,..., n, and the economic model implies the following restriction: Eg(W i, U i ; θ 0 ) = 0 where U i R d U is a vector of unobserved random variables, θ 0 is the true value of the parameters and g = (g 1,..., g J ) is a vector of moment functions with known functional form up to a set of parameters θ Θ R d θ. Since U i is unobserved, for each possible conditional distribution of U i conditional on W i, there will be a (possibly different) set of parameter values that satisfy the moment restrictions. The union of such parameter sets correspond to different conditional distributions of U i is usually not a singleton, thus the moment restrictions will usually lead to parameters being partially identified. Such type of moment restrictions can be generated by many semiparametric models, when a key economic value (other than the error term) is unobserved. For 1

example, when we only observe an interval where the key variable falls in, the latent variable U i can represent the difference between the true value and the upper bound of the interval. Another example is when there are multiple equilibria in a game, the equilibrium selection mechanism is always unobserved. Lastly, any moment inequalities can be rewritten into moment equalities by adding an unobserved term to represent the slackness of the inequalities. Unlike the sets that are defined by moment inequalities, there are very few methods that aim to estimate the identified set defined by moment equalities defined above. The major problem is that the optimization required in the model is performed over the space of conditional distributions of the latent variables, which is an infinite dimensional space and, hence, infeasible to compute. Ekeland et al. (2010), Galichon and Henry (2013), and Schennach (2014) has hence developed methods to transform the infinite dimensional optimization problem into a finite dimensional dual problem. This chapter considers models that place further restrictions on the marginal distribution of the latent variables (the literature has been focusing on the situation when all distributions that satisfy the moment restrictions are possible candidates). The motivation is that, in many economic models, though the latent variables are not observed, certain requirement can be imposed on it to reduce the size of the family of possible conditional distributions, and thus reduce the size of the identified set. For example, an economic model may imply that some of the latent variables are positively correlated with each other. In addition, in many market environments, the marginal distribution of the latent variables are always exchangeable, as in the labeling of the agents or products can be changed at will (Chapter 2 of the thesis will consider one of such examples). As far as I am aware of, no work has been done in the existing literature that considers the identified set defined with a smaller family of the marginal distribution of the latent variables (hence a smaller family of the conditional distribution of the latent variables), especially when the restriction cannot be expressed as moment restrictions, such as monotonicity. This chapter proposes a new method to reduce the dimension of the original problem (and hence transform it into a feasible problem) when there are further restrictions imposed on the family of the marginal distribution of the latent variables. The method builds on the assumption that the support of the latent variables is a compact and convex set and the moment function maps a convex set into a convex set. The technique hinges on the fact that the space of probability measures over a compact support is a convex set and compact in the weak topology and, therefore, 2

the optimization can be attained by only searching over the set of extreme points in this space, which is finite dimensional. After the objective function is turned into a feasible finite dimensional optimization problem, the construction of a set estimator and confidence region can be performed using the methods suggested by Chernozhukov et al. (2007), Romano and Shaikh (2008), and Romano and Shaikh (2010). The remainder of this paper is organized as follows. In Section 1.2, I formally introduce the moment equality model I will be discussing throughout the chapter and introduce the dimension reduction method used in identification which is the key contribution of the chapter. In Section 1.3 and 1.4, I provide a method of conducting estimation and inference on the identified set and the true parameter value. Section 1.5 concludes the paper. 1.2 Identification Result Suppose the economic model implies the following restrictions: E π0 µ 0 g(w i, U i ; θ 0 ) = 0 where W i R d W (i = 1, 2,..., n) is a sequence of observed random vectors, and U i R d U (i = 1, 2,...,n) is a sequence of unobserved random vectors. A vector of measurable functions g = (g 1,..., g J ) is such that each g j (j = 1,..., J) maps from R d W R d U to R with a known functional form up to a vector of parameters θ Θ R d θ. The expectation is taken with respect to joint distribution π0 µ 0, where π 0 is the true marginal distribution of W i, which is identified from data, and µ 0 is the true conditional distribution of U i conditional on W i. The true value θ 0 is assumed to be the unique value to make the above equation hold when the expectation is evaluated under the joint distribution π 0 µ 0. Denote the conditional distribution when conditioning on a specific value W i as µ Wi, and its support as supp(µ Wi ). I also assume that the economic model imposes a restriction on supp(µ Wi ) such that supp(µ Wi ) S Wi when the realized value is W i (this allows the support to be different for different values of W i ). When no restriction is imposed on the support, S Wi = R d U for any W i. Let C be the family of all conditional distribution of U i conditional on W i. For each µ C, let ρ(µ) be the marginal distribution of U i associated with the joint distribution π 0 µ, and ρ(µ 0 ) be the marginal distribution associated with 3

π 0 µ 0. Assume the economic model implies some properties of the true marginal distribution of U i (for example, E ρ(µ0 )U i = 0), such that, ρ(µ 0 ) U, where U is the family of the marginal distributions such that the implied properties of ρ(µ 0 ) hold. Let R C be such that, for any µ R, we have ρ(µ) U. Since we do not observe U i, the conditional distribution µ 0 is not identified, and we cannot conduct inference on θ 0 based on the moment restrictions directly. However, for each µ R, we know certain values of θ will satisfy the moment restriction. Denote the union of all such θ as Θ 0, then we would have θ 0 Θ 0 due to the fact that ρ(µ 0 ) U. Since U is known, we can identify Θ 0, and θ 0 is partially identified without any further restrictions. The previous discussion can be formalized into the following definition: Definition 1.2.1. The identified set Θ 0 is Θ 0 = {θ Θ : there exists µ R, E π0 µg(w i, U i ; θ) = 0} The rest of the chapter mostly focuses on carrying out identification and inference analysis of Θ 0 rather than the true parameter value θ 0. In order to propose an estimator for the identified set, notice that the previous definition also has the following equivalent representation in terms of an optimization problem. Lemma 1.2.1. The identified set Θ 0 can be represented as Proof. See Appendix. Θ 0 = arg min θ Θ min µ R E π 0 µg(w i, U i ; θ) (1.1) Lemma 1.2.1 implies that the problem of searching for the identified set contains two optimization steps, and one of them requires searching over the infinite dimensional space R. Suppose a researcher wants to conduct inference on Θ 0 based on the sample analogue: n min min E µ [g(w i, U i ; θ) W i ] θ Θ µ R i=1 Due to the space R being infinite dimensional, the inner optimization is actually infeasible in practice. Hence, a way to reduce the dimension is required. Before introducing of the dimension reduction technique developed in this paper, I discuss several approaches in the literature that try to solve this problem. 4

The first method to deal with distributions of latent variable is introduced by Pakes and Pollard (1989) in which they assume a parametric family of conditional distributions R R with a finite dimension and estimate via a simulated method of moments objective function as below: min min θ Θ µ R E π µ [g(w i, U i ; θ)] It is apparent that this method considers a much smaller class than the original problem, and thus will usually lead to an inconsistent estimator for the identified set when µ 0 / R. Another approach is to exploit the fact that a duality of the above infinite dimensional optimization problem can be finite. There are two papers in the existing literature, Ekeland et al. (2010) and Schennach (2014), that consider this method. They propose two different dual formulations of the original problem which are finite dimensional. Using their methods, the identified set in (1.1) can be shown to have the following equivalent representation Θ 0 = arg min θ Θ inf E π0 [ g(w i ; θ, γ)] γ R J where g( ) is a known function parameterized by θ and γ. Here we transformed the original infinite dimensional problem into one with finite dimension. We can therefore construct a sample objective function based on the population objective function as below: min θ Θ inf γ R J n g(w i ; θ, γ) i=1 Both papers mentioned above introduced a way to transform the original objective function into one that is feasible to handle. However, the method is designed for the situation when there is no restriction imposed on the conditional distribution of the latent variables, i.e., R = C, and there is no trivial generalization to the situations when R C. Another situation that such methods cannot deal with is when the latent variables are dependent across observations. For example, in the demand estimation problem considered in chapter 2 of the thesis, the observations in one market are always dependent. In this chapter, I propose a novel method to reduce the dimension of the above 5

problem by taking advantage of the special property that the support of the latent variables is compact and convex 1. The reduction consists of two steps, first step uses the fact that the support is compact and reduces the dimension from infinite to finite. Second step uses the fact that the support is convex and its image under g(w i, ; θ) is a convex set and reduces the dimension further. The following assumption is important throughout the chapter. Assumption 1.2.1. S Wi is compact and convex. Now we consider the space of all signed measures M(S Wi ) on the Borel σ-algebra B(S Wi ) on set S Wi. For any µ 1, µ 2 M(S Wi ), and a R, I use the following rule: Definition 1.2.2. For any B B(S Wi ), (µ 1 + µ 2 )(B) = µ 1 (B) + µ 2 (B) and (a µ 1 )(B) = a µ 1 (B). Under Definition 1.2.2, M(S Wi ) is a linear space and the set of all probability measures P(S Wi ) is a convex subset. Since S Wi is also compact, I have the following result: Lemma 1.2.2. The set P(S Wi ) is compact in its weak topology and convex. Proof. See Appendix. To apply my dimension reduction method, I do a transformation of the population objective function by using the law of iterated expectation: min E π 0 µ[g(w i, U i ; θ)] = min E [ π 0 Eµ [g(w i, U i ; θ) W i ] ] µ R µ R It now suffices to characterize the set of all values of E ν [g(w i, U i ; θ)] for any ν P(S Wi ) (Here W i is given and the expectation is taken with respect to U i ). Define a continuous linear operator L : P(S Wi ) R J such that for any ν P(S Wi ), Lν = E ν [g(w i, U i ; θ)]. Notice that the set I want to characterize is exactly L(P(S Wi ). L. By Lemma 1.2.2 we have the following result about the image of P(S Wi ) under Lemma 1.2.3. The set L(P(S Wi )) is compact and convex. Proof. See Appendix. 1 Ekeland et al. (2010) and Schennach (2014) consider a more general support which allows for nonconvex or noncompact set. 6

Now we have the main result of the first step of the dimension reduction: Proposition 1.2.1. The population objective function in (1.1) has the the following equal representation: min E π 0 µ[g(w i, U i ; θ)] = min E [ π 0 Eµ [g(w i, U i ; θ) W i ] ] µ E R µ E R where E C is such that for any µ E and W i we have supp(µ Wi ) S(W i ) and the number of points in supp(µ Wi ) is less than J + 1. Proof. See Appendix. The idea is to apply Krein-Milman theorem which states a compact convex subset of a vector space can be represented as the closed convex hull of its extreme points. Here the extreme points of the set can be shown to be the points that correspond to all probability measures that have only a single point in their support (Dirac measure). Then according to Carathéodory s theorem each convex combination only has to contain at most J + 1 points, which completes the proof. The details of the proof can be found in appendix. Notice that Proposition 1.2.1 already makes the original problem into a feasible one that searching for each conditional distribution conditional on a specific value W i is a J+1 dimensional problem. In the following I show that we can further reduce the dimension of the problem to one when the image of g(w i, ; θ) : S(W i ) R J, denoted as g(s(w i )), is a convex set. Assumption 1.2.2. g(s(w i )) is a convex set. Notice that, for any µ E, for each µ Wi assigned probability {p i0, p i1,..., p ij }. Then let supp(µ Wi ) = {r i0, r i1,..., r ij } with E µ [g(w i, U i ; θ) W i ] = p ik g(w i, r ik ; θ) k=0 which is essentially a convex combination of points in supp(µ Wi ). Then if g(s(w i )) itself is a convex set one can always find a single point r such that g(w i, r; θ) = J k=0 p ikg(w i, r ik ; θ). Now I introduce a new class of functions: Definition 1.2.3. Let r( ) be a function that takes W i as arguments and maps to a point in S(W i ). Define the probability measure µ r over S(W i ) as: for any Borel set B, µ r (B) = π 0 (r 1 (B)). Let F be the collection of such functions that µ r R. 7

Then I have the main result of second step of dimension reduction: Proposition 1.2.2. The population objective function in (1.1) has the following equal representation: Proof. See Appendix. Q(θ) = min E π0 [g(w i, r(w i ); θ)]. (1.2) r( ) F The above dimension reduction technique fully exploits the fact that the image of g(w i, ; θ) is convex and compact under the setup considered in this chapter. It does not have a general extension to other environments. For example, suppose the image of g(w i, ; θ) is not convex, as shown in Figure 1.1, then the image of the linear operator E µ g(w i, U i ; θ) taken µ as input can be shown in Figure 1.2 and clearly just considering the set in Figure 1.1 is not enough. Figure 1.1: Image of g(w i, ; θ) Figure 1.2: image of E µ [g(w i, ; θ)] 1.3 Estimation For any r = (r 1, r 2,..., r n ), let the empirical measure associated with it be ˆρ(r), given the result from the previous section, we consider the following sample objective function we will later use for estimating Θ 0. Define ˆQ n (θ) = min r i S(W i ) ˆρ(r) U n g(w i, r i ; θ) (1.3) i=1 ˆΘ = {θ Θ : ˆQ n (θ) C} where C is a constant. 8

In this section, we will prove given a choice of C, d H ( ˆΘ, Θ 0 ) p 0, where d H is the Hausdorff distance defined as: d H ( ˆΘ, Θ 0 ) = max { sup θ ˆΘ inf θ θ, sup inf θ θ } θ Θ 0 θ Θ 0 θ ˆΘ The analysis of this chapter assumes that W i is i.i.d. across i, the result can also be generalized to the case when W i has some form of weak dependence (e.g., mixing). First we have the following result in point convergence of sample objective function (1.3) to the population objective function (1.2). Lemma 1.3.1. Suppose either one of the following holds: U i, (i) R = C, i.e., there is no restriction placed over the marginal distribution of (ii) g(w i, ; θ) is a continuous function, then for any θ Θ, when n, we have min r i S(W i ) ˆρ(r) U n i=1 Proof. See Appendix. g(w i, r i ; θ) p min r( ) F E π 0 [g(w i, r(w i ); θ)] Condition (i) and (ii) can be abandoned at a cost of introducing tuning parameters. The choice of tuning parameters depends on what kind of restrictions are imposed on the marginal distribution of U i and is different from case to case, therefore, I am not going to formalize this approach in this chapter. As an example, if the model requires EU i 0, then one can require that in equation (1.2), the solution r has to satisfy n ri b n i=1 where b n is a sequence of negative numbers that converges to 0 at a rate slower than 1 n. To obtain uniform convergence we need the following stochastic equicontinuity condition. Assumption 1.3.1. { ˆQ n (θ) : n 1} is stochastic equicontinuous on Θ, i.e., 9

ε > 0, δ > 0 such that lim sup n P (sup θ Θ sup θ B(θ,δ) ˆQ n (θ) ˆQ n (θ ) > ε) < ε Now I apply the result from Chernozhukov et al. (2007) and construct the set estimator as: ˆΘ = {θ Θ : Q n (θ) ĉ} where ĉ sup θ Θ0 Q n (θ) with probability approaching 1 and ĉ/n p 0 is a data dependent number. We can then apply Theorem 3.1 from Chernozhukov et al. (2007) to establish the following proposition: Proposition 1.3.1. Assume (1) The parameter space Θ is a nonempty compact subset of R d Θ, (2) g(, ; θ) is a measurable function, and (3) Assumptions 1.2.1 and 1.3.1 hold. Then Θ 0 ˆΘ with probability approaching 1 when n and d H ( ˆΘ, Θ 0 ) = o p (1). where d H is the Hausdorff metric. In the actual application, I use the following sample objective function for efficiency: Proof. See Appendix. ˆQ n (θ) = Ω n (θ) 1 2 where 1 n n i=1 g(w i, ri ; θ) inf Ω n(θ) 1 1 2 θ Θ n n g(w i, ri ; θ) i=1 r i arg min r i S(W i ) ˆρ(r) U 1 n n g(w i, r i ; θ) i=1 and Ω n (θ) = 1 n n i=1 ( )(. g(w i, ri ; θ) g(w i, ri ; θ)) 10

1.4 Inference Now we consider the confidence set for the identified set Θ 0 and true value θ 0. Here I use QLR test statistics QLR n (θ) = min r i S(W i ) ˆρ(r) U Ω 1 2 n (θ) 1 n The confidence set for θ 0 is constructed as n g(w i, r i ; θ) 2. i=1 CR(Θ 0 ) = {θ Θ : QLR n (θ) Ĉ(1 α)} We seek to find Ĉ(1 α) such that, lim inf n P (Θ 0 CR(Θ 0 )) 1 α (1.4) where 1 α is the desired coverage probability. The idea is to apply procedure developed by Chernozhukov et al. (2007) and Romano and Shaikh (2010) for the choice of Ĉ(1 α) via subsampling. We have the convergence result followed directly from the estimation part. Lemma 1.4.1. sup θ Θ0 QLR n (θ) converges in distribution to a nondegenerate and continuous distribution when n. Now consider the following subsampling procedure similar to Romano and Shaikh (2008): 1. Let S = Θ. If sup θ S QLR n (θ) Ĉ(S, 1 α), then accept all hypotheses and stop, here Ĉ(S 1, 1 α) is an estimator of 1 α quantile of sup θ S QLR n (θ), which will be discussed later. 2. If not, then set S = {θ Θ : QLR n (θ) Ĉ(S, 1 α)} and repeat step 1. The above procedure may be computationally infeasible in practice, then one can always use the 1 α quantile of sup θ ˆΘ QLR n (θ) as Ĉ similar to Chernozhukov et al. (2007), where ˆΘ is the set estimator in Section 1.3. Applying Theorem 2.1 from Romano and Shaikh (2010) we have that (1.4) holds for the above procedure. Now I introduce a subsampling procedure to estimate Ĉ(S, 1 α). To do this, we need some more notations: 11

For each θ, let r i (θ) arg min r i S(W i ) ˆρ(r) U 1 n n g(w i, r i ; θ) i=1 and let QLR n(θ) = 1 n n g(w i, r i (θ); θ) = QLR(θ) i=1 Let b = b n < n be For the case when n, let b n < n be a sequence of positive integers tending to infinity with b n /n 0 as n. Let N n = ( n b n ) and let QLR n,b n,k(θ) = 1 b n b n g(w i, r i (θ); θ) be the QLR statistic evaluated at the kth subset of data with size b T. For any set S Θ, the estimator is defined as Ĉ(S, 1 α) = inf { x : 1 N n N n i=1 } 1{sup QLRn,b θ S n,i(θ) x} 1 α We can also construct confidence set for the true parameter value θ 0 following Romano and Shaikh (2008), the idea is to exploit the duality of hypothesis testing and confidence set. I still use the above QLR statistics and the confidence region is defined as CR(θ) = {θ Θ : H 0 : θ 0 = θ cannot be rejected} = {θ Θ : QLR n (θ) Ĉ(1 α)} The estimator for the 1 α quantile is similar as above via a subsampling procedure: For the case when n, let b n < n be a sequence of positive integers tending to infinity with b n /n 0 as n. Let N n = ( n b n ) and let QLR n,b n,k(θ) = 1 b n b n i=1 g(w i, r i (θ); θ) be the QLR statistic evaluated at the ith subset of data with size b n. Notice that 12

such subsampling draw is at market level. For any set S Θ, the estimator is defined as Ĉ(1 α) = inf { x : 1 N n N n i=1 } 1{QLRn,b n,i(θ) x} 1 α In practice, researchers are often interested in some particular parameters than full vector of them. In such case, I use the projection of the full confidence region to the parameters of interest to determine the confidence interval for that parameter. 1.5 Conclusion This chapter studies a type of models defined by moment equalities with latent variables in the moment function. Such moment restrictions can be generated by any economic model when some key variables are not observed. Due to the existence of the latent variables, the parameters in such models are usually not point identified, as each possible distribution of the latent variables can lead to a set of parameter values. The difficulty of estimating such models is that the family of the distribution of the latent variables is usually infinite dimensional, hence searching over it in an estimation procedure is not feasible. Previous work in the literature has proposed different methods to reduce the dimension of the problem when there is no restriction imposed on the family of the distributions. This chapter contributes to the literature by introducing a novel method to reduce the dimension of the problem while allowing for restrictions to be imposed on the family of the distributions. An estimator and inference procedure for the identified set and true parameter value are proposed based on the dimension reduction method. The asymptotic properties of the estimator and confidence region are justified theoretically. An application of the method to demand estimation can be found in chapter 2 of the thesis. 13

Chapter 2 Robust Inference for Differentiated Product Demand Systems 2.1 Introduction Many recent works in economics and marketing have estimated the differentiated product demand systems using the method developed by Berry (1994) and Berry et al. (1995) (henceforth BLP). BLP generates choice probabilities from a (random coefficient) discrete choice model and then matches them with market shares. A key assumption in this approach is that the observed market share of a product is equal to the choice probability generated from the true parameter value. However, this assumption does not always hold. First, sometimes consumers face a more restricted choice set rather than the entire set of products considered by econometricians. For example, if a consumer s preferred choice stocks-out, and the exact timing of the stock-out is not observed, the market share of the stockout product will be capped at its capacity and, therefore, is less than its actual choice probability; however, the market share of its available substitutes will be higher than their choice probabilities. Such choice set variation cannot be fixed by changing the discrete choice model because variations of choice set are typically unobserved. In addition, when the market size is small, the market share will not be an accurate representation of the corresponding choice probability due to sampling error 1. In some cases, even zero market share will be observed. Lastly, it is often 1 Berry et al. (2004) is aware of the sampling error in BLP framework and imposes regularity 14

the case that market size is not properly measured and researchers choose a set of numbers based on experience as a robustness check. If we ignore the above issues and proceed to estimation, the estimators can be inconsistent and the confidence sets can undercover the true parameter values, thereby invalidating counterfactual analysis based on the demand estimation. This paper seeks to solve the problems that result from any type of measurement error found in actual applications. It adopts a partial identification approach that introduces a vector of latent variables to represent the measurement error. For example, in the stock-out situation, I introduce a vector of latent variables that represent the unfulfilled demand for stocked-out products and the spillover demand for their substitutes. Now instead of replacing the choice probabilities with the market shares suggested by BLP, this paper proposes to replace the choice probabilities with the market shares plus the latent variables, and to subsequently generate moment conditions based on it. The model is partially identified because each conditional distribution of the latent variables corresponds to a vector of parameter values. The parameter values in the identified set are such that one can find a conditional distribution for the latent variables to satisfy the moment restrictions. The key idea to bound the identified set is to bound the support of these latent variables under different measurement error situations. This will result in moment equalities with latent variables in moment functions, which is different from the traditional moment inequalities method. The set is naturally sharp by its definition. This paper also proposes a novel method to estimate this type of model, which accounts for the special dependence structure within a market 2, as the existing methods in the literature are designed to handle only independent or weakly dependent data and cannot be easily generalized in the context of demand estimation. Hence, this method makes it possible to identify and estimate the demand function with either market- or product-level variations. As far as I know, this is the first paper in the literature to attempt to solve the stock-out problem in demand estimation under the BLP framework, which is known to work with the price endogeneity issue while allowing for rich substitution patterns. The industrial organization and marketing literature have been aware of the fact that consumers sometimes face a limited choice set due to stock-out events. A natural response to remedy this issue would be to consider a different model other assumptions to let the market size grow at a much higher rate than the sample size to make BLP estimator consistent. 2 For example, if the firms compete in price, then the price of each product will depend on the prices of all the products in that market. 15

than BLP that can account for stock-out events. The problem is that the timing of a stock-out event can never be pinned down to an exact time point, therefore the variations of some consumers choice sets are unobserved. Various approaches have been developed to address this problem. Conlon and Mortimer (2013) proposes a model to estimate an experimental data set on vending machines with observed inventory levels such that the timing of stock-out events can be partially pinned down. Musalem et al. (2010) develops a method to simulate consumers choice set using a scanner data set acquired from grocery stores. Other papers of interest in this literature include those by Bruno and Vilcassim (2008), Che et al. (2012), Campo et al. (2004), Ching et al. (2015), and Matsa (2011), which address different out-of-stock environments and their implications. However, to point identify the demand function, these methods either fail to account for price endogeneity, which plays a critical role in demand estimation, or hinge on strong assumptions related to the stock-out process (e.g., a uniform consumer arrival process that assumes consumers with different tastes arrive at the store with equal probability). Besides, some of the proposed methods rely on simulating consumers arrival processes to approximate the timing of an out-of-stock event. Such simulations always assume a specific parametric family of distributions and, thus, restrict consumers choice patterns and substitution patterns. This paper deals with the unobserved choice set variation by introducing the latent variables discussed above. I only require econometricians to observe a binary variable that indicates whether a product has experienced stock-out during a given observation period 3. In addition, the latent variables are modeled in a way that is agnostic about their dependence on the observed/unobserved variables and any possible consumer arrival process. Hence, it can allow for very rich consumer choice and substitution patterns. Besides, this treatment is under the BLP framework, which makes it possible to deal with the endogenous prices by using a set of instrumental variables. The measurement error of the market shares falls under the general situation in which there is measurement error in dependent variables like censoring. However, the traditional way of point identification using a quantile independence assumption 4, cannot be applied here as the unobserved shocks are multi-dimensional and buried inside a nonlinear system. Another common treatment is to simply drop 3 It does not have to be an exact time. For example, if one has weekly data, then I require researchers to know if a product has ever experienced stock-out during a week. 4 See, e.g., Powell (1984) and the literature thereafter. 16

the contaminated observations, however, it will suffer from the problem of sample selection bias. When the market size is small and sampling error is large, Gandhi et al. (2013) 5 (henceforth GLS) proposes a moment inequality method to characterize the identified set. Their method, however, does not deliver a sharp identified set, as the bound of the moment inequalities can be attained by parameter values outside the sharp identified set due to dependence within a market. The moment inequalities approach can also only be used to estimate demand function based on market-level variation by aggregating the product-level moments. This is because the random variables within the same market are usually dependent in a complicated way, which is not independent nor any form of weak dependence due to the strategic competition within a market, and none of the current estimation procedures presented in the moment inequality literature can deal with such a dependence structure. Under the same set of assumptions as in GLS, this paper provides a sharp characterization of the identified set and is able to handle within market dependence. The price to pay for using the method provided in this paper is that the identified set is not defined by moment inequalities but moment equalities with latent variables. Unlike the sets that are defined by moment inequalities, there are very few methods that aim to estimate the sets defined by moment equalities. major problem is that the optimization required in the model is performed over the space of conditional distributions of the latent variables, which is an infinite dimensional space and, hence, infeasible to compute. Ekeland et al. (2010), Galichon and Henry (2013), and Schennach (2014) develop methods to transform the infinite dimensional optimization problem into a finite dimensional dual problem. However, their methods are designed for independent or weakly dependent data and do not fit the problem considered here due to the special dependence structure within a market. This paper proposes a new method to reduce the dimension of the original problem by exploiting the fact that the support of the conditional distribution of the latent variables is always a compact and convex set. The The technique hinges on the fact that the space of probability measures over a compact support is a convex set and compact in the weak topology and, therefore, the optimization can be attained by only searching over the set of extreme points in this space, 5 In their latest draft, Gandhi et al. (2017), they abandon the partial identification approach and adopt a new assumption to point identify the demand function. They still claim that the old draft is applicable when the point identification assumption does not hold. 17

which is finite dimensional. After the objective function is turned into a feasible finite dimensional optimization problem, the construction of a set estimator and confidence region can be performed using the methods suggested by Chernozhukov et al. (2007), Romano and Shaikh (2008), and Romano and Shaikh (2010). I conducted two simulation studies on the proposed method, both of which focused on the out-of-stock situation. The first of these is a simple logit setup in which there is only one product in each market, and I can obtain an analytical form of the identified set. The simulation suggests that, even with a very low stockout rate across markets, the BLP confidence interval severely undercovers the true parameter value. For example, if 3 out of 100 markets observed a stock-out, then the coverage probability of a 95% confidence interval only covers the true parameter value at a rate of 47.1%. However, the method provided in this paper always has the correct coverage probability for both the identified set and the true parameter value. In the second simulation I generate data similar to the situation in the actual empirical application considered in this paper. The confidence region for the true parameter value of this paper still provides correct coverage probability while the BLP estimator is inconsistent. I apply the method developed in this paper to a scanner data set consisting of weekly sales data related to shampoo product category which was obtained from multiple grocery stores in different regions. The scanner data set itself does not contain information about stock-out events. However, it is very clear from the data set that, at some point, one product is experiencing an abnormally low volume of sales before returning to standard levels in the following week. I apply a method described in the existing marketing literature (Gruen et al. (2002), Gruen and Corsten (2007), and Grubor and Milicevic (2015)) to determine stock-out events. The empirical results suggest that the procedure developed in this paper results in a significant correction for price elasticity in comparison to the BLP estimators, thereby exhibiting a similar pattern to that identified in the simulation study. This provides a more robust counterfactual analysis from the demand side. The remainder of this paper is organized as follows. In Section 2.2, I introduce a simple two-product example to highlight the primary contribution and ideas of this paper. In Section 2.3, I formally introduce the discrete choice model I will be using throughout the paper. In Section 2.4, I discuss the new identification strategy and the identified set introduced in this paper. In Section 2.5, I provide a method of conducting estimation and inference on the identified set and the true parameter value. In Section 2.6, I describe the finite sample performance of the new 18

estimator and confidence set via Monte Carlo simulations. In Section 2.7, I conduct the method on a scanner data set acquired from the retail industry. Section 2.8 concludes the paper. 2.2 Motivating Example Suppose there are two products j = 1, 2 and an outside option j = 0 in each market t {1, 2,..., T }, the consumers in each market choose to buy one and only one product or the outside option. The utility of consumer i choosing product j in market t is represented as: u ijt = θ + ξ jt + ε ijt where ξ jt R is product j s unobserved characteristic in market t with mean zero, ε ijt R is consumer idiosyncratic shock, and θ Θ R is the parameter of interest. The utility of the outside option is normalized to zero: u i0t = 0. Assume ε ijt is independent across i, j and t, and follows a Type I extreme value distribution. The consumers would choose the option that gives them the highest utility. Since we do not observe a consumer s idiosyncratic shock, a canonical result says that we can integrate it out to obtain the choice probability of product j in market t in the following closed form: and σ jt = exp(θ + ξ jt ), j = 1, 2 (2.1) 1 + exp(θ + ξ 1t ) + exp(θ + ξ 2t ) 1 σ 1t σ 2t = σ 0t = 1 1 + exp(θ + ξ 1t ) + exp(θ + ξ 2t ) (2.2) We can then solve the system of equations (2.1) and (2.2) for ξ jt, and get ξ jt = log ( σ jt 1 σ 1t σ 2t ) θ Since E[ξ jt ] = 0, let the true value of θ be θ 0, then we have when θ = θ 0. [ E[ξ jt ] = E log ( σ jt ) ] θ 1 σ 1t σ 2t = 0 (2.3) Assume we observe the market share of product j in market t, s jt, and there is 19

no measurement error in the market share (σ jt = s jt ), then we can replace σ jt with s jt in equation (2.3) and get [ E log ( s jt ) ] θ 1 s 1t s 2t = 0 (2.4) when θ = θ 0. Equation (2.4) can be used to identify and estimate θ 0. However, equating market shares to choice probabilities can introduce measurement error. For example, if at some point in market t, product 1 is out-of-stock, then the market share of product 1 will always be no larger than the choice probability evaluated at the true parameter values, i.e., s 1t σ 1t. On the other hand, due to the forced substitution from the stock-out of product 1, the market share of product 2 will always be no smaller than the choice probability evaluated at the true parameter values, i.e., s 2t σ 2t. There can be other types of measurement error in the absence of stock-out events. Notice that, market share is the sample mean while choice probability is the population mean of a multinomial distribution. There is always sampling error that may not be negligible when the market size is small. Lastly, if researchers are uncertain of the market size when computing the market shares, the market shares will always be measured with error. Hence the traditional identification strategy discussed above no longer works when there is measurement error. If a researcher estimates the demand function with the same strategy outlined above, the estimators will generally be inconsistent. Let r jt be the measurement error such that s jt + r jt = σ jt, π 0 be the true joint distribution of (s 1t, s 2t ), and µ be a conditional distribution of (r 1t, r 2t ) conditional on (s 1t, s 2t ). Notice that since the products are modeled in a symmetric way 6, we have that π 0 is symmetric in its arguments (s 1t, s 2t ). Similarly, the marginal distribution of σ 1t should be the same as the marginal distribution of σ 2t, which implies, the marginal distribution of (s 1t + r 1t, s 2t + r 2t ) under π 0 µ should also be symmetric. Let C t (s 1t, s 2t ) be the collection of µ that satisfies the above symmetric restriction. The sharp identified set is characterized as { [ θ Θ : µ C t (s 1t, s 2t ), E π0 µ log ( s jt + r jt ) ] θ 1 s 1t s 2t r 1t r 2t } = 0 (2.5) Notice that since we do not observe the true value of r jt, and for each possible 6 Switching the subscript j will not affect anything. 20

µ, there will be one θ to satisfy the above moment constraints, we would have a set of θ that satisfy the above moment constraints in addition to θ 0 7, the parameters are generally not point identified. We could further restrict C t (s 1t, s 2t ) based on different measurement error environments as discussed in later sections, but for this simple example, I am just focusing on the identified set characterized in (2.5). Providing a set estimator for the identified set defined in (2.5) is not an easy task, as C t (s 1t, s 2t ) is infinite dimensional. Before I explain my method to estimate it, let us look at a moment inequality approach to characterize an identified set (hence easier to estimate). In the case when the market size is small and there is a large sampling error, GLS proposes the following moment inequalities: [ E log ( s 1t + η ) ] θ 1 s 1t s 2t η 0 (2.6) where η = max σ1t,σ 2t η 1t (σ 1t, σ 2t ) is a constant, and η 1t (σ 1t, σ 2t ) is the solution to [ E log ( s 1t + η 1t ) ] σ1t, σ 2t 1 s 1t s 2t η jt = log ( σ 1t 1 σ 1t σ 2t ) Here s t is introduced to solve the issue that log is not defined at zero and has the following representation: s t = n ts t + 1 n t + 3 (2.7) where n t is the market size of market t. Let θ be such that which satisfies the inequality in (2.6). [ E log ( s 1t + η ) θ] = 0 (2.8) 1 s 1t s 2t η 7 Let the true conditional distribution be µ 0, then θ 0 is the unique parameter value to satisfy [ E π0 µ 0 log ( s jt + r jt ) ] θ 1 s 1t s 2t r 1t r 2t = 0 21

Substituting (2.7) into (2.8) and rearranging we can get [ E log ( s 1t + 1 n t + nt+3 n t η ) θ] 1 s 1t s 2t + 1 n t nt+3 = 0 n t η To match it with (2.5), θ is such that, r 1t = 1 n t + nt+3 n t η and r 2t = 2 n t regardless of the value of (s 1t, s 2t ), which fails the symmetric requirement on π 0 µ. This implies that, θ, which is in the identified set characterized by the moment inequality, does not belong to the sharp identified set. GLS provides a way to characterize an identified set via moment inequalities, and we can see that it is not the sharp set from the above discussion. One can propose other ways of moment inequality characterizations, however, none of the moment inequality characterizations would lead to a sharp identified set because they will always fail the symmetric requirement in a way similar to GLS (unless there is only one product in each market). Though the identified set characterized by moment inequalities is easier to estimate, due to the above reason, this paper estimates the set in (2.5) directly without using moment inequalities. To solve the infinite dimensional problem, this paper proves that, it is sufficient to consider the set of Dirac measures 8 in C t (s 1t, s 2t ), which is finite dimensional. The later sections will formally discuss this idea in a more general setup. 2.3 Model Following discrete choice model literature, I assume consumer i can choose one from J products in market t {1, 2,..., T }. Let the utility of consumer i of consuming product j {1, 2,..., J} be u ijt = u(x jt, ξ jt, ε ijt ; β) with known parametric form 9, where x jt R dx and ξ jt R are observed and unobserved product and market characteristics, ε ijt R is unobserved consumer idiosyncratic shock, and β R dx is a finite dimensional parameter vector. The unobserved consumer idiosyncratic shock ε ijt is assumed to follow a distribution F ( ; η), which is known up to a finite dimensional parameter vector η R dη. The 8 A probability measure that assigns probability one to a single point. 9 The most common choice is the linear form. 22