What Do Consumers Consider Before They Choose? Evidence from Asymmetric Demand Responses

Size: px
Start display at page:

Download "What Do Consumers Consider Before They Choose? Evidence from Asymmetric Demand Responses"

Transcription

1 What Do Consumers Consider Before They Choose? Evidence from Asymmetric Demand Responses Jason Abaluck and Abi Adams March 31, 2017 Abstract Discrete choice models which relax the assumption that consumers consider all available options are known as consideration set models. The applied literature on consideration sets relies for identification on either auxiliary data on what options were considered or on instruments assumed to impact consideration probabilities or utility but not both. We show that a broad class of consideration set models including all we are aware of in the applied literature are identified without these assumptions from asymmetries in how choice probabilities respond to the characteristics of rival goods (the discrete choice analogue of Slutsky symmetry). Our identification proof constructively recovers how consideration probabilities vary with observables from these asymmetries. We show in hotel choice data from Expedia that the model determines that the randomly assigned ordering of hotels in search impacts attention and not utility and that bounds implied by the model predict the benefits of informative advertising. We replicate an earlier finding that health plan choices are more sensitive to characteristics of the plan chosen last year than rival goods and show that this implies that observed inertia is largely driven by inattention. 1 Introduction Discrete choice models typically assume that consumers are aware of all available options. This prevents researchers from asking many questions of interest. For example, would some goods be demanded more if they were noticed despite current low sales? Would inertial consumers wake up in response to a premium hike but remain unresponsive if rivals lower premiums? Measures of cognitive ability impact choices, but do they impact preferences or impact which options are considered? Normatively, if you choose an apple over an orange because you are not aware of the orange, you may still prefer oranges. Whether people eat the same foods and go to the same stores year after year because they like those options or because they do not know what else exists has first-order consequences for welfare. The literature on consideration sets relaxes the assumption that individuals consider all goods (hereafter the full information assumption), allowing for the possibility that consumers consider Thanks to Leila Bengali and Mauricio Caceres for excellent research assistance and to Dan Ackerberg, Joe Altonji, Steve Berry, Judy Chevalier, Jeremy Fox, Jonathan Gruber, Phil Haile, Erzo Luttmer, Costas Meghir, Olivia Mitchell, Barry Nalebuff, Fiona Scott Morton, Joe Shapiro, K. Sudhir and participants in the Heterogeneity in Supply and Demand Conference, the Roybal Annual Meeting, and the Yale IO, labor economics and econometrics workshops for helpful discussions. Also special thanks to Arthur Lewbel for retrieving Jason s keys when he left them in the seminar computer, and thanks to Raluca Ursu for help in replicating her Expedia analysis. 1

2 only a subset of possible options when making a choice. Empirical models in this literature typically rely either on additional data or exclusion restrictions to separate the impact of observables on utility and consideration set probabilities. For example, Conlon and Mortimer (2013) assume that consideration sets are known in some periods, Honka (2014) and Honka, Hortaçsu, and Vitorino (2015) use auxiliary information detailing which options consumers are aware of, and Goeree (2008) and Gaynor, Propper, and Seiler (2016) assume that some observables impact either attention or utility but not both. We show that utility and consideration set probabilities can be separately recovered in a broad class of discrete choice models including all we are aware of estimated in the literature to date without excluding variables from attention or utility, even if consideration sets are not observed. Identification comes from asymmetries in the responsiveness of choice probabilities to characteristics of rival goods. In addition to constructively recovering consideration set probabilities from these asymmetries, we show in several applications that these models imply different substitution patterns and normative conclusions from the full information models that they nest and that they can be used to predict which products will experience large increases in sales when consumers are made more aware of them. Our results suggest that consideration set models could be used in a wide variety of settings where full-information models are currently estimated. In cross-sectional data, one can identify whether goods are demanded because they are high-utility or simply more likely to be considered, and in panel data one can evaluate whether inertia reflects utility or inattention. Consideration sets may be relevant either because consumers pay attention only to a subset or products, or because some products which the econometrician sees as in the choice set are simply not available to consumers (they are not on the shelves). In this paper, we use attentive as synonymous with a good is in the consumer s consideration set in order to subsume both the case when consumers are not paying attention and the case where some product is unavailable. 1 To provide some intuition for our identification result, consider the following stylized example: suppose you must choose between an apple and an orange and that the apple is the default (e.g. because you chose it last time); assume you will be inattentive and choose the apple unless it becomes so unsuitable that you have to look elsewhere. There is no outside option and no income effects. Full information models assume that you should be equally responsive if the price of the apple changes by $1 in either direction or the price of the orange changes by $1 in the opposite direction. With no income effects, only relative prices matter. If we see that you respond more when the price of the apple changes than when the prices of the orange changes in the opposite direction, this will suggest that the price of the apple must be perturbing attention, creating the asymmetry. Since the asymmetries give the marginal impact of characteristics on attention, one can (with sufficient price variation) recover the attentive probability by integrating over these asymmetries without making any assumptions about utility. The assumptions made in this example are mostly unnecessary for identification. Our model 1 Honka, Hortaçsu, and Vitorino (2015) further distinguish between awareness and consideration. Our model can be thought of as a reduced form version of their model in which awareness and consideration jointly determine the set of goods from which consumers choose we discuss this interpretation and the assumptions required in Section 3. 2

3 can be applied in panel data or in cross-sectional data with no clear default, we can allow for J goods, income effects, outside options, and we can permit more general assumptions about how attention probabilities are correlated across goods and how they depend on the characteristics of own and rival goods. One set of broadly applicable sufficient conditions for identification of attention probabilities is quasilinear utility plus restrictions on how the characteristics of good j impact attention probabilities for rival goods. Several papers theoretically investigate the revealed preference identification of consideration sets (Manzini and Mariotti (2014), Masatlioglu, Nakajima, and Ozbay (2016)). However, the identification results in these papers require that each individual is observed choosing from a large number of different choice sets, something which is very rarely observed in the field. The models we consider can in principle be identified even if only a single choice set is observed given sufficient price variation. The relationship between inattention and asymmetric price responses has been previously noted several times (e.g. Chen, Levy, Ray, and Bergen (2008), Cabral and Fishman (2012) and Gabaix (2011)). Our contribution is to link this insight to the applied econometrics literature and show that asymmetric demand responses can be used to constructively estimate consideration set probabilities in discrete choice models. Perhaps the closest paper to ours is Crawford, Griffith, and Iaria (2016), which presents identification results in discrete choice models with consideration sets in panel data. Their main result is that, given logit errors, preferences can be recovered provided choice sets do not change over time; their model makes no assumptions about how consideration set probabilities vary with observable characteristics, but they require panel data, the assumption that consideration sets do not change is strong, and their model does not recover consideration set probabilities. A consideration set model is a model is a model in which the observed probability that individual i chooses good j is given by s ij = c C π cs ij (c) where C is the set of all possible subsets of the observed choices, π c is the probability that subset c is considered, and s ij (c) is the likelihood that individual i chooses option j from choice set c. Our analysis starts with the (trivial) negative result that absent instruments identification is impossible in a fully general consideration set model in which all consideration sets are allowed to depend arbitrarily on the characteristics of all goods. Most models in the literature impose further theoretical assumptions: in the example above, consumers wake up only in responses to changes in the default good (as in Ho, Hogan, and Scott-Morton (2015)) and in other models attention probabilities depend only on the characteristics of the good in question (Goeree (2008) and Gaynor, Propper, and Seiler (2016)). In these models, identification is possible, and we show that consideration set probabilities can be constructively recovered from asymmetries in demand given quasilinear utility (or functional form restrictions if one wants to permit income effects). We use data on hotel choices from Expedia and health plan choices from Medicare Part D to illustrate the range of applications of our model. In the hotel data, the order in which hotels were displayed in search was randomized, and we show that we can correctly recover that the that the randomized ordering impacts attention and not utility. Additionally, when the model is estimated on hotels shown in the 3rd-10th search positions, we can predict out of sample which hotels will experience the largest increase in demand when they are put in the 1st and 2nd search positions 3

4 conditional on their current demand. In other words, we can decompose whether the current demand is due to high utility (meaning a hotel would be more popular if more people noticed it) or high attention (meaning that additional advertising is unlikely to be effective). Using data from Medicare Part D, we replicate the finding in Ho, Hogan, and Scott-Morton (2015) that switching decisions are far more sensitive to characteristics of the default plan than characteristics of rival plans, and we show that this implies that the observed degree of inertia is largely due to inattention. Nonetheless, the remaining adjustment costs are sufficiently large that they offset the cost savings from assigning beneficiaries to the lowest cost plans. Finally, we describe an in progress lab experiment which validates that our model can be used to recover the (known) probability that consumers are aware of a given product given their choices from a superset of products that they may or may not be aware of. In Section 2, we work through a simple example to illustrate our identification argument. Section 3 lays out the general argument and proof. Section 4 considers the intuition for identification and estimation in a few special cases of interest we show how, in those cases, our consideration set models can be written as random utility models in which utility depends directly on the characteristics of rival goods. Section 5 presents estimation results from the Expedia data, Section 6 presents estimation results from the Medicare Part D data, Section?? describes our in progress lab experiment and Section 8 concludes. 2 Motivating Example To illustrate our identification argument, we first outline a stylised example that will highlight the main features of our approach. Note that nearly all of the assumptions we make here are for expository purposes and will be relaxed in our more general model in Section 3. Consider a consumer i selecting from two possible products, j = {0, 1}, for example insurance plans. Each plan is defined as a bundle of characteristics, x R K : annual premiums, the size of the deductible, drug coverage, and so on and utility is an additively separable function of these characteristics. One product, plan 0, is a default good that is always considered. The consumer may or may not pay attention to the other product depending on how good (or bad) the characteristics of the default good are. If someone only pays attention to the default good, then they simply pick that good. However, if the consumer also pays attention to the non-default good, then they pick the good that gives them the highest utility of those considered. 2 Let φ(x i0 ) give the probability that a consumer pays attention to both products. The probability that consumer i picks plans j, s ij, in this model can then be expressed as: s i0 (x i0, x i1 ) = (1 φ i (x i0 )) + φ i (x i0 )s i0(x i0, x i1 ) s i1 (x i0, x i1 ) = φ i (x i0 )s i1(x i0, x i1 ) (1) where s ij gives choice probabilities conditional on paying attention. We will show that in this model one can separately identify φ i, s i0 and s i1, and that conventional 2 This is a special case of the model in Ho, Hogan, and Scott-Morton (2015) which we consider more generally below. 4

5 discrete choice models which assume that s ij arises from choice from the full set of products will typically misspecify the relationship between s ij and x. Even if we could non-parametrically estimate s ij as a flexible function of x, we may still be interested in recovering φ i and s ij for normative purposes or to consider counterfactuals where we perturb consideration. The key to our identification argument is that, with full attention, the derivative of the choice probability of product j with respect to characteristics of product j will equal the derivatives of the choice probability of j with respect to the same characteristic of product j in a large class of discrete choice models (quasilinearity is a sufficient condition). This result is analogous to Slutsky symmetry in the continuous good case. In the two-good case, symmetry given full information follows directly from the fact that with no income effects and no outside option, only relative prices matter. Since only relative prices matter, s ij x ij k = s ij x ijk. Since market shares sum to 1, we further have s ij x ijk = s ij s x ijk and putting these together implies: ij x = s ij ij k x ijk, the desired symmetry property. 3 This result is analogous to Slutsky symmetry in the continuous good case. This symmetry result breaks down once we allow for individuals to fail to consider all available products. Differentiating Equation 1 and using the fact that the conditional market shares satisfy symmetry, we obtain: s i1 s i0 = φ i(x i0 ) x i0k x i1k x i0k s i1 = φ i(x i0 ) s i1 x i0k φ i (x i0 ) = log (φ i) s i1 x i0k where the second line follows from the fact that s i1 = φ i (x i0 )s i1. First, consider the probability of choosing a good conditional on considering both plans (the s ). A reduction in the price of the default plan will increase the share choosing the non-default plan and a reduction in the price of the non-default plan will increase the share choosing the default plan. This effect will be symmetric by the argument above. Now consider the unconditional market shares given in the above equation. An increase in the price of the default plan will also increase the probability of paying attention to the non-default plan if individuals only consider the non-default plan when the default is particularly unattractive. Thus, the responsiveness of the non-default plan to a reduction in the price of the default plan will in total be larger than the responsiveness of the default plan to a reduction in the price of the non-default plan. Your choice is sensitive to the price of the plan you chose last time, but insensitive to the price of rival plans to the degree that you are inattentive to those plans. Rearranging the above equation and integrating, the consideration probability function, φ i, can be fully recovered by integrating over the support of x i0k, with the constant of integration given by the coincidence of symmetric cross derivatives at φ i = 1. ( [ 1 si1 φ i (x i0 ) = exp s i1 x k s ] ) i0 i0 x k dx k i0 i1 3 In the more general proof where we relax separability and allow utility to be a general nonlinear function of observables, these derivatives must be taken conditional on observables being equal across the two goods. (2) (3) 5

6 Our broad approach, then, is to argue that economic theory imposes restrictions on choice behavior given the assumption that individuals consider every good. Observed deviations from the predictions of the full consideration model are informative about the structure of the underlying consideration probabilities. Of course, there are other reasons one might detect asymmetries in any given case such as misspecification of the parametric model or some other behavioral phenomenon. In the two-good case considered above, attention probabilities are exactly identified. However, more generally, the J-good model predicts a particular pattern of asymmetries across the different goods; in Section 6.3, we provide graphical and statistical tests to check whether the asymmetries observed in the data follow the distinctive pattern implied by inattention. 3 Model In this section, we describe our analytic framework and identification results formally, drawing the connections between our work and the prior literature. We begin by outlining the assumptions which underpin our approach at the most general level before adapting the framework to apply to commonly estimated consideration set models in the applied literature. We consider an individual i who makes a discrete choice among J + 1 products, {0, 1,..., J}, with J 1. Each product j is characterised as a bundle of K 1 characteristics, x ij, with support χ R K. Let x i = [x i0,..., x ij ]. We allow for individuals to consider an (unobserved) subset of available goods when making their choice. The set of goods that a consumer considers is called the consideration set. At this point, we place no restrictions on consideration set formation. Let P({0,..., J}) represent the power set of goods, with any given element of P({0,..., J}) indexed by c, and the set of consideration sets containing good j be given as: P(j) = {c : c P({0,..., J}) & j c} (4) We will develop identification results for a set of choice models that imply choice probabilities of the following form: s ij (x i ) = π ic (x i )s ij(x i c) (5) c P(j) where s ij is the observed probability of i selecting j (the market share of good j), π ic gives the probability that the set of goods c is considered, and s ij (x i c) gives the probability that i selects good j from the set c. For the most part, we suppress the dependence of these quantities on x. As π ic and s ij (x i c) represent proper probabilities, we have: π ic = 1 (6) c P({0,...,J}) s ij(c) = 1 (7) j c The immediate objects of interest are the consideration set probabilities, π ic, and the unobserved probabilities, s ij (x i c). If one places no restrictions on how the π ic are allowed to vary with the characteristics of all 6

7 of the included goods, then identification is not possible. Consider for example the case with two choice sets, 0, A and 0, A, B where: s ij (x i ) = π i0a s ij(0, A) + π i0ab s ij(0, A, B) (8) If we substitute for π i0a and π i0ab with, respectively, ˆπ i0a = π i0ab s ij (0, A, B)(s ij (0, A)) 1 and ˆπ i0ab = π i0a s ij (0, A)(s ij (0, A, B)) 1. Then we will obtain a model with exactly the same observed market shares, exactly the same latent utilities, but different consideration set probabilities. In order to uniquely pin down the consideration set probabilities, one will need to restrict how these probabilities are allowed to vary with the underlying characteristics of each available good. Such assumptions are typically already made in the applied literature, as we review below. The applied literature also relies for identification on further exclusion restrictions for example, that some variables impact consideration probabilities and not utility. Our proof will show that these further restrictions are in fact unnecessary and that one can instead exploit restrictions imposed on substitution patterns by utility maximization. Individual Utility Throughout, we will assume that individuals make choices from any given consideration set in order to maximise their utility. We take a random utility approach, decomposing the overall utility of good j, u ij, into a deterministic component that depends on the characteristics of good j and a random error term: u ij = v ij (x ij ) + ɛ ij (9) where we make the following assumptions: Assumption 1. Quasi-linearity There exists a characteristic x 1 ij function linearly and with homogenous coefficients across goods. that enters the indirect utility u ij = v ij (x ij ) + ɛ ij = βx 1 ij + w j (x 2 ij) + ɛ ij (10) Let v ij (x ij ) V R for all x ij χ, where V is compact. 4 random coefficient on x 1 ij.] [Note: relaxing all results to allow for Assumption 2. Exogenous Characteristics: ɛ ij x ij for i. We focus on the question of identification without the additional complications arising from endogeneity in this paper. This assumption will be relaxed in future work. is continuously distributed and the distribu- Assumption 3. One Continuous Characteristic: x 1 ij tion of x 1 ij x2 ij has a positive density on χ. 4 Quasi-linearity can be relaxed with additional parametric restrictions on the utility function. We will discuss this later in the Section. 7

8 Assumption 4. F (ɛ i0,..., ɛ ij ) is absolutely continuous with respect to the Lebesgue measure and gives rise to a density function that is everywhere positive on R. With [] denoting exclusion, the probability that individual i chooses option j having considered the set of options c, with j c, is given by: ( ) s ij(c) = P r v ij + ɛ ij = max v j ij c + ɛ ij vij +e v il [ vij +e v ij ] vij +e v il = f c (z l,..., e,..., z j)dz l...[dz j]...dz l de (11) Maximisation given the structure that we have placed on the utility function implies some specific restrictions on choice probabilities given a particular consideration set, in addition to those imposed by probability theory. Note that all proofs can be found in Appendix A. Corollary 1. Symmetry of Cross Derivatives: with respect to the quasi-linear characteristic: s ij = s ij (12) Corollary 2. Absence of Nominal Illusion: level shifts in the quasi-linear characteristics do not alter choice probabilities: s ij(x 1 i, x 2 i ) = s ij(x 1 i + δ, x 2 i ) (13) Consideration The identification results in this paper pertain to models in which consideration set probabilities vary with product characteristics. We are not able to say anything about the identification of consideration set probabilities and market shares conditional on paying attention in situations where consideration set probabilities are independent of characteristics. Assumption 5. π ic is continuously differentiable with: for π ic < 1. π ic 0 (14) Given Assumption 5, observed choice shares satisfy neither Corollary 1 or Corollary 2 because consideration set probabilities vary with product characteristics. This results in a violation of symmetry and a violation of nominal illusion. 8

9 Lemma 1. Given Assumptions 1-5, if s ij s ij (15) s ij (x 1 i, x 2 i ) s ij (x 1 i + δ, x 2 i ) (16) for δ 0, then π i (0,..., J) < 1, where π i (0,..., J) is the probability that an individual considers all goods {0,..., J}. Proof of Lemma 1. With a slight abuse of notation, let the set of consideration sets containing good j and j be given as: P(j, j ) = {c : c P({0,..., J}) & j c & j c}, (17) Given symmetry, the differences in cross derivatives depend on how market shares change with the variation in consideration set probabilities generated by variation in characteristics. s ij s ij = c P(j) = c P(j) π ic s ij(c) π ic s ij(c) c P(j ) c P(j ) π ic s ij (c ) + c P(j,j ) π ic ( ) s ij (c ) x 1 s ij (c ) ij x 1 ij (18) π ic x 1 s ij (c ) (19) ij 0 for π i (0,..., J) < 1 (20) Similarly, while level shifts in the quasi-linear characteristic do not cause choice probabilities conditional on a given consideration set to change, they do alter consideration set probabilities. Thus, absence of nominal illusion is violated. Without loss of generality, let δ > 0. Then, s ij (x 1 i, x 2 i ) = c P(j) c P(j) ( ) π ic (x 1 i, x 2 i )P r v ij + ɛ ij = max v j ij c + ɛ ij ( ) π ic (x 1 i + δ, x 2 i )P r v ij + ɛ ij = max v j ij c + ɛ ij (21) (22) = s ij (x 1 i + δ, x 2 i ) for π i (0,..., J) < 1 (23) While the assumptions made thus far enable us to make statements about the consistency of observed choice behaviour and a full consideration model, they are not generally sufficient for point identification of all the structures of interest. Observing a violation of symmetry and nominal illusion enables one to conclude that there is a positive probability that an individual does not consider all potential goods when making their choice, but still does not leave one in a position to pin down exactly what these probabilities are. We here consider two popular models in the applied literature, which we shall call the pure 9

10 default model and the independent probability model and show how they are identified without the need of additional exclusion restrictions given our symmetry and nominal illusion results. Roughly, we can think of the first model from Ho, Hogan, and Scott-Morton (2015) as more appropriate in settings where inertia or choice of defaults explains a large fraction of choices, while the second model from Goeree (2008) might be more appropriate in cross-sectional data without a clear default. We note below that one or the other of these cases either exactly or approximately subsumes every consideration set model we are aware of in the applied literature. 3.1 Pure Default Model A number of papers in the literature assume the existence of a default good amongst inside goods, d {0,..., J} and allow the probability of considering all other options to vary as a function of the characteristics of that default good. This model can be used to identify whether inertia observed in panel data arises because consumers do not consider other options (and thus might be better off if they switched) or because consumers are actively choosing not to switch due to adjustment costs or persistent unobserved heterogeneity. This specification has been popular in the literature on health insurance choices. For example, in Ho, Hogan, and Scott-Morton (2015) consumers choose in two stages. First, they decide whether to be attentive or not as a function of unobservables and the characteristics of the default good and second, if they are attentive, they make an active choice among all goods. To adapt our general framework above to this type of model, we modify our assumptions as follows: Assumption 5a. φ i (x id ) is a continuously differentiable function, either strictly increasing or decreasing in x 1 id for φ i < 1. To adapt our earlier notation to this setting, let: π i (0,..., J) = φ i (x id ) (24) π i (d) = 1 φ i (x id ) (25) π i = 0 for any other consideration set. (26) The probability of selecting option j then becomes: s ij = (1 φ i ) 1 (j = d) + φ i s ij (27) where s ij denotes the probability of choosing j conditional on considering all available goods. We now show how exploiting symmetry of shares conditional on paying attention enables us to identify φ(x id ). First note that: s id s ij x 1 id = φ i (x id ) s id = φ i (x id ) s ij x 1 id + φ i(x id ) x 1 s ij id (28) 10

11 for j d. Given quasi-linearity (Assumption 1), the following holds at all points in the support of characteristics: s ij x 1 id s id = φ i(x id ) x 1 s ij id = φ (x id ) s ij x 1 id φ i (x id ) Thus, the derivative of log(π i ) is directly identified from the difference in the cross derivatives: 1 φ i (x id ) φ i (x id ) x 1 = log (φ i(x id )) id x 1 id [ ] = 1 s ij s ij x 1 s id id (29) (30) x 1 id : Given Assumption 3, one identifies φ i up to a scale factor C by integrating over the support of 1 log (φ(x id )) + C = s ij [ ] s ij x 1 s id id x 1 dx id1 (31) ij To complete the identification proof requires us to pin down the constant of integration. Corollary 3. consideration. Given Assumptions 1-4 and 5a, cross derivative symmetry is equivalent to full 1 s ij [ ] s ij x 1 s id id x 1 = 0 iff φ i (x id ) = 1 (32) ij Proof: Necessity follows from Corollary 1. Sufficiency is proven by contradiction. Assume that symmetry is achieved at some x id with φ(x id ) < 1. We would have, 1 φ i (x id ) φ i (x id ) x 1 = 1 [ sij id s ij x 1 s ] id id x 1 = 0 (33) id Given that 0 φ i 1, symmetry requires: φ i (x id ) x 1 id = 0 (34) However, by Assumption 5a, this only holds only if φ i (x id ) = 1 That is, if an individual considers all potential goods. Assumption 6a. There exists an x 1 id χ at which φ i( x id ) = 1. 11

12 Given Assumption 6a, the constant of integration can be identified at x id, the point in the support where φ i (x id ) = 1: log (φ i ( x id )) + C = 0 C = 1 (35) This argument then reaches it s conclusion in Theorem 1. Theorem 1. (Pure Default Model) Given Assumptions 1-3, 4a, 5-6, the probability of paying attention to the full set of potential goods, φ i (x id ) and the market shares of goods conditional on paying attention, s ij for j d are identified. 3.2 Independent Consideration Probabilities While the pure default model is appropriate in settings where consumers either pay attention or choose a default, it is less appropriate in when consumers may consider a subset of goods and ignore others. Another popular set of models start with the assumption that the probability of paying attention to good j is independent of the probability of paying attention to good j conditional on observables for any goods j, j and that there exists a default good, good-0, that is always considered. In online applications, for example, the ranking of a product in search will depend on attributes of that product. In bricks and mortar retail, the shelf a product is on or the location in the store is likewise chosen based on observable attributes of that product. With a large number of products, the impact of the characteristics of any single rival product may be second order. Further, the model developed in Goeree (2008), where each good has a probability of being considered which depends on characteristics of that good, is a member of this set of models. We will again show that the impact of characteristics on utility and attention probabilities is separately identified and that the exclusion restrictions Goeree (2008) uses in estimation are unnecessary. The Goeree (2008) model is applied directly in Gaynor, Propper, and Seiler (2016). Other models in the literature, such as Honka (2014) and Honka, Hortaçsu, and Vitorino (2015) assume that consumers rank alternatives based on expected utility and then consider the top k such alternatives. In these models, as the number of goods becomes large, whether a good is considered is again a function only of the characteristics of that good and so the Goeree (2008) model can serve as a reduced form representation. Honka, Hortaçsu, and Vitorino (2015) further distinguishes between awareness and consideration and while the Goeree (2008) model can capture the net effect of these factors together, we do not attempt to separately identify these factors. To adapt our general framework to this setting, we thus make the following modifications to our assumptions: Assumption 5b. φ ij (x ij ) is continuously differentiable function, strictly increasing or decreasing in x 1 ij for φ j < 1, and 0 φ j (x ij ) 1. Let the probability of paying attention to some consideration set c be given by: π ic (x i ) = φ ij (x ij ) ( 1 φij (x ij ) ) (36) j c j / c 12

13 with φ i0 = 1 for all x i0. Assumption 6b. There exists an x ij χ at which φ ij ( x ij ) = 1. Note that Assumption 6b is much less restrictive than it might first appear. For example, it does not require us to observe a single market in which all goods are paid attention to with probability one. Without a parametric restriction on φ j (x ij ), it only requires us to observe one market for each good j in which good j is paid attention to. With parametric restrictions on φ j (x ij ), x ij need not be in the support of the data. Given the structure imposed by the Independent Consideration Probabilities model, choice probabilities take the form: s ij = c P(j) l c where s ij (c) = P r ( ) v ij + ɛ ij = max j c v ij + ɛ ij and j c s ij (c) = 1. φ l (1 φ l ) s ij(c) (37) l / c This model again generates observed choice shares that satisfy neither Corollary 1 or Corollary 2. For example, given symmetry, the differences in cross derivatives depend on how market shares change with the variation in consideration probabilities. With a slight abuse of notation, let the set of consideration sets containing good j and not containing j be given as: P(j/j ) = {c : c P({0,..., J}) & j c & j / c & 0 c}, (38) the and let the set of consideration sets containing good j and j be given as: P(j, j ) = {c : c P({0,..., J}) & j c & j c & 0 c}, (39) The difference in cross derivatives is then given by: s ij s ij = φ ij φ ij c P(j/j ) l c φ il l / {c,j } φ il c P(j /j) l c l / {c,j} (1 φ il ) ( s ij(c j ) s ij(c) ) (40) (1 φ il ) ( s ij (c j) s ij (c ) ) 0 for φ ij φ ij < 1 (41) Note: Fully nonparametric proof to come also using power of violations of absence of nominal 13

14 illusion. We begin with a semiparametric identification proof, which proves that models of this type that are typically found in the applied literature are identified without the need for exclusion restrictions. We here make an assumption first on the coefficient on our special regressor, x 1 ij, and independence assumptions on the utility errors. These will be relaxed in later drafts. Assumption 1b. There exists a characteristic x 1 ij and with homogenous coefficients across goods. that enters the indirect utility function linearly u ij = v ij (x ij ) + ɛ ij = x 1 ij + w ij (x 2 ij) + ɛ ij (42) Let w ij (x 2 ij ) W R for all x2 ij χ, where W is compact. Normalise v i0 = 0 given that only differences in utility matter. Assumption 7. (Semiparametric Restrictions): Let ɛ ij be distributed independently across goods with a known distribution and mean normalised to zero and variance normalised to one. Then, we have: s ij(c) = = j c/j j c/j F j F j ( ) vij + e v ij de (43) ( x 1 ij + w ij + e x 1 ij w ij ) de (44) where F j are known cumulative distribution functions. Further, 0 < s ij (c) w ij < 1 (45) This condition is met for many common functional form assumptions on the error term, e.g. logit errors. Assumption 8. Boundary Conditions On the support of characteristics, the default good cannot fully dominate or be fully dominated by any non-default good. F j (e x 1 ij w ij ) de = δ (46) F j (e x 1 ij w ij ) de = δ (47) where δ + δ < 1. Theorem 2. Under Assumptions 1-4, 5b, 6b and 7-8, φ ij (x ij ) is identified in [0, 1] and w ij (x 2 ij ) is 14

15 identified in W for j = 1,..., J. Our identification proof shows that we can form a system of equations from the market shares of non-default goods and expressions for φ ij, which corresponds to a contraction mapping in a compact metric space. Thus there exists a unique fixed point to the system, proving that φ ij (x ij ) and w ij are identified. Taking the differences of cross derivatives between default and non-default goods with respect to the quasi-linear characteristic gives: s i0 s ij x 1 i0 = φ ij c P(0/j) l c φ il l / {c,j} (1 φ l ) (s i0(c j) s i0(c)) (48) for j = 1,..., J. Rearranging Equation 48, we have: φ ij = ( s i0 ) s ij x 1 i0 c P(0/j) l c φ il l / {c,j} (1 φ il ) (s i0(c j) s i0(c)) 1 (49) for j = 1,..., J. Imagine for a moment that the remaining φ ij (x ij ) and s i0 (c) where known. Then, in an analogous manner to proving identification in the Pure Default case, one could integrate over the support of x 1 ij to identify φ ij(x ij ) with the constant of integration determined at x ij (Assumption 6b): ( x 1 ij s i0 φ ij (x ij ) = 1 ) s ij x 1 i0 c P(0/j) l c φ il l / {c,j} (1 φ il ) (s i0(c j) s i0(c)) Given the J independent market share equations for non-default goods, we have the system of 2J equations for j = 1,..., J: s ij = c P(j) l c φ l (1 φ l ) s ij(c) (51) l / c ( s i0 φ ij = 1 ) s ij x 1 i0 c P(0/j) l c φ il l / {c,j} (1 φ il ) (s i0(c j) s i0(c)) 1 1 dx 1 ij (50) dx 1 ij (52) In the Appendix, we show how we can manipulate these expressions to form a system of equations with a Jacobian that has all elements strictly less than one in absolute value, proving that there exists a unique θ = [w i1,..., w ij, φ i1,..., φ ij ] that satisfies the system. 15

16 4 Special Cases In this section, we consider the two special cases of the model highlighted in our proof. Roughly, we can think of the first model from Ho, Hogan, and Scott-Morton (2015) as more appropriate in settings where inertia or choice of defaults explains a large fraction of choices while the second model from Goeree (2008) is more appropriate in cross-sectional data without a clear default. In both cases, we show that the model with consideration sets is equivalent to a random utility model in which consideration sets are absent but the utility of each good depends in a particular way on the characteristics of rival goods. We use this representation to clarify the patterns in the data that identify attentiveness in our general proof and then we discuss estimation. 4.1 Inattention and Adjustment Costs The first special case we consider is where consumers have correlated inattention probabilities across all goods and where there is a clear default amongst the inside goods. This model can be used to identify whether inertia observed in panel data arises because consumers do not consider other options (and thus might be better off if they switched) or because consumers are actively choosing not to switch due to adjustment costs or persistent unobserved heterogeneity. One such model is considered in Ho, Hogan, and Scott-Morton (2015). In their interpretation, consumers choose in two stages. First, they decide whether to be attentive or not as a function of unobservables and the characteristics of the default good and second, if they are attentive, they make an active choice among all goods. Following their notation, we write the choice probabilities as a function of the probability of inattention, defined as one minus the attentive probability. This implies that choice probabilities are given by: P (Y id = 1) = P (I i x id, z i ) + (1 P (I i x id, z i ))P a (Y id = 1) P (Y ij = 1) = (1 P (I i x id, z i ))P a (Y ij = 1) for j d (53) where P a (Y ij = 1) is the probability of choosing plan j conditional on paying attention. We assume that consumers have utility given by: u ij = x ij β i + ξ i,j=d (x id ) + ɛ ij (54) where ξ i,j=d is the traditional adjustment cost term which takes the value ξ i (x id ) for plan d and is 0 otherwise and ɛ ij is i.i.d. extreme value. Note that ξ is allowed to vary with the characteristics of the default plan but cannot generally vary with the characteristics of other alternatives. This is a crucial identifying assumption (and is in our view quite natural in most cases) Identification As in the alternative-specific model, the above model is equivalent to a standard logit model with an additional inertial term: u ij = x ij β + ξ i,j=d (x id ) + ψ i,j=d + ɛ ij (55) 16

17 where ψ i,j=d takes the value ψ i for plan d and is 0 otherwise. We show in Appendix C that ψ i is given by: ( 1 + P (Ii x id, z i ) k d ψ i = ln exp((x ) ik x id )β i ξ i (x id )) (56) 1 P (I i x id, z i ) The term ξ i,j=d (x id ) in this model can be thought of as all of the reasons why an attentive consumer might nonetheless prefer to choose the same plan - for example, because there are adjustment costs to switching or persistent unobserved heterogeneity. 5 The ψ i,j=d term by contrast captures the possibility that the consumer chose the default plan not because it had higher utility, but simply because they were inattentive to the available options. 6 We observe that beneficiaries are inertial and we want to know - is this because of adjustment costs or inattention? If it is because of adjustment costs (ψ i = 0) then consumers will readily switch to alternative options if those options become more desirable. If it is because of inattention, consumers will be insensitive to characteristics of alternative plans. They may however still switch in response to changes in the characteristics of the default plan because these changes are allowed to impact the degree of inattention. Thus, in the health plan context, we may see the characteristic pattern documented in Ho, Hogan, and Scott-Morton (2015) wherein consumers readily switch when the premiums of their prior year plan increase but do not switch when the premiums of alternative plans fall. Provided we observe enough determinants of attentive behavior (which again, need not include any characteristics beyond those which enter utility), we can separately identify ξ and ψ based on the asymmetry in how the share of rival plans responds to changes in the characteristics of the default good relative to how the share of the default good responds to changes in the characteristics of rival plans Estimation To estimate this model by the usual methods, one must assume a functional form for P (I i x id ). If we assume that P (I i x id ) is itself given by a logit, we obtain a particularly simple expression. This is the same assumption used in HHS. Suppose consumers are inattentive whenever: x id β + ɛ id > f(z i ) + v i (57) 5 The inertia term ξ is not formally identical to a model where error terms are correlated over time. Abaluck and Gruber (2016) gives one example of a model which allows for both possibilities in the empirical setting of health plan choice we consider below. Nonetheless, in a model that does not allow for such a correlation, the ξ term may proxy for it. 6 Note that if we observed some subset of consumers that we knew were paying attention and we knew had exactly the same preferences and choice set as inattentive consumers, then we could estimate s ij and directly compute ψ ij. In practice however, this condition is unlikely to be met. Consider the context of health insurance plan choice. One might consider using the choices of new enrollees making a de novo choice to estimate P id and then compare those choices to P id estimated among returning enrollees. This method would incorrectly assume returning enrollees have no true adjustment costs or persistent unobserved preferences. The proof in section?? shows that this model is identified without observing any such consumers. 17

18 where x id are characteristics of the default good and z i is a vector of other individual characteristics and ɛ id and v i are both type 1 extreme value. Then the probability of being inattentive is: P (I x id ) = and we can simplify the inattention term to: exp(x id β) exp(f(z i )) + exp(x id β) [ j ψ i,j=d = ln 1 + exp(x ] ijβ) exp(f(z i )) (58) (59) Note first that we do not need to observe any additional individual characteristics in order to estimate this model. We can assume that f(z i ) = 0 and the model is still identified. Including individual characteristics just produces a more flexible model of inattention and thus reduces the likelihood that the error term is misspecified due to heteroscedasticity. 4.2 Alternative Specific Consideration A second special case of our general framework is the case where there is a probability of considering each good that is a function of the characteristics of that good (and perhaps but not necessarily individual level characteristics). Let A ij be an indicator for individual i paying attention to product j. Then for each good we have a probability of attention: φ ij = P (A ij = 1 x ij ) (60) which we model as a binary choice model. In other words: P (A ij = 1 x ij ) = P (A ij > 0) and A ij = x ij γ + η ij (61) One interpretation of this model is that consumers choose whether or not to pay attention to each good as a function of that good s characteristics. But this need not be the case. Consideration probabilities less than 1 could arise due to several factors: perhaps consumers happen to be more likely to consider goods which advertise more but they are not consciously doing so, or alternatively, perhaps products with certain characteristics are more likely to appear on shelves. The substantive point is that consumers only consider a subset of the available choices and their probability of considering each subset of choices depends on the characteristics of that good and potentially other individual-level characteristics. Following Goeree (2008), we further assume that η ij are i.i.d. extreme value and independent across goods, and that utility is given by a linear random coefficients logit specification. The assumption about η ij implies that we can compute the probability of a choice set c as: π c = φ il (1 φ ik ) (62) l c k / c 18

19 4.2.1 Random Utility Representation We show in Appendix C that this consideration set model can be rewritten as a full information model where the utility of each good depends directly on the characteristics of rival goods: u ij = x ij β i + ψ ij + ɛ ij (63) where: ( sij ψ ij = ln = ln 1 s ij ( ) ( ) s ij ln 1 s ij P (A j x ij ) k j exp(x ikβ i + ψ k ) (1 P (A j x ij )) exp(x ij β i ) + k j exp(x ikβ i + ψ k ) ) (64) where s ij is the probability that option j is chosen and s ij denotes the probability of choosing option j conditional on paying attention to that option. In other words, the inattention term ψ ij is the difference between (a monotonic function of) the observed probability of choosing the each option and what that probability would be if consumers were paying attention. Unlike a traditional logit model, the utility of good j in this representation depends directly on the characteristics of other goods via the ψ ij term if the consumer has some probability of being inattentive to good j. This dependence effectively undoes the substitution that would occur in response to changes in the characteristics of rival goods in a full information model. In other words, the price of good k increasing would normally increase the market share of good j (and likewise, an increase in the price of good j would increase the market share of good k). In the above representation, if one is partly inattentive to good j, then this price increase would decrement the utility of good j enough that less substitution occurs, breaking the symmetry between j and k. We can be more specific in this model about the sign of asymmetries. If the consumer is inattentive to good j, changes in the characteristics of good j impact good k through changes in both x ij β i and ψ ij. The impact via the first channel is symmetric with the impact of characteristics of good k on the choice of j, but the impact via the second channel does not exist with full attention. Thus, if consumers are inattentive to good j and if characteristics impact attention with the same sign as their impact on utility, then we expect the impact of a change in the characteristics of good j on the choice of good k to exceed in magnitude the impact of a change in the characteristics of good k on the choice of good j. This pattern makes sense in some contexts but not others. Suppose that price reduces both attention and utility so that the Goeree (2008) model predicts that the market share of an attentive good responds more to a change in the characteristics of an inattentive good than vice-versa. For a consumer choosing from an ordered list of items, we might observe that when the price of one of the top 3 items increases it has little effect on an item ranked 20th but when the price of an item ranked 20th decreases we do see an effect on the top 3 items because the 20th ranked item moves up in the list and is brought to consumers attention. This would be consistent with the model. Alternatively, we might observe that a consumer choosing amongst health plans is sensitive to changes in the plan they chose last year the plan they are paying attention to but insensitive to changes in rival 19

20 plans. This would contradict the pattern implied by Goeree (2008) and motivates our investigation of the model in Section Estimation Goeree (2008) provides details of the estimation process. We sketch the main ideas here. With a small number of available alternatives, estimation in the alternative-specific inattention model is straightforward. The probability of choosing any specific alternative as a function of the parameters θ = (β, γ) is given by: P (Y ij = 1 θ) = P (c = θ) 1 j=d + φ il (θ) c C l c k / c (1 φ ik (θ))p (Y ij = 1 c, θ) (65) We can use this to construct the likelihood function and then estimate the parameters β and γ by maximum likelihood. In larger choice sets, a major computational issue arises - there are 2 J possible consideration sets to sum over. To deal with this problem, we follow Goeree (2008) in using a simulated likelihood approach. The basic idea is to estimate the term c C l c φ il(θ) k / c (1 φ ik(θ))p (Y ij = 1 c, θ) by simulating R consideration sets per individual where, for each r, each option is added to the consideration set with probability φ ij so that the probability a given consideration set is simulated is given by: l c φ il k / c (1 φ ik). We then compute: ˆP ij = 1 P (Y ij = 1 c r, θ) (66) R Since each c r is chosen with probability l c φ il k / c (1 φ ik), we have that: ˆP ij p 1 R r c C r l c φ il (θ) k / c = φ il (θ) c C l c k / c (1 φ ik (θ))p (Y ij = 1 c, θ) (1 φ ik (θ))p (Y ij = 1 c, θ) (67) This procedure would still be computationally burdensome because it would require computing P (Y ij = 1 c r ) for every simulation r for all individuals at each candidate set of parameter values (since as the underlying parameters shift, the φ, and thus the choice sets would shift). Following Goeree (2008), two additional tricks are used so that the choice probabilities need to be evaluated only once per person for each simulation r. First, we use the same uniform draws to simulate choice sets at each set of parameter values. Second, we use an importance sampler so that the choice probabilities need only be evaluated at the consideration sets implied by the parameters at their initial values. Specifically, we can compute equation 66 using: ˆP ij = 1 φ il (1 φ ik ) P (Y ij = 1 c 0, θ) R φ 0 ir (θ 0) r l c k / c (68) where φ 0 ir (θ 0) = l S 0 φ il k / S 0 (1 φ ik ) and each consideration set is sampled with probability 20

21 φ 0 ir (θ 0). 7 5 Informative Advertising and Hotel Choice We will now apply these results to data from online hotel choices made via Expedia.com. A subset of the data randomizes the search position in which hotels are displayed to consumers. We use this variation to test whether our model identifies that search position impacts attention but not utility, as well as to validate the model as a tool for generating out of sample predictions of the efficacy of advertising that increases consumer awareness of products. We will estimate the Goeree (2008) discussed in section 4.2. That model fits well here because a product s ranking in search results depends on the observable attributes of that product. Before estimating the model, we demean the attributes at the individual level. This means that the attention probabilities depend on the relative value of the observed attributes compared to other options in each consumer s choice set, as they should if these probabilities arise from each product s placement in search results Data The full dataset contains results from 166,036 consumer queries, including the hotels consumers were shown, attributes of those hotels, as well as whether the consumer clicked on the hotel and whether they ultimately purchased the hotel. The main attributes we consider are price, star rating review score, a location desirability score, whether there is an on-going promotion and the position of the hotel in the search results. The data span almost 54,877 hotels in 788 destinations. Ursu (2015) contains a detailed discussion and describes several sample selection restrictions designed primarily to clean the data (e.g. dropping all hotels with prices of less than $10 per night or more than $1,000 per night). We impose the exact same sample selection restrictions as Ursu (2015) with two exceptions: we restrict to the top 10 choices and we do not restrict to the 4 largest hotel destinations, which results in a much larger sample. In the data, we observe both clicks and final transaction and our main results are reported for the subset of consumers for which we observe a final transaction. After restricting also to the sample with a randomized hotel ordering in search results, we end up with 2,441 total queries which span 9,851 hotels. Summary statistics from our data after all sample selection restrictions are imposed are reported in Table 1. The average hotel costs about $160 a night, is rated 3.2 out of 5 stars, receives an average review score of 3.9 out of 5 from Expedia users, has a 74% chance of being from a popular brand and a 20% chance of being currently undergoing a promotion (meaning that the sale price was noted as being lower than is typical). We can see that hotels which were actually chosen tend to be lower priced, more likely to be undergoing a promotion, and ranked higher in search. 7 Recall that an importance sampler estimates a density f(x) by drawing from a density g(x), labeling the resulting value as x 1 and then weighting each draw by f(x 1)/g(x 1). The resulting density is equivalent to drawing directly from f(x). 8 This demeaning would make no difference in a conventional logit model, but it can make a difference in an attentive logit model. 21

22 Table 1: Expedia Data: Summary Statistics A. All Hotels B. Chosen Hotels Price (dollars) (97.2) (67.9) Hotel Stars (1-5) (0.88) (0.80) Hotel Review Score (1-5) (0.72) (0.61) Popular Brand Indicator (0.44) (0.44) Location Score (normalized) (0.87) (0.86) Ongoing Promotion Indicator (0.40) (0.45) Position in Search (2.87) (2.89) Number of Hotels 24,410 2,441 Notes: Table reports means and standard deviations (in parenthesis) for the sample of consumers who received a randomized hotel ordering in search and recorded a final transaction. Price is dollars per night, the popular brand indicates the hotel is part of a "major hotel chain" (as defined by Expedia), and the online promotion indicator indicates that the hotel is highlighted because the listed price is lower than is typical for that hotel. 5.2 Attentive Logit Estimation Given that the order of the hotels was randomized, we might expect the position of the hotels in the search results to impact only attention and not utility. This need not be the case Expedia did not inform consumers that the order was randomized so they may believe that higher ranked hotels are better in some unobservable respect. The estimation results with a conditional logit model and the Goeree (2008) model, referred to as the attentive logit model, are shown in Table 2. First, note that in both models, all the coefficients have the expected sign - consumers dislike high prices and they like hotels with more stars, higher review scores, better locations and higher positions in search. The conditional logit model implies that their responsiveness to a hotel moving from search position 10 to search position 1 is about the same as an $80 or 50% increase in the price per day. The attentive logit model shows that the impact of search position on choices comes entirely through the impact on attention rather than utility. The model also implies that consumers are much more likely to consider hotels which have a desirable location score. This makes intuitive sense and is consistent with a world in which consumers make a query, find the hotels located nearby their destination, and then compare prices and other attributes to pick the ones. Table 3 shows how choice probabilities and attentive probabilities vary with the ranking. The model suggests that the attentive probability ranges from 0.3 for a hotel in the 10th position to 0.6 for the highest ranked hotel (the choice probability increases by a factor of 3 which differs from the ratio of average attention probabilities due to Jensen s inequality). We also compare the price elasticities estimated in the conditional logit model with the attentive logit model. The logit model seems to modestly attenuate own-price elasticities, with an average error of about 10%. This arises because consumers are insensitive to price variation for goods to which they are inattentive. 22

23 Table 2: Expedia Data: β and γ Conditional Logit Attentive Logit Utility: Price (dollars) *** *** (0.001) (0.003) Hotel Stars (1-5) 0.566*** 0.805*** (0.044) (0.138) Hotel Review Score (1-5) 0.410*** 0.768*** (0.049) (0.183) Popular Brand Indicator ** (0.058) (0.161) Location Score (normalized) 0.695*** 0.249** (0.047) (0.109) Ongoing Promotion Indicator 0.191*** (0.057) (0.156) Position in Search *** (0.008) (0.027) Attention: Price (dollars) (0.001) Hotel Stars (1-5) (0.106) Hotel Review Score (1-5) (0.115) Popular Brand Indicator (0.179) Location Score (normalized) 0.813*** (0.129) Ongoing Promotion Indicator (0.170) Position in Search *** (0.022) Constant (0.532) Notes: Table reports coefficient estimates from the Goeree (2008) model. Estimates are the coefficients in the utility and attention equations (not marginal effects). Standard errors are in parentheses. ***Denotes significance at the 1% level, **Denotes significance at the 5% level and *Denotes significance at the 10% level. The model is also includes a default which is a randomly chosen alternative for each consumer. Given the estimated attention probabilities, this default is chosen less than 1% of the time. More generally, the direction of the bias in own-price elasticities is ambiguous and depends on the correlation between prices and attention probabilities (which is empirically close to zero in this case). Figure 1 show the estimated asymmetries in the responsiveness of choice probabilities to the position variable. In practice, one can only perturb the position of one characteristic by also changing 23

24 Table 3: Expedia Data: Choice Probabilities and Elasticities Search Position Market Share Attentive Probability Conditional Logit Elasticity Attentive Logit Elasticity % % % % % % % % % % the position of other characteristics, but given the estimated coefficients in the model, we can ask how choice probabilities respond to changes in the characteristics of rival goods holding fixed the position of all other variables. The x-axis shows the value of this asymmetry computed for each pair of goods as a percentage of the marginal effect for the higher position good. The main takeaway is that the estimated impact of position on attention probabilities leads to large asymmetries in demand responses. 5.3 Out of Sample Validation Because we recover the probability of attention for each good, we can ask for which goods is it the case that demand is substantially higher if the probability goes to 1? This exercise provides a bound on the potential effectiveness of informative advertising. The random assignment in the Expedia data provides a natural experiment we can use to test that bound. To do so, we estimate the model using only the hotels in search positions 3 through 10, compute the bound using that data, and then ask how well the bound does in accounting for the observed behavior in positions 1 and 2. While we cannot know ex ante how the attention probability will change if a hotel is placed in positions 1 or 2, we know that demand in those positions should be less than the bound given by perfect attention. Thus, we ask first whether the bound implied by the Goeree (2008) model is indeed a bound on choice probabilities for hotels in positions 1 and 2 and second, whether this bound has predictive power in accounting for the choice probabilities conditional on observed demand. Figure 2 shows how we compute the bound for three example types of hotels - the average hotel in the data, the maximum utility hotel in each choice set and the minimum utility hotel in each choice set. The attentive logit model implies a linear relationship between observed attentive probabilities and chioce probabilities which is illustrated in the figure. The squares denote the bound for each type of hotel this is what the transaction probability would be if you had probability 1 of paying attention to that type of hotel. In practice, we compute this bound separately for each hotel in the data, but we collapse down to categories of hotels for expository purposes. Figure 3 shows how this bound compares to the observed demand for a variety of different types of hotels in each search position. The thick horizontal 24

25 Figure 1: Estimated Asymmetries in Position Responses line shows the bound, the 10 colored dots show demand in each search position (with higher dots corresponding to lower search positions) and the x indicates the average demand observed in the data for hotels of each type. The main takeaways from this figure are first that demand is always less than the bound implied by perfect attention and second that the bound is non-trivial. For example, average demand for the max utility hotels in positions 1-3 exceeds the bound on demand for the average hotel in the data. Finally, we ask whether the bound has predictive power if we see two hotels with the same level of demand in positions 3-10, will the hotel with the larger bound experience a larger increase in demand if it is randomly assigned to search position 1 or 2? Table ZZZ shows that the answer is yes. Specification (1) shows that across hotels, the bound constructed from the model estimated on positions (3)-(10) predicts demand in positions (1) and (2). Specifications (2)-(4) show that it continues to have predictive power even after we condition on the observed choice probability for that hotel in positions 3-10 as well as the choice probability implied by a logit model given the choice set and the characteristics of the hotel in question. Thus, the attentive logit model can be used to forecast which products will benefit from informative advertising given their current level of utility. 25

26 Figure 2: Expedia: Transaction Probability vs. Attention Table 4: Expedia Data: Regression of Choice Probability in Positions 1-2 on Hotel Characteristics (1) (2) (3) (4) Bound 0.576*** 0.588*** 0.225*** 0.261*** (0.035) (0.040) (0.074) (0.086) Hotel Prob (pos < 2) (0.030) (0.030) Logit (pos < 2) 0.712*** 0.660*** (0.143) (0.166) Number of Hotels Notes: Table reports coefficients from a regression at the hotel level of the transaction probability of that hotel in positions 1 and 2 on hotel-level covariates. 4,882 hotels appeared in the data in positions 1 and 2 and 3,722 of these also appeared in positions 3-10 (the bound can be constructed for hotels based on their characteristics and the estimated model coefficients even if they did not appear in positions 3-10). "Bound" indicates the alogit forecast of demand for a hotel with those characteristics with attention probability 1. Hotel Prob (pos < 2) is the empirical choice probability in positions 3-10 if available. Logit (pos < 2) is the logit choice probability given the observed characteristics of the hotel and the coefficients estimated on hotels in positions Adjustment Costs and Inattention in Medicare Part D We also apply the model to evaluate whether the observed inertia in Medicare Part D plans is due to inattention, adjustment costs or both. Medicare Part D plans provide prescription drug insurance 26

Discrete Choice Models with Consideration Sets: Identification from Asymmetric Cross-Derivatives

Discrete Choice Models with Consideration Sets: Identification from Asymmetric Cross-Derivatives Discrete Choice Models with Consideration Sets: Identification from Asymmetric Cross-Derivatives Jason Abaluck and Abi Adams October 21, 2016 Abstract The applied literature on consideration sets relies

More information

Lecture 3 Unobserved choice sets

Lecture 3 Unobserved choice sets Lecture 3 Unobserved choice sets Rachel Griffith CES Lectures, April 2017 Introduction Unobserved choice sets Goeree (2008) GPS (2016) Final comments 1 / 32 1. The general problem of unobserved choice

More information

Flexible Estimation of Treatment Effect Parameters

Flexible Estimation of Treatment Effect Parameters Flexible Estimation of Treatment Effect Parameters Thomas MaCurdy a and Xiaohong Chen b and Han Hong c Introduction Many empirical studies of program evaluations are complicated by the presence of both

More information

Estimating Single-Agent Dynamic Models

Estimating Single-Agent Dynamic Models Estimating Single-Agent Dynamic Models Paul T. Scott Empirical IO Fall, 2013 1 / 49 Why are dynamics important? The motivation for using dynamics is usually external validity: we want to simulate counterfactuals

More information

A Course in Applied Econometrics Lecture 14: Control Functions and Related Methods. Jeff Wooldridge IRP Lectures, UW Madison, August 2008

A Course in Applied Econometrics Lecture 14: Control Functions and Related Methods. Jeff Wooldridge IRP Lectures, UW Madison, August 2008 A Course in Applied Econometrics Lecture 14: Control Functions and Related Methods Jeff Wooldridge IRP Lectures, UW Madison, August 2008 1. Linear-in-Parameters Models: IV versus Control Functions 2. Correlated

More information

Deceptive Advertising with Rational Buyers

Deceptive Advertising with Rational Buyers Deceptive Advertising with Rational Buyers September 6, 016 ONLINE APPENDIX In this Appendix we present in full additional results and extensions which are only mentioned in the paper. In the exposition

More information

Parametric identification of multiplicative exponential heteroskedasticity ALYSSA CARLSON

Parametric identification of multiplicative exponential heteroskedasticity ALYSSA CARLSON Parametric identification of multiplicative exponential heteroskedasticity ALYSSA CARLSON Department of Economics, Michigan State University East Lansing, MI 48824-1038, United States (email: carls405@msu.edu)

More information

Transparent Structural Estimation. Matthew Gentzkow Fisher-Schultz Lecture (from work w/ Isaiah Andrews & Jesse M. Shapiro)

Transparent Structural Estimation. Matthew Gentzkow Fisher-Schultz Lecture (from work w/ Isaiah Andrews & Jesse M. Shapiro) Transparent Structural Estimation Matthew Gentzkow Fisher-Schultz Lecture (from work w/ Isaiah Andrews & Jesse M. Shapiro) 1 A hallmark of contemporary applied microeconomics is a conceptual framework

More information

Assortment Optimization under the Multinomial Logit Model with Nested Consideration Sets

Assortment Optimization under the Multinomial Logit Model with Nested Consideration Sets Assortment Optimization under the Multinomial Logit Model with Nested Consideration Sets Jacob Feldman School of Operations Research and Information Engineering, Cornell University, Ithaca, New York 14853,

More information

Heterogeneity. Krishna Pendakur. May 24, Krishna Pendakur () Heterogeneity May 24, / 21

Heterogeneity. Krishna Pendakur. May 24, Krishna Pendakur () Heterogeneity May 24, / 21 Heterogeneity Krishna Pendakur May 24, 2015 Krishna Pendakur () Heterogeneity May 24, 2015 1 / 21 Introduction People are heterogeneous. Some heterogeneity is observed, some is not observed. Some heterogeneity

More information

Econometric Analysis of Games 1

Econometric Analysis of Games 1 Econometric Analysis of Games 1 HT 2017 Recap Aim: provide an introduction to incomplete models and partial identification in the context of discrete games 1. Coherence & Completeness 2. Basic Framework

More information

Identification and Estimation Using Heteroscedasticity Without Instruments: The Binary Endogenous Regressor Case

Identification and Estimation Using Heteroscedasticity Without Instruments: The Binary Endogenous Regressor Case Identification and Estimation Using Heteroscedasticity Without Instruments: The Binary Endogenous Regressor Case Arthur Lewbel Boston College December 2016 Abstract Lewbel (2012) provides an estimator

More information

Lecture 1. Behavioral Models Multinomial Logit: Power and limitations. Cinzia Cirillo

Lecture 1. Behavioral Models Multinomial Logit: Power and limitations. Cinzia Cirillo Lecture 1 Behavioral Models Multinomial Logit: Power and limitations Cinzia Cirillo 1 Overview 1. Choice Probabilities 2. Power and Limitations of Logit 1. Taste variation 2. Substitution patterns 3. Repeated

More information

Principles Underlying Evaluation Estimators

Principles Underlying Evaluation Estimators The Principles Underlying Evaluation Estimators James J. University of Chicago Econ 350, Winter 2019 The Basic Principles Underlying the Identification of the Main Econometric Evaluation Estimators Two

More information

A Note on Demand Estimation with Supply Information. in Non-Linear Models

A Note on Demand Estimation with Supply Information. in Non-Linear Models A Note on Demand Estimation with Supply Information in Non-Linear Models Tongil TI Kim Emory University J. Miguel Villas-Boas University of California, Berkeley May, 2018 Keywords: demand estimation, limited

More information

Parametric Identification of Multiplicative Exponential Heteroskedasticity

Parametric Identification of Multiplicative Exponential Heteroskedasticity Parametric Identification of Multiplicative Exponential Heteroskedasticity Alyssa Carlson Department of Economics, Michigan State University East Lansing, MI 48824-1038, United States Dated: October 5,

More information

Econometrics Lecture 10: Applied Demand Analysis

Econometrics Lecture 10: Applied Demand Analysis Econometrics Lecture 10: Applied Demand Analysis R. G. Pierse 1 Introduction In this lecture we look at the estimation of systems of demand equations. Demand equations were some of the earliest economic

More information

Estimation of Static Discrete Choice Models Using Market Level Data

Estimation of Static Discrete Choice Models Using Market Level Data Estimation of Static Discrete Choice Models Using Market Level Data NBER Methods Lectures Aviv Nevo Northwestern University and NBER July 2012 Data Structures Market-level data cross section/time series/panel

More information

1 Differentiated Products: Motivation

1 Differentiated Products: Motivation 1 Differentiated Products: Motivation Let us generalise the problem of differentiated products. Let there now be N firms producing one differentiated product each. If we start with the usual demand function

More information

Estimating Single-Agent Dynamic Models

Estimating Single-Agent Dynamic Models Estimating Single-Agent Dynamic Models Paul T. Scott New York University Empirical IO Course Fall 2016 1 / 34 Introduction Why dynamic estimation? External validity Famous example: Hendel and Nevo s (2006)

More information

Identification and Estimation Using Heteroscedasticity Without Instruments: The Binary Endogenous Regressor Case

Identification and Estimation Using Heteroscedasticity Without Instruments: The Binary Endogenous Regressor Case Identification and Estimation Using Heteroscedasticity Without Instruments: The Binary Endogenous Regressor Case Arthur Lewbel Boston College Original December 2016, revised July 2017 Abstract Lewbel (2012)

More information

Consistency and Asymptotic Normality for Equilibrium Models with Partially Observed Outcome Variables

Consistency and Asymptotic Normality for Equilibrium Models with Partially Observed Outcome Variables Consistency and Asymptotic Normality for Equilibrium Models with Partially Observed Outcome Variables Nathan H. Miller Georgetown University Matthew Osborne University of Toronto November 25, 2013 Abstract

More information

Applied Health Economics (for B.Sc.)

Applied Health Economics (for B.Sc.) Applied Health Economics (for B.Sc.) Helmut Farbmacher Department of Economics University of Mannheim Autumn Semester 2017 Outlook 1 Linear models (OLS, Omitted variables, 2SLS) 2 Limited and qualitative

More information

Bresnahan, JIE 87: Competition and Collusion in the American Automobile Industry: 1955 Price War

Bresnahan, JIE 87: Competition and Collusion in the American Automobile Industry: 1955 Price War Bresnahan, JIE 87: Competition and Collusion in the American Automobile Industry: 1955 Price War Spring 009 Main question: In 1955 quantities of autos sold were higher while prices were lower, relative

More information

Nonparametric Identification of a Binary Random Factor in Cross Section Data - Supplemental Appendix

Nonparametric Identification of a Binary Random Factor in Cross Section Data - Supplemental Appendix Nonparametric Identification of a Binary Random Factor in Cross Section Data - Supplemental Appendix Yingying Dong and Arthur Lewbel California State University Fullerton and Boston College July 2010 Abstract

More information

1 Motivation for Instrumental Variable (IV) Regression

1 Motivation for Instrumental Variable (IV) Regression ECON 370: IV & 2SLS 1 Instrumental Variables Estimation and Two Stage Least Squares Econometric Methods, ECON 370 Let s get back to the thiking in terms of cross sectional (or pooled cross sectional) data

More information

Wooldridge, Introductory Econometrics, 4th ed. Chapter 15: Instrumental variables and two stage least squares

Wooldridge, Introductory Econometrics, 4th ed. Chapter 15: Instrumental variables and two stage least squares Wooldridge, Introductory Econometrics, 4th ed. Chapter 15: Instrumental variables and two stage least squares Many economic models involve endogeneity: that is, a theoretical relationship does not fit

More information

A Course in Applied Econometrics Lecture 18: Missing Data. Jeff Wooldridge IRP Lectures, UW Madison, August Linear model with IVs: y i x i u i,

A Course in Applied Econometrics Lecture 18: Missing Data. Jeff Wooldridge IRP Lectures, UW Madison, August Linear model with IVs: y i x i u i, A Course in Applied Econometrics Lecture 18: Missing Data Jeff Wooldridge IRP Lectures, UW Madison, August 2008 1. When Can Missing Data be Ignored? 2. Inverse Probability Weighting 3. Imputation 4. Heckman-Type

More information

The Generalized Roy Model and Treatment Effects

The Generalized Roy Model and Treatment Effects The Generalized Roy Model and Treatment Effects Christopher Taber University of Wisconsin November 10, 2016 Introduction From Imbens and Angrist we showed that if one runs IV, we get estimates of the Local

More information

Revisiting the Nested Fixed-Point Algorithm in BLP Random Coeffi cients Demand Estimation

Revisiting the Nested Fixed-Point Algorithm in BLP Random Coeffi cients Demand Estimation Revisiting the Nested Fixed-Point Algorithm in BLP Random Coeffi cients Demand Estimation Jinhyuk Lee Kyoungwon Seo September 9, 016 Abstract This paper examines the numerical properties of the nested

More information

Structure learning in human causal induction

Structure learning in human causal induction Structure learning in human causal induction Joshua B. Tenenbaum & Thomas L. Griffiths Department of Psychology Stanford University, Stanford, CA 94305 jbt,gruffydd @psych.stanford.edu Abstract We use

More information

Introduction to linear programming using LEGO.

Introduction to linear programming using LEGO. Introduction to linear programming using LEGO. 1 The manufacturing problem. A manufacturer produces two pieces of furniture, tables and chairs. The production of the furniture requires the use of two different

More information

Nonparametric Identication of a Binary Random Factor in Cross Section Data and

Nonparametric Identication of a Binary Random Factor in Cross Section Data and . Nonparametric Identication of a Binary Random Factor in Cross Section Data and Returns to Lying? Identifying the Effects of Misreporting When the Truth is Unobserved Arthur Lewbel Boston College This

More information

Demand in Differentiated-Product Markets (part 2)

Demand in Differentiated-Product Markets (part 2) Demand in Differentiated-Product Markets (part 2) Spring 2009 1 Berry (1994): Estimating discrete-choice models of product differentiation Methodology for estimating differentiated-product discrete-choice

More information

Estimating the Pure Characteristics Demand Model: A Computational Note

Estimating the Pure Characteristics Demand Model: A Computational Note Estimating the Pure Characteristics Demand Model: A Computational Note Minjae Song School of Economics, Georgia Institute of Technology April, 2006 Abstract This paper provides details of the computational

More information

Graduate Econometrics I: What is econometrics?

Graduate Econometrics I: What is econometrics? Graduate Econometrics I: What is econometrics? Yves Dominicy Université libre de Bruxelles Solvay Brussels School of Economics and Management ECARES Yves Dominicy Graduate Econometrics I: What is econometrics?

More information

On the Tightness of an LP Relaxation for Rational Optimization and its Applications

On the Tightness of an LP Relaxation for Rational Optimization and its Applications OPERATIONS RESEARCH Vol. 00, No. 0, Xxxxx 0000, pp. 000 000 issn 0030-364X eissn 526-5463 00 0000 000 INFORMS doi 0.287/xxxx.0000.0000 c 0000 INFORMS Authors are encouraged to submit new papers to INFORMS

More information

An empirical model of firm entry with endogenous product-type choices

An empirical model of firm entry with endogenous product-type choices and An empirical model of firm entry with endogenous product-type choices, RAND Journal of Economics 31 Jan 2013 Introduction and Before : entry model, identical products In this paper : entry with simultaneous

More information

The Impact of Organizer Market Structure on Participant Entry Behavior in a Multi-Tournament Environment

The Impact of Organizer Market Structure on Participant Entry Behavior in a Multi-Tournament Environment The Impact of Organizer Market Structure on Participant Entry Behavior in a Multi-Tournament Environment Timothy Mathews and Soiliou Daw Namoro Abstract. A model of two tournaments, each with a field of

More information

Introduction to General Equilibrium

Introduction to General Equilibrium Introduction to General Equilibrium Juan Manuel Puerta November 6, 2009 Introduction So far we discussed markets in isolation. We studied the quantities and welfare that results under different assumptions

More information

Choice, Consideration Sets and Attribute Filters

Choice, Consideration Sets and Attribute Filters Choice, Consideration Sets and Attribute Filters Mert Kimya Brown University, Department of Economics, 64 Waterman St, Providence RI 02912 USA. October 6, 2015 Abstract It is well known that decision makers

More information

UNIVERSITY OF NOTTINGHAM. Discussion Papers in Economics CONSISTENT FIRM CHOICE AND THE THEORY OF SUPPLY

UNIVERSITY OF NOTTINGHAM. Discussion Papers in Economics CONSISTENT FIRM CHOICE AND THE THEORY OF SUPPLY UNIVERSITY OF NOTTINGHAM Discussion Papers in Economics Discussion Paper No. 0/06 CONSISTENT FIRM CHOICE AND THE THEORY OF SUPPLY by Indraneel Dasgupta July 00 DP 0/06 ISSN 1360-438 UNIVERSITY OF NOTTINGHAM

More information

Econometrics of causal inference. Throughout, we consider the simplest case of a linear outcome equation, and homogeneous

Econometrics of causal inference. Throughout, we consider the simplest case of a linear outcome equation, and homogeneous Econometrics of causal inference Throughout, we consider the simplest case of a linear outcome equation, and homogeneous effects: y = βx + ɛ (1) where y is some outcome, x is an explanatory variable, and

More information

16/018. Efficiency Gains in Rank-ordered Multinomial Logit Models. June 13, 2016

16/018. Efficiency Gains in Rank-ordered Multinomial Logit Models. June 13, 2016 16/018 Efficiency Gains in Rank-ordered Multinomial Logit Models Arie Beresteanu and Federico Zincenko June 13, 2016 Efficiency Gains in Rank-ordered Multinomial Logit Models Arie Beresteanu and Federico

More information

September Math Course: First Order Derivative

September Math Course: First Order Derivative September Math Course: First Order Derivative Arina Nikandrova Functions Function y = f (x), where x is either be a scalar or a vector of several variables (x,..., x n ), can be thought of as a rule which

More information

Hypothesis Testing. Part I. James J. Heckman University of Chicago. Econ 312 This draft, April 20, 2006

Hypothesis Testing. Part I. James J. Heckman University of Chicago. Econ 312 This draft, April 20, 2006 Hypothesis Testing Part I James J. Heckman University of Chicago Econ 312 This draft, April 20, 2006 1 1 A Brief Review of Hypothesis Testing and Its Uses values and pure significance tests (R.A. Fisher)

More information

Linear Models in Econometrics

Linear Models in Econometrics Linear Models in Econometrics Nicky Grant At the most fundamental level econometrics is the development of statistical techniques suited primarily to answering economic questions and testing economic theories.

More information

A Robust Approach to Estimating Production Functions: Replication of the ACF procedure

A Robust Approach to Estimating Production Functions: Replication of the ACF procedure A Robust Approach to Estimating Production Functions: Replication of the ACF procedure Kyoo il Kim Michigan State University Yao Luo University of Toronto Yingjun Su IESR, Jinan University August 2018

More information

Can everyone benefit from innovation?

Can everyone benefit from innovation? Can everyone benefit from innovation? Christopher P. Chambers and Takashi Hayashi June 16, 2017 Abstract We study a resource allocation problem with variable technologies, and ask if there is an allocation

More information

A Rothschild-Stiglitz approach to Bayesian persuasion

A Rothschild-Stiglitz approach to Bayesian persuasion A Rothschild-Stiglitz approach to Bayesian persuasion Matthew Gentzkow and Emir Kamenica Stanford University and University of Chicago December 2015 Abstract Rothschild and Stiglitz (1970) represent random

More information

Technical Note: Capacitated Assortment Optimization under the Multinomial Logit Model with Nested Consideration Sets

Technical Note: Capacitated Assortment Optimization under the Multinomial Logit Model with Nested Consideration Sets Technical Note: Capacitated Assortment Optimization under the Multinomial Logit Model with Nested Consideration Sets Jacob Feldman Olin Business School, Washington University, St. Louis, MO 63130, USA

More information

4.8 Instrumental Variables

4.8 Instrumental Variables 4.8. INSTRUMENTAL VARIABLES 35 4.8 Instrumental Variables A major complication that is emphasized in microeconometrics is the possibility of inconsistent parameter estimation due to endogenous regressors.

More information

Stat 542: Item Response Theory Modeling Using The Extended Rank Likelihood

Stat 542: Item Response Theory Modeling Using The Extended Rank Likelihood Stat 542: Item Response Theory Modeling Using The Extended Rank Likelihood Jonathan Gruhl March 18, 2010 1 Introduction Researchers commonly apply item response theory (IRT) models to binary and ordinal

More information

Mini Course on Structural Estimation of Static and Dynamic Games

Mini Course on Structural Estimation of Static and Dynamic Games Mini Course on Structural Estimation of Static and Dynamic Games Junichi Suzuki University of Toronto June 1st, 2009 1 Part : Estimation of Dynamic Games 2 ntroduction Firms often compete each other overtime

More information

WELFARE: THE SOCIAL- WELFARE FUNCTION

WELFARE: THE SOCIAL- WELFARE FUNCTION Prerequisites Almost essential Welfare: Basics Welfare: Efficiency WELFARE: THE SOCIAL- WELFARE FUNCTION MICROECONOMICS Principles and Analysis Frank Cowell July 2017 1 Social Welfare Function Limitations

More information

Identification of Nonparametric Simultaneous Equations Models with a Residual Index Structure

Identification of Nonparametric Simultaneous Equations Models with a Residual Index Structure Identification of Nonparametric Simultaneous Equations Models with a Residual Index Structure Steven T. Berry Yale University Department of Economics Cowles Foundation and NBER Philip A. Haile Yale University

More information

Cowles Foundation for Research in Economics at Yale University

Cowles Foundation for Research in Economics at Yale University Cowles Foundation for Research in Economics at Yale University Cowles Foundation Discussion Paper No. 1904 Afriat from MaxMin John D. Geanakoplos August 2013 An author index to the working papers in the

More information

INFERENCE APPROACHES FOR INSTRUMENTAL VARIABLE QUANTILE REGRESSION. 1. Introduction

INFERENCE APPROACHES FOR INSTRUMENTAL VARIABLE QUANTILE REGRESSION. 1. Introduction INFERENCE APPROACHES FOR INSTRUMENTAL VARIABLE QUANTILE REGRESSION VICTOR CHERNOZHUKOV CHRISTIAN HANSEN MICHAEL JANSSON Abstract. We consider asymptotic and finite-sample confidence bounds in instrumental

More information

The random coefficients logit model is identified

The random coefficients logit model is identified The random coefficients logit model is identified The MIT Faculty has made this article openly available. Please share how this access benefits you. Your story matters. Citation As Published Publisher

More information

A Rothschild-Stiglitz approach to Bayesian persuasion

A Rothschild-Stiglitz approach to Bayesian persuasion A Rothschild-Stiglitz approach to Bayesian persuasion Matthew Gentzkow and Emir Kamenica Stanford University and University of Chicago January 2016 Consider a situation where one person, call him Sender,

More information

ECON 594: Lecture #6

ECON 594: Lecture #6 ECON 594: Lecture #6 Thomas Lemieux Vancouver School of Economics, UBC May 2018 1 Limited dependent variables: introduction Up to now, we have been implicitly assuming that the dependent variable, y, was

More information

Lecture 4. 1 Examples of Mechanism Design Problems

Lecture 4. 1 Examples of Mechanism Design Problems CSCI699: Topics in Learning and Game Theory Lecture 4 Lecturer: Shaddin Dughmi Scribes: Haifeng Xu,Reem Alfayez 1 Examples of Mechanism Design Problems Example 1: Single Item Auctions. There is a single

More information

Non-linear panel data modeling

Non-linear panel data modeling Non-linear panel data modeling Laura Magazzini University of Verona laura.magazzini@univr.it http://dse.univr.it/magazzini May 2010 Laura Magazzini (@univr.it) Non-linear panel data modeling May 2010 1

More information

ECO Class 6 Nonparametric Econometrics

ECO Class 6 Nonparametric Econometrics ECO 523 - Class 6 Nonparametric Econometrics Carolina Caetano Contents 1 Nonparametric instrumental variable regression 1 2 Nonparametric Estimation of Average Treatment Effects 3 2.1 Asymptotic results................................

More information

Identification of Discrete Choice Models for Bundles and Binary Games

Identification of Discrete Choice Models for Bundles and Binary Games Identification of Discrete Choice Models for Bundles and Binary Games Jeremy T. Fox University of Michigan and NBER Natalia Lazzati University of Michigan February 2013 Abstract We study nonparametric

More information

1. Basic Model of Labor Supply

1. Basic Model of Labor Supply Static Labor Supply. Basic Model of Labor Supply.. Basic Model In this model, the economic unit is a family. Each faimily maximizes U (L, L 2,.., L m, C, C 2,.., C n ) s.t. V + w i ( L i ) p j C j, C j

More information

IDENTIFICATION OF THE BINARY CHOICE MODEL WITH MISCLASSIFICATION

IDENTIFICATION OF THE BINARY CHOICE MODEL WITH MISCLASSIFICATION IDENTIFICATION OF THE BINARY CHOICE MODEL WITH MISCLASSIFICATION Arthur Lewbel Boston College December 19, 2000 Abstract MisclassiÞcation in binary choice (binomial response) models occurs when the dependent

More information

II. Analysis of Linear Programming Solutions

II. Analysis of Linear Programming Solutions Optimization Methods Draft of August 26, 2005 II. Analysis of Linear Programming Solutions Robert Fourer Department of Industrial Engineering and Management Sciences Northwestern University Evanston, Illinois

More information

arxiv: v1 [math.oc] 28 Jun 2016

arxiv: v1 [math.oc] 28 Jun 2016 On the Inefficiency of Forward Markets in Leader-Follower Competition Desmond Cai, Anish Agarwal, Adam Wierman arxiv:66.864v [math.oc] 8 Jun 6 June 9, 6 Abstract Motivated by electricity markets, this

More information

IDENTIFICATION OF TREATMENT EFFECTS WITH SELECTIVE PARTICIPATION IN A RANDOMIZED TRIAL

IDENTIFICATION OF TREATMENT EFFECTS WITH SELECTIVE PARTICIPATION IN A RANDOMIZED TRIAL IDENTIFICATION OF TREATMENT EFFECTS WITH SELECTIVE PARTICIPATION IN A RANDOMIZED TRIAL BRENDAN KLINE AND ELIE TAMER Abstract. Randomized trials (RTs) are used to learn about treatment effects. This paper

More information

Additional Material for Estimating the Technology of Cognitive and Noncognitive Skill Formation (Cuttings from the Web Appendix)

Additional Material for Estimating the Technology of Cognitive and Noncognitive Skill Formation (Cuttings from the Web Appendix) Additional Material for Estimating the Technology of Cognitive and Noncognitive Skill Formation (Cuttings from the Web Appendix Flavio Cunha The University of Pennsylvania James Heckman The University

More information

Chapter 6 Stochastic Regressors

Chapter 6 Stochastic Regressors Chapter 6 Stochastic Regressors 6. Stochastic regressors in non-longitudinal settings 6.2 Stochastic regressors in longitudinal settings 6.3 Longitudinal data models with heterogeneity terms and sequentially

More information

1 Bewley Economies with Aggregate Uncertainty

1 Bewley Economies with Aggregate Uncertainty 1 Bewley Economies with Aggregate Uncertainty Sofarwehaveassumedawayaggregatefluctuations (i.e., business cycles) in our description of the incomplete-markets economies with uninsurable idiosyncratic risk

More information

Do Shareholders Vote Strategically? Voting Behavior, Proposal Screening, and Majority Rules. Supplement

Do Shareholders Vote Strategically? Voting Behavior, Proposal Screening, and Majority Rules. Supplement Do Shareholders Vote Strategically? Voting Behavior, Proposal Screening, and Majority Rules Supplement Ernst Maug Kristian Rydqvist September 2008 1 Additional Results on the Theory of Strategic Voting

More information

Empirical approaches in public economics

Empirical approaches in public economics Empirical approaches in public economics ECON4624 Empirical Public Economics Fall 2016 Gaute Torsvik Outline for today The canonical problem Basic concepts of causal inference Randomized experiments Non-experimental

More information

On IV estimation of the dynamic binary panel data model with fixed effects

On IV estimation of the dynamic binary panel data model with fixed effects On IV estimation of the dynamic binary panel data model with fixed effects Andrew Adrian Yu Pua March 30, 2015 Abstract A big part of applied research still uses IV to estimate a dynamic linear probability

More information

CEMMAP Masterclass: Empirical Models of Comparative Advantage and the Gains from Trade 1 Lecture 3: Gravity Models

CEMMAP Masterclass: Empirical Models of Comparative Advantage and the Gains from Trade 1 Lecture 3: Gravity Models CEMMAP Masterclass: Empirical Models of Comparative Advantage and the Gains from Trade 1 Lecture 3: Gravity Models Dave Donaldson (MIT) CEMMAP MC July 2018 1 All material based on earlier courses taught

More information

Identification and Estimation of Differentiated Products Models using Market Size and Cost Data

Identification and Estimation of Differentiated Products Models using Market Size and Cost Data Department of Economics Identification and Estimation of Differentiated Products Models using Market Size and Cost Data David P. Byrne University of Melbourne Susumu Imai 1 University of Technology Sydney

More information

ECO 2901 EMPIRICAL INDUSTRIAL ORGANIZATION

ECO 2901 EMPIRICAL INDUSTRIAL ORGANIZATION ECO 2901 EMPIRICAL INDUSTRIAL ORGANIZATION Lecture 7 & 8: Models of Competition in Prices & Quantities Victor Aguirregabiria (University of Toronto) Toronto. Winter 2018 Victor Aguirregabiria () Empirical

More information

Mathematical Appendix. Ramsey Pricing

Mathematical Appendix. Ramsey Pricing Mathematical Appendix Ramsey Pricing PROOF OF THEOREM : I maximize social welfare V subject to π > K. The Lagrangian is V + κπ K the associated first-order conditions are that for each I + κ P I C I cn

More information

Individual decision-making under certainty

Individual decision-making under certainty Individual decision-making under certainty Objects of inquiry Our study begins with individual decision-making under certainty Items of interest include: Feasible set Objective function (Feasible set R)

More information

Truncation and Censoring

Truncation and Censoring Truncation and Censoring Laura Magazzini laura.magazzini@univr.it Laura Magazzini (@univr.it) Truncation and Censoring 1 / 35 Truncation and censoring Truncation: sample data are drawn from a subset of

More information

Specification Test on Mixed Logit Models

Specification Test on Mixed Logit Models Specification est on Mixed Logit Models Jinyong Hahn UCLA Jerry Hausman MI December 1, 217 Josh Lustig CRA Abstract his paper proposes a specification test of the mixed logit models, by generalizing Hausman

More information

When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Longitudinal Data?

When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Longitudinal Data? When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Longitudinal Data? Kosuke Imai Department of Politics Center for Statistics and Machine Learning Princeton University

More information

Oblivious Equilibrium: A Mean Field Approximation for Large-Scale Dynamic Games

Oblivious Equilibrium: A Mean Field Approximation for Large-Scale Dynamic Games Oblivious Equilibrium: A Mean Field Approximation for Large-Scale Dynamic Games Gabriel Y. Weintraub, Lanier Benkard, and Benjamin Van Roy Stanford University {gweintra,lanierb,bvr}@stanford.edu Abstract

More information

A Random Attention Model

A Random Attention Model A Random Attention Model Matias D. Cattaneo Xinwei Ma U. Michigan U. Michigan Yusufcan Masatlioglu Elchin Suleymanov U. Maryland U. Michigan Stochastic Choice Monotonic Attention RAM Identification Inference

More information

Robustness of Logit Analysis: Unobserved Heterogeneity and Misspecified Disturbances

Robustness of Logit Analysis: Unobserved Heterogeneity and Misspecified Disturbances Discussion Paper: 2006/07 Robustness of Logit Analysis: Unobserved Heterogeneity and Misspecified Disturbances J.S. Cramer www.fee.uva.nl/ke/uva-econometrics Amsterdam School of Economics Department of

More information

How Revealing is Revealed Preference?

How Revealing is Revealed Preference? How Revealing is Revealed Preference? Richard Blundell UCL and IFS April 2016 Lecture II, Boston University Richard Blundell () How Revealing is Revealed Preference? Lecture II, Boston University 1 / 55

More information

1 Outline. 1. Motivation. 2. SUR model. 3. Simultaneous equations. 4. Estimation

1 Outline. 1. Motivation. 2. SUR model. 3. Simultaneous equations. 4. Estimation 1 Outline. 1. Motivation 2. SUR model 3. Simultaneous equations 4. Estimation 2 Motivation. In this chapter, we will study simultaneous systems of econometric equations. Systems of simultaneous equations

More information

Online Supplement to

Online Supplement to Online Supplement to Pricing Decisions in a Strategic Single Retailer/Dual Suppliers Setting under Order Size Constraints Ali Ekici Department of Industrial Engineering, Ozyegin University, Istanbul, Turkey,

More information

KIER DISCUSSION PAPER SERIES

KIER DISCUSSION PAPER SERIES KIER DISCUSSION PAPER SERIES KYOTO INSTITUTE OF ECONOMIC RESEARCH Discussion Paper No.992 Intertemporal efficiency does not imply a common price forecast: a leading example Shurojit Chatterji, Atsushi

More information

Introduction to causal identification. Nidhiya Menon IGC Summer School, New Delhi, July 2015

Introduction to causal identification. Nidhiya Menon IGC Summer School, New Delhi, July 2015 Introduction to causal identification Nidhiya Menon IGC Summer School, New Delhi, July 2015 Outline 1. Micro-empirical methods 2. Rubin causal model 3. More on Instrumental Variables (IV) Estimating causal

More information

A Simplified Test for Preference Rationality of Two-Commodity Choice

A Simplified Test for Preference Rationality of Two-Commodity Choice A Simplified Test for Preference Rationality of Two-Commodity Choice Samiran Banerjee and James H. Murphy December 9, 2004 Abstract We provide a simplified test to determine if choice data from a two-commodity

More information

ESTIMATION OF NONPARAMETRIC MODELS WITH SIMULTANEITY

ESTIMATION OF NONPARAMETRIC MODELS WITH SIMULTANEITY ESTIMATION OF NONPARAMETRIC MODELS WITH SIMULTANEITY Rosa L. Matzkin Department of Economics University of California, Los Angeles First version: May 200 This version: August 204 Abstract We introduce

More information

Fixed Point Theorems

Fixed Point Theorems Fixed Point Theorems Definition: Let X be a set and let f : X X be a function that maps X into itself. (Such a function is often called an operator, a transformation, or a transform on X, and the notation

More information

The properties of L p -GMM estimators

The properties of L p -GMM estimators The properties of L p -GMM estimators Robert de Jong and Chirok Han Michigan State University February 2000 Abstract This paper considers Generalized Method of Moment-type estimators for which a criterion

More information

Instrumental Variables and the Problem of Endogeneity

Instrumental Variables and the Problem of Endogeneity Instrumental Variables and the Problem of Endogeneity September 15, 2015 1 / 38 Exogeneity: Important Assumption of OLS In a standard OLS framework, y = xβ + ɛ (1) and for unbiasedness we need E[x ɛ] =

More information

Comments on: Panel Data Analysis Advantages and Challenges. Manuel Arellano CEMFI, Madrid November 2006

Comments on: Panel Data Analysis Advantages and Challenges. Manuel Arellano CEMFI, Madrid November 2006 Comments on: Panel Data Analysis Advantages and Challenges Manuel Arellano CEMFI, Madrid November 2006 This paper provides an impressive, yet compact and easily accessible review of the econometric literature

More information

ECOM 009 Macroeconomics B. Lecture 3

ECOM 009 Macroeconomics B. Lecture 3 ECOM 009 Macroeconomics B Lecture 3 Giulio Fella c Giulio Fella, 2014 ECOM 009 Macroeconomics B - Lecture 3 84/197 Predictions of the PICH 1. Marginal propensity to consume out of wealth windfalls 0.03.

More information

Uncertainty. Michael Peters December 27, 2013

Uncertainty. Michael Peters December 27, 2013 Uncertainty Michael Peters December 27, 20 Lotteries In many problems in economics, people are forced to make decisions without knowing exactly what the consequences will be. For example, when you buy

More information