Learning and Valuation with Costly Attention

Similar documents
An Introduction to Rational Inattention

THE CHOICE BETWEEN MULTIPLICATIVE AND ADDITIVE OUTPUT UNCERTAINTY

Information Choice in Macroeconomics and Finance.

Public Provision of Scarce Resources when Preferences are Non-Linear

Equilibrium in a Production Economy

Indeterminacy and Sunspots in Macroeconomics

UNIVERSITY OF NOTTINGHAM. Discussion Papers in Economics CONSISTENT FIRM CHOICE AND THE THEORY OF SUPPLY

A Rothschild-Stiglitz approach to Bayesian persuasion

A Rothschild-Stiglitz approach to Bayesian persuasion

Appendix of Homophily in Peer Groups The Costly Information Case

Session 4: Money. Jean Imbs. November 2010

URBAN FIRM LOCATION AND LAND USE UNDER CERTAINTY AND UNDER PRODUCT-PRICE AND LAND-RENT RISK

Department of Economics The Ohio State University Final Exam Questions and Answers Econ 8712

Microeconomics II Lecture 4: Incomplete Information Karl Wärneryd Stockholm School of Economics November 2016

RSMG Working Paper Series. TITLE: The value of information and the value of awareness. Author: John Quiggin. Working Paper: R13_2

NTU IO (I) : Auction Theory and Mechanism Design II Groves Mechanism and AGV Mechansim. u i (x, t i, θ i ) = V i (x, θ i ) + t i,

On the Unique D1 Equilibrium in the Stackelberg Model with Asymmetric Information Janssen, M.C.W.; Maasland, E.

Positive Models of Private Provision of Public Goods: A Static Model. (Bergstrom, Blume and Varian 1986)

PhD Qualifier Examination

1 Bewley Economies with Aggregate Uncertainty

CHAPTER 3: OPTIMIZATION

Online Appendix for Dynamic Procurement under Uncertainty: Optimal Design and Implications for Incomplete Contracts

Simple New Keynesian Model without Capital

A Rothschild-Stiglitz approach to Bayesian persuasion

problem. max Both k (0) and h (0) are given at time 0. (a) Write down the Hamilton-Jacobi-Bellman (HJB) Equation in the dynamic programming

5. Externalities and Public Goods. Externalities. Public Goods types. Public Goods

Deceptive Advertising with Rational Buyers

Learning and Risk Aversion

Economic Growth: Lecture 8, Overlapping Generations

5. Externalities and Public Goods

Adding Production to the Theory

Second Price Auctions with Differentiated Participation Costs

Housing with overlapping generations

Intro to Economic analysis

In the Name of God. Sharif University of Technology. Microeconomics 1. Graduate School of Management and Economics. Dr. S.

DEPARTMENT OF ECONOMICS YALE UNIVERSITY P.O. Box New Haven, CT

Great Expectations. Part I: On the Customizability of Generalized Expected Utility*

Design Patent Damages under Sequential Innovation

KIER DISCUSSION PAPER SERIES

Week 6: Consumer Theory Part 1 (Jehle and Reny, Chapter 1)

Entry under an Information-Gathering Monopoly Alex Barrachina* June Abstract

Knowing What Others Know: Coordination Motives in Information Acquisition

Sentiments and Aggregate Fluctuations

Department of Agricultural Economics. PhD Qualifier Examination. May 2009

Chapter 1 - Preference and choice

Slides II - Dynamic Programming

Mathematical models in economy. Short descriptions

Comprehensive Exam. Macro Spring 2014 Retake. August 22, 2014

Pseudo-Wealth and Consumption Fluctuations

where u is the decision-maker s payoff function over her actions and S is the set of her feasible actions.

Rational Inattention

The More Abstract the Better? Raising Education Cost for the Less Able when Education is a Signal

Ambiguity and the Centipede Game

Capital Structure and Investment Dynamics with Fire Sales

Answer Key for M. A. Economics Entrance Examination 2017 (Main version)

Microeconomic Theory -1- Introduction

Information Acquisition in Interdependent Value Auctions

Social Learning with Endogenous Network Formation

The Lucas Imperfect Information Model

Preliminary Results on Social Learning with Partial Observations

Individual decision-making under certainty

AGRICULTURAL ECONOMICS STAFF PAPER SERIES

The Non-Existence of Representative Agents

Monetary Economics: Solutions Problem Set 1

1. Constant-elasticity-of-substitution (CES) or Dixit-Stiglitz aggregators. Consider the following function J: J(x) = a(j)x(j) ρ dj

September Math Course: First Order Derivative

Measuring the Standard of Living: Uncertainty about Its Development

Choice Set Effects. Mark Dean. Behavioral Economics G6943 Autumn 2018

Wars of Attrition with Budget Constraints

Expectations, Learning and Macroeconomic Policy

The marginal propensity to consume and multidimensional risk

Costly Social Learning and Rational Inattention

Growth, Learning and Redistributive Policies

Labor Economics, Lecture 11: Partial Equilibrium Sequential Search

On the Pareto Efficiency of a Socially Optimal Mechanism for Monopoly Regulation

Practice Questions for Mid-Term I. Question 1: Consider the Cobb-Douglas production function in intensive form:

Profit Maximization and Supermodular Technology

The willingness to pay for health improvements under comorbidity ambiguity 1

Collective Model with Children: Public Good and Household Production

Simple New Keynesian Model without Capital

EconS Microeconomic Theory II Midterm Exam #2 - Answer Key

Neoclassical Business Cycle Model

EconS Microeconomic Theory II Homework #9 - Answer key

Online Appendices for Large Matching Markets: Risk, Unraveling, and Conflation

GARP and Afriat s Theorem Production

Choice under Uncertainty

WHEN ARE SIGNALS COMPLEMENTS OR SUBSTITUTES?

Martin Gregor IES, Charles University. Abstract

Memory, Attention and Choice

Lecture Notes October 18, Reading assignment for this lecture: Syllabus, section I.

EconS Sequential Competition

Some Microfoundations for the Gatsby Curve. Steven N. Durlauf University of Wisconsin. Ananth Seshadri University of Wisconsin

Costly Expertise. Dino Gerardi and Leeat Yariv yz. Current Version: December, 2007

The Lottery Contest is a Best-Response Potential Game

Intro Prefs & Voting Electoral comp. Political Economics. Ludwig-Maximilians University Munich. Summer term / 37

Competitive Equilibria in a Comonotone Market

G5212: Game Theory. Mark Dean. Spring 2017

Theory of Auctions. Carlos Hurtado. Jun 23th, Department of Economics University of Illinois at Urbana-Champaign

Ph.D. Preliminary Examination MICROECONOMIC THEORY Applied Economics Graduate Program May 2012

Lecture Notes: Industrial Organization Joe Chen 1. The Structure Conduct Performance (SCP) paradigm:

Transcription:

Learning and Valuation with Costly Attention Jacob LaRiviere a and William Neilson b June 2015 Abstract This paper develops a theoretical framework of consumer learning and product valuation when attending to new information is costly. The key attribute of the model is that agents are unsure what product characteristics are present in a good. In the model increased beliefs that a good contains valuable attributes serves to increase learning and, possibly, willingness to pay for the good. We develop two testable implications from the model which can be readily tested using lab or field data. JEL Codes: D01; D83; Q41 Keywords: Information, Updating, Preferences, Uncertainty a University of Tennessee & Baker Center for Public Policy, Department of Economics, 525 Stokely Management Center, Knoxville, TN 37996-0550. Email: jlarivi1@utk.edu. b University of Tennessee, Department of Economics, 508 Stokely Management Center, Knoxville, TN 37996-0550. 1

1 Introduction There is mounting evidence that learning is a complicated process. Recent research notes that if attention is scarce or learning is costly, consumers may be left with inefficient levels of information (DellaVigna (2009), Bordalo, Gennaioli, and Shleifer (2013), Schwartzstein (2014), and LaRiviere, Czajkowski, Hanley, and Simpson (2015)). Despite this, there is little known either theoretically or empirically about the role of costly learning on how consumers form valuations for goods despite long-standing knowledge that learning resources are indeed scarce (Gabaix, Laibson, Moloche, and Weinberg (2006)). It is also unclear what the causal impact of increasing the retention of information is on economic decision-making. Hanna, Mullainathan, and Schwartzstein (2014) attempts to tackle some of these issues in the context of firms. That paper develops a theoretical model of costly attention in which firms must pay a cost to attend to information related to how effective an input is at increasing productivity. They then test and find evidence of their model using data from farmers in India. Our paper takes aspects of their model and applies it to consumers. We model both uncertainty that an attribute is present in a good and uncertainty as to the level of the attribute conditional on it being present. The theoretical model leads to two key propositions in addition to other testable predictions. Intuitively, if the probability that a desirable attribute is present in a good increases, learning about the good increases. If the attribute is indeed present, then valuations subsequently increase. Our two testable propositions can be tested in the lab or the field. Consistent with the empirical findings of LaRiviere, Czajkowski, Hanley, and Simpson (2015), our key modelling assumption is that learning is probabilistic. 2

2 Theoretical framework The approach is to construct the simplest possible model to capture all of the relevant aspects of the consumer s problem of learning and valuation with costly effort in order to guide the design of experiments and formulate testable hypotheses. In particular, the model considers a good with multiple features which might or might not be embodied in the good (e.g., purchasing a bottle of wine which may or may not be have heavy citrus but is definitely a Savignon Blanc from Marlborough, New Zealand), and allows for learning about how much a consumer values uncertain or new features. This framework applies to new private goods, new public goods, or new mixed goods with both private and public attributes like green energy blocks. This model bears qualitative similarity to other recent models of costly learning (Hanna, Mullainathan, and Schwartzstein (2014) and Schwartzstein (2014)). A consumer faces two types of goods, a composite good y and another good z. A standard framework for such a setting would treat utility as quasilinear: u(y, z) = y + v(z), (1) but the model used here has more structure behind v(z). A unit of z has two attributes, a familiar attribute in amount a, and an unfamiliar attribute b which may or may not be valued by the consumer. The utility generated by a single unit of z is v(1) = a + π φ(b)df (b) (2) where π [0, 1] is the probability that the good includes the new feature and the distri- 3

bution function F (b) governs the amount of the unfamiliar attribute the individual uses conditional on it being included. The function φ is nondecreasing and concave with φ(0) = 0. The additive characterization of v(1) has each attribute generate utility separately, and the individual displays diminishing sensitivity over the unfamiliar feature. We refer to this latter property as concavity in attributes and note that it bears similarity to the Lancaster (1966) utility formulation. The individual can choose an amount of good z, and consuming more z entails scaling up the two characteristics proportionately. This leads to the representation [ v(z) = Φ(z) a + π ] φ(b)df (b), (3) where Φ is increasing and strictly concave to reflect diminishing marginal utility of the good as a whole. We refer to this property as concavity in levels. The two forms of concavity, in attributes and in levels, replace the concavity of v(z) in the standard representation of (1). The individual is endowed with the noisy random variable b + ε, where ε is a noise variable with E[ ε b] = 0 for all b. Learning is costly and probabilistic. If learning is successful she learns whether the unfamiliar attribute is, in fact, included in the good and, if it is, the noise is removed from the random variable governing its amount. Learning whether the attribute is included resolves the binary random variable captured by the prior probability π, and we assume that this prior is unbiased. If the attribute is included and learning is successful, the per-unit amount of the unfamiliar attribute is governed by the random variable b instead of the noisy b + ε. If the attribute is not included and learning is successful, she learns that b = 0 with certainty. On the other hand, when learning is 4

unsuccessful she learns nothing about the inclusion probability π nor the amount of the attribute, so she still faces random variable b + ε. Learning is measured by its success probability t, and requires energy which is expended at cost c(t) with c nondecreasing and convex with c(0) = c (0) = 0. This representation of learning cost is consistent with either a fatigue interpretation (digging deeper requires increasingly more energy) or one based on ease of finding information (digging deeper requires more effort to find and digest additional relevant information). Let F b+ ε (b) be the distribution function for the noisy random variable b + ε, and let F b(b) denote the distribution function for the noiseless, but still random, variable b. Combining all of these notions into (1) and (3) yields [ u(y, z, t) = y + Φ(z) a + tπ φ(b)df b(b) + (1 t)π ] φ(b)df b+ ε (b) c(t). (4) This representation makes learning cost a utility cost, not a monetary expenditure. The individual chooses y, z, and t to maximize u subject to the budget constraint y + z = m, where m is income. The timing of the decision process is as follows. First the individual observes π, the prior probability that the attribute is included in the good. The individual then chooses learning effort t and the resulting information is revealed. Finally, the individual allocates the budget m between the composite good y and the good z. Theorem 1 Suppose that b and b+ ε are both nonnegative random variables. An increase in the inclusion probability π leads to an increase in learning t and an increase in expected utility. 5

The proof of the theorem relies on the existence of three states of the world that could follow the learning stage. In state 1 the individual learns that the feature is absent from the good, while in state 2 she learns that it is, in fact, included. In state 3 learning is ineffective and she does not know whether the feature is included. In practice the feature is either included or not, and if it is included then no one can learn that it is absent, and so no one can enter state 1 in such a state of the world. If the feature is absent no one can learn that it is included, so nobody can enter state 2. Individuals can still enter state 3 regardless of whether the good contains the feature because learning was ineffective. Let λ be a binary variable with λ = 1 when the attibute is included and λ = 0 when it is absent. The post-learning expected demand for the good depends on whether the attribute is included, and it is given by E[z ] = t z 1 + (1 t )z 3 λ = 0 t z 2 + (1 t )z 3 λ = 1 if where z 1, z 2, and z 3 are the individual s post-learning demands in each of the states described above. Theorem 2 An increase in the inclusion probability π has a positive impact on demand for good z if the attribute is included in the good; that is, de[z λ = 1] > 0.. When the attribute is not included, an increase in π has an ambiguous impact on expected demand. There are two countervailing effects. First, demand increases among those for whom learning is ineffective because they place higher probability on the highervalued (but incorrect) state of the world in which the good contains the attribute. Second, 6

demand falls among those for whom learning is effective, because they find out that the attribute is absent from the good and therefore value it less. These two results lead to our two main hypotheses which could be test in either the field or the lab: Hypothesis 1. When attention is drawn to an included and desirable feature, subjects learn more about the good. Hypothesis 2. When attention is drawn to an included and desirable feature, subjects are willing to pay more for the good. 3 Discussion In this paper, we developed a theoretical model which shows comparative statics of how agents value goods when they have incomplete information and effort devoted to learning is costly. The model finds that consumers will spend more effort learning about attributes embedded in a good if an agent s belief that a good has a desireable attribute increases. The model also finds that an agent s valuation for the good will increase as well if the desireable attribute is actually embedded in the good. Testing this model of consumer valuation with costly attention, a consumer analog of Hanna, Mullainathan, and Schwartzstein (2014), would be straightforward in an experimental setting, whether in a lab or in the field. The key attribute of any such experiment would be creating exogenous variation in π, the prior that subjects place on the attribute being included in the good. If the good does, in reality, contain the feature in question then Theorem 1 applies. 7

There are several feasible ways that an experimental design could create exogenous variation in π. For example, a quiz over good characteristics could alter the probability subjects place on the probability with which attributes are embedded in the good. Information on the attributes of similar goods coud also create variation in π. Lastly, recent evidence suggests that creating variation in subjects beliefs about the accuracy of their dataset affects valuations for public goods (LaRiviere, Czajkowski, Hanley, Aanesen, Falk- Petersen, and Tinch (2014)). A similar type of treatment could also create variation in π. There are also secondary hypotheses which could be tested in an experiment. Conditional on a particular level of π, for example, the model implies that the distribution of possible valuations decreases with more information. Further, the experimental design could provide information about the mechanism behind learning due to increases in π. 8

References Bordalo, P., N. Gennaioli, and A. Shleifer (2013): Salience and Consumer Choice, Journal of Political Economy, 121(5), 803 843. DellaVigna, S. (2009): Psychology and Economics: Evidence from the Field, Journal of Economic Literature, 47(2), 315 375. Gabaix, X., D. Laibson, G. Moloche, and S. Weinberg (2006): Costly Information Acquisition: Experimental Analysis of a Boundedly Rational Model, American Economic Review, 96(4), 1043 1068. Hanna, R., S. Mullainathan, and J. Schwartzstein (2014): Learning Through Noticing: Theory and Experimental Evidence in Farming, The Quarterly Journal of Economics, 129(3), 1311 1353. Lancaster, K. (1966): A New Approach to Consumer Theory, Journal of Political Economy, 74(2), 132 157. LaRiviere, J., M. Czajkowski, N. Hanley, M. Aanesen, J. Falk-Petersen, and D. Tinch (2014): The Value of Familiarity: Effects of Knowledge and Objective Signals on Willingness to Pay for a Public Good, Journal of Environmental Economics and Management, 68(2), 376 389. LaRiviere, J., M. Czajkowski, N. Hanley, and K. Simpson (2015): What is the Causal Effect of Knowledge on Decision Making, University of Tennessee Working Paper. 9

Schwartzstein, J. (2014): Selective Attention and Learning, Journal of the European Economic Association, 12(6), 1423 1452. 10

Proofs Theorem 1 Suppose that b and b + ε are both nonnegative random variables. An increase in the inclusion probability π leads to an increase in learning t and an increase in expected utility. Proof. For notational ease define w = φ(b)df b(b) and w ε = φ(b)df b+ ε (b). Because φ is concave and b + ε differs from b by a Rothschild-Stiglitz mean-preserving increase in risk, w w ε. By hypothesis w ε > 0. Let w = w w ε. Solve the problem by backward induction. The end of the game has three states of the world: one in which learning is successful but the attribute is not included (b = 0), one in wh1ich learning is successful and the attribute is included (b drawn from b), and one in which learning is unsuccessful (b drawn from b + ε). We take these one at a time. Suppose that learning is successful but the attribute is not included. The individual then chooses z to maximize y + aφ(z) subject to the constraint y + z = m. Substituting from the constraint yields the unconstrained objective function m z + aφ(z), and the first-order condition is Φ (z) = 1/a. Let z 1 denote the value of z that solves this problem. This case occurs with probability t(1 π). Now suppose that the individual learns that the attribute is included in the good. She chooses z to maximize m z + Φ(z) (a + w) and the first-order condition is Φ (z) = 1/ (a + w). Let z2 denote the solution to this equation. This case occurs with probability tπ. Finally, suppose that the learning is unsuccessful. She chooses z to maximize m z + Φ(z) (a + πw ε ) and the first-order condition has Φ (z) = 1/ (a + πw ε ). solution to this expression. This case occurs with probability 1 t. Let z 3 denote the 11

Now turn to the problem of choosing t. The objective function is h(t) = t(1 π) [m z 1 + aφ(z 1)] + tπ [m z 2 + Φ(z 2) (a + w)] +(1 t) [m z 3 + Φ(z 3)(a + πw ε )] c(t). The first-order condition is 0 = (1 π) [m z 1 + aφ(z 1)] + π [m z 2 + Φ(z 2) (a + w)] (5) [m z 3 + Φ(z 3)(a + πw ε )] c (t). Let t donote the solution to this problem. We want to know how t and z respond to a change in π. Note that t does not appear in the first-order conditions defining any of the zi s, and so dz 1 /dt = dz 2 /dt = dz 3 /dt = 0. Also, π does not appear in the first-order conditions defining z 1 and z 2, and so dz 1 /dπ = dz2 /dπ = 0. Implicitly differentiating the first-order condition identifying z 3 yields w ε Φ (z) dz 3 dπ = (a + πw ε + πt w ), so that dz 3 w ε dπ = Φ (z3 ) (a + πw ε + πt w ) > 0. Now implicitly differentiate (5) with respect to π to get 0 = [m z 1 + aφ(z 1)] + [m z 2 + Φ(z 2) (a + w)] + dz 3 dπ Φ (z 3) dz 3 dπ (a + πw ε) c (t) dt dπ. 12

The first-order condition for z 3 has Φ (z 3 )(a + πw ε) = 1, and so the first two terms in the second line cancel out. We are left with dt dπ = 1 c (t) ([m z 2 + Φ(z 2) (a + w)] [m z 1 + aφ(z 1)]). The term in parentheses on the right-hand side is strictly positive. To see why, note that m z2 + Φ(z 2 ) (a + w) is the optimized utility when the individual learns that the attribute matters, while is the optimized utility when she learns that the attribute does not matter. In the former case she could have chosen z1 instead of z 2, generating utility m z 1 + Φ(z 1 )(a + w), but because she chose z 2 it must be the case that m z 1 + aφ(z 1) < m z 1 + Φ(z 1)(a + w) m z 2 + Φ(z 2) (a + w). Convexity of the learning cost means that c (t) > 0, and consequently, dt /dπ > 0. The intuition is that increasing learning increases the probability that the individual gets to make an ex post optimal choice of z. An increase in π increases the probability that learning will reveal that the feature is included, and the individual learns more in order to reach a better optimum for this case. Expected utility is given by EU = t (1 π) [m z 1 + aφ(z 1)] + t π [m z 2 + Φ(z 2) (a + w)] +(1 t ) [m z 3 + Φ(z 3)(a + πw ε )] c(t ). By the envelope theorem, 13

deu dπ = EU π = t ([m z 2 + Φ(z 2) (a + w)] [m z 1 + aφ(z 1)]) + (1 t )Φ(z 3)w ε. As argued above, the term in parentheses in the second line is positive, and the third term is positive, so expected utility increases when π increases. Theorem 2 An increase in the inclusion probability π has a positive impact on demand for good z if the attribute is included in the good. Proof. When the attribute is included, the expected value of z is E[z λ = 1] = t z 2 + (1 t )z 3. It follows that d dπ E[z λ = 1] = t dz 2 dπ + (1 t ) dz 3 dπ + (z 2 z 3) dt dπ. From the proof of Theorem 1, dz2 /dπ = 0 and dz 3 /dπ > 0. Concavity of Φ implies that z 3 < z 2. Consequently de[z λ = 1] > 0. 14