UNIVERSITY OF CALGARY. Three Essays in Structural Estimation: Models of Matching and Asymmetric Information. Liang Chen A THESIS

Size: px
Start display at page:

Download "UNIVERSITY OF CALGARY. Three Essays in Structural Estimation: Models of Matching and Asymmetric Information. Liang Chen A THESIS"

Transcription

1 UNIVERSITY OF CALGARY Three Essays in Structural Estimation: Models of Matching and Asymmetric Information by Liang Chen A THESIS SUBMITTED TO THE FACULTY OF GRADUATE STUDIES IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY DEPARTMENT OF ECONOMICS CALGARY, ALBERTA April, 2014 c Liang Chen 2014

2 Abstract There is growing interest in using structural estimation methods to address economic questions. There are two main two advantages of using structural estimation methods. First, they can solve the endogeneity problem confronting numerous reduced-form regression works. Second, they estimate the deep parameters of the models, which allows researchers to analyze many interesting economic questions through counterfactual analysis. In this dissertation, I study three different economic questions by structurally estimating models of matching and asymmetric information. In the first chapter, I develop an estimable model which illustrates that the presence of moral hazard not only leads to inefficiency caused by risk sharing across firms and CEOs, but also creates inefficiency due to a talent misallocation. A new empirical method is proposed to identify the separate surplus of both firms and CEOs in a matching market with moral hazard. An application of this method to the U.S market for CEOs shows that the aggregate efficiency loss due to talent misallocation is $12.64 billion. This is more than four times as large as the loss stemming from risk-sharing between firms and CEOs. In the second chapter, my coauthor and I propose a new approach to identify models with network effects by invoking another side of the market. We show that other side of the market provides additional information for identification. Our running application investigates the importance of asymmetric information and network effects in the yellow pages advertising industry. In the third chapter, I study estimation and non-parametric identification of a dynamic matching model with a broader class of generalized unobserved heterogeneities. I first provide the identification results on the match surplus. I then show that the match equilibrium exists and is globally unique. Finally, I provide a new estimation method for our dynamic matching model, which provides more precise estimates than previous methods. ii

3 Acknowledgements I am very grateful to my committee members, Eugene Choo (supervisor), Alexander David, Arvind Magesan, Joanne Roberts for their invaluable support throughout my Ph.D. In particular, I would like to give my special thanks to my supervisor for his encouragement and patient guidance. I am also indebted to Victor Aguirregabiria, Curtis Eaton, Yao Luo, Atsuko Tanaka, Trevor Tombe, Kunio Tsuyuhara and Matthew Webb for extensive discussions. I thank John Boyce, Maksim Isakin, Robert Miller, Victor Yang Song, and Marko Tervio for helpful comments. I also thank my co-author Yao Luo (University of Toronto) for providing the permission to use our joint paper as a part of my dissertation. All remaining errors are my own. iii

4 Table of Contents Abstract ii Acknowledgements iii Table of Contents iv List of Tables vi List of Figures vii 1 Managerial Talent Misallocation and the Cost of Moral Hazard Introduction The Model Managers Firms Timing Optimal Contracting Equilibrium Allocation of Managers to Firms Measuring the Losses due to Moral Hazard A Numerical Illustration Identification and Estimation Identification Estimation Data CEOs Compensation Abnormal Returns Firm Characteristics Estimation Results Estimated Model Primitives Counterfactuals Conclusion Appendices Appendix A: Proofs Appendix B: Tables and Graphs Identification of Network Effects Using All The Economics (with Yao Luo) Introduction Yellow Pages Advertising Data in Toronto The Model Identification and Estimation Identification Estimation Empirical Results Estimation Results Counterfactuals Conclusion Appendix: Tables and Graphs Identification and Estimation of Dynamic Matching Model iv

5 3.1 Introduction A Dynamic Matching Model Model Setup and Assumptions Dynamic Discrete Choice Identification Examples: Generalized Extreme Value Distributions (GEV) Dynamic Matching Equilibrium Existence and Uniqueness of Matching Equilibrium Estimation of Match Surplus Non-parametric Estimation Parametric Estimation Conclusion Appendix The expressions of F a (s i) and R a (s i) v

6 List of Tables 1.1 Numerical Example 2: Matching Pattern Cross-sectional Information on Components of Compensation by Sectors Summary Description on Abnormal Returns in Cross-sectional information on firm characteristics by sectors Sample Correlations Parameter Estimates of the Returns Distributions Nonpecuniary Benefits Relative to Diligence Losses of Talent Misallocation Related to Moral Hazard (Billions) Revenue Ranking by Industry Headings in Core SE Directories Revenue Ranking by Industry Headings in Core NE Directories Summary Statistics on SE Yellow Page Directory Summary Statistics on NE Yellow Page Directory Estimation of Industry Level Network Effects Results of Counterfactural Experiments vi

7 List of Figures and Illustrations 1.1 The Numerical Example: Risk Premium χ n The Numerical Example 2: Loss from Mismatch Relation of CEO Compensation and Firm Rank by Market Value in Estimated Distribution Density of Firms Effective Size S(r)/ S(0) Estimated Distribution Density of Managerial Talent T (r)/t (0) Relation of Ranks by Firm Actual Size and Managerial Talent Relation of Ranks by Firm Actual Size and the Risk Premium e χn Distribution Areas of the 7 Directories Estimated Intrinsic and Network Utility Functions ˆV 0 ( ), Ŵ NE ( ) Estimated Type Densities ˆf SE ( ), ˆf new NE ( ), ˆf NE ( ) T full ( ),T ind ( ),T 0 ( ) q full ( ),q ind ( ),q 0 ( ) vii

8 Chapter 1 Managerial Talent Misallocation and the Cost of Moral Hazard 1.1 Introduction The inefficient allocation of CEOs (also referred to as managers hereafter) to firms can cause a large amount of economic loss. Evidence shows that managers and firms differ in their productive ability, and they are complementary in production. 1 In these circumstances, efficiency requires positive assortative sorting by the types of managers and firms. In the absence of information asymmetries, the competitive equilibrium of the market for managers exhibits this sorting. 2 However, if the output of firms is stochastic and depends on unobserved managerial action, a hidden action moral hazard problem arises and efficient sorting may not be achieved in equilibrium. While this moral hazard problem has been well-studied, with few exceptions the literature has focused on how to motivate managers. The way in which moral hazard affects the equilibrium allocation remains unclear. More importantly, little is known about the magnitude of inefficiency in the equilibrium allocation of managers to firms in the presence of moral hazard. This paper has two main contributions. First, it develops a new method for recovering the preferences (model primitives) of agents, who are matched only once in a two-sided matching market with moral hazard. These primitives are important for evaluating the efficiency of equilibrium allocation and assessing the role of policy designed in affecting 1 Pan (2012) shows three productive complementarities between executive and firm characteristics: firm size and managerial talent, the degree of diversification of the firm and the cross-industry experience of the manager, the R&D intensity of the firm and the innovation propensity of the manager. 2 Becker (1973) shows that in a perfect market, when the production technology is complementary, positive assortative matching is achieved in all equilibria. 1

9 equilibrium allocation outcomes. The existing literature focuses on estimating the aggregate surplus of matched partners in matching markets without asymmetric information. My method has two advantages. First, I provide a method to estimate the separate surplus of matched partners, while previous papers focus on estimating the aggregate surplus. Second, I estimate the primitives of a matching model with asymmetric information, which has not been attempted before. Therefore, the method developed here may be applicable to study other matching markets with asymmetric information. 3 Second, this method is applied to estimate preferences in the U.S. market for CEOs. Using the estimates, I empirically quantify the magnitude of inefficiency in the allocation of CEOs to firms caused by moral hazard. Moral hazard brings inefficiency due to the need for risk sharing between shareholders and CEOs, which is well understood in the literature. However, the magnitude of the inefficiency in the allocation of CEOs to firms also stemming from moral hazard has not been empirically quantified. 4 I estimate a model in which moral hazard can lead to an inefficient allocation of CEOs to firms in equilibrium. I then quantify the magnitude of this inefficiency using the estimates of the model primitives. I find that the efficiency loss of CEO misallocation is large: namely this is more than four times as large as the loss induced by risk-sharing inefficiency. The findings suggest that studies focusing solely on risk-sharing can severely underestimate efficiency losses and the impacts of moral hazard on allocation efficiency should be considered in future work. Why would moral hazard possibly matter for the allocation of CEOs to firms? To explain this, consider first a competitive market without moral hazard. Suppose the production of firms and CEOs is complementary in their types. Since higher type CEOs generate more value at higher type firms, the equilibrium allocation of CEOs to firms exhibits positive assortative sorting. This allocation is efficient in the sense of maximizing the total expected 3 Examples include other high-skilled labor markets, venture capital investment, and matching between landowners and tenants in Agriculture. 4 Actually both traditional risk-sharing inefficiency and allocation inefficiency are caused by the need of risk sharing between firms and CEOs in the presence of moral hazard. While the existing literature focuses only on the traditional risk-sharing inefficiency, I study both risk-sharing and allocation inefficiency. 2

10 production before compensating CEOs. Now introduce moral hazard into this market. The efficient allocation may be distorted. If output of firms is stochastic and depends on unobserved managerial action, firms use performance-based wages to align the incentives of CEOs to shareholders. Firms now need to pay a risk premium to risk-averse CEOs and then riskier firms have to pay higher expected wages. As a result, if the largest firm is very risky, it needs pay a very high premium to get the best CEOs. It may not worth for it to do so and it may end up with relatively worse CEOs. There will be misallocation. 5 I develop an estimable model to study above CEOs misallocation problem. The model extends the matching model in Tervio (2008) and incorporates moral hazard following Margiotta and Miller (2000) 6. In a competitive market, risk-averse CEOs differing in types (talent) are hired by firms varying in types (size), risk and costs of managerial effort. To focus on moral hazard, I assume that CEOs have private information about their effort, all characteristics are known to firms and CEOs but not to the econometrician. 7 Specifically, firms know managerial talent and CEOs know all firms size, risk and costs of managerial effort. The allocation of CEOs to firms in equilibrium is driven by the level of CEO compensation, which is determined by the optimal wage contracting in the presence of moral hazard. Since firms differ in multidimensional characteristics, it is challenging to solve the equilibrium allocation of CEOs to firms. As in Edmans and Gabaix (2011a) making the following assumptions: managers have constant relative risk aversion preferences with two possible effort levels (shirking and working) to choose from, and the production function is complementary in firms size and managerial talent, I could collapse multidimensional characteristics of firms into a one single index, namely effective size. I then show that the equilibrium allocation of CEOs to firms exhibits positive assortative sorting by firms effec- 5 This is the similar story of CEOs misallocation provided by Edmans and Gabaix (2011a). 6 Gayle and Miller (2009) estimate a moral hazard model based on Margiotta and Miller (2000) to study the importance of moral hazard in CEO compensation. 7 Motivated by Tervio (2008), the (actual) size of firms should be thought of as the combination of all firm factors that contribute to production, including reputation, market capitalization, growth potential, and so on. Similar logic applies to managerial talent. Therefore, it is reasonable to assume that they are not observed by the econometrician. 3

11 tive size and managerial talent. This may or may not be efficient because the total expected production is maximized when the way that CEOs is allocated to firms is positive assortative sorting by firms actual size and managerial talent. To identify the model, I must confront two facts that differ from the data usually studied from other matching markets. These two facts are (i) the types of firms and CEOs are not observed to the econometrician; (ii) firms and CEOs with different types are matched only once. Instead of a standard revealed preference approach, I identify the model using observed monetary transfers (CEO compensation), profits of firms, and information in the environment with moral hazard. Tervio (2008) shows that the type distributions of firms and CEOs can be inferred from observed CEO compensation and the profits of firms. Following this idea, I recover the type distributions of CEOs and firms from their observed income. With their type distributions in hand, the production function is then identified under complementarity assumption. While the presence of moral hazard brings challenges in solving the model, it provides additional information for identifying the managerial preferences from the model structure. Using the technique of identifying principal agent models of moral hazard in Gayle and Miller (2009) and Gayle and Miller (2013), I am able to identify managerial preferences by exploiting optimal compensation contracts, managers participation and incentive compatibility constraints. Building on the identification strategy, I propose a technique for estimating the primitives of the model. The technique is applied to estimate the model using data on the 1000 largest firms in terms of market value from S&P Compustat databases for The estimation results show that the ordering of firms effective size differs from that of their actual size, which implies moral hazard does affect the efficient allocation of CEOs to firms. While the efficient allocation requires positive assortative sorting by firms actual size and managerial talent, the equilibrium allocation exhibits positive assortative sorting by firms effective size and managerial talent. I then use the estimates to investigate the importance of moral 4

12 hazard in the allocation of CEOs to firms by quantifying four measures of losses related to moral hazard. The counterfactual results show that (i) the aggregate loss of the 1000 largest firms due to talent misallocation is $12.64 billion, which is more than four times as large as the loss induced by the risk-sharing inefficiency; (ii) maximum aggregate loss from talent misallocation is about 15.00% of the total market value, while the aggregate loss of switching from working to shirking is about 19.96% of the total market value. This paper contributes to an extensive literature on moral hazard and matching in the market for CEOs. 8 The empirical literature analyzes moral hazard and CEO matching separately. Seminal papers by Murphy and Jensen (1990) and Hall and Liebman (1998) study the importance of moral hazard by estimating CEO pay-performance sensitivity. Margiotta and Miller (2000), and Gayle and Miller (2009) analyze principal agent models to estimate economic costs of moral hazard. Gayle and Miller (2013) first develop a new tenique of identifying and testing principal agent models of moral hazard. I apply their identification strategy in this paper. 9 Gabaix and Landier (2008), and Tervio (2008) employ CEO matching models to investigate the cross sectional differential and time trend of CEO compensation. 10 In these papers the cost of CEO misallocation caused by moral hazard is unable to be quantified because moral hazard and CEO allocation are investigated separately and incorporating them is not an obvious extension. The theory in this paper is closely related to Edmans et al. (2009) and Edmans and Gabaix (2011a). Edmans and Gabaix (2011a) provide a similar theory on CEO misallocation, which also incorporates moral hazard into a matching model. My model distinguishes from theirs in its empirical feature that all 8 There is also an extensive theoretical literature on moral hazard and matching. The theoretical papers on moral hazard mainly focus on deriving the optimal incentive wage contract by solving principal agent models of moral hazard. See, e.g., Mirrlees (1976), Hölmstrom (1979, 1982), Grossman and Hart (1983), Hölmstrom and Milgrom (1987) and Edmans and Gabaix (2011b). The theoretical papers on matching try to understand the distribution of labor income. See, e.g., Tinbergen (1951, 1956), Koopmans and Beckmann (1957), Sattinger (1979, 1993) and Rosen (1982). 9 The principal agent model of moral hazard in this paper differ from Gayle and Miller (2013) in that I employ a constant relative risk aversion utility function, while the authors in Gayle and Miller (2013) use a constant absolute risk aversion utility function. In Section 2.1, I discuss the advantage of using a constant relative risk aversion utility function. 10 See, e.g., Chiappori and Salanié (2003) for a survey of the contracting literature. 5

13 primitives of the model can be recovered from observables. The empirical method in this paper contributes to the recent literature on estimating matching models with transferable utility. The majority of papers focus on estimating a single aggregate surplus in markets without asymmetric information. The empirical analysis of matching models with transferable utility is initiated by Choo and Siow (2006), and subsequently extended by Galichon and Salanié (2010), Chiappori et al. (2011), and others. Fox (2008) proposes a different approach to estimate matching models of transferable utility, which is applied by Bajari and Fox (2005), and others. In these papers, the possibility of recovering two separate utility functions is limited because data on monetary transfers between matched partners are often not observed. 11 Moreover, the models in these papers do not feature asymmetric information. To the best of my knowledge, this is the first paper that estimates matching models with asymmetric information. The rest of the paper is organized as follows. Section 2 presents the theoretical model on which my empirical analysis is based. The model shows that the presence of moral hazard can not only lead to inefficiency caused by risk-sharing, but also create an inefficient allocation of CEOs to firms. Section 3 proposes a new empirical method to recover the preferences of CEOs and firms in a matching framework with moral hazard. Section 4 introduces data on the U.S. market for CEOs in While section 5 presents the estimates of the model primitives and counterfactual results about the importance of moral hazard in the allocation of CEOs to firms, section 6 concludes. 1.2 The Model This section lays out a static matching model of managers and firms featuring moral hazard in the market for CEOs as the theoretical framework of my empirical analysis. The model extend the matching model in Tervio (2008) by incorporating a principal agent model of moral 11 Agarwal (2012) provides an approach to identify the separate surplus of matched partners in the medical match. 6

14 hazard developed by Margiotta and Miller (2000) 12. The interactions between managers and firms in my static framework are modeled to be in two stages: (i) firms and managers match one to one to produce output in a competitive labor market; (ii) firms offer compensation contracts to their hired managers. As standard, I solve the model by backward induction. I start with the second stage by formulating firms optimal contracting problem. I then derive firms optimal compensation contracts in the presence of moral hazard. Taking the optimal contracts into account, I derive the equilibrium allocation of managers and firms. Finally I show how moral hazard can affect the efficient CEOs allocation and provide measures of loss to firms generated from moral hazard Managers In my static model there is continuum of managers in a competitive market. They differ in their talent, whose distribution is captured by a quantile function T (m) as in Tervio (2008). T (m) is the talent of a m quantile manager. 13 This quantile approach is more intuitive and tractable than traditional density functions when considering empirical applications. A typical manager s preference is captured by using a constant relative risk aversion (CRRA) utility function. All managers have the same coefficient of relative risk aversion, denoted by ρ. For a manager with talent T (m) serving for firm n, the cost of his effort is captured by the coefficient α ne corresponding to two levels of managerial effort e {1, 2}, where e = 1 represents shirking and e = 2 represents working. 14 Here α ne is firm specific as in Gayle and Miller (2009) and Edmans and Gabaix (2011a). The manager s utility can then written as U n (w(x)) = (α new(x)) 1 ρ 1 ρ for e = 1 or 2, (1.1) 12 Gayle and Miller (2009) estimate a moral hazard model based on Margiotta and Miller (2000) to study the importance of moral hazard in CEO compensation. 13 Denoting the distribution function of firm size as F s ( ), T (m) is defined by T (m) = T, s.t. F t (T ) = m. 14 Shirking and working should be thought of as managers pursue their own interests and those of firms, respectively. 7

15 where w(x) is the wage that the manager recieved from serving the firm. I assume that a typical manager prefers shirking to working, that is α n1 > α n2 > 0. The manager s reservation utility from outside options is assumed to be α 1 ρ m0 /(1 ρ), 15 which are positively correlated with the manager s talent T (m) and endogenously determined in the matching process. This will be derived when solving the matching problem. Here I follow Edmans and Gabaix (2011a), representing managers preference using the CRRA utility function. Existing literature mainly use the Constant Absolute Risk Aversion Utility (CARA). (see, e.g., Margiotta and Miller (2000), Gayle and Miller (2009), and Gayle and Miller (2013).) With this CRRA utility function, effort has a multiplicative effect on a manager s utility. This preference captures the idea that a manager s utility from shirking is increasing with their wealth. This specification is plausible if effort is interpreted as forgoing leisure. A day of vacation is more valuable to a richer manager as he has more wealth to enjoy it. As noted by Edmans and Gabaix (2011a) CRRA preferences are also commonly used in macroeconomics as it leads to realistic income effects. Moreover, employing the CRRA utility function has an important technical reason. It is necessary to achieve multiplicative form of optimal compensation in equilibrium, which is desired to solve the equilibrium CEOs allocation equilbirum in my model Firms There is also continuum of firms in the competitive market as in Tervio (2008). They are characterized by three dimensional characteristics: size, disutility of effort and risk similar as in Edmans and Gabaix (2011a). The firm size should be thought of as the combination of all firm characteristics that contribute to production, such as assets, reputation, capitalization ability and so on. Its distribution is also captured by a quantile function S(n). S(n) is the size of a n quantile firm. 16 The disutility of managerial effort, which is captured by α ne. 15 The purpose of assuming this form of the outside option is for mathematical simplicity. It can gives more tractable optimal contract than assuming that the outside option is α m0. 16 Denoting the distribution function of firm size as F s ( ), S(n) is defined by S(n) = S, s.t. F s (S) = n. 8

16 Following Edmans and Gabaix (2011a), it is assumed to be firm specific. For example, a firm in a regulated industry or headquartered in an unattractive location is unpleasant to work for regardless of the effort exerted by the CEO. 17 The risk of firms is reflected by the volatility of their output, which is produced after a manager is allocated to a firm. The production output is specified as a function of firm s size, managerial talent and effort. Specifically, the production function is assumed to be multiplicatively separable between the firm s and the manager s contribution to output. Consider a firm with size S(n) has hired a manager with talent T (m). Its production function is written as 18 Y [S(n), T (m), e] = S(n) T (m) [1 + x(e)]. (1.2) The term S( ) T ( ) is the simpliest form that exhibits complementarity as discussed by Tervio (2008). I refer the readers to Tervio (2008) for the microfoundations of this term. As in Margiotta and Miller (2000), x(e) is the idiosyncratic output signal attributed to managerial effort, whose probability density function is conditional on managerial effort e. For firm n, conditioning on manager shirking (e = 1), probability density function of x is denoted by f n1 (x). Conditioning on manager working (e = 2), probability density function of x is denoted by f n2 (x). Since x is the only stochastic variable in the output function, the variances of f n1 (x) and f n2 (x) reflect the risk of firms. Hereafter throughout this paper, the symbol E n [ ] is employed to represent the expectation taken over f n2 (x), i.e., fn2 (x)dx. For the purpose of solving the model, I follow the literature on moral hazard to define likelihood ratio function by g n (x) f n1 (x)/f n2 (x). 19 It is nonnegative for all x and E[g n (x)] = xf n1 (x)dx = 1. In order to identify and estimate the model, I make three important assumptions on 17 This is the argument made in Edmans and Gabaix (2011a). 18 The production function is motivated by Tervio (2008). 19 Following Gayle and Miller (2009), g n (x) can be interpreted as the signal firm n receives about managerial effort choice. g n (x) = 0 implies working while g n (x) = implies shirking. When g n (x) = 1, the signal is useless such that compensation does not depend on x. 9

17 probability density functions of x following Gayle and Miller (2013). These three assumptions are crucial to my empirical analysis. First, I normalize the expected value of x conditional on manager working to be zero, which is formalized in the next assumption. Assumption 1. E n (x) is normalized to be 0 for any firm n. Second, I assume each firm perfers manager working to shirking. That is the expectation of x is larger under manager working than it under manager shirking. Assumption 2. E n [x] = xf n2 (x)dx > xf n1 dx = E n [xg n (x)] for any firm n. Finally, I assume that a very large x is extremely unlikely to be obtained if the manager shirks, which is formalized in next assumption. Assumption 3. lim x [g n (x)] = 0 for any firm n. In other words, this assumption tells that an extraordinary high level x can only be achieved when the manager works Timing The timing of interactions between managers and firms in my static framework is as follows. First, at the beginning of the period, heterogeneous firms and managers match one to one in a competitive labor market. In order to focus on moral hazard problem, I assume there is no search friction in this market. All characteristics of firms and managers are observed by firms and managers but not by the econometrician. The continuous distributions of firms and managers rule out match-specific rents and therefore, any need to model bargaining. Second, after managers are matched with firms, each firm proposes a performance-based compensation contract w(x) to its hired manager. Recall x is the output signal whose distribution is conditional on the level of managerial effort. Given the firm s compensation offer, each manager makes his choice of whether to take the offer or not. If the manager rejects the offer, he obtains his reservation wage from outside options. If the manager accepts the 10

18 offer, he then chooses his effort level. The effort choice is not observed by the shareholders of the firm. This hidden action moral hazard problem is the only friction in my model. Finally, the output signal x is realized and each manager gets paid according to his compensation contract w(x) Optimal Contracting As standard in the literature on principal agent models, I solve the optimal compensation in two steps. 20 First, I solve the optimal compensation contracts corresponding to both effort levels: shirking and working. Second, the optimal effort level is determined by the firm s profit maximization problem. Consider the problem of a firm with size S(n) which has hired a manager with talent T (m). I assume firms are risk neutral. Their utility is then measured by their profits, a benefit net of a cost. Let s first consider the optimal compensation contract problem of a firm which wants to induce its manager to work. The firm s cost is then the expected compensation paid to the manager. The firm s benefit is the expected output when its manager works, which is a constant because firm s size S(n) and managerial talent T (m) are fixed and the expected value of x when the manager works, E n (x), is normalized to be zero from assumption (4). The firm s problem is now to design an optimal compensation contract to minimizing its cost, the expected compensation w n (x)f n2 (x)dx, subject to (i) the manager accepts the offer instead of choosing his outside options (participation constraint); (ii) and the manager pursues the interests of the firm instead of his own (incentive compatibility constraint). They can be formally written as [αn2 w n (x)] (1 ρ) (1 ρ) f n2 (x)dx α(1 ρ) m0 (1 ρ), (1.3) 20 The optimal compensation contracts are first solved under a principal-agent framework by Grossman and Hart (1983). Here I follow their procedure. 11

19 and [αn2 w n (x)] (1 ρ) [αn1 w n (x)] (1 ρ) f n2 (x)dx f n1 (x)dx. (1.4) (1 ρ) (1 ρ) Intuitively, (1.3) requires the expected managerial utility from working is not less than the utility that the manager obtains from outside options. Note that talent only enters into the participation constraint through the outside option utility that comes from the matching problem. (1.4) requires the expected utility that the manager obtains from working is not less than the utility from shirking, which gives the manager incentive to work. Solving the firm s cost minimization problem, I derive the optimal compensation contract that induces the manager to work. The next lemma formalizes the result. Lemma 1. The firm n s optimal contract inducing its manager to work is given by w n (x) = (α m0 /α n2 ){θ 0 + θ 1 [1 (α n1 /α n2 ) 1 ρ g n (x)]} 1/ρ, (1.5) where θ 0 and θ 1 is the unique positive solutions to the following system of equations, E{[θ 0 + θ 1 (1 (α n1 /α n2 ) 1 ρ g n (x))] (1 ρ)/ρ } = 1, (1.6) E{[θ 0 + θ 1 (1 (α n1 /α n2 ) 1 ρ g n (x))] (1 ρ)/ρ g n (x)}(α n1 /α n2 ) 1 ρ = 1. (1.7) All proofs are in the appendix. Optimal compensation for working is the multiplication of a constant term and a risk primium term. If moral hazard is not a problem because managerial action could be perfectly monitored, the manager would be paid a flat compensation, α m0 /α n2. The second term in the optimal compensation determines how it varies with output signal x through the likelihood ratio function g n (x). If a firm is very risky or generate high costs for managers making effort, variance of x or α m0 /α n2 is large, the expected value of this second term will be large. Thus this firm need to pay a higher expected wage. If the firm wants to induce the manager to shirk, its optimal compensation is a flat wage α m0 /α n1, which ensures the following participation constraint for shirking to hold with 12

20 equality, [αn1 w n (x)] (1 ρ) (1 ρ) f n1 (x)dx α(1 ρ) m0 (1 ρ). Profit maximization determines whether firms should offer the optimal contracts that induce managers to work or to shirk. Hereafter, as in Gayle and Miller (2009), I assume that inducing managers to work gives firms more profits, and as such focus only on the case where all firms offer optimal compensation contracts that induce their manangers to work Equilibrium Allocation of Managers to Firms I now incorporate the above moral hazard problem into a matching model of managers and firms. The matching setup follows closely the model of Tervio (2008). Under the complementary production, efficiency requires positive assortative sorting by managers and firms. It maximizes the expected total production of firms. In the absence of moral hazard, the competitive equilibrium exhibits this sorting. 22 In the presence of moral hazard, this positive assortative sorting may not be achieved in the competitive equilibrium. In the following I show how moral hazard can affect this positive assortative sorting. Edmans and Gabaix (2011a) provide a similar theory on CEOs misallocation using a different model setup. Here I apply their idea of solving the allocation problem to my model setup. Following Edmans and Gabaix (2011a), I begin with the expected utility of managers. Using the optimal compensation derived in Lemma 1, the expected utility of firm n s manager can be written as 23 EU n = {α n2e[w n (x)]e χn } 1 ρ, (1.8) 1 ρ where E[w n (x)] is the expected compensation and χ n is defined by χ n ln E{[θ 0 + θ 1 (1 (α n1 /α n2 ) 1 ρ g(x))] 1 ρ }. (1.9) 21 I observe that all the firms provide compensation based on their performance in the data and there is no firm paying flat wage. 22 See, eg., Sattinger (1993), Gabaix and Landier (2008) and Tervio (2008)). 23 The derivation of the expected utility is in the appendix. 13

21 χ n is the risk premium that firm n pay to its manager in the sense that χ n = ln{e[w n (x)]} ln U 1 (EU n ). Riskier and higher disutility firms need to pay a high risk premium in terms of χ n. From the perspective of the manager, χ n is the total loss he/she suffered from working and sharing risk with the firm. After adjusting for the loss, the certainty equivalent wage (named effective wage) of firm n s manager is defined by v n = E[w n (x)]e χn, (1.10) which is fixed and gives the manager the same expected utility as that under the optimal compensation contract. From the proof of Lemma 1, the participation constraint (1.3) is met with equality and hence we have v n = α m0 /α n2, which is nonpecuniary benefits the manager obtains from outside options versus working. The outside options for the manager should be thought of as the opportunities to work for firms whose sizes are next to firm n. In this sense the effective wages of managers are endogenously determined by the competition for managers among firms, meaning more talented managers obtain higher effective wages in equilibrium. In the competitive equilibrium, a manager with ability T (m) should receive an effective wage of v(m) whose quantile is m. If firm n wishes to hire him, it has to pay him an effective wage v(m) and thus an expected dollar wage of E[w n (x)] = v(m)e χn. The expected profits that the firm will obtain is the expected output net of expected wage of the manager, E[S(n)T (m)(1 + x)] v(m)e χn = S(n)T (m) v(m)e χn = e χn [S(n)e χn T (m) v(m)], where the first equality is from the normalization E n (x) = 0. The firm chooses a manager to maximize above expected surplus, which tells that firm n with actual size S(n) now acts as a firm with size S(n)e χn, namely effective size as defined in Edmans and Gabaix (2011a). 14

22 Riskier and higher disutility firms have lower value of e χn and thus smaller effective size given actual size. Hereafter I order firms using the effective size; and thus suppose firm n s effective size has quantile r, denoted by S(r) S(n)e χn. Now I need to solve a standard one dimensional matching problem instead of a multidimensional matching problem. Here the way to collapse firms multidimensional characteristics into one dimension is similar to Edmans and Gabaix (2011a). In absence of moral hazard, firms act according to their actual sizes. The competitive equilibrium is positive assortative sorting by firm actual size and manager talent, which is efficient. In the presence of moral hazard, efficieny still requires positive assortative sorting. However, in this circumstance, firms act according to their effective sizes, the competitive equilibrium involves positive assortative sorting by firm s effective size and managerial talent 24. This is the result from matching literature. For example, similar argument is made by Edmans and Gabaix (2011a). This may or may not be efficient because efficiency involves postitive assortative sorting by firm s actual size and managerial talent. If the ranking of firms effective size is different from that of firms actual size, the equilibrium outcome is not efficent in the present of moral hazard. This is the CEOs misallocation problem that I am studying empirically in this paper. Although Edmans and Gabaix (2011a) provide a similar theory on CEOs misallocation, the model setup and focus in my paper is totally different from their paper. The goal of my paper is to show the importance of moral hazard by empirically quantifying the production losses arising from CEOs misallocation induced by moral hazard. To achieve this goal, my model is built such as all my model primitives are identifiable and estimable. I also provide identification and estimation strategies for the primitives. However, the focus of Edmans and Gabaix (2011a) is to explain how would risk affect CEO compensation. To do so, they build a theoretical model such that they can derive the closed-form solutions for CEO assignment, compensation and incentives. 24 The interpretion on it is shown in the appendix following Edmans and Gabaix (2011a) 15

23 As I discussed, I only need solve a standard one dimensional matching problem by aggregating firms multidimensional characteristics into a single index (effective size). I now extend Tervio (2008) to solve the equilibrium allocation of CEOs across firms in the context of moral hazard. In the presence of moral hazard, the profiles of manager effective wage and firm effective profits must support the allocation that involves perfect sorting by firm effective size and managerial talent. The following two types of conditions must be satisfied, S(r)T (r) v(r) S(r)T (m) v(m) r, m [0, 1] (1.11) S(r)T (r) v(r) π 0 r [0, 1] (1.12) v(r) v 0 r [0, 1] (1.13) S(r)T (r) is effective output, which is divided between the managers and firms. (π 0, v 0 ) is the effective wage and effective profits that managers and firms could obtain from the job opportunities outside the market. The lowest active firm-manager pair (r = 0) is the one that just breaks even with the alternative opportunity outside the market, S(0)T (0) = v 0 + π 0. First type conditions (1.11) guarantee each firm must prefer hiring its manager to hiring any other managers at their equilibrium effective wages. Second type conditions (1.12) and (1.13) guarantee all firms and managers are active in the market. Replacing m with r ɛ in sorting constraints (1.11) and dividing both sides by ɛ gives S(r)T (r) S(r)T (r ɛ) ɛ v(r) v(r ɛ), ɛ which becomes equility as ɛ 0. By definition, the slope of the managerial effective wage profile is given by 25 v (r) = S(r)T (r). (1.15) 25 This equation can also be obtained by solving firm s profit maximization problem. The firm n chooses a CEO to maximize its expected profits and thus it solves the maximization problem, max m S(r)T (m) v(m). (1.14) Taking first derivative with respet to m and using positive assortative matching in equilibrium (m = r) gives us (1.14). 16

24 The effective wage profile then can be obtained by integrating the slope and using v(0) = v 0 : v(r) = v 0 + r 0 S(j)T (j)dj. (1.16) Analogously by the fact that π(r) = S(r)T (r) v(r), the profile of firm effective profits satisfies π (r) = S (r)t (r), (1.17) which gives π(r) = π 0 + r 0 S (j)t (j)dj. (1.18) The expressions of managerial effective wage and firms effective profits in (1.16) and (1.18) are similar as the expressions of managerial wage and firms profits in expressions (6) and (8) provided by Tervio (2008). While Tervio (2008) solves a pure assignment model, incorporating moral hazard requires me to work with managerial effective wage and firms effective profits. From (1.16) and (1.18), we see that all inframarginal manager-firm pairs produce an effective output over the sum of their opportunities outside the market, and the division of this effective output depends on the distributions of firm effective size and managerial talent. The effective profits and wage of firms and managers increase with their quantiles. The following definition summarizes the conditions that a competitive assignment equilibrium should satisfy when taking moral hazard into account. Definition 1. When moral hazard is present, a competitive allocation equilibrium is defined by the allocation of managers to firms and a profile of firm effective profits and managerial effective wage, which satisfy the following conditions: (i) the assignment satisfies the conditions presented in (1.11)-(1.13); (ii) the effective wages and effective profits are given by distribution functions of firms size and managerial talent in (1.16) and (1.18). 17

25 1.2.6 Measuring the Losses due to Moral Hazard The importance of moral hazard on the allocation of CEOs to firms is assessed by quantifying four measures of losses related to moral hazard. First, in the presence of moral hazard, firms offer performance-based compensation contracts that lead risk sharing between firms and CEOs. Consequently firms incur losses by paying risk premium to risk-averse CEOs. This is widely recognized as agency cost of moral hazard in the literature. Various papers have quantified the magnitude of agency cost, among which Margiotta and Miller (2000), Gayle and Miller (2009), Edmans and Gabaix (2011a) and Gayle and Miller (2013) quantify it in the market for executives. Second, firms incur production losses arising from talent misallocation. This is the focus of this paper. Edmans and Gabaix (2011a) provide a similar measure for this type of loss in their model context. The third measure is the maximum losses that firms would incur from talent misallocation caused by moral hazard. The final measure is the loss that firms would incur from ignoring moral hazard problem. This loss is also quantified in Margiotta and Miller (2000), Gayle and Miller (2009), and Gayle and Miller (2013). In the following, I provide formal definitions for these measures of losses in the context of my model. First, I provide a measure to the loss from risk-sharing inefficiency, which are the costs that firms pay to motive CEOs when moral hazard is present. If moral hazard is not a problem because of perfect monitoring, the firm with quantile n pays a fixed ceretainty equivalent wage α n0 /α n2 = v n. In the presence of moral hazard, firm n offers a wage contract w n (x) and expected wage E[w n (x)], which is worth to the fixed wage v n. The loss of firm n from risk-sharing inefficiency is the difference between the expected wage E[w n (x)] and the fixed wage v n. Denoting it by L n1, it is given by L n1 = E[w n (x)] v n = E[w n (x)][1 e χn ], (1.19) where the second equality comes from the definition v n = E[w n (x)]e χn. The sum of L n1 over all firms gives the gross loss firms would incur from risk-sharing inefficiency. 18

26 Second, I provide a measure to the loss from talent misallocation associated with moral hazard. It is measured by the difference between firm n s output from efficient talent allocation and that from actual talent allocation. Let T (n) denote the talent of the manager assigned to firm n in equilibrium 26. Denoting the loss by L n2, it is then given by L n2 = E[S(n)T (n)(1 + x)] E[S(n) T (n)(1 + x)] = S(n)[T (n) T (n)], (1.20) where the second equality exploits the normalization E(x) = 0, that is the expected value of x is zero when the manager pursues the interests of the firm. The sum of L n2 over all firms give the gross loss firms would incur from matching inefficiency. The sum of L 1n and L 2n over all firms are the total loss that firms incur to solve moral hazard problem by optimal contracting. Third, I provide a measure to the maximum loss that firms would incur from talent misallocation. This loss the difference between the output of efficient talent allocation and that of the most inefficient talent allocation. The most inefficient talent allocation is negative assortative sorting, that is the least talented manager is allocated to largest firm. Finally, I provide a measure to the loss that firms would incur from ignoring moral hazard. This is the total benefits of firms from motivating the CEOs. If a firm ignores moral hazard problem, its manager will pursue the interests of his own. Let L 3n denote the loss of firm n from ignoring moral hazard. The loss is measured by the difference between the expected output from the manager pursuing the firm s interests versus his own, which is given by L 3n = S(n) T (n)(1 + x)f 2n (x)dx S(n)T (n)(1 + x)f 1n (x)dx. (1.21) The talent allocation will be efficient if all firms ignore moral hazard since the firm will pay a fixed wage to the manager. Thus the second term at the right side of above equation gives expected output when the firm ignores moral hazard. 26 Since we know the firm n has effective size S(r), the firm will end up with talent T (r) manager. Thus, T (n) = T (r) 19

27 1.2.7 A Numerical Illustration I now provide a numerical example to illustrate the intuition of the model. In this example, I focus on how the efficient allocation is distorted and the losses that arise from this distortion. Consider a market for CEOs with a continuum of firms and managers in the market for CEOs. The distributions of firm s actual size and managerial talent are represented by quantile functions, which are assumed to be S(n) = 100n and T (m) = 10m for n, m [0, 1]. For a typical firm n, the probability density functions of x under manager working and shirking are assumed to be normal distributions, N(0, σn) 2 x n N( 1, σn) 2 if working, if shirking. Here σ 2 n is the variance of the probability density functions, which reflects firms risk. Here I assume managerial effort does not affect the risk of firms. The firms disutility is assumed to be the same across firms as I focus on the effect of firms risk. α n1 /α n2 is then set to be 4 for all firms. Finally I set the risk aversion parameter ρ = 1/2. Under above setup, I first derive the θ 0 and θ 1 for each firm following Lemma 1. The solutions are given by 1 θ n1 = 1 + 2(exp(1/σn) 2 1), 1 θ n2 = 2(exp(1/σn) 2 1). With θ n0 and θ n1 in hand, risk premium χ n can be calculated by using (1.9). Figure (1.1) displays how it changes with firms risk. It shows that riskier firms need pay more risk premium, which is consistent with the model intuition. Using the risk premium, I can derive the firms effective size from its definition S(r) S(n)e χn. The equilibrium allocation involves perfect sorting by firms effective size and managerial talent. I first randomly assign variances to firms from σn 2 [1, 11]. Table (1.1) shows the equilibrium allocation. The matching is distorted in the sense that some small firms hire relatively 20

28 high talented managers and some large firms hire relatively low talented managers. The loss from allocation inefficiency due to moral hazard is measured by using (1.20). Figure (1.2) shows the efficiency loss. The total value of the loss is 2575, which is about 7.61% of total output under the efficient allocation. When the variances of x are assigned to be positively correlated with firms actual sizes, total efficiency loss is at its maximum level, which is This is about 31.1% of total output under the efficient allocation. 1.3 Identification and Estimation Identification After the model is defined and solved, I discuss the identification and estimation of the model primitives. I first define the structural model primitives and observables. The model primitives consist of probability density functions of firm abnormal returns x under both shirking and working, f n1 (x) and f n2 (x), nonpecuniary benefits of managers from outside options versus working α m0 /α n2, nonpecuniary benefits of managers from shirking versus working α n1 /α n2, managers risk aversion parameter ρ, and quantile functions of firm size and managerial talent, S(n) and T (n). f n1 (x), f n2 (x), α m0 /α n2, α n1 /α n2, and ρ are pritimitves in the principal agent model similar to Gayle and Miller (2013). Among these primitives, f n1 (x), f n2 (x), S(n) and T (n) characterize the production function of firms. α m0 /α n2, α n1 /α n2, and ρ characterize the utility function of managers. These primitives are identified from observed monetary transfer between CEOs and firms, profit and performance of firms, and the final matches. The monetary transfer between CEOs and firms is CEO compensation. The profit of firms is measured by their market value after compensating CEOs. The performance of firms is measured by their abnormal returns. Thus the observables are comprised of firm abnormal returns x, firm market value V, and managerial compensation w. The data on these observables are assumed to be available for a sample of N observations generated in equilibrium. The observables for the sample are 21

29 then denoted by {x i, V i, w i } N i=1. Heterogeneity in firms risk and disutility, reflected by the variances of abnormal returns distributions and nonpecuniary benefits from shirking and working, are the key in this analysis. The heterogeneity is introduced by using the observable covariates of firms. In other words, conditional on firms observable covariates, firms have the same distributions of abnormal returns under both shirking and working, and nonpecuniary benefits from shirking and working. For the purpose of interpreting identification, I discuss the identification of model primitives after controlling the observed covariates of firms. After suppressing firm index n and manager index m, the model primitives to be identified are f 1 (x), f 2 (x), α 0 /α 2, α 1 /α 2, ρ, S(n) and T (n). The model primitives are identified in a four-step identification procedure. The first two steps of the identification strategy are borrowed from Gayle and Miller (2013) and the third step is from Gayle and Miller (2009). In particular, first, the distribution of firms abnormal returns under working f 2 (x) and w(x). As noted by Gayle and Miller (2013), since we observe all firms tie their managerial compensation to performance in the data, the identification focuses on the case in which working is induced by firms. Thus, in principal probability density function of abnormal returns under working f 2 (x) is nonparametrically identified from abnormal returns x n and compensation function w(x) is nonparametrically identified from observed x n and w n. As in Gayle and Miller (2013), the second step identifies the distribution of firms abnormal returns under shirking f 1 (x), two nonpecuniary benefits of managers, α 0 /α 2 and α 1 /α 2, for any given risk aversion parameter from the optimal wage equation, participation and incentive compatibility constraints with equality given risk aversion parameter is known. Following Theorem (2.1) in Gayle and Miller (2013), under assumption (3) and the fact 22

30 E[g(x)] = 1, I derive g(x), α 0 /α 2, α 1 /α 2 as functions of f 2 (x), w(x), and ρ as follows. g(x) = wρ w(x) ρ w ρ E[w(x) ρ ], (1.22) α 0 /α 2 = {E[w(x) 1 ρ ]} 1/(1 ρ), (1.23) α 1 /α 2 = { w ρ E[w(x) ρ ] } 1/(1 ρ), (1.24) w ρ E[w(x)]/E[w(x) 1 ρ ] where w is the maximum compensation managers can receive, which can be identified from the maximum compensation observed in the data. 27 f 2 (x), w(x), and ρ is identified from step one or known by assumption. Once the likelihood ratio function g(x) is known, f 1 (x) is identified from the definition g(x) f 1 (x)/f 2 (x). Readers requiring more details and intuition on them should refer to the discussion in Gayle and Miller (2013). Third, I identified the risk aversion parameter ρ. 28 In this study, I follow Gayle and Miller (2009) to identify the risk aversion parameter by assuming the data have two states where the nonpecuniary benefits of CEOs from outside options versus are the same. The risk aversion parameter ρ can be identified by using the participation constraints in the two states. Formally, let w κ (x) and w τ (x) denote managerial compensation schedules for states κ and τ, f κ2 (x) and f τ2 (x) denote the probability density functions of abnormal returns under working in the two states. because the participation contraints hold in both states, we could in principle solve the following equation w κ (x) 1 ρ f κ2 (x)dx = w τ (x) 1 ρ f τ2 (x)dx (1.25) in ρ. If there is at least one solution in above equation, ρ is identified. In the fourth step, I identify firms effective size and managerial talent quantile functions, S(n) and T (n) in a matching framework with moral hazard. I show that they are identified 27 The three expressions (1.22)-(1.24) is similar as equations (14)-(16) in Gayle and Miller (2013). Since I use different utility function, the expressions of above equations are slightly different from theirs and I include the derivation of them in the appendix. 28 Gayle and Miller (2013) exploit the profit-maximization condition to partially identify the risk aversion parameter up to a subset. Gayle and Miller (2009) provide a point identification of the risk aversion parameter by investigating the participation contraint. 23

31 by using the equilibrium matching conditions in equilibrium from observables and identified variables. The actual size is recovered from identified effective size S(n) by exploiting the definition of effective size. This step is my methodological contribution. The basic idea is from the labor literature on income distribution by analyzing the pure assignment models. They show that income distributions of employees and firms can be written as functions of the productive type distributions of employees and firms. See, e.g., Sattinger (1993) and Tervio (2008). In particular, Tervio (2008) provide a result that firms size and managerial talent distributions can be written as functions of firms profits and managers wage, and thus firms size and managerial talent distributions can be identified directly from observed firms profits and managers wage. In this paper, I incorporate moral hazard problem into a pure assignment model. It brings challenges to identify the distributions of firms size and managerial talent as I address in the following. Using the two differential equations for effective managerial wage (1.15) and firm s effective profit (1.17), the relative quantile functions of firm effective size and managerial talent can then be expressed as functions of effective wage and effective profit as follows. S(r) r S(0) = exp( π (i) di), (1.26) v(i) + π(i) T (r) T (0) = exp( 0 r 0 v (i) di). (1.27) v(i) + π(i) These results are similar as equations (24) and (25) in Tervio (2008). In contrast, incorporating moral hazard requires to work with the effective terms of firm s size and managerial compensation. Similarly as in Tervio (2008), the indeterminacy of the lowest talent T (0) and smallest effective size S(0) is because there is no information to infer the relative contributions of managerial talent and firm size to the effective output created at and below the smallest firm in the sample. From equations (1.26) and (1.27), the relative quantile functions of firms effective size and managerial talent are functions of the effective wage and effective profits. Thus, if the effective wage and effective profits for each firm are known, the relative quantile functions of 24

32 firm effective size and managerial talent can be identified directly. However, these effective terms cannot be observed directly from the data because they involve risk premium terms. To identify the quantile functions of firms effective size and managerial talent, I first need recover the effective wage and effective profits from observed managerial wage and firms profits, and identified model primitives. They are recovered by using the following two conditions. First, each firm s total expected output equals to the sum of the expected managerial wage E[w(x)] and expected firm profits, which is measured by using firms market value V n. Since firms market value reflects also people s expection on firms market in the future, the economic value of current CEOs talent is the aggregate value current CEOs generate for firms in current and future periods. Second, the each firm s effective output equals to the sum of the effective managerial wage and effective firm profits. Using these two conditions, we have π(r) = e χn V n and v(r) = e χn E[w n (x)], where e χn is given by (1.9) and can be estimated from the recovered model primitives identified in previous steps. Since total expected output equals to the sum of the expected managerial wage and expected firm profits, I have S(n)T (r) = E[w n (x)] + V n. Multiplying e χn on both sides of above equation yields S(r)T (r) = e χn E[w n (x)] + e χn V n. On the other side, the fact that effective output consists of effective wage and effective profits gives S(r)T (r) = v(r) + π(r). Comparing these two equations gives π(r) = e χn V n and v(r) = e χn E[w n (x)]. With the recovered effective wage v(r) and effective profits π(r) in hand, the relative quantile functions S(r) S(0) and T (r) T (0) are identified by using equations (1.26) and (1.27). Using the identified relative quantile function of firm effective size, the relative quantile function of firm actual size S(n) S(0) can then be identified by using the definition of effective size, S(r) S(n)e χn. 25

33 1.3.2 Estimation In view of the identification results, I propose a three-step procedure to estimate the model primitives using the data on managerial compensation w, firms abnormal returns x and market value V, and the observed firms characteristics. The observed firms characteristics are used to introduce firms heterogeneity in the model primitives capturing firms risk and disutility. In the first step, I show the estimation of the probability density function under working, f n2 (x), from the observed firms abnormal returns x. Second, I present the estimation of the probability density function of abnormal returns for shirking f n1 (x), risk aversion parameter ρ, nonpecuniary benefits of outside options versus working, α n0 /α n2, and nonpecuniary benefits of shirking versus diligent work α n1 /α n2. They are estimated by using a Minimum Distance Estimator (MDE) from observed managerial compensation w and the observed firms characteristics. Finally, The estimation of quantile functions of firm s size and managerial talent, S(n) and T (n), can be achieved by directly following the identification strategy. estimation of f n2 (x) As in Margiotta and Miller (2000) and Gayle and Miller (2009), the probability density function of abnormal returns for working f n2 (x) could be estimated nonparametrically using the data on x. However, firms risk and disutility are heterogeneous. Recall that firms risk is reflected by the variance of probability density functions of abnormal returns. To capture firm heterogeneity in firms risk, I exploit the observed firms characteristics as the controlling covariates in the estimation of f n2 (x). Since it is intractable to undertake nonparameteric estimation with many covariates, I will adopt a parametric estimation method to estimate f n2 (x). Specifically as in Gayle and Miller (2009), I assume that the probability density functions of abnormal returns under shirking (e = 1) and working (e = 2) are both truncated normal 26

34 with support bounded below by ψ, f ne (x) = [Φ( µ ne ψ )σ n 2π] 1 exp[ (x µ ne) 2 ], (1.28) σ n where Φ is the standard normal distribution function and (µ ne, σn) 2 denotes the mean and variance of the corresponding parent normal distributions for firm n. In this specification, probability density functions of abnormal returns for all firms have the same functional form, but different values of mean and variance. The density functions under shirking and working have different means, but share the same variance. This implies that managerial effect does not affect the riskiness of firms in my one period static model. My model restricts that the expected abnormal returns conditional on manager working are zeros. Moreover, in the data I fail to reject that the mean of abnormal returns is zero. 2σ 2 n Thus I will use the restriction in the estimation of f n2 (x). Using the truncated normal specification of f n2 (x), the implicit function for µ n2 is given by 0 = E(x e = 2) = µ n2 + σ nϕ[(µ n2 ψ)/σ n ] Φ[(µ n2 ψ)/σ n ), (1.29) where ϕ denotes standard normal probability density function. This restriction is used in the estimation of Gayle and Miller (2009). To introduce firms heterogeneity, the mean and variance (µ n2, σ 2 n) are specified as functions of the observed firms characteristics, including number of employees, debt-to-equity ratio and sector dummies. Denoting above observed firms covariates by z n1, I follow Gayle and Miller (2009) to specify varance σ 2 n as an exponential function, σ 2 n = exp(β z n1 ), where β is a parameter vector for the firms observed covariates. µ n2 will be also a function of β, defined by (1.29). The estimation of f n2 (x) is completed by estimating (ψ, β). ψ is consistently estimated by using the lowest value of abnormal return x in the data. β can be estimated by using a Maximum Likelihood Estimator (MLE) through the probability 27

35 density function of x under working. The maximum likelihood estimator ˆβ is then found by choosing β to minimize the following negative sum of the log-likelihood functions, L N (β) = N {ln σ n (β) + ln Φ[ µ n2 ψ σ n (β) ] + [x n µ 2n ] 2 }, (1.30) 2σ n (β 2 ) 2 n=1 subject to the restriction that the expected value of abnormal returns is zero when managers work (1.29). estimation of f n1 (x), α m0 /α n2, α n1 /α n2, ρ Under the truncated normal distribution specification, the parameters characterize the probability density function of abnormal returns under shirking f n1 (x) are the mean µ n1 and variance σn 2 of its parent distribution. Recall that f n1 (x) is assumed to share the same variance as f n2 (x). The estimation of f n1 (x) will be completed if the mean µ n1 is estimated. To capture firm heterogeneity, similarly following Gayle and Miller (2009), I specify the mean µ n1 as a linear function of the observed firms covariates, µ n1 = u 1z n1. Here I use the same observed firm covariates as in the specification of variance σn 2 because µ n2 is also an implicit function of the covariates. Nonpecuniary benefits from outside options versus working α m0 /α n2 are determined by demand for management service and managerial satisfication on working for the firm. Thus it is ideally to specify α m0 /α n2 as functions of both firm s and managerial characteristics. Since I do not have the information on managerial characteristics, in this study I specify α m0 /α n2 and α n1 /α n2 only as functions of firm s characteristics. Hereafter I rewrite α m0 /α n2 as α n0 /α n2. It is plausible in the sense that heterogeneous firms are matched with heterogeneous managers endogenously in equilibrium. The characteristics of firms can reflects outside option of managers. In particular, I specify it as a linear function of firms characteristics, α n0 /α n2 = a 0z n2, 28

36 where z n2 is a vector of firm s characteristics, including firms assets and number of employees. Nonpecuniary benefits from shirking versus working α n1 /α n2 reflect disutility that the firm bring to its working manager. Thus, I also specify it as a linear function of firms characteristics, α n1 /α n2 = a 1z n2. The linear specification of the nonpecuniary benefits is also used in Gayle and Miller (2009). The estimation task in this step now becomes to estimate the parameters set Ω (u 1, a 0, a 1, ρ). It is estimated by using a Minimum Distance Estimator (MDE) through exploiting the optimal wage equation (1.5), participation constraint (1.3) and incentive compatibility constraint (1.4). Denote the true value of Ω by Ω 0. Let w n denote the observed wage of firm n s manager, w n = wn(ω 0, x), n = 1, 2,, N. Then the parameter set Ω can be estimated by choosing Ω to minimize the distance of observed wage and model generated wage. Equivalently I estimate Ω by choosing Ω to minimize the distance of log observed wage and log model generated wage, which is given by N [ln(w i ) ln wi (Ω, x)] 2. by i=1 Using the optimal wage equation and the specifications of the parameters, Ω is estimated ˆΩ = argmin Ω N i=1 [ln(w i ) ln(a 0z n2 ) + 1 ρ ln{[θ 0 + θ 1 (1 f n1(x, u 1 ) f n2 (x) (a 1z n2 ) 1 ρ )]}] 2 subject to the equation systems (1.6) and (1.7) determined by the participation constraint (1.3) and incentive compatibility constraint (1.4). Note that θ 0 and θ 1 are the solution of a fixed-point problem, which must be solved for each value of the parameter vector Ω, to evaluate the econometric criterion function This is an example of a nested fixed-point algorithm, first proposed by (Rust (1987)) in the empirical industrial organization literature. In the literature on managerial compensation, Ferrall and Shearer (1999), Margiotta and Miller (2000) and Gayle and Miller (2009) also use a nested fixed-point algorithm to obtain their estimates. 29

37 estimation of S(n) and T (n) The estimation of quantile functions of firms size S(n) and managerial talent T (n) directly follows the identification. As noted in the identification section, I first need estimate the firm effective profit and managerial effective wage. The effective profit π(r) can be estimated from ˆπ(r) = V n e ˆχn, where V n is the observed firm market value and ˆχ n is the estimated value of χ n by using the estimated variables, ˆχ n = ln E{[ ˆθ 0 + ˆθ 1 (1 (â 1z n2 ) 1 ˆρ ĝ n (x)] 1/ˆρ }. The effective wage v(r) is estimated directly from ˆv(r) = â 0z n2, where the equality is from the definition of effective wage v(r) E[w n (x)]e χn = α n0 /α n2. As noted by Tervio (2008), the prerequisite of the assignment models to make sense is that the incomes that firms and managers obtain need to exhibit perfect positive rank correlation. In this study firms effective profit and managerial effective wage need exhibit perfect positive rank correlation. However, since in practice managerial wage is affected by many factors, including some stochastic factors, we could not expect that the effective wage and effective profit are perfect rank correlated. Thus the noisy relation of managerial effective wage and effective profit needs to be smoothed into a strictly monotonic relation. The smooth can be done in many ways. Here I follow Tervio (2008) to perform a Lowess Smoothing of the relation of the levels of managerial effective wage and firm effective profit. The Lowess Smoothing is first proposed by Cleveland (1979). The basic idea of the method is to take a weighted moving average of effective wage along the rank by firms effective profit, using higher weights for nearby observations 30. Hereafter I use the smoothed effective wage to refer to the actual effective wage. Since the rank of effective profit is used to order the observations, there is no need to smooth it. I only need to do a simple connect-the-dots interpolation to create a continuous distribution for it. 30 In principle, I could also smooth firms effective profit according the rank of effective wage. However, since managerial wage is more volatile, the firms market value tends to be better. 30

38 Using the effective profit and smoothed effective wage, the relative quantile functions of firm effective size S(r) S(0) and managerial talent T (r) T (0) (1.26) and (1.27). More explicitly, they are estimated by S(r) S(0) = exp( r T (r) T (0) = exp( 0 r 0 can be estimated by exploiting equations ˆπ (i) di), (1.31) ˆv(i) + ˆπ(i) ˆv (i) di), (1.32) ˆv(i) + ˆπ(i) Finally using the estimated relative quantile function of firm effective size, and estimated ˆχ n, the quantile function of firm actual size S(n) can be estimated by using the definition, S(r) S(n)e χn. 1.4 Data My sample is comprised of the 1000 largest publicly traded firms in market value and their CEOs in the S&P Compustat databases for Data on executive compensation is collected from the S&P Compustat Execucomp database. I extract only the information on the compensation of chief executive officers for this study. The compensation data are supplemented by firm information from the S&P Compustat North America database and monthly stock price data from the Center for Research in Security Prices (CRSP) database. The firm characteristics data are exploited to introduce heterogeneity in firms risk and disutility. The monthly stock prices data are used to construct the abnormal returns of firms. Industrial level factors may affect risk of firms and thus mangarial compensation. To consider the industrial level effects but not complicate the analysis too much, I follow Gayle and Miller (2013) to specify firms in the sample as three industrial sectors according to GICS code. The first is called primary sector, including firms in energy (GICS: 1010), materials (1510), industrials (2010, 2020, 2030), and utility (5510). Sector 2, called consumer goods, comprises firms from consumer discretionary (2510, 2520, 2530, 2540, 2550) and consumer staples (3010, 3020, 3030). Finally firms in health care (3510, 3520), financial service (4010, 31

39 4020, 4030, 4040), information technology and telecommunication services (410, 4520, 4030, 4040, 5010) comprise Sector 3, called services CEOs Compensation I measure a CEO s compensation as the the sum of his salary and bonus, the value of restricted stocks and options granted, and the value of retirement and long-term compensation schemes. It is the costs to shareholders of employing a CEO and the total compensation a CEO obtains associated with employment. I use this approach to measure CEO compensation for two reasons. First, I consider a static model. When firms are matching with managers, firms care about the costs of employing a manager and managers care about total compensation associated with employment. Second, the CEO compensation measured by this approach is alway positive, which satisfies the model restriction. Table (1.2) summarizes the cross-sectional information on components of CEO compensation by sectors in our data. The total compensation is broken out into four components: salary and bonus, the value of options granted, the value of restricted stock granted, and other compensation, where other compensation includes the value of retirement and nonequity incentive compensation. Salary, bonus and other compensation account for about 44.4% of the total compensation while other three components collectively account for about 55.6% of the total compensation. It shows that a large fraction of managerial compensation is linked to firm performance. Moreover, managerial income from holding granted financial securities has very large standard deviation, which suggests that managerial income from holding granted financial securities whose value is affected by the firm s performance accounts for most variability of total compensation Abnormal Returns I follow Gayle and Miller (2009) to define the abnormal returns x of a typical firm as the residual component of returns that its manager is able to control. In the optimal contract, 32

40 the compensation should depend on this residual in order to provide the manager appropriate incentives, but it should not depend on changes in stochastic factors that originate outside the firm and are not able to be controlled by the manager. More specifically, following Gayle and Miller (2009), I impute x, the abnormal returns to the firm, using the monthly stock price data on the 1000 largest companies from 1998 to 2011 in two steps. First, I calculate the difference between the financial return on the individual firm stock and the return on the market portofolio. Second, I then regress this difference on a sector-specific constant and the time-varying factors, including GDP. Table (1.3) displays the summary description on the residual for the sample in All the estimated coefficients in the regression used to measure the abnormal returns are proven to be significant. The table shows that the means of abnormal returns are all negative in all three sectors, The mean is highest in services sector and lowest in primary sector. The dispersion of the abnormal returns is highest in consumer sector and lowest in primary sector Firm Characteristics Firm characteristics affect firm risk, the nature of its manager s responsibilities and the satisfaction he derives from managing the firm. These characteristics are also relevant to the nonpecuniary benefits of managers from pursuing his own interests within firms. Table (1.4) summarizes the cross sectional information on firms charactersitics by sectors. It gives summary statistics on assets, market value, sales and employees. These characteristics give us some idea on the scope of managerial responsibilities. It also shows summary statistics on debt equity ratio, which reflects firms risk to some extent. The firms in the consumer sector are most highly leveraged while those in primary are the least leveraged. The fact suggests that averagely firms in consumer sectors may be rikier, which is consist with the dispersion of abnormal returns shown in Table (1.3). The prerequisite of the study to make sense is that the compensation increases with firm size and abnormal returns. Figure (1.3) displays the relation of CEO compensation and firm 33

41 rank by market value in The sample correlations between CEO compensation, firm characteristics and abnormal return are displayed in Table (1.5). The correlation between market value and CEO compensation is , which is the largest. It suggests that market value has the most explanation power on CEO compensation. The correlation between CEO compensation and abnormal return is also positive. 1.5 Estimation Results The estimation approach is applied to the above data on CEO compensation, firms abnormal returns, market value and other observed characteristics. In view of the estimation, I first estimate the means and variances of probability density functions under manager working, f n2 (x). The estimates for the variables in the specification of variance capture how firms heterogeneous risk depends on the observed firm covariates. I then estimate the means of probability density functions under manager shirking, f n1 (x), nonpecuniary benefits from outside options versus working, α n0 /α n2, nonpecuniary benefits from shirking versus working, α n1 /α n2, and the relative risk aversion parameter, ρ. Finally I estimate the quantile functions of firm size S(n) and managerial talent T (n). With the model primitive estimates in hand, I then conduct counterfactuals to assess the importance of managerial talent misallocation as a result of moral hazard on the aggregate production of firms. More specifically, I conduct counterfactuals to quantify the four measures of loss generated from moral hazard presented in Section (2), including the loss from risk-sharing inefficiency, the actual loss from talent misallocation, the maximal loss from talent misallocation, and the loss from ignoring moral hazard problem. Quantifying each of the losses requires me to conduct a counterfactual. In the section I first report the estimates of the model primitives. I then report the counterfactual results on the losses related to moral hazard. 34

42 1.5.1 Estimated Model Primitives The probability density function under working f n2 (x) is estimated in the first step. Recall f n2 (x) is parameterized to be a truncated normal distribution. The estimation is completed by estimating its lower bound, the mean and variance of its parent distribution function. On the top of Table (1.6), I report the estimates for the variables in the specification of variance of its parent normal distribution. The estimates convey information on risk of firms with different observed characteristics. The estimate on debt equity ratio is positive, which suggests that the more leveraged firms measured by debt equity ratio are riskier holding other factors constant. Number of employees has a negative effect on firms risk, which means firm with more employees are less risky holding other factors constant. The firms in the services sector tend to have the highest risk while those in primary sectors have the lowest. The mean for the parent distribution of f n2 (x) is estimated by using the restriction E(x e = 2) = 0, which implies it is not greater than zero for all firms. On the bottom of Table (1.6), it shows that consistent estimate of truncation lower bound is ψ = 0.744, which is the lowest abnormal return in the data. The proability density function under shirking f n1 (x) shares the same variance as f n2 (x). This leave its mean µ 1 to be estimated. The parameter estimates for the variables in its specification of µ 1 are reported in the middle of Table (1.6). The abnormal returns for the firms with higher debt equity ratio and more employees tend to have lower means if their managers pursue their own interests. This implies that managers have more impacts on these firms. The estimates on dummy variables indicating consumer and services sectors is negative. Managers have more impacts on firms in the consumer and services sectors. This might be because firms in those sectors face more competition. The top of Table (1.7) reports the estimation results for the nonpecuniary benefits from outside options versus working, α n0 /α n2. The results show that managers serving for firms with more assets and employees obtain more nonpecuniary benefits from their outside options 35

43 versus working. The reason is that those firms endogenously end up with more talented managers, whose reservation utility are higher. The middle of Table (1.7) reports that the estimation results for the nonpecuniary benefits from shirking versus working, α n1 /α n2. These results show that managers serving for firms with more assets or employees obtain more nonpecuniary benefits from shirking than those from working. It might be because that managing these firms needs more responsibilities and gives less satisfaction to managers. Moreover, the estimates of α n1 /α n2 for all the firms are greater than 1, which is consistent with our model restriction. The risk aversion parameter for all managers is estimated to be With such a risk aversion level, a manager having $2 millions is willing to pay $0.382 millions to avoid a gamble that he has equal probability losing $1 millions and winning $1 millions. The remaining model primitives are the distributions of firms size and managerial talent. The relative quantile function of firms effective size and managerial talent, S(r) T (r) and S(0), are T (0) estimated according to equations (1.31) and (1.32). Using the estimated S(r), the relative S(0) quantile function of firms actual size S(n) S(0) is estimated by exploiting the definition of effective size, S(r) S(n)e χ n. Figures (1.4) and (1.5) display the estimated distribution of firms effective size and managerial talent, respectively. The distribution of firms effective size is highly skewed to right and there is no much difference in managerial talent. This is consistent with the finding by Tervio (2008) and Jung and Subramanian (2013) that most variation on CEO compensation is explained by differential of firm size. Figure (1.6) displays actual, efficient, and least efficient allocation of CEOs to firms. The grey 45 dot represents the efficient allocation of CEOs to firms. The blue dot represents the actual allocation of CEOs to firms. From the comparison between the efficient and actual allocations, we can see that many firms that end up with much less talented CEOs in equilibrium than they should in the efficient allocation. The reason is that these firms are very risky and pay higher risk premium to CEOs. Figure (1.7) displays the risk premium that 36

44 firms need pay by their actual size. This graph is consistent with that Figure (1.6). The blue green line represents the worst allocation of CEOs to firms. The counterfactuals quantify the losses of all firms from actual and worst allocation of CEOs to firms by comparing the difference of output under these allocation between the output under efficient allocation Counterfactuals I conduct four counterfactuals to assess the importance of managerial talent misallocation as a result of moral hazard using the estimated model primitives. The importance is evaluated by quantifying the four measures of loss presented in Section (2.6). Quantifying each of the four measures requires a counterfactual. The remaining of this section presents the counterfactual details and discusses the results. loss from risk-sharing inefficiency The first counterfactual concerns on quantifying the loss of firms from risk-sharing inefficiency. The loss is measured by the difference between the expected wage the firm pays and flat wage it would pay if moral hazard is not a problem, defined by L n1 in (1.19). The first row of Table (1.8) gives the summary statistics on this loss, The total estimated aggregate loss over all firms is 2.9 billion dollars. This value is plausible if I compare it with the literature on this measure. In Edmans and Gabaix (2011a), calibrating a different theoretical model, this loss is calibrated to be $2 billion over the top 500 firms in 2005 Execucomp database. Assuming absolute risk aversion perferences of managers, Gayle and Miller (2009) estimates the aggregate costs from risk sharing inefficiency are about billion 2000 year US dollars over 3026 firms from 1992 to actual loss from matching inefficiency In the second counterfactual, I quantify the loss to firms from talent misallocation. The loss is measured by the difference between the expected output of firms from efficient matching with that from actual matching, which is defined by L n2 in (1.20). I first calculate the 37

45 expected output from a counterfactual in which the matching between managers and firms is efficient, which is given by From the estimated S(r) S(0) effective size. I can then derive E[S(n)T (n)(1 + x)] = S(n)T (n). and χ n, I can recover S(n) S(0) S(n)T (n) S(0)T (0) = S(r)e χn S(0) by exploiting the definition Π n, which gives S(n)T (n) = Π n [ S(0)T (0)] = Π n [ˆv(0) + ˆπ(0)], where ˆv(0) + ˆπ(0) is the estimated total effective output for the lowest firm-ceo pair. The total production loss from actual inefficient matching for the firm n is the firm s total output in the data, which is given by E[S(n)T (r)(1 + x)] = S(n)T (r) = E[w n (x)] + V n, where E[w n (x)] is the expected wage the firm s manager obtain and V n is the firm s market value. The difference between the total expected output from efficient matching and that from actual inefficient matching is the production loss from talent misallocation. The second row of Table (1.8) displays the summary statistics on the production loss for all firms. The total production loss is estimated to be about billion dollars from talent misallocation. To the best of my knowledge, there is only one other paper which calibrates the loss from matching inefficiency due to moral hazard. Edmans and Gabaix (2011a) provide a calibrated upper bound ($7.7 billion) for this loss for the top 500 firms in 2005 Execucomp database. maximum loss from matching inefficiency Third, I conduct a counterfactual to quantify the maximal loss that firms in the market would incur from managerial talent misallocation due to moral hazard. This loss is measured by the difference between the total expected output from the efficient matching and that from the worst matching, that is the best firm matches with the least talented manager. The summary statistics on this loss is displayed on the third row of Table (1.8). The total maximal loss that firms could incur from talent misallocation could reach to 1, billion dollars, which is approximately 15.00% of the market capitalization of the 1000 largest firms. 38

46 loss from ignoring moral hazard Finally, I conduct a counterfactual to quantify the loss that all firms would incur from ignoring moral hazard from. It is measured by the difference between the expected output from managers pursuing the firms interests versus that from managers pursuing their own interests. The summary statistics on this loss is displayed on the bottom of Table (1.8). The total loss that firms would incur from ignoring moral hazard is about 2, billion dollars, which is about 19.69% of the market capitalization of the 1000 largest firms. This is the benefit of all firms from motivating their CEOs. This is consistent with the finding in Gayle and Miller (2009). The estimated total loss of firms from ignoring moral hazard in their paper is about billion dollars for for 3026 firms from 1992 to This value accounts for 19.32% of the total market capitalization of those 3026 firms. discussion The above counterfactuals show that the aggregate loss firms incur as a result of talent misallocation is more than four times as large as the loss due to the standard risk-sharing inefficiency when moral hazard is present. Most previous studies on moral hazard focuses only on the loss of firms from risk-sharing inefficiency, which is commonly considered as the agency costs of moral hazard for firms. The results suggest that the studies on agency costs of moral hazard may severely underestimate efficiency loss if focusing only on risk-sharing inefficiency and ignoring matching inefficiency. The sum of loss from both risk-sharing and matching inefficiecny is the aggregate costs that firms would incur by using optimal contracting to solve moral hazard problem. Corporate governance, such as boarding monitoring, is considered as a substitute mechanism to reduce costs of moral hazard. Understanding the aggregate costs associated with moral hazard gives us some guidance on implementing corporate governance. On the other side, the sum of loss from both risk-sharing and matching inefficiencies is the total costs all firms would incur by using optimal contracting to solve the moral hazard problem. The total costs of both risk-sharing and matching inefficiencies associated 39

47 with moral hazard is very small compared with the substantial benefits from motivating the managers to pursue the interests of shareholders. This suggests that aligning the managers to pursue the objective of shareholders instead of their own is extremely beneficial to firms. 1.6 Conclusion This paper quantified the magnitude of inefficiency in the equilibrium allocation of CEOs to firms. I developed an estimable model which illustrated the presence of moral hazard could not only lead to a risk-sharing inefficiency, but also create an inefficient allocation of CEOs to firms in equilibrium. An new empirical method was proposed to estimate the model primitives in a matching framework with asymmetric information. The method was applied to estimate the model using data on the U.S. market for CEOs in Using the estimates, I quantified the magnitude of inefficiency in the equilibrium allocation of CEOs to firms caused by moral hazard. I found that the inefficiency is more than four time as large as the inefficiency loss from risk-sharing due to moral hazard. The findings suggest that the studies focusing solely on risk-sharing can severely underestimate the inefficiency loss and future work should consider the allocation inefficiency caused by moral hazard. The methodology developed in this paper has several potential extensions. The first extension would be to consider incorporating a more realistic, dynamic contracting problem into the matching model of CEOs and firms, where the dynamic contracts offered to the CEOs consist of a sequence of wage and effort. Second, the methodology developed could also apply to other markets with moral hazard, including labor markets in which employers hire employees whose actions are not perfectly supervised, and capital markets in which venture capitalists invest in entrepreneurial companies whose management cannot be monitored. 40

48 1.7 Appendices Appendix A: Proofs Proof of Lemma 1: Proof. First define ν n (x) [ α n2 α m0 w n (x)] 1 ρ, and then the participation constraint (1.3) can be rewritten as E[ν n (x)] 1. (1.33) Similarly the incentive compatibility constraint (1.4) for working can be rewritten as E[ν n (x)] (α n1 /α n2 ) (1 ρ) E[ν n (x)g n (x)]. (1.34) Consequently, minimizing expected compensation subject to (1.3) and (1.4) is equivalent to minimizing E[ν n (x)] 1/(1 ρ) subject to (1.33) and (1.34). To solve the optimal problem, I choose ν n (x) to maximize the following Lagrangian, E[ν n (x)] 1/(1 ρ) + θ 0 E[ν n (x) 1] + θ 1 E[ν n (x) (α n1 /α n2 ) 1 ρ ν n (x)g n (x)], where θ 0 and θ 1 are Lagrangian Multipliers for participation and incentive compatibility constraints. The first order condition is then given by ν n (x) ρ/(1 ρ) = θ 0 + θ 1 [1 (α n1 /α n2 ) 1 ρ g n (x)]. (1.35) From the definition of ν n (x), we know w n (x) = (α m0 /α n2 )ν n (x) 1/(1 ρ). Substituting it back into the first order condition, we obtain w n(x) = (α m0 /α n2 )[1 (α n1 /α n2 ) 1 ρ g n (x)] 1/ρ. which is optimal compensation equation (1.5). The two equations determining the Lagrangian Multipliers θ 0 and θ 1 in Lemma 1 are derived by exploiting the participation and incentive compatibility constraints. First multiplying ν n (x) and taking expectations on both sides of (1.35) yields E[ν n (x) 1/(1 ρ) ] = θ 0 E[ν n (x)], (1.36) 41

49 which implies θ 0 > 0 and the participation constraint (1.33) holds with equality. Solving (1.35) for ν n (x) and substituting it into binding (1.33) gives (1.6). The incentive compatibility constraint also holds with equality by a contradition. If we set θ 1 = 0 in (1.35), we will obtain ν n (x) ρ/(1 ρ) is fixed and then we have a fixed optimal wage. This contradicts with that the optimal compensation contract should be tied with x. Similarly solving (1.35) for ν n (x) and substituting it into binding (1.34) gives (1.7). Derivation of the Expected Utility: From Lemma 1, the optimal compensation of firm n s manager is given by w n(x) = (α m0 /α n2 ){θ 0 + θ 1 [1 (α n1 /α n2 ) 1 ρ g n (x)]} 1/ρ. (1.37) Taking the log on both sides of optimal equation (1.37) yields ln[w n(x)] = ln(α m0 /α n2 ) + 1 ρ ln{θ 0 + θ 1 [1 (α n1 /α n2 ) 1 ρ g n (x)]}. (1.38) Taking the expectation on both sides of optimal equation (1.37) gives E[w n(x)] = (α m0 /α n2 )E{[θ 0 + θ 1 (1 (α n1 /α n2 ) 1 ρ g n (x))] 1/ρ }. (1.39) The expected utility of firm n s manager is written as EU n = α1 ρ n2 1 ρ E{[w (x)] 1 ρ } = α1 ρ n2 (x)] 1 ρ E{e(1 ρ)ln[w } = α1 ρ n2 1 ρ E{e(1 ρ)[ln(α m0/α n2)+ 1 ρ ln{θ 0+θ 1 [1 (α n1 /α n2 ) 1 ρ g n(x)]}] } = α1 ρ n2 1 ρ e(1 ρ) ln(α m0/α n2) E{e 1 ρ ρ ln{θ 0+θ 1 [1 (α n1 /α n2 ) 1 ρ g n(x)]} }. Letting χ n = ln{e{[θ 0 + θ 1 (1 (α n1 /α n2 ) 1 ρ g n (x))] 1/ρ }}, and using equation (1.6), the 42

50 expected utility can be rewritten as EU n = α1 ρ n2 1 ρ e(1 ρ)[ln(α m0/α n2)+χ n χ n] = α1 ρ n2 1 ρ e(1 ρ){ln[(α m0/α n2)e{[θ 0 +θ 1 (1 (α n1/α n2)1 ρ g n(x))]1/ρ }] χ n} = α1 ρ n2 1 ρ e(1 ρ)((ln{e[w n (x)]} χ n) = {α n2e[wn(x)]e χn } 1 ρ. 1 ρ The idea of above derivation is following Edmans and Gabaix (2011a). Interpretation of the Sorting Conditions: Consider there are two firms and two managers in the market for CEOs. Assume that firm 1 has larger actual size than firm 2, S 1 > S 2, and manager 1 is more talented than manager 2, T 1 > T 2. If we observe that firm 1 matches with manager 1 and firm 2 matches with manager 2, we must have that the profits firm 1 obtains from hiring manager 1 are not less than those from hiring manager 2 and the similar condition holds for firm 2. Formally, the following conditions are written as, S 1 T 1 E(w 1 1) S 1 T 2 E(w 1 2), S 2 T 2 E(w 2 2) S 2 T 1 E(w 2 1), where E(w n m) is the expected wage firm n would pay to managers m. Using the definition of effective wage, we have E(w n m) = v m e χn. Thus above two inequalities can be rewritten as follows, S 1 T 1 v 1 e χ 1 S 1 T 2 v 2 e χ 1, S 2 T 2 v 2 e χ 2 S 2 T 1 v 1 e χ 2. 43

51 Rearranging them yields S 1 e χ 1 (T 1 T 2 ) v 1 v 2, v 1 v 2 S 2 e χ 2 (T 1 T 2 ). Since T 1 > T 2, we have S 1 e χ 1 (T 1 T 2 ) v 1 v 2 S 2 e χ 2 (T 1 T 2 ), which implies that firm 1 s effective size S 1 e χ 1 should be no less than firm 2 s effective size S 2 e χ 2. We also know that in equilibrium firm 1 matches with manager 1 and firm 2 matches with manager 2. Thus the firm with larger effective size matches with the more talented manager in equilibrium. The matching is perfect sorting by firm s effective size and managerial talent. These conditions are also derived in Edmans and Gabaix (2011a). Derivation of equations (1.22)-(1.23): Proof. The equation (1.22), determining g(x), is derived by using the optimal wage equation (1.5), assumption (3), and the fact E[g(x)] = 1. Suppressing the firm index n, the optimal wage equation (1.5) can be rewritten as w(x) = (α 0 /α 2 ){θ 0 + θ 1 [1 (α 1 /α 2 ) 1 ρ g(x)]} 1 ρ. Defining λ(x) [ w(x) α 0 /α 2 ] ρ, above equation is written as λ(x) = θ 0 + θ 1 [1 (α 1 /α 2 ) 1 ρ g(x)]. (1.40) From assumption (A3), we know that lim x g(x) = 0. Taking limit on both sides of (1.40) yields λ lim x λ(x) = θ 0 + θ 1. (1.41) Note the fact that E[g(x)] = 0. Taking expectation on both sides of (1.40) gives λ E[λ(x)] = θ 0 + θ 1 [1 (α 1 /α 2 ) 1 ρ ]. (1.42) 44

52 Substracting both sides of equation (1.41) to equation (1.40) gives us λ λ(x) = θ 1 (α 1 /α 2 ) 1 ρ g(x). (1.43) Similarly substracting both sides of equation (1.41) to equation (1.42) gives us λ λ = θ 1 (α 1 /α 2 ) 1 ρ. (1.44) Dividing equation (1.43) over equation (1.44) yields g(x) = λ λ(x) λ λ. (1.45) On the other hand, from the definition λ(x) [ w(x) α 0 /α 2 ] ρ w, we know that λ = [ α 0 /α 2 ] ρ and λ = E{[ w(x) α 0 /α 2 ] ρ }, where w is the maximum wage. Substituting them into equation (1.45) gives us the equation (1.22). The equation (1.23), determining the nonpecuniary benefit of outside options versus workingα 0 /α 1, can be derived directly from the participation contraint (1.3) with equality. Rearranging it gives α 0 /α 2 = {E[w(x) 1 ρ ]} 1 1 ρ, which is exactly the equation (1.23). To derive (1.24), rearranging equation (1.44) and using equation (1.43) give us α 1 /α 2 = { λ λ θ 1 } 1 1 ρ From equation (1.36) in the proof of lemma (1), we have θ 0 = Plugging it into (1.46) gives equation (1.24). E[w(x)] E[w(x) 1 ρ ](α 0 /α 2 ) ρ = { λ λ λ θ 0 } 1 1 ρ. (1.46) Derivation of (1.26) and (1.27): 45

53 Proof. From the fact that each firm s effective output is divided by the manager and the firm, I know that the firm s effective output equals to the manager s effective wage and the firm s effective profits. I also know that the firm whose effective size has quantile r will match with the manager whose talent has quantile r, which gives v(r) + π(r) = S(r)T (r). (1.47) Recall that the slope of managerial effective wage is v (r) = S(r)T (r), (1.48) and the slope of firm s effective profit is π (r) = S (r)t (r). (1.49) Dividing (1.48) to (1.47), I have T (r) T (r) = v (r) v(r) + π(r). (1.50) Intergrating it over quantile 0 to any quantile i yields (1.27 ). Similarly dividing (1.49) to (1.47), I have S (r) S(r) = Intergrating it over quantile 0 to any quantile i yields (1.26). π (r) v(r) + π(r). (1.51) The idea this proof is based on Tervio (2008). 46

54 1.7.2 Appendix B: Tables and Graphs Table 1.1: Numerical Example 2: Matching Pattern 1 0%-20% 20%-40% 40%-60% 60%-80% 80%-100% 0%-20% %-40% %-60 % %-80% %-100% Table 1.2: Cross-sectional Information on Components of Compensation by Sectors (In millions of US $ (2011); standard deviations in parentheses) Variable Primary Consumer Services All Observations Salary and bonus (0.92) (1.49) (0.70) (1.01) Value of options granted (1.77) (3.43) (1.85) (2.28) Value of restricted Stock granted (2.50) (3.90) (18.56) (12.22) Other compensation (3.27) (4.16) (2.08) (2.73) Total compensation (5.17) (7.70) (18.86) (13.10) 47

55 Table 1.3: Summary Description on Abnormal Returns in 2011 Variable Sector Mean Std Min Max Abnormal returns All Primary Consumer Services Table 1.4: Cross-sectional information on firm characteristics by sectors (Assets, market value and sales in billions of US $ (2011), employees in thansands, standard deviations in parentheses) Variable Primary Consumer Services All Assets (45.27) (27.11) (171.04) (114.76) Market value (28.25) (26.70) (32.65) (29.83) Sales (21.38) (26.86) (14.69) (17.56) Employee (39.33) (161.89) (44.20) (86.67) Debt equity ratio (3.92) (39.72) (7.66) (19.63) 48

56 Table 1.5: Sample Correlations CEO compensation Market value Assets Sales Employee Abnormal returns Table 1.6: Parameter Estimates of the Returns Distributions Parameter Description Variable Estimate Standard Error σ 2 Percent variance Constant of return Debt equity ratio Log Employees Consumer dummy Services dummy µ 1 Percent mean return Constant from shirking Debt equity ratio Log Employees Consumer dummy Services dummy ψ Lower bound of return

57 Table 1.7: Nonpecuniary Benefits Relative to Diligence Parameter Description Variable Estimate Standard Error α n0 /α n2 Nonpecuniary benefits Constant from outside option Log Assets Log Employees α n1 /α n2 Nonpecuniary benefits Constant from shirking Log Assets Log Employee ρ Risk aversion parameter Table 1.8: Losses of Talent Misallocation Related to Moral Hazard (Billions) Losses Mean Standard deviation Total (Million) (Million) (Billion) Actual loss from risk-sharing inefficiency Actual loss from matching inefficiency Maximal loss from matching inefficiency Loss from ignoring moral hazard

58 Figure 1.1: The Numerical Example: Risk Premium χ n Figure 1.2: The Numerical Example 2: Loss from Mismatch Figure 1.3: Relation of CEO Compensation and Firm Rank by Market Value in

59 Figure 1.4: Estimated Distribution Density of Firms Effective Size S(r)/ S(0) Figure 1.5: Estimated Distribution Density of Managerial Talent T (r)/t (0) Figure 1.6: Relation of Ranks by Firm Actual Size and Managerial Talent 52

60 Figure 1.7: Relation of Ranks by Firm Actual Size and the Risk Premium e χn 53

61 Chapter 2 Identification of Network Effects Using All The Economics (with Yao Luo) 2.1 Introduction There is a growing literature on network effects. The majority of papers have considered the linear-in-means model. In such a model, the seminal paper by Manski (1993) shows that it is difficult to distinguish between exogenous effects ( i.e., the influence of exogenous peer characteristics), endogenous effects ( i.e., the influence of peer behavior), and correlated effects ( i.e., individuals in the same group tend to behave similarly). A number of studies have addressed this problem by introducing known nonlinearity (Brock and Durlauf (2001)), using a dynamic analog (Brock and Durlauf (2001)), or exploiting social network structures (Bramoullé, Djebbari, and Fortin (2009)). In this paper, we propose a new approach to identify models with network effects by invoking another side of the market. While the previous literature focuses on interactions between group members, it is frequently the case that they also interact with another side of the market. For instance, consumers are affected not only by other consumers in their group but also by the seller s price schedule. Firms are affected not only by other firms in their industry but also by the regulator s decisions. The additional side s decisions, i.e., the seller s price schedule or the regulator s decisions, naturally endogenize interactions between group members. Thus, using all the economics of a general equilibrium model provides a new source of identification information. 1 When another side of the market is invoked, the analyst may identify the interaction between group members. 1 A similar idea was used by Ekeland, Heckman, and Nesheim (2004) in hedonic models. 54

62 We construct a non-linear pricing model with network effects to illustrate our idea. This idea would be generally applicable to other models with network effects. In a simplified version, the seller designs a tariff schedule T ( ) to maximize its profit, while buyers choose consumption quantities to maximize their payoffs max q θ V (q) T (q), where θ ρ(e(q x), x, z, ɛ) represents the buyers private information, which is the aggregation of all information the seller does not know or cannot use to screen buyers. x is a vector of observable group-specific characteristics, z is a vector of observable individual-specific characteristics, and ɛ represents all the information even the seller does not observe. E(q x) is the average choice in group x. In sum, θ represents an aggregation of the following effects: exogenous effects z, endogenous effects E(q x) and correlated effects ɛ. V (q) is the base utility function. Using standard techniques from the nonlinear pricing literature, we derive the first-order conditions for both the seller and buyers. Identification is achieved in two steps. First, we show that the one-to-one mapping between private information θ and quantity purchased q can be written in terms of observables by using both the seller s and buyers first-order conditions. In other words, we learn about θ by using all the economics of the nonlinear pricing model. Thus, our problem reduces to identifying the ρ( ) function. Second, ρ( ) is identified nonparametrically under certain normalization assumptions following Matzkin (2003) arguments. V ( ) is also identified in this step. Our structural approach has several advantages. First, our two-step identification strategy avoids invoking the reduced-form and thus the reflection problem. While invoking the seller s optimality condition allows us to recover buyers tastes without specifying their network structure, the second step is a standard nonlinear regression problem. As a result, it is relatively straightforward to consider multiple channels for social interactions, which is rarely attempted in the literature. Another advantage of our approach is that we identify 55

63 network effects nonparametrically. In contrast, it is still unknown how to identify the partial linear in means model when the nonlinearity is unknown (Brock and Durlauf (2001)). To illustrate our idea, we use a running empirical application to illustrate our idea. While the literature has been focusing on how growing network effects may boost the market, we structurally estimate our model to study the effects of shrinking network effects in a declining industry the print yellow pages advertising industry. The publisher designs a price schedule for each directory to maximize his profit, while taking into account that businesses have heterogeneous tastes for advertising in yellow pages. Heterogeneity arises both from exogenous factors as well as endogenous factors capturing industry level network effects, such as purchases of businesses in the neighborhood or in the same industry. A typical business payoff depends not only on his own purchase and taste, but also the total advertisement in the directory. The latter represents the market level network effects. Using data from multiple markets, we exploit both the publisher s and businesses first-order conditions to identify both levels of network effects. We estimate our model using data on purchases and price schedules from 7 directories in Toronto. Our estimation results are consistent with our theoretical specifications. We found positive network effects in both levels, which strengthens the findings in the seminal paper by Rysman (2004). Businesses obtain a large amount of informational rent as the publisher tries to screen heterogeneous businesses through nonlinear pricing. Informational rents account for approximately 90% of total revenues. With the estimated model primitives at hand, we resolve the model after shutting down market level network effects and all network effects, respectively. Our results show that shutting down market level network effects leads to a 26.36% decrease in the publisher s revenue, a 26.93% decrease in businesses surplus and a 9.13% decrease in total purchase. Shutting down all network effects leads to a 46.42% decrease in the publisher s revenue and a 43.94% decrease in businesses surplus, and a 17.24% decrease in total purchase. The 56

64 publisher charges the lowest price when there is no network effects while it charges the highest price when there are both levels of network effects. In sum, network effects have important effects on the publisher pricing strategy, businesses purchasing behavior, and market outcomes in equilibrium. Literature This paper is closely related to Rysman (2004) but distinguishes itself in several important ways: First, we hand-collected individual level data on advertisement purchases, which allows us to model businesses as heterogeneous instead of homogeneous as in Rysman (2004). Second, instead of a quantity-setting model, we employ a nonlinear pricing model in which the publisher designs a price schedule to maximize its profit and businesses choose quantity of advertising to maximize their payoffs. Third, we show that our model is identified by exploiting simultaneously the publisher s and businesses first-order conditions. In specific, we show that the one-to-one mapping between the unobserved taste and quantity purchased can be written in terms of observables by using both first-order conditions. Previous literature mostly uses discrete choice models and uses instrumental variables to identify the parameters. For instance, Rysman (2004) derive consumers, businesses and publisher s first-order conditions and find instrumental variables for each equation. Our paper also contributes to the literature of structural analysis of nonlinear pricing. See, for instance, Miravete (2002), Miravete and Röller (2003) and Crawford and Shum (2007). Two most related papers are Perrigne and Vuong (2013) and Luo (2011). While Perrigne and Vuong (2013) provide a methodology for the analysis of nonlinear pricing data with a known price schedule, Luo (2011) proposes an alternative method when the price schedule needs to be estimated. Our paper generalizes their results to incorporate important network effects. While they use data from one market, we exploit the variation in price schedule and purchases across markets. The rest of this paper is organized as follows. Section 2 presents the data on yellow pages 57

65 advertising. Section 3 specifies the model, while Section 4 establishes its nonparametric identification and develops a semiparametric estimation procedure. Section 5 presents our estimation results and counterfactuals. Section 6 concludes. 2.2 Yellow Pages Advertising Data in Toronto We collect data on print yellow pages advertising in 2011 for the city of Toronto. 2 In Toronto there is only one print yellow pages directory publishing company, the Yellow Pages Group. The company publishes print yellow pages directories for each of the 7 non-overlapping districts of Toronto and distributes them freely to the households living in corresponding districts. The map in Figure (2.1) shows these 7 districts, which are Core Center (C), Core West (W), Core Northeast (NE), Core Southeast (SE), Etobocoke (E), North (N) and Scarborough (S). The publisher collects revenue by selling advertising in these yellow pages directories to businesses. The price businesses pay depends on the sizes and categories they use. We collect our data through two sources: first, we manually read off businesses advertisement purchases from the directories; second, the price schedule for each directory is collected from the Local Search Association. 3 Advertisements in yellow page directories differ by size, color and other special features, which provide us numerous possible combinations of advertising options for businesses to choose from. 4 For instance, we observe 212 and 225 advertising options chosen by businesses, which lead to corresponding numbers of different prices paid by businesses in SE and NE yellow pages directories, respectively. Following Perrigne and Vuong (2013) and Busse and Rysman (2005), we estimate a nonlinear tariff function for each 2 An increasing competitor to print yellow page advertising is the internet through search engines and internet yellow page advertising. The print yellow page is predicted to extinct in the future. However, the industry still remains very strong. 71% of Canadians consult their print yellow page directories every month until (2007 Canadian Business Usage Study. ) 30 millions print directories are distributed annually in Canada until 2007.(comScore Media Metrix (Nov. 2007).) Moreover, according to the Newsletter by Local Search Association (2011), the print yellow page remains people s first choice to search the local businesses. 3 The name of the association was Yellow Pages Association before See Perrigne and Vuong (2013) for details. 58

66 directory and construct quality-adjusted quantities for purchases. Regarding price schedules, there are two interesting features worth noting. The first is that the schedules display a nonlinear pattern, as reported in earlier papers ( e.g., Perrigne and Vuong (2013) and Busse and Rysman (2005)). The more businesses buy, they more discounts they get. For instance, for multi-color display advertisements in the NE yellow pages directory, the price per square pica varies from $11.1, $9.22 and $4.48 for the lowest size, half page and double page advertisements, respectively. The same pattern is also observed in other categories. This feature motivates us to employ a nonlinear pricing model as Perrigne and Vuong (2013). The second feature is that prices and discounts vary across markets, which will provide us important variation to identify our model primitives as we will discuss later. For example, the prices are $1935.6, $ and $ for the smallest size, half page and double page multi-color display advertisements in the SE yellow pages directory, while corresponding prices are $2304, $ and $ in the NE yellow pages directory. The advertisements are more expensive in NE yellow page directory. On the other hand, there are more residents living in the northeast area than the southeast area. Moreover, as we will show later, a higher quantity of advertisements are purchased in the NE directory than that in the SE directory. These two facts imply that there might exist network effects in the yellow pages advertising markets of Toronto. The publisher exploits the network effects to charge a higher price for advertising in the NE directory. As it is quite troublesome to manually collect businesses purchases from the directories, we read off purchase data from two directories, NE and SE, to illustrate our empirical methodology. We collect totally 6903 advertisements over 1457 industry headings and 7204 advertisements over 1454 industry headings from SE and NE yellow pages directories, respectively. The publisher collected revenue of $5.38 and $6.98 million from selling the advertisements in SE and NE directories. Tables (2.1) and (2.2) display the top 10 indus- 59

67 try headings, which represent about 20% of the total revenue from both directories. These industries include professionals and household services. Within each of these industry headings, businesses in the industries with larger total size of advertisements tend to buy larger advertisements themselves. This feature suggests that there might exist interactions between businesses within industries, which motives us to analyze industry level network effects. 2.3 The Model Our model is based on the nonlinear pricing model developed by Maskin and Riley (1984). The principal is the yellow pages directory publisher and agents are businesses. Consumers receive yellow pages directories for free, while display advertising is sold to businesses. 5 Businesses are heterogeneous and characterized by a scalar taste type for advertising θ [θ, θ] with 0 < θ < θ <. The scalar business type θ ρ(x, ɛ), comprising the business all observable (X) and unobservable taste information (ɛ) that the publisher cannot discriminate on. The vector X includes exogenous observable group-specific characteristics x such as industry, exogenous individual-specific characteristics z such as the business size and age, and endogenous variables E(q x) describing various channels of network effects. Network effects arise from businesses locations (network of neighborhood), industries (network of competitors), suppliers (network of upstream firms) and so on. Each business knows its own type θ, but the publisher knows only the distribution of θ, F ( ). This may arise from the fact that the publisher does not know about the profitability of advertising to businesses in yellow pages or the publisher knows but cannot use this information to screen businesses. F ( ) is assumed to be strictly increasing and absolutely continuous, and therefore the corresponding density function f( ) exists and is strictly positive for all θ [θ, θ]. In addition, 1 F (θ), the reciprocal of the hazard rate, is assumed to be f(θ) 5 As shown in Halaburda and Yehezkel (2011), this, so called divide-and-conquer strategy, is a consequence of asymmetric information: the publisher finds it optimal to attract the side with the lower information problem the consumer. 60

68 non-increasing for all θ. The total number of businesses in the market is normalized to one. The use of divide-and-conquer strategy implies that we can focus on the interaction between the publisher and the businesses. Following Shearer (2004), we specify the time sequence of the interaction between the publisher and businesses as follows: (i) the publisher proposes a menu of quantity and payment pairs [q( ), T ( )]; (ii) businesses observe the menu, form their expectations about the average purchases in their neighborhoods E(q x) and the total quantity Q, and form their tastes based on their information and expectations; (iii) each business decides which quantity to purchase based on their type and expectations, and then payoffs are realized. Hereafter, we omit market level covariates for the ease of exposition. businesses The payoff of a type θ business is represented by a linearly separable function U(q, θ, Q) T (q), where q is the quantity of advertising purchased by the business, Q is the total quantity of advertising purchased by all businesses in the market, T (q) is the price paid by the business purchasing q. By including Q in the utility function, we adopt a macro approach (Economides (1996)) when considering the network effects arising from the consumer side, which we label as market level network effects : consumers use the yellow pages more (or less) often when it provides more information, larger Q. Recall that θ is an aggregation of exogenous characteristics, endogenous characteristics and unobservable taste information. We label it as industry level network effects. We can pick a market with total purchase of Q as our benchmark market and denote V (q, θ) U(q, θ, Q ). We call V (q, θ) the intrinsic utility from consuming q units of the advertising for type θ business. For any given Q, the difference between the total utility and intrinsic utility U(q, θ, Q) V (q, θ) is referred to as the network utility from consuming q units of the advertising for type θ business. The reservation utility of type θ business is 61

69 assumed to be 0. The utility function U(q, θ, Q) is assumed to have the following standard properties. Assumption 4. The utility function U(q, θ, Q) is continuously differentiable for any q [0, + ), θ [θ, θ] and Q [0, + ) and satisfies (i) U q (q, θ, Q) 0, U qq (q, θ, Q) 0, (ii) U θ (q, θ, Q) > 0, U θθ (q, θ, Q) 0, (iii) U qθ (q, θ, Q) > 0. Assumption (i) says that the marginal utility is nonnegative and decreasing. (ii) says that a consumer with a higher taste θ gets a larger utility and this increase diminishes as θ increases. (iii) is the standard Spence-Mirrles single-crossing condition, which requires that a consumer with a higher taste θ also enjoys a larger marginal payoff across every q. publisher The publisher designs a price schedule T ( ) to maximize its profit Π = θ θ T ( q(θ) ) f(θ)dθ C ( θ q(θ)f(θ)dθ ), where the first term is the revenue collected from all the businesses buying advertising and the second term expresses the cost for producing the total advertising quantity. To publish the yellow page, the publisher incurs a cost, which is a function of total quantity purchased Q = θ θ q(θ)f(θ)dθ. We assume that the cost function is increasing, i.e., C ( ) 0. We assume that all businesses in the market purchase advertising in the yellow pages, i.e., full coverage, for simplicity. The theoretical literature on nonlinear pricing usually makes this assumption in order to simplify analysis or to make sharp predictions when the model is hard to solve in general (See, e.g, Rochet and Stole (2002) and Armstrong and Vickers (2001) studying nonlinear pricing models with competition). Since both the standard listing and minimum payment are very small (for instance, the minimum payment is only 0.3% of θ 62

70 the maximum one), full coverage is a reasonable approximation in our case. 6 solving the model Since the publisher and all businesses share the same relevant information, it is reasonable to assume they form the same expectations. As a standard practice in the social economics literature, we solve the model for self-fulfilling equilibria in two steps. First, we find the publisher s optimal price schedule for any expected total quantity Q. Second, we find the self-fulfilling total quantity Q and corresponding optimal price schedule in equilibrium. Standard techniques give not only the necessary conditions that the publisher s optimal price schedule menu should satisfy in the self-fulfilled equilibrium, but also the existence and uniqueness conditions of such self-fulfilled equilibrium. These conditions are formalized in the following proposition. Proposition 1. (a) Under assumptions (1) and (2), the publisher s optimal price schedule [(q ( ), T ( )] in self-fulfilled equilibrium satisfies the following conditions U 1 (q (θ), θ, Q ) = C (Q ) + 1 F (θ) U 12 (q (θ), θ, Q ), (2.1) f(θ) T (q (θ)) = U (q (θ), θ, Q ), (2.2) where q (θ) = q (θ, Q ) and Q = θ θ q (θ)f(θ)dθ. (b) if U 1 (q, θ, Q) is bounded for all q, θ, Q, then a optimal self-fulfilled price schedule always exists. Moreover, if U 13 (q, θ, Q) < U 11 (q, θ, Q) for all q, θ, Q, then (2.1) and (2.2) specify the unique optimal self-fulfilled price schedule for the publisher. Equation (2.1) and (2.2) characterize the optimal selling mechanism, [q ( ), T ( )]. Equation (2.1) says that the total marginal payoff for each type equals the marginal cost plus a nonnegative distortion term arised from incomplete information. The Spence-Mirrles singlecrossing condition implies that all businesses, except for the highest type businesses, buy less 6 Perrigne and Vuong (2013) consider a nonlinear pricing model with optimal exclusion. The publisher chooses optimally a threshold type level θ 0 below which q 0 > 0 (standard listing) is provided at zero price. 63

71 than the efficient quantity of advertising. For the highest type businesses, their marginal payoff equals C (Q ). Equation (2.2) says that the marginal payment for each type equals the marginal payoff for that type. Maskin and Riley (1984) show that the tariff function T ( ) is concave and increasing while q ( ) is increasing. 2.4 Identification and Estimation We first define the model structure and the observables. In view of Section 3, the model primitives are [U(,, ), ρ(, ), F ( ), C( )], which are the businesses utility function, type aggregation function, type distribution, and the publisher s cost function. The utility function incorporates market level network effects, while the type aggregation function incorporates industry level network effects. With data from one market, we can only observe one realization of total purchase Q. The model primitives are obviously not identified. With data from multiple markets, we are able to identify the model primitives by exploiting price schedule, purchase and payment variations across markets (as Q varies). To do so, we assume that only the taste distribution varies across markets. This primitive variation is attributed to observed variations on price schedule, purchase and payment variations across markets. Therefore, the model primitives are [U(,, ), ρ(, ), F m ( ), C( )], where m indicates market. For each market, the data provide information on the price schedule, the distribution of advertising purchases and total purchase. Using our previous notations in Section 3, the observables are [T m ( ), G m ( ), Q m ] for market m Identification Our identification strategy is motivated by Heckman, Matzkin, and Nesheim (2010), Manski (1993), Brock and Durlauf (2001) and Sweeting (2009) and extends Perrigne and Vuong (2013). Heckman, Matzkin, and Nesheim (2010) study the nonparametric identification of 64

72 hedonic models with nonadditive utility functions. They show first that the marginal payoff function is nonidentified without further restrictions in a single market. They show then that marginal payoff function is identified under several alternative assumptions or using multimarket data. Manski (1993), Brock and Durlauf (2001) and Sweeting (2009) show that variations in the realized equilibria across markets provide the leverage required to identify models with interactions. Perrigne and Vuong (2013) study nonparametric identification of nonlinear pricing without network effects in a single market. We also identify our model by using first-order conditions from both sides of the market. However, to allow for network effects, we need to to utilize the variations in price schedules, purchases across markets. To achieve identification, we assume that the payoff function U(q, θ, Q) is additively separable in two terms: intrinsic payoff V (q, θ) and network payoff W (q, θ, Q). Identification is achieved in three steps. The first step follows Perrigne and Vuong (2013). The intrinsic payoff function V (, ) is identified from the data in a picked benchmark market by normalizing network payoff W (, ) to be zero in this market. Intrinsic payoff function and the type distribution F ( ) for the benchmark market are identified. The second step identifies the aggregation function ρ(, ), which incorporates industry level network effects. The third step identifies the market level network effects. For any other market m with total purchase Q m [Q, Q], we show that network payoff function W (, Q m ) and the type distribution F m ( ) are identified by exploiting the differences of market outcomes between market m and the benchmark market. identifying assumptions We make the following identifying assumptions on the model primitives. Assumption 5. (i) The utility function is specified as the following form U(q, θ, Q) = V (q, θ) + W (q, θ, Q) = θ [ V 0 (q) + W (q, Q) ], (2.3) 65

73 and V 0 (0) = 0, W (0, Q) = 0 for any Q. (ii) The network payoff is normalized to be zero in a benchmark market, i.e., W (, Q ) = 0. Following Perrigne and Vuong (2013), we assume that the utility function U(,, ) is multiplicatively separable in the type θ. Thus, V 0 ( ) can be interpreted as the base intrinsic payoff. It can be easily seen that Assumption 4 is satisfied under similar assumptions on V 0 ( ) and W (, ). V 0 (0) = 0 and W (0, Q) for any Q imply that businesses do not have any intrinsic and network payoff if they do not purchase any advertising. The necessary conditions (2.1) and (2.2) for any market m can then be rewritten as where q = q m(θ). θ [ V 0(q) + W (q, Q m ) ] = C (Q m ) + 1 F m(θ) [ V f m (θ) 0(q) + W (q, Q m ) ], (2.4) T m(q) = θ [ V 0(q) + W (q, Q m ) ], (2.5) Assumption 5-(ii) is a normalization. This normalization is necessary because the utility is ordinal. We can only identify relative network utility across markets with different total puchases. In fact, we can normalize the network payoff function for a benchmark market to be any known function. Then the network payoff functions identified for other markets are relative to network payoff function in this benchmark market. Hereafter indicates variables and functions related to the benchmark market. identification of C( ) Our first identification result concerns the cost function using data on price schedules. Multimarket data provide us variations on the total quantity purchased Q. Since we observe both the price schedule T ( ) and maximum purchase q from multimarkets, we can recover marginal cost function C (Q) by exploiting the variations of Q and T ( ). Proposition 2 shows that only the marginal cost function is identified from observables. Proposition 2. The marginal cost function C ( ) is identified on [Q, Q] where Q and Q denote minimum and maximum total quantity purchased among all the markets, respectively. 66

74 This result is not surprising since the model involves the cost function only through the marginal cost. To see this, substituting (2.5) into (2.4) and evaluating (2.4) at θ = θ, we obtain T m(q m ) = C (Q m ), for any Q m [Q, Q]. Multimarket data provide us variations on both T ( ) and Q, which allow us to identify the function of marginal cost C ( ). Note that we identify only the marginal cost C ( ). It is conceivable that there is a fixed cost to produce a yellow page directory. To have an estimate of the fixed cost, we could use data from the market with the minimum total purchase Q. As any firm would only operate if net profit is positive, we can reasonably believe that the publisher s profit and cost are equal. With estimated model primitives, we can simulate the profit level and variable cost in this market. We can then learn about the fixed cost, thereby identifying the cost function C( ). identification of V 0 ( ) and F ( ) To identify the intrinsic payoff function, we exploit the necessary conditions (2.4) and (2.5) for the benchmark market. Denote θ ( ) q 1 ( ). The next lemma shows that the business marginal basic intrinsic payoff v 0 ( ) V 0( ) and the unobserved type θ (q) for the benchmark market are expressed as functions of the quantity purchased q and the observables [T ( ), G (q)]. Lemma 2. The necessary conditions (2.4) and (2.5) are equivalent to v 0 (q) = T (q) ξ (q) and θ (q) = θ θ ξ (q) (2.6) for all q (q, q], where ξ (q) = C [ ] (Q 1 ) { q 1 G T T (q) (q) exp C (Q ) (x) q T log [ 1 G (x) ] } dx, (2.7) (x)2 with ξ (q) = lim q q ξ (q) = θ/θ 0 and ξ (q) = 1. 67

75 The proof of Lemma (2) follows directly from the proof of Lemma (5) of Perrigne and Vuong (2013). By Lemma 2, a natural normalization is θ = 1. The base marginal intrinsic payoff function v 0 ( ) can then be uniquely recovered on [0, q] from the benchmark observable. Also, the type distribution in the benchmark market F ( ) can be uniquely recovered on [θ, θ] from the same observables. The following assumption and proposition formalize this result. Assumption 6. θ = 1. Under such a normalization, V 0 ( ) is the intrinsic payoff function for the highest type in all markets. Recall that V 0 (0) = 0, we identify the intrinsic utility function V 0 ( ) as V 0 (q) = q 0 T (x)ξ (x)dx. Proposition 3. Under Assumption 6, the base intrinsic payoff function V 0 ( ) and the business type distribution F ( ) in the benchmark market are identified on [0, q] and [θ, θ], respectively. identification of Industry level network effects ρ(, ) We now turn to the identification of the aggregation function ρ(x, ɛ), where X includes both exogenous characteristics and endogenous characteristics, and ɛ is unobserved characteristics. While it is difficult to generalize linear-in-mean models to allow unknown nonlinear network effects, using all the economics of the model provides a possibility. With our aggregation assumption of θ and the identified one-to-one mapping θ(q), we do not need to invoke the reduced-form as in a linear-in-mean model. Under the self-fulfilling assumption, identifying how the three kinds of characteristics affect businesses decisions is a nonlinear regression of θ on X. Identification is achieved under fairly general conditions. For instance, we could consider a nonlinear model with additive separable error θ = ρ(x)+ɛ. The function ρ( ) is identified under the mean independent assumption E[ɛ X] = 0. See Matzkin (2003) for identification of nonseparable models. identification of market level network effects W (, ) and F m ( ) 68

76 We are now able to show the identification of network payoff function W (, ) and business type distributions F m ( ) in other markets. Similar to the identification in the benchmark market, we can identify the sum of marginal intrinsic utility function and marginal network utility function. Applying Lemma 2 to other markets gives V 0( ) + W (, Q m ) = T m( )ξ m ( ), where ξ m ( ) is obtained by replacing with m in Equation (2.7). Moreover, the taste distribution in this market is also identified as the mapping from purchase to taste can be written in terms of observables. Namely, θ m ( ) = 1/ξ m ( ). Since V 0 ( ) is identified, the marginal network utility function is identified as W (, Q m ) = T m( )ξ m ( ) T ( )ξ ( ). Recall that W (0, Q) = 0, we identify the network utility function W (, Q) as W (q, Q) = q The following proposition formalizes these results. 0 [ T m (x)ξ m (x) T (x)ξ (x) ] dx (2.8) Proposition 4. The network payoff W (, ) and the business type distribution of market m F m ( ) are identified on [0, q] [Q, Q] and [θ, θ], respectively. The network payoff W (, Q m ) and businesses type θ m for any market m can be uniquely recovered on [q, q] from the observables G m ( ), T m( ) and the identified C (Q m ) and V 0( ). Intuitively, the identification of the network payoff function is obtained by using the variations of price schedule T ( ) and advertising purchase distribution G( ) across markets. an alternative Identification Strategy An alternative is to take the approach proposed in Luo (2012). In particular, we can replace Assumption 5-(i) with the following one. Assumption 7. (i ) The utility function has the following form U(q, θ, Q) = θv 0 (q) + W (q, Q). Again, our first step identifies the intrinsic payoff function V 0 ( ), which is the same across all markets. Our second step is to express the unknown network payoff function W (, ) and the unobserved type θ m for any other market m in terms of V 0 ( ) and observables. 69

77 In particular, we note that [1 F m (θ)]/f m (θ) = θ (q)[1 G m]/g m(q). Substituting it into (2.4), we obtain θ m(q) = 1 V 0(q) gm(q) 1 G m(q) (T m(x) C (Q m )). Integrating above differential equation over [q, q] and rearranging it, we obtain θ m (q) = 1 q q 1 V 0(x) gm(x) 1 G m(x) (T m(x) C (Q m ))dx. From above equation, we can see that θ m for market m is expressed in terms of G m( ), g m( ), T m ( ), C ( ) and v 0 ( ), which is identified in the first step. Having known the unobservable type θ m for market m, we then can express the marginal network payoff for any market with Q m in terms of observables and identified primitives as W (q, Q m ) = T m(q) θ m (q)v 0(q). Since W (0, Q) = 0 for any Q, we identify the network utility function W (, Q m ) as Estimation W (q, Q m ) = q 0 [ T m (x) θ m (x)v 0(x) ] dx. From Proposition 2 we could estimate marginal cost function as T ( ) evaluated at q. From Lemma 2, condition (8) provides the expressions for marginal intrinsic payoff function v 0 ( ) and business type at benchmark market θ ( ) as functions of T ( ) and ξ ( ), where ξ ( ) depends on T ( ), T ( ), C (Q ) and quantity distribution G ( ) at benchmark market. The functions T ( ) and T ( ) can be constructed from price scheduce while G ( ) needs to be estimated using purchases data from benchmark market. From (2.8), the network value payoff function for any other market m is expressed as a function of T m( ), ξ m (x), T ( ) and ξ ( ). Similarly ξ m (x) depends on T m( ), T m( ) and quantity purchase distribution G m ( ) at market m, which can also be similarly constructed or estimated from the price schedule and purchases information at market m. Finally the business type in any other market m can be expressed as a function of ξ m (x) from Lemma 2. 70

78 In view of the identification results, we propose a four-step procedure to estimate the model primitives using data from multiple markets. To illustrate our estimation procedures, we exploit data from two markets, SE and NE districts. Without loss of generality, we set SE district as our benchmark market. That is W (, Q se ) = 0. In the first step, we use the price schedule data from all markets to construct the tariff T (q, Q) as a function of both individual and total consumptions. In the second step, we first estimate the quantity distribution function G se ( ) for SE district directory using purchases of every business in the market. We then use tariff function T (q, Q se ) and quantity distribution function G se ( ) to estimate ξ se ( ). With T (q, Q se ) and ξ se ( ) in hand, marginal intrinsic payoff function v 0 ( ) and business type can be estimated by using condition (8) in Lemma 2. At last intrinsic payoff function V 0 ( ) can be derived by integrating v 0 ( ) with initial condition V 0 (q) = 0. In the third step, we first estimate ξ ne ( ) using the similar way as we estimated ξ se ( ). we then estimate network payoff using equation (2.8) and business type at NE district using θ ne = 1/ξ ne ( ). In the fouth step, we estimate the business type density distribution functions in both SE and NE districts nonparametrically using the estimated pseudo sample business types. In the following we provide the details on these steps. First, we construct tariff function using price schedule data from all 7 yellow page directories in Toronto. Moreover, we also use the constructed tariff function to adjust advertisement quantity. As described in data section, yellow page advertisements differ in size and quality (color and other features). However, the evidence on price schedule shows that the publisher in Toronto does not use the quality dimension to discriminate. For this reason, we follow Perrigne and Vuong (2013) and Busse and Rysman (2005) to capture the shape of the price schedules and construct a quality-adjusted quantity index which aggregates size, color and other features of advertisement. 7 To do so, we consider the price schedule for multicolor displays and adjust the advertising sizes for other colors accordingly. 7 It may be useful to integrate quality in the nonlinear pricing model. This might need a complex multidimentional screening model, which is known to be difficult to solve. See Armstrong (1996) and Luo (2012). 71

79 In particular, using this price schedule for multicolor displays for all 7 markets yellow page directories in Toronto from Local Search Association, we estimate the following equation log T mj = β 0 + β 1 log S mj + β 2 log Q m + ɛ mj, (2.9) where T mj is the price in dollors for a multicolor display in market m, S mj is the advertising size measured in square picas and Q m is the total purchase in market m measured in square picas. We then use the regression result to construct the tariff functions and quality-adjusted quantities for other colors advertisements. More specifically, the quality-adjusted quantity for any oberseved purchases in the 7 yellow page directories is constructed by q = e ˆβ 0 / ˆβ 1 t 1/ ˆβ 1 Q ˆβ 2 / ˆβ 1 m, (2.10) where ˆβ 0, ˆβ 1 and ˆβ 2 are estimated coefficients from (2.9), q is quality adjusted quantity and t is observed price for any type advertisement. The tariff function for any market with total quantity Q is constructed by T (q, Q) = e ˆβ 0 q ˆβ 1 Q ˆβ 2, (2.11) where T (q, Q) is the tariff function for the market with total quantity Q and q is the quality adjusted quantity. Using (2.11), we could obtain tariff functions for SE and NE districts as ˆT se (q) = T (q, Q se ) and ˆT se (q) = T (q, Q se ). Marginal cost for the two markets can then be estimated by Ĉ (Q se ) = ˆT se(q) and Ĉ (Q ne ) = ˆT ne(q) from Proposition 2. Second, we estimate the intrinsic utility function V 0 ( ) and business type in SE district using the advertisement purchases data and constructed tariff function for SE district. Let N se denote the number of businesses purchasing advertising in the SE district yellow page directory and q i se, i = 1, 2,..., N se denote the quantity purchased by each of those businesses. Following (2.7), we first estimate ξ se ( ) by ˆξ se (q) = [ ] Ĉ (Qse) 1 {Ĉ q 1 Ĝse(q) ˆT se (q) exp (Q se ) q ˆT se(x) ˆT se(x) log [ 1 Ĝse(x) ] } dx, (2.12) 2 72

80 where G se (x) is estimated by integrating its density function g se ( ), which is estimated by using a kernel estimator ĝ(q) = 1 N se h N se i=1 K( q qi se ) (2.13) h for q (0, q), where K( ) is a symmetric kernel function with compact support and h is a bandwidth. Following Lemma 2, the marginal intrinsic utility function v 0 ( ) is estimated by ˆv 0 (q) = ˆT se(q)ˆξ se (q) and then the intrinsic utility function is obtained from ˆV 0 (q) = q 0 ˆv 0(x)dx with boundary condtion V 0 (0) = 0. Finally the business type function in SE district is estimated by ˆθ se (q) = 1/ˆξ se (q). Third, we estimate the network utility function W (, Q ne ) and business type distribution for NE district by exploiting the differences of equilibrium price schedule and purchases between NE and SE districts. Using the same appoach on estimating hatξ se (q), we can analogically estimate ˆξ ne (q) by substituting Ĝse(q), Ĉ (Q se ), ˆT se ( ) and Ĉ (Q ne ), ˆT ne( ) and function W (, Q ne ) for NE district by ˆT se( ) with Ĝne(q), ˆT ne( ) in (2.12). Following Proposition 4, we estimate the network utility Ŵ (q, Q ne ) = q 0 [ ˆT ne(x)ˆξ ne (x) ˆT se(x)ˆξ se (x)]dx. (2.14) The business type function for each business in NE distrist is estimated by ˆθ ne (q) = 1/ˆξ ne (q) from Lemma 2. Finally, we nonparametrically estimate the business type distribution functions f se ( ) and f ne ( ) for both SE and NE districts using the estimated business type functions ˆθ se (q) and ˆθ ne (q). Denote the number of businesses purchasing advertising in the NE district yellow page directory by N se, and the quantity purchased by each of those businesses by q i ne, i = 1, 2,..., N ne. Using the estimated business type functions ˆθ se (q) and ˆθ ne (q), we construct the pseudo type samples for both markets by {ˆθ i se = ˆθ se (q i se)} i=1,2,...,nse and {ˆθ i ne = ˆθ ne (q i ne)} i=1,2,...,nne, respectively. We estimate f se ( ) and f ne ( ) using kernel estimators ˆf m (θ) = 1 N m h N m i=1 73 K( θ ˆθ m i ), (2.15) h

81 where m = se, ne, θ (θ, θ), K( ) is a symmetric kernel function with compact support and h is a bandwidth. 2.5 Empirical Results Estimation Results We apply our estimation approach using advertising data from SE and NE directories. In view of our identification results, we need to estimate the tariff function and construct purchase distribution in the first step. We then use data from SE to estimate the base intrinsic utility function and use data from NE to estimate the network utility function. Finally, we estimate the density functions using constructed pseudo samples of business types. estimated model primitives Using the price schedule data for multicolor display advertisements from all 7 district yellow page directories, we estimate equation (2.9). The coefficient estimates are ˆβ 0 = (2.0269), ˆβ 1 = ( ) and ˆβ 2 = (0.1530), where standard errors are in parentheses. Since all the estimates are significant and the adjusted R 2 of the regression is 98%, we use those regression estimates to construct quality-adjusted quantities and the tariff functions from (2.10) and (2.11). The first two rows of Table (2.3) and Table (2.4) show the summary statistics of quality-adjusted quantities and tariff paid for SE and NE yellow page directories, respectively. The publisher charges higher price for advertising in NE directory than that in SE directory. The total quality-adjusted quantities purchased are and thousands square picas in NE and SE yellow pages directories, respectively. Using (2.11), we obtain an estimate for the marginal cost function Ĉ (Q) = ˆT q(q max, Q) = Q , where q max is the maximum size double pages. The marginal cost function is increasing with respect to the total quantity produced, which is consistent with our model Assumption 2. This might 74

82 be because the publisher needs more cost in designing the layout of a yellow page directory when it needs accommodate more advertisements. We now report the estimates of the intrinsic utility function V 0 ( ) and the network utility function in the NE yellow page directory, W se ( ) W (, Q se ). These functions are displayed in Figure 2.2. The estimated intrinsic utility function is increasing and concave, thereby satisfying Assumption 1. The estimated network utility function is positive, increasing and concave, thereby satisfying Assumption 1. As we can notice from graph, the network utility function is quite relevant as its scale is comparable with a median consumer s utility function. The positive network effects we found is consistent with Rysman (2004) s findings. A larger yellow page directory contains more information that is valuable for consumers, thereby leading to more usage, which in turn increases the value of advertising for businesses. We finally report the pseudo sample estimates of the business types in SE and NE districts, which are used to estimate their probability density functions. Summary statistics on estimated pseudo business types are displayed in the middle of Tables 2.3 and 2.4. Figure 2.3 displays the estimated density functions f se ( ) and f ne ( ). Businesses in NE district tend to have higher tastes for yellow page advertisements. We interpret this result by positing that there might be more residential consumers or residents have higher income in NE district. Businesses expect the advertising is more value in NE directory. informational rent and network effects Using these primitive estimates, we access empirically the publisher s revenue as well as the businesses informational rents and market level network payoff. The publisher s revenue is the total payment he collects from businesses. The publisher collects 5.38 and 6.98 millions dollars from selling advertising in SE and NE directories, respectively. The informational rent is defined as the difference between utility and payment. For instance, a type θ NE business purchases a q quantity advertisement by paying T ne (q) and it obtains utility U(q, θ, Q ne ). The informational rent for this business is U(q, θ, Q ne ) T ne (q). The total informational rents 75

83 are 4.78 and 6.35 millions dollars for SE and NE directories, repectively. The ratio of total informational rent to total payment across all consumers measures the cost of asymmetric information, which is 88.91% in SE and 91.00% in NE. Although the publisher collects much revenue, businesses also obtain huge amount informational rent due to private information they have. Comparing to the results reported in Perrigne and Vuong (2013), businesses in Toronto obtain more informational rent. This comes from the fact that businesses display a higher degree of heterogeneity in Toronto than State College. The publisher leaves more informational rent to screen businesses of different tastes. Tables (2.3) and (2.4) show the summary statistics of taste, information rent and its ratio to payment. The ratio of standard deviation to its mean is 94% in SE and 96% in NE, comparing to 68% in Perrigne and Vuong (2013). Regarding market level network payoff, businesses advertising in NE directories obtain 3.06 millions dollars total payoff from market level network relative to businesses advertising in NE directories. This amount accounts for about 22.96% of the total payoff. It is also about 43.84% of total payment and 48.19% of the total informational rent all businesses obtain. Both publisher and businesses benefit from market level network effects. Table (2.4) displays the summary statistics of network payoff obtained by each businesses. Recall that business type θ aggregates all the information that the publisher cannot discriminate on, including the business competition status in its industry. Along with their payments and purchases, we observe some business characteristics. These variables are likely to be correlated with types. For instance, more total purchases in the same heading may leads to an increased preference for advertising due to competition between firms. To see this industrial level network effects, we regress business type on observed characteristics. Note that θ [0, 1] by our normalization assumption. We first transform θ into an unbounded variable ϑ = log θ, which is a strictly increasing one-to-one mapping. It implies 1 θ 76

84 that θ = e ϑ /(1 + e ϑ ). We then estimate the following equation ϑ i = α 0 + Competition iα 1 + X iα 2 + ν i, where Competition i is a vector consisting of variables regarding competition in the same industry, X i is a vector including other exogenous variables, and ν i is mean independent of Competition i and X i. In particular, we use the mean purchase and the number of firms for Competition and control for 10 division fixed effects. 8 Table 2.5 displays the results of our regression with four specifications. Note that the adjusted R 2 are very similar, while the squared terms and the number of firms do not significantly affect ϑ. Thus, Specification I becomes our preferred one as the other variables are not significant and do not add much explanation power. The results show that the more advertising a business s competitors purchase from yellow page directory, the higher taste the business tends to have. Therefore, there are positive network effects in the industry level. To make its advertisement stand out, a business tends to buy a larger advertisement than its competitors do Counterfactuals With structural estimates at hand, we conduct counterfactuals to evaluate the importance of network effects. Recall that we consider two types of network effects in our model. The business type θ aggregates all information that the publisher cannot discriminate on, including the industry level network effects. Moreover, market level network effects are represented by the network utility function W (, ), which is a function of the business s own purchase and also the total purchase in his market. As we normalize the market level network effects to be 0 in SE directory, our counterfactual experiments focus on NE directory. To separate the two types of network effects, we first simulate the case of shutting down 8 There are more than a thousand different industry headings. But it is unrealistic to have this many fixed effects in our regression. Thus we define 10 division dummies using SIC code, such as Agriculture, forestry, and fishing, Mining, Construction, Manufacturing and so on. 77

85 the market level network effects. That is, businesses do not obtain more utility by buying advertising from a larger yellow page directory. Second, we shut down both type of network effects. We show the importance of the two types of network effects by comparing the publisher s revenue, consumer surplus and social welfare in the two counterfactuals with those estimated in the original model defined by (Ĉ( ), Û(,, ), ˆf ne ( )) using the real data. The original model with both industrial level and market level network effects is referred to as full model hereafter. The optimal purchase and tariff function in this model is denoted by q full ( ) and T full ( ), respectively. To be able to conduct the counterfactual experiments, we need make some assumptions on things we do not identify. Since we identify and estimate W (, Q) only for Q = Q ne, we assume that for any Q [Q, Q], Û(q, θ, Q) = θ [ ˆV0 (q) + (Q Q se ) Ŵne(q) ], which equals the estimated intrinsic payoff when Q = Q se, and equals the sum of the estimated intrinsic and network payoffs for NE directory when Q = Q ne. Regarding the cost function, note that we identify only the marginal cost C ( ). However, if Q does not change a lot, the publisher still decides on the margin using C ( ). Assuming 0 fixed cost would not change the result on consumer surplus or the publisher s revenue. shut down market level network effects Our first counterfactual experiment investigates market level network effects by shutting them down. To do so, we assume businesses do not enjoy market level network effects. Formally, their utility is now written as U new (q, θ, Q) = θ ˆV 0 (q) We then use (Ĉ( ), U new (,, ), ˆf ne ( )) to simulate the market outcomes without market level 78

86 network effects. The new equilibrium is defined by θ ˆV 0(q) = Ĉ (Q) + 1 ˆF ne (θ) ˆV 0(q) ˆf ne (θ) T (q) = θ ˆV 0(q), along with two boundary conditions: (i) q(θ; Q) = 0; (ii) T (0) = 0, and also a self-fulfilling expectation condition: θ θ q(θ; Q)dθ = Q. We remark that q( ; Q) is defined as the inverse mapping of θ( ; Q) Ĥ 1 [Ĉ (Q)/ ˆV 0( )], where Ĥ( ) 1 ˆF ( ) ˆf( ) is increasing and ˆV 0( ) is decreasing. Thus, the implied θ( ; Q) is increasing. To find the new equilibrium total purchase Q, we use the bisection method. In particular, for any value of Q, we can solve the above two equations and find the optimal assignment schedule q( ; Q) and tariff function T ( ; Q). We then use bisection method to search for the Q such that θ θ q(θ; Q)dθ = Q. In this model, since there is only industry level network effects, we denote the optimal purchases and tariff function in equilibrium by q ind ( ) and T ind ( ), respectively. shut down All network effects Recall that θ aggregates all the information that the publisher cannot discriminate on, including the competition status in the industry. The industry level network effects are reflected in θ. To shut down industry level network effects, we calculate a new ϑ i as ϑ new i = ϑ i ˆα 1 Competition i. We then calculate a new taste using θ new i = exp(ϑ new i )/(1+exp(ϑ new i )), which represents the business s taste for advertising on yellow page without pressure from its competitors. We estimate a new density function f new ( ) using the new type sample {θi new } i=1,2,...,n. Table (2.4) gives summary statistics on θ new and Figure (2.3) gives a kernel estimate of the new density function fne new ( ). After shutting down industry level network effects, firms tend to have smaller willingness-to-pay for advertising in yellow page directory. Thus we strengthen the findings in Rysman (2004) by finding positive network effects in the industry level. To calculate the effects of shutting down all network effects, we simulate the market 79

87 outcome using (Ĉ( ), U new (,, ), f new ( )). Since there is no any type of network effects in this model, we denote the optimal purchases and tariff function by q 0 ( ) and T 0 ( ), respectively. Discussion and the Counterfactual Results As shown in Shearer (2004), if network value is increasing in individual and gross consumption, but constant in θ, then individual consumption increases all but the highest type, i.e., q full ( ) q ind ( ). Profits and consumer surplus increase as well. Our first counterfactual experiment confirms his result and quantifies market level network effects in NE market. In theory, it is ambiguous how publisher profit and business surplus should change without industry level network effects. If ˆα 1 < 0, industry level network effects is negative. In other words, there is congestion effects. If ˆα 1 > 0, there is arm race among competitors in the same industry. However, it is unclear how consumer surplus changes if we shut down industry level network effects. For instance, if ˆα 1 < 0, businesses have larger tastes without industry level network effects. On the other hand, they may become more or less heterogeneous after this transformation. Suppose businesses tastes become more similar without competition, then the publisher can leave less informational rent and achieve a new second-best profit. In this case, consumer surplus decreases. We compare the total purchase, the publisher s revenue, business surplus and social welfare under three scenarios: (a) full model (one with estimated model primitives), (b) a model with only industry level network effects (no market level network effects), (c) a model with no network effects. Shutting down market level network effects would lead to a 26.36% decrease in firm s revenue and a 26.93% decrease in advertiser surplus, resulting in a 9.13% decrease in total purchase. Shutting down all network effect would lead to a 46.42% decrease in firm s revenue and a 43.94% decrease in advertiser surplus, resulting in a 17.24% decrease in total purchase. Figure (2.4) displays the resulting tariff functions denoted by T full ( ), T ind ( ), T 0 ( ) for the three scenarios. The publisher can exploit network effects by charging a more expensive 80

88 price schedule when there are network effects. Figure (2.5) displays the resulting assignment schedules denoted by q full ( ), q ind ( ), q 0 ( ) for the three scenarios. Relative to Case (b), firms buy more advertisement with both kinds of network effects. This is consistent with the results in Shearer (2004). In Case (c), as the tariff schedule is much less expensive, given his taste θ, a business tends to buy more advertisement. However, as businesses tend to have smaller tastes, the total purchase is smaller relative to cases with network effects. 2.6 Conclusion This paper proposes a new approach to identify models with network effects by invoking another side of the market. It allows multiple channels as well as nonlinearity for social interactions. Our running empirical application investigates the role of network effects in local businesses decisions on advertising in the yellow pages directories. With hand-collected data on advertisers purchases and nonlinear price schedules from 7 districts in Toronto, we estimate a model that allows for heterogeneous businesses demand and endogenizes the publisher s optimal nonlinear pricing decision. We found network effects are positive in both the market level, i.e., an increase in the total quantity of advertising leads to an increase in consumer usage, and the industry level, i.e., a higher average purchase in the industry leads to higher taste. Our counterfactuals evaluate the effects of shrinking network effects in this declining industry. The methodology we develop in this paper is also applicable to other industries, i.e., public transportation with endogenous congestion and telecommunications with endogenous network externalities. It is also useful to study technology adoption with network externalities. Yellow pages are the simplest platforms where two sides, consumers and businesses, interact. While the divide-and-conquer strategy simplifies our setting, incorporating both sides and the platform may become necessary in other settings. Endogenizing all three players decisions is challenging but potentially applicable to studying platforms, such as dating 81

89 websites and so on. 82

90 2.7 Appendix: Tables and Graphs Table 2.1: Revenue Ranking by Industry Headings in Core SE Directories Industry heading Revenue percentage Lawyers $ % Plumbing Contracts $ % Dentists $ % Personnel consultants $ % Drafting services $ % Roofing Contractors $ % Heating contractors $ % Storage-self service $ % Electric contractors $ % Funeral homes $ % Total $ % 83

91 Table 2.2: Revenue Ranking by Industry Headings in Core NE Directories Industry heading Revenue percentage Lawyers $ % Plumbing Contracts $ % Bankruptcies-trustees $ % Dentists $ % Pest Control Services $ % Waterproof Contracts $ % Appliances-major-sales&service $ % Roofing Contractors $ % Eletric Contractors $ % Rubbish Removal $ % Total $ % Table 2.3: Summary Statistics on SE Yellow Page Directory Variable Mean Std. Dev. Min Max Total t e+6 q adj e+5 ˆθ rent e+6 rentratio

92 Table 2.4: Summary Statistics on NE Yellow Page Directory Variable Mean Std. Dev. Min Max Total t e+6 q adj e+5 ˆθ rent e+6 rentratio netutility e+6 netratio θ new Figure 2.1: Distribution Areas of the 7 Directories 85

EC476 Contracts and Organizations, Part III: Lecture 2

EC476 Contracts and Organizations, Part III: Lecture 2 EC476 Contracts and Organizations, Part III: Lecture 2 Leonardo Felli 32L.G.06 19 January 2015 Moral Hazard: Consider the contractual relationship between two agents (a principal and an agent) The principal

More information

1. Linear Incentive Schemes

1. Linear Incentive Schemes ECO 317 Economics of Uncertainty Fall Term 2009 Slides to accompany 20. Incentives for Effort - One-Dimensional Cases 1. Linear Incentive Schemes Agent s effort x, principal s outcome y. Agent paid w.

More information

Moral Hazard: Part 1. April 9, 2018

Moral Hazard: Part 1. April 9, 2018 Moral Hazard: Part 1 April 9, 2018 Introduction In a standard moral hazard problem, the agent A is characterized by only one type. As with adverse selection, the principal P wants to engage in an economic

More information

Optimal Incentive Contract with Costly and Flexible Monitoring

Optimal Incentive Contract with Costly and Flexible Monitoring Optimal Incentive Contract with Costly and Flexible Monitoring Anqi Li 1 Ming Yang 2 1 Department of Economics, Washington University in St. Louis 2 Fuqua School of Business, Duke University May 2016 Motivation

More information

This is designed for one 75-minute lecture using Games and Information. October 3, 2006

This is designed for one 75-minute lecture using Games and Information. October 3, 2006 This is designed for one 75-minute lecture using Games and Information. October 3, 2006 1 7 Moral Hazard: Hidden Actions PRINCIPAL-AGENT MODELS The principal (or uninformed player) is the player who has

More information

General idea. Firms can use competition between agents for. We mainly focus on incentives. 1 incentive and. 2 selection purposes 3 / 101

General idea. Firms can use competition between agents for. We mainly focus on incentives. 1 incentive and. 2 selection purposes 3 / 101 3 Tournaments 3.1 Motivation General idea Firms can use competition between agents for 1 incentive and 2 selection purposes We mainly focus on incentives 3 / 101 Main characteristics Agents fulll similar

More information

Deceptive Advertising with Rational Buyers

Deceptive Advertising with Rational Buyers Deceptive Advertising with Rational Buyers September 6, 016 ONLINE APPENDIX In this Appendix we present in full additional results and extensions which are only mentioned in the paper. In the exposition

More information

Optimal Insurance of Search Risk

Optimal Insurance of Search Risk Optimal Insurance of Search Risk Mikhail Golosov Yale University and NBER Pricila Maziero University of Pennsylvania Guido Menzio University of Pennsylvania and NBER November 2011 Introduction Search and

More information

Game Theory and Economics of Contracts Lecture 5 Static Single-agent Moral Hazard Model

Game Theory and Economics of Contracts Lecture 5 Static Single-agent Moral Hazard Model Game Theory and Economics of Contracts Lecture 5 Static Single-agent Moral Hazard Model Yu (Larry) Chen School of Economics, Nanjing University Fall 2015 Principal-Agent Relationship Principal-agent relationship

More information

Game Theory, Information, Incentives

Game Theory, Information, Incentives Game Theory, Information, Incentives Ronald Wendner Department of Economics Graz University, Austria Course # 320.501: Analytical Methods (part 6) The Moral Hazard Problem Moral hazard as a problem of

More information

Moral Hazard: Hidden Action

Moral Hazard: Hidden Action Moral Hazard: Hidden Action Part of these Notes were taken (almost literally) from Rasmusen, 2007 UIB Course 2013-14 (UIB) MH-Hidden Actions Course 2013-14 1 / 29 A Principal-agent Model. The Production

More information

The Firm-Growth Imperative: A Theory of Production and Personnel Management

The Firm-Growth Imperative: A Theory of Production and Personnel Management The Firm-Growth Imperative: A Theory of Production and Personnel Management Rongzhu Ke Hong Kong Baptist University Jin Li London School of Economics Michael Powell Kellogg School of Management Management

More information

1 The Basic RBC Model

1 The Basic RBC Model IHS 2016, Macroeconomics III Michael Reiter Ch. 1: Notes on RBC Model 1 1 The Basic RBC Model 1.1 Description of Model Variables y z k L c I w r output level of technology (exogenous) capital at end of

More information

Two-sided investments and matching with multi-dimensional cost types and attributes

Two-sided investments and matching with multi-dimensional cost types and attributes Two-sided investments and matching with multi-dimensional cost types and attributes Deniz Dizdar 1 1 Department of Economics, University of Montréal September 15, 2014 1/33 Investments and matching 2/33

More information

G5212: Game Theory. Mark Dean. Spring 2017

G5212: Game Theory. Mark Dean. Spring 2017 G5212: Game Theory Mark Dean Spring 2017 Adverse Selection We have now completed our basic analysis of the adverse selection model This model has been applied and extended in literally thousands of ways

More information

ONLINE ONLY APPENDIX. Endogenous matching approach

ONLINE ONLY APPENDIX. Endogenous matching approach ONLINE ONLY APPENDIX Endogenous matching approach In addition with the respondable risk approach, we develop in this online appendix a complementary explanation regarding the trade-off between risk and

More information

Microeconomic Theory (501b) Problem Set 10. Auctions and Moral Hazard Suggested Solution: Tibor Heumann

Microeconomic Theory (501b) Problem Set 10. Auctions and Moral Hazard Suggested Solution: Tibor Heumann Dirk Bergemann Department of Economics Yale University Microeconomic Theory (50b) Problem Set 0. Auctions and Moral Hazard Suggested Solution: Tibor Heumann 4/5/4 This problem set is due on Tuesday, 4//4..

More information

Decision, Risk and Operations Working Papers Series

Decision, Risk and Operations Working Papers Series Decision, Risk and Operations Working Papers Series The cost of moral hazard and limited liability in the principal-agent problem F. Balmaceda, S. R. Balseiro, J. R. Correa, N. E. Stier-Moses July 2010;

More information

Contracts in informed-principal problems with moral hazard

Contracts in informed-principal problems with moral hazard Contracts in informed-principal problems with moral hazard Nicholas C Bedard January 20, 2016 Abstract In many cases, an employer has private information about the potential productivity of a worker, who

More information

Teoria das organizações e contratos

Teoria das organizações e contratos Teoria das organizações e contratos Chapter 6: Adverse Selection with two types Mestrado Profissional em Economia 3 o trimestre 2015 EESP (FGV) Teoria das organizações e contratos 3 o trimestre 2015 1

More information

1. The General Linear-Quadratic Framework

1. The General Linear-Quadratic Framework ECO 317 Economics of Uncertainty Fall Term 2009 Slides to accompany 21. Incentives for Effort - Multi-Dimensional Cases 1. The General Linear-Quadratic Framework Notation: x = (x j ), n-vector of agent

More information

Some Notes on Adverse Selection

Some Notes on Adverse Selection Some Notes on Adverse Selection John Morgan Haas School of Business and Department of Economics University of California, Berkeley Overview This set of lecture notes covers a general model of adverse selection

More information

Investor s Increased Shareholding due to Entrepreneur Manager Collusion

Investor s Increased Shareholding due to Entrepreneur Manager Collusion Investor s Increased Shareholding due to Entrepreneur Manager Collusion Özgün Atasoy Sabancı University Mehmet Barlo Sabancı University August, 2007 Abstract This study presents an investor/entrepreneur

More information

Economics 2450A: Public Economics Section 8: Optimal Minimum Wage and Introduction to Capital Taxation

Economics 2450A: Public Economics Section 8: Optimal Minimum Wage and Introduction to Capital Taxation Economics 2450A: Public Economics Section 8: Optimal Minimum Wage and Introduction to Capital Taxation Matteo Paradisi November 1, 2016 In this Section we develop a theoretical analysis of optimal minimum

More information

Bargaining, Contracts, and Theories of the Firm. Dr. Margaret Meyer Nuffield College

Bargaining, Contracts, and Theories of the Firm. Dr. Margaret Meyer Nuffield College Bargaining, Contracts, and Theories of the Firm Dr. Margaret Meyer Nuffield College 2015 Course Overview 1. Bargaining 2. Hidden information and self-selection Optimal contracting with hidden information

More information

Moral Hazard and Persistence

Moral Hazard and Persistence Moral Hazard and Persistence Hugo Hopenhayn Department of Economics UCLA Arantxa Jarque Department of Economics U. of Alicante PRELIMINARY AND INCOMPLETE Abstract We study a multiperiod principal-agent

More information

Hidden information. Principal s payoff: π (e) w,

Hidden information. Principal s payoff: π (e) w, Hidden information Section 14.C. in MWG We still consider a setting with information asymmetries between the principal and agent. However, the effort is now perfectly observable. What is unobservable?

More information

1 Bewley Economies with Aggregate Uncertainty

1 Bewley Economies with Aggregate Uncertainty 1 Bewley Economies with Aggregate Uncertainty Sofarwehaveassumedawayaggregatefluctuations (i.e., business cycles) in our description of the incomplete-markets economies with uninsurable idiosyncratic risk

More information

Assortative Matching with Large Firms

Assortative Matching with Large Firms Assortative Matching with Large Firms Span of Control over More versus Better Workers Jan Eeckhout 1 Philipp Kircher 2 1 University College London and UPF 2 London School of Economics Marseille, April

More information

Models of Wage Dynamics

Models of Wage Dynamics Models of Wage Dynamics Toshihiko Mukoyama Department of Economics Concordia University and CIREQ mukoyama@alcor.concordia.ca December 13, 2005 1 Introduction This paper introduces four different models

More information

University of California Berkeley

University of California Berkeley Working Paper #2018-02 Infinite Horizon CCAPM with Stochastic Taxation and Monetary Policy Revised from the Center for Risk Management Research Working Paper 2018-01 Konstantin Magin, University of California,

More information

1. The General Linear-Quadratic Framework

1. The General Linear-Quadratic Framework ECO 37 Economics of Uncertainty Fall Term 009 Notes for lectures Incentives for Effort - Multi-Dimensional Cases Here we consider moral hazard problems in the principal-agent framewor, restricting the

More information

The Principal-Agent Problem

The Principal-Agent Problem Andrew McLennan September 18, 2014 I. Introduction Economics 6030 Microeconomics B Second Semester Lecture 8 The Principal-Agent Problem A. In the principal-agent problem there is no asymmetric information

More information

Competing Teams. Hector Chade 1 Jan Eeckhout 2. SED June, Arizona State University 2 University College London and Barcelona GSE-UPF

Competing Teams. Hector Chade 1 Jan Eeckhout 2. SED June, Arizona State University 2 University College London and Barcelona GSE-UPF Competing Teams Hector Chade 1 Jan Eeckhout 2 1 Arizona State University 2 University College London and Barcelona GSE-UPF SED June, 2014 The Problem We analyze assortative matching with externalities

More information

Organization, Careers and Incentives

Organization, Careers and Incentives Organization, Careers and Incentives Chapter 4 Robert Gary-Bobo March 2018 1 / 31 Introduction Introduction A firm is a pyramid of opportunities (Alfred P. Sloan). Promotions can be used to create incentives.

More information

QUALITY MEASUREMENT AND CONTRACT DESIGN: LESSONS FROM THE NORTH AMERICAN SUGARBEET INDUSTRY

QUALITY MEASUREMENT AND CONTRACT DESIGN: LESSONS FROM THE NORTH AMERICAN SUGARBEET INDUSTRY QUALITY MEASUREMENT AND CONTRACT DESIGN: LESSONS FROM THE NORTH AMERICAN SUGARBEET INDUSTRY BRENT HUETH AND TIGRAN MELKONYAN Date: October 26, 2004. Contact author: Tigran Melkonyan, Department of Agricultural

More information

Fall 2017 Recitation 10 Notes

Fall 2017 Recitation 10 Notes 14.770-Fall 017 Recitation 10 Notes Arda Gitmez December 8, 017 Today: Why incentives can backfire Holmstrom and Milgrom s multitasking model (1991, JLEO) Two papers by Benabou and Tirole (003, REStud

More information

Endogenous Information Choice

Endogenous Information Choice Endogenous Information Choice Lecture 7 February 11, 2015 An optimizing trader will process those prices of most importance to his decision problem most frequently and carefully, those of less importance

More information

Endogenous information acquisition

Endogenous information acquisition Endogenous information acquisition ECON 101 Benhabib, Liu, Wang (2008) Endogenous information acquisition Benhabib, Liu, Wang 1 / 55 The Baseline Mode l The economy is populated by a large representative

More information

Moral Hazard. Felix Munoz-Garcia. Advanced Microeconomics II - Washington State University

Moral Hazard. Felix Munoz-Garcia. Advanced Microeconomics II - Washington State University Moral Hazard Felix Munoz-Garcia Advanced Microeconomics II - Washington State University Moral Hazard Reading materials: Start with Prajit Dutta, Chapter 19. MWG, Chapter 14 Macho-Stadler and Perez-Castrillo,

More information

Advanced Macroeconomics

Advanced Macroeconomics Advanced Macroeconomics The Ramsey Model Marcin Kolasa Warsaw School of Economics Marcin Kolasa (WSE) Ad. Macro - Ramsey model 1 / 30 Introduction Authors: Frank Ramsey (1928), David Cass (1965) and Tjalling

More information

Aggregate Demand, Idle Time, and Unemployment

Aggregate Demand, Idle Time, and Unemployment Aggregate Demand, Idle Time, and Unemployment Pascal Michaillat (LSE) & Emmanuel Saez (Berkeley) July 2014 1 / 46 Motivation 11% Unemployment rate 9% 7% 5% 3% 1974 1984 1994 2004 2014 2 / 46 Motivation

More information

5. Relational Contracts and Career Concerns

5. Relational Contracts and Career Concerns 5. Relational Contracts and Career Concerns Klaus M. Schmidt LMU Munich Contract Theory, Summer 2010 Klaus M. Schmidt (LMU Munich) 5. Relational Contracts and Career Concerns Contract Theory, Summer 2010

More information

Lecture Notes - Dynamic Moral Hazard

Lecture Notes - Dynamic Moral Hazard Lecture Notes - Dynamic Moral Hazard Simon Board and Moritz Meyer-ter-Vehn October 23, 2012 1 Dynamic Moral Hazard E ects Consumption smoothing Statistical inference More strategies Renegotiation Non-separable

More information

Redistributive Taxation in a Partial-Insurance Economy

Redistributive Taxation in a Partial-Insurance Economy Redistributive Taxation in a Partial-Insurance Economy Jonathan Heathcote Federal Reserve Bank of Minneapolis and CEPR Kjetil Storesletten Federal Reserve Bank of Minneapolis and CEPR Gianluca Violante

More information

Oblivious Equilibrium: A Mean Field Approximation for Large-Scale Dynamic Games

Oblivious Equilibrium: A Mean Field Approximation for Large-Scale Dynamic Games Oblivious Equilibrium: A Mean Field Approximation for Large-Scale Dynamic Games Gabriel Y. Weintraub, Lanier Benkard, and Benjamin Van Roy Stanford University {gweintra,lanierb,bvr}@stanford.edu Abstract

More information

Comprehensive Exam. Macro Spring 2014 Retake. August 22, 2014

Comprehensive Exam. Macro Spring 2014 Retake. August 22, 2014 Comprehensive Exam Macro Spring 2014 Retake August 22, 2014 You have a total of 180 minutes to complete the exam. If a question seems ambiguous, state why, sharpen it up and answer the sharpened-up question.

More information

14.999: Topics in Inequality, Lecture 5 Superstars and Top Inequality

14.999: Topics in Inequality, Lecture 5 Superstars and Top Inequality 14.999: Topics in Inequality, Lecture 5 Superstars and Top Inequality Daron Acemoglu MIT March 4, 2015. Daron Acemoglu (MIT) Superstars and Top Inequality March 4, 2015. 1 / 59 Introduction Superstars

More information

Mathematical Appendix. Ramsey Pricing

Mathematical Appendix. Ramsey Pricing Mathematical Appendix Ramsey Pricing PROOF OF THEOREM : I maximize social welfare V subject to π > K. The Lagrangian is V + κπ K the associated first-order conditions are that for each I + κ P I C I cn

More information

Moral Hazard. EC202 Lectures XV & XVI. Francesco Nava. February London School of Economics. Nava (LSE) EC202 Lectures XV & XVI Feb / 19

Moral Hazard. EC202 Lectures XV & XVI. Francesco Nava. February London School of Economics. Nava (LSE) EC202 Lectures XV & XVI Feb / 19 Moral Hazard EC202 Lectures XV & XVI Francesco Nava London School of Economics February 2011 Nava (LSE) EC202 Lectures XV & XVI Feb 2011 1 / 19 Summary Hidden Action Problem aka: 1 Moral Hazard Problem

More information

Game Theory. Monika Köppl-Turyna. Winter 2017/2018. Institute for Analytical Economics Vienna University of Economics and Business

Game Theory. Monika Köppl-Turyna. Winter 2017/2018. Institute for Analytical Economics Vienna University of Economics and Business Monika Köppl-Turyna Institute for Analytical Economics Vienna University of Economics and Business Winter 2017/2018 Static Games of Incomplete Information Introduction So far we assumed that payoff functions

More information

Economics 2010c: Lectures 9-10 Bellman Equation in Continuous Time

Economics 2010c: Lectures 9-10 Bellman Equation in Continuous Time Economics 2010c: Lectures 9-10 Bellman Equation in Continuous Time David Laibson 9/30/2014 Outline Lectures 9-10: 9.1 Continuous-time Bellman Equation 9.2 Application: Merton s Problem 9.3 Application:

More information

Cournot and Bertrand Competition in a Differentiated Duopoly with Endogenous Technology Adoption *

Cournot and Bertrand Competition in a Differentiated Duopoly with Endogenous Technology Adoption * ANNALS OF ECONOMICS AND FINANCE 16-1, 231 253 (2015) Cournot and Bertrand Competition in a Differentiated Duopoly with Endogenous Technology Adoption * Hongkun Ma School of Economics, Shandong University,

More information

Economic Growth: Lecture 8, Overlapping Generations

Economic Growth: Lecture 8, Overlapping Generations 14.452 Economic Growth: Lecture 8, Overlapping Generations Daron Acemoglu MIT November 20, 2018 Daron Acemoglu (MIT) Economic Growth Lecture 8 November 20, 2018 1 / 46 Growth with Overlapping Generations

More information

Aggregate Demand, Idle Time, and Unemployment

Aggregate Demand, Idle Time, and Unemployment Aggregate Demand, Idle Time, and Unemployment Pascal Michaillat (LSE) & Emmanuel Saez (Berkeley) September 2014 1 / 44 Motivation 11% Unemployment rate 9% 7% 5% 3% 1974 1984 1994 2004 2014 2 / 44 Motivation

More information

Perfect Competition in Markets with Adverse Selection

Perfect Competition in Markets with Adverse Selection Perfect Competition in Markets with Adverse Selection Eduardo Azevedo and Daniel Gottlieb (Wharton) Presented at Frontiers of Economic Theory & Computer Science at the Becker Friedman Institute August

More information

Relative Performance Evaluation

Relative Performance Evaluation Relative Performance Evaluation Ram Singh Department of Economics March, 205 Ram Singh (Delhi School of Economics) Moral Hazard March, 205 / 3 Model I Multiple Agents: Relative Performance Evaluation Relative

More information

STRUCTURE Of ECONOMICS A MATHEMATICAL ANALYSIS

STRUCTURE Of ECONOMICS A MATHEMATICAL ANALYSIS THIRD EDITION STRUCTURE Of ECONOMICS A MATHEMATICAL ANALYSIS Eugene Silberberg University of Washington Wing Suen University of Hong Kong I Us Irwin McGraw-Hill Boston Burr Ridge, IL Dubuque, IA Madison,

More information

Design Patent Damages under Sequential Innovation

Design Patent Damages under Sequential Innovation Design Patent Damages under Sequential Innovation Yongmin Chen and David Sappington University of Colorado and University of Florida February 2016 1 / 32 1. Introduction Patent policy: patent protection

More information

Consistency and Asymptotic Normality for Equilibrium Models with Partially Observed Outcome Variables

Consistency and Asymptotic Normality for Equilibrium Models with Partially Observed Outcome Variables Consistency and Asymptotic Normality for Equilibrium Models with Partially Observed Outcome Variables Nathan H. Miller Georgetown University Matthew Osborne University of Toronto November 25, 2013 Abstract

More information

Part VII. Accounting for the Endogeneity of Schooling. Endogeneity of schooling Mean growth rate of earnings Mean growth rate Selection bias Summary

Part VII. Accounting for the Endogeneity of Schooling. Endogeneity of schooling Mean growth rate of earnings Mean growth rate Selection bias Summary Part VII Accounting for the Endogeneity of Schooling 327 / 785 Much of the CPS-Census literature on the returns to schooling ignores the choice of schooling and its consequences for estimating the rate

More information

Supplementary Materials for. Forecast Dispersion in Finite-Player Forecasting Games. March 10, 2017

Supplementary Materials for. Forecast Dispersion in Finite-Player Forecasting Games. March 10, 2017 Supplementary Materials for Forecast Dispersion in Finite-Player Forecasting Games Jin Yeub Kim Myungkyu Shim March 10, 017 Abstract In Online Appendix A, we examine the conditions under which dispersion

More information

Lecture Notes - Dynamic Moral Hazard

Lecture Notes - Dynamic Moral Hazard Lecture Notes - Dynamic Moral Hazard Simon Board and Moritz Meyer-ter-Vehn October 27, 2011 1 Marginal Cost of Providing Utility is Martingale (Rogerson 85) 1.1 Setup Two periods, no discounting Actions

More information

Layo Costs and E ciency with Asymmetric Information

Layo Costs and E ciency with Asymmetric Information Layo Costs and E ciency with Asymmetric Information Alain Delacroix (UQAM) and Etienne Wasmer (Sciences-Po) September 4, 2009 Abstract Wage determination under asymmetric information generates ine ciencies

More information

Under-Employment and the Trickle-Down of Unemployment - Online Appendix Not for Publication

Under-Employment and the Trickle-Down of Unemployment - Online Appendix Not for Publication Under-Employment and the Trickle-Down of Unemployment - Online Appendix Not for Publication Regis Barnichon Yanos Zylberberg July 21, 2016 This online Appendix contains a more comprehensive description

More information

Fundamentals in Optimal Investments. Lecture I

Fundamentals in Optimal Investments. Lecture I Fundamentals in Optimal Investments Lecture I + 1 Portfolio choice Portfolio allocations and their ordering Performance indices Fundamentals in optimal portfolio choice Expected utility theory and its

More information

Mini Course on Structural Estimation of Static and Dynamic Games

Mini Course on Structural Estimation of Static and Dynamic Games Mini Course on Structural Estimation of Static and Dynamic Games Junichi Suzuki University of Toronto June 1st, 2009 1 Part : Estimation of Dynamic Games 2 ntroduction Firms often compete each other overtime

More information

Linear Contracts. Ram Singh. February 23, Department of Economics. Ram Singh (Delhi School of Economics) Moral Hazard February 23, / 22

Linear Contracts. Ram Singh. February 23, Department of Economics. Ram Singh (Delhi School of Economics) Moral Hazard February 23, / 22 Ram Singh Department of Economics February 23, 2015 Ram Singh (Delhi School of Economics) Moral Hazard February 23, 2015 1 / 22 SB: Linear Contracts I Linear Contracts Assumptions: q(e, ɛ) = e + ɛ, where

More information

Minimum Wages and Excessive E ort Supply

Minimum Wages and Excessive E ort Supply Minimum Wages and Excessive E ort Supply Matthias Kräkel y Anja Schöttner z Abstract It is well-known that, in static models, minimum wages generate positive worker rents and, consequently, ine ciently

More information

The Value of Symmetric Information in an Agency Model with Moral Hazard: The Ex Post Contracting Case

The Value of Symmetric Information in an Agency Model with Moral Hazard: The Ex Post Contracting Case Faculty of Business and Law SCHOOL OF ACCOUNTING, ECONOMICS AND FINANCE School Working Paper - Economic Series 2006 SWP 2006/24 The Value of Symmetric Information in an Agency Model with Moral Hazard:

More information

Information Choice in Macroeconomics and Finance.

Information Choice in Macroeconomics and Finance. Information Choice in Macroeconomics and Finance. Laura Veldkamp New York University, Stern School of Business, CEPR and NBER Spring 2009 1 Veldkamp What information consumes is rather obvious: It consumes

More information

Political Cycles and Stock Returns. Pietro Veronesi

Political Cycles and Stock Returns. Pietro Veronesi Political Cycles and Stock Returns Ľuboš Pástor and Pietro Veronesi University of Chicago, National Bank of Slovakia, NBER, CEPR University of Chicago, NBER, CEPR Average Excess Stock Market Returns 30

More information

Introduction: Asymmetric Information and the Coase Theorem

Introduction: Asymmetric Information and the Coase Theorem BGPE Intensive Course: Contracts and Asymmetric Information Introduction: Asymmetric Information and the Coase Theorem Anke Kessler Anke Kessler p. 1/?? Introduction standard neoclassical economic theory

More information

Persuading Skeptics and Reaffirming Believers

Persuading Skeptics and Reaffirming Believers Persuading Skeptics and Reaffirming Believers May, 31 st, 2014 Becker-Friedman Institute Ricardo Alonso and Odilon Camara Marshall School of Business - USC Introduction Sender wants to influence decisions

More information

Econ 101A Problem Set 6 Solutions Due on Monday Dec. 9. No late Problem Sets accepted, sorry!

Econ 101A Problem Set 6 Solutions Due on Monday Dec. 9. No late Problem Sets accepted, sorry! Econ 0A Problem Set 6 Solutions Due on Monday Dec. 9. No late Problem Sets accepted, sry! This Problem set tests the knowledge that you accumulated mainly in lectures 2 to 26. The problem set is focused

More information

Department of Agricultural Economics. PhD Qualifier Examination. May 2009

Department of Agricultural Economics. PhD Qualifier Examination. May 2009 Department of Agricultural Economics PhD Qualifier Examination May 009 Instructions: The exam consists of six questions. You must answer all questions. If you need an assumption to complete a question,

More information

Identification and Estimation of Bidders Risk Aversion in. First-Price Auctions

Identification and Estimation of Bidders Risk Aversion in. First-Price Auctions Identification and Estimation of Bidders Risk Aversion in First-Price Auctions Isabelle Perrigne Pennsylvania State University Department of Economics University Park, PA 16802 Phone: (814) 863-2157, Fax:

More information

Blocking Development

Blocking Development Blocking Development Daron Acemoglu Department of Economics Massachusetts Institute of Technology October 11, 2005 Taking Stock Lecture 1: Institutions matter. Social conflict view, a useful perspective

More information

Cross-Country Differences in Productivity: The Role of Allocation and Selection

Cross-Country Differences in Productivity: The Role of Allocation and Selection Cross-Country Differences in Productivity: The Role of Allocation and Selection Eric Bartelsman, John Haltiwanger & Stefano Scarpetta American Economic Review (2013) Presented by Beatriz González January

More information

Economics 210B Due: September 16, Problem Set 10. s.t. k t+1 = R(k t c t ) for all t 0, and k 0 given, lim. and

Economics 210B Due: September 16, Problem Set 10. s.t. k t+1 = R(k t c t ) for all t 0, and k 0 given, lim. and Economics 210B Due: September 16, 2010 Problem 1: Constant returns to saving Consider the following problem. c0,k1,c1,k2,... β t Problem Set 10 1 α c1 α t s.t. k t+1 = R(k t c t ) for all t 0, and k 0

More information

Government 2005: Formal Political Theory I

Government 2005: Formal Political Theory I Government 2005: Formal Political Theory I Lecture 11 Instructor: Tommaso Nannicini Teaching Fellow: Jeremy Bowles Harvard University November 9, 2017 Overview * Today s lecture Dynamic games of incomplete

More information

Adverse Selection, Signaling, and Screening in Markets

Adverse Selection, Signaling, and Screening in Markets BGPE Intensive Course: Contracts and Asymmetric Information Adverse Selection, Signaling, and Screening in Markets Anke Kessler Anke Kessler p. 1/27 Stylized Facts: Market Failure used cars, even if they

More information

Optimal Insurance of Search Risk

Optimal Insurance of Search Risk Optimal Insurance of Search Risk Mikhail Golosov Yale University and NBER Pricila Maziero University of Pennsylvania Guido Menzio University of Pennsylvania and NBER May 27, 2011 Introduction Search and

More information

Module 8: Multi-Agent Models of Moral Hazard

Module 8: Multi-Agent Models of Moral Hazard Module 8: Multi-Agent Models of Moral Hazard Information Economics (Ec 515) George Georgiadis Types of models: 1. No relation among agents. an many agents make contracting easier? 2. Agents shocks are

More information

Authority and Incentives in Organizations*

Authority and Incentives in Organizations* Authority and Incentives in Organizations* Matthias Kräkel, University of Bonn** Abstract The paper analyzes the choice of organizational structure as solution to the trade-off between controlling behavior

More information

arxiv: v4 [cs.gt] 13 Sep 2016

arxiv: v4 [cs.gt] 13 Sep 2016 Dynamic Assessments, Matching and Allocation of Tasks Kartik Ahuja Department of Electrical Engineering, UCLA, ahujak@ucla.edu Mihaela van der Schaar Department of Electrical Engineering, UCLA, mihaela@ee.ucla.edu

More information

(a) Write down the Hamilton-Jacobi-Bellman (HJB) Equation in the dynamic programming

(a) Write down the Hamilton-Jacobi-Bellman (HJB) Equation in the dynamic programming 1. Government Purchases and Endogenous Growth Consider the following endogenous growth model with government purchases (G) in continuous time. Government purchases enhance production, and the production

More information

Modelling Czech and Slovak labour markets: A DSGE model with labour frictions

Modelling Czech and Slovak labour markets: A DSGE model with labour frictions Modelling Czech and Slovak labour markets: A DSGE model with labour frictions Daniel Němec Faculty of Economics and Administrations Masaryk University Brno, Czech Republic nemecd@econ.muni.cz ESF MU (Brno)

More information

The Impact of Organizer Market Structure on Participant Entry Behavior in a Multi-Tournament Environment

The Impact of Organizer Market Structure on Participant Entry Behavior in a Multi-Tournament Environment The Impact of Organizer Market Structure on Participant Entry Behavior in a Multi-Tournament Environment Timothy Mathews and Soiliou Daw Namoro Abstract. A model of two tournaments, each with a field of

More information

LEN model. And, the agent is risk averse with utility function for wealth w and personal cost of input c (a), a {a L,a H }

LEN model. And, the agent is risk averse with utility function for wealth w and personal cost of input c (a), a {a L,a H } LEN model The LEN model is a performance evaluation frame for dealing with unbounded performance measures. In particular, LEN stands for Linear compensation, negative Exponential utility, and Normally

More information

Area I: Contract Theory Question (Econ 206)

Area I: Contract Theory Question (Econ 206) Theory Field Exam Summer 2011 Instructions You must complete two of the four areas (the areas being (I) contract theory, (II) game theory A, (III) game theory B, and (IV) psychology & economics). Be sure

More information

Microeconomics. 3. Information Economics

Microeconomics. 3. Information Economics Microeconomics 3. Information Economics Alex Gershkov http://www.econ2.uni-bonn.de/gershkov/gershkov.htm 9. Januar 2008 1 / 19 1.c The model (Rothschild and Stiglitz 77) strictly risk-averse individual

More information

The Ramsey Model. (Lecture Note, Advanced Macroeconomics, Thomas Steger, SS 2013)

The Ramsey Model. (Lecture Note, Advanced Macroeconomics, Thomas Steger, SS 2013) The Ramsey Model (Lecture Note, Advanced Macroeconomics, Thomas Steger, SS 213) 1 Introduction The Ramsey model (or neoclassical growth model) is one of the prototype models in dynamic macroeconomics.

More information

problem. max Both k (0) and h (0) are given at time 0. (a) Write down the Hamilton-Jacobi-Bellman (HJB) Equation in the dynamic programming

problem. max Both k (0) and h (0) are given at time 0. (a) Write down the Hamilton-Jacobi-Bellman (HJB) Equation in the dynamic programming 1. Endogenous Growth with Human Capital Consider the following endogenous growth model with both physical capital (k (t)) and human capital (h (t)) in continuous time. The representative household solves

More information

Incentives and Machine Learning

Incentives and Machine Learning Incentives and Machine Learning John Hegeman December 12, 2008 In 2007, US paid search ad spend was $8.9 billion - most of which used a pay per click (PPC) billing model. In PPC, advertisers only pay the

More information

Political Economy of Transparency

Political Economy of Transparency Political Economy of Transparency Raphael Galvão University of Pennsylvania rgalvao@sas.upenn.edu November 20, 2017 Abstract This paper develops a model where short-term reputation concerns guide the public

More information

Simple New Keynesian Model without Capital

Simple New Keynesian Model without Capital Simple New Keynesian Model without Capital Lawrence J. Christiano January 5, 2018 Objective Review the foundations of the basic New Keynesian model without capital. Clarify the role of money supply/demand.

More information

STATE UNIVERSITY OF NEW YORK AT ALBANY Department of Economics

STATE UNIVERSITY OF NEW YORK AT ALBANY Department of Economics STATE UNIVERSITY OF NEW YORK AT ALBANY Department of Economics Ph. D. Comprehensive Examination: Macroeconomics Fall, 202 Answer Key to Section 2 Questions Section. (Suggested Time: 45 Minutes) For 3 of

More information

Robust Predictions in Games with Incomplete Information

Robust Predictions in Games with Incomplete Information Robust Predictions in Games with Incomplete Information joint with Stephen Morris (Princeton University) November 2010 Payoff Environment in games with incomplete information, the agents are uncertain

More information

Surge Pricing and Labor Supply in the Ride- Sourcing Market

Surge Pricing and Labor Supply in the Ride- Sourcing Market Surge Pricing and Labor Supply in the Ride- Sourcing Market Yafeng Yin Professor Department of Civil and Environmental Engineering University of Michigan, Ann Arbor *Joint work with Liteng Zha (@Amazon)

More information