Returns to Tenure. Christopher Taber. March 31, Department of Economics University of Wisconsin-Madison

Similar documents
Instrumental Variables

Other Models of Labor Dynamics

Inference in Regression Model

Dynamic Models Part 1

SUPPLEMENT TO RETURNS TO TENURE OR SENIORITY? : ADDITIONAL TABLES AND ESTIMATIONS (Econometrica, Vol. 82, No. 2, March 2014, )

EC402 - Problem Set 3

Single-Equation GMM: Endogeneity Bias

Econometrics in a nutshell: Variation and Identification Linear Regression Model in STATA. Research Methods. Carlos Noton.

The returns to schooling, ability bias, and regression

Applied Econometrics (MSc.) Lecture 3 Instrumental Variables

Regression Discontinuity

5. Erroneous Selection of Exogenous Variables (Violation of Assumption #A1)

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018

ECNS 561 Multiple Regression Analysis

Regression Discontinuity

Instrumental Variables and the Problem of Endogeneity

[y i α βx i ] 2 (2) Q = i=1

IV Estimation and its Limitations: Weak Instruments and Weakly Endogeneous Regressors

Problem Set #6: OLS. Economics 835: Econometrics. Fall 2012

14.74 Lecture 10: The returns to human capital: education

Dealing With Endogeneity

Multiple Regression Analysis

14.32 Final : Spring 2001

1 What does the random effect η mean?

Development. ECON 8830 Anant Nyshadham

ECON Introductory Econometrics. Lecture 16: Instrumental variables

Education Production Functions. April 7, 2009

Search Frictions and Wage Dispersion

Econometrics. Week 8. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague

Logistic regression: Why we often can do what we think we can do. Maarten Buis 19 th UK Stata Users Group meeting, 10 Sept. 2015

Simple Regression Model. January 24, 2011

Identification of Models of the Labor Market

Ch 7: Dummy (binary, indicator) variables

The BLP Method of Demand Curve Estimation in Industrial Organization

An example to start off with

Specification Errors, Measurement Errors, Confounding

ECONOMICS 210C / ECONOMICS 236A MONETARY HISTORY

Panel Data. March 2, () Applied Economoetrics: Topic 6 March 2, / 43

ECON2228 Notes 2. Christopher F Baum. Boston College Economics. cfb (BC Econ) ECON2228 Notes / 47

Multiple Regression. Midterm results: AVG = 26.5 (88%) A = 27+ B = C =

Click to edit Master title style

Simultaneous Equation Models

Eco517 Fall 2014 C. Sims FINAL EXAM

The Statistical Property of Ordinary Least Squares

Recitation Notes 5. Konrad Menzel. October 13, 2006

Multiple Linear Regression CIVL 7012/8012

FNCE 926 Empirical Methods in CF

Next is material on matrix rank. Please see the handout

Chapter 2: simple regression model

Econometrics (60 points) as the multivariate regression of Y on X 1 and X 2? [6 points]

Models of Wage Dynamics

Introduction to Econometrics Final Examination Fall 2006 Answer Sheet

11. Further Issues in Using OLS with TS Data

STA 431s17 Assignment Eight 1

FNCE 926 Empirical Methods in CF

Lecture 9. Matthew Osborne

Recitation Notes 6. Konrad Menzel. October 22, 2006

An overview of applied econometrics

The general linear regression with k explanatory variables is just an extension of the simple regression as follows

Multiple Regression Analysis

Regression and Stats Primer

The Generalized Roy Model and Treatment Effects

Regression #3: Properties of OLS Estimator

1 Motivation for Instrumental Variable (IV) Regression

Wooldridge, Introductory Econometrics, 4th ed. Chapter 2: The simple regression model

Lecture 4: Testing Stuff

Econometric Methods. Prediction / Violation of A-Assumptions. Burcu Erdogan. Universität Trier WS 2011/2012

ECO 513 Fall 2008 C.Sims KALMAN FILTER. s t = As t 1 + ε t Measurement equation : y t = Hs t + ν t. u t = r t. u 0 0 t 1 + y t = [ H I ] u t.

Short Questions (Do two out of three) 15 points each

The College Premium in the Eighties: Returns to College or Returns to Ability

ECON Introductory Econometrics. Lecture 17: Experiments

f rot (Hz) L x (max)(erg s 1 )

A Model of Human Capital Accumulation and Occupational Choices. A simplified version of Keane and Wolpin (JPE, 1997)

Statistics II. Management Degree Management Statistics IIDegree. Statistics II. 2 nd Sem. 2013/2014. Management Degree. Simple Linear Regression

Motivation Non-linear Rational Expectations The Permanent Income Hypothesis The Log of Gravity Non-linear IV Estimation Summary.

Econometrics. 7) Endogeneity

Heteroskedasticity in Panel Data

Small Open Economy RBC Model Uribe, Chapter 4

MA Advanced Econometrics: Spurious Regressions and Cointegration

Notes 11: OLS Theorems ECO 231W - Undergraduate Econometrics

For more information about how to cite these materials visit

Heteroskedasticity in Panel Data

AGEC 661 Note Fourteen

Chapter 6: Endogeneity and Instrumental Variables (IV) estimator

Appendix to The Life-Cycle and the Business-Cycle of Wage Risk - Cross-Country Comparisons

Topic 10: Panel Data Analysis

Economics 113. Simple Regression Assumptions. Simple Regression Derivation. Changing Units of Measurement. Nonlinear effects

Multiple Regression Theory 2006 Samuel L. Baker

Treatment Effects. Christopher Taber. September 6, Department of Economics University of Wisconsin-Madison

Consequences of measurement error. Psychology 588: Covariance structure and factor models

Fitting a Straight Line to Data

CHAPTER 6: SPECIFICATION VARIABLES

Club Convergence: Some Empirical Issues

Economics 582 Random Effects Estimation

Instrumental Variables

Solution to Proof Questions from September 1st

Regression Discontinuity

Topic 4 Unit Roots. Gerald P. Dwyer. February Clemson University

SOLUTIONS Problem Set 2: Static Entry Games

McCall Model. Prof. Lutz Hendricks. November 22, Econ720

Transcription:

Returns to Tenure Christopher Taber Department of Economics University of Wisconsin-Madison March 31, 2008

Outline 1 Basic Framework 2 Abraham and Farber 3 Altonji and Shakotko 4 Topel

Basic Framework Lets start with a basic framework for thinking about turnover and job specific human capital. The first important concept is the returns to seniority. If there is firm specific human capital and if the worker has some bargaining power the longer a worker has worked at a firm, the wage will go up This is called the seniority effect or the returns to tenure

Another reason for the seniority effect is for principal-agent type stuff If you are worried about the worker shirking you might want to backload pay If they get caught, they get fired Will only happen at the firm level since if you switch jobs, they won t be honored to honor this

Matching component The second issue is the matching component People are better at some jobs than others For example I am not so bad at economics, but I would be pretty bad working in an auto shop There are multiple reasons why this is important although I need some friction or lack of information so that people don t just automatically move to the best job immediately

Putting these together we get an expression like this: w ijt = αt ijt + βe it + θ i + η ij + ε ijt (I don t like this notation but everyone else used it so I will) where Variable w ijt T ijt E it θ i η ij ε ijt Definition Wage Tenure Experience Permanent Individual Component Firm/Worker Match Transitory error term

So how will this work? If a worker is sitting at a firm and gets an outside offer what do they worry about? Why can t we just run OLS to estimate the model? The biggest problem is that we are worried that T ijt will be postively correlated with η ij. That is people who are particularly well matched to a job are likely to stay at that job for a long time To deal with this we need an instrument-a variable that is correlated with T ijt but uncorrelated with η ij. A number of different papers have examined this question.

Outline 1 Basic Framework 2 Abraham and Farber 3 Altonji and Shakotko 4 Topel

The first we look at is Abraham and Farber. There is another issue here which is that once a job has started, T ijt and E it are perfectly collinear Abraham and Farber choose a slightly different parameterization. The define (my notation) E 0 ij as the level of experience at the beginning of the job. This means that E it = E 0 ij + T ijt.

Put this into the parameterization as w ijt = αt ijt + βe it + θ i + η ij + ε ijt ( ) = αt ijt + β Eij 0 + T ijt + θ i + η ij + ε ijt = (α + β) T ijt + βe 0 ij + θ i + η ij + ε ijt α T ijt + βe 0 ij + θ i + η ij + ε ijt They begin by worrying about the relationship betweeen E 0 ij and η ij People who have been in the labor market for a long period of time are going to tend to have found better matches Thus these variables will tend to be positively related

Abraham and Farber recognize this by writing the regression of η ij on T ijt as η ij = δe 0 ij + φ ij Substituting in w ijt = α T ijt + βe 0 ij + θ i + η ij + ε ijt = α T ijt + (β + δ) E 0 ij + θ i + φ ij + ε ijt Note that δ is really part of the causal return to experience, so β + δ represents the full returns to experience

We have still not solved the problem that T ijt is likely to be related to φ ij. Let D ij be the completed duration of the job (not typically observed). They define D ij = γη ij + ɛ ij = δγeij 0 + γφ ij + ɛ ij.

They argue that in a given cross section Thus E ( T ijt ) = 1 2 E ( D ij ) T ijt 1 2 D ij + ξ ijt = δγ 2 E 0 ij + γ 2 φ ij + 1 2 ɛ ij + ξ ijt

They argue that this suggests that ξ ijt comes out as a natural instrument for T ijt It is cocorrelated with T ijt by design It is uncorrelated with everything else by design Thus if you observe completed duration, you can construct this as the instrument is T ijt D ij /2. More generally if spells aren t 1/2 of completed duration on average you can just run the regression and use the residual. T ijt = ω 0 + ω 1 D ij + ξ ijt

There is also a simpler but similar approach. We can just include D ij in the regression model. w ijt = α T ijt + (β + δ) E 0 ij + ρd ij + θ i + η ij + ε ijt The idea is quite simply that conditional on D ij there is no reason for T ijt to be correlated with φ ij. Lets see what this converges to Lets remind ourselves about a couple tricks in figuring this stuff out. First think about a regression where we partition our model into Y = X 1 β 1 + X 2 β 2 + U

Using the partioned inverse formula it is straight forward to show that ˆβ 1 can be obtained by: Regress X 1 on X 2. Take the residual from these regressions Regress Y on the residuals The other thing is that if we know ˆβ 1 we can obtain ˆβ 2 by regressing Y X 1 ˆβ 1 on X 2.

This is straight forward to see as we can write the moment conditions from this as ) (Y X 1 ˆβ 1 X 2 ˆβ 2 = 0. X 2 Now lets use this to look at the probem above. w ijt = α T ijt + (β + δ) E 0 ij + ρd ij + θ i + η ij + ε ijt First think of estimating the plim of α.

Since T ijt will be uncorrelated with E 0 ij conditional on D ij, regressing T ijt on D ij and E 0 ij yields: so α cov(ξ ijt, w ijt ) var(ξ ijt ) T ijt = ω 0 + ω 1 D ij + ξ ijt = cov(ξ ijt, α T ijt + (β + δ) E 0 ij + ρd ij + θ i + η ij + ε ijt ) var(ξ ijt ) = α cov(ξ ijt, T ijt ) var(ξ ijt ) = α cov(ξ ijt, ω 0 + ω 1 D ij + ξ ijt ) var(ξ ijt ) = α

But now what about the coefficient on experience? Lets think about the moment conditions from the regression ( [ ]) E Eij 0 w ijt αt ijt b 1 Eij 0 b 2 D ij = 0 [ ]) E (D ij w ijt αt ijt b 1 Eij 0 b 2 D ij = 0 ( E E [ ]) ( Eij 0 βeij 0 + θ i + η ij + ε ijt = b 1 var [ ]) (D ij βeij 0 + θ i + η ij + ε ijt = b 1 cov Eij 0 ) ( ) + b 2 cov Eij 0, D ij ) + b 2 var ( ) D ij ( E 0 ij, D ij To economize on space, let V denote variance and C covariance ( ) ( ) ( ) ( ) βv Eij 0 + δv Eij 0 =b 1 V Eij 0 + b 2 C Eij 0, D ij ( ) βc Eij 0, D ij + γc ( ) ( ) η ij =b1 C Eij 0, D ij + b 2 V ( ) D ij

( ) βv Eij 0 V ( ) ( ) D ij + δv Eij 0 V ( ) D ij b 1 = ( ) V Eij 0 V ( ) ( ) ( ) D ij C Eij 0, D ij C Eij 0, D ij ( ) ( ) βc Eij 0, D ij C Eij 0, D ij + γv ( ) ( ) η ij C Eij 0, D ij ( ) V Eij 0 V ( ) ( ) ( ) D ij C Eij 0, D ij C Eij 0, D ij ( ) δv Eij 0 V ( ) ( ) ( ) D ij γv ηij C Eij 0, D ij =β + ( ) V Eij 0 V ( ) ( ) ( ) D ij C Eij 0, D ij C Eij 0, D ij ( ) δv Eij 0 V ( ) ( ) ( [ ] ) γη ij + ɛ ij γv ηij C Eij 0, γ δeij 0 + φ ij + ɛ ij =β + ( ) V Eij 0 V ( ) ( ( [ ] )) 2 γη ij + ɛ ij C Eij 0, γ δeij 0 + φ ij + ɛ ij

= β + δ ( V E 0 ij ( ) V Eij 0 V ( ) ɛ ij ) V ( ) [ ( γη ij + ɛ ij γδv E 0 ij )] 2 This extra term must be less than 1 so the coefficient is biased downward Thus we understate the returns to experience This means that we will overstate the net returns to tenure estimated as : α b 1 Thus at least we get some idea of how big this effect could be

Obviously there is one remaining big problem In many case we do not know D ij exactly They come up with a way of simulating it They use a proportional hazard Weibull model Pr (D T ) = exp ( λt τ ) with λ = e Z Γ You can estimate Γ and τ by maximum likelihood.

Then assuming that all jobs end at age 65 it is straight forward to show that E (D D > S f, Z ) = 1 exp ( ) λsf τ S65 S f λτt τ e λtτ dt+ exp ( λs65 τ ) exp ( ) λsf τ S 65 They use this as a proxy for D ij when they don t have it. Lets look at the results

There are a number of problems with this approach that we might be worried about. What are the main ones?

Outline 1 Basic Framework 2 Abraham and Farber 3 Altonji and Shakotko 4 Topel

The next paper we look at that examines this question is Altonji and Shakotko (RES, 1987) They take the same basic set up Lets focus on the specification w ijt = αt ijt + βe it + θ i + η ij + ε ijt They actually have higher order terms and some other stuff-but that is not important for the main idea We are worried that T ijt is correlated with θ i and η ij.

Let τ ij be the set of t for which we can observe individual i on job j and N ij the number of such observations They then use as their instrument T ijt T ijt T ijt where T ijt 1 N ij t τ ij T ijt This has the really cool feature of being uncorrelated with θ i and η ij by construction.

There is one major problem though: We still have that E it is likely to be positively correlated with η ij In general we think that this means that β is likely to be biased upward Since T ijt and E it are going to tend to be positively related, this means that α will tend to be biased downward

Lets try to work this out formally in a way similar to before. We get the moment conditions: [ ( ) ] E Tijt wijt b 1 T ijt b 2 E it = 0 E [ E it ( wijt b 1 T ijt b 2 E it )] = 0 which gives ( ) ( ) cov Tijt, αt ijt + βe it + θ i + η ij + ε ijt = cov Tijt, b 1 T ijt + b 2 E it cov ( ) ( ) E it, αt ijt + βe it + θ i + η ij + ε ijt = cov Eit, b 1 T ijt + b 2 E it ( ) ( ) ( ) ( ) αc Tijt, T ijt + βc Tijt, E it = b 1 C Tijt, T ijt + b 2 C Tijt, E it αc ( E it, T ijt ) + βv (Eit, ) + cov ( E it, η ij ) = b1 C ( E it, T ijt ) + b2 V (E it )

Solving the equations gives [ ( ) ( )] αc Tijt, T ijt + βc Tijt, E it V (E it ) b 1 = ( ) C Tijt, T ijt V (E it ) C ( ) ( ) E it, T ijt C Tijt, E it [ ( ) αc Eit, T ijt + βv (Eit, ) + C ( )] ( ) E it, η ij C Tijt, E it =α ( ) C Tijt, T ijt V (E it ) C ( ) ( ) E it, T ijt C Tijt, E it cov ( ) ( ) E it, η ij cov Tijt, E it ( ) cov Tijt, T ijt var (E it ) cov ( ) ( ) E it, T ijt cov Tijt, E it So the estimator is biased.

They come back and deal with this later, but lets forget about this for now. Lets just focus on the IV1 estimator-the rest try to get more efficient estimates in various ways They also use T 2 ijt and OLDJOB ijt T ijt > 0. The point of the OLDJOB variable is to allow more flexibility in the relationship

They find small effects. Next they try to worry about the possible bias One thing to compare it to is fixed effects estimation of w ijt = α 1 T ijt + α 2 T 2 ijt + α 3OLDJOB ijt + β 1 E it + β 2 E 2 it + θ i + η ij + ε ijt Note that in this case α 1 and β 1 can not be separately estimated as they are perfectly collinear within a job However we can estimate their sum and all of the other coefficients We can compare that to the IV estimates They are quite similar and you can t reject that they are different

Another thing the tried to do was to try to instrument for education using Ẽijt this led to implausible (and I think imprecise) estimates. Finally they try making a bunch of assumptions about cov(e it, η ij ) to get an idea what the bias might look like These results are presented in the following Table

Altogether, Altonji and Shakotko would claim the returns to tenure are small

Outline 1 Basic Framework 2 Abraham and Farber 3 Altonji and Shakotko 4 Topel

Topel takes a different approach and gets a different answer Lets focus on the same basic model w ijt = αt ijt + βe it + θ i + η ij + ε ijt He notices that for people who do not change jobs w ijt w ijt 1 =α ( T ijt T ijt 1 ) + β (Eit E it 1 ) + ε ijt ε ijt 1 =α + β + ε ijt ε ijt 1

Thus one can just get a consistent estimate of α + β by differencing and running fixed effects. If we knew β we would be done. How might we estimate β? Think only using people at the time of hire For them, T ijt = 0. Thus for new hires the wage equation is w ijt = βe it + θ i + η ij + ε ijt

Will OLS give us consistent estimates of β? No, it will be upward biased. We then estimate α as ˆα = α + β ˆβ Thus if ˆβ is biased upwards, ˆα will be biased downward Thus he interprets his estimate as a lower bound of the effect

He actually does something related, but better We showed before that w ijt = (α + β) T ijt + βe 0 ij + θ i + η ij + ε ijt so w ijt α + βt ijt +βe 0 ij + θ i + η ij + ε ijt He implements the second stage by using this approach. Lets look at the results

Why are these results so different from Abraham and Farber and from Altonji and Shakoto? Topel claims that the difference from Abraham and Farber primarily come because of the assumption they make. Take their model as w ijt = (α + β) T ijt + (β + δ) E 0 ij + ρd ij + θ i + φ ij + ε ijt Their key idea is that controlling for D ij will allow us to get consistent estimates (or close)

Topel points out that one can rewrite this model as w ijt = (α + β) T ij + (α + β) ( T ijt T ij ) + (β + δ) E 0 ij +ρd ij + θ i + φ ij + ε ijt Since ( T ijt T ij ) is orthogonal to everything else in the model, the coefficient in front of it is essentially the fixed effect estimator

He interprets what AF do is to restrict these parameters to be the same However, this is testable When he separates things out in this way, this is what he finds.

Note that the AS idea is similar to Topel Both take advantage of the fact that ( T ijt T ij ) is exogenous Both have a coefficient on E which is biased upward Topel claims that the difference between his and AS s results come from the facts that: Altonji s estimate is biased downward Measurement error is a big deal Altonji and Shakoto do not control for time effects in as good a way Here is his evidence on the subject

Altonji did not conceed He came back with a response to Topel (jointly written with Williams) They claim that after dealing with all of these issues more closely, you get effects that are smaller than Topel but bigger than AS In their preferred estimates, the effect of ten years of seniority on log wages is approximately 0.11 I do not want to get into all of these details, but if you are interested in this subject you should read those papers.