Lecture Notes in Empirical Finance (PhD): Linear Factor Models

Similar documents
Multivariate Tests of the CAPM under Normality

Financial Econometrics Lecture 6: Testing the CAPM model

ASSET PRICING MODELS

R = µ + Bf Arbitrage Pricing Model, APM

Ross (1976) introduced the Arbitrage Pricing Theory (APT) as an alternative to the CAPM.

Linear Factor Models and the Estimation of Expected Returns

Factor Models for Asset Returns. Prof. Daniel P. Palomar

Specification Test Based on the Hansen-Jagannathan Distance with Good Small Sample Properties

ZHAW Zurich University of Applied Sciences. Bachelor s Thesis Estimating Multi-Beta Pricing Models With or Without an Intercept:

GMM - Generalized method of moments

A new test on the conditional capital asset pricing model

Linear Factor Models and the Estimation of Expected Returns

Linear Factor Models and the Estimation of Expected Returns

Notes on empirical methods

Econ 583 Final Exam Fall 2008

Multivariate GARCH models.

The Slow Convergence of OLS Estimators of α, β and Portfolio. β and Portfolio Weights under Long Memory Stochastic Volatility

Using Bootstrap to Test Portfolio Efficiency *

14 Week 5 Empirical methods notes

Inference on Risk Premia in the Presence of Omitted Factors

Regression: Ordinary Least Squares

MFE Financial Econometrics 2018 Final Exam Model Solutions

Asset Pricing with Consumption and Robust Inference

Financial Econometrics Return Predictability

Econometrics of Panel Data

Chapter 11 GMM: General Formulas and Application

Predicting bond returns using the output gap in expansions and recessions

Model Mis-specification

In modern portfolio theory, which started with the seminal work of Markowitz (1952),

The regression model with one fixed regressor cont d

Panel Data Models. James L. Powell Department of Economics University of California, Berkeley

Zellner s Seemingly Unrelated Regressions Model. James L. Powell Department of Economics University of California, Berkeley

Empirical Methods in Finance Section 2 - The CAPM Model

Bootstrap tests of mean-variance efficiency with multiple portfolio groupings

Chapter 15 Panel Data Models. Pooling Time-Series and Cross-Section Data

A Non-Parametric Approach of Heteroskedasticity Robust Estimation of Vector-Autoregressive (VAR) Models

Economic modelling and forecasting

Lecture 4: Heteroskedasticity

GMM and SMM. 1. Hansen, L Large Sample Properties of Generalized Method of Moments Estimators, Econometrica, 50, p

Econ671 Factor Models: Principal Components

Instrumental Variables, Simultaneous and Systems of Equations

Financial Econometrics

A Bootstrap Test for Causality with Endogenous Lag Length Choice. - theory and application in finance

Uniform Inference for Conditional Factor Models with Instrumental and Idiosyncratic Betas

Econometrics of Panel Data

Chapter 1. GMM: Basic Concepts

An Introduction to Generalized Method of Moments. Chen,Rong aronge.net

Vector Autoregressive Model. Vector Autoregressions II. Estimation of Vector Autoregressions II. Estimation of Vector Autoregressions I.

Generalized Method Of Moments (GMM)

Panel Data Models. Chapter 5. Financial Econometrics. Michael Hauser WS17/18 1 / 63

Errata for Campbell, Financial Decisions and Markets, 01/02/2019.

A Skeptical Appraisal of Asset-Pricing Tests

Economics 583: Econometric Theory I A Primer on Asymptotics

Unexplained factors and their effects on second pass R-squared s

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018

Unexplained factors and their effects on second pass R-squared's and t-tests Kleibergen, F.R.; Zhan, Z.

Volume 30, Issue 1. Measuring the Intertemporal Elasticity of Substitution for Consumption: Some Evidence from Japan

ECON4515 Finance theory 1 Diderik Lund, 5 May Perold: The CAPM

ECON 4160, Spring term 2015 Lecture 7

Econometrics. Week 4. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague

Answers to Problem Set #4

Final Exam November 24, Problem-1: Consider random walk with drift plus a linear time trend: ( t

ECON 4551 Econometrics II Memorial University of Newfoundland. Panel Data Models. Adapted from Vera Tabakova s notes

Econ 423 Lecture Notes: Additional Topics in Time Series 1

Econometrics II - EXAM Answer each question in separate sheets in three hours

Large Sample Properties of Estimators in the Classical Linear Regression Model

II.2. Empirical Testing for Asset Pricing Models. 2.1 The Capital Asset Pricing Model, CAPM

Network Connectivity and Systematic Risk

1 Outline. 1. Motivation. 2. SUR model. 3. Simultaneous equations. 4. Estimation

Reliability of inference (1 of 2 lectures)

Drawing Inferences from Statistics Based on Multiyear Asset Returns

(θ θ ), θ θ = 2 L(θ ) θ θ θ θ θ (θ )= H θθ (θ ) 1 d θ (θ )

MS&E 226: Small Data

Improvement in Finite Sample Properties of the. Hansen-Jagannathan Distance Test

Introductory Econometrics

Regression Analysis. y t = β 1 x t1 + β 2 x t2 + β k x tk + ϵ t, t = 1,..., T,

Beta Is Alive, Well and Healthy

Problem Set #6: OLS. Economics 835: Econometrics. Fall 2012

Advanced Econometrics

Arma-Arch Modeling Of The Returns Of First Bank Of Nigeria

Econometrics for PhDs

Spring 2017 Econ 574 Roger Koenker. Lecture 14 GEE-GMM

Heteroscedasticity and Autocorrelation

Estimating Deep Parameters: GMM and SMM

APT looking for the factors

Tests of the Present-Value Model of the Current Account: A Note

An overview of applied econometrics

Econometrics. Week 8. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague

Section 3: Simple Linear Regression

Probabilities & Statistics Revision

TESTING FOR CO-INTEGRATION

Bootstrapping Heteroskedasticity Consistent Covariance Matrix Estimator

Equity risk factors and the Intertemporal CAPM

The Bootstrap: Theory and Applications. Biing-Shen Kuo National Chengchi University

Ch 2: Simple Linear Regression

Analysis of Cross-Sectional Data

Increasing the Power of Specification Tests. November 18, 2018

Specification Test for Instrumental Variables Regression with Many Instruments

The outline for Unit 3

Transcription:

Contents Lecture Notes in Empirical Finance (PhD): Linear Factor Models Paul Söderlind 8 June 26 University of St Gallen Address: s/bf-hsg, Rosenbergstrasse 52, CH-9 St Gallen, Switzerland E-mail: PaulSoderlind@unisgch Document name: EmpFinPhDLinFTeX 3 Linear Factor Models 2 3 CAPM Tests: Overview 2 32 Testing CAPM: Traditional LS Approach 3 32 CAPM with One Asset: Traditional LS Approach 3 322 CAPM with Several Assets: Traditional LS Approach 7 33 Testing CAPM: GMM 8 33 CAPM with Several Assets: GMM and a Wald Test 8 332 CAPM and Several Assets: GMM and an LM Test 2 333 Size and Power ohe CAPM Tests 3 334 Choice of Portfolios 4 335 Empirical Evidence 4 34 Testing Multi-Factor Models (Factors are Excess Returns) 5 34 A Multi-Factor Model 5 342 Multi-Factor Model: GMM 6 343 Empirical Evidence 7 344 Coding ohe GMM Problem 8 35 Testing Multi-Factor Models (General Factors) 2 35 GMM Estimation with General Factors 2 352 Traditional Cross-Sectional Regressions as Special Cases 22 353 Alternative Formulation of Moment Conditions using α = β(λ E ) 24 354 What ihe Factor is a Portfolio? 24 355 Empirical Evidence 25 36 Fama-MacBeth 25

MV frontiers before and after (α=) MV frontiers before and after (α=5) Solid curves: 2 assets, Dashed curves: 3 assets Chapter 3 µ 5 µ 5 Linear Factor Models 3 CAPM Tests: Overview Reference: Cochrane (2) 2; Campbell, Lo, and MacKinlay (997) 5 Let Rit e = R it R be the excess return on asset i in excess over the riskfree asset, and let = R mt R be the excess return on the market portfolio CAPM with a riskfree return says that α i = in Rit e = α + β + ε it, where E ε it = and Cov(, ε it ) = (3) µ 5 σ 5 MV frontiers before and after (α= 4) 5 σ 5 σ The new asset has the abnormal return α compared to the market (of 2 assets) Means 8 5 α + β(er m R f ) Cov matrix Tang portf 256 44 44 N = 2 47 53 NaN α= 47 53 α=5 3 34 34 α= 4 82 9 73 The economic importance of a non-zero intercept (α) is that the tangency portfolio changes ihe test asset is added to the investment opportunity set See Figure 3 for an illustration The basic test of CAPM is to estimate (3) on a single asset and then test ihe intercept is zero This can easily be extended to several assets, where we test if all the intercepts are zero Notice that the test of CAPM can be given two interpretations If we assume that R mt is the correct benchmark, then it is a test of whether asset R it is correctly priced (this is the approach in mutual fund evaluations) Alternatively, if we assume that R it is correctly priced, then it is a test ohe mean-variance efficiency of R mt (compare the Roll critique) Figure 3: MV frontiers with 2 and 3 assets 32 Testing CAPM: Traditional LS Approach 32 CAPM with One Asset: Traditional LS Approach Ihe residuals in the CAPM regression are iid, then the traditional LS approach is just fine: estimate (3) and form a t-test ohe null hypothesis that the intercept is zero Ihe disturbance is iid normally distributed, then this approach is the ML approach To understand the properties ohe LS approch, we use the results in the following remark Remark (Covariance matrix of LS estimator) Consider the regression equation y t = x t b +u t With iid errors that are independent of all regressors (also across observations), 2 3

the LS estimator, ˆb Ls, is asymptotically distributed as T ( ˆb Ls b ) d N(, σ 2 xx ), where σ 2 = E u 2 t and xx = E T t= x t x t /T When the regressors are just a constant (equal to one) and one variable regressor,, so x t =,, then we have xx = E T t= x t x t /T = E Tt= T σ 2 xx = σ 2 E ft 2 (E ) 2 E ft 2 E (In the last line we use Var( ) = E f 2 t (E ) 2 ) E ft 2 = E E ft 2, so E = σ 2 Var( ) + (E ) 2 Var( ) E E This remark fits well with the CAPM regression (3) The variance ohe intercept estimator is therefore Var( ˆα α ) = { + (E ) 2 } Var(ε it )/T (32) Var ( ) = ( + S R 2 f ) Var(ε it)/t, (33) where S R 2 f is the squared Sharpe ratio ohe market portfolio (recall: is the excess return on market portfolio) We see that the uncertainty about the intercept is high when the disturbance is volatile and when the sample is short, but also when the Sharpe ratio of the market is high Note that a large market Sharpe ratio means that the market asks for a high compensation for taking on risk A bit uncertainty about how risky asset i is then gives a large uncertainty about what the risk-adjusted return should be The t-test ohe hypothesis that α = is then ˆα Std( ˆα) = ˆα ( + S R 2 f ) Var(ε it)/t d N(, ) under H : α = (34) Note that this is the distribution under the null hypothesis that the true value ohe intercept is zero, that is, that CAPM is correct (in this respect, at least) Remark 2 (Quadratic forms of normally distributed random variables) Ihe n vector X N(, ), then Y = X X χn 2 Therefore, ihe n scalar random variables 4 X i, i =,, n, are uncorrelated and have the distributions N(, σi 2 ), i =,, n, then Y = i= n X i 2/σ i 2 χn 2 Instead of a t-test, we can use the equivalent chi-square test ˆα 2 Var( ˆα) = ˆα 2 ( + S R 2 f ) Var(ε it)/t d χ 2 under H : α = (35) The chi-square test is equivalent to the t-test when we are testing only one restriction, but it has the advantage that it also allows us to test several restrictions at the same time Both the t-test and the chi square tests are Wald tests (estimate unrestricted model and then test the restrictions) It is quite straightforward to use the properties of minimum-variance frontiers (see Gibbons, Ross, and Shanken (989), and MacKinlay (995)) to show that the test statistic in (35) can be written ˆα i 2 Var( ˆα i ) = (Ŝ R c ) 2 (Ŝ R f ) 2 + (Ŝ R f ) 2 /T, (36) where S R f is the Sharpe ratio ohe market portfolio and S R c is the Sharpe ratio of the tangency portfolio when investment in both the market return and asset i is possible (Recall that the tangency portfolio is the portfolio with the highest possible Sharpe ratio) Ihe market portfolio has the same (squared) Sharpe ratio as the tangency portfolio ohe mean-variance frontier of R it and R mt (so the market portfolio is mean-variance efficient also when we take R it into account) then the test statistic, ˆα 2 i / Var( ˆα i), is zero and CAPM is not rejected Proof (of (36)) From the CAPM regression (3) we have Rit e βi 2 Cov Rmt e = σ m 2 + Var(ε it) β i σm 2 µ i e β i σm 2 σm 2, and = µ e m α i + β i µ e m µ e m Suppose we use this information to construct a mean-variance frontier for both R it and R mt, and we find the tangency portfolio, with excess return Rct e It is straightforward to show that the square ohe Sharpe ratio ohe tangency portfolio is µ e µ e, where µ e is the vector of expected excess returns and is the covariance matrix By using the covariance matrix and mean vector above, we get that the squared Sharpe ratio for the 5

tangency portfolio, µ e µ e, (using both R it and R mt ) is which we can write as ( µ e ) 2 ( c = α2 i µ e ) 2 Var(ε it ) + m, σ c σ m (S R c ) 2 = α2 i Var(ε it ) + (S R m) 2 Use the notation = R mt R and combine this with (33) and to get (36) It is also possible to construct small sample test (that do not rely on any asymptotic results), which may be a better approximation ohe correct distribution in real-life samples provided the strong assumptions are (almost) satisfied The most straightforward modification is to transform (35) into an F,T -test This is the same as using a t-test in (34) since it is only one restriction that is tested (recall that if Z t n, then Z 2 F(, n)) An alternative testing approach is to use an LR or LM approach: restrict the intercept in the CAPM regression to be zero and estimate the model with ML (assuming that the errors are normally distributed) For instanc, for an LR test, the likelihood value (when α = ) is then compared to the likelihood value without restrictions A common finding is that these tests tend to reject a true null hypothesis too often when the critical values from the asymptotic distribution are used: the actual small sample size ohe test is thus larger than the asymptotic (or nominal ) size (see Campbell, Lo, and MacKinlay (997) Table 5) To study the power ohe test (the frequency of rejections of a false null hypothesis) we have to specify an alternative data generating process (for instance, how much extra return in excess ohat motivated by CAPM) and the size ohe test (the critical value to use) Once that is done, it is typically found that these tests require a substantial deviation from CAPM and/or a long sample to get good power 322 CAPM with Several Assets: Traditional LS Approach Suppose we have n test assets Stack (3) expressions (3) for i =,, n as Let = R mt R and stack the expressions (3) for i =,, n as R e t R e nt = α + α n β + β n ε t ε nt, E ε it = and Cov(, ε it ) = (37) This is a system of seemingly unrelated regressions (SUR) with the same regressor (see, for instance, Greene (23) 4) In this case, the efficient estimator (GLS) is LS on each equation separately Moreover, the covariance matrix ohe coefficients is particularly simple To see what the covariances ohe coefficients are, write each ohe the regression equations in (37) on a traditional form Rit e = x t θ i + ε t, where x t = (38) If we define xx = plim T t= x t x t /T, and σ i j = plim T t= ε itε jt /T, (39) then the asymptotic covariance matrix ohe vectors ˆθ i and ˆθ j (assets i and j) is σ i j xx /T, that is, ACov( σ σ n T ˆθ) = xx (3) σ n ˆσ nn In a large sample, the estimator is normally distributed Therefore, under the null hypothesis (that ther intecepts are zero)) we have the following result From Remark we know that the upper left element of xx equals + S R2, so T ˆα d N n, ( + S R 2 ), where = σ σ n (3) σ n ˆσ nn 6 7

In practice we use the sample moments for the covariance matrix To test the null hypothesis that all intercepts are zero, we use the test statistic T ˆα ( + S R 2 ) ˆα χ 2 n (32) As for the case of a single asset, it is straightforward to do an LR or LM test instead Assuming the errors are normally distributed, a restricted model (α = ) is estimated by ML (LS actually), and the properties ohe likelihood function are used for testing 33 Testing CAPM: GMM 33 CAPM with Several Assets: GMM and a Wald Test To test n assets at the same time when the errors are non-iid we make use ohe GMM framework A special case is when the residuals are iid The results in this section will then coincide with those in Section 32 Let = R mt R and stack the expressions (3) for i =,, n as R e t R e nt = α + α n or more compactly β + β n ε t ε nt, E ε it = and Cov(, ε it ) =, (33) R e t = α + β + ε t, E ε t = n and Cov(, ε t ) = n, (34) where α and β are n vectors Clearly, setting n = gives the case of a single test asset The 2n GMM moment conditions are that, at the true values of α and β, E g t (α, β) = 2n, where (35) ε t Rt e α β g t (α, β) = = ( ) (36) ε t R e t α β There are as many parameters as moment conditions, so the GMM estimator picks values of α and β such that the sample analogues of (35) are satisfied exactly ḡ( ˆα, ˆβ) = T g t ( ˆα, ˆβ) = T t= t= Rt e ˆα ˆβ (Rt e = 2n, (37) ˆα ˆβ ) which gives the LS estimator For the inference, we allow for the possibility of non-iid errors (With iid errors we get the same results as in Section 32, at least asymptotically) With point estimates and their sampling distribution it is straightforward to set up a Wald test for the hypothesis that all elements in α are zero ˆα Var( ˆα) ˆα d χ 2 n (38) Remark 3 (Easy coding ohe GMM Problem (37)) Estimate by LS, equation by equation Then, plug in the fitted residuals in (36) to generate time series ohe moments (will be important for the tests) Remark 4 (Distribution of GMM) Let the parameter vector in the moment condition have the true value b Define S = ACov T ḡ (b ) and D = plim ḡ(b ) b When the estimator solves min ḡ (b) S ḡ (b) or when the model is exactly identified, the distribution ohe GMM estimator is T ( ˆb b ) d ( ) N ( k, V ), where V = D S D = D S (D ) Details on the Wald Test To be concrete, consider the case with two assets ( and 2) so the parameter vector is b = α, α 2, β, β 2 Write out (35) as ḡ (α, β) Rt e ḡ 2 (α, β) ḡ 3 (α, β) = α β T R2t e α 2 β 2 T t= (Rt e α β ) = 4, (39) ḡ 4 (α, β) (R2t e α 2 β 2 ) where ḡ (α, β) denotes the sample average ohe first moment condition 8 9

The Jacobian is ḡ / α ḡ / α 2 ḡ / β ḡ / β 2 ḡ(α, β) α, α 2, β, β 2 = ḡ 2 / α ḡ 2 / α 2 ḡ 2 / β ḡ 2 / β 2 ḡ 3 / α ḡ 3 / α 2 ḡ 3 / β ḡ 3 / β 2 ḡ 4 / α ḡ 4 / α 2 ḡ 4 / β ḡ 4 / β 2 = T T t= ft 2 (32) ft 2 Note that, in this case with a linear model, the Jacobian does not involve the parameters that we want to estimate This means that we do not have to worry about evaluating the Jacobian at the true parameter values The probability limit of (32) is simply the expected value, which can written as ( ) D = E ft 2 I 2 = E I 2, (32) where is the Kronecker product For n assets, change I 2 to I n (The last expression applies also to the case of several factors) Remark 5 (Kronecker product) If A and B are matrices, then a B a n B A B = a m B a mn B From Remark 4, we can write the covariance matrix ohe 2n vector of parameters (n parameters in α and another n in β) as ( ) T ˆα ACov ˆβ = D S (D ) (322) The asymptotic covariance matrix of T times the sample moment conditions, eval- uated at the true parameter values, that is at the true disturbances, is defined as S = ACov ( T T ) g t (α, β) = t= s= R(s), where (323) R(s) = E g t (α, β)g t s (α, β) (324) With n assets, we can write (324) in terms ohe n vector ε t as R(s) = E g t (α, β)g t s (α, β) = E ε t ε t ε t s s ε t s = E ( ε t ) ( (The last expression applies also to the case of several factors) s ε t s ) (325) The Newey-West estimator is often a good estimator of S, but the performance ohe test improved, by imposing (correct, of course) restrictions on the R(s) matrices Example 6 (Special case : is independent of ε t s, errors are iid, and n = ) With E these assumptions R(s) = 2 2 if s =, and S = E E ft 2 Var(ε it ) Combining with (32) gives ACov ( T ) ˆα E = ˆβ E E ft 2 Var(ε it), which is the same expression as σ 2 xx in Remark, which assumed iid errors Example 7 (Special case 2: as in Special case, but n ) With these assumptions E R(s) = 2n 2n if s =, and S = E E ft 2 E ε t ε t Combining with (32) gives ACov ( T ) ˆα E = ˆβ E E ft 2 ( E ε t ε t) This follows from the facts that (A B) = A B and (A B)(C D) = AC B D (if conformable)

332 CAPM and Several Assets: GMM and an LM Test We could also construct an LM test instead by imposing α = in the moment conditions (35) and (37) The moment conditions are then Rt e β E g(β) = E (Rt e = 2n (326) β ) Since there are q = 2n moment conditions, but only n parameters (the β vector), this model is overidentified We could either use a weighting matrix in the GMM loss function or combine the moment conditions so the model becomes exactly identified With a weighting matrix, the estimator solves min b ḡ(b) W ḡ(b), (327) where ḡ(b) is the sample average ohe moments (evaluated at some parameter vector b), and W is a positive definite (and symmetric) weighting matrix Once we have estimated the model, we can test the n overidentifying restrictions that all q = 2n moment conditions are satisfied at the estimated n parameters ˆβ If not, the restriction (null hypothesis) that α = n is rejected To combine the moment conditions so the model becomes exactly identified, premultiply by a matrix A to get A n 2n E g(β) = n (328) The model is then tested by testing if all 2n moment conditions in (326) are satisfied at this vector of estimates ohe betas This is the GMM analogue to a classical LM test For instance, to effectively use only the last n moment conditions in the estimation, we specify A E g(β) = n n I n E This clearly gives the classical LS estimator without an intercept Rt e β (Rt e = n (329) β ) Tt= Rt ˆβ e = /T Tt= ft 2 /T (33) can combine the four moment conditions into only two by Rt e β A E g t (β, β 2 ) = E R2t e β 2 (Rt e β ) = 2 (R2t e β 2 ) Remark 9 (Test of overidentifying assumption in GMM) When the GMM estimator solves the quadratic loss function ḡ(β) S ḡ(β) (or is exactly identified), then the J test statistic is T ḡ( ˆβ) S ḡ( ˆβ) d χq k 2, where q is the number of moment conditions and k is the number of parameters Remark (Distribution of GMM, more general results) When GMM solves min b ḡ(b) W ḡ(b) or Aḡ( ˆβ) = k, the distribution ohe GMM estimator and the test of overidentifying assumptions are different than in Remarks 4 and 9 333 Size and Power ohe CAPM Tests The size (using asymptotic critical values) and power in small samples is often found to be disappointing Typically, these tests tend to reject a true null hypothesis too often (see Campbell, Lo, and MacKinlay (997) Table 5) and the power to reject a false null hypothesis is often fairly low These features are especially pronounced when the sample is small and the number of assets, n, is low One useful rule ohumb is that a saturation ratio (the number of observations per parameter) below (or so) is likely to give poor performance ohe test In the test here we have nt observations, 2n parameters in α and β, and n(n + )/2 unique parameters in S, so the saturation ratio is T/(2 + (n + )/2) For instance, with T = 6 and n = or at T = and n = 2, we have a saturation ratio of 8, which is very low (compare Table 5 in CLM) One possible way of dealing with the wrong size ohe test is to use critical values from simulations ohe small sample distributions (Monte Carlo simulations or bootstrap simulations) Example 8 (Combining moment conditions, CAPM on two assets) With two assets we 2 3

334 Choice of Portfolios This type oest is typically done on portfolios of assets, rather than on the individual assets themselves There are several econometric and economic reasons for this The econometric techniques we apply need the returns to be (reasonably) stationary in the sense that they have approximately the same means and covariance (with other returns) throughout the sample (individual assets, especially stocks, can change character as the company moves into another business) It might be more plausible that size or industry portfolios are stationary in this sense Individual portfolios are typically very volatile, which makes it hard to obtain precise estimate and to be able to reject anything It sometimes makes economic sense to sort the assets according to a characteristic (size or perhaps book/market) and then test ihe model is true for these portfolios Rejection ohe CAPM for such portfolios may have an interest in itself 335 Empirical Evidence See Campbell, Lo, and MacKinlay (997) 65 (Table 6 in particular) and Cochrane (2) 22 One ohe more interesting studies is Fama and French (993) (see also Fama and French (996)) They construct 25 stock portfolios according to two characteristics ohe firm: the size and the book value to market value ratio (BE/ME) In June each year, they sort the stocks according to size and BE/ME They then form a 5 5 matrix of portfolios, where portfolio i j belongs to the ith size quantile and the jth BE/ME quantile They run a traditional CAPM regression on each ohe 25 portfolios (monthly data 963 99) and then study ihe expected excess returns are related to the betas as they should according to CAPM (recall that CAPM implies E Rit e = β i E Rmt e ) However, there is little relation between E Rit e and β i (see Cochrane (2) 22, Figure 29) This lack of relation (a cloud in the β i E R e it space) is due to the combination owo features ohe data First, within a BE/ME quantile, there is a positive relation (across size quantiles) between E R e it and β i as predicted by CAPM (see Cochrane (2) 22, Figure 2) Second, within a size quantile there is a negative relation (across BE/ME quantiles) between E R e it and β i in stark contrast to CAPM (see Cochrane (2) 22, Figure 2) Mean excess return US industry portfolios, 947: 25:5 5 5 I D H A GB C J F 5 5 β all A (NoDur) B (Durbl) C (Manuf) D (Enrgy) E (HiTec) F (Telcm) G (Shops) H (Hlth ) I (Utils) J (Other) alpha NaN 99 36 25 358 2 45 32 233 233 pval 2 82 77 4 9 73 8 9 E StdErr NaN 8 82 68 288 23 978 859 95 35 64 Mean excess return US industry portfolios, 947: 25:5 5 5 I D H A GB C J F Figure 32: CAPM, US industry portfolios Excess market return: 74% 5 5 Predicted mean excess return (with α=) CAPM Factor: US market test statistics use Newey West std test statistics ~ χ 2 (n), n = no oest assets α and StdErr are in annualized % 34 Testing Multi-Factor Models (Factors are Excess Returns) Reference: Cochrane (2) 2; Campbell, Lo, and MacKinlay (997) 62 34 A Multi-Factor Model When the K factors,, are excess returns, the null hypothesis typically says that α i = in R e it = α i + β i + ε it, where E ε it = and Cov(, ε it ) = K (33) and β i is now an K vector The CAPM regression is a special case when the market excess return is the only factor In other models like ICAPM (see Cochrane (2) 92), E 4 5

Mean excess return US industry portfolios, 947: 25:5 5 5 H E D AG CJ B F I 5 5 Predicted mean excess return all A (NoDur) B (Durbl) C (Manuf) D (Enrgy) E (HiTec) F (Telcm) G (Shops) H (Hlth ) I (Utils) J (Other) alpha NaN 95 254 6 46 46 57 3 438 25 69 Fama French model Factors: US market, SMB (size), and HML (book to market) test statistics use Newey West std test statistics ~ χ 2 (n), n = no oest assets α and StdErr are in annualized % pval 44 4 38 68 98 85 3 StdErr NaN 798 2 579 2 3 963 849 4 946 549 Note that this expression looks similar to (35) the only difference is that may now be a vector (and we therefore need to use the Kronecker product) It is then intuitively clear that the expressions for the asymptotic covariance matrix of ˆα and ˆβ will look very similar too When the system is exactly identified, the GMM estimator solves ḡ(α, β) = n(+k ), (334) which is the same as LS equation by equation Instead, when we restrict α = n (overidentified system), then we either specify a weighting matrix W and solve min β ḡ(β) W ḡ(β), (335) or we specify a matrix A to combine the moment conditions and solve Figure 33: Three-factor model, US industry portfolios we typically have several factors We stack the returns for n assets to get Rt e α β β K ε t = + +, or α n β n β nk f K t ε nt R e nt R e t = α + β + ε t, where E ε t = n and Cov(, ε t ) = K n, (332) where α is n and β is n K Notice that β i j shows how the ith asset depends on the jth factor This is, of course, very similar to the CAPM (one-factor) model and both the LS and GMM approaches are straightforward to extend I will elaborate on the GMM approach 342 Multi-Factor Model: GMM The moment conditions are ( ) ( ) E g t (α, β) = E ε t = E (Rt e α β ) = n(+k ) 6 (333) A nk n(+k ) ḡ(β) = nk (336) For instance, to get the classical LS estimator without intercepts we specify ( ) A = nk n I nk E (Rt e β ) (337) Example (Moment condition with two assets and two factors) The moment conditions for n = 2 and K = 2 are Rt e α β β 2 f 2t R2t e α 2 β 2 β 22 f 2t (R E g t (α, β) = E t e α β β 2 f 2t ) (R2t e α = 2 β 2 β 22 f 2t ) 6 f 2t (Rt e α β β 2 f 2t ) f 2t (R2t e α 2 β 2 β 22 f 2t ) Restricting α = α 2 = gives the moment conditions for the overidentified case 343 Empirical Evidence Fama and French (993) also try a multi-factor model They find that a three-factor model fits the 25 stock portfolios fairly well (two more factors are needed to also fit the seven 7

bond portfolios that they use) The three factors are: the market return, the return on a portfolio of small stocks minus the return on a portfolio of big stocks (SMB), and the return on a portfolio with high BE/ME minus the return on portfolio with low BE/ME (HML) This three-factor model is rejected at traditional significance levels (see Campbell, Lo, and MacKinlay (997) Table 6 or Fama and French (993) Table 9c), but it can still capture a fair amount ohe variation of expected returns (see Cochrane (2) 22, Figures 22 3) 344 Coding ohe GMM Problem This section describes how the GMM problem can be programmed We treat the case with n assets and K Factors (which are all excess returns) The moments are ohe form ( ) g t = (Rt e α β ) (338) g t = ( (R e t β ) for the exactly identified and overidentified case respectively We want to write the moments on the form ) (339) g t = z t ( yt x t b), (34) to make it easy to use matrix algrebra in the calcuation ohe estimate In that case we could let zy = T z t y t and zx = T t= z t x t, so T t= In the exactly identified case, we then have g t = zy zx b (34) t= ḡ t = zy zx b =, so ˆb = zx zy (342) (It is straightforward to show that this can also be calculated equation by equation) In the 8 overidentified case with a weighting matrix, the loss function can be written ḡ W ḡ = ( zy zx b) W ( zy zx b), so zx W zy zx W zx ˆb = and ˆb = ( zx W zx) zx W zy (343) In the overidentified case when we premultiply the moment conditions by A, we get Aḡ = A zy A zx b =, so b = (A zx ) A zy (344) In practice, we never perform an explicit inversion it is typically much better (in terms of both speed and precision) to let the software solve the system of linear equations instead It is straightforward to show that this works nice if we write the moment conditions as ( ) ( ) g t = I n f t Re t I n b f, with b = vec(α, β) t }{{}}{{} (345) z t x t ( ) g t = I n Rt e ( f ) t I n b, with b = vec(β) }{{}}{{} x t z t (346) for the exactly identified and overidentified case respectively Clearly, z t and x t are matrices, not vectors (z t is n( + K ) n and either ohe same dimension or has one row less) Example 2 (Rewriting the moment conditions) For the moment conditions in Example we have g t (α, β) = f 2t f 2t R e t R e 2t f 2t f 2t α α 2 β β 2 β 2 β 22 Proof (of 345))From the properties of Kronecker products, we know that (i) vec( ABC) = (C A)vec(B); and (ii) if a is m and c is n, then a c = (a I n )c The first 9

rule allows to write ( ) α + β = I n α β as I n vec( α β ) }{{}}{{} x t b The second rule allows us two write ( ) (Rt e α β ) as I n (Rt e α β ) } {{ } z t (For the exactly identified case, we could also use the fact (A B) = A B to notice that z t = x t ) Remark 3 ( Quick matrix calculations of zx and zy ) Although a loop wouldn t take too long time to calculate zx and zy, there is a quicker way Put in row t ohe matrix Z T (+K ) and Rt e in row t ohe matrix R T n For the exactly identified case, let X = Z For the overidentified case, put in row t ohe matrix X T K Then, calculate zx = (Z X/T ) I n and vec(r Z/T ) = zy 35 Testing Multi-Factor Models (General Factors) Reference: Cochrane (2) 22; Campbell, Lo, and MacKinlay (997) 623 and 63 35 GMM Estimation with General Factors Linear factor models imply that all expected excess returns are linear functions ohe same vector of factor risk premia E where β is n K E Rit e = β iλ, where λ is K, for i =, n, or (347) β β K λ =, or β n β nk R e t R e nt λ K E R e t = βλ, (348) The old way oesting this is to do a two-step estimation: first, estimate the β i vectors in a time series model like (332) (equation by equation); second, use ˆβ i as regressors in a regression equation ohe type (347) with a residual added T t= Re it /T = ˆβ i λ + u i (349) It is then tested if u i = for all assets i =,, n This approach is often called a crosssectional regression while the previous tests are called time series regression The main problem ohe cross-sectional approach is that we have to account for the fact that the regressors in the second step, ˆβ i, are just estimates and therefore contain estimation errors This errors-in-variables problem is likely to have two effects (i) it gives a downwards bias ohe estimates of λ and an upward bias ohe mean ohe fitted residuals; and (ii) invalidates the standard expression ohe test of λ A way to handle these problems is to combine the moment conditions for the regression function (333) (to estimate β) with (348) (to estimate λ) to get a joint system (Rt e α β ) E g t (α, β, λ) = E = n(+k +) (35) Rt e βλ We can then test the overidentifying restrictions ohe model There are n( + K + ) moment condition (for each asset we have one moment condition for the constant, K moment conditions for the K factors, and one moment condition corresponding to the 2 2

restriction on the linear factor model) There are only n( + K ) + K parameters (n in α, nk in β and K in λ) We therefore have n K overidentifying restrictions which can be tested with a chi-square test Notice that this is a non-linear estimation problem, since the parameters in β multiply the parameters in λ From the GMM estimation using (35) we get estimates ohe factor risk premia and also the variance-covariance ohem This allows us to characterize the risk factors and to test ihey are priced (each ohem or perhaps all jointly) by using a Wald test Example 4 (Two assets and one factor) we have the moment conditions Rt e α β R2t e α 2 β 2 (R E g t (α, α 2, β, β 2, λ) = E t e α β ) (R2t e α = 6 2 β 2 ) Rt e β λ R2t e β 2λ There are then 6 moment conditions and 5 parameters, so there is one overidentifying restriction to test Note that with one factor, then we need at least two assets for this testing approach to work (n K = 2 ) In general, we need at least one more asset than factors 352 Traditional Cross-Sectional Regressions as Special Cases Instead of specifying a weighting matrix, we could combine the moment equations so they become equal to the number of parameters, for instance by specifying a matrix A and combine as A E g t = This does not generate any overidentifying restrictions, but it still allows us to test hypothesis about λ One possibility is to let the upper left block of A be an identity matrix and just combine the last n moment conditions, R e t βλ, to just K moment conditions I n(+k ) n(+k ) n K n(+k ) θ K n (Rt e α β ) E = n(+k ) (35) Rt e βλ (Rt e α β ) E = (352) θ(rt e βλ) In this case, we can estiumate α and β with LS equation by equation as a standard timeseries regression of a factor model To estimate the K vector λ, notice that we can solve the 2nd set of moment conditions as θ E(R e t βλ) = K or λ = (θβ) θ E R e t, (353) which is just like a cross-sectional instrumental variables regression of E R e t = βλ (with β being the regressors, θ the instruments, and E R e t the dependent variable) With θ = β, we get the traditional cross-sectional approach (347) The only difference is we here take the uncertainty about the generated betas into account (in the testing) With Alternatively, let be the covariance matrix ohe residuals from the time-series estimation ohe factor model Then, using θ = β gives a traditional GLS cross-sectional approach Example 5 (LS cross-sectional regression) With the moment conditions in Example (4) and the weighting vector θ = β, β 2 we get Rt e α β R 2t e α 2 β 2 E g t (α, α 2, β, β 2, λ) = E (Rt e α β ) = 5, (R2t e α 2 β 2 ) β (Rt e β λ) + β 2 (R2t e β 2λ) which has as many parameters as moment conditions 22 23

353 Alternative Formulation of Moment Conditions using α = β(λ E ) The test ohe general multi-factor models is sometimes written on a slightly different form (see, for instance, Campbell, Lo, and MacKinlay (997) 623, but adjust for the fact that they look at returns rather than excess returns) To illustrate this, note that the regression equations (332) imply that E R e t = α + β E (354) Equate the expected returns in the two formulations (354) and (347) to get α = β(λ E ), (355) which is another way of summarizing the restrictions that the linear factor model gives We can then rewrite the moment conditions (35) as (substitute for α and skip the last set of moments) E g t (β, λ) = E (R e t β(λ E ) β ) = n(+k ) (356) Note that there are n( + K ) moment conditions and nk + K parameters (nk in β and K in λ), so there are n K overidentifying restrictions (as before) Example 6 Two assets and one factor) Use the restrictions (355) in the moment conditions for that case (compare with (39)) to get Rt e β (λ E ) β E g t (β, β 2, λ) = E R2t e β 2(λ E ) β 2 Rt e β (λ E ) β = 4 R2t e β 2(λ E ) β 2 This gives 4 moment conditions, but only three parameters, so there is one overidentifying restriction to test just as with (35) 354 What ihe Factor is a Portfolio? It would (perhaps) be natural ihe tests discussed in this section coincided with those in Section 34 when the factors are in fact excess returns That is almost so The difference is 24 that we here estimate the K vector λ (factor risk premia) as a vector of free parameters, while the tests in Section 34 impose λ = E If we were to put this restriction on (356), then we are back to the LM test ohe multifactor model where (356) specifies n( + K ) moment conditions, but includes only nk parameters (in β) we gain one degree of freedom for every element in λ that we avoid to estimate If we do not impose the restriction λ = E, then the tests are not identical and can be expected to be a bit different (in small samples, in particular) 355 Empirical Evidence Chen, Roll, and Ross (986) use a number of macro variables as factors along with traditional market indices They find that industrial production and inflation surprises are priced factors, while the market index might not be Breeden, Gibbons, and Litzenberger (989) and Lettau and Ludvigson (2) estimate models where consumption growth is the factor with very mixed results 36 Fama-MacBeth Reference: Cochrane (2) 23; Campbell, Lo, and MacKinlay (997) 58; Fama and MacBeth (973) The Fama and MacBeth (973) approach is a bit different from the regression approaches discussed so far although is seems most related to what we discussed in Section 35 The method has three steps, described below First, estimate the betas β i (i =,, n) from (3) (this is a time-series regression) This is often done on the whole sample assuming the betas are constant Sometimes, the betas are estimated separately for different sub samples (so we could let ˆβ i carry a time subscript in the equations below) Second, run a cross sectional regression for every t That is, for period t, estimate λ t from the cross section (across the assets i =,, n) regression Rit e = λ t ˆβ i + ε it, (357) where ˆβ i are the regressors (Note the difference to the traditional cross-sectional 25

approach discussed in (34), where the second stage regression regressed E R e it on ˆβ i, while the Fama-French approach runs one regression for every time period) Third, estimate the time averages ˆε i = T ˆλ = T ˆε it for i =,, n, (for every asset) (358) t= ˆλ t (359) t= The second step, using ˆβ i as regressors, creates an errors-in-variables problem since ˆβ i are estimated, that is, measured with an error The effect ohis is typically to bias the estimator of λ t towards zero (and any intercept, or mean ohe residual, is biased upward) One way to minimize this problem, used by Fama and MacBeth (973), is to let the assets be portfolios of assets, for which we can expect that some ohe individual noise in the first-step regressions to average out and thereby make the measurement error in ˆβ smaller If CAPM is true, then the return of an asset is a linear function ohe market return and an error which should be uncorrelated with the errors of other assets otherwise some factor is missing Ihe portfolio consists of 2 assets with equal error variance in a CAPM regression, then we should expect the portfolio to have an error variance which is /2th as large We clearly want portfolios which have different betas, or else the second step regression (357) does not work Fama and MacBeth (973) choose to construct portfolios according to some initial estimate of asset specific betas Another way to deal with the errors-in-variables problem is adjust the tests Jagannathan and Wang (996) and Jagannathan and Wang (998) discuss the asymptotic distribution ohis estimator We can test the model by studying if ε i = (recall from (358) that ε i is the time average ohe residual for asset i, ε it ), by forming a t-test ˆε i / Std(ˆε i ) Fama and MacBeth (973) suggest that the standard deviation should be found by studying the time-variation in ˆε it In particular, they suggest that the variance of ˆε it (not ˆε i ) can be estimated by the (average) squared variation around its mean Since ˆε i is the sample average of ˆε it, the variance ohe former is the variance ohe latter divided by T (the sample size) provided ˆε it is iid That is, Var(ˆε i ) = T Var(ˆε it) = T 2 A similar argument leads to the variance of ˆλ Var(ˆλ) = T 2 ) 2 (ˆεit ˆε i (36) t= (ˆλ t ˆλ) 2 (362) t= Fama and MacBeth (973) found, among other things, that the squared beta is not significant in the second step regression, nor is a measure of non-systematic risk Var(ˆε it ) = T ) 2 (ˆεit ˆε i (36) t= 26 27

Lettau, M, and S Ludvigson, 2, Resurrecting the (C)CAPM: A Cross-Sectional Test When Risk Premia Are Time-Varying, Journal of Political Economy, 9, 238 287 Bibliography MacKinlay, C, 995, Multifactor Models Do Not Explain Deviations from the CAPM, Journal of Financial Economics, 38, 3 28 Breeden, D T, M R Gibbons, and R H Litzenberger, 989, Empirical Tests ohe Consumption-Oriented CAPM, Journal of Finance, 44, 23 262 Campbell, J Y, A W Lo, and A C MacKinlay, 997, The Econometrics of Financial Markets, Princeton University Press, Princeton, New Jersey Chen, N-F, R Roll, and S A Ross, 986, Economic Forces and the Stock Market, Journal of Business, 59, 383 43 Cochrane, J H, 2, Asset Pricing, Princeton University Press, Princeton, New Jersey Fama, E, and J MacBeth, 973, Risk, Return, and Equilibrium: Empirical Tests, Journal of Political Economy, 7, 67 636 Fama, E F, and K R French, 993, Common Risk Factors in the Returns on Stocks and Bonds, Journal of Financial Economics, 33, 3 56 Fama, E F, and K R French, 996, Multifactor Explanations of Asset Pricing Anomalies, Journal of Finance, 5, 55 84 Gibbons, M, S Ross, and J Shanken, 989, A Test ohe Efficiency of a Given Portfolio, Econometrica, 57, 2 52 Greene, W H, 23, Econometric Analysis, Prentice-Hall, Upper Saddle River, New Jersey, 5th edn Jagannathan, R, and Z Wang, 996, The Conditional CAPM and the Cross-Section of Expectd Returns, Journal of Finance, 5, 3 53 Jagannathan, R, and Z Wang, 998, A Note on the Asymptotic Covariance in Fama- MacBeth Regression, Journal of Finance, 53, 799 8 28 29