Factor models with many assets: strong factors, weak factors, and the two-pass procedure

Factor models wth many assets: strong factors, weak factors, and the two-pass procedure Stanslav Anatolyev 1 Anna Mkusheva 2 1 CERGE-EI and NES 2 MIT December 2017 Stanslav Anatolyev and Anna Mkusheva Factor models 1 / 31

Introducton Lnear factor-prcng models Factor-prcng model: Er t = λ β, where β = var(f t ) 1 cov(f t,r t ) r t s excess return to portfolo at perod t, F t are rsk factors, β are rsk exposures, λ are rsk prema. Classcal estmaton approach s the two-pass procedure (Fama and MacBeth, 1973) wth standard error correcton (Shanken, 1992) 1 Estmate β for each portfolo from tme-seres regresson; 2 Estmate λ from cross-sectonal regresson of average returns on estmated betas. Qualty control: Is prce of rsk non-zero? Test: H 0 : λ 0; Do these rsks prce market? Specfcaton test H 0 : Er t = λ β ; How much does rsk exposure explan a varaton n average returns? Second-pass R 2. Stanslav Anatolyev and Anna Mkusheva Factor models 2 / 31

Introducton Lnear factor-prcng models Frst and most known: CAPM (Sharpe 1964, Lnner 1965) The second most well-known s Fama-French (1993): ncludes market portfolo, sze factor SMB (small-mnus-bg) and book-to-market factor HML (hgh-mnus-low). Some models have factors based on market behavor: examplemomentum factor MOM (Jegadeesh and Ttman, 1993); Some have macroeconomc factors: example- consumpton-to-wealth rato cay (Lettau and Ludvgson, 2001) Harvey, Lu and Zhu (2016) lst hundreds of papers proposng, justfyng and estmatng varous lnear factor-prcng models. Stanslav Anatolyev and Anna Mkusheva Factor models 3 / 31

Introducton Problem 1: weak dentfcaton? If some of the observed factors are only weakly correlated wth returns, then the second-pass parameters may be weakly dentfed. Kan and Zhang (1999): useless factors lead to spurous nference Klebergen and Zhan (2015): weak factors may arse from poor measurement of true factors Klebergen (2009): weak factors dstort consstency and asymptotc normalty of rsk-prema estmates. Stanslav Anatolyev and Anna Mkusheva Factor models 4 / 31

Introducton Problem 2: mssng factors? Emprcal fact found n Klebergen and Zhan (2015): many well-known lnear factor-prcng models have very strong remanng factor structure present n the resduals. Example: for all Lettau and Ludvgson (2001) specfcatons frst three prncple components of resduals explan 82% - 96% of remanng cross-sectonal varaton. One found excepton to ths rule: Fama and French. Stanslav Anatolyev and Anna Mkusheva Factor models 5 / 31

Introducton Observaton n our paper: Large T and large N? Tradtonally (and n all mentoned papers) the asymptotc results are derved under assumpton: N s fxed, T However, the most often used datasets are: Jagannathan and Wang (1996): N = 100,T = 330; Fama-French: N = 25,T = 141; Gaglardn, Ossola and Scallet (2016): N = 44 and N = 9936, T = 546. N and T are comparable n sze More adequate asymptotc approxmatons may result from both N and T Stanslav Anatolyev and Anna Mkusheva Factor models 6 / 31

Introducton Our setup ncludes smultaneously Weak observed factors: Some observed factors are only weakly correlated: we model correspondng rsk exposure coeffcents β as beng of order O(1/ T). Thus, frst-stage estmaton error s of the same order of magntude as the coeffcents themselves Mssng factors: There s a strong factor structure present n error terms Large-N-large-T asymptotcs: Many assets-long tme span: N,T Stanslav Anatolyev and Anna Mkusheva Factor models 7 / 31

Introducton Fndngs of our paper We prove that the classcal two-pass procedure fals n our settng: nconsstent estmates of the prema on weak factors, nvald nferences and sgnfcant fnte-sample bas for estmate of rsk prema on strong observed factor We propose new procedures that provde consstent estmators for rsk prema and guarantee asymptotcally gaussan nferences. Stanslav Anatolyev and Anna Mkusheva Factor models 8 / 31

Introducton Fndngs of our paper We develop an estmaton procedure for rsk prema n an envronment wth many assets, weak ncluded factors and strong excluded factors wth the followng features: t yelds consstent estmates when the tradtonal two-pass procedure fals; t yelds consstent estmates wthout knowledge of whch factors are strong and whch are weak; t does not lose effcency f the tradtonal two-pass procedure works; t s a procedure of the press button type: easy-to-mplement, uses standard estmaton technques. Stanslav Anatolyev and Anna Mkusheva Factor models 9 / 31

Introducton Outlne 1 Introducton 2 Setup and man assumptons 3 Two-pass procedure fals: Why? 4 Our proposed soluton 5 Some famous papers revsted Stanslav Anatolyev and Anna Mkusheva Factor models 10 / 31

Setup and man assumptons Setup We observe excess returns on assets or portfolos {r t, = 1,...,N,t = 1,...,T} and k F 1 rsk factors {F t,t = 1,...,T} that follow the correctly-specfed lnear factor-prcng model: Er t = λ β, where β = var(f t ) 1 cov(f t,r t ) Ths s equvalent to assumng that r t = λ β +(F t EF t ) β +ε t, where the random error terms ε t have mean zero and are uncorrelated wth F t. We treat λ and β as non-random, whle r t,f t,ε t are random. Stanslav Anatolyev and Anna Mkusheva Factor models 11 / 31

Setup and man assumptons Setup: weak observed factors We wll dvde factors F t = (F t,1,f t,2 ) and exposures β = (β,1,β,2 ) nto strong and weak : β,2 = b T, where we make the same assumptons about sze of β,1 and sze of b (they are O(1)). Estmaton error for each β s of order O p (1/ T), smlar to sze of β,2 In settng wth N-fxed and T, ths corresponds to weak dentfcaton. We do not assume that econometrcan knows whch factors are weak or the number of weak factors (our results hold for more general assumptons, that some lnear combnaton of factors s weak). Stanslav Anatolyev and Anna Mkusheva Factor models 12 / 31

Setup and man assumptons Setup: mssng factors Model: r t = λ β +(F t EF t ) β +ε t, We assume that error terms are not auto-correlated (effcent market hypothess) but have non-trval cross-sectonal dependence - they have unobserved factor structure: ε t = v t µ +e t, where v t are unobserved random varables; have mean zero and unt varance (normalzaton); uncorrelated wth e t ; µ - unknown constant loadngs of sze O(1). e t are weakly cross-sectonally correlated. Stanslav Anatolyev and Anna Mkusheva Factor models 13 / 31

Setup and man assumptons Outlne 1 Introducton 2 Setup and man assumptons 3 Two-pass procedure fals: Why? 4 Our proposed soluton 5 Some famous papers revsted Stanslav Anatolyev and Anna Mkusheva Factor models 14 / 31

Two-pass procedure fals: Why? Asymptotcs of the two-pass procedure If all observed factors are strong: T( λ TP λ) N(0,V). If some observed factors are weak, but no mssng factors n errors: errors-n-varables bas: λ TP,1 s consstent and Gaussan, but based (nferences are not vald), λ TP,2 s nconsstent If some observed factors are weak, and some mssng factors n errors: errors-n-varables + omtted varable : λ TP,1 s consstent, but based and non-standard dstrbuton, λ TP,2 s nconsstent Stanslav Anatolyev and Anna Mkusheva Factor models 15 / 31

Two-pass procedure fals: Why? Why two-pass fals? No mssng factors case Assume some observed factors are weak, but no factor structure n errors r t = λ β +(F t EF t ) β +e t, e t are weakly dependent Frst-pass estmates: ( T ) 1 T β = F t F t F t r t = (β +u )(1+o p (1)), t=1 t=1 where u = 1 T T t=1 Σ 1 F F t e t are asymptotcally uncorrelated for dfferent and unrelated to β Stanslav Anatolyev and Anna Mkusheva Factor models 16 / 31

Two-pass procedure fals: Why? Why two-pass fals? No mssng factors case Ideal regresson: f one regresses r = 1 T T t=1 r t on β, then wll have consstent estmate of λ But we have nstead only estmates and u = O(1/ T) ( ) ( ) ( ) ( ) β,1 β,1 u,1 β,1 (1+o(1)) = + = u,2 β,2 +u,2 β,2 β,2 Mstake n β,2 s of the same order of magntude as coeffcent tself. It behaves lke classcal measurement error! Regresson of r on β has an attenuaton bas! Stanslav Anatolyev and Anna Mkusheva Factor models 17 / 31

Two-pass procedure fals: Why? No mssng factors case: Soluton Idea: Splt sample n two T 1 T 2 = {1,...,T} Estmate β twce: β (j) = t Tj 1 F t F t Ft r t = (β +u (j) )(1+o p (1)), j = 1,2 t T j Estmaton mstakes u (1) and u (2) are (asymptotcally) uncorrelated β (1) Use as a regressor and and average fnal estmates) β (2) as nstrument (or vce versa, or both Idea of sample-splttng (and ts extreme verson: leave-one-out or jackknfe) has been used n many-weak-iv model (Hansen, Hausman and Newey, 2008) Stanslav Anatolyev and Anna Mkusheva Factor models 18 / 31

Two-pass procedure fals: Why? Factors n errors. Why two-pass fals? Model wth factor structure n errors: r t = λ β +(F t EF t ) β +v t µ +e t, v t s unobserved and µ are unknown, e t are weakly cross-correlated. Frst step where ( T ) 1 T β = F t F t F t r t = t=1 t=1 η T = 1 T Σ 1 F T F t v t ( β + η Tµ T +u )(1+o p (1)), t=1 s comng from unobserved factor structure Stanslav Anatolyev and Anna Mkusheva Factor models 19 / 31

Two-pass procedure fals: Why? Factors n errors. Why two-pass fals? β = ( β + η Tµ T +u )(1+o p (1)), Now the estmaton error η T Tµ +u s NOT classcal measurement error: both terms η T Tµ and u are stochastcally of order O p ( 1 T ) estmaton errors are cross-correlated (for dfferent ) due to term η T Tµ estmaton error may be correlated wth regressor f sample correlaton between β and µ s non-zero Stanslav Anatolyev and Anna Mkusheva Factor models 20 / 31

Two-pass procedure fals: Why? Factors n errors. Why two-pass fals? Model wth factor structure n errors: Ideal regresson: r t = λ β +(F t EF t ) β +v t µ +e t, y = Tr = 1 T r t = λ ( Tβ )+η vµ +ε, T t=1 If there s µ but you know β only- we have omtted varable, t wll cause omtted varable bas f sample correlaton between β and µ s non-zero. Stanslav Anatolyev and Anna Mkusheva Factor models 21 / 31

Two-pass procedure fals: Why? Factors n errors. Why two-pass fals? Summary: f there s no factor structure n errors - we have classcal error-n-varables problem and assocated attenuaton bas If we have factor structure n errors we addtonally have: non-classcal error-n-varable (mstakes n regressor β,2 are cross-correlated and correlated wth β ) even f we know β there s omtted varable bas n the deal regresson f sample correlaton between β and µ s non-zero. Stanslav Anatolyev and Anna Mkusheva Factor models 22 / 31

Two-pass procedure fals: Why? Outlne 1 Introducton 2 Setup and man assumptons 3 Two-pass procedure fals: Why? 4 Our proposed soluton 5 Some famous papers revsted Stanslav Anatolyev and Anna Mkusheva Factor models 23 / 31

Our proposed soluton Our proposed soluton: Idea We reconsder sample-splttng. We have an estmate of β for each sub-sample β (j) = t Tj 1 ( F t F t Ft r t = β + η jµ t T j T +u (j) ) (1+o p (1)), where η j = 1 Σ 1 F F t v t N(0,Ω Fv ). Tj t T j η j are ndependent for dfferent j and ndependent from errors u (j). Stanslav Anatolyev and Anna Mkusheva Factor models 24 / 31

Our proposed soluton Our proposed soluton: Idea ( β (j) = β + η jµ T +u (j) ) (1+o p (1)), We can construct proxy for µ (!!!) ( β (1) (2) β = η 1 T1 η 2 T2 ) µ +(u (1) u (2) ) ( η If T j = T/4, then random coeffcent 1 η 2 T1 and error (u (1) u (2) ) = O( 1 T ) β (1) β (2) T2 ) = O( 1 T ) Proxy ms-measures µ, but measurement error s classcal: not cross-correlated and not correlated wth regressors. Stanslav Anatolyev and Anna Mkusheva Factor models 25 / 31

Our proposed soluton Our proposed soluton: Idea Splt sample nto 4 equal sub-samples. (j) Estmate β for j = 1,...,4. Run IV regresson of r on regressors β (1) (2) β β (3) (1) β (3) β and proxy based on (4) β. wth nstruments and For effcency consderatons you may repeat ths 4 tmes crculatng ndces 1-4. Average estmates you obtan for λ. We also provde formula for how to calculate covarance matrx for our estmate. Stanslav Anatolyev and Anna Mkusheva Factor models 26 / 31

Our proposed soluton Our proposed soluton The exact asymptotc dstrbuton of λ 4S s not Gaussan but rather mxed Gaussan. The estmated varance matrx s asymptotcally random though non-degenerate wth probablty 1. Ths s due to the fact that the coeffcent on proxy for µ s random. It leads to nformaton contaned n second stage IV beng random, though NOT weak wth probablty 1. Our 4-splt estmator: t yelds consstent estmates when the tradtonal two-pass procedure fals; t yelds consstent estmates wthout knowledge of whch factors are strong and whch are weak; t does not lose effcency f the tradtonal two-pass procedure works; t s a procedure of the push-button type: easy-to-mplement, uses standard estmaton technques. Stanslav Anatolyev and Anna Mkusheva Factor models 27 / 31

Our proposed soluton Outlne 1 Introducton 2 Setup and man assumptons 3 Two-pass procedure fals: Why? 4 Our proposed soluton 5 Some famous papers revsted Stanslav Anatolyev and Anna Mkusheva Factor models 28 / 31

Some famous papers revsted Emprcal applcaton (Fama French portfolos) no. specfcaton 5 man prncpal components n resduals 1 Market, SMB, HML 0.29 0.14 0.11 0.07 0.04 2 Market, HML 0.62 0.10 0.05 0.03 0.03 3 Market, HML, cay 0.62 0.10 0.05 0.03 0.03 no. rsk factor Market SMB HML cay 1 conventonal two-pass 2.70 0.61 average four-splt 2.80 0.62 3 conventonal two-pass 2.55 0.61 average four-splt 2.06 0.63 0.69 0.48 0.46 0.47 1.96 0.58 1.29 0.84 1.92 0.62 2.44 0.68 0.027 0.019 0.009 0.005 Stanslav Anatolyev and Anna Mkusheva Factor models 29 / 31

Some famous papers revsted Emprcal applcaton (ndustry portfolos) specfcaton 5 man prncpal components n resduals Market, SMB, HML, MOM 0.14 0.12 0.08 0.06 0.04 Stanslav Anatolyev and Anna Mkusheva Factor models 30 / 31

Some famous papers revsted Emprcal applcaton (ndustry portfolos) specfcaton 5 man prncpal components n resduals Market, SMB, HML, MOM 0.14 0.12 0.08 0.06 0.04 rsk factor Market SMB HML MOM 0.27 0.00 1.05 0.19 0.15 0.35 conventonal two-pass 1.05 0.20 average four-splt 1.15 0.21 1.10 0.24 0.03 0.18 0.03 0.40 Stanslav Anatolyev and Anna Mkusheva Factor models 30 / 31

Some famous papers revsted Concluson What we have done here: Showed that conventonal two-pass procedure gves unrelable estmates of rsk prema n emprcally-relevant stuatons Proposed alternatve press buttons procedure robust to weak factors and strong mssng factors, based on splt-sample IV Alternatve procedure yelds consstent and asymptotcally normal estmates under many-asset, weak-factor asymptotcs Stanslav Anatolyev and Anna Mkusheva Factor models 31 / 31