Factor models with many assets: strong factors, weak factors, and the two-pass procedure

Similar documents
LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4)

Properties of Least Squares

Estimating and Testing Cross-Sectional Asset Pricing Models: A Robust IV Econometric Technique

Durban Watson for Testing the Lack-of-Fit of Polynomial Regression Models without Replications

A Comparative Study for Estimation Parameters in Panel Data Model

Econ Statistical Properties of the OLS estimator. Sanjaya DeSilva

The Granular Origins of Aggregate Fluctuations : Supplementary Material

[ ] λ λ λ. Multicollinearity. multicollinearity Ragnar Frisch (1934) perfect exact. collinearity. multicollinearity. exact

Department of Statistics University of Toronto STA305H1S / 1004 HS Design and Analysis of Experiments Term Test - Winter Solution

Continuous vs. Discrete Goods

The Multiple Classical Linear Regression Model (CLRM): Specification and Assumptions. 1. Introduction

See Book Chapter 11 2 nd Edition (Chapter 10 1 st Edition)

GMM Method (Single-equation) Pongsa Pornchaiwiseskul Faculty of Economics Chulalongkorn University

Econometrics of Panel Data

x i1 =1 for all i (the constant ).

UNR Joint Economics Working Paper Series Working Paper No Further Analysis of the Zipf Law: Does the Rank-Size Rule Really Exist?

1. Inference on Regression Parameters a. Finding Mean, s.d and covariance amongst estimates. 2. Confidence Intervals and Working Hotelling Bands

Chapter 2 - The Simple Linear Regression Model S =0. e i is a random error. S β2 β. This is a minimization problem. Solution is a calculus exercise.

Negative Binomial Regression

Chapter 13: Multiple Regression

Polynomial Regression Models

JAB Chain. Long-tail claims development. ASTIN - September 2005 B.Verdier A. Klinger

ECONOMETRICS - FINAL EXAM, 3rd YEAR (GECO & GADE)

Statistics for Economics & Business

Statistics II Final Exam 26/6/18

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

x = , so that calculated

Department of Quantitative Methods & Information Systems. Time Series and Their Components QMIS 320. Chapter 6

Maximum Likelihood Estimation of Binary Dependent Variables Models: Probit and Logit. 1. General Formulation of Binary Dependent Variables Models

STAT 511 FINAL EXAM NAME Spring 2001

Lecture 4 Hypothesis Testing

Exam. Econometrics - Exam 1

e i is a random error

Chapter 3. Two-Variable Regression Model: The Problem of Estimation

Statistics for Managers Using Microsoft Excel/SPSS Chapter 13 The Simple Linear Regression Model and Correlation

Statistics for Business and Economics

Mismeasured Variables in Econometric Analysis: Problems from the Right and Problems from the Left. Jerry Hausman 1

Introduction to Dummy Variable Regressors. 1. An Example of Dummy Variable Regressors

Psychology 282 Lecture #24 Outline Regression Diagnostics: Outliers

Testing for seasonal unit roots in heterogeneous panels

APPROXIMATE PRICES OF BASKET AND ASIAN OPTIONS DUPONT OLIVIER. Premia 14

Online Appendix to: Axiomatization and measurement of Quasi-hyperbolic Discounting

Limited Dependent Variables

Interval Estimation in the Classical Normal Linear Regression Model. 1. Introduction

T E C O L O T E R E S E A R C H, I N C.

Economics 130. Lecture 4 Simple Linear Regression Continued

Systems of Equations (SUR, GMM, and 3SLS)

The Geometry of Logit and Probit

Lab 4: Two-level Random Intercept Model

Chapter 5: Hypothesis Tests, Confidence Intervals & Gauss-Markov Result

Empirical Methods for Corporate Finance. Identification

THE ROYAL STATISTICAL SOCIETY 2006 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE

Primer on High-Order Moment Estimators

Lena Boneva and Oliver Linton. January 2017

Maximum Likelihood Estimation of Binary Dependent Variables Models: Probit and Logit. 1. General Formulation of Binary Dependent Variables Models

ECONOMICS 351*-A Mid-Term Exam -- Fall Term 2000 Page 1 of 13 pages. QUEEN'S UNIVERSITY AT KINGSTON Department of Economics

/ n ) are compared. The logic is: if the two

Chapter 4: Regression With One Regressor

Composite Hypotheses testing

Copyright 2017 by Taylor Enterprises, Inc., All Rights Reserved. Adjusted Control Limits for P Charts. Dr. Wayne A. Taylor

Here is the rationale: If X and y have a strong positive relationship to one another, then ( x x) will tend to be positive when ( y y)

A Monte Carlo Study for Swamy s Estimate of Random Coefficient Panel Data Model

Random Partitions of Samples

ISQS 6348 Final Open notes, no books. Points out of 100 in parentheses. Y 1 ε 2

A Robust Method for Calculating the Correlation Coefficient

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity

Chapter 11: Simple Linear Regression and Correlation

Statistics Chapter 4

An Introduction to Censoring, Truncation and Sample Selection Problems

Comparison of Regression Lines

An R implementation of bootstrap procedures for mixed models

Number of cases Number of factors Number of covariates Number of levels of factor i. Value of the dependent variable for case k

4.3 Poisson Regression

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

Parametric fractional imputation for missing data analysis. Jae Kwang Kim Survey Working Group Seminar March 29, 2010

Computation of Higher Order Moments from Two Multinomial Overdispersion Likelihood Models

Chapter 14 Simple Linear Regression

Two-factor model. Statistical Models. Least Squares estimation in LM two-factor model. Rats

DO NOT OPEN THE QUESTION PAPER UNTIL INSTRUCTED TO DO SO BY THE CHIEF INVIGILATOR. Introductory Econometrics 1 hour 30 minutes

Lecture 2: Prelude to the big shrink

ANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U)

Predictive Analytics : QM901.1x Prof U Dinesh Kumar, IIMB. All Rights Reserved, Indian Institute of Management Bangalore

Copyright 2017 by Taylor Enterprises, Inc., All Rights Reserved. Adjusted Control Limits for U Charts. Dr. Wayne A. Taylor

Laboratory 3: Method of Least Squares

Outline. Zero Conditional mean. I. Motivation. 3. Multiple Regression Analysis: Estimation. Read Wooldridge (2013), Chapter 3.

9. Binary Dependent Variables

Professor Chris Murray. Midterm Exam

2016 Wiley. Study Session 2: Ethical and Professional Standards Application

STK4080/9080 Survival and event history analysis

Linear Regression Analysis: Terminology and Notation

Lecture 10 Support Vector Machines II

STAT 3008 Applied Regression Analysis

Lecture 19. Endogenous Regressors and Instrumental Variables

STATISTICS QUESTIONS. Step by Step Solutions.

REGRESSION ANALYSIS II- MULTICOLLINEARITY

Linear Approximation with Regularization and Moving Least Squares

4 Analysis of Variance (ANOVA) 5 ANOVA. 5.1 Introduction. 5.2 Fixed Effects ANOVA

Lecture 3 Stat102, Spring 2007

If we apply least squares to the transformed data we obtain. which yields the generalized least squares estimator of β, i.e.,

Transcription:

Factor models wth many assets: strong factors, weak factors, and the two-pass procedure Stanslav Anatolyev 1 Anna Mkusheva 2 1 CERGE-EI and NES 2 MIT December 2017 Stanslav Anatolyev and Anna Mkusheva Factor models 1 / 31

Introducton Lnear factor-prcng models Factor-prcng model: Er t = λ β, where β = var(f t ) 1 cov(f t,r t ) r t s excess return to portfolo at perod t, F t are rsk factors, β are rsk exposures, λ are rsk prema. Classcal estmaton approach s the two-pass procedure (Fama and MacBeth, 1973) wth standard error correcton (Shanken, 1992) 1 Estmate β for each portfolo from tme-seres regresson; 2 Estmate λ from cross-sectonal regresson of average returns on estmated betas. Qualty control: Is prce of rsk non-zero? Test: H 0 : λ 0; Do these rsks prce market? Specfcaton test H 0 : Er t = λ β ; How much does rsk exposure explan a varaton n average returns? Second-pass R 2. Stanslav Anatolyev and Anna Mkusheva Factor models 2 / 31

Introducton Lnear factor-prcng models Frst and most known: CAPM (Sharpe 1964, Lnner 1965) The second most well-known s Fama-French (1993): ncludes market portfolo, sze factor SMB (small-mnus-bg) and book-to-market factor HML (hgh-mnus-low). Some models have factors based on market behavor: examplemomentum factor MOM (Jegadeesh and Ttman, 1993); Some have macroeconomc factors: example- consumpton-to-wealth rato cay (Lettau and Ludvgson, 2001) Harvey, Lu and Zhu (2016) lst hundreds of papers proposng, justfyng and estmatng varous lnear factor-prcng models. Stanslav Anatolyev and Anna Mkusheva Factor models 3 / 31

Introducton Problem 1: weak dentfcaton? If some of the observed factors are only weakly correlated wth returns, then the second-pass parameters may be weakly dentfed. Kan and Zhang (1999): useless factors lead to spurous nference Klebergen and Zhan (2015): weak factors may arse from poor measurement of true factors Klebergen (2009): weak factors dstort consstency and asymptotc normalty of rsk-prema estmates. Stanslav Anatolyev and Anna Mkusheva Factor models 4 / 31

Introducton Problem 2: mssng factors? Emprcal fact found n Klebergen and Zhan (2015): many well-known lnear factor-prcng models have very strong remanng factor structure present n the resduals. Example: for all Lettau and Ludvgson (2001) specfcatons frst three prncple components of resduals explan 82% - 96% of remanng cross-sectonal varaton. One found excepton to ths rule: Fama and French. Stanslav Anatolyev and Anna Mkusheva Factor models 5 / 31

Introducton Observaton n our paper: Large T and large N? Tradtonally (and n all mentoned papers) the asymptotc results are derved under assumpton: N s fxed, T However, the most often used datasets are: Jagannathan and Wang (1996): N = 100,T = 330; Fama-French: N = 25,T = 141; Gaglardn, Ossola and Scallet (2016): N = 44 and N = 9936, T = 546. N and T are comparable n sze More adequate asymptotc approxmatons may result from both N and T Stanslav Anatolyev and Anna Mkusheva Factor models 6 / 31

Introducton Our setup ncludes smultaneously Weak observed factors: Some observed factors are only weakly correlated: we model correspondng rsk exposure coeffcents β as beng of order O(1/ T). Thus, frst-stage estmaton error s of the same order of magntude as the coeffcents themselves Mssng factors: There s a strong factor structure present n error terms Large-N-large-T asymptotcs: Many assets-long tme span: N,T Stanslav Anatolyev and Anna Mkusheva Factor models 7 / 31

Introducton Fndngs of our paper We prove that the classcal two-pass procedure fals n our settng: nconsstent estmates of the prema on weak factors, nvald nferences and sgnfcant fnte-sample bas for estmate of rsk prema on strong observed factor We propose new procedures that provde consstent estmators for rsk prema and guarantee asymptotcally gaussan nferences. Stanslav Anatolyev and Anna Mkusheva Factor models 8 / 31

Introducton Fndngs of our paper We develop an estmaton procedure for rsk prema n an envronment wth many assets, weak ncluded factors and strong excluded factors wth the followng features: t yelds consstent estmates when the tradtonal two-pass procedure fals; t yelds consstent estmates wthout knowledge of whch factors are strong and whch are weak; t does not lose effcency f the tradtonal two-pass procedure works; t s a procedure of the press button type: easy-to-mplement, uses standard estmaton technques. Stanslav Anatolyev and Anna Mkusheva Factor models 9 / 31

Introducton Outlne 1 Introducton 2 Setup and man assumptons 3 Two-pass procedure fals: Why? 4 Our proposed soluton 5 Some famous papers revsted Stanslav Anatolyev and Anna Mkusheva Factor models 10 / 31

Setup and man assumptons Setup We observe excess returns on assets or portfolos {r t, = 1,...,N,t = 1,...,T} and k F 1 rsk factors {F t,t = 1,...,T} that follow the correctly-specfed lnear factor-prcng model: Er t = λ β, where β = var(f t ) 1 cov(f t,r t ) Ths s equvalent to assumng that r t = λ β +(F t EF t ) β +ε t, where the random error terms ε t have mean zero and are uncorrelated wth F t. We treat λ and β as non-random, whle r t,f t,ε t are random. Stanslav Anatolyev and Anna Mkusheva Factor models 11 / 31

Setup and man assumptons Setup: weak observed factors We wll dvde factors F t = (F t,1,f t,2 ) and exposures β = (β,1,β,2 ) nto strong and weak : β,2 = b T, where we make the same assumptons about sze of β,1 and sze of b (they are O(1)). Estmaton error for each β s of order O p (1/ T), smlar to sze of β,2 In settng wth N-fxed and T, ths corresponds to weak dentfcaton. We do not assume that econometrcan knows whch factors are weak or the number of weak factors (our results hold for more general assumptons, that some lnear combnaton of factors s weak). Stanslav Anatolyev and Anna Mkusheva Factor models 12 / 31

Setup and man assumptons Setup: mssng factors Model: r t = λ β +(F t EF t ) β +ε t, We assume that error terms are not auto-correlated (effcent market hypothess) but have non-trval cross-sectonal dependence - they have unobserved factor structure: ε t = v t µ +e t, where v t are unobserved random varables; have mean zero and unt varance (normalzaton); uncorrelated wth e t ; µ - unknown constant loadngs of sze O(1). e t are weakly cross-sectonally correlated. Stanslav Anatolyev and Anna Mkusheva Factor models 13 / 31

Setup and man assumptons Outlne 1 Introducton 2 Setup and man assumptons 3 Two-pass procedure fals: Why? 4 Our proposed soluton 5 Some famous papers revsted Stanslav Anatolyev and Anna Mkusheva Factor models 14 / 31

Two-pass procedure fals: Why? Asymptotcs of the two-pass procedure If all observed factors are strong: T( λ TP λ) N(0,V). If some observed factors are weak, but no mssng factors n errors: errors-n-varables bas: λ TP,1 s consstent and Gaussan, but based (nferences are not vald), λ TP,2 s nconsstent If some observed factors are weak, and some mssng factors n errors: errors-n-varables + omtted varable : λ TP,1 s consstent, but based and non-standard dstrbuton, λ TP,2 s nconsstent Stanslav Anatolyev and Anna Mkusheva Factor models 15 / 31

Two-pass procedure fals: Why? Why two-pass fals? No mssng factors case Assume some observed factors are weak, but no factor structure n errors r t = λ β +(F t EF t ) β +e t, e t are weakly dependent Frst-pass estmates: ( T ) 1 T β = F t F t F t r t = (β +u )(1+o p (1)), t=1 t=1 where u = 1 T T t=1 Σ 1 F F t e t are asymptotcally uncorrelated for dfferent and unrelated to β Stanslav Anatolyev and Anna Mkusheva Factor models 16 / 31

Two-pass procedure fals: Why? Why two-pass fals? No mssng factors case Ideal regresson: f one regresses r = 1 T T t=1 r t on β, then wll have consstent estmate of λ But we have nstead only estmates and u = O(1/ T) ( ) ( ) ( ) ( ) β,1 β,1 u,1 β,1 (1+o(1)) = + = u,2 β,2 +u,2 β,2 β,2 Mstake n β,2 s of the same order of magntude as coeffcent tself. It behaves lke classcal measurement error! Regresson of r on β has an attenuaton bas! Stanslav Anatolyev and Anna Mkusheva Factor models 17 / 31

Two-pass procedure fals: Why? No mssng factors case: Soluton Idea: Splt sample n two T 1 T 2 = {1,...,T} Estmate β twce: β (j) = t Tj 1 F t F t Ft r t = (β +u (j) )(1+o p (1)), j = 1,2 t T j Estmaton mstakes u (1) and u (2) are (asymptotcally) uncorrelated β (1) Use as a regressor and and average fnal estmates) β (2) as nstrument (or vce versa, or both Idea of sample-splttng (and ts extreme verson: leave-one-out or jackknfe) has been used n many-weak-iv model (Hansen, Hausman and Newey, 2008) Stanslav Anatolyev and Anna Mkusheva Factor models 18 / 31

Two-pass procedure fals: Why? Factors n errors. Why two-pass fals? Model wth factor structure n errors: r t = λ β +(F t EF t ) β +v t µ +e t, v t s unobserved and µ are unknown, e t are weakly cross-correlated. Frst step where ( T ) 1 T β = F t F t F t r t = t=1 t=1 η T = 1 T Σ 1 F T F t v t ( β + η Tµ T +u )(1+o p (1)), t=1 s comng from unobserved factor structure Stanslav Anatolyev and Anna Mkusheva Factor models 19 / 31

Two-pass procedure fals: Why? Factors n errors. Why two-pass fals? β = ( β + η Tµ T +u )(1+o p (1)), Now the estmaton error η T Tµ +u s NOT classcal measurement error: both terms η T Tµ and u are stochastcally of order O p ( 1 T ) estmaton errors are cross-correlated (for dfferent ) due to term η T Tµ estmaton error may be correlated wth regressor f sample correlaton between β and µ s non-zero Stanslav Anatolyev and Anna Mkusheva Factor models 20 / 31

Two-pass procedure fals: Why? Factors n errors. Why two-pass fals? Model wth factor structure n errors: Ideal regresson: r t = λ β +(F t EF t ) β +v t µ +e t, y = Tr = 1 T r t = λ ( Tβ )+η vµ +ε, T t=1 If there s µ but you know β only- we have omtted varable, t wll cause omtted varable bas f sample correlaton between β and µ s non-zero. Stanslav Anatolyev and Anna Mkusheva Factor models 21 / 31

Two-pass procedure fals: Why? Factors n errors. Why two-pass fals? Summary: f there s no factor structure n errors - we have classcal error-n-varables problem and assocated attenuaton bas If we have factor structure n errors we addtonally have: non-classcal error-n-varable (mstakes n regressor β,2 are cross-correlated and correlated wth β ) even f we know β there s omtted varable bas n the deal regresson f sample correlaton between β and µ s non-zero. Stanslav Anatolyev and Anna Mkusheva Factor models 22 / 31

Two-pass procedure fals: Why? Outlne 1 Introducton 2 Setup and man assumptons 3 Two-pass procedure fals: Why? 4 Our proposed soluton 5 Some famous papers revsted Stanslav Anatolyev and Anna Mkusheva Factor models 23 / 31

Our proposed soluton Our proposed soluton: Idea We reconsder sample-splttng. We have an estmate of β for each sub-sample β (j) = t Tj 1 ( F t F t Ft r t = β + η jµ t T j T +u (j) ) (1+o p (1)), where η j = 1 Σ 1 F F t v t N(0,Ω Fv ). Tj t T j η j are ndependent for dfferent j and ndependent from errors u (j). Stanslav Anatolyev and Anna Mkusheva Factor models 24 / 31

Our proposed soluton Our proposed soluton: Idea ( β (j) = β + η jµ T +u (j) ) (1+o p (1)), We can construct proxy for µ (!!!) ( β (1) (2) β = η 1 T1 η 2 T2 ) µ +(u (1) u (2) ) ( η If T j = T/4, then random coeffcent 1 η 2 T1 and error (u (1) u (2) ) = O( 1 T ) β (1) β (2) T2 ) = O( 1 T ) Proxy ms-measures µ, but measurement error s classcal: not cross-correlated and not correlated wth regressors. Stanslav Anatolyev and Anna Mkusheva Factor models 25 / 31

Our proposed soluton Our proposed soluton: Idea Splt sample nto 4 equal sub-samples. (j) Estmate β for j = 1,...,4. Run IV regresson of r on regressors β (1) (2) β β (3) (1) β (3) β and proxy based on (4) β. wth nstruments and For effcency consderatons you may repeat ths 4 tmes crculatng ndces 1-4. Average estmates you obtan for λ. We also provde formula for how to calculate covarance matrx for our estmate. Stanslav Anatolyev and Anna Mkusheva Factor models 26 / 31

Our proposed soluton Our proposed soluton The exact asymptotc dstrbuton of λ 4S s not Gaussan but rather mxed Gaussan. The estmated varance matrx s asymptotcally random though non-degenerate wth probablty 1. Ths s due to the fact that the coeffcent on proxy for µ s random. It leads to nformaton contaned n second stage IV beng random, though NOT weak wth probablty 1. Our 4-splt estmator: t yelds consstent estmates when the tradtonal two-pass procedure fals; t yelds consstent estmates wthout knowledge of whch factors are strong and whch are weak; t does not lose effcency f the tradtonal two-pass procedure works; t s a procedure of the push-button type: easy-to-mplement, uses standard estmaton technques. Stanslav Anatolyev and Anna Mkusheva Factor models 27 / 31

Our proposed soluton Outlne 1 Introducton 2 Setup and man assumptons 3 Two-pass procedure fals: Why? 4 Our proposed soluton 5 Some famous papers revsted Stanslav Anatolyev and Anna Mkusheva Factor models 28 / 31

Some famous papers revsted Emprcal applcaton (Fama French portfolos) no. specfcaton 5 man prncpal components n resduals 1 Market, SMB, HML 0.29 0.14 0.11 0.07 0.04 2 Market, HML 0.62 0.10 0.05 0.03 0.03 3 Market, HML, cay 0.62 0.10 0.05 0.03 0.03 Stanslav Anatolyev and Anna Mkusheva Factor models 29 / 31

Some famous papers revsted Emprcal applcaton (Fama French portfolos) no. specfcaton 5 man prncpal components n resduals 1 Market, SMB, HML 0.29 0.14 0.11 0.07 0.04 2 Market, HML 0.62 0.10 0.05 0.03 0.03 3 Market, HML, cay 0.62 0.10 0.05 0.03 0.03 no. rsk factor Market SMB HML cay 1 conventonal two-pass 2.70 0.61 average four-splt 2.80 0.62 3 conventonal two-pass 2.55 0.61 average four-splt 2.06 0.63 0.69 0.48 0.46 0.47 1.96 0.58 1.29 0.84 1.92 0.62 2.44 0.68 0.027 0.019 0.009 0.005 Stanslav Anatolyev and Anna Mkusheva Factor models 29 / 31

Some famous papers revsted Emprcal applcaton (ndustry portfolos) specfcaton 5 man prncpal components n resduals Market, SMB, HML, MOM 0.14 0.12 0.08 0.06 0.04 Stanslav Anatolyev and Anna Mkusheva Factor models 30 / 31

Some famous papers revsted Emprcal applcaton (ndustry portfolos) specfcaton 5 man prncpal components n resduals Market, SMB, HML, MOM 0.14 0.12 0.08 0.06 0.04 rsk factor Market SMB HML MOM 0.27 0.00 1.05 0.19 0.15 0.35 conventonal two-pass 1.05 0.20 average four-splt 1.15 0.21 1.10 0.24 0.03 0.18 0.03 0.40 Stanslav Anatolyev and Anna Mkusheva Factor models 30 / 31

Some famous papers revsted Concluson What we have done here: Showed that conventonal two-pass procedure gves unrelable estmates of rsk prema n emprcally-relevant stuatons Proposed alternatve press buttons procedure robust to weak factors and strong mssng factors, based on splt-sample IV Alternatve procedure yelds consstent and asymptotcally normal estmates under many-asset, weak-factor asymptotcs Stanslav Anatolyev and Anna Mkusheva Factor models 31 / 31