Regression: Ordinary Least Squares

Regression: Ordinary Least Squares Mark Hendricks Autumn 2017 FINM Intro: Regression Outline Regression OLS Mathematics Linear Projection Hendricks, Autumn 2017 FINM Intro: Regression: Lecture 2/32

Regression analysis in finance Regression applications in finance include... Risk-management. Find how a portfolio return is impacted by some factor/instrument. Forecasting. Build forecasts of financial and macroeconomic variables. (inflation, yields, etc.) Pricing. The fundamental asset pricing equation is a linear relation between risk and return. (Will see this more in Portfolio Theory 1.) Hendricks, Autumn 2017 FINM Intro: Regression: Lecture 3/32 Beyond regression Nonlinear analysis is also important. Options Pricing. Differential equations requiring martingale methods, simulation, finite differenc, etc. Value at Risk. Model the tail of the distribution of profits and losses. Volatility Models Need non-linear timeseries models such as GARCH. Hendricks, Autumn 2017 FINM Intro: Regression: Lecture 4/32

Linear regression model Consider a linear regression model involving two variables, y and x. y =α + βx + ɛ y is referred to as the regressand, or explained variable. x is referred to as the regressor, covariate, or explanatory variable. α and β are the (constant) parameters of the model. Hendricks, Autumn 2017 FINM Intro: Regression: Lecture 5/32 Example: Portfolio factor sensitivity Decompose the hedge fund return into a market-driven and market-neutral return. R p =α + βr mkt + ɛ random total portfolio return denoted by R p random return on the S&P 500, denoted by R mkt. Interpret... β = 0, 1, 2 α =.01, 0,.01. Hendricks, Autumn 2017 FINM Intro: Regression: Lecture 6/32

Example: Portfolio decomposition Continuing the example from above, R p =α + βr mkt + ɛ We may want to know how much of R p is explained by R mkt. R-squared (R 2 ) is a metric of the variation explained in the regression model. Is the hedge-fund driven by market returns if β = 1, R 2 =.10? How about β =.5, R 2 =.50? (Notation: R 2 is standard notation in regression analysis nothing to do with my choice of variable name R p, R mkt.) Hendricks, Autumn 2017 FINM Intro: Regression: Lecture 7/32 Univariate regression When there is only one regressor, x, we will see that the OLS estimator is simply: β = cov (y, x) var (x) And that the R-squared statistic is simply R 2 = [corr(y, x)] 2 So why bother with regression if we just need covariances and variances? Hendricks, Autumn 2017 FINM Intro: Regression: Lecture 8/32

Multiple regression In the case of multiple regressors, the OLS statistics are not so easily formed. Augment our hedge-fund regression with a second regressor: a US dollar index, R $. R p =α + β 1R mkt + β 2R $ + ɛ The formulas for β 1 and β 2 do not follow as easily: β 1 cov (Rp, R mkt) var (R mkt ) The R-squared stat captures the correlation between R p and the combined space spanned by both R mkt and R $. Hendricks, Autumn 2017 FINM Intro: Regression: Lecture 9/32 Caution! Remember that the multi-variable beta is not the same as the univariate beta! R p =α + β 1R mkt + β 2R $ + ɛ Perhaps R p is positively correlated with R $, and thus would have a positive beta if regressed on only R $. But β 2 is not a measure of this pairwise comovement! β 2 gives the impact on R p if we hold R mkt constant! Thus, when the regressors are correlated, multi-variable betas can be quite different from their univariate counterpart. Hendricks, Autumn 2017 FINM Intro: Regression: Lecture 10/32

Units When interpreting the regression coefficients, be careful to remember the underlying units. R p =α + β 1R mkt + β 2R $ + ɛ The volatility of R mkt is three times larger than the volatility of R $. Thus, even if β 2 is larger than β 1, we need to remember that one-unit changes in R $ happen less frequently. In this situation it may be more helpful to report β 1σ 1 and β 2σ 2 to help convey the one-standard deviation impact from each factor. Hendricks, Autumn 2017 FINM Intro: Regression: Lecture 11/32 Outline Regression OLS Mathematics Linear Projection Hendricks, Autumn 2017 FINM Intro: Regression: Lecture 12/32

Multivariate linear regression In a multivariate regression model with k regressors, y =α + β 1x 1 + β 2x 2 +... β k x k + ɛ k =α + β j x j + ɛ j=1 =x β + ɛ The last line defines x such that the first element is the constant 1, and the first element of β is α. Including the regression constant in the vector notation will simplify the algebra, as we will always consider the case where the first regressor is a constant. Hendricks, Autumn 2017 FINM Intro: Regression: Lecture 13/32 Data from the regression model A sample of n observations is denoted as (y i, x i ) for i = 1, 2,... n. y i =x iβ + ɛ i where 1 x i,1 x i x i,2. x i,k α β 1 β β 2. β k Hendricks, Autumn 2017 FINM Intro: Regression: Lecture 14/32

Regression estimate Consider a sample estimate of β, denoted by b. Then y i = x ib + e i where e i denotes a sample residual, e i = y i x ib This is estimated regression, as opposed to the population regression equation above. Hendricks, Autumn 2017 FINM Intro: Regression: Lecture 15/32 Ordinary least squares The ordinary least squares estimator of β minimizes the sum of squared sample errors: b arg min bo = arg min bo n (e i ) 2 i=1 n ( yi x ) 2 ib o i=1 Hendricks, Autumn 2017 FINM Intro: Regression: Lecture 16/32

OLS problem Rewrite the OLS problem in matrix notation, b arg min bo e e = arg min (Y Xb o) (Y Xb o) bo where x 1 x 2 X., x n y 1 Y y 2., y n e 1 e e 2., e n Hendricks, Autumn 2017 FINM Intro: Regression: Lecture 17/32 Assumption: Full-rank Assumption 1: X X is full rank. Equivalently, assume that there is no exact linear relationship among any of the regressors. Clearly, the existence of OLS estimator requires that this assumption be satisfied. Multicollinearity refers to the case where this assumption fails. Hendricks, Autumn 2017 FINM Intro: Regression: Lecture 18/32

OLS estimate Solving the minimization problem above gives the OLS estimate: b = ( X X ) 1 X Y This estimate yields sample residuals of Thus e is orthogonal to X. e =Y X ( X X ) 1 X Y ( = I X ( X X ) ) 1 X Y Equivalently, the in-sample correlation between x i and e i is zero. Hendricks, Autumn 2017 FINM Intro: Regression: Lecture 19/32 Alternative OLS derivation Suppose the population correlation between x and ɛ is zero. Thus, 0 =E [xɛ] 0 =E [ x ( y x β )] β = ( E [ xx ]) 1 E [xy] If regression includes a constant, then these terms are covariance matrices, and we can use sample estimators in place of the population moments to get the OLS estimator: b = ( X X ) 1 X Y ( ) 1 ( ) 1 n = x i x 1 n i x i y i n n i=1 i=1 Hendricks, Autumn 2017 FINM Intro: Regression: Lecture 20/32

Regression with an intercept The assumption that X includes a column of 1 s is important. Including a constant in the regression is equivalent to running a regression with demeaned data. Running a regression on just a constant regressor and nothing else, would simply pick up the mean in the data. Including a constant in the regression means the regressors try to match the variation in the y data, not the overall level. Hendricks, Autumn 2017 FINM Intro: Regression: Lecture 21/32 Example: Risk premia A fundamental theorem of asset pricing says that there is a linear relation between the risk premium of asset i, π i, and a certain risk measure, x i : π i = α + βx i + ɛ i The Portfolio Theory class covers this theory in detail, but for now take it as given. Test this theory with a linear regression. Try both including a constant, α, and without. Risk and return data is collected on various industry portfolios. Hendricks, Autumn 2017 FINM Intro: Regression: Lecture 22/32

Example: Regression with and without an intercept 12 Mean excess return (annual %) 10 8 6 4 2 0 0 0.5 1 1.5 Market risk NonDur Durable Manuf Energy Tech Telcom Shop Health Util Other Figure: Data Source: Ken French. Monthly 1926-2011. Student Version of MATLAB Hendricks, Autumn 2017 FINM Intro: Regression: Lecture 23/32 Example: Regression with and without an intercept 12 Mean excess return (annual %) 10 8 6 4 2 0 0 0.5 1 1.5 Market risk NonDur Durable Manuf Energy Tech Telcom Shop Health Util Other Figure: Data Source: Ken French. Monthly 1926-2011. Student Version of MATLAB Hendricks, Autumn 2017 FINM Intro: Regression: Lecture 23/32

Residuals with zero mean By assuming the model includes a constant, E[xɛ] = 0 = E[ɛ] = 0 By including a constant in the sample estimation, 1 x 1 1 1 x 2 e = 0 = 1 n e i = ē = 0 n.. n i=1 1 x n Hendricks, Autumn 2017 FINM Intro: Regression: Lecture 24/32 OLS as linear projection Using the derivation of b and e, we can write the regression equation as a linear projection. Y =PY + MY P X ( X X ) 1 X M I n X ( X X ) 1 X P is an n n matrix which projects data into the space spanned by the column vectors of X. M is an n n matrix which projects data into the space orthogonal to X. Hendricks, Autumn 2017 FINM Intro: Regression: Lecture 25/32

Replicating a stock using other equities Suppose we want to replicate the return of a certain stock, using the returns of 10 other stocks. The linear projection selects a portfolio (linear combination) of the securities, which leads to maximal correlation with the target security. Y is the sample of the returns of the target stock. X is the sample of the returns of the 10 stocks used to mimic the target. Hendricks, Autumn 2017 FINM Intro: Regression: Lecture 26/32 Replicating a stock with linear projection. Replicate a Stock 800 cumulative percentage return 600 400 200 0 200 replicating portfolio target stock 400 1920 1940 1960 1980 2000 2020 Figure: Correlation 85%. Hendricks, Autumn 2017 FINM Intro: Regression: Lecture 27/32

Outline Regression OLS Mathematics Linear Projection Hendricks, Autumn 2017 FINM Intro: Regression: Lecture 28/32 Conditional expectation It is generally true that given any variable y along with a vector of variables, x, we can write y =E [y x] + ɛ where ɛ is simply the prediction error, ɛ = y E [y x] With a linear model, then under the exogeneity assumption, E [y x] =x β Hendricks, Autumn 2017 FINM Intro: Regression: Lecture 29/32

Assumption: Exogeneity Assumption 2: ɛ is exogenous to the regressors, x. E [ɛ x] = 0 The exogeneity assumption, implies that ɛ is uncorrelated with x. implies that ɛ is uncorrelated with any function of x. does NOT imply that ɛ is independent of x. Hendricks, Autumn 2017 FINM Intro: Regression: Lecture 30/32 Example: S&P and VIX 250 200 sp500 vix 150 100 50 0 2000 2002 2005 2007 2010 2012 2015 Figure: Example of highly correlated signals: S&P500 and VIX. Hendricks, Autumn 2017 FINM Intro: Regression: Lecture 31/32

Projection and conditional expectation Assuming full-rank and exogeneity, linear projection gives the best linear conditional forecast. [ (y b = arg min E x γ ) ] 2 γ That is, the linear projection x b minimizes the mean-squared-error. Further assuming that the process is linear, then the conditional expectation is exactly equal to the linear projection: E [y x] = x b Hendricks, Autumn 2017 FINM Intro: Regression: Lecture 32/32