1
OUTLINE Basic Concept: Multiple Regression MULTICOLLINEARITY AUTOCORRELATION HETEROSCEDASTICITY REASEARCH IN FINANCE 2
BASIC CONCEPTS: Multiple Regression Y i = β 1 + β 2 X 1i + β 3 X 2i + β 4 X 3i + u i 3
BASIC CONCEPTS: Normality Assumption for CLRM assumes that each is distributed normally with Y i = β 1 + β 2 X 1i + β 3 X 2i + β 4 X 3i + u i 4
BASIC CONCEPTS: Why we need Normality Assumptions of 5
BASIC CONCEPTS: Why we need Normality Assumptions of 1. Influence of the omitted or neglected variables is small and at best random Central Limit Theorem (CLT) 2. Even if the number of variables is not very large or if these variables are not strictly independent, their sum may still be normally distributed 3. Must be normally distributed in order to make assumption of OLS estimators, are normally distributed 4. Normal distribution is a comparatively simple distribution involving only two parameters (mean and variance) 5. Let s say sample < 100, normality assumption assumes a critical role. If the sample size is reasonably large, normality is relaxed. 6. Large samples, t and F statistics have appropriately. TEST BLUE Condition 6
ม.ค.-09 เม.ย.-09 ก.ค.-09 ต.ค.-09 ม.ค.-10 เม.ย.-10 ก.ค.-10 ต.ค.-10 ม.ค.-11 เม.ย.-11 ก.ค.-11 ต.ค.-11 ม.ค.-12 เม.ย.-12 Multiple Regression Analysis DATA PREPARATION: Seasonally Adjusted is statistical methods of removing the seasonal component of a time series that is used when analyzing non-seasonal trends Many economic phenomena have seasonal cycles 140 Dubai Crude Oil Price 140 120 Seasonally Adjusted : Census X12 Method 120 100 100 80 80 60 40 20 0 2009 2010 2011 2012 60 40 20 0 OIL OIL_SA Jan Feb Mar Apr MayJune Jul Aug Sep Oct Nov Dec 7
DATA PREPARATION: Seasonally Adjusted 8
ALTERNATIVE MODELS Stationary (Unit Root Test: ADF) H 0 : Non Station (unit root) Granger Causality Test Stationary : I(0) (Reject H0), p 0.05 Stationary Data at First Diff D(data) Non Stationary : I(1) (Fail to Reject H 0 ) p> 0.05 VAR/VECM I(0) or I(1) ( ECONOMETRIC PROBLEMS Multicollinearity Run: Xi = f(x1, X2,..,Xk) Rule of Thumb: VIF 10 No Multi VIF ( i) = 1 / VIF 1 R2) (βi) = 1 / (1-R 2 ) If Multicollinearity VIF > 10, then drop variable Autocorrelation Test: Durbin Watson (D.W.) 2 If Autocorrelation D.W. not 2, then AR(1) No Autocorrelation Heteroscedasticity Test: White Test H0 : Homoscedasticity, p > 0.05 If Heteroscedasticity (p 0.05) Transform Regression Yi /xi = b0\xi, +b1 Yi/Xi 2 = b0\ Xi 2, +b1/xi ARCH/GARCH Yi/ 2 i = b0, +b1xi / 2 i Clean Econometrix Problems GO AHEAD!!! RUN OLS : William H. Greene, Dr. Kulkunya Prayarach 9
DATA PREPARATION: Stationary is a stochastic process whose joint probability distribution does not change when shifted in time or space >>> Parameters (mean, variance) will not change overtime or position Stationary at level I(0) 10
DATA PREPARATION: Random Walk (Unit Root Process) Random Walk without Drift Random Walk with Drift 11
DATA PREPARATION: Unit Root Test a test of stationary (or nonstationary) where Where u t is a white noise error term. Test Augmented Dickey-Fuller (ADF) Test for Unit Root Test Test H 0 : then UNIT ROOT (nonstationary) ~ Random walk without drift >>> CANNOT simply regress Y t on its lagged value Y t-1 12
DATA PREPARATION: How to Solve Unit Root Problem STEP 1: First Differentiate STEP 2 : Test Unit Root again Test H 0 : ~ >>> Unit root (ACCEPT) STEP 3 : Second Differentiate Test H 0 : if reject then NO Unit root 13
1/1/2009 1/4/2009 1/7/2009 1/10/2009 1/1/2010 1/4/2010 1/7/2010 1/10/2010 1/1/2011 1/4/2011 1/7/2011 1/10/2011 1/1/2012 1/4/2012 1/3/2006 3/22/2006 6/8/2006 8/24/2006 11/9/2006 1/30/2007 4/18/2007 7/5/2007 9/20/2007 5 Dec 07 19 Feb 08 5 May 08 18 Jul 08 2 Oct 08 17 Dec 08 3 Mar 09 18 May 09 31 Jul 09 15 Oct 09 30 Dec 09 16 Mar 10 31 May 10 13 Aug 10 28 Oct 10 12 Jan 11 29 Mar 11 13 Jun 11 26 Aug 11 10 Nov 11 25 Jan 12 10 Apr 12 160 Oil Price (WTI) 140 120 100 80 60 40 20 0 37 35 33 31 29 27 Exchange Rate 14
DATA PREPARATION: Gaussian, Standard or Classical Linear Regression Model (CLRM) 15
Abnormal profit % Assumption 1: # of stock 16
Assumption 2: Nonlinear Regression Taylor Series Expansion Gauss-Newton iterative Newton-Raphson iterative Method 17
Assumption 3: 18
Assumption 4: 19
Assumption 5: 20
Assumption 6: There must be sufficient variability in the values taken by the regressors. I. Conceptual Framework II. Empirical Evidence III. My Mapping IV. Linkages: Internal Factor, External Factor, Shock 21
Assumption 7: X variables Should be vary 22
MULTICOLLINEARITY: Is Multicollinearity seriously Problem? Assumption 8: What is the nature of multicollinearity? Is Multicollinearity really a problem? What are its practical consequences? How does one detect it? What remedial measures can be taken to alleviate the problem of multicollinearity? 23
MULTICOLLINEARITY: Is Multicollinearity seriously Problem? The Nature of Multicollinearity is the existence of a perfect or exact, linear relationship among some or all explanatory variables of a regression model 24
MULTICOLLINEARITY: Consequences of Multicollinearity Best Linear Unbiased Estimator Collinearity does not destroy the property of BLUE 25
MULTICOLLINEARITY: Detecting of Multicollinearity 1. High R2 but few significant t ratios. Example: R2 = 0.8 but individual t tests wil show that none or few of the partial slope coefficients are statisticallly different from zero. 2. High pair-wise correlations among regressors. 3. Examination of partial correlations 26
MULTICOLLINEARITY: Detecting of Multicollinearity 4. Auxiliary regression 5. Eigenvalues and condition index if 100 < k <1000 moderate multicollinearity k > 1000 severe multicollinearity 6. Tolerance and variance inflation factors TOL >>> 0 or VIF > 10 27
MULTICOLLINEARITY: Remedial Measures 1. Do nothing Multicollinearity is God s will, not a problem with OLS or statistical techique in general (Blanchard) 2. Rule of Thumb Procedures (1) A priori information (2) Combining cross-sectional and time series data (3) Dropping variable(s) and specification bias (4) Transformation of variables (5) (Additional or new data) Increase a size of sample (6) Polynomial Regression (7) Factor analysis 28
Autocorrelation: Nature of Autocorrelation Assumption 9: 1. What is the nature of autocorrelation? 2. What are the theoretical and practical consequences of autocorrelation? 3. How does one remedy the problem of autocorrelation? 29
Autocorrelation: Nature of Autocorrelation Positive serial correlation Negative serial correlation Zero correlation 30
Autocorrelation: Types of Autocorrelation 1. Specification Bias: Excluded variables Case 2. Nonstationarity 3. Spurious problem 31
Autocorrelation: Consequences of using OLS in the Presence of Autocorrelation Best Linear Unbiased Estimator Autocorrelation destroy property of BLUE Autocorrelation destroys the property of BLUE due to not minimum variance The residual variance is likely to underestimate The usual t and F tests of significance are no longer valid, and if applied, are likely to give seriously misleading conclusions about the statiscal signifcance of the estimated regression coefficients 32
Autocorrelation: Detecting Autocorrelation 1. Graph Residual Plot 2. Run Test 3. Durbin-Watson Test 4. Breusch-Godfrey (BG) test ~ LM test nonstochastic regressors, higher-order autoregressive : AR(1), AR(2)) 33
Autocorrelation: Remedial Measure 1. Transform the original model >>> o Generalized least-square (GLS) Method o Feasible Generalized least-square (FGLS) method 2. First-Difference Method 3. When is not known then estimate from the residuals AR(1) 4. Change Model to ARCH and GARCH Models 5. Change Model to ARMA or ARIMA 34
Heteroscedasticity: Nature of Heteroscedasticity Assumption 10: 35
Heteroscedasticity: Nature of Heteroscedasticity What is the nature of heteroscedasticity? What are its consequences? How does one detect it? What are the remedial measures? 36
Heteroscedasticity: Nature of Heteroscedasticity Why the variances of u i may be variable? 1. Following the error-learning models, as people learn their errors of behavior become smaller over time. 2. Growth oriented companies 3. As data collecting techniques improves, is likely to decrease. 4. The presence of outliers 5. Skewness 37
Heteroscedasticity: Consequences of using OLS in the Presence of Heteroscedasticity Best Linear Unbiased Estimator Heteroscedasticity destroy property of BLUE If we persist in using the usual testing procedure despite heteroscedasticity, whatever conclusions we draw or inferences we make may be very misleading 38
Heteroscedasticity: Detecting of Heteroscedasticity 1. Graph Residual Plot against Y and X 2. Park Test 3. Glejser Test 4. Spearman s Rank Correlation Test 5. Glejser Test 6. Goldfeld-Quandt Test 7. Breusch-Pagon-Godfrey Test (BPG) 8. White s General Heteroscedasticity Test 39
Heteroscedasticity: Remedial Measures 1. Weighted Least Square (WLS) o Weighted by Y, 1/X, Different variables o Error Term 40
Assumption 11: Omitting Variables 41
42
43
Heteroscedasticity 44
Variable Definitions 45
WORK SHOP #2 46
WORK ORDERS : Multiple Regression (1) Run Multiple Regression Take care of seasonal effect and smooth data (by taking log) (2) Test Multicollinearity and remedy if happens (3) Test Autocorrelation and remedy if happens (4) Test Heteroscedasticity and remedy if happens (5) Analyze your results 47