CHAPTER 4: Forecasting by Regression
|
|
- Clarence Perkins
- 6 years ago
- Views:
Transcription
1 CHAPTER 4: Forecasting by Regression Prof. Alan Wan 1 / 57
2 Table of contents 1. Revision of Linear Regression 3.1 First-order Autocorrelation and the Durbin-Watson Test 3.2 Correction for Autocorrelation 2 / 57
3 Revision of Linear Regression One main purpose of regression is to forecast an outcome, also called response variable or dependent variable, based on certain factors, also called explanatory variables or regressors. 3 / 57
4 Revision of Linear Regression One main purpose of regression is to forecast an outcome, also called response variable or dependent variable, based on certain factors, also called explanatory variables or regressors. The outcome has to be quantitative, but the explanatory variables can either be quantitative or qualitative. 3 / 57
5 Revision of Linear Regression One main purpose of regression is to forecast an outcome, also called response variable or dependent variable, based on certain factors, also called explanatory variables or regressors. The outcome has to be quantitative, but the explanatory variables can either be quantitative or qualitative. Linear regression postulates a linear association between the response and each of the explanatory variables; simple regression deals with situations with one explanatory variable, whereas multiple regression tackles cases with more than one regressor. 3 / 57
6 Revision of Linear Regression A multiple linear regression model may be expressed as: Y t = β 0 + β 1 X 1t + β 2 X 2t + β 3 X 3t + + β k X kt + ɛ t, where ɛ t N(0, σ 2 ). Hence E(Y t ) = β 0 + β 1 X 1t + β 2 X 2t + β 3 X 3t + + β k X kt 4 / 57
7 Revision of Linear Regression A multiple linear regression model may be expressed as: Y t = β 0 + β 1 X 1t + β 2 X 2t + β 3 X 3t + + β k X kt + ɛ t, where ɛ t N(0, σ 2 ). Hence E(Y t ) = β 0 + β 1 X 1t + β 2 X 2t + β 3 X 3t + + β k X kt The estimated sample multiple linear regression model is thus Ŷ t = b 0 + b 1 X 1t + b 2 X 2t + b 3 X 3t + + b k X kt, where b 0, b 1,, b k are the ordinary least squares (O.L.S.) estimators of β 0, β 1,, β k respectively obtained by the criterion min n t=1 e2 t = min n t=1 (Y t Ŷt) 2 4 / 57
8 Revision of Linear Regression A slope coefficient represents the marginal change of Y t with respect to a one-unit change in the corresponding explanatory variable. 5 / 57
9 Revision of Linear Regression A slope coefficient represents the marginal change of Y t with respect to a one-unit change in the corresponding explanatory variable. The linear regression model assumes 1. that there is a linear association between the response and each of the explanatory variables 5 / 57
10 Revision of Linear Regression A slope coefficient represents the marginal change of Y t with respect to a one-unit change in the corresponding explanatory variable. The linear regression model assumes 1. that there is a linear association between the response and each of the explanatory variables 2. that E(ɛ t ) = 0 for all t, meaning that no relevant explanatory variable has been omitted 5 / 57
11 Revision of Linear Regression A slope coefficient represents the marginal change of Y t with respect to a one-unit change in the corresponding explanatory variable. The linear regression model assumes 1. that there is a linear association between the response and each of the explanatory variables 2. that E(ɛ t ) = 0 for all t, meaning that no relevant explanatory variable has been omitted 3. that the disturbances are homoscedastic, i.e., var(ɛ t ) = σ 2 for all t 5 / 57
12 Revision of Linear Regression A slope coefficient represents the marginal change of Y t with respect to a one-unit change in the corresponding explanatory variable. The linear regression model assumes 1. that there is a linear association between the response and each of the explanatory variables 2. that E(ɛ t ) = 0 for all t, meaning that no relevant explanatory variable has been omitted 3. that the disturbances are homoscedastic, i.e., var(ɛ t ) = σ 2 for all t 4. that the disturbances are uncorrelated, i.e., cov(ɛ t, ɛ t+s ) = 0 for all t and s 0 5 / 57
13 Revision of Linear Regression A slope coefficient represents the marginal change of Y t with respect to a one-unit change in the corresponding explanatory variable. The linear regression model assumes 1. that there is a linear association between the response and each of the explanatory variables 2. that E(ɛ t ) = 0 for all t, meaning that no relevant explanatory variable has been omitted 3. that the disturbances are homoscedastic, i.e., var(ɛ t ) = σ 2 for all t 4. that the disturbances are uncorrelated, i.e., cov(ɛ t, ɛ t+s ) = 0 for all t and s 0 5. the absence of perfect multicollinearity, i.e., no exact linear association exists among the explanatory variables 5 / 57
14 Revision of Linear Regression A slope coefficient represents the marginal change of Y t with respect to a one-unit change in the corresponding explanatory variable. The linear regression model assumes 1. that there is a linear association between the response and each of the explanatory variables 2. that E(ɛ t ) = 0 for all t, meaning that no relevant explanatory variable has been omitted 3. that the disturbances are homoscedastic, i.e., var(ɛ t ) = σ 2 for all t 4. that the disturbances are uncorrelated, i.e., cov(ɛ t, ɛ t+s ) = 0 for all t and s 0 5. the absence of perfect multicollinearity, i.e., no exact linear association exists among the explanatory variables 6. normality of ɛ t s (this assumption is needed only when conducting inference) 5 / 57
15 Revision of Linear Regression The O.L.S. estimator b j is a linear estimator of β j for j = 0,, k, because each b j can be written as a linear combination of Y t s weighted by a mixture of the values of X t s. 6 / 57
16 Revision of Linear Regression The O.L.S. estimator b j is a linear estimator of β j for j = 0,, k, because each b j can be written as a linear combination of Y t s weighted by a mixture of the values of X t s. When Assumptions are fulfilled, b j yields the best linear unbiased estimator (B.L.U.E.) of β j, meaning that the linear estimator b j is unbiased (i.e., E(b j ) = β j for j = 0,, k) and b j has the smallest variance (and hence the highest average precision) of all linear unbiased estimators of β j. 6 / 57
17 Revision of Linear Regression The O.L.S. estimator b j is a linear estimator of β j for j = 0,, k, because each b j can be written as a linear combination of Y t s weighted by a mixture of the values of X t s. When Assumptions are fulfilled, b j yields the best linear unbiased estimator (B.L.U.E.) of β j, meaning that the linear estimator b j is unbiased (i.e., E(b j ) = β j for j = 0,, k) and b j has the smallest variance (and hence the highest average precision) of all linear unbiased estimators of β j. The theorem proving the above result is known as the Gauss-Markov Theorem. 6 / 57
18 Revision of Linear Regression Common model diagnostics include 1. t-tests of significance of individual coefficients 2. F test of model significance 3. R 2 and adjusted-r 2 for goodness of fit 4. test of autocorrelation (usually for time series data) 5. test of homoscedasticity (usually for cross section data) 6. test of autoregressive conditional heteroscedasticity (usually for financial time series data) 7. detection of outliers 8. test of normality of errors 9. test of coefficient constancy (structural change) and others. 7 / 57
19 Revision of Linear Regression The following example with n = 34 annual observations is taken from Griffiths, Hill and Judge (1993). We are concerned with the relationship between the area of sugarcane (A, in 1000 of hectares) planted in a region of Bangladesh. By using area planted instead of quantity as the dependent variable, we are elminating yield uncertainty. It is thought that when farmers decide on an area for sugarcane production, their decision is largely determined by the price of sugarcane (PS, in taka/tonne) and that of its main substitute, jute (PJ, in taka/tonne). Assuming a log-linear function form for constant elasticity, we specify the model as lna t = β 0 + β 1 lnps t + β 2 lnpj t + ɛ t 8 / 57
20 Revision of Linear Regression PROC REG of SAS produces the following results: The REG Procedure Model: MODEL1 Dependent Variable: lna Number of Observations Read 34 Number of Observations Used 34 Analysis of Variance Sum of Mean Source DF Squares Square F Value Pr > F Model <.0001 Error Corrected Total Root MSE R-Square Dependent Mean Adj R-Sq Coeff Var Parameter Estimates Parameter Standard Variable DF Estimate Error t Value Pr > t Intercept lnps <.0001 lnpj / 57
21 Revision of Linear Regression The estimated regression equation is thus lna t = ( ) ( ) lnps t ( ) lnpj t 10 / 57
22 Revision of Linear Regression The estimated regression equation is thus lna t = ( ) ( ) lnps t ( ) lnpj t A test of H 0 : β 1 = 0 vs. H 1 : β 1 0 yields t = b 1 0 s.e.(b 1 ) = = 6.98 with a p-value of < Hence β 1 is significantly different from zero, and lnps is therefore a significant explanatory variable. 10 / 57
23 Revision of Linear Regression The estimated regression equation is thus lna t = ( ) ( ) lnps t ( ) lnpj t A test of H 0 : β 1 = 0 vs. H 1 : β 1 0 yields t = b 1 0 s.e.(b 1 ) = = 6.98 with a p-value of < Hence β 1 is significantly different from zero, and lnps is therefore a significant explanatory variable. However, the same cannot be said about β 2 or lnpj. 10 / 57
24 Revision of Linear Regression A test of H 0 : β 1 = β 2 = 0 vs. H 1 : otherwise by the F test: F = RSS/k ESS/(n (k+1)) = / /31 = with a p-value of < , confirming the overall significance of the model. Question: Why should we test the overall significance of the model in addition to testing individual regressors significance? R 2 = meaning that the estimated regression can explain 62.06% of the variability of lna in the sample; after adjusting for the model s d.o.f., the explanatory power of the model is 59.61%, as indicated by the adjusted-r / 57
25 Revision of Linear Regression Removing the insignificant lnpj and re-running the regression yields: The REG Procedure Model: MODEL1 Dependent Variable: lna Number of Observations Read 34 Number of Observations Used 34 Analysis of Variance Sum of Mean Source DF Squares Square F Value Pr > F Model <.0001 Error Corrected Total Root MSE R-Square Dependent Mean Adj R-Sq Coeff Var Parameter Estimates Parameter Standard Variable DF Estimate Error t Value Pr > t Intercept lnps < / 57
26 Revision of Linear Regression Note that R 2 decreases by 4.38% from to whereas adjusted-r 2 decreases by 2.58% from to Recall that when explanatory variables are dropped (added), R 2 always falls (rises), but adjusted-r 2 may rise or fall; when the model contains fewer (more) variables, adjusted-r 2 will rise (drop) if the increase (decrease) in d.o.f. due to the omission (addition) of variables outweighs the fall (rise) in the explanatory power of the regression. 13 / 57
27 Revision of Linear Regression Note that R 2 decreases by 4.38% from to whereas adjusted-r 2 decreases by 2.58% from to Recall that when explanatory variables are dropped (added), R 2 always falls (rises), but adjusted-r 2 may rise or fall; when the model contains fewer (more) variables, adjusted-r 2 will rise (drop) if the increase (decrease) in d.o.f. due to the omission (addition) of variables outweighs the fall (rise) in the explanatory power of the regression. For the simple linear regression model, the t statistic for H 0 : β 1 = 0 is 6.834, which is F = This result does not hold for multiple regression. 13 / 57
28 Multicollinearity There is another serious consequence of adding too many variables to a model besides depleting the model s d.o.f. If a model has several variables, it is likely that some of the variables will be strongly correlated. This problem, known as multicollinearity, can drastically alter the results from one model to another, making them harder to interpret. 14 / 57
29 Multicollinearity There is another serious consequence of adding too many variables to a model besides depleting the model s d.o.f. If a model has several variables, it is likely that some of the variables will be strongly correlated. This problem, known as multicollinearity, can drastically alter the results from one model to another, making them harder to interpret. The most extreme form of multicollinearity is perfect multicollinearity. It refers to the situation where an explanatory variable can be expressed as an exact linear combination of some of the others. Under perfect multicollinearity, O.L.S. fails to produce estimates of the coefficients. A classic example of perfect multicollinearity is the dummy variable trap. 14 / 57
30 Multicollinearity (Imperfect) multicollinearity is also known as near collinearity - the explanatory variables are linearly correlated but they do not obey an exact linear relationship. 15 / 57
31 Multicollinearity (Imperfect) multicollinearity is also known as near collinearity - the explanatory variables are linearly correlated but they do not obey an exact linear relationship. Consider the following three models explaining the relationship between HOUSING (number of housing starts (in thousands) in the U.S., and POP (U.S. population in millions), GDP (U.S. Gross Domestic Product in billions of dollars) and INTRATE (new home mortgage interest rate) between 1963 to 1985: 1)HOUSING t = β 0 + β 1 POP t + β 2 INTRATE t + ɛ t 2)HOUSING t = β 0 + β 3 GDP t + β 2 INTRATE t + ɛ t 3)HOUSING t = β 0 + β 1 POP t + β 2 INTRATE t + β 3 GDP t + ɛ t 15 / 57
32 Multicollinearity Results for the first model: The REG Procedure Model: MODEL1 Dependent Variable: housing Number of Observations Read 23 Number of Observations Used 23 Analysis of Variance Sum of Mean Source DF Squares Square F Value Pr > F Model Error Corrected Total Root MSE R-Square Dependent Mean Adj R-Sq Coeff Var Parameter Estimates Parameter Standard Variable DF Estimate Error t Value Pr > t Intercept pop intrate / 57
33 Multicollinearity Results for the second model: The REG Procedure Model: MODEL1 Dependent Variable: housing Number of Observations Read 23 Number of Observations Used 23 Analysis of Variance Sum of Mean Source DF Squares Square F Value Pr > F Model Error Corrected Total Root MSE R-Square Dependent Mean Adj R-Sq Coeff Var Parameter Estimates Parameter Standard Variable DF Estimate Error t Value Pr > t Intercept gdp intrate / 57
34 Multicollinearity Results from Models 1) and 2) both make sense - estimates of the coefficients are of the expected signs: β 1 > 0, β 2 < 0 and β 3 > 0 and the coefficients are all highly significant. 18 / 57
35 Multicollinearity Results from Models 1) and 2) both make sense - estimates of the coefficients are of the expected signs: β 1 > 0, β 2 < 0 and β 3 > 0 and the coefficients are all highly significant. Consider the third model that combines regressors of the first and second models: The REG Procedure Model: MODEL1 Dependent Variable: housing Number of Observations Read 23 Number of Observations Used 23 Analysis of Variance Sum of Mean Source DF Squares Square F Value Pr > F Model Error Corrected Total Root MSE R-Square Dependent Mean Adj R-Sq Coeff Var Parameter Estimates Parameter Standard Estimate Error Variable DF t Value Pr > t Intercept pop gdp intrate / 57
36 Multicollinearity In the third model, POP and GDP change to becoming insignificant although they are both significant when entering separately in the first and second models. This is because the three explanatory variables are strongly correlated. The pairwise sample correlations of the three variables are as follows: r GDP,POP = 0.99, r GDP,INTRATE = 0.88 and r POP,INTRATE = / 57
37 Multicollinearity Consider another example that relates EXPENSES, cumulative expenditure on the maintenance of an automobile, to MILES, the cumulative mileage in thousand of miles, and WEEKS, the automobile s age in weeks since first purchase, for 57 automobiles. The following three models are considered: 1)EXPENSES t = β 0 + β 1 WEEKS t + ɛ t 2)EXPENSES t = β 0 + β 2 MILES t + ɛ t 3)EXPENSES t = β 0 + β 1 WEEKS t + β 2 MILES t + ɛ t 20 / 57
38 Multicollinearity Consider another example that relates EXPENSES, cumulative expenditure on the maintenance of an automobile, to MILES, the cumulative mileage in thousand of miles, and WEEKS, the automobile s age in weeks since first purchase, for 57 automobiles. The following three models are considered: 1)EXPENSES t = β 0 + β 1 WEEKS t + ɛ t 2)EXPENSES t = β 0 + β 2 MILES t + ɛ t 3)EXPENSES t = β 0 + β 1 WEEKS t + β 2 MILES t + ɛ t A priori, we expect β 1 > 0 and β 2 > 0; a car that is driven more should have a greater maintenance expense; similarly, the older the car the greater the cost of maintaining it. 20 / 57
39 Multicollinearity Consider results for the three models: The REG Procedure Model: MODEL1 Dependent Variable: expenses Number of Observations Read 57 Number of Observations Used 57 Analysis of Variance Sum of Mean Source DF Squares Square F Value Pr > F Model <.0001 Error Corrected Total Root MSE R-Square Dependent Mean Adj R-Sq Coeff Var Parameter Estimates Parameter Standard Variable DF Estimate Error t Value Pr > t Intercept <.0001 weeks < / 57
40 Multicollinearity The REG Procedure Model: MODEL1 Dependent Variable: expenses Number of Observations Read 57 Number of Observations Used 57 Analysis of Variance Sum of Mean Source DF Squares Square F Value Pr > F Model <.0001 Error Corrected Total Root MSE R-Square Dependent Mean Adj R-Sq Coeff Var Parameter Estimates Parameter Standard Variable DF Estimate Error t Value Pr > t Intercept <.0001 miles < / 57
41 Multicollinearity The REG Procedure Model: MODEL1 Dependent Variable: expenses Number of Observations Read 57 Number of Observations Used 57 Analysis of Variance Sum of Mean Source DF Squares Square F Value Pr > F Model <.0001 Error Corrected Total Root MSE R-Square Dependent Mean Adj R-Sq Coeff Var Parameter Estimates Parameter Standard Variable DF Estimate Error t Value Pr > t Intercept weeks <.0001 miles < / 57
42 Multicollinearity It is interesting to note that even though the coefficient estimate for MILES is positive in the second model, it is negative in the third model. Thus there is a reversal in sign. 24 / 57
43 Multicollinearity It is interesting to note that even though the coefficient estimate for MILES is positive in the second model, it is negative in the third model. Thus there is a reversal in sign. The magnitude of the coefficient estimate for WEEKS also changes substantially. 24 / 57
44 Multicollinearity It is interesting to note that even though the coefficient estimate for MILES is positive in the second model, it is negative in the third model. Thus there is a reversal in sign. The magnitude of the coefficient estimate for WEEKS also changes substantially. The t-statistics for MILES and WEEKS are also much lower in the third model even though both variables are still significant. 24 / 57
45 Multicollinearity It is interesting to note that even though the coefficient estimate for MILES is positive in the second model, it is negative in the third model. Thus there is a reversal in sign. The magnitude of the coefficient estimate for WEEKS also changes substantially. The t-statistics for MILES and WEEKS are also much lower in the third model even though both variables are still significant. The problem is again high correlation between WEEKS and MILES. 24 / 57
46 Multicollinearity To explain, consider the model Y t = β 0 + β 1 X 1t + β 2 X 2t + ɛ t It can be shown that var(b 1 ) = σ 2 n t=1 (X 1t X 1 ) 2 (1 r 2 12 ) and var(b 2 ) = σ 2 n t=1 (X 2t X 2 ) 2 (1 r 2 12 ), where r 12 is the sample correlation between X 1t and X 2t. 25 / 57
47 Multicollinearity The effects of increasing r 12 on var(b 2 ): r 12 var(b 2 ) 0 σ 2 nt=1 (X 2t X 2 ) 2 = V V V V V V V V V V The sign reversal and decrease in t values (in absolute terms) are caused by the inflated variances of the estimators. 26 / 57
48 Multicollinearity Common consequences of multicollinearity: - Wider confidence intervals. - Insignificant t statistics. - High R 2 and consequently F can convincingly reject H 0 : β 1 = β 2 = = β k = 0, but few significant t values. - O.L.S. estimates and their standard errors are very sensitive to small changes in model. 27 / 57
49 Multicollinearity Common consequences of multicollinearity: - Wider confidence intervals. - Insignificant t statistics. - High R 2 and consequently F can convincingly reject H 0 : β 1 = β 2 = = β k = 0, but few significant t values. - O.L.S. estimates and their standard errors are very sensitive to small changes in model. Multicollinearity is very much a norm in regression analysis involving non-experimental data. It can never be eliminated. The question is not about the existence or non-existence of multicollinearity, but how serious the problem is. 27 / 57
50 Identifying multicollinearity How to identify multicollinearity? High R 2 (and significant F value) but low values of t statistics. 28 / 57
51 Identifying multicollinearity How to identify multicollinearity? High R 2 (and significant F value) but low values of t statistics. Coefficient estimates and standard errors are sensitive to small changes in model specification. 28 / 57
52 Identifying multicollinearity How to identify multicollinearity? High R 2 (and significant F value) but low values of t statistics. Coefficient estimates and standard errors are sensitive to small changes in model specification. High pairwise correlations between the explanatory variables, but the converse need not be true. In other words, multicollinearity can still be a problem even though the correlation between two variables does not appear to be high. It is possible for three or more variables to be strongly correlated with low pairwise correlations. 28 / 57
53 Identifying multicollinearity variance inflation factor (VIF): The VIF for the variable X j is VIF j = 1, 1 Rj 2 where Rj 2 is the coefficient of determination of the regression of X j on the remaining explanatory variables. The VIF is a measure of the strength of the relationship between each explanatory variable and all other explanatory variables. 29 / 57
54 Identifying multicollinearity variance inflation factor (VIF): The VIF for the variable X j is VIF j = 1, 1 Rj 2 where Rj 2 is the coefficient of determination of the regression of X j on the remaining explanatory variables. The VIF is a measure of the strength of the relationship between each explanatory variable and all other explanatory variables. Relationship between Rj 2 and VIF j : Rj 2 VIF j / 57
55 Identifying multicollinearity Rule of thumb for using VIF: - An individual VIF j larger than 10 indicates that multicollinearity may be seriously influencing the least squares estimates of regression coefficients. - If the average of the VIF j s of the model exceeds 5 then muilticollinearity is considered to be serious. 30 / 57
56 Identifying multicollinearity For the HOUSING example, The REG Procedure Model: MODEL1 Dependent Variable: housing Number of Observations Read 23 Number of Observations Used 23 Analysis of Variance Sum of Mean Source DF Squares Square F Value Pr > F Model Error Corrected Total Root MSE R-Square Dependent Mean Adj R-Sq Coeff Var Parameter Estimates Parameter Standard Variance Variable DF Estimate Error t Value Pr > t Inflation Intercept pop gdp intrate / 57
57 Solutions to multicollinearity Solutions to multicollinearity: Benign neglect: If an analyst is less interested in interpreting individual coefficients but more interested in forecasting then multicollinearity may not a serious concern. Even with high correlations among independent variables, if the regression coefficients are significant and have meaningful signs and magnitudes, one need not be too concerned with multicollinearity. 32 / 57
58 Solutions to multicollinearity Solutions to multicollinearity: Benign neglect: If an analyst is less interested in interpreting individual coefficients but more interested in forecasting then multicollinearity may not a serious concern. Even with high correlations among independent variables, if the regression coefficients are significant and have meaningful signs and magnitudes, one need not be too concerned with multicollinearity. Eliminating Variables: Remove the variable with strong correlation with the rest would generally improve the significance of other variables. There is a danger, however, in removing too many variables from the model because that would lead to bias in the estimates. 32 / 57
59 Solutions to multicollinearity Respecify the model: For example, in the housing regression, we can include the variables as per capita rather than including population as an explanatory variable, leading to HOUSING t /POP t = β 0 + β 1 GDP t /POP t + β 2 INTRATE t + ɛ t The REG Procedure Model: MODEL1 Dependent Variable: phousing Number of Observations Read 23 Number of Observations Used 23 Analysis of Variance Sum of Mean Source DF Squares Square F Value Pr > F Model Error Corrected Total Root MSE R-Square Dependent Mean Adj R-Sq Coeff Var Parameter Estimates Parameter Standard Variance Estimate Error t Inflation Variable DF Value Pr > t Intercept pgdp intrate / 57
60 Solutions to multicollinearity Increase the sample size if additional information is available. 34 / 57
61 Solutions to multicollinearity Increase the sample size if additional information is available. Use alternative estimation techniques such as Ridge regression and principal component analysis (beyond the scope of this course) 34 / 57
62 3.1 First-order Autocorrelation and the Durbin-Watson Test 3.2 Correction for Autocorrelation First-order autocorrelation As described previously, the standard linear regression model is assumed to be such that ɛ t and ɛ t+k are uncorrelated for all k 0. When this assumption fails the situation is known as autocorrelation or serial correlation. 35 / 57
63 3.1 First-order Autocorrelation and the Durbin-Watson Test 3.2 Correction for Autocorrelation First-order autocorrelation As described previously, the standard linear regression model is assumed to be such that ɛ t and ɛ t+k are uncorrelated for all k 0. When this assumption fails the situation is known as autocorrelation or serial correlation. The interpretation of such a situation is that the disturbance at time t influences not only the current value of the dependent variable but also values of the dependent variable at other times. 35 / 57
64 3.1 First-order Autocorrelation and the Durbin-Watson Test 3.2 Correction for Autocorrelation First-order autocorrelation As described previously, the standard linear regression model is assumed to be such that ɛ t and ɛ t+k are uncorrelated for all k 0. When this assumption fails the situation is known as autocorrelation or serial correlation. The interpretation of such a situation is that the disturbance at time t influences not only the current value of the dependent variable but also values of the dependent variable at other times. Many factors can cause autocorrelation, e.g., omitted explanatory variables, misspecification of functional form, measurement errors, patterns of business cycles, to name a few. 35 / 57
65 3.1 First-order Autocorrelation and the Durbin-Watson Test 3.2 Correction for Autocorrelation First-order autocorrelation There are many possible specifications of correlations among disturbances. The simplest, and also most common type is first-order autocorrelation by which the current disturbance depends linearly upon the immediate past disturbance plus another disturbance term that exhibits no autocorrelation over time, i.e., ɛ t = ρɛ t 1 + ν t, where ν t s are uncorrelated and ρ is an autocorrelation coefficient. 36 / 57
66 3.1 First-order Autocorrelation and the Durbin-Watson Test 3.2 Correction for Autocorrelation First-order autocorrelation There are many possible specifications of correlations among disturbances. The simplest, and also most common type is first-order autocorrelation by which the current disturbance depends linearly upon the immediate past disturbance plus another disturbance term that exhibits no autocorrelation over time, i.e., ɛ t = ρɛ t 1 + ν t, where ν t s are uncorrelated and ρ is an autocorrelation coefficient. It is required that 1 < ρ < 1 to fulfill the assumption of stationarity (see Chapter 5). 36 / 57
67 3.1 First-order Autocorrelation and the Durbin-Watson Test 3.2 Correction for Autocorrelation Durbin-Watson Test The problem with O.L.S. under autocorrelation is that it leads to inefficient estimators of the coefficients and a biased estimator of the error variance. Alternative estimation strategies other than O.L.S. are typically used when disturbances are autocorrelated. 37 / 57
68 3.1 First-order Autocorrelation and the Durbin-Watson Test 3.2 Correction for Autocorrelation Durbin-Watson Test The problem with O.L.S. under autocorrelation is that it leads to inefficient estimators of the coefficients and a biased estimator of the error variance. Alternative estimation strategies other than O.L.S. are typically used when disturbances are autocorrelated. How to test for first-order autocorrelation? 37 / 57
69 3.1 First-order Autocorrelation and the Durbin-Watson Test 3.2 Correction for Autocorrelation Durbin-Watson Test The problem with O.L.S. under autocorrelation is that it leads to inefficient estimators of the coefficients and a biased estimator of the error variance. Alternative estimation strategies other than O.L.S. are typically used when disturbances are autocorrelated. How to test for first-order autocorrelation? The Durbin-Watson (DW) test is the most common test. The DW test statistic is given by DW = where e t = Y t Ŷ t. n t=2 (et e t 1) 2 n, t=1 e2 t 37 / 57
70 3.1 First-order Autocorrelation and the Durbin-Watson Test 3.2 Correction for Autocorrelation Durbin-Watson Test Note that n t=2 DW = (e t e t 1 ) 2 n t=1 e2 t = n t=2 e2 t + n t=2 e2 t 1 2 n t=2 e te t 1 n t=1 e2 t = n t=1 e2 t e1 2 + n t=1 e2 t en 2 2 n t=2 e te t 1 n t=1 e2 t = 2 n t=1 e2 t 2 n t=2 e te t 1 (e1 2 + e2 n) n t=1 e2 t n t=1 e2 t = 2(1 r) (e2 1 + e2 n), where r = n t=2 ete t 1 n t=1 e2 t is the sample autocorrelation coefficient. 38 / 57
71 3.1 First-order Autocorrelation and the Durbin-Watson Test 3.2 Correction for Autocorrelation Durbin-Watson Test When the sample size is sufficient large, DW 2(1 r). 39 / 57
72 3.1 First-order Autocorrelation and the Durbin-Watson Test 3.2 Correction for Autocorrelation Durbin-Watson Test When the sample size is sufficient large, DW 2(1 r). If this was based on the true ɛ t s then DW tends in the limit to 2(1 ρ) as n increases. This means - if ρ 0, then DW 2 - if ρ 1, then DW 0 - if ρ 1, then DW 4 39 / 57
73 3.1 First-order Autocorrelation and the Durbin-Watson Test 3.2 Correction for Autocorrelation Durbin-Watson Test When the sample size is sufficient large, DW 2(1 r). If this was based on the true ɛ t s then DW tends in the limit to 2(1 ρ) as n increases. This means - if ρ 0, then DW 2 - if ρ 1, then DW 0 - if ρ 1, then DW 4 Therefore, a test of H 0 : ρ = 0 can be based on whether DW is close to 2 or not. 39 / 57
74 3.1 First-order Autocorrelation and the Durbin-Watson Test 3.2 Correction for Autocorrelation Durbin-Watson Test When the sample size is sufficient large, DW 2(1 r). If this was based on the true ɛ t s then DW tends in the limit to 2(1 ρ) as n increases. This means - if ρ 0, then DW 2 - if ρ 1, then DW 0 - if ρ 1, then DW 4 Therefore, a test of H 0 : ρ = 0 can be based on whether DW is close to 2 or not. Unfortunately, the critical values of DW depend on the values of the explanatory variables and these vary from one data set to another. 39 / 57
75 3.1 First-order Autocorrelation and the Durbin-Watson Test 3.2 Correction for Autocorrelation Durbin-Watson Test To get around this problem, Durbin and Watson established the lower (d L ) and upper (d U ) bounds for the DW critical value. 40 / 57
76 3.1 First-order Autocorrelation and the Durbin-Watson Test 3.2 Correction for Autocorrelation Durbin-Watson Test To get around this problem, Durbin and Watson established the lower (d L ) and upper (d U ) bounds for the DW critical value. If DW > 4 d L or DW < d L, then we reject H 0. If the observed d U < DW < 4 d U, then we do not reject H 0. If DW lies in neither of these two regions then the test is inconclusive. 40 / 57
77 3.1 First-order Autocorrelation and the Durbin-Watson Test 3.2 Correction for Autocorrelation Durbin-Watson Test To get around this problem, Durbin and Watson established the lower (d L ) and upper (d U ) bounds for the DW critical value. If DW > 4 d L or DW < d L, then we reject H 0. If the observed d U < DW < 4 d U, then we do not reject H 0. If DW lies in neither of these two regions then the test is inconclusive. See the DW table uploaded on the website. Note that d L and d U are tabulated in terms of n and k = k 1 = number of coefficients excluding the intercept. 40 / 57
78 3.1 First-order Autocorrelation and the Durbin-Watson Test 3.2 Correction for Autocorrelation Durbin-Watson Test To get around this problem, Durbin and Watson established the lower (d L ) and upper (d U ) bounds for the DW critical value. If DW > 4 d L or DW < d L, then we reject H 0. If the observed d U < DW < 4 d U, then we do not reject H 0. If DW lies in neither of these two regions then the test is inconclusive. See the DW table uploaded on the website. Note that d L and d U are tabulated in terms of n and k = k 1 = number of coefficients excluding the intercept. An intercept term must be present in order for d L s and d U s to be valid. 40 / 57
79 3.1 First-order Autocorrelation and the Durbin-Watson Test 3.2 Correction for Autocorrelation Durbin-Watson Test To be more specific, for testing H 0 : ρ = 0 vs. H 1 : ρ > 0, the decision rule is to reject H 0 if DW < d L and not to reject H 0 if DW > d U ; the test is inconclusive if d L < DW < d U. 41 / 57
80 3.1 First-order Autocorrelation and the Durbin-Watson Test 3.2 Correction for Autocorrelation Durbin-Watson Test To be more specific, for testing H 0 : ρ = 0 vs. H 1 : ρ > 0, the decision rule is to reject H 0 if DW < d L and not to reject H 0 if DW > d U ; the test is inconclusive if d L < DW < d U. For testing H 0 : ρ = 0 vs. H 1 : ρ < 0, the decision rule is to reject H 0 if DW > 4 d L and not to reject H 0 if DW < 4 d U ; the test is inconclusive if 4 d U < DW < 4 d L. 41 / 57
81 3.1 First-order Autocorrelation and the Durbin-Watson Test 3.2 Correction for Autocorrelation Durbin-Watson Test SAS calculates the DW statistic by the option DW in PROC REG. 42 / 57
82 3.1 First-order Autocorrelation and the Durbin-Watson Test 3.2 Correction for Autocorrelation Durbin-Watson Test SAS calculates the DW statistic by the option DW in PROC REG. For example, for our previous sugarcane plant area example, one can calculate the DW statistic by proc reg data=bangladesh; model lna=lnps lnpj/dw; run; yielding the results The REG Procedure Model: MODEL1 Dependent Variable: lna Durbin-Watson D Number of Observations 34 1st Order Autocorrelation / 57
83 3.1 First-order Autocorrelation and the Durbin-Watson Test 3.2 Correction for Autocorrelation Durbin-Watson Test As r = 0.412, we test H 0 : ρ = 0 vs. H 1 : ρ > / 57
84 3.1 First-order Autocorrelation and the Durbin-Watson Test 3.2 Correction for Autocorrelation Durbin-Watson Test As r = 0.412, we test H 0 : ρ = 0 vs. H 1 : ρ > 0. DW = For n = 34, k = 2, at the 5% significance level, d L = 1.33 and d U = / 57
85 3.1 First-order Autocorrelation and the Durbin-Watson Test 3.2 Correction for Autocorrelation Durbin-Watson Test As r = 0.412, we test H 0 : ρ = 0 vs. H 1 : ρ > 0. DW = For n = 34, k = 2, at the 5% significance level, d L = 1.33 and d U = We therefore reject H 0 and conclude that there is a significant first-order autocorrelation in the disturbances. 43 / 57
86 3.1 First-order Autocorrelation and the Durbin-Watson Test 3.2 Correction for Autocorrelation Correction for Autocorrelation Many alternative least squares procedures have been introduced for autocorrelation correction, e.g., Cochrane-Orcutt procedure, Prais-Winstein procedure. 44 / 57
87 3.1 First-order Autocorrelation and the Durbin-Watson Test 3.2 Correction for Autocorrelation Correction for Autocorrelation Many alternative least squares procedures have been introduced for autocorrelation correction, e.g., Cochrane-Orcutt procedure, Prais-Winstein procedure. SAS uses an AUTOREG procedure that augments the original regression model with the autocorrelated disturbance function. For example, in the case of the sugarcane plant area regression example, AUTOREG considers the following model: lna t = β 0 + β 1 lnps t + β 2 lnpj t + ɛ t ; ɛ t = ζɛ t 1 + ν t, where ζ = ρ. The procedure simultaneously estimates β 0, β 1, β 2 and ρ. 44 / 57
88 3.1 First-order Autocorrelation and the Durbin-Watson Test 3.2 Correction for Autocorrelation Correction for Autocorrelation The SAS commands and outputs are as follows: proc autoreg data=bangladesh; model lna=lnps lnpj/nlag=1; run; The AUTOREG Procedure Estimates of Autoregressive Parameters Standard Lag Coefficient Error t Value Yule-Walker Estimates SSE DFE 30 MSE Root MSE SBC AIC MAE AICC MAPE HQC Durbin-Watson Regress R-Square Total R-Square Parameter Estimates Standard Approx Variable DF Estimate Error t Value Pr > t Intercept lnps <.0001 lnpj / 57
89 3.1 First-order Autocorrelation and the Durbin-Watson Test 3.2 Correction for Autocorrelation Correction for Autocorrelation The DW value has increased to , resulting in non-rejection of H 0 : ρ = / 57
90 3.1 First-order Autocorrelation and the Durbin-Watson Test 3.2 Correction for Autocorrelation Correction for Autocorrelation The DW value has increased to , resulting in non-rejection of H 0 : ρ = 0. The coefficient β 2 changes from being insignificant (under O.L.S.) to significant. 46 / 57
91 3.1 First-order Autocorrelation and the Durbin-Watson Test 3.2 Correction for Autocorrelation Correction for Autocorrelation The DW value has increased to , resulting in non-rejection of H 0 : ρ = 0. The coefficient β 2 changes from being insignificant (under O.L.S.) to significant. The estimated equation is lna t = (2.3813) (0.1902) lnps t (0.3465) lnpj t ( ) e t 1 46 / 57
92 3.1 First-order Autocorrelation and the Durbin-Watson Test 3.2 Correction for Autocorrelation Correction for Autocorrelation The DW value has increased to , resulting in non-rejection of H 0 : ρ = 0. The coefficient β 2 changes from being insignificant (under O.L.S.) to significant. The estimated equation is lna t = (2.3813) (0.1902) lnps t (0.3465) lnpj t ( ) e t 1 The forecast of lna t thus depends on e t 1, the error in the last period. For out-of-sample forecast of more than one period, e t 1 is unknown and set to zero as E(e t ) = / 57
93 Seemingly Unrelated Regression Equations Sometimes different regression equations may be connected not because they interact, but because their error terms are related. 47 / 57
94 Seemingly Unrelated Regression Equations Sometimes different regression equations may be connected not because they interact, but because their error terms are related. For example, in demand studies, a system of demand equations is specified to explain consumption of different commodities; the potential correlations of the disturbances across the equations arise because a shock affecting the demand of one good may spill over and affect demand of other goods. firms in the same branch of industry are likely subject to similar disturbances. 47 / 57
95 Seemingly Unrelated Regression Equations The seemingly unrelated regression equations (S.U.R.E.) model pool the observations of different regressions together and allow for contemporaneous correlations of the disturbances across the different equations. 48 / 57
96 Seemingly Unrelated Regression Equations The seemingly unrelated regression equations (S.U.R.E.) model pool the observations of different regressions together and allow for contemporaneous correlations of the disturbances across the different equations. S.U.R.E. usually (but not always) leads to improved precision over O.L.S. that treats each equation separately. 48 / 57
97 Seemingly Unrelated Regression Equations The seemingly unrelated regression equations (S.U.R.E.) model pool the observations of different regressions together and allow for contemporaneous correlations of the disturbances across the different equations. S.U.R.E. usually (but not always) leads to improved precision over O.L.S. that treats each equation separately. The equations are seemingly unrelated because they are only related through the disturbance terms. 48 / 57
98 Seemingly Unrelated Regression Equations A standard two-equation S.U.R.E. model may be expressed as: where Y t = β 0 + β 1 X 1t + β 2 X 2t + β 3 X 3t + + β k X kt + ɛ t W t = γ 0 + γ 1 Z 1t + γ 2 Z 2t + γ 3 Z 3t + + γ k Z kt + u t, E(ɛ t ) = E(u t ) = 0, var(ɛ t ) = σ 2 1, var(u t) = σ 2 2 cov(ɛ t ɛ t j ) = cov(u t u t j ) = 0 for j 0 cov(ɛ t u t ) 0 and cov(ɛ t u t j ) = 0 for j / 57
99 Seemingly Unrelated Regression Equations Thus, the standard S.U.R.E. model rules out possibilities of serial correlations or heteroscedasticity in an individual equation or serial correlations across equations, but permits contemporaneous correlations across the equations. 50 / 57
100 Seemingly Unrelated Regression Equations Thus, the standard S.U.R.E. model rules out possibilities of serial correlations or heteroscedasticity in an individual equation or serial correlations across equations, but permits contemporaneous correlations across the equations. The standard S.U.R.E. model has been extended to allow for the non-standard features described above as well as different number of explanatory variables in the equations. But these are beyond the scope of our discussion here. 50 / 57
101 Seemingly Unrelated Regression Equations To illustrate the S.U.R.E. technique, consider two firms, General Electric and Westinghouse, indexed by 1 and 2 respectively. Consider the following economic model describing gross firm investment of the two firms: I 1t = β 0 + β 1 V 1t + β 2 K 1t + ɛ t I 2t = γ 0 + γ 1 V 2t + γ 2 K 2t + u t t = 1,..20, where I, V and K are annual gross investment, stock market value and capital stock of the firm at the beginning of the year respectively. The data are taken from Griffiths, Hill and Judge (1993). 51 / 57
102 Seemingly Unrelated Regression Equations As General Electric and Westinghouse are in similar lines of business, the unexplained disturbances that affect the two firms investment decision may be contemporaneously correlated (i.e., the unexplained factor that affects General Electric s investment at time t may be correlated with a similar factor that affects Westinghouse s at the same time). O.L.S. regression of individual regressions cannot capture this correlation. We therefore pool the 40 observations and treat the model as a two-equation system. 52 / 57
103 Seemingly Unrelated Regression Equations The SAS commands for S.U.R.E. estimation of the above model are as follows. PROC SYSLIN first produces the O.L.S. results from estimating the equations separately followed by the S.U.R.E. results from joint estimation: data invest; input i1 v1 k1 i2 v2 k2; cards;...; proc syslin sur; model i1=v1 k1; model i2=v2 k2; run; 53 / 57
104 Seemingly Unrelated Regression Equations The SAS System SYSLIN Procedure Ordinary Least Squares Estimation Model: I1 Dependent variable: I1 Analysis of Variance Sum of Mean Source DF Squares Square F Value Prob>F Model Error C Total Root MSE R-Square Dep Mean Adj R-SQ C.V Parameter Estimates Parameter Standard T for H0: Variable DF Estimate Error Parameter=0 Prob > T INTERCEP V K SYSLIN Procedure Ordinary Least Squares Estimation Model: I2 Dependent variable: I2 Analysis of Variance Sum of Mean Source DF Squares Square F Value Prob>F Model Error C Total Root MSE R-Square Dep Mean Adj R-SQ C.V Parameter Estimates Parameter Standard T for H0: Variable DF Estimate Error Parameter=0 Prob > T INTERCEP V K / 57
105 Seemingly Unrelated Regression Equations SYSLIN Procedure Seemingly Unrelated Regression Estimation Cross Model Correlation Corr I1 I2 I I Model: I1 Dependent variable: I1 Parameter Estimates Parameter Standard T for H0: Variable DF Estimate Error Parameter=0 Prob > T INTERCEP V K Model: I2 Dependent variable: I2 Parameter Estimates Parameter Standard T for H0: Variable DF Estimate Error Parameter=0 Prob > T INTERCEP V K / 57
106 Seemingly Unrelated Regression Equations Hence O.L.S. estimation produces Î 1t = V 1t K 1t ( ) ( ) ( ) Î 2t = V 2t K 2t ( ) ( ) ( ) whereas S.U.R.E. estimation yields Î 1t = V 1t K 1t ( ) ( ) ( ) Î 2t = V 2t K 2t ( ) ( ) ( ) 56 / 57
107 Seemingly Unrelated Regression Equations S.U.R.E. estimation results in smaller standard errors of the estimates and hence more precise estimates of the coefficients. 57 / 57
108 Seemingly Unrelated Regression Equations S.U.R.E. estimation results in smaller standard errors of the estimates and hence more precise estimates of the coefficients. S.U.R.E. estimation will result no efficiency gain over O.L.S. if 1. cov(ɛ t u t ) = 0, or 2. the equations contain identical explanatory variables, e.g., V 1t = V 2t and K 1t = K 2t for all t. 57 / 57
109 Seemingly Unrelated Regression Equations S.U.R.E. estimation results in smaller standard errors of the estimates and hence more precise estimates of the coefficients. S.U.R.E. estimation will result no efficiency gain over O.L.S. if 1. cov(ɛ t u t ) = 0, or 2. the equations contain identical explanatory variables, e.g., V 1t = V 2t and K 1t = K 2t for all t. In our example, the O.L.S. residuals from the two equations have a contemporaneous correlation of It can be tested if the disturbances are indeed correlated. 57 / 57
CHAPTER 3: Multicollinearity and Model Selection
CHAPTER 3: Multicollinearity and Model Selection Prof. Alan Wan 1 / 89 Table of contents 1. Multicollinearity 1.1 What is Multicollinearity? 1.2 Consequences and Identification of Multicollinearity 1.3
More informationCHAPTER 2: Assumptions and Properties of Ordinary Least Squares, and Inference in the Linear Regression Model
CHAPTER 2: Assumptions and Properties of Ordinary Least Squares, and Inference in the Linear Regression Model Prof. Alan Wan 1 / 57 Table of contents 1. Assumptions in the Linear Regression Model 2 / 57
More informationAUTOCORRELATION. Phung Thanh Binh
AUTOCORRELATION Phung Thanh Binh OUTLINE Time series Gauss-Markov conditions The nature of autocorrelation Causes of autocorrelation Consequences of autocorrelation Detecting autocorrelation Remedial measures
More informationEconometrics. Week 4. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague
Econometrics Week 4 Institute of Economic Studies Faculty of Social Sciences Charles University in Prague Fall 2012 1 / 23 Recommended Reading For the today Serial correlation and heteroskedasticity in
More informationAutocorrelation or Serial Correlation
Chapter 6 Autocorrelation or Serial Correlation Section 6.1 Introduction 2 Evaluating Econometric Work How does an analyst know when the econometric work is completed? 3 4 Evaluating Econometric Work Econometric
More informationEconometrics. 9) Heteroscedasticity and autocorrelation
30C00200 Econometrics 9) Heteroscedasticity and autocorrelation Timo Kuosmanen Professor, Ph.D. http://nomepre.net/index.php/timokuosmanen Today s topics Heteroscedasticity Possible causes Testing for
More informationMultiple Regression Analysis. Part III. Multiple Regression Analysis
Part III Multiple Regression Analysis As of Sep 26, 2017 1 Multiple Regression Analysis Estimation Matrix form Goodness-of-Fit R-square Adjusted R-square Expected values of the OLS estimators Irrelevant
More informationLECTURE 11. Introduction to Econometrics. Autocorrelation
LECTURE 11 Introduction to Econometrics Autocorrelation November 29, 2016 1 / 24 ON PREVIOUS LECTURES We discussed the specification of a regression equation Specification consists of choosing: 1. correct
More informationACE 564 Spring Lecture 8. Violations of Basic Assumptions I: Multicollinearity and Non-Sample Information. by Professor Scott H.
ACE 564 Spring 2006 Lecture 8 Violations of Basic Assumptions I: Multicollinearity and Non-Sample Information by Professor Scott H. Irwin Readings: Griffiths, Hill and Judge. "Collinear Economic Variables,
More informationMaking sense of Econometrics: Basics
Making sense of Econometrics: Basics Lecture 4: Qualitative influences and Heteroskedasticity Egypt Scholars Economic Society November 1, 2014 Assignment & feedback enter classroom at http://b.socrative.com/login/student/
More informationECON 4230 Intermediate Econometric Theory Exam
ECON 4230 Intermediate Econometric Theory Exam Multiple Choice (20 pts). Circle the best answer. 1. The Classical assumption of mean zero errors is satisfied if the regression model a) is linear in the
More informationSection 2 NABE ASTEF 65
Section 2 NABE ASTEF 65 Econometric (Structural) Models 66 67 The Multiple Regression Model 68 69 Assumptions 70 Components of Model Endogenous variables -- Dependent variables, values of which are determined
More informationThe general linear regression with k explanatory variables is just an extension of the simple regression as follows
3. Multiple Regression Analysis The general linear regression with k explanatory variables is just an extension of the simple regression as follows (1) y i = β 0 + β 1 x i1 + + β k x ik + u i. Because
More information405 ECONOMETRICS Chapter # 11: MULTICOLLINEARITY: WHAT HAPPENS IF THE REGRESSORS ARE CORRELATED? Domodar N. Gujarati
405 ECONOMETRICS Chapter # 11: MULTICOLLINEARITY: WHAT HAPPENS IF THE REGRESSORS ARE CORRELATED? Domodar N. Gujarati Prof. M. El-Sakka Dept of Economics Kuwait University In this chapter we take a critical
More informationApplied Econometrics. Applied Econometrics Second edition. Dimitrios Asteriou and Stephen G. Hall
Applied Econometrics Second edition Dimitrios Asteriou and Stephen G. Hall MULTICOLLINEARITY 1. Perfect Multicollinearity 2. Consequences of Perfect Multicollinearity 3. Imperfect Multicollinearity 4.
More informationIris Wang.
Chapter 10: Multicollinearity Iris Wang iris.wang@kau.se Econometric problems Multicollinearity What does it mean? A high degree of correlation amongst the explanatory variables What are its consequences?
More informationAuto correlation 2. Note: In general we can have AR(p) errors which implies p lagged terms in the error structure, i.e.,
1 Motivation Auto correlation 2 Autocorrelation occurs when what happens today has an impact on what happens tomorrow, and perhaps further into the future This is a phenomena mainly found in time-series
More informationEconometrics Summary Algebraic and Statistical Preliminaries
Econometrics Summary Algebraic and Statistical Preliminaries Elasticity: The point elasticity of Y with respect to L is given by α = ( Y/ L)/(Y/L). The arc elasticity is given by ( Y/ L)/(Y/L), when L
More information13. Time Series Analysis: Asymptotics Weakly Dependent and Random Walk Process. Strict Exogeneity
Outline: Further Issues in Using OLS with Time Series Data 13. Time Series Analysis: Asymptotics Weakly Dependent and Random Walk Process I. Stationary and Weakly Dependent Time Series III. Highly Persistent
More informationChapter 15 Panel Data Models. Pooling Time-Series and Cross-Section Data
Chapter 5 Panel Data Models Pooling Time-Series and Cross-Section Data Sets of Regression Equations The topic can be introduced wh an example. A data set has 0 years of time series data (from 935 to 954)
More informationLecture 5: Omitted Variables, Dummy Variables and Multicollinearity
Lecture 5: Omitted Variables, Dummy Variables and Multicollinearity R.G. Pierse 1 Omitted Variables Suppose that the true model is Y i β 1 + β X i + β 3 X 3i + u i, i 1,, n (1.1) where β 3 0 but that the
More informationFinQuiz Notes
Reading 10 Multiple Regression and Issues in Regression Analysis 2. MULTIPLE LINEAR REGRESSION Multiple linear regression is a method used to model the linear relationship between a dependent variable
More information10. Time series regression and forecasting
10. Time series regression and forecasting Key feature of this section: Analysis of data on a single entity observed at multiple points in time (time series data) Typical research questions: What is the
More informationOutline. Nature of the Problem. Nature of the Problem. Basic Econometrics in Transportation. Autocorrelation
1/30 Outline Basic Econometrics in Transportation Autocorrelation Amir Samimi What is the nature of autocorrelation? What are the theoretical and practical consequences of autocorrelation? Since the assumption
More information2. Linear regression with multiple regressors
2. Linear regression with multiple regressors Aim of this section: Introduction of the multiple regression model OLS estimation in multiple regression Measures-of-fit in multiple regression Assumptions
More informationRegression Analysis. BUS 735: Business Decision Making and Research. Learn how to detect relationships between ordinal and categorical variables.
Regression Analysis BUS 735: Business Decision Making and Research 1 Goals of this section Specific goals Learn how to detect relationships between ordinal and categorical variables. Learn how to estimate
More informationUse of Dummy (Indicator) Variables in Applied Econometrics
Chapter 5 Use of Dummy (Indicator) Variables in Applied Econometrics Section 5.1 Introduction Use of Dummy (Indicator) Variables Model specifications in applied econometrics often necessitate the use of
More informationMeasurement Error. Often a data set will contain imperfect measures of the data we would ideally like.
Measurement Error Often a data set will contain imperfect measures of the data we would ideally like. Aggregate Data: (GDP, Consumption, Investment are only best guesses of theoretical counterparts and
More informationMatematické Metody v Ekonometrii 7.
Matematické Metody v Ekonometrii 7. Multicollinearity Blanka Šedivá KMA zimní semestr 2016/2017 Blanka Šedivá (KMA) Matematické Metody v Ekonometrii 7. zimní semestr 2016/2017 1 / 15 One of the assumptions
More informationMULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS
MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS Page 1 MSR = Mean Regression Sum of Squares MSE = Mean Squared Error RSS = Regression Sum of Squares SSE = Sum of Squared Errors/Residuals α = Level
More informationInstrumental Variables, Simultaneous and Systems of Equations
Chapter 6 Instrumental Variables, Simultaneous and Systems of Equations 61 Instrumental variables In the linear regression model y i = x iβ + ε i (61) we have been assuming that bf x i and ε i are uncorrelated
More informationGreene, Econometric Analysis (7th ed, 2012)
EC771: Econometrics, Spring 2012 Greene, Econometric Analysis (7th ed, 2012) Chapters 2 3: Classical Linear Regression The classical linear regression model is the single most useful tool in econometrics.
More information11.1 Gujarati(2003): Chapter 12
11.1 Gujarati(2003): Chapter 12 Time Series Data 11.2 Time series process of economic variables e.g., GDP, M1, interest rate, echange rate, imports, eports, inflation rate, etc. Realization An observed
More informationChapter 7 Student Lecture Notes 7-1
Chapter 7 Student Lecture Notes 7- Chapter Goals QM353: Business Statistics Chapter 7 Multiple Regression Analysis and Model Building After completing this chapter, you should be able to: Explain model
More informationDEMAND ESTIMATION (PART III)
BEC 30325: MANAGERIAL ECONOMICS Session 04 DEMAND ESTIMATION (PART III) Dr. Sumudu Perera Session Outline 2 Multiple Regression Model Test the Goodness of Fit Coefficient of Determination F Statistic t
More informationEconometrics Homework 1
Econometrics Homework Due Date: March, 24. by This problem set includes questions for Lecture -4 covered before midterm exam. Question Let z be a random column vector of size 3 : z = @ (a) Write out z
More informationAnswer all questions from part I. Answer two question from part II.a, and one question from part II.b.
B203: Quantitative Methods Answer all questions from part I. Answer two question from part II.a, and one question from part II.b. Part I: Compulsory Questions. Answer all questions. Each question carries
More informationMultiple Regression. Midterm results: AVG = 26.5 (88%) A = 27+ B = C =
Economics 130 Lecture 6 Midterm Review Next Steps for the Class Multiple Regression Review & Issues Model Specification Issues Launching the Projects!!!!! Midterm results: AVG = 26.5 (88%) A = 27+ B =
More informationCHAPTER 6: SPECIFICATION VARIABLES
Recall, we had the following six assumptions required for the Gauss-Markov Theorem: 1. The regression model is linear, correctly specified, and has an additive error term. 2. The error term has a zero
More informationCovers Chapter 10-12, some of 16, some of 18 in Wooldridge. Regression Analysis with Time Series Data
Covers Chapter 10-12, some of 16, some of 18 in Wooldridge Regression Analysis with Time Series Data Obviously time series data different from cross section in terms of source of variation in x and y temporal
More informationEcon107 Applied Econometrics
Econ107 Applied Econometrics Topics 2-4: discussed under the classical Assumptions 1-6 (or 1-7 when normality is needed for finite-sample inference) Question: what if some of the classical assumptions
More informationMaking sense of Econometrics: Basics
Making sense of Econometrics: Basics Lecture 7: Multicollinearity Egypt Scholars Economic Society November 22, 2014 Assignment & feedback Multicollinearity enter classroom at room name c28efb78 http://b.socrative.com/login/student/
More informationF9 F10: Autocorrelation
F9 F10: Autocorrelation Feng Li Department of Statistics, Stockholm University Introduction In the classic regression model we assume cov(u i, u j x i, x k ) = E(u i, u j ) = 0 What if we break the assumption?
More informationApplied Econometrics. Applied Econometrics. Applied Econometrics. Applied Econometrics. What is Autocorrelation. Applied Econometrics
Autocorrelation 1. What is 2. What causes 3. First and higher orders 4. Consequences of 5. Detecting 6. Resolving Learning Objectives 1. Understand meaning of in the CLRM 2. What causes 3. Distinguish
More information08 Endogenous Right-Hand-Side Variables. Andrius Buteikis,
08 Endogenous Right-Hand-Side Variables Andrius Buteikis, andrius.buteikis@mif.vu.lt http://web.vu.lt/mif/a.buteikis/ Introduction Consider a simple regression model: Y t = α + βx t + u t Under the classical
More informationSimple Regression Model (Assumptions)
Simple Regression Model (Assumptions) Lecture 18 Reading: Sections 18.1, 18., Logarithms in Regression Analysis with Asiaphoria, 19.6 19.8 (Optional: Normal probability plot pp. 607-8) 1 Height son, inches
More informationAutocorrelation. Think of autocorrelation as signifying a systematic relationship between the residuals measured at different points in time
Autocorrelation Given the model Y t = b 0 + b 1 X t + u t Think of autocorrelation as signifying a systematic relationship between the residuals measured at different points in time This could be caused
More informationHypothesis testing Goodness of fit Multicollinearity Prediction. Applied Statistics. Lecturer: Serena Arima
Applied Statistics Lecturer: Serena Arima Hypothesis testing for the linear model Under the Gauss-Markov assumptions and the normality of the error terms, we saw that β N(β, σ 2 (X X ) 1 ) and hence s
More informationLinear Regression Models
Linear Regression Models November 13, 2018 1 / 89 1 Basic framework Model specification and assumptions Parameter estimation: least squares method Coefficient of determination R 2 Properties of the least
More informationMaking sense of Econometrics: Basics
Making sense of Econometrics: Basics Lecture 2: Simple Regression Egypt Scholars Economic Society Happy Eid Eid present! enter classroom at http://b.socrative.com/login/student/ room name c28efb78 Outline
More informationTable 1: Fish Biomass data set on 26 streams
Math 221: Multiple Regression S. K. Hyde Chapter 27 (Moore, 5th Ed.) The following data set contains observations on the fish biomass of 26 streams. The potential regressors from which we wish to explain
More informationEconometrics Part Three
!1 I. Heteroskedasticity A. Definition 1. The variance of the error term is correlated with one of the explanatory variables 2. Example -- the variance of actual spending around the consumption line increases
More information1 Motivation for Instrumental Variable (IV) Regression
ECON 370: IV & 2SLS 1 Instrumental Variables Estimation and Two Stage Least Squares Econometric Methods, ECON 370 Let s get back to the thiking in terms of cross sectional (or pooled cross sectional) data
More informationMulti-Equation Structural Models: Seemingly Unrelated Regression Models
Chapter 15 Multi-Equation Structural Models: Seemingly Unrelated Regression Models Section 15.1 Seemingly Unrelated Regression Models Modeling Approaches Econometric (Structural) Models Time-Series Models
More informationECON2228 Notes 2. Christopher F Baum. Boston College Economics. cfb (BC Econ) ECON2228 Notes / 47
ECON2228 Notes 2 Christopher F Baum Boston College Economics 2014 2015 cfb (BC Econ) ECON2228 Notes 2 2014 2015 1 / 47 Chapter 2: The simple regression model Most of this course will be concerned with
More informationREED TUTORIALS (Pty) LTD ECS3706 EXAM PACK
REED TUTORIALS (Pty) LTD ECS3706 EXAM PACK 1 ECONOMETRICS STUDY PACK MAY/JUNE 2016 Question 1 (a) (i) Describing economic reality (ii) Testing hypothesis about economic theory (iii) Forecasting future
More informationChapter 5. Classical linear regression model assumptions and diagnostics. Introductory Econometrics for Finance c Chris Brooks
Chapter 5 Classical linear regression model assumptions and diagnostics Introductory Econometrics for Finance c Chris Brooks 2013 1 Violation of the Assumptions of the CLRM Recall that we assumed of the
More informationLecture 6: Dynamic Models
Lecture 6: Dynamic Models R.G. Pierse 1 Introduction Up until now we have maintained the assumption that X values are fixed in repeated sampling (A4) In this lecture we look at dynamic models, where the
More informationECON3327: Financial Econometrics, Spring 2016
ECON3327: Financial Econometrics, Spring 2016 Wooldridge, Introductory Econometrics (5th ed, 2012) Chapter 11: OLS with time series data Stationary and weakly dependent time series The notion of a stationary
More informationLecture 8. Using the CLR Model. Relation between patent applications and R&D spending. Variables
Lecture 8. Using the CLR Model Relation between patent applications and R&D spending Variables PATENTS = No. of patents (in 000) filed RDEP = Expenditure on research&development (in billions of 99 $) The
More informationHandout 11: Measurement Error
Handout 11: Measurement Error In which you learn to recognise the consequences for OLS estimation whenever some of the variables you use are not measured as accurately as you might expect. A (potential)
More informationLECTURE 10. Introduction to Econometrics. Multicollinearity & Heteroskedasticity
LECTURE 10 Introduction to Econometrics Multicollinearity & Heteroskedasticity November 22, 2016 1 / 23 ON PREVIOUS LECTURES We discussed the specification of a regression equation Specification consists
More informationMultiple Regression Analysis
1 OUTLINE Basic Concept: Multiple Regression MULTICOLLINEARITY AUTOCORRELATION HETEROSCEDASTICITY REASEARCH IN FINANCE 2 BASIC CONCEPTS: Multiple Regression Y i = β 1 + β 2 X 1i + β 3 X 2i + β 4 X 3i +
More informationEco and Bus Forecasting Fall 2016 EXERCISE 2
ECO 5375-701 Prof. Tom Fomby Eco and Bus Forecasting Fall 016 EXERCISE Purpose: To learn how to use the DTDS model to test for the presence or absence of seasonality in time series data and to estimate
More informationChapter 14 Student Lecture Notes 14-1
Chapter 14 Student Lecture Notes 14-1 Business Statistics: A Decision-Making Approach 6 th Edition Chapter 14 Multiple Regression Analysis and Model Building Chap 14-1 Chapter Goals After completing this
More informationLecture 1 Linear Regression with One Predictor Variable.p2
Lecture Linear Regression with One Predictor Variablep - Basics - Meaning of regression parameters p - β - the slope of the regression line -it indicates the change in mean of the probability distn of
More informationIntroduction to Econometrics
Introduction to Econometrics T H I R D E D I T I O N Global Edition James H. Stock Harvard University Mark W. Watson Princeton University Boston Columbus Indianapolis New York San Francisco Upper Saddle
More informationAnswers: Problem Set 9. Dynamic Models
Answers: Problem Set 9. Dynamic Models 1. Given annual data for the period 1970-1999, you undertake an OLS regression of log Y on a time trend, defined as taking the value 1 in 1970, 2 in 1972 etc. The
More informationBusiness Statistics. Tommaso Proietti. Linear Regression. DEF - Università di Roma 'Tor Vergata'
Business Statistics Tommaso Proietti DEF - Università di Roma 'Tor Vergata' Linear Regression Specication Let Y be a univariate quantitative response variable. We model Y as follows: Y = f(x) + ε where
More informationContest Quiz 3. Question Sheet. In this quiz we will review concepts of linear regression covered in lecture 2.
Updated: November 17, 2011 Lecturer: Thilo Klein Contact: tk375@cam.ac.uk Contest Quiz 3 Question Sheet In this quiz we will review concepts of linear regression covered in lecture 2. NOTE: Please round
More informationChapter 3 Multiple Regression Complete Example
Department of Quantitative Methods & Information Systems ECON 504 Chapter 3 Multiple Regression Complete Example Spring 2013 Dr. Mohammad Zainal Review Goals After completing this lecture, you should be
More informationPhD/MA Econometrics Examination. January, 2015 PART A. (Answer any TWO from Part A)
PhD/MA Econometrics Examination January, 2015 Total Time: 8 hours MA students are required to answer from A and B. PhD students are required to answer from A, B, and C. PART A (Answer any TWO from Part
More information7. Integrated Processes
7. Integrated Processes Up to now: Analysis of stationary processes (stationary ARMA(p, q) processes) Problem: Many economic time series exhibit non-stationary patterns over time 226 Example: We consider
More informationMultiple Regression Analysis
Chapter 4 Multiple Regression Analysis The simple linear regression covered in Chapter 2 can be generalized to include more than one variable. Multiple regression analysis is an extension of the simple
More informationINTRODUCTORY REGRESSION ANALYSIS
;»»>? INTRODUCTORY REGRESSION ANALYSIS With Computer Application for Business and Economics Allen Webster Routledge Taylor & Francis Croup NEW YORK AND LONDON TABLE OF CONTENT IN DETAIL INTRODUCTORY REGRESSION
More informationRecent Advances in the Field of Trade Theory and Policy Analysis Using Micro-Level Data
Recent Advances in the Field of Trade Theory and Policy Analysis Using Micro-Level Data July 2012 Bangkok, Thailand Cosimo Beverelli (World Trade Organization) 1 Content a) Classical regression model b)
More informationLecture 4: Multivariate Regression, Part 2
Lecture 4: Multivariate Regression, Part 2 Gauss-Markov Assumptions 1) Linear in Parameters: Y X X X i 0 1 1 2 2 k k 2) Random Sampling: we have a random sample from the population that follows the above
More informationEcon 510 B. Brown Spring 2014 Final Exam Answers
Econ 510 B. Brown Spring 2014 Final Exam Answers Answer five of the following questions. You must answer question 7. The question are weighted equally. You have 2.5 hours. You may use a calculator. Brevity
More informationStat 500 Midterm 2 12 November 2009 page 0 of 11
Stat 500 Midterm 2 12 November 2009 page 0 of 11 Please put your name on the back of your answer book. Do NOT put it on the front. Thanks. Do not start until I tell you to. The exam is closed book, closed
More informationLECTURE 10: MORE ON RANDOM PROCESSES
LECTURE 10: MORE ON RANDOM PROCESSES AND SERIAL CORRELATION 2 Classification of random processes (cont d) stationary vs. non-stationary processes stationary = distribution does not change over time more
More informationDiagnostics of Linear Regression
Diagnostics of Linear Regression Junhui Qian October 7, 14 The Objectives After estimating a model, we should always perform diagnostics on the model. In particular, we should check whether the assumptions
More information7. Integrated Processes
7. Integrated Processes Up to now: Analysis of stationary processes (stationary ARMA(p, q) processes) Problem: Many economic time series exhibit non-stationary patterns over time 226 Example: We consider
More informationEconometrics Homework 4 Solutions
Econometrics Homework 4 Solutions Question 1 (a) General sources of problem: measurement error in regressors, omitted variables that are correlated to the regressors, and simultaneous equation (reverse
More informationChapter 16. Simple Linear Regression and dcorrelation
Chapter 16 Simple Linear Regression and dcorrelation 16.1 Regression Analysis Our problem objective is to analyze the relationship between interval variables; regression analysis is the first tool we will
More informationEconometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018
Econometrics I KS Module 2: Multivariate Linear Regression Alexander Ahammer Department of Economics Johannes Kepler University of Linz This version: April 16, 2018 Alexander Ahammer (JKU) Module 2: Multivariate
More informationRegression Analysis. BUS 735: Business Decision Making and Research
Regression Analysis BUS 735: Business Decision Making and Research 1 Goals and Agenda Goals of this section Specific goals Learn how to detect relationships between ordinal and categorical variables. Learn
More informationPlease discuss each of the 3 problems on a separate sheet of paper, not just on a separate page!
Econometrics - Exam May 11, 2011 1 Exam Please discuss each of the 3 problems on a separate sheet of paper, not just on a separate page! Problem 1: (15 points) A researcher has data for the year 2000 from
More informationRef.: Spring SOS3003 Applied data analysis for social science Lecture note
SOS3003 Applied data analysis for social science Lecture note 05-2010 Erling Berge Department of sociology and political science NTNU Spring 2010 Erling Berge 2010 1 Literature Regression criticism I Hamilton
More informationDr. Maddah ENMG 617 EM Statistics 11/28/12. Multiple Regression (3) (Chapter 15, Hines)
Dr. Maddah ENMG 617 EM Statistics 11/28/12 Multiple Regression (3) (Chapter 15, Hines) Problems in multiple regression: Multicollinearity This arises when the independent variables x 1, x 2,, x k, are
More informationUsing EViews Vox Principles of Econometrics, Third Edition
Using EViews Vox Principles of Econometrics, Third Edition WILLIAM E. GRIFFITHS University of Melbourne R. CARTER HILL Louisiana State University GUAY С LIM University of Melbourne JOHN WILEY & SONS, INC
More informationMultiple Regression Methods
Chapter 1: Multiple Regression Methods Hildebrand, Ott and Gray Basic Statistical Ideas for Managers Second Edition 1 Learning Objectives for Ch. 1 The Multiple Linear Regression Model How to interpret
More informationARDL Cointegration Tests for Beginner
ARDL Cointegration Tests for Beginner Tuck Cheong TANG Department of Economics, Faculty of Economics & Administration University of Malaya Email: tangtuckcheong@um.edu.my DURATION: 3 HOURS On completing
More informationLeast Squares Estimation-Finite-Sample Properties
Least Squares Estimation-Finite-Sample Properties Ping Yu School of Economics and Finance The University of Hong Kong Ping Yu (HKU) Finite-Sample 1 / 29 Terminology and Assumptions 1 Terminology and Assumptions
More information1 Linear Regression Analysis The Mincer Wage Equation Data Econometric Model Estimation... 11
Econ 495 - Econometric Review 1 Contents 1 Linear Regression Analysis 4 1.1 The Mincer Wage Equation................. 4 1.2 Data............................. 6 1.3 Econometric Model.....................
More informationChapter 16. Simple Linear Regression and Correlation
Chapter 16 Simple Linear Regression and Correlation 16.1 Regression Analysis Our problem objective is to analyze the relationship between interval variables; regression analysis is the first tool we will
More informationPanel Data Models. Chapter 5. Financial Econometrics. Michael Hauser WS17/18 1 / 63
1 / 63 Panel Data Models Chapter 5 Financial Econometrics Michael Hauser WS17/18 2 / 63 Content Data structures: Times series, cross sectional, panel data, pooled data Static linear panel data models:
More informationTypes of economic data
Types of economic data Time series data Cross-sectional data Panel data 1 1-2 1-3 1-4 1-5 The distinction between qualitative and quantitative data The previous data sets can be used to illustrate an important
More informationECON2228 Notes 10. Christopher F Baum. Boston College Economics. cfb (BC Econ) ECON2228 Notes / 48
ECON2228 Notes 10 Christopher F Baum Boston College Economics 2014 2015 cfb (BC Econ) ECON2228 Notes 10 2014 2015 1 / 48 Serial correlation and heteroskedasticity in time series regressions Chapter 12:
More informationLecture 4: Multivariate Regression, Part 2
Lecture 4: Multivariate Regression, Part 2 Gauss-Markov Assumptions 1) Linear in Parameters: Y X X X i 0 1 1 2 2 k k 2) Random Sampling: we have a random sample from the population that follows the above
More informationECON2228 Notes 10. Christopher F Baum. Boston College Economics. cfb (BC Econ) ECON2228 Notes / 54
ECON2228 Notes 10 Christopher F Baum Boston College Economics 2014 2015 cfb (BC Econ) ECON2228 Notes 10 2014 2015 1 / 54 erial correlation and heteroskedasticity in time series regressions Chapter 12:
More information