MISCELLANEOUS REGRESSION TOPICS

Similar documents
MORE ON MULTIPLE REGRESSION

MULTIPLE REGRESSION METHODS

SIMPLE TWO VARIABLE REGRESSION

STAT 212 Business Statistics II 1

1 Introduction to Minitab

LECTURE 11. Introduction to Econometrics. Autocorrelation

Basic Business Statistics, 10/e

The simple linear regression model discussed in Chapter 13 was written as

LINEAR REGRESSION ANALYSIS. MODULE XVI Lecture Exercises

Circle the single best answer for each multiple choice question. Your choice should be made clearly.

Answers: Problem Set 9. Dynamic Models

MULTIPLE LINEAR REGRESSION IN MINITAB

28. SIMPLE LINEAR REGRESSION III

Chapter 6: Model Specification for Time Series

Chapter 26 Multiple Regression, Logistic Regression, and Indicator Variables

Project Report for STAT571 Statistical Methods Instructor: Dr. Ramon V. Leon. Wage Data Analysis. Yuanlei Zhang

Economics 620, Lecture 13: Time Series I

Answer all questions from part I. Answer two question from part II.a, and one question from part II.b.

Time series and Forecasting

Warm-up Using the given data Create a scatterplot Find the regression line

INFERENCE FOR REGRESSION

ECON 312 FINAL PROJECT

1. Least squares with more than one predictor

Chapter 16. Simple Linear Regression and dcorrelation

Chapter 14 Multiple Regression Analysis

at least 50 and preferably 100 observations should be available to build a proper model

Section 2 NABE ASTEF 65

Model Building Chap 5 p251

AUTOCORRELATION. Phung Thanh Binh

Chapter 16. Simple Linear Regression and Correlation

ECON 497: Lecture 4 Page 1 of 1

LECTURE 10: MORE ON RANDOM PROCESSES

Diagnostics of Linear Regression

Multiple Linear Regression

Empirical Market Microstructure Analysis (EMMA)

LECTURE 13: TIME SERIES I

Hypothesis Testing hypothesis testing approach

Stat 500 Midterm 2 12 November 2009 page 0 of 11

Autoregressive models with distributed lags (ADL)

Confidence Interval for the mean response

Time Series Analysis. Smoothing Time Series. 2) assessment of/accounting for seasonality. 3) assessment of/exploiting "serial correlation"

Eco and Bus Forecasting Fall 2016 EXERCISE 2

Iris Wang.

Final Exam. Question 1 (20 points) 2 (25 points) 3 (30 points) 4 (25 points) 5 (10 points) 6 (40 points) Total (150 points) Bonus question (10)

3 Time Series Regression

Econometrics Part Three

Mathematical Notation Math Introduction to Applied Statistics

Financial Time Series Analysis: Part II

Lecture 4a: ARMA Model

Models with qualitative explanatory variables p216

VII. Serially Correlated Residuals/ Autocorrelation

Econometrics Summary Algebraic and Statistical Preliminaries

Analysis of Covariance. The following example illustrates a case where the covariate is affected by the treatments.

Binary Logistic Regression

Immigration attitudes (opposes immigration or supports it) it may seriously misestimate the magnitude of the effects of IVs

Uni- and Bivariate Power

Econ 300/QAC 201: Quantitative Methods in Economics/Applied Data Analysis. 17th Class 7/1/10

Keller: Stats for Mgmt & Econ, 7th Ed July 17, 2006

EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY

The Model Building Process Part I: Checking Model Assumptions Best Practice

SMAM 314 Practice Final Examination Winter 2003

Rama Nada. -Ensherah Mokheemer. 1 P a g e

Linear Regression with 1 Regressor. Introduction to Econometrics Spring 2012 Ken Simons

Handout 12. Endogeneity & Simultaneous Equation Models

Analysis. Components of a Time Series

1 Correlation and Inference from Regression

Problem Set 2: Box-Jenkins methodology

Wooldridge, Introductory Econometrics, 4th ed. Chapter 15: Instrumental variables and two stage least squares

Statistics for Managers using Microsoft Excel 6 th Edition

9) Time series econometrics

G. S. Maddala Kajal Lahiri. WILEY A John Wiley and Sons, Ltd., Publication

Econometrics Honor s Exam Review Session. Spring 2012 Eunice Han

MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS

CHAPTER 5 FUNCTIONAL FORMS OF REGRESSION MODELS

PART I. (a) Describe all the assumptions for a normal error regression model with one predictor variable,

Chapter 19: Logistic regression

AGEC 621 Lecture 16 David Bessler

(1) The explanatory or predictor variables may be qualitative. (We ll focus on examples where this is the case.)

STAT 212: BUSINESS STATISTICS II Third Exam Tuesday Dec 12, 6:00 PM

Regression, part II. I. What does it all mean? A) Notice that so far all we ve done is math.

Non-Stationary Time Series and Unit Root Testing

Keppel, G. & Wickens, T. D. Design and Analysis Chapter 4: Analytical Comparisons Among Treatment Means

Regression Diagnostics Procedures

Economics 536 Lecture 7. Introduction to Specification Testing in Dynamic Econometric Models

Reading Assignment. Serial Correlation and Heteroskedasticity. Chapters 12 and 11. Kennedy: Chapter 8. AREC-ECON 535 Lec F1 1

(4) 1. Create dummy variables for Town. Name these dummy variables A and B. These 0,1 variables now indicate the location of the house.

Descriptive Statistics (And a little bit on rounding and significant digits)

Box-Jenkins ARIMA Advanced Time Series

Multiple Regression Examples

Non-Stationary Time Series and Unit Root Testing

Lecture 2: Univariate Time Series

Lecture 7: OLS with qualitative information

Applied Econometrics. Applied Econometrics. Applied Econometrics. Applied Econometrics. What is Autocorrelation. Applied Econometrics

Business Statistics. Lecture 10: Correlation and Linear Regression

Circle a single answer for each multiple choice question. Your choice should be made clearly.

Rockefeller College University at Albany

Topic 1. Definitions

Histogram of Residuals. Residual Normal Probability Plot. Reg. Analysis Check Model Utility. (con t) Check Model Utility. Inference.

Simple Linear Regression. Steps for Regression. Example. Make a Scatter plot. Check Residual Plot (Residuals vs. X)

STATISTICS 110/201 PRACTICE FINAL EXAM

Transcription:

DEPARTMENT OF POLITICAL SCIENCE AND INTERNATIONAL RELATIONS Posc/Uapp 816 MISCELLANEOUS REGRESSION TOPICS I. AGENDA: A. Example of correcting for autocorrelation. B. Regression with ordinary independent variables. C. Box-Jenkins models D. Logistic regression E. Polynomial regression F. Reading 1. rd Agresti and Finlay, Statistical Methods for Social Sciences, 3 edition, pages 537 to 538; 543 to 550; Chapter 15. II. AN EXAMPLE OF SERIAL CORRELATION: A. Let s continue with the example of petroleum products import data. B. Summary of steps to address and correct for autocorrelation C. The steps are: 1. Use OLS to obtain the residuals. Store these in somewhere. i. Use a name such as Y-residuals 2. Lag the residuals 1 time period to obtain ĝg. t & 1 MINITAB has a procedure: i. Go to Statistics, Time series, then Lag. ii. Pick the column containing the residuals from the model. iii. Pick another column to store the lag residuals and press OK. iv. You should name this column as well to keep the book keeping simple. 1) Also, look at the data window to see what s going on. 3. Use descriptive statistics to obtain the simple correlation between the residual column and the lagged residuals. i. This is the estimate of ρ, the autocorrelation parameter. 4. Lag the dependent and independent variables so they can be transformed. i. Make sure you know where they are stored. ii. Again, name them. 5. Now transform the dependent variable: Y ( t ' Y t & ˆDY t&1 i. That is, create a new variable (with the calculator or mathematical

Posc/Uapp 816 Class 21 Miscellaneous Topics Page 2 expressions or let command) that is simply Y at time t minus ρ times Y at time t - 1. 1) Example: a) Suppose Y at time t is stored in column 10, its lagged version in column 15 and the estimated autocorrelation is.568. Then the let command will store the new Y in column 25: mtb>let c25 = c10 - (.568)*c15 2) Label it something such as New-Y 6. The first independent variable is transformed by X ( t ' X t & ˆDX t&1 i. When using MINITAB just follow the procedure describe above for Y. 1) Example: a) If X 1, the counter say, is stored in column 7 and its lag in column 17 (and once again the estimated ρ is.568) then the MINITAB command will be mtb>let c26 = c7 - (.568)*c17 ii. Since we are dealing with lagged variables, we will lose one observation (see the figure above). 1) For now we can forget about them or use the following to replace these missing first values. Y ( 1 ' 1 & ˆD 2 Y 1 X ( 1,1 ' 1 & ˆD 2 X 1,1 X ( 2,1 ' 1 & ˆD 2 X 2,1

Posc/Uapp 816 Class 21 Miscellaneous Topics Page 3 iii. The Y ( 1 and X ( 1,1 mean the first case for Y and X 1 and so forth. 1) If there are more X's proceed in the same way. * * 7. Finally, regress the transformed variables (i.e., Y on the X 's to obtain new estimates of the coefficients, residuals, and so forth. i. Check the model adequacy with the Durbin-Watson statistic, plots of errors and the usual. ii. If the model is not satisfactory, treat the Y* and X* as raw data and go through the process again. D. Here are the results for the petroleum data. 1. First the original equation: Petroimp = - 0.167 + 0.0961 Counter + 2.47 Dummy - 0.106 Inter Predictor Coef StDev T P VIF Constant -0.1670 0.1057-1.58 0.121 Counter 0.096077 0.007109 13.51 0.000 7.1 Dummy 2.4732 0.3208 7.71 0.000 18.8 Inter -0.10569 0.01075-9.84 0.000 30.6 i. The Durbin-Watson statistic is.53. ii. The estimate of the autocorrelation coefficient, ρ is.732. E. Evaluating the Durbin-Watson statistic. 1. Attached to the notes is a table of the Durbin-Watson statistic for the.05 and.01 levels of significance. 2. As noted last time, find (N, the number of time periods; I ve sometimes call it T) and the number of parameters K. i. In the table this is denoted with p. 3. Find the lower and upper bounds. i. In this instance they are (using N = 50, since there isn t a row 48, and p = 4): D l = 1.42 and D u= 1.67. 4. Decision: since the observed DW falls below the lower bound, reject H 0 that there is no autocorrelation. i. In as much as it is positive we proceed to estimate its value and transform the data. F. Transformation 1. Use the formulas described last time to create new variables. 2. After carrying out the transformation the model and test statistics are:

Posc/Uapp 816 Class 21 Miscellaneous Topics Page 4 NewY = - 0.153 + 0.120 Newcount + 2.91 Newdumy - 0.130 Newinter 47 cases used 1 cases contain missing values Predictor Coef StDev T P VIF Constant -0.15266 0.08706-1.75 0.087 Newcount 0.11969 0.01782 6.72 0.000 6.9 Newdumy 2.9098 0.6808 4.27 0.000 25.2 Newinter -0.13019 0.02631-4.95 0.000 43.7 S = 0.1703 R-Sq = 54.1% R-Sq(adj) = 50.9% Analysis of Variance Source DF SS MS F P Regression 3 1.46723 0.48908 16.87 0.000 Residual Error 43 1.24659 0.02899 Total 46 2.71382 Durbin-Watson statistic = 1.85 i. The estimated DW is 1.85, which falls above the upper bound. Hence, we can assume no serial correlation. 1) Note that we have a smaller N, since I didn t bother estimating the first values of Y and X. ii. The estimated ρ is.035. 3. Both it and the Durbin-Watson statistic suggest that autocorrelation is not a problem. i. But the plot of residuals suggests that it might be. G. Multicolinearity 1. As noted previously in the semester, one almost always encounters multicolinearity when using dummy variables. 2. This situation can cause parameter estimates to be very unstable, especially since in time series analysis N = T is often not very large. 3. What to do? i. Try dropping a constant or level dummy variable if it (the level) is not of great theoretical importance or if it obviously changes with time, as it almost always will. III. LAGGED REGRESSION: A. So far we have dealt with a special kind of time series data in which the independent variables were all dummy indicators or counters. B. But it is possible to use ordinary independent variables.

Posc/Uapp 816 Class 21 Miscellaneous Topics Page 5 1. One will, however, have to look for and possibly adjust for serial correlation. 2. An example follows. C. Example: 1. Here is a very simple (but real example). The data, which come from a study of crime in Detroit, present several difficult problems, not the least of 1 which is the small sample size. The variables are: i. Y: Number of homicides per 100,000 population ii. X 1: Number of handgun license per 100,000 population. iii. X 2: Average weekly earnings D. The time period was 1961 to 1973. Although they are perhaps dated, these data still illustrate the potential and problems of time series regression. E. First suppose we simply estimate the model: E(Y t ) ' $ 0 % $ 1 X t,1 % $ 2 X t,2 1. Here the X's are number of handgun licenses and average weekly earnings at time t (for t = 1, 2,...,13) and Y is the homicide rate. 2. Simple OLS estimates are: Ŷ t ' &34.05 %.0232X 1 %.275X 2 (&7.83) (6.39) (10.18) i. Note that the observed t's are in parentheses. It is common practice to write either the t value or standard deviation of the coefficient under or next to the estimate. 3. Superficially, the data fit the model very well: F = 115.21; s = 3.661; R obs =.958. 4. But we know that the errors will probably be a problem. The residuals plotted against the time counter (see Figure 1 at the top of the next page) seem to follow a distinct pattern: negative values tend to be followed by negative numbers and similarly for positive ones. But we have to be careful 2 1 J. C. Fisher, collected the data for his paper "Homicide in Detroit: The Role of Firearms," Criminology, 14 (1976):387-400. I obtained them via the Internet from "statlib" at Carnegie Mellon University.

Posc/Uapp 816 Class 21 Miscellaneous Topics Page 6 Figure 1: Residuals Suggest Serial Correlation because the sample is so small, we may find "runs" like these just by chance. i. An aside: it's easy to see the pattern of pluses and minuses by applying the sign command to the data in a column; it produces a "1" for each positive number, a "-1" for each negative value, and "0" for values equal to zero. 1) For example, In MINITAB go to the session window and type, sign column number column number. a) Example: if the numbers -.2 -.3.05.5.89 0 and -.66 are stored in c1, the instruction sign c1 c2 will put - 1-1 1 1 1 0-1 in column c2. (You can see the pattern by printing c2 on your screen.) 5. In addition the Durbin-Watson statistic is.79. At the.05 level, for N = 13, the lower bound is 1.08, so we should reject the null hypothesis of no serial correlation (i.e., H 0: ρ = 0), concluding instead that ρ > 0. 6. Note: the Durbin-Watson test "works" only under a relatively narrow range of circumstances. Hence, it should be supplemented with other procedures (such as residual plots). Moreover, many analysts argue

Posc/Uapp 816 Class 21 Miscellaneous Topics Page 7 that if DW does not exceed the upper limit, the null hypothesis should be rejected, even if the test result falls in the indeterminate range. That is good advice because correcting for serial correlation when there is none present will probably create fewer problems than assuming it is not a factor. F. Solutions: 1. As we did in the previous class we can transform the data by estimating ρ and transforming the data. (That is, lag the residuals from the OLS regression and obtain the regression coefficient between the two sets of errors--losing one observation in the process--and creating new X's and Y as in X ( 1 ' X 1 & ˆDX t & 1 2. Then, regressing these variables to obtain new estimates of the coefficients, examine the pattern of errors, and so forth. IV. ALTERNATIVE TIME SERIES PROCESSES: A. The previous example involved a couple of independent variables measured at time t. B. It also possible to include lagged exogenous or independent variables as in: Y t % $ 0 % $ 1 X 1,t % $ 2 X 1,t&1 % g t 1. The term exogenous has a particular meaning to many analysts, but I use it loosely as a synonym of independent. 2. Here Y is a function of X at both t and t - 1. i. That is, the effects of X last over more the current time period. 3. These models are approached in much the same way the other ones have been: we look for and attempt to correct serial or autocorrelation. C. Lagged endogenous or dependent variables. 1. It is common to find models in which the dependent variable at time t is a function of its value at time t - 1 or even earlier. 2. Example: Y t % $ 0 % $ 1 Y t&1 % g t i. I wonder if the budget of the United States government couldn t be modeled partly in this way:

Posc/Uapp 816 Class 21 Miscellaneous Topics Page 8 1) The budget process used to be called disjointed incrementalism, which meant in part that one year s budget was the same as the previous years plus a small increment. a) The expected value of ε t would be positive, I suppose. 3. Models with lagged endogenous (and possible lagged or non-lagged exogenous) variables have to be treated in somewhat different fashion than those previously discussed because the autocorrelation parameter cannot be estimated as before. D. ARMA models. 1. So far, I have dealt almost exclusively with first-order positive autocorrelation. i. The error term is a partial function of its value at the immediately preceding time period. 2. But it is possible that the error term reflects the effects of its values at several earlier times as in: g t ' D 1 g t&1 % D 2 g t&2 %... % < t i. A model incorporating this sort of error process is called an autoregressive process of order p, AR(p). 3. A moving average error process involves a disturbance that is created by a weighted sum of current and lagged random disturbances: g t ' * 1 < t % * 2 < t&1 %... % * q < t&q i. This is called a moving average model of order q: MA(q). 4. It s possible to have mixed models involving both types of error structures such as g t ' D 1 g t&1 % < t % * 1 < t&1 E. It s necessary to used alternative methods to identify and estimate the models, a subject we cannot get to in this semester. F. Many social scientist feel that p and q almost never exceed 1 or possibly 2. I have, however, on occasions seen somewhat deeper models. G. Finally there is still another version called an ARIMA model for autoregressive integrated moving average. 1. Again we do not have time to explore them.

Posc/Uapp 816 Class 21 Miscellaneous Topics Page 9 V. LOGISTIC REGRESSION. A. A binary dependent variable 1. Consider a variable, Y, that has just two possible values. i. Examples: "voted Democratic or Republican"; "Passed seatbelt law or did not pass seatbelt law"; "Industrialized or non industrialized." ii. The values of these categories can be following our previous practice, be coded 0 and 1. 2. If such a variable is a function of X (or X's), the mean value of Y, denoted p(x), can be interpreted as a probability, namely, the conditional probability that Y = a specific category (i.e., Y = 1 or 0), given that X (or the X's) have certain values. 3. It is tempting to use such a variable as a dependent or response variable in OLS, and in many circumstances we can obtain reasonable results from doing so. Such an model would have the usual form: Y i ' $ 0 % $ 1 X 1 %...% g i 4. Still, there are two major problems with this approach. i. It is possible that the estimated equation will produce estimated or predicted values that are less than 0 or greater than 1. ii. Another problem is that for a particular X, the dependent variable is 1 with some probability, which we have denoted above as p(x), and is 0 with probability (1 - p(x)). 5. The variance of Y at this value of X can be shown to be p(x)(1 - p(x)). i. Consequently, the variance of Y depends on the value of X. Normally, this variance will not be the same, a situation that violates the assumption of constant variance. ii. Recall the discussion and examples earlier in the course that showed the distribution of Y's at various values of X. We said that the variance of these distributions is assumed to be the same up and down the line. B. For these and other reasons social scientists and statisticians do not usually work with "linear probability models" such as the one above. Instead, they frequently use odds and odds ratios (and the log of odds ratios) as the dependent variable. 1. Suppose π is the probability that a case or unit has Y = 1, given that X or i X's take specific values. Then (1 - π ) is the corresponding conditional i probability that Y is 0. 2. The ratio of these two probabilities can be interpreted as the conditional odds of being Y having value 1 as opposed to 0. The odds ratio can be denoted:

Posc/Uapp 816 Class 21 Miscellaneous Topics Page 10 S i ' B i (1 & B i ) 3. For various reasons it is easier to work with the logarithm of this quantity:? i ' log B i (1 & B i ) 4. Ο is considered the dependent variable in a linear model. i 5. The approach is called logistic regression. i. We ll discuss it next time. VI. NEXT TIME: A. Additional regression procedures 1. Resistant regression 2. Polynomial regression B. Models and structural equations Go to Notes page Go to Statistics page