Decision 411: Class 5. Where we ve been so far

Size: px
Start display at page:

Download "Decision 411: Class 5. Where we ve been so far"

Transcription

1 Decision 411: Class 5 HW#2 discussion Introduction to regression forecasting Example: rolling back the beer tax Where we ve been so far Thus far we have looked at the most basic models for predicting future values of a time series Y from its own history: : the mean model, random walk model, and smoothing/averaging models, possibly with seasonal adjustment. These basic models assume that future values of Y are some sort of linear function of its past values, so we ve also discussed the use of nonlinear data transformations (logging, deflating) to cover more possibilities. We ve also studied basic principles and tools for testing the assumptions of models and comparing the forecasting accuracy of different models. 1

2 Where we re going next Next we will consider models for predicting future values of Y as linear functions of already-known values of some other variable X,, or possibly several other variables (X1( X1, X2,, etc.). These more general linear forecasting models are called regression models (for reasons to be explained ). In some cases the X s could be lagged (previous) values of Y,, but in general they are other variables whose movements are in some way predictive of movements in Y. Our same general tools for testing and comparing models will still apply, but now there will be more assumptions to test and more models to compare. Game plan for the this week Today (videos #3-6): Major concepts: correlation, R-squared R & all that Regression tools for time series data Regression procedures available in Statgraphics Example: rolling back the beer tax Friday (videos #15-16): 16): Seasonality revisited: dummy variables Selecting regressors: : manual vs. stepwise Modeling issues & diagnostic tests More examples 2

3 Game plan for next week Tuesday, September 25: Quiz Nonlinear transformations Not-so so-simple simple regression Multiplicative regression Friday, September 25 (videos #17-18): 18): Advanced regression techniques: ANOVA, general linear models, logistic regression, etc. Linear Regression Is the most widely used (and abused!) of all statistical techniques Is about the fitting of straight lines to data Y X General equation of a (simple) regression line: Y = constant + beta*x Y X 3

4 Why assume linear relationships? Linear relationships are the simplest non-trivial relationships (start simple!) "True" relationships between variables are often at least approximately linear over the range of interest Y X Why assume linear relationships? Linear relationships are the simplest non-trivial relationships (start simple!) "True" relationships between variables are often at least approximately linear over the range of interest Y X 4

5 Linearization of relationships Alternatively, we may be able to transform the variables in such a way as to linearize linearize the relationships. Nonlinear transformations (log, power, reciprocal, deflation, differences, percentage differences, ratios of variables, etc.) are therefore an important tool of regression modeling but use with care and with good motivation! Examples Sales $$ = constant + beta * Advertising $$ Δ Units sold = constant + beta* Δ Coupons distributed % Return on stock = constant + beta* % Return on market Log(Population) ) = constant + beta * Time Temperature(t) ) = constant + beta* Temperature(t-1) Δ WebHits(t) ) = constant + beta * Δ WebHits(t-1) Δ denotes change in,, i.e., delta, DIFF 5

6 History of regression Was so-named by Sir Francis Galton, a 19th century scientist & adventurer Galton initially gained fame for his African explorations and wrote best-selling selling books on wilderness exploration that introduced the sleeping bag & other wilderness gear to the Western world (still in print) 6

7 Galton (warts and all) Was also a pioneer in the collection & analysis of biometric, anthropometric & psychometric data, inspired by the evolution theory of Darwin Invented weather maps and pioneered the scientific study of tea-brewing He was also wrong about some things (e.g., eugenics) His disciple, Karl Pearson, worked out the mathematics of correlation and regression (Look him up in Google or Wikipedia,, also Galton.org) 7

8 Galton s observations A taller-than than-average average parent tends to have a taller- than-average average child, but the child is likely to be less tall than the parent relative to its own generation Parent s height = x standard deviations from the mean child's predicted height = rx standard deviations from the mean,... where r is a number less than 1 in magnitude: the coefficient of correlation between heights of parents and children This is a "regression" toward mediocrity," or in modern terms a "regression" to the mean." The first regression line (1877) 8

9 Graphical interpretation of regression If you standardize the X and Y variables by converting them to units of standard deviations from their own means, the prediction line passes through the origin and has a slope equal to the correlation coefficient, r. Thus, the line regresses back toward the X- axis, because this minimizes the squared errors in predicting Y from X. standardize(y3) Graphical interpretation standardize(x) On a standardized plot of Y vs. X, where the units are standard-deviations deviations-from- the-mean, the data distribution is roughly symmetric around the 45- degree line but the line for predicting Y from X regresses toward the X axis because this minimizes the squared error in the Y direction. The slope of the regression line on the standardized plot is the correlation r (=0.46 in this case). 9

10 Graphical interpretation 3 standardize(y3) standardize(x) If we instead wanted to predict X from Y,, the line would regress to the Y axis instead! (This line would minimize the squared error measured in the X direction.) Graphical interpretation of regression with time series data In a simple regression of two time series,, the forecast plot of Y is just a shifted and rescaled copy of the time series plot of X In a multiple regression of time series, the forecast plot of Y is a weighted sum* of the plots of the X s In either case, the time pattern in Y should look like some of the time patterns in the X variables: trends and peaks and valleys and spikes in Y ideally should have their counterparts somewhere among the X s * weights can be positive or negative 10

11 Regression is inescapable Your kids will probably be less exceptional than you and your spouse, for better or worse Your performance on the final exam in a course will probably be less exceptional than your score on the midterm A ballplayer s performance during the 2nd half of a season will probably be less exceptional than in the 1st half The hottest mutual funds of the last 5 years will be less hot in the next 5 years Regression is inescapable, cont d Your forecasting models will always produce sequences of forecasts that are smoother (less variable) than the actual data This doesn t mean the future is guaranteed to be more mediocre (less interesting) than the past, but that s the way to bet! 11

12 Why do predictions regress? Is there a restoring force that pulls everything back to the mean? No! It s a purely statistical phenomenon. Every observation of a random process is part signal (a predictable or inheritable component) and part noise (a random, unpredictable, zero- mean component). Here s why: An observation that is exceptional (far above or below the mean) is likely to be composed of a signal and a noise term with the same sign (both positive or both negative). If the high- (or low-) ) achiever performs again (or has offspring), the expected signal will be just as strong and in the same direction, but the expected noise term will be zero. Hence the second observation is likely (not guaranteed, just likely) to be closer to the mean. 12

13 Underlying assumptions of regression Linear relationship between variables Constant variance of errors (homoscedasticity( homoscedasticity) Normally distributed errors Independent errors (no autocorrelation) Stationary process (stable correlations over time) These need to be tested! Error statistics and confidence intervals for forecasts are not reliable if the assumptions are badly violated Sufficient statistics for regression Regression analysis depends only on the following summary statistics of the data: Means of all variables Variances (or standard deviations) of all variables Covariances (or correlations) between all pairs of variables Given only these statistics, you can calculate all the coefficient estimates, standard errors, and forecasts for any regression model that might be fitted to any combination of the variables! (However, you still ought to look at residual plots, etc., ) 13

14 Variance measures the tendency of a variable to vary (away from its mean) Population variance: VARP( Y ) = AVG(( Y Y )...the population variance is the average squared deviation of Y from its own mean Sample variance: VAR(Y) ) = (n/(( /(n-1))varp( 1))VARP(Y)...an unbiased estimate of the true variance based on a finite sample of size n Our forecasting task is to explain the variance in Y. Why does it vary in the way that it does i.e., why isn t t it always constant? 2 ) This factor adjusts for the degree of freedom for error that was used up back calculating the mean from the same sample. Covariance measures the tendency of two variables to vary together Population covariance: COVP( X, Y ) = AVG(( X X )( Y Y ))... is the average product of the deviations of X and Y from their respective means If Y and X tend to be on the same side of their respective means at the same time (both above or both below), the average product of deviations is positive. If they tend to be on opposite sides of their own means at any given time, the average product is negative. If their variations around their own means are unrelated,, the average product is zero. 14

15 Sample covariance Sample covariance: COV(X,Y) ) = (n/(( /(n 1))COVP( 1))COVP(X,Y)...an unbiased estimate of the true covariance based on a sample of size n,, analogous to the sample variance. Correlation The correlation coefficient is the covariance standardized by dividing by the product of standard deviations: r = COV(X,Y X,Y)/STDEV( )/STDEV(X)STDEV(Y) = COVP(X,Y X,Y)/STDEVP( )/STDEVP(X)STDEVP(Y) = CORREL(X,Y X,Y) ) in Excel It measures the strength of the linear relationship between X & Y on a relative scale of -11 to +1 When the correlation is significantly different from zero, variations in X can help to predict variations in Y 15

16 Simple regression formulas Model assumption: Prediction equation: Y t 0 1 t t Y t = β + β X + ε intercept slope = ˆ β + ˆ β X 0 1 ε = t independent identically normally distributed error t Least squares coefficient estimates: βˆ 1 = COV( X,Y )/VAR( X ) = r(stdev( Y )/STDEV( X )) ˆ β0 = AVG( Y ) ˆ β1 AVG( X ) We have exact formulas for the coefficient estimates don t t need to use Solver to minimize squared error. The slope coefficient is just the correlation multiplied by the ratio of standard deviations! Multiple regression formulas The formulas for coefficient estimates and forecast standard errors for the multiple regression model are merely matrix versions of the preceding formulas. If you re interested in the gory details, see the Regression formulas worksheet (SIMPREG.XLS) posted on the Course Outline web page (lecture 5 links). 16

17 Standard error of the regression The standard error of the regression, a.k.a., standard error of the estimate, is the RMSE adjusted for # coefficients estimated: n 1 n 1 n 2 n 2 s = STDEV( e ) = ( )(1 r )VAR( Y ) 2 Adjustment for # coefficients estimated (2) Sample standard deviation of the residuals (errors) Fraction of variance unexplained Original sample variance s is the estimated standard deviation of the true error process (ε( t ), and in general it is slightly larger than the sample standard deviation of the residuals, due to the adjustment for additional coefficients estimated besides the constant. All the other standard errors (for coefficients, ficients, means, forecasts, etc.) are proportional to this quantity. Standard errors of the coefficient estimates Standard error of the slope coefficient SE ˆ β 1 1 s = n STDEVP( X ) t-statistic of the slope coefficient: ˆ β t 1 ˆ = β1 SE ˆ β 1 The p-value of the t-stat is TDIST(t, n 2,2) in Excel The larger the sample size (n),( the more precise the coefficient estimate 17

18 Standard error of the mean The standard error of the mean at X = X t is the standard deviation of the error in estimating the true height of the regression line at that point: SE mean s = + n ( X AVG( X)) 1 tvarp( X ) 2 Same as standard error of the mean in the mean model Correction factor for distance of X t from the mean Standard error of the forecast The standard error of the forecast is 1 n 2 2 ( Xt AVG( X)) fcst = + mean = VARP( X ) SE s SE s 2 This term measures the noise (unexplained variation) in the data This term measures the error in estimating the height of the true regression line at X = X t Note that this is almost the same formula we used for the mean model in class 1. The only difference is that calculating SE mean is slightly more complicated here it depends on the value of X t. 18

19 Lower bounds on standard errors s n is a lower bound on the standard error of the mean ( equalled only when X=AVG(X) ) 1 s 1+ n is the corresponding lower bound on the standard error of the forecast Key point: the standard errors of the forecasts for Y are larger for values of X that are farther from the mean, i.e., farther from the center of the data distribution Confidence limits Confidence limits for a forecast are obtained by adding and subtracting the appropriate multiples of the forecast standard error (as usual). For large n (>20) a rough 95% confidence interval is plus or minus 2 standard errors The exact number of standard errors for a 95% interval, for any n,, is given by TINV(.05,n 2) in Excel A 50% interval is roughly 1/3 as wide (plus or minus 2/3 standard error) 19

20 250 X t = 210 Y % conf. int. for mean Note that confidence intervals are wider when X is far from the center this this probably understates the danger of over- extrapolating a linear model! X 250 X t = 210 Y % conf. int. for forecast The confidence interval for the forecast reflects both the parameter risk concerning the slope & intercept of the regression line and the intrinsic risk of random 150 variations around it. 300 X 20

21 Strange but true For any regression model: VAR( Y ) = VAR( Yˆ ) + VAR( e) Total variance Explained variance + Unexplained variance For a simple regression model: VAR( Y ˆ ) / VAR( Y ) = i.e., fraction of variance explained = r squared r 2 R-squared The term R squared refers to the fraction of variance explained,, i.e., the ratio VAR( Y ˆ ) / VAR( Y ) regardless of the number of regressors. It measures the improvement of the regression model over the mean model for Y. A bigger R-squared R is usually better, for the same Y, but R-squared R should not be used to compare models that may have used different transformations of Y and/or different data samples. R-squared can be very misleading for regressions involving time series data: 90% or more is not necessarily good, and 10% or less is not necessarily bad. 21

22 Example: rolling back the beer tax Suppose the 1991 beer tax had been rolled back in July 2007, resulting an immediate 10-point drop in the beer price index (from to ) What would be the expected effect on per capita real consumption ( BeerPerCapita BeerPerCapita )? What would we predict for the consumption rate in July 2007? (June 2007 rate is $ per year SAAR in year-2000 beer dollars.) 22

23 In search of a linear model Variables BeerPerCapita BeerRelPrice What will happen to per Plot of BeerPerCapita vs BeerRelPrice capita consumption 1.02 in July 2007? BeerPerCapita Post tax hike anomaly Scatterplot of BeerPerCapita vs. relative price (BeerPrice( BeerPrice/CPI) reveals a strong negative correlation and a highly linear relationship (except for mid- 90 s anomaly) Assumed 110 relative price drop 0.55 in June BeerRelPrice The actual correlation is -0.94, which suggests that % of the variance in BeerPerCapita can be explained by BeerRelPrice Summary statistics of variables Here are summary stats and correlations of the two variables, obtained with the Multiple-Variable analysis procedure. Note that the standard deviation of BeerPerCapita is $39.15, which is essentially the forecast standard error we would get by using the mean model to predict it. How much better can we do with a regression model? Well, there is a very strong negative correlation of with BeerRelPrice,, and the square of the correlation is the fraction by which the error variance can be reduced by regressing BeerPerCapita on BeerRelPrice rather instead of using the mean model. Variable definitions: BeerPerCapita = *Beer/(BeerPrice BeerPrice*Population) BeerRelPrice = BeerPrice/CPI 23

24 Fitting a simple regression model: Relate/Multiple Factors/Multiple Regression on the Statgraphics menu Typical regression output Standard error of the regression, a.k.a. standard error of the estimate, is the RMSE adjusted for # coefficients estimated The bottom line IF it is really representative of future accuracy R-squared & adjusted* * R-squaredR Not the bottom line! Coefficients & their standard errors, t-stats (=coeff coeff./std. error) & p-values Residual plots and diagnostic tests Used to test whether some variables are insignificant in the presence of the others Used to test assumptions of linearity, normality, no autocorrelation, etc. *Adjusted for # coefficients in the same way as the standard error of the regression, to be able to compare among models with different # s of coefficients 24

25 What to look for in regression output Error measures: smaller is better t-statistics of coefficients greater than 2 in magnitude? (p-values < 0.05) variables appear significant * Economic interpretations of coefficients Residual plots & diagnostic tests Residuals vs. predicted (nonlinearity?) Normal probability plot (skew? fat tails? outliers?) Residuals vs. time (for time series data) Residual autocorrelation plot (for time series data) *Not a hard and fast rule, but variables that don t pass this test can often be removed without being missed. If a variable s presence in the model is strongly supported by intuition or theory, then a low t-stat may be OK: its effect may just be hard to measure. Basic regression output The Interval plot option plots the regression line vs. the dependent variable or time index. R-squared = 88% as expected, slope coefficient (-280.8)( is highly significant (t-stat = -64). Standard error of regression is $13.80, much less than original standard deviation, but still a lot of error in predicting next month s per capita consumption! Durbin-Watson stat and lag-1 1 autocorrelation are also very bad! (DW should be close to 2, not zero, lag-1 1 auto should be close to zero, not 1!) 25

26 Deconstruction of R-squaredR The variance of the dependent variable is (39.15) 2 = This is the error variance you would get by using the mean model. The variance of the regression forecast errors is the square of the regression standard error, which is (13.8) 2 = 190. The fraction of the original variance that remains unexplained is 190/ %, hence the fraction explained is 88%. This is the reduction in error variance compared to using the mean model instead. What s the Durbin-Watson statistic? It s just an alternative statistic for measuring lag-1 autocorrelation in the residuals, which is popular in regression analysis for obscure historical reasons 0< DW < 4, and ideally DW 2 DW 2(1 r 1 ) where r 1 = lag-1 1 autocorrelation r 1 is easier to interpret: a good value is close to 0, and r 2 1 is roughly the percentage of further variance reduction that could be achieved by fine- tuning to reduce the autocorrelation. 26

27 Economic interpretation of model The slope coefficient of suggests that a.01 decrease in the relative price (X)( ) should increase consumption (Y)( ) by $2.81 The proposed tax rollback would decrease the relative price by (from to 0.520) Thus, predicted consumption in July 07 will increase by 0.049*280.8 = $13.76 from its predicted June 07 value But the model s prediction for June 07 is already way off! Hence the prediction for July 07 is less than the actual June 07 value, despite the price drop. Forecasting equation of model 1 Forecasting equation of this model: Y t = X t For July 07: Y t = (0.52) The forecast for Y depends (only) on the current value of X, not on recent values of Y May 07 June 07 July 07 BeerRelprice (X) BeerPerCapita (Y) ?? 27

28 The forecast ( Reports ) report The multiple regression procedure automatically shows forecasts (on the Reports report) if future values are provided for the independent variable(s).. Here, a July 2007 value of 0.52 was plugged in for BeerRelPrice on the data spreadsheet, and the resulting forecast for BeerPerCapita is $240.20, which is $28.45 below the current value of $ The upper 95% limit of $267.4 is even below the current t value! Last data point (268.65) is somewhere in here 95% CI for forecast for BeerRelPrice = 0.52 Here s the plot of the regression line with confidence 95% confidence limits for the forecasts. This is the Interval plots chart drawn with h 95% intervals for predicted values (a right-mouse mouse-button option). The interval for the July 07 prediction is at the upper left where BeerRelPrice=

29 Plot of residuals vs. row number (time) shows severe autocorrelation, i.e., long runs of values with the same sign, as foretold by bad DW stat and lag-1 1 autocorrelation, and the most recent errors have been especially large. Plot of predicted values (red) vs. row number ( Interval plot ) shows poor fit to data, and the predicted jump in July 07 falls well short of the June 07 value. The predicted values are actually just a shifted and rescaled version of BeerRelPrice. Regression option in Forecasting procedure Here model E is specified as a mean + 1 regressor model You can also fit the same regression model in the Forecast/User-Specified Model procedure. Choose the Mean model type and hit the Regression button to add independent variables. This approach allows you to use the model-comparison features and additional residual diagnostics in the forecasting procedure. Caveat: no more than 4 independent variables are allowed here. 29

30 Same regression results and forecast, but the normal probability plot and autocorrelation plot of the residuals look terrible,, and the comparison with simpler time series models is not flattering! Conclusion (so far ) Although this model provides a plausible estimate of the macroeconomic relationship between relative price and per capita consumption (assuming that the long- term upward trend in consumption is entirely caused by the long-term downward trend in relative price!), it does not do a very plausible job of forecasting the near future. Why not? It is a cross-sectional sectional model that does not exploit the time dimension in the data: it predicts consumption for a randomly chosen relative price. Due to other, unmodeled factors, the data wanders away from the regression line and does not return very quickly errors are strongly correlated. 30

31 How to incorporate the time dimension in a regression model? Some possible approaches: Predict changes instead of levels (i.e., use a first- difference transformation) Use lagged variables (recent values of dependent and independent variables) as additional regressors,, to serve as proxies for effects of unmodeled variables* Use an autocorrelation correction (e.g., Cochrane-Orcutt or ARIMA error structure) as a proxy for unmodeled factors* *We ll discuss these in later classes Let s look at monthly changes Here s a plot of the original BeerPerCapita series obtained in the Time Series/Descriptive Methods procedure. No transformations have been performed yet. 31

32 On the right-mouse mouse-button Analysis Options panel, entering a 1 in the Differencing box performs a first-difference transformation. Now we are seeing the plot of month-to to-month changes in BeerPerCapita. Here are time series plots of both BeerPerCapita and BeerRelPrice,, before and after a first-difference transformation. Note that the differenced series appear to be stationary : no trend, constant variance, etc. The T circled point in lower right is the assumed price impact of tax rollback in July (This 4-chart arrangement was made by pasting the plots into the Statgallery Statgallery. ) 32

33 Plot of diff(beerpercapita) vs diff(beerrelprice) 15 diff(beerpercapita) Here the Plot/Scatterplot Scatterplot/X-Y Y Plot procedure was used to plot diff(beerpercapita) vs. diff(beerrelprice) diff(beerrelprice) Scatterplot of differenced variables indicates a weaker but still significant nt negative correlation (-0.33).( The two circled points in the lower right are the drops in Jan. and Feb due to the beer tax increase. Even when w these two points are de-selected, there is still a significant negative correlation of Statistics of the differenced variables The correlations and other summary stats of the differenced variables were computed using the Describe/Numeric Data/Multiple- Variable Analysis procedure. Note that the standard deviation of diff(beerpercapita) ) is only $ This is roughly the forecast standard error you would get by using a random walk with drift model to predict BeerPerCapita,, because the random walk model merely predicts that each change will equal the mean change. Hence we can already see that the forecast standard error of the RW model is smaller than that of the original cross-sectional sectional regression by roughly a factor of 5. However, let s see if we can improve on the RW model by regressing diff(beerpercapita) ) on diff(beerrelprice). 33

34 Simple regression of diff(beerpercapita) on diff(beerrelprice) In a simple regression of the differenced variables, the change in BeerPerCapita is predicted from the change in BeerRelPrice. This is a micro prediction rather than a macro prediction. Our predicted level of BeerPerCapita in the next period will be equal to the current level plus the predicted change. Regression standard error is vastly superior! Still some autocorrelation, but not nearly as bad. The estimated coefficient of diff(beerrelprice ) is , in the same ballpark as the coefficient of BeerRelPrice in the earlier model. Hence a similar change in consumption per unit change in relative price is predicted. However, this model is directly predicting the change,, not the level. The predicted change in July 07 is positive (+$12.65) in line with intuition. But what happened to R-squared? It s fallen to around 11%! (Horrors) 34

35 What happened to R-squared?? R The previous model explained 88% of the variance in the monthly level of BeerPerCapita. Because BeerPerCapita is a nonstationary, trended variable, it has a lot of variance to explain! This model directly predicts the change in BeerPerCapita,, which is a stationary series with a much lower variance to begin with. Hence, less variance remains to be explained by this regression model, and an R-squared R of only 11% is actually a much better performance. Another way to look at it: When the dependent variable is undifferenced, R-squared measures the reduction in error variance compared to using the mean model. When the dependent variable is differenced, R-squared measures the reduction in error variance compared to using the random walk with drift model on the original variable. Here, a random walk model (or another simple time series model) would have been a much better reference point for predictions of monthly per capita beer consumption. The regression of differenced variables is a walk model in which the steps are not completely random: they depend on the change in price! 35

36 Deconstruction of R-squaredR The variance of the differenced dependent variable is (2.637) 2 = This is the error variance you would get by using the random walk with drift model on the original undifferenced variable. The variance of the regression forecast errors is the square of the regression standard error, which is now (2.494) 2 = 6.22 The fraction of the original variance that remains unexplained is 6.22/ %, hence the fraction explained is 11%. This is not a huge improvement over the random walk model in terms of forecast accuracy, but it does allow us to factor in the price sensitivity of consumers. Forecasting equation for model 2 Forecasting equation for the change in Y: (Y t Y t-1 ) = (X t X t-1 ) For July 07: (Y t Y t-1 ) = ( ).569) = Undifferenced forecast for new level of Y: Y t = Y t = = The ultimate forecast from this model steps off from the last actual value of Y,, as in the random walk model, but now the step size depends on the change in X Dec-04 Jan-05 Feb-05 BeerRelprice (X) BeerPerCapita (Y) !! 36

37 Same model in Forecasting procedure There are several ways in which the differenced regression model can be fitted in the Forecasting procedure. The simplest way is to specify it as an ARIMA model with 1 order of nonseasonal differencing plus 1 regressor and a constant term. The first-difference transformation is applied to both variables prior to fitting the regression model. Almost the same regression results and forecast (slightly different estimation procedure) and the normal probability plot and autocorrelation plot of the residuals are much better (not perfect, but acceptable). The differenced regression model (B) is best on all error measures, but not by a large margin. 37

38 More fine tuning?? The differenced model still has a technically significant lag-1 1 autocorrelation of Because it is negative, it means the model is over- reacting rather than under-reacting reacting to recent changes in the data. By the r-squared rule, this suggests that = % of the remaining variance might be explained via more fine-tuning (e.g., adding lagged variables). This is not a large improvement: it corresponds to about a 2.5% further reduction in standard error, hence a 2.5% shrinkage in confidence intervals. Class 5 recap Regression to mediocrity is inescapable Correlations and scatterplots help to reveal strengths of linear relationships How to interpret regression output & test residuals Much of the variance in the original data may be explainable merely by an appropriate transformation of the data, such as a first- difference transformation applied to nonstationary time series variables. R-squared is not the bottom line! 38

Decision 411: Class 5

Decision 411: Class 5 Decision 411: Class 5 HW#2 discussion Introduction to regression forecasting Roll back the beer tax? Where we ve been so far Thus far we have looked at the most basic models for predicting future values

More information

Decision 411: Class 7

Decision 411: Class 7 Decision 411: Class 7 Confidence limits for sums of coefficients Use of the time index as a regressor The difficulty of predicting the future Confidence intervals for sums of coefficients Sometimes the

More information

Decision 411: Class 3

Decision 411: Class 3 Decision 411: Class 3 Discussion of HW#1 Introduction to seasonal models Seasonal decomposition Seasonal adjustment on a spreadsheet Forecasting with seasonal adjustment Forecasting inflation Poor man

More information

Decision 411: Class 3

Decision 411: Class 3 Decision 411: Class 3 Discussion of HW#1 Introduction to seasonal models Seasonal decomposition Seasonal adjustment on a spreadsheet Forecasting with seasonal adjustment Forecasting inflation Poor man

More information

Decision 411: Class 3

Decision 411: Class 3 Decision 411: Class 3 Discussion of HW#1 Introduction to seasonal models Seasonal decomposition Seasonal adjustment on a spreadsheet Forecasting with seasonal adjustment Forecasting inflation Log transformation

More information

Decision 411: Class 9. HW#3 issues

Decision 411: Class 9. HW#3 issues Decision 411: Class 9 Presentation/discussion of HW#3 Introduction to ARIMA models Rules for fitting nonseasonal models Differencing and stationarity Reading the tea leaves : : ACF and PACF plots Unit

More information

9. Linear Regression and Correlation

9. Linear Regression and Correlation 9. Linear Regression and Correlation Data: y a quantitative response variable x a quantitative explanatory variable (Chap. 8: Recall that both variables were categorical) For example, y = annual income,

More information

Decision 411: Class 4

Decision 411: Class 4 Decision 411: Class 4 Non-seasonal averaging & smoothing models Simple moving average (SMA) model Simple exponential smoothing (SES) model Linear exponential smoothing (LES) model Combining seasonal adjustment

More information

Decision 411: Class 4

Decision 411: Class 4 Decision 411: Class 4 Non-seasonal averaging & smoothing models Simple moving average (SMA) model Simple exponential smoothing (SES) model Linear exponential smoothing (LES) model Combining seasonal adjustment

More information

Assumptions in Regression Modeling

Assumptions in Regression Modeling Fall Semester, 2001 Statistics 621 Lecture 2 Robert Stine 1 Assumptions in Regression Modeling Preliminaries Preparing for class Read the casebook prior to class Pace in class is too fast to absorb without

More information

Section 3: Simple Linear Regression

Section 3: Simple Linear Regression Section 3: Simple Linear Regression Carlos M. Carvalho The University of Texas at Austin McCombs School of Business http://faculty.mccombs.utexas.edu/carlos.carvalho/teaching/ 1 Regression: General Introduction

More information

STOCKHOLM UNIVERSITY Department of Economics Course name: Empirical Methods Course code: EC40 Examiner: Lena Nekby Number of credits: 7,5 credits Date of exam: Saturday, May 9, 008 Examination time: 3

More information

Chapter 16. Simple Linear Regression and dcorrelation

Chapter 16. Simple Linear Regression and dcorrelation Chapter 16 Simple Linear Regression and dcorrelation 16.1 Regression Analysis Our problem objective is to analyze the relationship between interval variables; regression analysis is the first tool we will

More information

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages:

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages: Glossary The ISI glossary of statistical terms provides definitions in a number of different languages: http://isi.cbs.nl/glossary/index.htm Adjusted r 2 Adjusted R squared measures the proportion of the

More information

Chapter 16. Simple Linear Regression and Correlation

Chapter 16. Simple Linear Regression and Correlation Chapter 16 Simple Linear Regression and Correlation 16.1 Regression Analysis Our problem objective is to analyze the relationship between interval variables; regression analysis is the first tool we will

More information

Decision 411: Class 8

Decision 411: Class 8 Decision 411: Class 8 One more way to model seasonality Advanced regression (power tools): Stepwise and all possible regressions 1-way ANOVA Multifactor ANOVA General Linear Models (GLM) Out-of of-sample

More information

Midterm 2 - Solutions

Midterm 2 - Solutions Ecn 102 - Analysis of Economic Data University of California - Davis February 24, 2010 Instructor: John Parman Midterm 2 - Solutions You have until 10:20am to complete this exam. Please remember to put

More information

LAB 5 INSTRUCTIONS LINEAR REGRESSION AND CORRELATION

LAB 5 INSTRUCTIONS LINEAR REGRESSION AND CORRELATION LAB 5 INSTRUCTIONS LINEAR REGRESSION AND CORRELATION In this lab you will learn how to use Excel to display the relationship between two quantitative variables, measure the strength and direction of the

More information

Biostatistics and Design of Experiments Prof. Mukesh Doble Department of Biotechnology Indian Institute of Technology, Madras

Biostatistics and Design of Experiments Prof. Mukesh Doble Department of Biotechnology Indian Institute of Technology, Madras Biostatistics and Design of Experiments Prof. Mukesh Doble Department of Biotechnology Indian Institute of Technology, Madras Lecture - 39 Regression Analysis Hello and welcome to the course on Biostatistics

More information

The Simple Linear Regression Model

The Simple Linear Regression Model The Simple Linear Regression Model Lesson 3 Ryan Safner 1 1 Department of Economics Hood College ECON 480 - Econometrics Fall 2017 Ryan Safner (Hood College) ECON 480 - Lesson 3 Fall 2017 1 / 77 Bivariate

More information

Decision 411: Class 8

Decision 411: Class 8 Decision 411: Class 8 One more way to model seasonality Advanced regression (power tools): Stepwise and all possible regressions 1-way ANOVA Multifactor ANOVA General Linear Models (GLM) Out-of of-sample

More information

Outline. Nature of the Problem. Nature of the Problem. Basic Econometrics in Transportation. Autocorrelation

Outline. Nature of the Problem. Nature of the Problem. Basic Econometrics in Transportation. Autocorrelation 1/30 Outline Basic Econometrics in Transportation Autocorrelation Amir Samimi What is the nature of autocorrelation? What are the theoretical and practical consequences of autocorrelation? Since the assumption

More information

Interactions. Interactions. Lectures 1 & 2. Linear Relationships. y = a + bx. Slope. Intercept

Interactions. Interactions. Lectures 1 & 2. Linear Relationships. y = a + bx. Slope. Intercept Interactions Lectures 1 & Regression Sometimes two variables appear related: > smoking and lung cancers > height and weight > years of education and income > engine size and gas mileage > GMAT scores and

More information

Estimating σ 2. We can do simple prediction of Y and estimation of the mean of Y at any value of X.

Estimating σ 2. We can do simple prediction of Y and estimation of the mean of Y at any value of X. Estimating σ 2 We can do simple prediction of Y and estimation of the mean of Y at any value of X. To perform inferences about our regression line, we must estimate σ 2, the variance of the error term.

More information

Decision 411: Forecasting

Decision 411: Forecasting Decision 411: Forecasting Professor: Bob Nau Course content: How to predict the future How to learn from the past using data analysis Who should be interested: Anyone on a quantitative career track (financial

More information

Business Statistics. Lecture 9: Simple Regression

Business Statistics. Lecture 9: Simple Regression Business Statistics Lecture 9: Simple Regression 1 On to Model Building! Up to now, class was about descriptive and inferential statistics Numerical and graphical summaries of data Confidence intervals

More information

Econ 300/QAC 201: Quantitative Methods in Economics/Applied Data Analysis. 17th Class 7/1/10

Econ 300/QAC 201: Quantitative Methods in Economics/Applied Data Analysis. 17th Class 7/1/10 Econ 300/QAC 201: Quantitative Methods in Economics/Applied Data Analysis 17th Class 7/1/10 The only function of economic forecasting is to make astrology look respectable. --John Kenneth Galbraith show

More information

Regression Analysis. BUS 735: Business Decision Making and Research

Regression Analysis. BUS 735: Business Decision Making and Research Regression Analysis BUS 735: Business Decision Making and Research 1 Goals and Agenda Goals of this section Specific goals Learn how to detect relationships between ordinal and categorical variables. Learn

More information

Statistics for Managers using Microsoft Excel 6 th Edition

Statistics for Managers using Microsoft Excel 6 th Edition Statistics for Managers using Microsoft Excel 6 th Edition Chapter 13 Simple Linear Regression 13-1 Learning Objectives In this chapter, you learn: How to use regression analysis to predict the value of

More information

28. SIMPLE LINEAR REGRESSION III

28. SIMPLE LINEAR REGRESSION III 28. SIMPLE LINEAR REGRESSION III Fitted Values and Residuals To each observed x i, there corresponds a y-value on the fitted line, y = βˆ + βˆ x. The are called fitted values. ŷ i They are the values of

More information

How can we explore the association between two quantitative variables?

How can we explore the association between two quantitative variables? How can we explore the association between two quantitative variables? An association exists between two variables if a particular value of one variable is more likely to occur with certain values of the

More information

Year 10 Mathematics Semester 2 Bivariate Data Chapter 13

Year 10 Mathematics Semester 2 Bivariate Data Chapter 13 Year 10 Mathematics Semester 2 Bivariate Data Chapter 13 Why learn this? Observations of two or more variables are often recorded, for example, the heights and weights of individuals. Studying the data

More information

Keller: Stats for Mgmt & Econ, 7th Ed July 17, 2006

Keller: Stats for Mgmt & Econ, 7th Ed July 17, 2006 Chapter 17 Simple Linear Regression and Correlation 17.1 Regression Analysis Our problem objective is to analyze the relationship between interval variables; regression analysis is the first tool we will

More information

Do not copy, post, or distribute

Do not copy, post, or distribute 14 CORRELATION ANALYSIS AND LINEAR REGRESSION Assessing the Covariability of Two Quantitative Properties 14.0 LEARNING OBJECTIVES In this chapter, we discuss two related techniques for assessing a possible

More information

Review of Statistics 101

Review of Statistics 101 Review of Statistics 101 We review some important themes from the course 1. Introduction Statistics- Set of methods for collecting/analyzing data (the art and science of learning from data). Provides methods

More information

11.1 Gujarati(2003): Chapter 12

11.1 Gujarati(2003): Chapter 12 11.1 Gujarati(2003): Chapter 12 Time Series Data 11.2 Time series process of economic variables e.g., GDP, M1, interest rate, echange rate, imports, eports, inflation rate, etc. Realization An observed

More information

DEMAND ESTIMATION (PART III)

DEMAND ESTIMATION (PART III) BEC 30325: MANAGERIAL ECONOMICS Session 04 DEMAND ESTIMATION (PART III) Dr. Sumudu Perera Session Outline 2 Multiple Regression Model Test the Goodness of Fit Coefficient of Determination F Statistic t

More information

Regression Analysis. BUS 735: Business Decision Making and Research. Learn how to detect relationships between ordinal and categorical variables.

Regression Analysis. BUS 735: Business Decision Making and Research. Learn how to detect relationships between ordinal and categorical variables. Regression Analysis BUS 735: Business Decision Making and Research 1 Goals of this section Specific goals Learn how to detect relationships between ordinal and categorical variables. Learn how to estimate

More information

REVIEW 8/2/2017 陈芳华东师大英语系

REVIEW 8/2/2017 陈芳华东师大英语系 REVIEW Hypothesis testing starts with a null hypothesis and a null distribution. We compare what we have to the null distribution, if the result is too extreme to belong to the null distribution (p

More information

Midterm 2 - Solutions

Midterm 2 - Solutions Ecn 102 - Analysis of Economic Data University of California - Davis February 23, 2010 Instructor: John Parman Midterm 2 - Solutions You have until 10:20am to complete this exam. Please remember to put

More information

How To: Deal with Heteroscedasticity Using STATGRAPHICS Centurion

How To: Deal with Heteroscedasticity Using STATGRAPHICS Centurion How To: Deal with Heteroscedasticity Using STATGRAPHICS Centurion by Dr. Neil W. Polhemus July 28, 2005 Introduction When fitting statistical models, it is usually assumed that the error variance is the

More information

LAB 3 INSTRUCTIONS SIMPLE LINEAR REGRESSION

LAB 3 INSTRUCTIONS SIMPLE LINEAR REGRESSION LAB 3 INSTRUCTIONS SIMPLE LINEAR REGRESSION In this lab you will first learn how to display the relationship between two quantitative variables with a scatterplot and also how to measure the strength of

More information

Time Series Analysis. Smoothing Time Series. 2) assessment of/accounting for seasonality. 3) assessment of/exploiting "serial correlation"

Time Series Analysis. Smoothing Time Series. 2) assessment of/accounting for seasonality. 3) assessment of/exploiting serial correlation Time Series Analysis 2) assessment of/accounting for seasonality This (not surprisingly) concerns the analysis of data collected over time... weekly values, monthly values, quarterly values, yearly values,

More information

Review of Statistics

Review of Statistics Review of Statistics Topics Descriptive Statistics Mean, Variance Probability Union event, joint event Random Variables Discrete and Continuous Distributions, Moments Two Random Variables Covariance and

More information

Chapter 3: Regression Methods for Trends

Chapter 3: Regression Methods for Trends Chapter 3: Regression Methods for Trends Time series exhibiting trends over time have a mean function that is some simple function (not necessarily constant) of time. The example random walk graph from

More information

Nonlinear Regression. Summary. Sample StatFolio: nonlinear reg.sgp

Nonlinear Regression. Summary. Sample StatFolio: nonlinear reg.sgp Nonlinear Regression Summary... 1 Analysis Summary... 4 Plot of Fitted Model... 6 Response Surface Plots... 7 Analysis Options... 10 Reports... 11 Correlation Matrix... 12 Observed versus Predicted...

More information

Simple Linear Regression

Simple Linear Regression CHAPTER 13 Simple Linear Regression CHAPTER OUTLINE 13.1 Simple Linear Regression Analysis 13.2 Using Excel s built-in Regression tool 13.3 Linear Correlation 13.4 Hypothesis Tests about the Linear Correlation

More information

Introduction to Regression Analysis. Dr. Devlina Chatterjee 11 th August, 2017

Introduction to Regression Analysis. Dr. Devlina Chatterjee 11 th August, 2017 Introduction to Regression Analysis Dr. Devlina Chatterjee 11 th August, 2017 What is regression analysis? Regression analysis is a statistical technique for studying linear relationships. One dependent

More information

Any of 27 linear and nonlinear models may be fit. The output parallels that of the Simple Regression procedure.

Any of 27 linear and nonlinear models may be fit. The output parallels that of the Simple Regression procedure. STATGRAPHICS Rev. 9/13/213 Calibration Models Summary... 1 Data Input... 3 Analysis Summary... 5 Analysis Options... 7 Plot of Fitted Model... 9 Predicted Values... 1 Confidence Intervals... 11 Observed

More information

Univariate analysis. Simple and Multiple Regression. Univariate analysis. Simple Regression How best to summarise the data?

Univariate analysis. Simple and Multiple Regression. Univariate analysis. Simple Regression How best to summarise the data? Univariate analysis Example - linear regression equation: y = ax + c Least squares criteria ( yobs ycalc ) = yobs ( ax + c) = minimum Simple and + = xa xc xy xa + nc = y Solve for a and c Univariate analysis

More information

APPENDIX 1 BASIC STATISTICS. Summarizing Data

APPENDIX 1 BASIC STATISTICS. Summarizing Data 1 APPENDIX 1 Figure A1.1: Normal Distribution BASIC STATISTICS The problem that we face in financial analysis today is not having too little information but too much. Making sense of large and often contradictory

More information

11. Further Issues in Using OLS with TS Data

11. Further Issues in Using OLS with TS Data 11. Further Issues in Using OLS with TS Data With TS, including lags of the dependent variable often allow us to fit much better the variation in y Exact distribution theory is rarely available in TS applications,

More information

UNST 232 Mentor Section Assignment 5 Historical Climate Data

UNST 232 Mentor Section Assignment 5 Historical Climate Data UNST 232 Mentor Section Assignment 5 Historical Climate Data 1 introduction Informally, we can define climate as the typical weather experienced in a particular region. More rigorously, it is the statistical

More information

Chapter 7 Linear Regression

Chapter 7 Linear Regression Chapter 7 Linear Regression 1 7.1 Least Squares: The Line of Best Fit 2 The Linear Model Fat and Protein at Burger King The correlation is 0.76. This indicates a strong linear fit, but what line? The line

More information

Applied Regression Analysis. Section 2: Multiple Linear Regression

Applied Regression Analysis. Section 2: Multiple Linear Regression Applied Regression Analysis Section 2: Multiple Linear Regression 1 The Multiple Regression Model Many problems involve more than one independent variable or factor which affects the dependent or response

More information

Inferences for Regression

Inferences for Regression Inferences for Regression An Example: Body Fat and Waist Size Looking at the relationship between % body fat and waist size (in inches). Here is a scatterplot of our data set: Remembering Regression In

More information

Introduction to Design of Experiments

Introduction to Design of Experiments Introduction to Design of Experiments Jean-Marc Vincent and Arnaud Legrand Laboratory ID-IMAG MESCAL Project Universities of Grenoble {Jean-Marc.Vincent,Arnaud.Legrand}@imag.fr November 20, 2011 J.-M.

More information

EC4051 Project and Introductory Econometrics

EC4051 Project and Introductory Econometrics EC4051 Project and Introductory Econometrics Dudley Cooke Trinity College Dublin Dudley Cooke (Trinity College Dublin) Intro to Econometrics 1 / 23 Project Guidelines Each student is required to undertake

More information

LECTURE 2: SIMPLE REGRESSION I

LECTURE 2: SIMPLE REGRESSION I LECTURE 2: SIMPLE REGRESSION I 2 Introducing Simple Regression Introducing Simple Regression 3 simple regression = regression with 2 variables y dependent variable explained variable response variable

More information

Ch 13 & 14 - Regression Analysis

Ch 13 & 14 - Regression Analysis Ch 3 & 4 - Regression Analysis Simple Regression Model I. Multiple Choice:. A simple regression is a regression model that contains a. only one independent variable b. only one dependent variable c. more

More information

Chapter 27 Summary Inferences for Regression

Chapter 27 Summary Inferences for Regression Chapter 7 Summary Inferences for Regression What have we learned? We have now applied inference to regression models. Like in all inference situations, there are conditions that we must check. We can test

More information

27. SIMPLE LINEAR REGRESSION II

27. SIMPLE LINEAR REGRESSION II 27. SIMPLE LINEAR REGRESSION II The Model In linear regression analysis, we assume that the relationship between X and Y is linear. This does not mean, however, that Y can be perfectly predicted from X.

More information

Wed, June 26, (Lecture 8-2). Nonlinearity. Significance test for correlation R-squared, SSE, and SST. Correlation in SPSS.

Wed, June 26, (Lecture 8-2). Nonlinearity. Significance test for correlation R-squared, SSE, and SST. Correlation in SPSS. Wed, June 26, (Lecture 8-2). Nonlinearity. Significance test for correlation R-squared, SSE, and SST. Correlation in SPSS. Last time, we looked at scatterplots, which show the interaction between two variables,

More information

POL 681 Lecture Notes: Statistical Interactions

POL 681 Lecture Notes: Statistical Interactions POL 681 Lecture Notes: Statistical Interactions 1 Preliminaries To this point, the linear models we have considered have all been interpreted in terms of additive relationships. That is, the relationship

More information

Read Section 1.1, Examples of time series, on pages 1-8. These example introduce the book; you are not tested on them.

Read Section 1.1, Examples of time series, on pages 1-8. These example introduce the book; you are not tested on them. TS Module 1 Time series overview (The attached PDF file has better formatting.)! Model building! Time series plots Read Section 1.1, Examples of time series, on pages 1-8. These example introduce the book;

More information

THE ROYAL STATISTICAL SOCIETY 2008 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE (MODULAR FORMAT) MODULE 4 LINEAR MODELS

THE ROYAL STATISTICAL SOCIETY 2008 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE (MODULAR FORMAT) MODULE 4 LINEAR MODELS THE ROYAL STATISTICAL SOCIETY 008 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE (MODULAR FORMAT) MODULE 4 LINEAR MODELS The Society provides these solutions to assist candidates preparing for the examinations

More information

FinQuiz Notes

FinQuiz Notes Reading 10 Multiple Regression and Issues in Regression Analysis 2. MULTIPLE LINEAR REGRESSION Multiple linear regression is a method used to model the linear relationship between a dependent variable

More information

appstats27.notebook April 06, 2017

appstats27.notebook April 06, 2017 Chapter 27 Objective Students will conduct inference on regression and analyze data to write a conclusion. Inferences for Regression An Example: Body Fat and Waist Size pg 634 Our chapter example revolves

More information

Decision 411: Forecasting

Decision 411: Forecasting Decision 411: Forecasting Professor: Bob Nau Course content: How to predict the future How to learn from the past using data analysis Who should be interested: Anyone on a quantitative career track (financial

More information

Decision 411: Forecasting

Decision 411: Forecasting Decision 411: Forecasting Professor: Bob Nau Course content: How to predict the future How to learn from the past using data analysis Who should be interested: Anyone on a quantitative career track (financial

More information

Chapter 13. Multiple Regression and Model Building

Chapter 13. Multiple Regression and Model Building Chapter 13 Multiple Regression and Model Building Multiple Regression Models The General Multiple Regression Model y x x x 0 1 1 2 2... k k y is the dependent variable x, x,..., x 1 2 k the model are the

More information

ACE 564 Spring Lecture 8. Violations of Basic Assumptions I: Multicollinearity and Non-Sample Information. by Professor Scott H.

ACE 564 Spring Lecture 8. Violations of Basic Assumptions I: Multicollinearity and Non-Sample Information. by Professor Scott H. ACE 564 Spring 2006 Lecture 8 Violations of Basic Assumptions I: Multicollinearity and Non-Sample Information by Professor Scott H. Irwin Readings: Griffiths, Hill and Judge. "Collinear Economic Variables,

More information

Summary statistics. G.S. Questa, L. Trapani. MSc Induction - Summary statistics 1

Summary statistics. G.S. Questa, L. Trapani. MSc Induction - Summary statistics 1 Summary statistics 1. Visualize data 2. Mean, median, mode and percentiles, variance, standard deviation 3. Frequency distribution. Skewness 4. Covariance and correlation 5. Autocorrelation MSc Induction

More information

Correlation and regression

Correlation and regression NST 1B Experimental Psychology Statistics practical 1 Correlation and regression Rudolf Cardinal & Mike Aitken 11 / 12 November 2003 Department of Experimental Psychology University of Cambridge Handouts:

More information

Regression and correlation. Correlation & Regression, I. Regression & correlation. Regression vs. correlation. Involve bivariate, paired data, X & Y

Regression and correlation. Correlation & Regression, I. Regression & correlation. Regression vs. correlation. Involve bivariate, paired data, X & Y Regression and correlation Correlation & Regression, I 9.07 4/1/004 Involve bivariate, paired data, X & Y Height & weight measured for the same individual IQ & exam scores for each individual Height of

More information

Basics: Definitions and Notation. Stationarity. A More Formal Definition

Basics: Definitions and Notation. Stationarity. A More Formal Definition Basics: Definitions and Notation A Univariate is a sequence of measurements of the same variable collected over (usually regular intervals of) time. Usual assumption in many time series techniques is that

More information

Trendlines Simple Linear Regression Multiple Linear Regression Systematic Model Building Practical Issues

Trendlines Simple Linear Regression Multiple Linear Regression Systematic Model Building Practical Issues Trendlines Simple Linear Regression Multiple Linear Regression Systematic Model Building Practical Issues Overfitting Categorical Variables Interaction Terms Non-linear Terms Linear Logarithmic y = a +

More information

Regression of Time Series

Regression of Time Series Mahlerʼs Guide to Regression of Time Series CAS Exam S prepared by Howard C. Mahler, FCAS Copyright 2016 by Howard C. Mahler. Study Aid 2016F-S-9Supplement Howard Mahler hmahler@mac.com www.howardmahler.com/teaching

More information

Mathematics for Economics MA course

Mathematics for Economics MA course Mathematics for Economics MA course Simple Linear Regression Dr. Seetha Bandara Simple Regression Simple linear regression is a statistical method that allows us to summarize and study relationships between

More information

9) Time series econometrics

9) Time series econometrics 30C00200 Econometrics 9) Time series econometrics Timo Kuosmanen Professor Management Science http://nomepre.net/index.php/timokuosmanen 1 Macroeconomic data: GDP Inflation rate Examples of time series

More information

YEAR 10 GENERAL MATHEMATICS 2017 STRAND: BIVARIATE DATA PART II CHAPTER 12 RESIDUAL ANALYSIS, LINEARITY AND TIME SERIES

YEAR 10 GENERAL MATHEMATICS 2017 STRAND: BIVARIATE DATA PART II CHAPTER 12 RESIDUAL ANALYSIS, LINEARITY AND TIME SERIES YEAR 10 GENERAL MATHEMATICS 2017 STRAND: BIVARIATE DATA PART II CHAPTER 12 RESIDUAL ANALYSIS, LINEARITY AND TIME SERIES This topic includes: Transformation of data to linearity to establish relationships

More information

Modules 1-2 are background; they are the same for regression analysis and time series.

Modules 1-2 are background; they are the same for regression analysis and time series. Regression Analysis, Module 1: Regression models (The attached PDF file has better formatting.) Required reading: Chapter 1, pages 3 13 (until appendix 1.1). Updated: May 23, 2005 Modules 1-2 are background;

More information

Chapter 10 Regression Analysis

Chapter 10 Regression Analysis Chapter 10 Regression Analysis Goal: To become familiar with how to use Excel 2007/2010 for Correlation and Regression. Instructions: You will be using CORREL, FORECAST and Regression. CORREL and FORECAST

More information

Week 8: Correlation and Regression

Week 8: Correlation and Regression Health Sciences M.Sc. Programme Applied Biostatistics Week 8: Correlation and Regression The correlation coefficient Correlation coefficients are used to measure the strength of the relationship or association

More information

Introduction to Regression

Introduction to Regression Regression Introduction to Regression If two variables covary, we should be able to predict the value of one variable from another. Correlation only tells us how much two variables covary. In regression,

More information

TESTING FOR CO-INTEGRATION

TESTING FOR CO-INTEGRATION Bo Sjö 2010-12-05 TESTING FOR CO-INTEGRATION To be used in combination with Sjö (2008) Testing for Unit Roots and Cointegration A Guide. Instructions: Use the Johansen method to test for Purchasing Power

More information

Regression Models for Time Trends: A Second Example. INSR 260, Spring 2009 Bob Stine

Regression Models for Time Trends: A Second Example. INSR 260, Spring 2009 Bob Stine Regression Models for Time Trends: A Second Example INSR 260, Spring 2009 Bob Stine 1 Overview Resembles prior textbook occupancy example Time series of revenue, costs and sales at Best Buy, in millions

More information

Chapter 5 Least Squares Regression

Chapter 5 Least Squares Regression Chapter 5 Least Squares Regression A Royal Bengal tiger wandered out of a reserve forest. We tranquilized him and want to take him back to the forest. We need an idea of his weight, but have no scale!

More information

1 The Classic Bivariate Least Squares Model

1 The Classic Bivariate Least Squares Model Review of Bivariate Linear Regression Contents 1 The Classic Bivariate Least Squares Model 1 1.1 The Setup............................... 1 1.2 An Example Predicting Kids IQ................. 1 2 Evaluating

More information

Chapter 14 Simple Linear Regression (A)

Chapter 14 Simple Linear Regression (A) Chapter 14 Simple Linear Regression (A) 1. Characteristics Managerial decisions often are based on the relationship between two or more variables. can be used to develop an equation showing how the variables

More information

Introduction to Regression

Introduction to Regression Introduction to Regression ιατµηµατικό Πρόγραµµα Μεταπτυχιακών Σπουδών Τεχνο-Οικονοµικά Συστήµατα ηµήτρης Φουσκάκης Introduction Basic idea: Use data to identify relationships among variables and use these

More information

Exam Applied Statistical Regression. Good Luck!

Exam Applied Statistical Regression. Good Luck! Dr. M. Dettling Summer 2011 Exam Applied Statistical Regression Approved: Tables: Note: Any written material, calculator (without communication facility). Attached. All tests have to be done at the 5%-level.

More information

Stat 500 Midterm 2 12 November 2009 page 0 of 11

Stat 500 Midterm 2 12 November 2009 page 0 of 11 Stat 500 Midterm 2 12 November 2009 page 0 of 11 Please put your name on the back of your answer book. Do NOT put it on the front. Thanks. Do not start until I tell you to. The exam is closed book, closed

More information

Introduction to Regression

Introduction to Regression Introduction to Regression Using Mult Lin Regression Derived variables Many alternative models Which model to choose? Model Criticism Modelling Objective Model Details Data and Residuals Assumptions 1

More information

Week 9: An Introduction to Time Series

Week 9: An Introduction to Time Series BUS41100 Applied Regression Analysis Week 9: An Introduction to Time Series Dependent data, autocorrelation, AR and periodic regression models Max H. Farrell The University of Chicago Booth School of Business

More information

Important note: Transcripts are not substitutes for textbook assignments. 1

Important note: Transcripts are not substitutes for textbook assignments. 1 In this lesson we will cover correlation and regression, two really common statistical analyses for quantitative (or continuous) data. Specially we will review how to organize the data, the importance

More information

Forecasting. Simon Shaw 2005/06 Semester II

Forecasting. Simon Shaw 2005/06 Semester II Forecasting Simon Shaw s.c.shaw@maths.bath.ac.uk 2005/06 Semester II 1 Introduction A critical aspect of managing any business is planning for the future. events is called forecasting. Predicting future

More information

1 Correlation and Inference from Regression

1 Correlation and Inference from Regression 1 Correlation and Inference from Regression Reading: Kennedy (1998) A Guide to Econometrics, Chapters 4 and 6 Maddala, G.S. (1992) Introduction to Econometrics p. 170-177 Moore and McCabe, chapter 12 is

More information

Variance. Standard deviation VAR = = value. Unbiased SD = SD = 10/23/2011. Functional Connectivity Correlation and Regression.

Variance. Standard deviation VAR = = value. Unbiased SD = SD = 10/23/2011. Functional Connectivity Correlation and Regression. 10/3/011 Functional Connectivity Correlation and Regression Variance VAR = Standard deviation Standard deviation SD = Unbiased SD = 1 10/3/011 Standard error Confidence interval SE = CI = = t value for

More information

Lecture 10: F -Tests, ANOVA and R 2

Lecture 10: F -Tests, ANOVA and R 2 Lecture 10: F -Tests, ANOVA and R 2 1 ANOVA We saw that we could test the null hypothesis that β 1 0 using the statistic ( β 1 0)/ŝe. (Although I also mentioned that confidence intervals are generally

More information