Introduction to Regression

Size: px
Start display at page:

Download "Introduction to Regression"

Transcription

1 Introduction to Regression Using Mult Lin Regression Derived variables Many alternative models Which model to choose? Model Criticism Modelling Objective Model Details Data and Residuals Assumptions 1

2 Data Like This Values of coefficients Sampling Distributions Standard Errors 95% Confidence Intervals 95% Prediction Intervals ANOVA etc 2

3 Derived variables General Logs Proportions and Ratios Too many (derived) variables Redundancy Many versions of same model Indicator variables categorical data Time series applications Indicator variables eg seasonal effects Lagged variables Differences Logs and Rate of Return

4 Gas Gas Gas Consumption vs Temp 7 6 Period 1 Fitted Line Plot Gas = Temperature S R-Sq 94.4% R-Sq(adj) 94.1% Weekly gas consumption (in 1000 cubic feet) and the average outside temperature (in degrees Celsius) at one house in south-east England for two "heating seasons", one of 26 weeks before, and one of 0 weeks after cavity-wall insulation was installed. The object of the exercise was to assess the effect of the insulation on gas consumption. The house thermostat was set at 20 C throughout. 5 4 Period 2 Fitted Line Plot Gas = Temperature Temperature S R-Sq 81.% R-Sq(adj) 80.6% 4 Comparative Temperature

5 Objective Nominal focus on prediction Predict gas consumption in future for this house Knowing temp and whether or not insulated Actual interest Does insulation make a difference At all temps? How much? Slope? Intercept? SEs? Data Like This 5

6 Using an Indicator variable Insulated Week Temperature Gas Insulated Week Temperature Gas etc etc One stacked data set Week Insulation Temperature Gas etc Two parallel data sets 6

7 Temperature Gas Simple Regression & Indicator Variable Fitted Line Plot Gas = Insulated 0.4 Insulated Fitted Line Plot Temperature = Insulated 0.4 Insulated S R-Sq 29.8% R-Sq(adj) 28.5% S R-Sq 2.6% R-Sq(adj) 0.8% Gas vs Insulated Insulated = 0 Avg Gas = Insulated = 1 Avg Gas =.48 Diff = Temp vs Insulated Coeff Unit Increase Random Error Design Implications 7

8 Gas SLR with indicator var & T-test Fitted Line Plot Gas = Insulated Two-sample T for Gas Insulated S R-Sq 29.8% R-Sq(adj) 28.5% Insulated N Mean StDev SE Mean Difference = μ (0) - μ (1) T-Value = 4.79 P-Value = DF = 54 Using Pooled StDev = Regression Analysis: Gas versus Insulated S R-sq R-sq(adj) R-sq(pred) % 28.49% 24.5% Coefficients Term Coef SE Coef T-Value P-Value Constant Insulated

9 Indicator Variables in Regression Response variable Predictors x Temp, x Insulated(0 /1) Statistical Model Y x x ; ~ N 0, When x 0 Y x When x 1 Y x Y Y Gas x Y x Common Slopes Diff bet Int'cpts No interaction Binary Indicator Variable

10 Multiple Regression Output Regression Analysis: Gas versus Temperature, Insulated The regression equation is Gas = Temperature Insulated Predictor Coef SE Coef Constant Temperature Insulated ˆ SE ˆ Rough 95%CI (0.097) Prev ( 1.76, 1.7) Mean Diff (0.274) Parallel lines 10

11 Implementation: Categorical Variable 11

12 Regression Output: Categorical Var Regression Analysis: Gas versus Temperature, Insulated Categorical predictor coding (1, 0) Model Summary S R-sq % Coefficients Regression Equation Term Coef SE Coef T-Value P-Value Constant Temperature Insulated Insulated 0 Gas = Temperature 1 Gas = Temperature 12

13 Aside: Omitted predictors Hidden/Lurking variables Subset of data Used in exam Uninformed by insulation status Slope positive On avg, gas consumption increases with temp! Knowing insulation status Slopes negative On avg, gas consumption decreases with temp 1

14 Interaction? Refine the question Different slopes as well? 14

15 Indicator Variables in Regression Response variable Y Gas Predictors x Temp, x Insulated(0 /1), x Temp x Combined statistical model Y x x x ; ~ N 0, When x 0 Y x When x 1 Y x Y x diff in intercepts; diff in slopes 2 15

16 New Derived Variable 16

17 Modelling two regression lines Regression Analysis: Gas versus Temperature, Insulated, Ins X Temp Gas = Temperature Insulated Ins X Temp Predictor Coef SE Coef Constant Temperature Insulated Ins X Temp S = R-Sq = 92.8% R-Sq(adj) = 92.4% Which coeff most fundamantal to theory of heat loss? 17

18 Alt Models of two regression lines Nearly equivalent Two sep lin regs Gas vs Temp Exercise Compare Coeff Ests 95% Ints Response variable 1 2 a) One model, w interaction b) Two sep models Predictors x Temp, x Insulated(0 / 1) Two Statistical Models Y Gas 2 2 0; NoIns NoIns 1 ; 0, NoIns x Y x N 2 2 1; Ins Ins 1 ; 0, Ins x Y x N 18

19 Multiple indicator variables Will also meet Redundancy Multiple formulations of same model 19

20 Housing Completions, quarterly, 1978 to 2000 Quarter Q Q Q Q Quarter Q Q Q Q Quarter Q Q Q Q

21 Completions Figure 1.0 Housing Completions, quarterly, 1978 to Time Series Plot of Completions Take objective: forecast one quarter ahead Quarter Q1 Q2 Q Q Quarter Q1 Year 1978 Q Q Q Q Q1 199 Q Q

22 Comps Aside: Cubic/Quadratic Regression Fitted Line plot Options Log Quadratic Cubic Fitted Line Plot Comps = E time time** time** Regression 95% PI S R-Sq 88.% R-Sq(adj) 87.9% time

23 Modelling Options Focus on stable linear structure post 199 Assume this structure will continue Exploit structure extension of Indicator Vars Disadvantage: smaller data set One model for entire data set Note: structure has changed; might change again Exploit weaker structure Use Lagged variables Advantage: use all data. 2

24 Completions Comps, quarterly, 199 to 2000 Target is 2001 Q1 Use Q1 data only? OR Use all data? 4 parallel lines more efficient Why/What sense? Option 1 work since 199 Time Series Plot of Completions Quarter Q1 Q2 Q Q Quarter Year Q1 199 Q Q Q Q Q Q

25 Completions Completions Q1 only Fitted Line Plot Completions = year Other Qs; 4 sep lines S R-Sq 98.5% R-Sq(adj) 98.% year Later, use Time since 1978 Changes intercept only Pred = ± 2(16.5) = (9795, 11061) 25

26 Linear in Time plus Quarterly Ind Vars Create set of binary variables Q1, Q2, Q, Q4 Comps = 1 Q Q 2 + Q + 4 Q 4 + Time + Year. Quarter time Time since 1978 Comps Q1 Q2 Q Q4 199 Q Q Q Q Q Q Q Q

27 Multiple Indicator Vars: Tech Issue Regression Analysis: Comps versus Time since 1978, Q1, Q2, Q, Q4 * Q4 is highly correlated with other X variables * Q4 has been removed from the equation. The regression equation is Comps = Time since Q1-119 Q2-758 Q Y Q Q Q Q t Interp of t and all Q 0 i Redundancy Alternatives 0 No Constant Use indicator variables only equiv Enter " Quarter" as categorical variable 27

28 Multiple Indicator Vars: Tech Issue Regression Analysis: Comps versus Time since 1978, Q1, Q2, Q, Q4 * Q4 is highly correlated with other X variables * Q4 has been removed from the equation. Comps = Time since Q1-119 Q2-758 Q S = OR Note = = etc Regression Analysis: Comps versus Time since 1978, Q1, Q2, Q, Q4 No constant option Comps = 986 Time since Q Q Q Q4 S =

29 Multiple Indicator Vars: Tech Issue Regression Analysis: Comps versus Time since 1978, Q1, Q2, Q, Q4 * Q4 is highly correlated with other X variables * Q4 has been removed from the equation. Comps = Time since Q1-119 Q2-758 Q S = OR Note = = etc Regression Analysis: Comps versus Time since 1978, Q1, Q2, Q, Q4 No constant option Comps = 986 Time since Q Q Q Q4 S =

30 Categorical Variable approach Model Summary Regression Equations S R-sq % Quarter Q1 Comps = t Q2 Comps = t Coefficients Q Q4 Comps = t Comps = t Term Coef SE Coef Constant time since Quarter Consider Q2 Q1 at t = 0 Q Q Q

31 Derived variables and Transforms in Time Series Lags Differences Rates of Return Log scale 1

32 Completions All Comps, quarterly, 1978 to 2000 Option 2 use all data, but diff model Time Series Plot of Completions Quarter Q1 Q2 Q Q Quarter Q1 Year 1978 Q Q Q Q Q1 199 Q Q

33 Comps Auto-Regression for Time Series Basic idea next value like last value (Lag1) Fitted Line Plot Comps = Lag1Comp S R-Sq 76.1% R-Sq(adj) 75.8% Lag1Comp

34 Auto-Regression for Time Series Basic idea next value like last value (Lag1) Auto Regression Y Y + * Y + t 0 lag1 t1 t + * Y t 0 lag1 t1 + * Y lag 4 t4 t Year. QuarterComps Lag1Comp Lag4Comp 1978 Q Q Q Q Q Q Q Q Q

35 Using two lagged variables Regression Analysis: Comps versus Lag1Comp, Lag4Comp The regression equation is Comps = Lag1Comp Lag4Comp : S = Comp Q = 1287, Comp Q = % Pred Int Comp Q = ± 2(780.7)= (100, 145) 5

36 Using Lagged Variables Basic Idea Current Quarter like prev quarter same Q last year Matrix Plot of Completions, Lag1Comp, Lag4Comp Completions Lag1Comp Lag4Comp 6

37 Comparison Forecasting models Comps Linear in Time, quarter indicators Lag1 and Lag 4 Modelling Options 1 Parallel Linear Regressions Y Q Q Q Q t t Seasonal AutoRegression Y Y Y t 1 t1 4 t4 t More efficient for prediction Fewer modelling assumptions Different modelling strategy t Lin in time + Q Lag 1 and lag 4 Comps Lag 1 Lag 4 Q1 Q2 Q inds Q Q Q Q Q Q2? ? Q? Q4? Q1??

38 Model Criticism Criticism Does it make sense? Are there outliers? Choice amongst alternatives R 2 SE 8

39 Extra: Logs lags and differences Financial data IBM share price Natural language %age change MINITAB language logs 9

40 Financial Series- IBM Prices daily Simple Reg on Time 40

41 Logprice Logprice Log IBM Prices Log(Y t ) vs t Log(Y t ) vs log(y t-1 ) IBM Prices Logprice = t IBM Prices Logprice = lag1logprice Regression 95% PI S R-Sq 94.4% R-Sq(adj) 94.4% Regression 95% PI S R-Sq 99.8% R-Sq(adj) 99.8% t lag1logprice

42 price price Modeled in Log Scale, presented in original units Log(Y Log(Y t )vs log(y t-1 ) t ) vs t IBM Prices log10(price) = t IBM Prices log10(price) = log10(lag1price) Regression 95% PI S R-Sq 94.4% R-Sq(adj) 94.4% Regression 95% PI S R-Sq 99.8% R-Sq(adj) 99.8% t lag1price

43 Differences/ Ratios First Differences Seasonal Diffs Today Yesterday This Q same Q last year Ratio Y(t) / Y(t-1) Rate of Return 100 x(y(t) Y(t-1))/ Y(t-1) 100 x (Ratio -1) Log(Ratio) Log( Y(t) ) Log ( Y(t-1) ) 4

44 La g1diff Financial Series- IBM Prices daily Simple Regression of Daily Diffs vs Time IBM Prices Lag1diff = t Regression 95% PI S R-Sq 0.1% R-Sq(adj) 0.0% t

45 Lag1difflog Financial Series- IBM Prices daily Simple Regression of First Diffs of LogPrice vs Time IBM Prices Lag1difflog = t Regression 95% PI S R-Sq 0.0% R-Sq(adj) 0.0% t

46 Lag1difflog Financial Series- IBM Prices daily IBM Prices Lag1difflog = t Interpretation Regression 95% PI S R-Sq 0.0% R-Sq(adj) 0.0% log P log P 0 time t t1 t t t log Pt P log t t or in ( , ) P t1 P t1 in ( , ) Pt 10 or in 10, or in 0.96, In summary Rate of return 0.1% per day 4% P 46

47 Financial Series Day to day changes most naturally expressed as % change price tomorrow = price today small change Log(price t+1)= Log(price t) + Log(small change) Average drift per day (for logs) is ie about 0.1% growth pd = 61% pa 47

48 Financial Series Confidence in future prediction pt est hi lo ^ Factor Eg initial capital 1000 Day infinity infinity 61% per annum?? 48

49 Derived Variables Why use derived variables? Adding extra variables gives more options Challenge Is there a cost? Which is best Scientific insight can powerful & simple analysis 49

Introduction to Regression

Introduction to Regression Introduction to Regression Using Mult Lin Regression Derived variables Many alternative models Which model to choose? Model Criticism Modelling Objective Model Details Data and Residuals Assumptions 1

More information

Confidence Interval for the mean response

Confidence Interval for the mean response Week 3: Prediction and Confidence Intervals at specified x. Testing lack of fit with replicates at some x's. Inference for the correlation. Introduction to regression with several explanatory variables.

More information

Predict y from (possibly) many predictors x. Model Criticism Study the importance of columns

Predict y from (possibly) many predictors x. Model Criticism Study the importance of columns Lecture Week Multiple Linear Regression Predict y from (possibly) many predictors x Including extra derived variables Model Criticism Study the importance of columns Draw on Scientific framework Experiment;

More information

INFERENCE FOR REGRESSION

INFERENCE FOR REGRESSION CHAPTER 3 INFERENCE FOR REGRESSION OVERVIEW In Chapter 5 of the textbook, we first encountered regression. The assumptions that describe the regression model we use in this chapter are the following. We

More information

Multiple Regression Examples

Multiple Regression Examples Multiple Regression Examples Example: Tree data. we have seen that a simple linear regression of usable volume on diameter at chest height is not suitable, but that a quadratic model y = β 0 + β 1 x +

More information

Simple Linear Regression: A Model for the Mean. Chap 7

Simple Linear Regression: A Model for the Mean. Chap 7 Simple Linear Regression: A Model for the Mean Chap 7 An Intermediate Model (if the groups are defined by values of a numeric variable) Separate Means Model Means fall on a straight line function of the

More information

STA 108 Applied Linear Models: Regression Analysis Spring Solution for Homework #6

STA 108 Applied Linear Models: Regression Analysis Spring Solution for Homework #6 STA 8 Applied Linear Models: Regression Analysis Spring 011 Solution for Homework #6 6. a) = 11 1 31 41 51 1 3 4 5 11 1 31 41 51 β = β1 β β 3 b) = 1 1 1 1 1 11 1 31 41 51 1 3 4 5 β = β 0 β1 β 6.15 a) Stem-and-leaf

More information

Ch 13 & 14 - Regression Analysis

Ch 13 & 14 - Regression Analysis Ch 3 & 4 - Regression Analysis Simple Regression Model I. Multiple Choice:. A simple regression is a regression model that contains a. only one independent variable b. only one dependent variable c. more

More information

Basic Business Statistics, 10/e

Basic Business Statistics, 10/e Chapter 4 4- Basic Business Statistics th Edition Chapter 4 Introduction to Multiple Regression Basic Business Statistics, e 9 Prentice-Hall, Inc. Chap 4- Learning Objectives In this chapter, you learn:

More information

Model Building Chap 5 p251

Model Building Chap 5 p251 Model Building Chap 5 p251 Models with one qualitative variable, 5.7 p277 Example 4 Colours : Blue, Green, Lemon Yellow and white Row Blue Green Lemon Insects trapped 1 0 0 1 45 2 0 0 1 59 3 0 0 1 48 4

More information

Basic Business Statistics 6 th Edition

Basic Business Statistics 6 th Edition Basic Business Statistics 6 th Edition Chapter 12 Simple Linear Regression Learning Objectives In this chapter, you learn: How to use regression analysis to predict the value of a dependent variable based

More information

Models with qualitative explanatory variables p216

Models with qualitative explanatory variables p216 Models with qualitative explanatory variables p216 Example gen = 1 for female Row gpa hsm gen 1 3.32 10 0 2 2.26 6 0 3 2.35 8 0 4 2.08 9 0 5 3.38 8 0 6 3.29 10 0 7 3.21 8 0 8 2.00 3 0 9 3.18 9 0 10 2.34

More information

Inference for Regression Inference about the Regression Model and Using the Regression Line, with Details. Section 10.1, 2, 3

Inference for Regression Inference about the Regression Model and Using the Regression Line, with Details. Section 10.1, 2, 3 Inference for Regression Inference about the Regression Model and Using the Regression Line, with Details Section 10.1, 2, 3 Basic components of regression setup Target of inference: linear dependency

More information

General Linear Model (Chapter 4)

General Linear Model (Chapter 4) General Linear Model (Chapter 4) Outcome variable is considered continuous Simple linear regression Scatterplots OLS is BLUE under basic assumptions MSE estimates residual variance testing regression coefficients

More information

LINEAR REGRESSION ANALYSIS. MODULE XVI Lecture Exercises

LINEAR REGRESSION ANALYSIS. MODULE XVI Lecture Exercises LINEAR REGRESSION ANALYSIS MODULE XVI Lecture - 44 Exercises Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur Exercise 1 The following data has been obtained on

More information

Correlation & Simple Regression

Correlation & Simple Regression Chapter 11 Correlation & Simple Regression The previous chapter dealt with inference for two categorical variables. In this chapter, we would like to examine the relationship between two quantitative variables.

More information

Is economic freedom related to economic growth?

Is economic freedom related to economic growth? Is economic freedom related to economic growth? It is an article of faith among supporters of capitalism: economic freedom leads to economic growth. The publication Economic Freedom of the World: 2003

More information

Chapter 12: Multiple Regression

Chapter 12: Multiple Regression Chapter 12: Multiple Regression 12.1 a. A scatterplot of the data is given here: Plot of Drug Potency versus Dose Level Potency 0 5 10 15 20 25 30 0 5 10 15 20 25 30 35 Dose Level b. ŷ = 8.667 + 0.575x

More information

SMAM 319 Exam 1 Name. 1.Pick the best choice for the multiple choice questions below (10 points 2 each)

SMAM 319 Exam 1 Name. 1.Pick the best choice for the multiple choice questions below (10 points 2 each) SMAM 319 Exam 1 Name 1.Pick the best choice for the multiple choice questions below (10 points 2 each) A b In Metropolis there are some houses for sale. Superman and Lois Lane are interested in the average

More information

Multiple Regression: Chapter 13. July 24, 2015

Multiple Regression: Chapter 13. July 24, 2015 Multiple Regression: Chapter 13 July 24, 2015 Multiple Regression (MR) Response Variable: Y - only one response variable (quantitative) Several Predictor Variables: X 1, X 2, X 3,..., X p (p = # predictors)

More information

28. SIMPLE LINEAR REGRESSION III

28. SIMPLE LINEAR REGRESSION III 28. SIMPLE LINEAR REGRESSION III Fitted Values and Residuals To each observed x i, there corresponds a y-value on the fitted line, y = βˆ + βˆ x. The are called fitted values. ŷ i They are the values of

More information

Time series and Forecasting

Time series and Forecasting Chapter 2 Time series and Forecasting 2.1 Introduction Data are frequently recorded at regular time intervals, for instance, daily stock market indices, the monthly rate of inflation or annual profit figures.

More information

STAT 212 Business Statistics II 1

STAT 212 Business Statistics II 1 STAT 1 Business Statistics II 1 KING FAHD UNIVERSITY OF PETROLEUM & MINERALS DEPARTMENT OF MATHEMATICAL SCIENCES DHAHRAN, SAUDI ARABIA STAT 1: BUSINESS STATISTICS II Semester 091 Final Exam Thursday Feb

More information

Steps for Regression. Simple Linear Regression. Data. Example. Residuals vs. X. Scatterplot. Make a Scatter plot Does it make sense to plot a line?

Steps for Regression. Simple Linear Regression. Data. Example. Residuals vs. X. Scatterplot. Make a Scatter plot Does it make sense to plot a line? Steps for Regression Simple Linear Regression Make a Scatter plot Does it make sense to plot a line? Check Residual Plot (Residuals vs. X) Are there any patterns? Check Histogram of Residuals Is it Normal?

More information

Estimating σ 2. We can do simple prediction of Y and estimation of the mean of Y at any value of X.

Estimating σ 2. We can do simple prediction of Y and estimation of the mean of Y at any value of X. Estimating σ 2 We can do simple prediction of Y and estimation of the mean of Y at any value of X. To perform inferences about our regression line, we must estimate σ 2, the variance of the error term.

More information

SMAM 319 Exam1 Name. a B.The equation of a line is 3x + y =6. The slope is a. -3 b.3 c.6 d.1/3 e.-1/3

SMAM 319 Exam1 Name. a B.The equation of a line is 3x + y =6. The slope is a. -3 b.3 c.6 d.1/3 e.-1/3 SMAM 319 Exam1 Name 1. Pick the best choice. (10 points-2 each) _c A. A data set consisting of fifteen observations has the five number summary 4 11 12 13 15.5. For this data set it is definitely true

More information

St 412/512, D. Schafer, Spring 2001

St 412/512, D. Schafer, Spring 2001 St 412/512, D. Schafer, Spring 2001 Midterm Exam Your name:_solutions Your lab time (Circle one): Tues. 8:00 Tues 11:00 Tues 2:00 This is a 50-minute open-book, open-notes test. Show work where appropriate.

More information

SMAM 314 Practice Final Examination Winter 2003

SMAM 314 Practice Final Examination Winter 2003 SMAM 314 Practice Final Examination Winter 2003 You may use your textbook, one page of notes and a calculator. Please hand in the notes with your exam. 1. Mark the following statements True T or False

More information

Histogram of Residuals. Residual Normal Probability Plot. Reg. Analysis Check Model Utility. (con t) Check Model Utility. Inference.

Histogram of Residuals. Residual Normal Probability Plot. Reg. Analysis Check Model Utility. (con t) Check Model Utility. Inference. Steps for Regression Simple Linear Regression Make a Scatter plot Does it make sense to plot a line? Check Residual Plot (Residuals vs. X) Are there any patterns? Check Histogram of Residuals Is it Normal?

More information

Simple Linear Regression. Steps for Regression. Example. Make a Scatter plot. Check Residual Plot (Residuals vs. X)

Simple Linear Regression. Steps for Regression. Example. Make a Scatter plot. Check Residual Plot (Residuals vs. X) Simple Linear Regression 1 Steps for Regression Make a Scatter plot Does it make sense to plot a line? Check Residual Plot (Residuals vs. X) Are there any patterns? Check Histogram of Residuals Is it Normal?

More information

23. Inference for regression

23. Inference for regression 23. Inference for regression The Practice of Statistics in the Life Sciences Third Edition 2014 W. H. Freeman and Company Objectives (PSLS Chapter 23) Inference for regression The regression model Confidence

More information

UNIVERSITY OF TORONTO SCARBOROUGH Department of Computer and Mathematical Sciences Midterm Test, October 2013

UNIVERSITY OF TORONTO SCARBOROUGH Department of Computer and Mathematical Sciences Midterm Test, October 2013 UNIVERSITY OF TORONTO SCARBOROUGH Department of Computer and Mathematical Sciences Midterm Test, October 2013 STAC67H3 Regression Analysis Duration: One hour and fifty minutes Last Name: First Name: Student

More information

Statistics - Lecture Three. Linear Models. Charlotte Wickham 1.

Statistics - Lecture Three. Linear Models. Charlotte Wickham   1. Statistics - Lecture Three Charlotte Wickham wickham@stat.berkeley.edu http://www.stat.berkeley.edu/~wickham/ Linear Models 1. The Theory 2. Practical Use 3. How to do it in R 4. An example 5. Extensions

More information

Q Lecture Introduction to Regression

Q Lecture Introduction to Regression Q3 2009 1 Before/After Transformation 2 Construction Role of T-ratios Formally, even under Null Hyp: H : 0, ˆ, being computed from k t k SE ˆ ˆ y values themselves containing random error, will sometimes

More information

AP Statistics Bivariate Data Analysis Test Review. Multiple-Choice

AP Statistics Bivariate Data Analysis Test Review. Multiple-Choice Name Period AP Statistics Bivariate Data Analysis Test Review Multiple-Choice 1. The correlation coefficient measures: (a) Whether there is a relationship between two variables (b) The strength of the

More information

22S39: Class Notes / November 14, 2000 back to start 1

22S39: Class Notes / November 14, 2000 back to start 1 Model diagnostics Interpretation of fitted regression model 22S39: Class Notes / November 14, 2000 back to start 1 Model diagnostics 22S39: Class Notes / November 14, 2000 back to start 2 Model diagnostics

More information

Chapter 14 Multiple Regression Analysis

Chapter 14 Multiple Regression Analysis Chapter 14 Multiple Regression Analysis 1. a. Multiple regression equation b. the Y-intercept c. $374,748 found by Y ˆ = 64,1 +.394(796,) + 9.6(694) 11,6(6.) (LO 1) 2. a. Multiple regression equation b.

More information

Chapter 14 Student Lecture Notes 14-1

Chapter 14 Student Lecture Notes 14-1 Chapter 14 Student Lecture Notes 14-1 Business Statistics: A Decision-Making Approach 6 th Edition Chapter 14 Multiple Regression Analysis and Model Building Chap 14-1 Chapter Goals After completing this

More information

Inference for the Regression Coefficient

Inference for the Regression Coefficient Inference for the Regression Coefficient Recall, b 0 and b 1 are the estimates of the slope β 1 and intercept β 0 of population regression line. We can shows that b 0 and b 1 are the unbiased estimates

More information

Stat 501, F. Chiaromonte. Lecture #8

Stat 501, F. Chiaromonte. Lecture #8 Stat 501, F. Chiaromonte Lecture #8 Data set: BEARS.MTW In the minitab example data sets (for description, get into the help option and search for "Data Set Description"). Wild bears were anesthetized,

More information

Week 9: An Introduction to Time Series

Week 9: An Introduction to Time Series BUS41100 Applied Regression Analysis Week 9: An Introduction to Time Series Dependent data, autocorrelation, AR and periodic regression models Max H. Farrell The University of Chicago Booth School of Business

More information

1 Introduction to Minitab

1 Introduction to Minitab 1 Introduction to Minitab Minitab is a statistical analysis software package. The software is freely available to all students and is downloadable through the Technology Tab at my.calpoly.edu. When you

More information

Final Exam Bus 320 Spring 2000 Russell

Final Exam Bus 320 Spring 2000 Russell Name Final Exam Bus 320 Spring 2000 Russell Do not turn over this page until you are told to do so. You will have 3 hours minutes to complete this exam. The exam has a total of 100 points and is divided

More information

Confidence Intervals, Testing and ANOVA Summary

Confidence Intervals, Testing and ANOVA Summary Confidence Intervals, Testing and ANOVA Summary 1 One Sample Tests 1.1 One Sample z test: Mean (σ known) Let X 1,, X n a r.s. from N(µ, σ) or n > 30. Let The test statistic is H 0 : µ = µ 0. z = x µ 0

More information

Lecture 18: Simple Linear Regression

Lecture 18: Simple Linear Regression Lecture 18: Simple Linear Regression BIOS 553 Department of Biostatistics University of Michigan Fall 2004 The Correlation Coefficient: r The correlation coefficient (r) is a number that measures the strength

More information

Chapter 26 Multiple Regression, Logistic Regression, and Indicator Variables

Chapter 26 Multiple Regression, Logistic Regression, and Indicator Variables Chapter 26 Multiple Regression, Logistic Regression, and Indicator Variables 26.1 S 4 /IEE Application Examples: Multiple Regression An S 4 /IEE project was created to improve the 30,000-footlevel metric

More information

Lecture notes on Regression & SAS example demonstration

Lecture notes on Regression & SAS example demonstration Regression & Correlation (p. 215) When two variables are measured on a single experimental unit, the resulting data are called bivariate data. You can describe each variable individually, and you can also

More information

School of Mathematical Sciences. Question 1

School of Mathematical Sciences. Question 1 School of Mathematical Sciences MTH5120 Statistical Modelling I Practical 8 and Assignment 7 Solutions Question 1 Figure 1: The residual plots do not contradict the model assumptions of normality, constant

More information

Lecture 3: Inference in SLR

Lecture 3: Inference in SLR Lecture 3: Inference in SLR STAT 51 Spring 011 Background Reading KNNL:.1.6 3-1 Topic Overview This topic will cover: Review of hypothesis testing Inference about 1 Inference about 0 Confidence Intervals

More information

1. Review of Lecture level factors Homework A 2 3 experiment in 16 runs with no replicates

1. Review of Lecture level factors Homework A 2 3 experiment in 16 runs with no replicates Lecture 3.1 1. Review of Lecture 2.2 2-level factors Homework 2.2.1 2. A 2 3 experiment 3. 2 4 in 16 runs with no replicates Lecture 3.1 1 2 k Factorial Designs Designs with k factors each at 2 levels

More information

Applied Econometrics. Professor Bernard Fingleton

Applied Econometrics. Professor Bernard Fingleton Applied Econometrics Professor Bernard Fingleton Regression A quick summary of some key issues Some key issues Text book JH Stock & MW Watson Introduction to Econometrics 2nd Edition Software Gretl Gretl.sourceforge.net

More information

SMAM 314 Exam 42 Name

SMAM 314 Exam 42 Name SMAM 314 Exam 42 Name Mark the following statements True (T) or False (F) (10 points) 1. F A. The line that best fits points whose X and Y values are negatively correlated should have a positive slope.

More information

School of Mathematical Sciences. Question 1. Best Subsets Regression

School of Mathematical Sciences. Question 1. Best Subsets Regression School of Mathematical Sciences MTH5120 Statistical Modelling I Practical 9 and Assignment 8 Solutions Question 1 Best Subsets Regression Response is Crime I n W c e I P a n A E P U U l e Mallows g E P

More information

Multiple Regression Methods

Multiple Regression Methods Chapter 1: Multiple Regression Methods Hildebrand, Ott and Gray Basic Statistical Ideas for Managers Second Edition 1 Learning Objectives for Ch. 1 The Multiple Linear Regression Model How to interpret

More information

The simple linear regression model discussed in Chapter 13 was written as

The simple linear regression model discussed in Chapter 13 was written as 1519T_c14 03/27/2006 07:28 AM Page 614 Chapter Jose Luis Pelaez Inc/Blend Images/Getty Images, Inc./Getty Images, Inc. 14 Multiple Regression 14.1 Multiple Regression Analysis 14.2 Assumptions of the Multiple

More information

42 GEO Metro Japan

42 GEO Metro Japan Statistics 101 106 Lecture 11 (17 November 98) c David Pollard Page 1 Read M&M Chapters 2 and 11 again. Section leaders will decide how much of Chapters 12 and 13 to cover formally; they will assign the

More information

Analysis of Covariance. The following example illustrates a case where the covariate is affected by the treatments.

Analysis of Covariance. The following example illustrates a case where the covariate is affected by the treatments. Analysis of Covariance In some experiments, the experimental units (subjects) are nonhomogeneous or there is variation in the experimental conditions that are not due to the treatments. For example, a

More information

Analysis of Bivariate Data

Analysis of Bivariate Data Analysis of Bivariate Data Data Two Quantitative variables GPA and GAES Interest rates and indices Tax and fund allocation Population size and prison population Bivariate data (x,y) Case corr&reg 2 Independent

More information

Review of Regression Basics

Review of Regression Basics Review of Regression Basics When describing a Bivariate Relationship: Make a Scatterplot Strength, Direction, Form Model: y-hat=a+bx Interpret slope in context Make Predictions Residual = Observed-Predicted

More information

MULTIPLE REGRESSION ANALYSIS AND OTHER ISSUES. Business Statistics

MULTIPLE REGRESSION ANALYSIS AND OTHER ISSUES. Business Statistics MULTIPLE REGRESSION ANALYSIS AND OTHER ISSUES Business Statistics CONTENTS Multiple regression Dummy regressors Assumptions of regression analysis Predicting with regression analysis Old exam question

More information

(4) 1. Create dummy variables for Town. Name these dummy variables A and B. These 0,1 variables now indicate the location of the house.

(4) 1. Create dummy variables for Town. Name these dummy variables A and B. These 0,1 variables now indicate the location of the house. Exam 3 Resource Economics 312 Introductory Econometrics Please complete all questions on this exam. The data in the spreadsheet: Exam 3- Home Prices.xls are to be used for all analyses. These data are

More information

STAT 3A03 Applied Regression With SAS Fall 2017

STAT 3A03 Applied Regression With SAS Fall 2017 STAT 3A03 Applied Regression With SAS Fall 2017 Assignment 2 Solution Set Q. 1 I will add subscripts relating to the question part to the parameters and their estimates as well as the errors and residuals.

More information

Six Sigma Black Belt Study Guides

Six Sigma Black Belt Study Guides Six Sigma Black Belt Study Guides 1 www.pmtutor.org Powered by POeT Solvers Limited. Analyze Correlation and Regression Analysis 2 www.pmtutor.org Powered by POeT Solvers Limited. Variables and relationships

More information

STA 302 H1F / 1001 HF Fall 2007 Test 1 October 24, 2007

STA 302 H1F / 1001 HF Fall 2007 Test 1 October 24, 2007 STA 302 H1F / 1001 HF Fall 2007 Test 1 October 24, 2007 LAST NAME: SOLUTIONS FIRST NAME: STUDENT NUMBER: ENROLLED IN: (circle one) STA 302 STA 1001 INSTRUCTIONS: Time: 90 minutes Aids allowed: calculator.

More information

CHAPTER 5 FUNCTIONAL FORMS OF REGRESSION MODELS

CHAPTER 5 FUNCTIONAL FORMS OF REGRESSION MODELS CHAPTER 5 FUNCTIONAL FORMS OF REGRESSION MODELS QUESTIONS 5.1. (a) In a log-log model the dependent and all explanatory variables are in the logarithmic form. (b) In the log-lin model the dependent variable

More information

Inference for Regression Inference about the Regression Model and Using the Regression Line

Inference for Regression Inference about the Regression Model and Using the Regression Line Inference for Regression Inference about the Regression Model and Using the Regression Line PBS Chapter 10.1 and 10.2 2009 W.H. Freeman and Company Objectives (PBS Chapter 10.1 and 10.2) Inference about

More information

Introduction to Regression

Introduction to Regression Regression Introduction to Regression If two variables covary, we should be able to predict the value of one variable from another. Correlation only tells us how much two variables covary. In regression,

More information

Statistics for Managers using Microsoft Excel 6 th Edition

Statistics for Managers using Microsoft Excel 6 th Edition Statistics for Managers using Microsoft Excel 6 th Edition Chapter 13 Simple Linear Regression 13-1 Learning Objectives In this chapter, you learn: How to use regression analysis to predict the value of

More information

Introduction to Econometrics

Introduction to Econometrics Introduction to Econometrics STAT-S-301 Introduction to Time Series Regression and Forecasting (2016/2017) Lecturer: Yves Dominicy Teaching Assistant: Elise Petit 1 Introduction to Time Series Regression

More information

STK4900/ Lecture 3. Program

STK4900/ Lecture 3. Program STK4900/9900 - Lecture 3 Program 1. Multiple regression: Data structure and basic questions 2. The multiple linear regression model 3. Categorical predictors 4. Planned experiments and observational studies

More information

(ii) Scan your answer sheets INTO ONE FILE only, and submit it in the drop-box.

(ii) Scan your answer sheets INTO ONE FILE only, and submit it in the drop-box. FINAL EXAM ** Two different ways to submit your answer sheet (i) Use MS-Word and place it in a drop-box. (ii) Scan your answer sheets INTO ONE FILE only, and submit it in the drop-box. Deadline: December

More information

Data Set 8: Laysan Finch Beak Widths

Data Set 8: Laysan Finch Beak Widths Data Set 8: Finch Beak Widths Statistical Setting This handout describes an analysis of covariance (ANCOVA) involving one categorical independent variable (with only two levels) and one quantitative covariate.

More information

Lecture 1 Linear Regression with One Predictor Variable.p2

Lecture 1 Linear Regression with One Predictor Variable.p2 Lecture Linear Regression with One Predictor Variablep - Basics - Meaning of regression parameters p - β - the slope of the regression line -it indicates the change in mean of the probability distn of

More information

STA 101 Final Review

STA 101 Final Review STA 101 Final Review Statistics 101 Thomas Leininger June 24, 2013 Announcements All work (besides projects) should be returned to you and should be entered on Sakai. Office Hour: 2 3pm today (Old Chem

More information

Multiple Regression an Introduction. Stat 511 Chap 9

Multiple Regression an Introduction. Stat 511 Chap 9 Multiple Regression an Introduction Stat 511 Chap 9 1 case studies meadowfoam flowers brain size of mammals 2 case study 1: meadowfoam flowering designed experiment carried out in a growth chamber general

More information

AP Statistics Unit 6 Note Packet Linear Regression. Scatterplots and Correlation

AP Statistics Unit 6 Note Packet Linear Regression. Scatterplots and Correlation Scatterplots and Correlation Name Hr A scatterplot shows the relationship between two quantitative variables measured on the same individuals. variable (y) measures an outcome of a study variable (x) may

More information

Simple Linear Regression: One Qualitative IV

Simple Linear Regression: One Qualitative IV Simple Linear Regression: One Qualitative IV 1. Purpose As noted before regression is used both to explain and predict variation in DVs, and adding to the equation categorical variables extends regression

More information

ANOVA Situation The F Statistic Multiple Comparisons. 1-Way ANOVA MATH 143. Department of Mathematics and Statistics Calvin College

ANOVA Situation The F Statistic Multiple Comparisons. 1-Way ANOVA MATH 143. Department of Mathematics and Statistics Calvin College 1-Way ANOVA MATH 143 Department of Mathematics and Statistics Calvin College An example ANOVA situation Example (Treating Blisters) Subjects: 25 patients with blisters Treatments: Treatment A, Treatment

More information

Stat 231 Final Exam. Consider first only the measurements made on housing number 1.

Stat 231 Final Exam. Consider first only the measurements made on housing number 1. December 16, 1997 Stat 231 Final Exam Professor Vardeman 1. The first page of printout attached to this exam summarizes some data (collected by a student group) on the diameters of holes bored in certain

More information

STATISTICS 110/201 PRACTICE FINAL EXAM

STATISTICS 110/201 PRACTICE FINAL EXAM STATISTICS 110/201 PRACTICE FINAL EXAM Questions 1 to 5: There is a downloadable Stata package that produces sequential sums of squares for regression. In other words, the SS is built up as each variable

More information

The ARIMA Procedure: The ARIMA Procedure

The ARIMA Procedure: The ARIMA Procedure Page 1 of 120 Overview: ARIMA Procedure Getting Started: ARIMA Procedure The Three Stages of ARIMA Modeling Identification Stage Estimation and Diagnostic Checking Stage Forecasting Stage Using ARIMA Procedure

More information

Orthogonal contrasts for a 2x2 factorial design Example p130

Orthogonal contrasts for a 2x2 factorial design Example p130 Week 9: Orthogonal comparisons for a 2x2 factorial design. The general two-factor factorial arrangement. Interaction and additivity. ANOVA summary table, tests, CIs. Planned/post-hoc comparisons for the

More information

FinQuiz Notes

FinQuiz Notes Reading 9 A time series is any series of data that varies over time e.g. the quarterly sales for a company during the past five years or daily returns of a security. When assumptions of the regression

More information

Topic 14: Inference in Multiple Regression

Topic 14: Inference in Multiple Regression Topic 14: Inference in Multiple Regression Outline Review multiple linear regression Inference of regression coefficients Application to book example Inference of mean Application to book example Inference

More information

Examination paper for TMA4255 Applied statistics

Examination paper for TMA4255 Applied statistics Department of Mathematical Sciences Examination paper for TMA4255 Applied statistics Academic contact during examination: Anna Marie Holand Phone: 951 38 038 Examination date: 16 May 2015 Examination time

More information

Start with review, some new definitions, and pictures on the white board. Assumptions in the Normal Linear Regression Model

Start with review, some new definitions, and pictures on the white board. Assumptions in the Normal Linear Regression Model Start with review, some new definitions, and pictures on the white board. Assumptions in the Normal Linear Regression Model A1: There is a linear relationship between X and Y. A2: The error terms (and

More information

LEARNING WITH MINITAB Chapter 12 SESSION FIVE: DESIGNING AN EXPERIMENT

LEARNING WITH MINITAB Chapter 12 SESSION FIVE: DESIGNING AN EXPERIMENT LEARNING WITH MINITAB Chapter 12 SESSION FIVE: DESIGNING AN EXPERIMENT Laura M Williams, RN, CLNC, MSN MOREHEAD STATE UNIVERSITY IET603: STATISTICAL QUALITY ASSURANCE IN SCIENCE AND TECHNOLOGY DR. AHMAD

More information

O2. The following printout concerns a best subsets regression. Questions follow.

O2. The following printout concerns a best subsets regression. Questions follow. STAT-UB.0103 Exam 01.APIL.11 OVAL Version Solutions O1. Frank Tanner is the lab manager at BioVigor, a firm that runs studies for agricultural food supplements. He has been asked to design a protocol for

More information

SAS Procedures Inference about the Line ffl model statement in proc reg has many options ffl To construct confidence intervals use alpha=, clm, cli, c

SAS Procedures Inference about the Line ffl model statement in proc reg has many options ffl To construct confidence intervals use alpha=, clm, cli, c Inference About the Slope ffl As with all estimates, ^fi1 subject to sampling var ffl Because Y jx _ Normal, the estimate ^fi1 _ Normal A linear combination of indep Normals is Normal Simple Linear Regression

More information

[4+3+3] Q 1. (a) Describe the normal regression model through origin. Show that the least square estimator of the regression parameter is given by

[4+3+3] Q 1. (a) Describe the normal regression model through origin. Show that the least square estimator of the regression parameter is given by Concordia University Department of Mathematics and Statistics Course Number Section Statistics 360/1 40 Examination Date Time Pages Final June 2004 3 hours 7 Instructors Course Examiner Marks Y.P. Chaubey

More information

" M A #M B. Standard deviation of the population (Greek lowercase letter sigma) σ 2

 M A #M B. Standard deviation of the population (Greek lowercase letter sigma) σ 2 Notation and Equations for Final Exam Symbol Definition X The variable we measure in a scientific study n The size of the sample N The size of the population M The mean of the sample µ The mean of the

More information

Interpreting the coefficients

Interpreting the coefficients Lecture Week 5 Multiple Linear Regression Interpreting the coefficients Uses of Multiple Regression Predict for specified new x-vars Predict in time. Focus on one parameter Use regression to adjust variation

More information

MBA Statistics COURSE #4

MBA Statistics COURSE #4 MBA Statistics 51-651-00 COURSE #4 Simple and multiple linear regression What should be the sales of ice cream? Example: Before beginning building a movie theater, one must estimate the daily number of

More information

III. Inferential Tools

III. Inferential Tools III. Inferential Tools A. Introduction to Bat Echolocation Data (10.1.1) 1. Q: Do echolocating bats expend more enery than non-echolocating bats and birds, after accounting for mass? 2. Strategy: (i) Explore

More information

Chapter 7 Student Lecture Notes 7-1

Chapter 7 Student Lecture Notes 7-1 Chapter 7 Student Lecture Notes 7- Chapter Goals QM353: Business Statistics Chapter 7 Multiple Regression Analysis and Model Building After completing this chapter, you should be able to: Explain model

More information

Outline. Linear OLS Models vs: Linear Marginal Models Linear Conditional Models. Random Intercepts Random Intercepts & Slopes

Outline. Linear OLS Models vs: Linear Marginal Models Linear Conditional Models. Random Intercepts Random Intercepts & Slopes Lecture 2.1 Basic Linear LDA 1 Outline Linear OLS Models vs: Linear Marginal Models Linear Conditional Models Random Intercepts Random Intercepts & Slopes Cond l & Marginal Connections Empirical Bayes

More information

STAT 360-Linear Models

STAT 360-Linear Models STAT 360-Linear Models Instructor: Yogendra P. Chaubey Sample Test Questions Fall 004 Note: The following questions are from previous tests and exams. The final exam will be for three hours and will contain

More information

Disadvantages of using many pooled t procedures. The sampling distribution of the sample means. The variability between the sample means

Disadvantages of using many pooled t procedures. The sampling distribution of the sample means. The variability between the sample means Stat 529 (Winter 2011) Analysis of Variance (ANOVA) Reading: Sections 5.1 5.3. Introduction and notation Birthweight example Disadvantages of using many pooled t procedures The analysis of variance procedure

More information

A Second Course in Statistics: Regression Analysis

A Second Course in Statistics: Regression Analysis FIFTH E D I T I 0 N A Second Course in Statistics: Regression Analysis WILLIAM MENDENHALL University of Florida TERRY SINCICH University of South Florida PRENTICE HALL Upper Saddle River, New Jersey 07458

More information

10. Alternative case influence statistics

10. Alternative case influence statistics 10. Alternative case influence statistics a. Alternative to D i : dffits i (and others) b. Alternative to studres i : externally-studentized residual c. Suggestion: use whatever is convenient with the

More information