Unit 10: Simple Linear Regression and Correlation

Size: px
Start display at page:

Download "Unit 10: Simple Linear Regression and Correlation"

Transcription

1 Unit 10: Simple Linear Regression and Correlation Statistics 571: Statistical Methods Ramón V. León 6/28/2004 Unit 10 - Stat Ramón V. León 1

2 Introductory Remarks Regression analysis is a method for studying the relationship between two or more numerical variables In regression analysis one of the variables is regarded as a response, outcome or dependent variable The other variables are regarded as predictor, explanatory or independent variables An empirical model approximately captures the main features of the relationship between the response variable and the key predictor variables Sometimes it is not clear which of two variable should be the response. In this case correlation analysis is used. In this unit we consider only two variables 6/28/2004 Unit 10 - Stat Ramón V. León 2

3 A Probabilistic Model for Simple Linear Regression 2 Note that the Yi are independent N( µ i = β0 + β1xi, σ ) r.v.'s Constant variability about the regression line Notice that the predictor variable is regarded as nonrandom because it is assumed to be set by the investigator 6/28/2004 Unit 10 - Stat Ramón V. León 3

4 A Probabilistic Model for Simple Linear Regression Let x1, x2,..., xn be specific settings of the predictor variable. Let y1, y2,..., yn be the corresponding values of the response variable. Assume that yi is the observed value of a random variable (r.v.) Yi, which depends on xi according to the following model: Y = β + β x + ε ( i = 1, 2,..., n). i 0 1 i i 2 Here εi is the random error with E( εi) = 0 and Var( εi) = σ Thus EY ( i) = µ i = β0 + β1xi (true regression line). We assume that εi are independent, identically distributed (i.i.d.) r.v.'s We also usually assume that the are normally distributed. ε i 6/28/2004 Unit 10 - Stat Ramón V. León 4

5 Reflections on the Method of Tire Data Collection: Experimental Design Mileage Depth Mileage in 1000 miles Depth in mils Was one tire or nine used in the experiment? (Read the book!) What effect would this answer have on the experimental error? on our ability to generalize to all tires of this brand? Was the data collected on one or several cars? Does it matter? Was the data collected in random order or in the order given? Does it matter? Is there a possible confounding problem? 6/28/2004 Unit 10 - Stat Ramón V. León 5

6 Example 10.1: Tread Wear vs. Mileage Using the Fit Y by X platform in the Analyze menu 6/28/2004 Unit 10 - Stat Ramón V. León 6

7 Scatter Plot: Tread Wear vs. Mileage 6/28/2004 Unit 10 - Stat Ramón V. León 7

8 Least Squares Line Fit y = ˆ β + ˆ 0 1 β x 6/28/2004 Unit 10 - Stat Ramón V. León 8

9 Illustration: Least Squares Line Fit The least squares line fit is the line that minimizes the sum of the squares of the lengths of the vertical segments 6/28/2004 Unit 10 - Stat Ramón V. León 9

10 i= 1 Mathematics of Least Squares Find the line, i.e., values of β and β that minimizes the sum of the squared deviations: How? [ ( β β )] Solve for values of β and β for which Q β n Q= y + x i Q = 0 and = 0 β i 6/28/2004 Unit 10 - Stat Ramón V. León 10

11 Predict Mean Grove Depth for a Given Mileage For x = 25 (25,000 miles) the predicted y: yˆ = ˆ β + ˆ β (25) = = mils. 0 1 Add a row to the table and leave the y value blank. 6/28/2004 Unit 10 - Stat Ramón V. León 11

12 Goodness of Fit of the LS Line Fitted values of y : yˆ = ˆ β + ˆ β x, i = 1,2,..., n. i i 0 1 i ( ˆ β ˆ ) 0 β1 Residuals: e = y yˆ = y + x = segment length i i i i i 6/28/2004 Unit 10 - Stat Ramón V. León 12

13 Sums of Squares Error Sum of Squares (SSE) = Total Sum of Squares (SST) = n n 2 ei = yi yi i= 1 i= 1 i= 1 Regression Sum of Squares (SSR) = n ( y y) i n i= 1 ( ˆ ) 2 ( yˆ y) n 2 n 2 n 2 ( y y) = ( yˆ y) + ( y yˆ ) i i i i i= 1 i= 1 i= 1 SST = SSR + SSE i 2 2 These definitions apply even if the model is more complex than linear in x, for example, if it is quadratic in x SST is interpreted as the total variation in y JMP calls SSR the Model Sums of Square 6/28/2004 Unit 10 - Stat Ramón V. León 13

14 y i yi y y i yˆi y ˆ i y y Illustration of Sums of Squares 6/28/2004 Unit 10 - Stat Ramón V. León 14

15 Coefficient of Determination SST = SSR + SSE 2 SSR SSE r = = 1 = proportion of the variation in SST SST that is accounted for by regression on x y SSR SST = 50, , /28/2004 Unit 10 - Stat Ramón V. León 15

16 s Estimation of σ 2 n n n 2 e i yi y i yi 0 1xi 2 i= 1 i= 1 i= 1 ( ) 2 ˆ ( ˆ β + ˆ β ) 2 = = = n 2 n 2 n 2 s 2 s This estimate has n-2 degrees of freedom because two unknown parameters are estimated. 6/28/2004 Unit 10 - Stat Ramón V. León 16

17 Statistical Inference for β 0 and β 1 2. Select ˆ β0 β ˆ 0 β1 β1 N(0,1) and N(0,1) SD( ˆ β ) SD( ˆ β ) 0 1 ˆ β β ˆ β β t and SE( ˆ β ) SE( ˆ β ) n 2 n (1- α)% CI : ˆ β ± t SE( ˆ β ) = 7.28 ± t = 1 n 2, α 2 1 7,0.025 [ ] 7.28 ± = 8.733, ˆ β ˆ 0 ± tn 2, α 2SE( β0) = ± t7, t 1. Right click on the Parameter Estimate area 6/28/2004 Unit 10 - Stat Ramón V. León 17

18 Test of Hypothesis for β 0 and β 1 H : β = 0 vs. H : β Reject H at α level if t = t ˆ β 0 ˆ β = > t SE( ˆ β ) SE( ˆ β ) n 2, α 2 Tire example: ˆ β 7.28 SE( ˆ β ) = = = t7,.025 = , Strong evidence that mileage affects tread depth. (Two-sided p-value) 6/28/2004 Unit 10 - Stat Ramón V. León 18

19 The Mean Square (MS) is the Sums of Square divided by its degrees of freedom, e.g. MSE = SSE/df = /7 = H Analysis of Variance : β = 0 vs. H : β MSR F = MSE = F = = ( ) = t 2 2 F and t relationship holds only when the F numerator has one degree of freedom 6/28/2004 Unit 10 - Stat Ramón V. León 19

20 Example: Prediction of a Future y or Its Mean For a fixed value of x*, are we trying to predict -the average value of y? -the value of a future observation of y? Do I want to predict the average selling price of all 4,000 square feet houses in my neighborhood. Or do I want to predict the particular future selling price of my 4,000 square feet house? Which prediction is subject to the most error? 6/28/2004 Unit 10 - Stat Ramón V. León 20

21 Prediction of a Future y or Its Mean: Prediction Interval or Confidence Interval For a fixed value of x*, are we trying to predict -the average value of y? -the value of a future observation of y? 6/28/2004 Unit 10 - Stat Ramón V. León 21

22 JMP: Prediction of the Mean of y Grove Depth (in mils) Mileage (in 1000 Miles 6/28/2004 Unit 10 - Stat Ramón V. León 22

23 JMP: Prediction of a Future Value of y Grove Depth (in mils) Mileage (in 1000 Miles 6/28/2004 Unit 10 - Stat Ramón V. León 23

24 Formulas for Confidence and Prediction Intervals xx n i= 1 ( ) 2 S = x x i Prediction Interval of Chapter 7: 6/28/2004 Unit 10 - Stat Ramón V. León 24

25 Confidence and Prediction Intervals with JMP Using the Fit Model Platform not the Fit Y by X platform 6/28/2004 Unit 10 - Stat Ramón V. León 25

26 CI for Mean for µ* 6/28/2004 Unit 10 - Stat Ramón V. León 26

27 Prediction Interval for Y* 6/28/2004 Unit 10 - Stat Ramón V. León 27

28 Prediction for the Mean of Y or a Future Observation of Y Point estimate prediction is the same in both cases: But the error bands are different Narrower for the mean of Y: [158.73, ] Wider for a future value of Y: [129.44, ] 6/28/2004 Unit 10 - Stat Ramón V. León 28

29 Calibration (Inverse Regression) 6/28/2004 Unit 10 - Stat Ramón V. León 29

30 Inverse Regression in JMP 5: Confidence Interval In Fit Model Platform unchecked 6/28/2004 Unit 10 - Stat Ramón V. León 30

31 Inverse Regression: Two Types of Confidence Intervals Calibration is usually used in measurement. Given that you got a certain measurement what is the true value of the measured quantity? Which inverse prediction interval makes more sense? 6/28/2004 Unit 10 - Stat Ramón V. León 31

32 Measurement Application Suppose you get a measurement of 6.50 on an object. What is a calibration interval for its true value? Data was supposedly obtained by measuring standards of known true measurement with a calibrated instrument How the data was really generated 6/28/2004 Unit 10 - Stat Ramón V. León 32

33 Residuals The residuals are the differences between the observed value of y and the predicted value of y yˆ = ˆ β + ˆ β (32) = = mils 0 1 6/28/2004 Unit 10 - Stat Ramón V. León 33

34 Regression Diagnostics The residuals e = y yˆ can be viewed as the "left over" i i i after the model is fitted. If the model is correct, then the e i can be viewed as the "estimates" of the random errors ε ' s. As such, the residuals are vital for checking the model assumptions. i Residual plots If the model is correct the residuals should have no structure, that is, they should look random on their plots. 6/28/2004 Unit 10 - Stat Ramón V. León 34

35 Mathematics of Residuals If the assumed regression model is correct, then the normally distributed with Ee ( ) = 0 and ( xi x) 2 Var( ei ) = σ 1 σ if n is large. n Sxx The e ' s are not independent (even though the ε 's are i n i i i i=1 i=1 i n i e ' s are independent) because they satisfy the following contraints: e = 0, xe = 0. However, the dependence introduced by these constraints is neglible in n is modestly large, say greater than 20 or so. i xx n i= 1 ( ) 2 S = x x i 6/28/2004 Unit 10 - Stat Ramón V. León 35

36 Basic Principle Underlying Residual Plots If the assumed model is correct then the residuals should be randomly scattered around 0 and should show no obvious systematic pattern. 6/28/2004 Unit 10 - Stat Ramón V. León 36

37 In Fit Y by X platform: In Fit Model platform: Red triangle menu next to Linear Fit. Residuals in JMP 6/28/2004 Unit 10 - Stat Ramón V. León 37

38 Checking for Linearity Should EY ( ) = β + β x+ β x be fitted rather than EY ( ) = β + β x? Grove Depth (in mils) Residuals Grove Depth (in mils) Mileage (in 1000 Miles Mileage (in 1000 Miles 6/28/2004 Unit 10 - Stat Ramón V. León 38

39 Predicted by Residual Plot Residuals Grove Depth (in mils) Predicted Grove Depth (in mils) Mileage (in 1000 Miles 6/28/2004 Unit 10 - Stat Ramón V. León 39 Residuals Grove Depth (in mils) What are the advantages of this plot?

40 Tread Wear Quadratic Model 6/28/2004 Unit 10 - Stat Ramón V. León 40

41 Residuals for Quadratic Model Residuals have a random pattern The Fit Y by X platform was used to get these plots Left plot can also be obtained using the Plot Residuals option 6/28/2004 Unit 10 - Stat Ramón V. León 41

42 Tread Wear Quadratic Model R 2 was for model linear in x 6/28/2004 Unit 10 - Stat Ramón V. León 42

43 Checking for Constant Variance: Plot of Predicted vs. Residual If assumption is incorrect, often Var( Y ) is some function of EY ( ) = µ 6/28/2004 Unit 10 - Stat Ramón V. León 43

44 Other Model Diagnostics Based on Residuals Checking for normality of errors: Do a normal plot of the residuals Warning: Don t plot the response y on a normal plot This plot has no meaning when one has regressors Don t transform the data on the basis of this plot Many students make this mistake in their project. Don t be one of them. 6/28/2004 Unit 10 - Stat Ramón V. León 44

45 Other Model Diagnostics Based on Residuals Fit Model Platform: Checking for independence of errors: Plot the residual in time order. The order of data collection should be always recorded. Use Durbin-Watson statistic to test for autocorrelation. 6/28/2004 Unit 10 - Stat Ramón V. León 45

46 Other Model Diagnostics Based on Residuals Checking for outliers: See if any standardized (Studentized) residual exceeds 2 (standard deviations) in absolute value. * ei ei ei ei = =, i= 1, 2,..., n. SE( e ) 2 i 1 ( xi x) s s 1 n S xx Also do a box plot of residuals to check for outliers 6/28/2004 Unit 10 - Stat Ramón V. León 46

47 Standardized (Studentized) Residuals in JMP 4 Using the Fit Model JMP platform with linear model Should we omit the first observation and refit the model? Possible outlier The model fitted here is the one linear in x, not the quadratic model 6/28/2004 Unit 10 - Stat Ramón V. León 47

48 Checking for Influential Observations An observation can be influential because it has an extreme x-value, an extreme y-value, or both Note that the influential observation above are not outliers since their residuals are quite small (even zero) 6/28/2004 Unit 10 - Stat Ramón V. León 48

49 Checking for Influential Observations n yˆ = h y where the h are some functions of the x's i ij j ij j= 1 H = h ij is called the hat matrix We can think of the h as the leverage exerted by y on the fitted value yˆ. i ij If yˆ i is largely determined by yi with very small contribution from the other y 's, then we say that the ith observations is influential (also called high leverage). j j 6/28/2004 Unit 10 - Stat Ramón V. León 49

50 e * i Checking for Influential Observations Since h detemines how much ˆ ii yi depends on yi it is used as a measure of the leverage of the i-thobservation. n Now h = k+ 1 or the average of h is ( k+ 1) / n, i= 1 ii where k is the number of predictor variables variables. Rule of thumb: Regard any hii > 2( k+ 1) / n as high leverage. In simple regression k = 1 and so h > 4 / n is regarded as high leverage. ei ei = = SE( e ) s 1 h i ii ii 6/28/2004 Unit 10 - Stat Ramón V. León 50 ii Standardized (Studentized) residuals will be large relative to the residual for high leverage observations.

51 Graph of Anscombe Data Observation in row 8 6/28/2004 Unit 10 - Stat Ramón V. León 51

52 Checking for Influential Observations Fit Model Platform: Anscombe Data Illustration: > 4 / n= 4 /11 = 0.36 Notice: Observation No. 8 is not an outlier since e8 = y ˆ 8 y8 = 0 6/28/2004 Unit 10 - Stat Ramón V. León 52 h ii Thus observation 8 is influential

53 How to Deal with Outliers and Influential Observations They can give a misleading picture of the relationship between y and x Are they erroneous observations? If yes, they should be discarded. If the observation are valid then they should be included in the analysis Least Squares is very sensitive to these observations. Do analysis with and without outlier or influential observation. Or use robust method: Minimize the sum of the absolute deviations n i= 1 y i ( β + β x ) 0 1 These estimates are analogous to the sample median. LS estimates are analogous to the sample mean. 6/28/2004 Unit 10 - Stat Ramón V. León 53 i

54 Data Transformations If the functional relationship between x and y is known it may be possible to find a linearizing transformation analytically: β For y = αx take log on both sides: log y = logα + βlog x β x For y = αe take log on both sides: log y = logα + βx If we are fitting an empirical model, then a linearizing transformation can be found by trial and error using the scatter plot as a guide. 6/28/2004 Unit 10 - Stat Ramón V. León 54

55 Scatter Plots and Linearizing Transformations 6/28/2004 Unit 10 - Stat Ramón V. León 55

56 Tire Tread Wear vs. Mileage: Exponential Model y = αe β x log y = logα β x 6/28/2004 Unit 10 - Stat Ramón V. León 56

57 Tire Tread Wear vs. Mileage: Exponential Model WEAR = e e = e MILES.0298MILES 6/28/2004 Unit 10 - Stat Ramón V. León 57

58 Tire Tread Wear vs. Mileage: Exponential Model For linear model: R 2 =.953 Residual plot is still curved mainly because observation 1 is an outlier 6/28/2004 Unit 10 - Stat Ramón V. León 58

59 Weights and Gas Mileages of Model Year Cars 6/28/2004 Unit 10 - Stat Ramón V. León 59

60 Weights and Gas Mileages of Model Year Cars Using Fit Y by X platform 100 Use transformation y. The new variable has units y of gallons per 100 miles. This transformation has the advantage of being physically interpretable. A spline fits the data locally. Our prior curve fits have been global. 6/28/2004 Unit 10 - Stat Ramón V. León 60

61 Weights and Gas Mileages of Model Year Cars 6/28/2004 Unit 10 - Stat Ramón V. León 61

62 Variance Stabilizing Transformation α Suppose σ µ We want to find a transformation of y that yields a constant variance. Suppose the transformation is a power of the original data, say y* = Y y λ For example: Take the square root of Poisson count data since for data so distributed the mean and the variance are the same 6/28/2004 Unit 10 - Stat Ramón V. León 62

63 Box-Cox Transformation: Empirical Determination of Best Transformation Fit Model Platform: y λ λ 1 ( y 1 ) y 0 λ λ λ = yln y λ = 0 ( ) where y = n n yi = i= 1 geometric mean 6/28/2004 Unit 10 - Stat Ramón V. León 63

64 Transformation in Tire Data A better fit than than the polynomial and with one less parameter (3 versus 4) Transform Mileage (in 1000 Miles 6/28/2004 Unit 10 - Stat Ramón V. León 64

65 Weights and Gas Mileages of Model Year Cars: Box-Cox Transformation Reciprocal transformation among suggested transformations 6/28/2004 Unit 10 - Stat Ramón V. León 65

66 Correlation Analysis When it is not clear which is the predictor variable and which is the response variable When both variables are random Try the bivariate normal distribution as a probability model for the joint distribution of two r.v. s Parameters: µ X, µ Y, σ X, σy and Cov (X, Y) ρ = Corr(X, Y) = Var(X)Var(Y) The correlation ρ is a measure of association between the two random variables X and Y. Zero correlation corresponds to no association. Correlation of 1 or -1 represents perfect association. 6/28/2004 Unit 10 - Stat Ramón V. León 66

67 Bivariate Normal Density Function 6/28/2004 Unit 10 - Stat Ramón V. León 67

68 Statistical Inference on the Correlation Coefficient R H = n ( X X)( Y Y) i i i= 1 n n 2 2 ( X i X) ( Yi Y) i= 1 i= 1 : ρ = 0 vs H : ρ Test statistic Approximate test statistic available to Test nonzero null hypotheses Obtain confidence intervals for ρ T = R n-2 1 R 6/28/2004 Unit 10 - Stat Ramón V. León 68 2 t n 2 Equivalent to testing H : β = 0 vs. H : β See textbook

69 Exercise /28/2004 Unit 10 - Stat Ramón V. León 69

70 Exercise /28/2004 Unit 10 - Stat Ramón V. León 70

71 Exercise By default, a 95% bivariate normal density ellipse is imposed on each scatterplot. If the variables are bivariate normally distributed, this ellipse encloses approximately 95% of the points. The correlation of the variables is seen by the collapsing of the ellipse along the diagonal axis. If the ellipse is fairly round and is not diagonally oriented, the variables are uncorrelated 6/28/2004 Unit 10 - Stat Ramón V. León 71

Unit 11: Multiple Linear Regression

Unit 11: Multiple Linear Regression Unit 11: Multiple Linear Regression Statistics 571: Statistical Methods Ramón V. León 7/13/2004 Unit 11 - Stat 571 - Ramón V. León 1 Main Application of Multiple Regression Isolating the effect of a variable

More information

Inference for Regression Inference about the Regression Model and Using the Regression Line

Inference for Regression Inference about the Regression Model and Using the Regression Line Inference for Regression Inference about the Regression Model and Using the Regression Line PBS Chapter 10.1 and 10.2 2009 W.H. Freeman and Company Objectives (PBS Chapter 10.1 and 10.2) Inference about

More information

Lecture 10 Multiple Linear Regression

Lecture 10 Multiple Linear Regression Lecture 10 Multiple Linear Regression STAT 512 Spring 2011 Background Reading KNNL: 6.1-6.5 10-1 Topic Overview Multiple Linear Regression Model 10-2 Data for Multiple Regression Y i is the response variable

More information

Estimating σ 2. We can do simple prediction of Y and estimation of the mean of Y at any value of X.

Estimating σ 2. We can do simple prediction of Y and estimation of the mean of Y at any value of X. Estimating σ 2 We can do simple prediction of Y and estimation of the mean of Y at any value of X. To perform inferences about our regression line, we must estimate σ 2, the variance of the error term.

More information

Statistics for Managers using Microsoft Excel 6 th Edition

Statistics for Managers using Microsoft Excel 6 th Edition Statistics for Managers using Microsoft Excel 6 th Edition Chapter 13 Simple Linear Regression 13-1 Learning Objectives In this chapter, you learn: How to use regression analysis to predict the value of

More information

Chapter 16. Simple Linear Regression and dcorrelation

Chapter 16. Simple Linear Regression and dcorrelation Chapter 16 Simple Linear Regression and dcorrelation 16.1 Regression Analysis Our problem objective is to analyze the relationship between interval variables; regression analysis is the first tool we will

More information

Inference for Regression Simple Linear Regression

Inference for Regression Simple Linear Regression Inference for Regression Simple Linear Regression IPS Chapter 10.1 2009 W.H. Freeman and Company Objectives (IPS Chapter 10.1) Simple linear regression p Statistical model for linear regression p Estimating

More information

Inferences for Regression

Inferences for Regression Inferences for Regression An Example: Body Fat and Waist Size Looking at the relationship between % body fat and waist size (in inches). Here is a scatterplot of our data set: Remembering Regression In

More information

Correlation Analysis

Correlation Analysis Simple Regression Correlation Analysis Correlation analysis is used to measure strength of the association (linear relationship) between two variables Correlation is only concerned with strength of the

More information

Chapter 16. Simple Linear Regression and Correlation

Chapter 16. Simple Linear Regression and Correlation Chapter 16 Simple Linear Regression and Correlation 16.1 Regression Analysis Our problem objective is to analyze the relationship between interval variables; regression analysis is the first tool we will

More information

LINEAR REGRESSION ANALYSIS. MODULE XVI Lecture Exercises

LINEAR REGRESSION ANALYSIS. MODULE XVI Lecture Exercises LINEAR REGRESSION ANALYSIS MODULE XVI Lecture - 44 Exercises Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur Exercise 1 The following data has been obtained on

More information

STAT Chapter 11: Regression

STAT Chapter 11: Regression STAT 515 -- Chapter 11: Regression Mostly we have studied the behavior of a single random variable. Often, however, we gather data on two random variables. We wish to determine: Is there a relationship

More information

Lectures on Simple Linear Regression Stat 431, Summer 2012

Lectures on Simple Linear Regression Stat 431, Summer 2012 Lectures on Simple Linear Regression Stat 43, Summer 0 Hyunseung Kang July 6-8, 0 Last Updated: July 8, 0 :59PM Introduction Previously, we have been investigating various properties of the population

More information

Basic Business Statistics 6 th Edition

Basic Business Statistics 6 th Edition Basic Business Statistics 6 th Edition Chapter 12 Simple Linear Regression Learning Objectives In this chapter, you learn: How to use regression analysis to predict the value of a dependent variable based

More information

CHAPTER EIGHT Linear Regression

CHAPTER EIGHT Linear Regression 7 CHAPTER EIGHT Linear Regression 8. Scatter Diagram Example 8. A chemical engineer is investigating the effect of process operating temperature ( x ) on product yield ( y ). The study results in the following

More information

Multiple Linear Regression

Multiple Linear Regression Multiple Linear Regression University of California, San Diego Instructor: Ery Arias-Castro http://math.ucsd.edu/~eariasca/teaching.html 1 / 42 Passenger car mileage Consider the carmpg dataset taken from

More information

Any of 27 linear and nonlinear models may be fit. The output parallels that of the Simple Regression procedure.

Any of 27 linear and nonlinear models may be fit. The output parallels that of the Simple Regression procedure. STATGRAPHICS Rev. 9/13/213 Calibration Models Summary... 1 Data Input... 3 Analysis Summary... 5 Analysis Options... 7 Plot of Fitted Model... 9 Predicted Values... 1 Confidence Intervals... 11 Observed

More information

STAT420 Midterm Exam. University of Illinois Urbana-Champaign October 19 (Friday), :00 4:15p. SOLUTIONS (Yellow)

STAT420 Midterm Exam. University of Illinois Urbana-Champaign October 19 (Friday), :00 4:15p. SOLUTIONS (Yellow) STAT40 Midterm Exam University of Illinois Urbana-Champaign October 19 (Friday), 018 3:00 4:15p SOLUTIONS (Yellow) Question 1 (15 points) (10 points) 3 (50 points) extra ( points) Total (77 points) Points

More information

Linear Regression. Simple linear regression model determines the relationship between one dependent variable (y) and one independent variable (x).

Linear Regression. Simple linear regression model determines the relationship between one dependent variable (y) and one independent variable (x). Linear Regression Simple linear regression model determines the relationship between one dependent variable (y) and one independent variable (x). A dependent variable is a random variable whose variation

More information

Chapter 4: Regression Models

Chapter 4: Regression Models Sales volume of company 1 Textbook: pp. 129-164 Chapter 4: Regression Models Money spent on advertising 2 Learning Objectives After completing this chapter, students will be able to: Identify variables,

More information

Formal Statement of Simple Linear Regression Model

Formal Statement of Simple Linear Regression Model Formal Statement of Simple Linear Regression Model Y i = β 0 + β 1 X i + ɛ i Y i value of the response variable in the i th trial β 0 and β 1 are parameters X i is a known constant, the value of the predictor

More information

Unit 12: Analysis of Single Factor Experiments

Unit 12: Analysis of Single Factor Experiments Unit 12: Analysis of Single Factor Experiments Statistics 571: Statistical Methods Ramón V. León 7/16/2004 Unit 12 - Stat 571 - Ramón V. León 1 Introduction Chapter 8: How to compare two treatments. Chapter

More information

Ch. 1: Data and Distributions

Ch. 1: Data and Distributions Ch. 1: Data and Distributions Populations vs. Samples How to graphically display data Histograms, dot plots, stem plots, etc Helps to show how samples are distributed Distributions of both continuous and

More information

Chapter 4. Regression Models. Learning Objectives

Chapter 4. Regression Models. Learning Objectives Chapter 4 Regression Models To accompany Quantitative Analysis for Management, Eleventh Edition, by Render, Stair, and Hanna Power Point slides created by Brian Peterson Learning Objectives After completing

More information

5.1 Model Specification and Data 5.2 Estimating the Parameters of the Multiple Regression Model 5.3 Sampling Properties of the Least Squares

5.1 Model Specification and Data 5.2 Estimating the Parameters of the Multiple Regression Model 5.3 Sampling Properties of the Least Squares 5.1 Model Specification and Data 5. Estimating the Parameters of the Multiple Regression Model 5.3 Sampling Properties of the Least Squares Estimator 5.4 Interval Estimation 5.5 Hypothesis Testing for

More information

THE ROYAL STATISTICAL SOCIETY 2008 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE (MODULAR FORMAT) MODULE 4 LINEAR MODELS

THE ROYAL STATISTICAL SOCIETY 2008 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE (MODULAR FORMAT) MODULE 4 LINEAR MODELS THE ROYAL STATISTICAL SOCIETY 008 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE (MODULAR FORMAT) MODULE 4 LINEAR MODELS The Society provides these solutions to assist candidates preparing for the examinations

More information

Simple Linear Regression

Simple Linear Regression Simple Linear Regression In simple linear regression we are concerned about the relationship between two variables, X and Y. There are two components to such a relationship. 1. The strength of the relationship.

More information

Keller: Stats for Mgmt & Econ, 7th Ed July 17, 2006

Keller: Stats for Mgmt & Econ, 7th Ed July 17, 2006 Chapter 17 Simple Linear Regression and Correlation 17.1 Regression Analysis Our problem objective is to analyze the relationship between interval variables; regression analysis is the first tool we will

More information

Sampling Distributions in Regression. Mini-Review: Inference for a Mean. For data (x 1, y 1 ),, (x n, y n ) generated with the SRM,

Sampling Distributions in Regression. Mini-Review: Inference for a Mean. For data (x 1, y 1 ),, (x n, y n ) generated with the SRM, Department of Statistics The Wharton School University of Pennsylvania Statistics 61 Fall 3 Module 3 Inference about the SRM Mini-Review: Inference for a Mean An ideal setup for inference about a mean

More information

Ch 2: Simple Linear Regression

Ch 2: Simple Linear Regression Ch 2: Simple Linear Regression 1. Simple Linear Regression Model A simple regression model with a single regressor x is y = β 0 + β 1 x + ɛ, where we assume that the error ɛ is independent random component

More information

401 Review. 6. Power analysis for one/two-sample hypothesis tests and for correlation analysis.

401 Review. 6. Power analysis for one/two-sample hypothesis tests and for correlation analysis. 401 Review Major topics of the course 1. Univariate analysis 2. Bivariate analysis 3. Simple linear regression 4. Linear algebra 5. Multiple regression analysis Major analysis methods 1. Graphical analysis

More information

Lecture 9: Linear Regression

Lecture 9: Linear Regression Lecture 9: Linear Regression Goals Develop basic concepts of linear regression from a probabilistic framework Estimating parameters and hypothesis testing with linear models Linear regression in R Regression

More information

Multiple linear regression

Multiple linear regression Multiple linear regression Course MF 930: Introduction to statistics June 0 Tron Anders Moger Department of biostatistics, IMB University of Oslo Aims for this lecture: Continue where we left off. Repeat

More information

Table of z values and probabilities for the standard normal distribution. z is the first column plus the top row. Each cell shows P(X z).

Table of z values and probabilities for the standard normal distribution. z is the first column plus the top row. Each cell shows P(X z). Table of z values and probabilities for the standard normal distribution. z is the first column plus the top row. Each cell shows P(X z). For example P(X.04) =.8508. For z < 0 subtract the value from,

More information

Confidence Intervals, Testing and ANOVA Summary

Confidence Intervals, Testing and ANOVA Summary Confidence Intervals, Testing and ANOVA Summary 1 One Sample Tests 1.1 One Sample z test: Mean (σ known) Let X 1,, X n a r.s. from N(µ, σ) or n > 30. Let The test statistic is H 0 : µ = µ 0. z = x µ 0

More information

Chapter 14 Simple Linear Regression (A)

Chapter 14 Simple Linear Regression (A) Chapter 14 Simple Linear Regression (A) 1. Characteristics Managerial decisions often are based on the relationship between two or more variables. can be used to develop an equation showing how the variables

More information

Linear models and their mathematical foundations: Simple linear regression

Linear models and their mathematical foundations: Simple linear regression Linear models and their mathematical foundations: Simple linear regression Steffen Unkel Department of Medical Statistics University Medical Center Göttingen, Germany Winter term 2018/19 1/21 Introduction

More information

Regression Models - Introduction

Regression Models - Introduction Regression Models - Introduction In regression models there are two types of variables that are studied: A dependent variable, Y, also called response variable. It is modeled as random. An independent

More information

Chapter 8 Heteroskedasticity

Chapter 8 Heteroskedasticity Chapter 8 Walter R. Paczkowski Rutgers University Page 1 Chapter Contents 8.1 The Nature of 8. Detecting 8.3 -Consistent Standard Errors 8.4 Generalized Least Squares: Known Form of Variance 8.5 Generalized

More information

Regression used to predict or estimate the value of one variable corresponding to a given value of another variable.

Regression used to predict or estimate the value of one variable corresponding to a given value of another variable. CHAPTER 9 Simple Linear Regression and Correlation Regression used to predict or estimate the value of one variable corresponding to a given value of another variable. X = independent variable. Y = dependent

More information

Project Report for STAT571 Statistical Methods Instructor: Dr. Ramon V. Leon. Wage Data Analysis. Yuanlei Zhang

Project Report for STAT571 Statistical Methods Instructor: Dr. Ramon V. Leon. Wage Data Analysis. Yuanlei Zhang Project Report for STAT7 Statistical Methods Instructor: Dr. Ramon V. Leon Wage Data Analysis Yuanlei Zhang 77--7 November, Part : Introduction Data Set The data set contains a random sample of observations

More information

Polynomial Regression

Polynomial Regression Polynomial Regression Summary... 1 Analysis Summary... 3 Plot of Fitted Model... 4 Analysis Options... 6 Conditional Sums of Squares... 7 Lack-of-Fit Test... 7 Observed versus Predicted... 8 Residual Plots...

More information

Chapter 13. Multiple Regression and Model Building

Chapter 13. Multiple Regression and Model Building Chapter 13 Multiple Regression and Model Building Multiple Regression Models The General Multiple Regression Model y x x x 0 1 1 2 2... k k y is the dependent variable x, x,..., x 1 2 k the model are the

More information

Regression Models. Chapter 4

Regression Models. Chapter 4 Chapter 4 Regression Models To accompany Quantitative Analysis for Management, Eleventh Edition, by Render, Stair, and Hanna Power Point slides created by Brian Peterson Introduction Regression analysis

More information

Lecture 9 SLR in Matrix Form

Lecture 9 SLR in Matrix Form Lecture 9 SLR in Matrix Form STAT 51 Spring 011 Background Reading KNNL: Chapter 5 9-1 Topic Overview Matrix Equations for SLR Don t focus so much on the matrix arithmetic as on the form of the equations.

More information

Section 3: Simple Linear Regression

Section 3: Simple Linear Regression Section 3: Simple Linear Regression Carlos M. Carvalho The University of Texas at Austin McCombs School of Business http://faculty.mccombs.utexas.edu/carlos.carvalho/teaching/ 1 Regression: General Introduction

More information

Business Statistics. Lecture 9: Simple Regression

Business Statistics. Lecture 9: Simple Regression Business Statistics Lecture 9: Simple Regression 1 On to Model Building! Up to now, class was about descriptive and inferential statistics Numerical and graphical summaries of data Confidence intervals

More information

Chapte The McGraw-Hill Companies, Inc. All rights reserved.

Chapte The McGraw-Hill Companies, Inc. All rights reserved. 12er12 Chapte Bivariate i Regression (Part 1) Bivariate Regression Visual Displays Begin the analysis of bivariate data (i.e., two variables) with a scatter plot. A scatter plot - displays each observed

More information

y response variable x 1, x 2,, x k -- a set of explanatory variables

y response variable x 1, x 2,, x k -- a set of explanatory variables 11. Multiple Regression and Correlation y response variable x 1, x 2,, x k -- a set of explanatory variables In this chapter, all variables are assumed to be quantitative. Chapters 12-14 show how to incorporate

More information

LAB 3 INSTRUCTIONS SIMPLE LINEAR REGRESSION

LAB 3 INSTRUCTIONS SIMPLE LINEAR REGRESSION LAB 3 INSTRUCTIONS SIMPLE LINEAR REGRESSION In this lab you will first learn how to display the relationship between two quantitative variables with a scatterplot and also how to measure the strength of

More information

Business Statistics. Lecture 10: Course Review

Business Statistics. Lecture 10: Course Review Business Statistics Lecture 10: Course Review 1 Descriptive Statistics for Continuous Data Numerical Summaries Location: mean, median Spread or variability: variance, standard deviation, range, percentiles,

More information

Chapter 1 Statistical Inference

Chapter 1 Statistical Inference Chapter 1 Statistical Inference causal inference To infer causality, you need a randomized experiment (or a huge observational study and lots of outside information). inference to populations Generalizations

More information

Mathematics for Economics MA course

Mathematics for Economics MA course Mathematics for Economics MA course Simple Linear Regression Dr. Seetha Bandara Simple Regression Simple linear regression is a statistical method that allows us to summarize and study relationships between

More information

9. Linear Regression and Correlation

9. Linear Regression and Correlation 9. Linear Regression and Correlation Data: y a quantitative response variable x a quantitative explanatory variable (Chap. 8: Recall that both variables were categorical) For example, y = annual income,

More information

Start with review, some new definitions, and pictures on the white board. Assumptions in the Normal Linear Regression Model

Start with review, some new definitions, and pictures on the white board. Assumptions in the Normal Linear Regression Model Start with review, some new definitions, and pictures on the white board. Assumptions in the Normal Linear Regression Model A1: There is a linear relationship between X and Y. A2: The error terms (and

More information

Multiple Regression. Inference for Multiple Regression and A Case Study. IPS Chapters 11.1 and W.H. Freeman and Company

Multiple Regression. Inference for Multiple Regression and A Case Study. IPS Chapters 11.1 and W.H. Freeman and Company Multiple Regression Inference for Multiple Regression and A Case Study IPS Chapters 11.1 and 11.2 2009 W.H. Freeman and Company Objectives (IPS Chapters 11.1 and 11.2) Multiple regression Data for multiple

More information

Multiple Linear Regression

Multiple Linear Regression Multiple Linear Regression Simple linear regression tries to fit a simple line between two variables Y and X. If X is linearly related to Y this explains some of the variability in Y. In most cases, there

More information

Inference for Regression

Inference for Regression Inference for Regression Section 9.4 Cathy Poliak, Ph.D. cathy@math.uh.edu Office in Fleming 11c Department of Mathematics University of Houston Lecture 13b - 3339 Cathy Poliak, Ph.D. cathy@math.uh.edu

More information

Regression Models. Chapter 4. Introduction. Introduction. Introduction

Regression Models. Chapter 4. Introduction. Introduction. Introduction Chapter 4 Regression Models Quantitative Analysis for Management, Tenth Edition, by Render, Stair, and Hanna 008 Prentice-Hall, Inc. Introduction Regression analysis is a very valuable tool for a manager

More information

Lecture 2 Linear Regression: A Model for the Mean. Sharyn O Halloran

Lecture 2 Linear Regression: A Model for the Mean. Sharyn O Halloran Lecture 2 Linear Regression: A Model for the Mean Sharyn O Halloran Closer Look at: Linear Regression Model Least squares procedure Inferential tools Confidence and Prediction Intervals Assumptions Robustness

More information

Lecture 14 Simple Linear Regression

Lecture 14 Simple Linear Regression Lecture 4 Simple Linear Regression Ordinary Least Squares (OLS) Consider the following simple linear regression model where, for each unit i, Y i is the dependent variable (response). X i is the independent

More information

Multicollinearity occurs when two or more predictors in the model are correlated and provide redundant information about the response.

Multicollinearity occurs when two or more predictors in the model are correlated and provide redundant information about the response. Multicollinearity Read Section 7.5 in textbook. Multicollinearity occurs when two or more predictors in the model are correlated and provide redundant information about the response. Example of multicollinear

More information

STAT5044: Regression and Anova

STAT5044: Regression and Anova STAT5044: Regression and Anova Inyoung Kim 1 / 49 Outline 1 How to check assumptions 2 / 49 Assumption Linearity: scatter plot, residual plot Randomness: Run test, Durbin-Watson test when the data can

More information

Unit 14: Nonparametric Statistical Methods

Unit 14: Nonparametric Statistical Methods Unit 14: Nonparametric Statistical Methods Statistics 571: Statistical Methods Ramón V. León 8/8/2003 Unit 14 - Stat 571 - Ramón V. León 1 Introductory Remarks Most methods studied so far have been based

More information

Correlation and regression

Correlation and regression 1 Correlation and regression Yongjua Laosiritaworn Introductory on Field Epidemiology 6 July 2015, Thailand Data 2 Illustrative data (Doll, 1955) 3 Scatter plot 4 Doll, 1955 5 6 Correlation coefficient,

More information

Business Statistics. Lecture 10: Correlation and Linear Regression

Business Statistics. Lecture 10: Correlation and Linear Regression Business Statistics Lecture 10: Correlation and Linear Regression Scatterplot A scatterplot shows the relationship between two quantitative variables measured on the same individuals. It displays the Form

More information

Lecture 3: Inference in SLR

Lecture 3: Inference in SLR Lecture 3: Inference in SLR STAT 51 Spring 011 Background Reading KNNL:.1.6 3-1 Topic Overview This topic will cover: Review of hypothesis testing Inference about 1 Inference about 0 Confidence Intervals

More information

Chapter 14 Student Lecture Notes Department of Quantitative Methods & Information Systems. Business Statistics. Chapter 14 Multiple Regression

Chapter 14 Student Lecture Notes Department of Quantitative Methods & Information Systems. Business Statistics. Chapter 14 Multiple Regression Chapter 14 Student Lecture Notes 14-1 Department of Quantitative Methods & Information Systems Business Statistics Chapter 14 Multiple Regression QMIS 0 Dr. Mohammad Zainal Chapter Goals After completing

More information

Chapter 14. Linear least squares

Chapter 14. Linear least squares Serik Sagitov, Chalmers and GU, March 5, 2018 Chapter 14 Linear least squares 1 Simple linear regression model A linear model for the random response Y = Y (x) to an independent variable X = x For a given

More information

Homoskedasticity. Var (u X) = σ 2. (23)

Homoskedasticity. Var (u X) = σ 2. (23) Homoskedasticity How big is the difference between the OLS estimator and the true parameter? To answer this question, we make an additional assumption called homoskedasticity: Var (u X) = σ 2. (23) This

More information

Math 3330: Solution to midterm Exam

Math 3330: Solution to midterm Exam Math 3330: Solution to midterm Exam Question 1: (14 marks) Suppose the regression model is y i = β 0 + β 1 x i + ε i, i = 1,, n, where ε i are iid Normal distribution N(0, σ 2 ). a. (2 marks) Compute the

More information

Measuring the fit of the model - SSR

Measuring the fit of the model - SSR Measuring the fit of the model - SSR Once we ve determined our estimated regression line, we d like to know how well the model fits. How far/close are the observations to the fitted line? One way to do

More information

Diagnostics and Remedial Measures

Diagnostics and Remedial Measures Diagnostics and Remedial Measures Yang Feng http://www.stat.columbia.edu/~yangfeng Yang Feng (Columbia University) Diagnostics and Remedial Measures 1 / 72 Remedial Measures How do we know that the regression

More information

Chapter 14 Student Lecture Notes 14-1

Chapter 14 Student Lecture Notes 14-1 Chapter 14 Student Lecture Notes 14-1 Business Statistics: A Decision-Making Approach 6 th Edition Chapter 14 Multiple Regression Analysis and Model Building Chap 14-1 Chapter Goals After completing this

More information

Confidence Interval for the mean response

Confidence Interval for the mean response Week 3: Prediction and Confidence Intervals at specified x. Testing lack of fit with replicates at some x's. Inference for the correlation. Introduction to regression with several explanatory variables.

More information

The Model Building Process Part I: Checking Model Assumptions Best Practice

The Model Building Process Part I: Checking Model Assumptions Best Practice The Model Building Process Part I: Checking Model Assumptions Best Practice Authored by: Sarah Burke, PhD 31 July 2017 The goal of the STAT T&E COE is to assist in developing rigorous, defensible test

More information

Lecture notes on Regression & SAS example demonstration

Lecture notes on Regression & SAS example demonstration Regression & Correlation (p. 215) When two variables are measured on a single experimental unit, the resulting data are called bivariate data. You can describe each variable individually, and you can also

More information

Assumptions in Regression Modeling

Assumptions in Regression Modeling Fall Semester, 2001 Statistics 621 Lecture 2 Robert Stine 1 Assumptions in Regression Modeling Preliminaries Preparing for class Read the casebook prior to class Pace in class is too fast to absorb without

More information

Unit 10: Simple Linear Regression and Correlation

Unit 10: Simple Linear Regression and Correlation Unt 10: Smple Lnear Regresson and Correlaton Statstcs 571: Statstcal Methods Ramón V. León 6/28/2004 Unt 10 - Stat 571 - Ramón V. León 1 Introductory Remarks Regresson analyss s a method for studyng the

More information

Chapter 7 Student Lecture Notes 7-1

Chapter 7 Student Lecture Notes 7-1 Chapter 7 Student Lecture Notes 7- Chapter Goals QM353: Business Statistics Chapter 7 Multiple Regression Analysis and Model Building After completing this chapter, you should be able to: Explain model

More information

Lecture 2 Simple Linear Regression STAT 512 Spring 2011 Background Reading KNNL: Chapter 1

Lecture 2 Simple Linear Regression STAT 512 Spring 2011 Background Reading KNNL: Chapter 1 Lecture Simple Linear Regression STAT 51 Spring 011 Background Reading KNNL: Chapter 1-1 Topic Overview This topic we will cover: Regression Terminology Simple Linear Regression with a single predictor

More information

Quantitative Methods I: Regression diagnostics

Quantitative Methods I: Regression diagnostics Quantitative Methods I: Regression University College Dublin 10 December 2014 1 Assumptions and errors 2 3 4 Outline Assumptions and errors 1 Assumptions and errors 2 3 4 Assumptions: specification Linear

More information

Harvard University. Rigorous Research in Engineering Education

Harvard University. Rigorous Research in Engineering Education Statistical Inference Kari Lock Harvard University Department of Statistics Rigorous Research in Engineering Education 12/3/09 Statistical Inference You have a sample and want to use the data collected

More information

The Model Building Process Part I: Checking Model Assumptions Best Practice (Version 1.1)

The Model Building Process Part I: Checking Model Assumptions Best Practice (Version 1.1) The Model Building Process Part I: Checking Model Assumptions Best Practice (Version 1.1) Authored by: Sarah Burke, PhD Version 1: 31 July 2017 Version 1.1: 24 October 2017 The goal of the STAT T&E COE

More information

Tentative solutions TMA4255 Applied Statistics 16 May, 2015

Tentative solutions TMA4255 Applied Statistics 16 May, 2015 Norwegian University of Science and Technology Department of Mathematical Sciences Page of 9 Tentative solutions TMA455 Applied Statistics 6 May, 05 Problem Manufacturer of fertilizers a) Are these independent

More information

y ˆ i = ˆ " T u i ( i th fitted value or i th fit)

y ˆ i = ˆ  T u i ( i th fitted value or i th fit) 1 2 INFERENCE FOR MULTIPLE LINEAR REGRESSION Recall Terminology: p predictors x 1, x 2,, x p Some might be indicator variables for categorical variables) k-1 non-constant terms u 1, u 2,, u k-1 Each u

More information

The Simple Linear Regression Model

The Simple Linear Regression Model The Simple Linear Regression Model Lesson 3 Ryan Safner 1 1 Department of Economics Hood College ECON 480 - Econometrics Fall 2017 Ryan Safner (Hood College) ECON 480 - Lesson 3 Fall 2017 1 / 77 Bivariate

More information

Multiple Regression Analysis. Part III. Multiple Regression Analysis

Multiple Regression Analysis. Part III. Multiple Regression Analysis Part III Multiple Regression Analysis As of Sep 26, 2017 1 Multiple Regression Analysis Estimation Matrix form Goodness-of-Fit R-square Adjusted R-square Expected values of the OLS estimators Irrelevant

More information

We like to capture and represent the relationship between a set of possible causes and their response, by using a statistical predictive model.

We like to capture and represent the relationship between a set of possible causes and their response, by using a statistical predictive model. Statistical Methods in Business Lecture 5. Linear Regression We like to capture and represent the relationship between a set of possible causes and their response, by using a statistical predictive model.

More information

Contents. 1 Review of Residuals. 2 Detecting Outliers. 3 Influential Observations. 4 Multicollinearity and its Effects

Contents. 1 Review of Residuals. 2 Detecting Outliers. 3 Influential Observations. 4 Multicollinearity and its Effects Contents 1 Review of Residuals 2 Detecting Outliers 3 Influential Observations 4 Multicollinearity and its Effects W. Zhou (Colorado State University) STAT 540 July 6th, 2015 1 / 32 Model Diagnostics:

More information

Psychology 282 Lecture #4 Outline Inferences in SLR

Psychology 282 Lecture #4 Outline Inferences in SLR Psychology 282 Lecture #4 Outline Inferences in SLR Assumptions To this point we have not had to make any distributional assumptions. Principle of least squares requires no assumptions. Can use correlations

More information

Review of Statistics 101

Review of Statistics 101 Review of Statistics 101 We review some important themes from the course 1. Introduction Statistics- Set of methods for collecting/analyzing data (the art and science of learning from data). Provides methods

More information

Ch14. Multiple Regression Analysis

Ch14. Multiple Regression Analysis Ch14. Multiple Regression Analysis 1 Goals : multiple regression analysis Model Building and Estimating More than 1 independent variables Quantitative( 量 ) independent variables Qualitative( ) independent

More information

Chapter 10. Regression. Understandable Statistics Ninth Edition By Brase and Brase Prepared by Yixun Shi Bloomsburg University of Pennsylvania

Chapter 10. Regression. Understandable Statistics Ninth Edition By Brase and Brase Prepared by Yixun Shi Bloomsburg University of Pennsylvania Chapter 10 Regression Understandable Statistics Ninth Edition By Brase and Brase Prepared by Yixun Shi Bloomsburg University of Pennsylvania Scatter Diagrams A graph in which pairs of points, (x, y), are

More information

Interactions. Interactions. Lectures 1 & 2. Linear Relationships. y = a + bx. Slope. Intercept

Interactions. Interactions. Lectures 1 & 2. Linear Relationships. y = a + bx. Slope. Intercept Interactions Lectures 1 & Regression Sometimes two variables appear related: > smoking and lung cancers > height and weight > years of education and income > engine size and gas mileage > GMAT scores and

More information

Topic 10 - Linear Regression

Topic 10 - Linear Regression Topic 10 - Linear Regression Least squares principle Hypothesis tests/confidence intervals/prediction intervals for regression 1 Linear Regression How much should you pay for a house? Would you consider

More information

Inference for the Regression Coefficient

Inference for the Regression Coefficient Inference for the Regression Coefficient Recall, b 0 and b 1 are the estimates of the slope β 1 and intercept β 0 of population regression line. We can shows that b 0 and b 1 are the unbiased estimates

More information

K. Model Diagnostics. residuals ˆɛ ij = Y ij ˆµ i N = Y ij Ȳ i semi-studentized residuals ω ij = ˆɛ ij. studentized deleted residuals ɛ ij =

K. Model Diagnostics. residuals ˆɛ ij = Y ij ˆµ i N = Y ij Ȳ i semi-studentized residuals ω ij = ˆɛ ij. studentized deleted residuals ɛ ij = K. Model Diagnostics We ve already seen how to check model assumptions prior to fitting a one-way ANOVA. Diagnostics carried out after model fitting by using residuals are more informative for assessing

More information

Analysis of Variance. ภาว น ศ ร ประภาน ก ล คณะเศรษฐศาสตร มหาว ทยาล ยธรรมศาสตร

Analysis of Variance. ภาว น ศ ร ประภาน ก ล คณะเศรษฐศาสตร มหาว ทยาล ยธรรมศาสตร Analysis of Variance ภาว น ศ ร ประภาน ก ล คณะเศรษฐศาสตร มหาว ทยาล ยธรรมศาสตร pawin@econ.tu.ac.th Outline Introduction One Factor Analysis of Variance Two Factor Analysis of Variance ANCOVA MANOVA Introduction

More information

Review of Statistics

Review of Statistics Review of Statistics Topics Descriptive Statistics Mean, Variance Probability Union event, joint event Random Variables Discrete and Continuous Distributions, Moments Two Random Variables Covariance and

More information