MAT2377. Rafa l Kulik. Version 2015/November/26. Rafa l Kulik

Size: px
Start display at page:

Download "MAT2377. Rafa l Kulik. Version 2015/November/26. Rafa l Kulik"

Transcription

1 MAT2377 Rafa l Kulik Version 2015/November/26 Rafa l Kulik

2 Bivariate data and scatterplot Data: Hydrocarbon level (x) and Oxygen level (y): x: 0.99, 1.02, 1.15, 1.29, 1.46, 1.36, 0.87, 1.23, 1.55, 1.40, 1.19, 1.15, 0.98, 1.01, 1.11, 1.20, 1.26, 1.32, 1.43, y: 90.01, 89.05, 91.43, 93.74, 96.73, 94.45, 87.59, 91.77, 99.42, 93.65, 93.54, 92.52, 90.56, 89.54, 89.85, 90.39, 93.25, 93.41, 94.98, Rafa l Kulik 1

3 Oxygen Hydrocarbon level Rafa l Kulik 2

4 We want to describe the relationship between these two variables. We will use regression analysis. We will assume that our model is given by Y = β 0 + β 1 x + ɛ, where ɛ is a random error and β 0, β 1 are regression coefficients. The variable x is called regressor (predictor) variable and Y is called a dependent or response variable. It is assumed that E(ɛ) = 0, Var(ɛ) = σ 2. In particular, E(Y x) = β 0 + β 1 x. Rafa l Kulik 3

5 Suppose now that we have observations (x i, y i ) from our model, so y i = β 0 + β 1 x i + ɛ i, i = 1,..., n. Our aim is to find ˆβ 0, ˆβ 1, estimator of the unknown parameters β 0, β 1. consequently, we will find the estimated (fitted) regression line or the line of the best fit: ŷ = ˆβ 0 + ˆβ 1 x. This line is obtained using the method of least squares. Having the observations y i, i = 1,..., n, their deviation e i from the line β 0 + β 1 x are e i = (y i β 0 β i x i ), i = 1,..., n. So, L(β 0, β 1 ) := n e 2 i = n (y i β 0 β i x i ) 2. i=1 i=1 Rafa l Kulik 4

6 Consequently, and dl dβ 0 = 2 dl dβ 1 = 2 n (y i β 0 β i x i ) i=1 n (y i β 0 β i x i )x i. i=1 Solving dl dβ 0 = 0 and dl dβ 1 = 0 we obtain the least squares estimators of β 0 and β 1 : Rafa l Kulik 5

7 where Equivalently, S xx = ˆβ 1 = S xy S xx, S xy = ˆβ0 = ȳ ˆβ 1 x, n (x i x)(y i ȳ) i=1 n (x i x) 2 S yy = i=1 n (y i ȳ) 2 i=1 S xy = n i=1 x iy i 1 n ( n i=1 x i)( n i=1 y i) S xx = n i=1 x2 i 1 n ( n i=1 x i) 2 S yy = n i=1 y2 i 1 n ( n i=1 y i) 2 Rafa l Kulik 6

8 Example For hydrocarbon data we find x i = 23.92, y i = , x 2 i = , y 2 i = , x i y i = , so that S xy = , S xx = , S yy = Therefore, ˆβ0 = 74.28, ˆβ1 = Consequently, the fitted regression line is ŷ = x. Oxygen Hydrocarbon level Rafa l Kulik 7

9 Sample Correlation Coefficient. For data (x i, y i ) we define the sample correlation coefficient as R = S xy Sxx S yy (1) (R is defined only if S xx and S yy are positive). Example: For the hydrocarbon data the sample correlation coefficient is R = Rafa l Kulik 8

10 Properties of R R is unaffected by change of scale or origin. Adding constants to x does not change x x and multiplying x and y by constants changes numerator and denominator equally. R is symmetric in x and y. 1 R 1. If R = 1 (R = 1) then the observations (x i, y i ) all lie on a straight line with a positive (negative) slope. The sign of r reflects the trend of the points. Remember a high correlation coefficient does not necessarily imply any causal relationship between the two variables. Rafa l Kulik 9

11 Note x and y can have a very strong non-linear relationship and R 0. The graph shows plots of a vector x against y = (x x) 2 and z = (x x) 3. y z x x Correlation for the above data is and 0.93, respectively. Rafa l Kulik 10

12 R= R= R= R= 0.78 Rafa l Kulik 11

13 Estimating σ 2 Recall that Var(ɛ) = σ 2. To estimate it we use sum of squares of residuals: n n SS E = e 2 i = (y i ŷ i ) 2. The estimator ˆσ 2 of σ 2 is given by i=1 i=1 ˆσ 2 = SS E n 2 = S yy ˆβ 1 S xy. (2) n 2 For the oxygen data we have ˆσ 2 = Rafa l Kulik 12

14 Properties of the LSE Recall that we consider the model Y = β 0 + β 1 x + ɛ, where E(ɛ) = 0, Var(ɛ) = σ 2. Thus, given x, Y is a random variable with mean β 0 + β 1 x and variance σ 2. Note that ˆβ 0 and ˆβ 1 depend on the observed y s, which are realizations of the random variable Y. Consequently, the estimators are random variables. We have E( ˆβ 1 ) = β 1, Var( ˆβ 1 ) = σ2 S xx, E( ˆβ 0 ) = β 0, Var( ˆβ 0 ) = σ 2 [ ] 1 n + x2 S xx ( ˆβ 0 and ˆβ 1 are the unbiased estimator of β 0 and β 1, respectively). Consequently, the estimated standard errors are [ ] se( ˆβ ˆσ 1 ) = 2, se( S ˆβ 1 0 ) = ˆσ 2 xx n + x2 S xx Rafa l Kulik 13

15 Hypothesis tests in linear regression We want to test that the slope equals β 1,0, i.e. H 0 : β 1 = β 1,0, H 1 : β 1 β 1,0. Since ˆβ 1 is approximately normal N (β 1, σ2 S xx ), we may use statistics: ˆβ 1 β 1,0 σ2 /S xx N (0, 1). However, σ 2 is not known, thus we plug-in its estimator ˆσ 2 given in (2): T 0 = ˆβ 1 β 1,0 ˆσ2 /S xx t n 2. If now t 0 is the observed value, we reject H 0 if t 0 > t α/2,n 2. Rafa l Kulik 14

16 Significance of regression For each bivariate data set we may fit a regression line, which aims to describe a linear relationship between x and Y. However, does it always make sense? x y Of course, we may fit the regression line, ŷ = x, but this line does not describe at all the bivariate data set. Rafa l Kulik 15

17 Having computed the regression line, we want to test whether this line is significant. The test for significance of regression is given by H 0 : β 1 = 0, H 1 : β 1 0. If we reject H 0, then there is a linear relationship between x and Y. Example: Hydrocarbon data - ˆβ 1 = 14.95, n = 20, S xx = 0.68, ˆσ 2 = Consequently, t 0 = ˆβ 1 0 ˆσ2 /S xx = > 2.88 = t 0.005,18. We reject H 0 - there is a linear relationship between x and Y. Rafa l Kulik 16

18 Confidence intervals - slope and intercept The 100(1 α)% confidence intervals for β 1 and β 0 are: ˆβ 0 t α/2,n 2 ˆσ ˆβ 1 t 2 α/2,n 2 β 1 S ˆβ 1 + t α/2,n 2 xx ˆσ 2 S xx [ ] [ ] 1 ˆσ 2 n + x2 β 0 S ˆβ t α/2,n 2 ˆσ 2 xx n + x2 S xx Example: Hydrocarbon data β Rafa l Kulik 17

19 Confidence intervals - mean response We want to estimate µ Y x0 = E(Y x 0 ) - the mean response at x 0 (typically at the observed x 0 ). Of course, it can be read exactly from the regression line ˆµ Y x0 = ˆβ 0 + ˆβ 1 x 0. The distance (at x 0 ) between the estimated and the true regression line is ˆµ Y x0 µ Y x0 = Now, E(ˆµ Y x0 ) = µ Y x0 and ( ˆβ0 β 0 ) + ( ) ˆβ1 β 1 x 0. [ 1 Var(ˆµ Y x0 ) = σ 2 n + (x 0 x) 2 ]. S xx Rafa l Kulik 18

20 Note that Var(ˆµ Y x0 ) = Var( ˆβ 0 + ˆβ 1 x 0 ) Var( ˆβ 0 ) + Var( ˆβ 1 x 0 ) since ˆβ 0 and ˆβ 1 are dependent. is The confidence interval for µ Y x0 (the mean response (regression line)) ˆµ Y x0 ± t α/2,n 2 [ 1 ˆσ 2 n + (x ] 0 x) 2. S xx Example: Hydrocarbon data - ˆµ Y x0 ± [ (x ] ) Rafa l Kulik 19

21 y x A lot of observations are outside the confidence interval (small sample (???)). Rafa l Kulik 20

22 Prediction of new observations If x 0 is the value of the regressor variable of interest, then the estimated value of the response variable Y is ŷ = Ŷ0 = ˆβ 0 + ˆβ 1 x 0. If Y 0 is the true future observation at x = x 0 (so, Y 0 = β 0 + β 1 x 0 + ɛ) and Ŷ 0 is the predicted value, given by the above equation, then the prediction error eˆp = Y 0 Ŷ0 = β 0 + β 1 x 0 + ɛ ( ˆβ 0 + ˆβ 1 x 0 ) = (β 0 ˆβ 0 ) + (β 1 ˆβ 1 )x 0 + ɛ Rafa l Kulik 21

23 is normally distributed N (0, σ [ n + (x 0 x) 2 ]). S xx Now, we plug-in the estimator of σ to get the following confidence interval for Y 0 : [ ŷ 0 ± t α/2,n 2 ˆσ n + (x ] 0 x) 2, S xx where ŷ 0 = ˆβ 0 + ˆβ 1 x 0 Rafa l Kulik 22

24 Residuals e i = y i ŷ i, where y i, i = 1,..., n are the observed value and ŷ i, i = 1,..., n are the values obtained from the regression line, i.e. ŷ i = ˆβ 0 + ˆβ 1 x i. lm(y ~ x)$res Index Rafa l Kulik 23

25 Analysis - summary 1. Draw scatterplot 2. Find the regression line 3. Check the appropriateness of a linear fit (correlation coefficient, significance of regression test) 4. Check goodness-of-fit (confidence interval for the regression line) 5. Check model assumptions (residuals) 6. Do prediction, if appropriate Rafa l Kulik 24

26 Set #1 This data set contains statistics, in arrests per 100,000 residents for assault, murder, and rape in each of the 50 US states in x-number of murders, y - number of assaults x y Rafa l Kulik 25

27 2. n n x i = 389.4, y i = 8538 i=1 n x 2 i = , i=1 Thus, ŷ = x i=1 n yi 2 = , i=1 i=1 n x i y i = y x Rafa l Kulik 26

28 3. Correlation: R = Significance of regression: H 0 : β 1 = 0, H 1 : β 1 0. Test statistics T 0 = ˆβ 1 0 t n 2.. We have ˆσ ˆσ 2 = , S xx = /S xx The observed value: t 0 = 9.30, t 0.05/2, reject H 0 - there is a linear relationship between x and y. Rafa l Kulik 27

29 4. Confidence interval for the regression line x y Rafa l Kulik 28

30 Index lm(y ~ x)$res Rafa l Kulik 29

31 6. Predict number of assaults if number of murders is x 0 = 20: ŷ 0 = = , Equivalent statement: Give a point estimate of number of assaults if number of murders is... Rafa l Kulik 30

32 Compute prediction interval ŷ 0 ± t α/2,n ± ± [ ˆσ n + (x ] 0 x) 2 S xx [ ] ( ) What change in mean number of assaults would be expected for a change of 1 in the number of assaults? - compute slope Rafa l Kulik 31

33 Set #2 The classic airline data. Monthly totals of international airline passengers, 1949 to x-time, x = (1, 2,..., 144). y x Rafa l Kulik 32

34 2. n n x i = 10440, y i = i=1 i=1 n n x 2 i = , yi 2 = , i=1 i=1 i=1 Thus, ŷ = x n x i y i = y x Rafa l Kulik 33

35 3. Correlation: R = Significance of regression: H 0 : β 1 = 0, H 1 : β 1 0. Test statistics T 0 = ˆβ 1 0 t n 2.. We have ˆσ ˆσ 2 = , S xx = /S xx The observed value: t 0 = , t 0.05/2, reject H 0 - there is a linear relationship between x and y. Rafa l Kulik 34

36 4. Confidence interval for the reg. line ˆµ Y x0 ± t α/2,n 2 ˆσ 2 [ 1 n + (x 0 x) 2 S xx ]. y x Rafa l Kulik 35

37 lm(y ~ x)$res Index Rafa l Kulik 36

38 y x Rafa l Kulik 37

Simple Linear Regression

Simple Linear Regression Simple Linear Regression ST 430/514 Recall: A regression model describes how a dependent variable (or response) Y is affected, on average, by one or more independent variables (or factors, or covariates)

More information

Ch 2: Simple Linear Regression

Ch 2: Simple Linear Regression Ch 2: Simple Linear Regression 1. Simple Linear Regression Model A simple regression model with a single regressor x is y = β 0 + β 1 x + ɛ, where we assume that the error ɛ is independent random component

More information

Linear models and their mathematical foundations: Simple linear regression

Linear models and their mathematical foundations: Simple linear regression Linear models and their mathematical foundations: Simple linear regression Steffen Unkel Department of Medical Statistics University Medical Center Göttingen, Germany Winter term 2018/19 1/21 Introduction

More information

Simple and Multiple Linear Regression

Simple and Multiple Linear Regression Sta. 113 Chapter 12 and 13 of Devore March 12, 2010 Table of contents 1 Simple Linear Regression 2 Model Simple Linear Regression A simple linear regression model is given by Y = β 0 + β 1 x + ɛ where

More information

Simple Linear Regression

Simple Linear Regression Simple Linear Regression ST 370 Regression models are used to study the relationship of a response variable and one or more predictors. The response is also called the dependent variable, and the predictors

More information

where x and ȳ are the sample means of x 1,, x n

where x and ȳ are the sample means of x 1,, x n y y Animal Studies of Side Effects Simple Linear Regression Basic Ideas In simple linear regression there is an approximately linear relation between two variables say y = pressure in the pancreas x =

More information

Simple Linear Regression

Simple Linear Regression Simple Linear Regression In simple linear regression we are concerned about the relationship between two variables, X and Y. There are two components to such a relationship. 1. The strength of the relationship.

More information

AMS 315/576 Lecture Notes. Chapter 11. Simple Linear Regression

AMS 315/576 Lecture Notes. Chapter 11. Simple Linear Regression AMS 315/576 Lecture Notes Chapter 11. Simple Linear Regression 11.1 Motivation A restaurant opening on a reservations-only basis would like to use the number of advance reservations x to predict the number

More information

Linear Models and Estimation by Least Squares

Linear Models and Estimation by Least Squares Linear Models and Estimation by Least Squares Jin-Lung Lin 1 Introduction Causal relation investigation lies in the heart of economics. Effect (Dependent variable) cause (Independent variable) Example:

More information

Measuring the fit of the model - SSR

Measuring the fit of the model - SSR Measuring the fit of the model - SSR Once we ve determined our estimated regression line, we d like to know how well the model fits. How far/close are the observations to the fitted line? One way to do

More information

Problems. Suppose both models are fitted to the same data. Show that SS Res, A SS Res, B

Problems. Suppose both models are fitted to the same data. Show that SS Res, A SS Res, B Simple Linear Regression 35 Problems 1 Consider a set of data (x i, y i ), i =1, 2,,n, and the following two regression models: y i = β 0 + β 1 x i + ε, (i =1, 2,,n), Model A y i = γ 0 + γ 1 x i + γ 2

More information

Scatter plot of data from the study. Linear Regression

Scatter plot of data from the study. Linear Regression 1 2 Linear Regression Scatter plot of data from the study. Consider a study to relate birthweight to the estriol level of pregnant women. The data is below. i Weight (g / 100) i Weight (g / 100) 1 7 25

More information

STAT Chapter 11: Regression

STAT Chapter 11: Regression STAT 515 -- Chapter 11: Regression Mostly we have studied the behavior of a single random variable. Often, however, we gather data on two random variables. We wish to determine: Is there a relationship

More information

Inference for Regression

Inference for Regression Inference for Regression Section 9.4 Cathy Poliak, Ph.D. cathy@math.uh.edu Office in Fleming 11c Department of Mathematics University of Houston Lecture 13b - 3339 Cathy Poliak, Ph.D. cathy@math.uh.edu

More information

Midterm 2 - Solutions

Midterm 2 - Solutions Ecn 102 - Analysis of Economic Data University of California - Davis February 23, 2010 Instructor: John Parman Midterm 2 - Solutions You have until 10:20am to complete this exam. Please remember to put

More information

Scatter plot of data from the study. Linear Regression

Scatter plot of data from the study. Linear Regression 1 2 Linear Regression Scatter plot of data from the study. Consider a study to relate birthweight to the estriol level of pregnant women. The data is below. i Weight (g / 100) i Weight (g / 100) 1 7 25

More information

Multiple Linear Regression

Multiple Linear Regression Multiple Linear Regression Simple linear regression tries to fit a simple line between two variables Y and X. If X is linearly related to Y this explains some of the variability in Y. In most cases, there

More information

STAT 111 Recitation 7

STAT 111 Recitation 7 STAT 111 Recitation 7 Xin Lu Tan xtan@wharton.upenn.edu October 25, 2013 1 / 13 Miscellaneous Please turn in homework 6. Please pick up homework 7 and the graded homework 5. Please check your grade and

More information

Statistical View of Least Squares

Statistical View of Least Squares Basic Ideas Some Examples Least Squares May 22, 2007 Basic Ideas Simple Linear Regression Basic Ideas Some Examples Least Squares Suppose we have two variables x and y Basic Ideas Simple Linear Regression

More information

Ch 3: Multiple Linear Regression

Ch 3: Multiple Linear Regression Ch 3: Multiple Linear Regression 1. Multiple Linear Regression Model Multiple regression model has more than one regressor. For example, we have one response variable and two regressor variables: 1. delivery

More information

Oct Simple linear regression. Minimum mean square error prediction. Univariate. regression. Calculating intercept and slope

Oct Simple linear regression. Minimum mean square error prediction. Univariate. regression. Calculating intercept and slope Oct 2017 1 / 28 Minimum MSE Y is the response variable, X the predictor variable, E(X) = E(Y) = 0. BLUP of Y minimizes average discrepancy var (Y ux) = C YY 2u C XY + u 2 C XX This is minimized when u

More information

ECON3150/4150 Spring 2016

ECON3150/4150 Spring 2016 ECON3150/4150 Spring 2016 Lecture 4 - The linear regression model Siv-Elisabeth Skjelbred University of Oslo Last updated: January 26, 2016 1 / 49 Overview These lecture slides covers: The linear regression

More information

Data Analysis and Statistical Methods Statistics 651

Data Analysis and Statistical Methods Statistics 651 y 1 2 3 4 5 6 7 x Data Analysis and Statistical Methods Statistics 651 http://www.stat.tamu.edu/~suhasini/teaching.html Lecture 32 Suhasini Subba Rao Previous lecture We are interested in whether a dependent

More information

Statistics 112 Simple Linear Regression Fuel Consumption Example March 1, 2004 E. Bura

Statistics 112 Simple Linear Regression Fuel Consumption Example March 1, 2004 E. Bura Statistics 112 Simple Linear Regression Fuel Consumption Example March 1, 2004 E. Bura Fuel Consumption Case: reducing natural gas transmission fines. In 1993, the natural gas industry was deregulated.

More information

Chapter 1. Linear Regression with One Predictor Variable

Chapter 1. Linear Regression with One Predictor Variable Chapter 1. Linear Regression with One Predictor Variable 1.1 Statistical Relation Between Two Variables To motivate statistical relationships, let us consider a mathematical relation between two mathematical

More information

AMS 7 Correlation and Regression Lecture 8

AMS 7 Correlation and Regression Lecture 8 AMS 7 Correlation and Regression Lecture 8 Department of Applied Mathematics and Statistics, University of California, Santa Cruz Suumer 2014 1 / 18 Correlation pairs of continuous observations. Correlation

More information

Correlation and the Analysis of Variance Approach to Simple Linear Regression

Correlation and the Analysis of Variance Approach to Simple Linear Regression Correlation and the Analysis of Variance Approach to Simple Linear Regression Biometry 755 Spring 2009 Correlation and the Analysis of Variance Approach to Simple Linear Regression p. 1/35 Correlation

More information

Lecture 11: Simple Linear Regression

Lecture 11: Simple Linear Regression Lecture 11: Simple Linear Regression Readings: Sections 3.1-3.3, 11.1-11.3 Apr 17, 2009 In linear regression, we examine the association between two quantitative variables. Number of beers that you drink

More information

The Simple Linear Regression Model

The Simple Linear Regression Model The Simple Linear Regression Model Lesson 3 Ryan Safner 1 1 Department of Economics Hood College ECON 480 - Econometrics Fall 2017 Ryan Safner (Hood College) ECON 480 - Lesson 3 Fall 2017 1 / 77 Bivariate

More information

STAT 3A03 Applied Regression With SAS Fall 2017

STAT 3A03 Applied Regression With SAS Fall 2017 STAT 3A03 Applied Regression With SAS Fall 2017 Assignment 2 Solution Set Q. 1 I will add subscripts relating to the question part to the parameters and their estimates as well as the errors and residuals.

More information

Applied Regression. Applied Regression. Chapter 2 Simple Linear Regression. Hongcheng Li. April, 6, 2013

Applied Regression. Applied Regression. Chapter 2 Simple Linear Regression. Hongcheng Li. April, 6, 2013 Applied Regression Chapter 2 Simple Linear Regression Hongcheng Li April, 6, 2013 Outline 1 Introduction of simple linear regression 2 Scatter plot 3 Simple linear regression model 4 Test of Hypothesis

More information

Density Temp vs Ratio. temp

Density Temp vs Ratio. temp Temp Ratio Density 0.00 0.02 0.04 0.06 0.08 0.10 0.12 Density 0.0 0.2 0.4 0.6 0.8 1.0 1. (a) 170 175 180 185 temp 1.0 1.5 2.0 2.5 3.0 ratio The histogram shows that the temperature measures have two peaks,

More information

STAT 540: Data Analysis and Regression

STAT 540: Data Analysis and Regression STAT 540: Data Analysis and Regression Wen Zhou http://www.stat.colostate.edu/~riczw/ Email: riczw@stat.colostate.edu Department of Statistics Colorado State University Fall 205 W. Zhou (Colorado State

More information

13 Simple Linear Regression

13 Simple Linear Regression B.Sc./Cert./M.Sc. Qualif. - Statistics: Theory and Practice 3 Simple Linear Regression 3. An industrial example A study was undertaken to determine the effect of stirring rate on the amount of impurity

More information

Y i = η + ɛ i, i = 1,...,n.

Y i = η + ɛ i, i = 1,...,n. Nonparametric tests If data do not come from a normal population (and if the sample is not large), we cannot use a t-test. One useful approach to creating test statistics is through the use of rank statistics.

More information

Coefficient of Determination

Coefficient of Determination Coefficient of Determination ST 430/514 The coefficient of determination, R 2, is defined as before: R 2 = 1 SS E (yi ŷ i ) = 1 2 SS yy (yi ȳ) 2 The interpretation of R 2 is still the fraction of variance

More information

Statistical View of Least Squares

Statistical View of Least Squares May 23, 2006 Purpose of Regression Some Examples Least Squares Purpose of Regression Purpose of Regression Some Examples Least Squares Suppose we have two variables x and y Purpose of Regression Some Examples

More information

ECON3150/4150 Spring 2015

ECON3150/4150 Spring 2015 ECON3150/4150 Spring 2015 Lecture 3&4 - The linear regression model Siv-Elisabeth Skjelbred University of Oslo January 29, 2015 1 / 67 Chapter 4 in S&W Section 17.1 in S&W (extended OLS assumptions) 2

More information

STAT 511. Lecture : Simple linear regression Devore: Section Prof. Michael Levine. December 3, Levine STAT 511

STAT 511. Lecture : Simple linear regression Devore: Section Prof. Michael Levine. December 3, Levine STAT 511 STAT 511 Lecture : Simple linear regression Devore: Section 12.1-12.4 Prof. Michael Levine December 3, 2018 A simple linear regression investigates the relationship between the two variables that is not

More information

Ecn Analysis of Economic Data University of California - Davis February 23, 2010 Instructor: John Parman. Midterm 2. Name: ID Number: Section:

Ecn Analysis of Economic Data University of California - Davis February 23, 2010 Instructor: John Parman. Midterm 2. Name: ID Number: Section: Ecn 102 - Analysis of Economic Data University of California - Davis February 23, 2010 Instructor: John Parman Midterm 2 You have until 10:20am to complete this exam. Please remember to put your name,

More information

UNIVERSITY OF MASSACHUSETTS. Department of Mathematics and Statistics. Basic Exam - Applied Statistics. Tuesday, January 17, 2017

UNIVERSITY OF MASSACHUSETTS. Department of Mathematics and Statistics. Basic Exam - Applied Statistics. Tuesday, January 17, 2017 UNIVERSITY OF MASSACHUSETTS Department of Mathematics and Statistics Basic Exam - Applied Statistics Tuesday, January 17, 2017 Work all problems 60 points are needed to pass at the Masters Level and 75

More information

Formal Statement of Simple Linear Regression Model

Formal Statement of Simple Linear Regression Model Formal Statement of Simple Linear Regression Model Y i = β 0 + β 1 X i + ɛ i Y i value of the response variable in the i th trial β 0 and β 1 are parameters X i is a known constant, the value of the predictor

More information

Business Statistics. Lecture 10: Correlation and Linear Regression

Business Statistics. Lecture 10: Correlation and Linear Regression Business Statistics Lecture 10: Correlation and Linear Regression Scatterplot A scatterplot shows the relationship between two quantitative variables measured on the same individuals. It displays the Form

More information

Estimating σ 2. We can do simple prediction of Y and estimation of the mean of Y at any value of X.

Estimating σ 2. We can do simple prediction of Y and estimation of the mean of Y at any value of X. Estimating σ 2 We can do simple prediction of Y and estimation of the mean of Y at any value of X. To perform inferences about our regression line, we must estimate σ 2, the variance of the error term.

More information

regression analysis is a type of inferential statistics which tells us whether relationships between two or more variables exist

regression analysis is a type of inferential statistics which tells us whether relationships between two or more variables exist regression analysis is a type of inferential statistics which tells us whether relationships between two or more variables exist sales $ (y - dependent variable) advertising $ (x - independent variable)

More information

Math 3330: Solution to midterm Exam

Math 3330: Solution to midterm Exam Math 3330: Solution to midterm Exam Question 1: (14 marks) Suppose the regression model is y i = β 0 + β 1 x i + ε i, i = 1,, n, where ε i are iid Normal distribution N(0, σ 2 ). a. (2 marks) Compute the

More information

STAT5044: Regression and Anova. Inyoung Kim

STAT5044: Regression and Anova. Inyoung Kim STAT5044: Regression and Anova Inyoung Kim 2 / 47 Outline 1 Regression 2 Simple Linear regression 3 Basic concepts in regression 4 How to estimate unknown parameters 5 Properties of Least Squares Estimators:

More information

Linear Regression. Simple linear regression model determines the relationship between one dependent variable (y) and one independent variable (x).

Linear Regression. Simple linear regression model determines the relationship between one dependent variable (y) and one independent variable (x). Linear Regression Simple linear regression model determines the relationship between one dependent variable (y) and one independent variable (x). A dependent variable is a random variable whose variation

More information

Simple linear regression

Simple linear regression Simple linear regression Biometry 755 Spring 2008 Simple linear regression p. 1/40 Overview of regression analysis Evaluate relationship between one or more independent variables (X 1,...,X k ) and a single

More information

Homework 2: Simple Linear Regression

Homework 2: Simple Linear Regression STAT 4385 Applied Regression Analysis Homework : Simple Linear Regression (Simple Linear Regression) Thirty (n = 30) College graduates who have recently entered the job market. For each student, the CGPA

More information

STAT 4385 Topic 03: Simple Linear Regression

STAT 4385 Topic 03: Simple Linear Regression STAT 4385 Topic 03: Simple Linear Regression Xiaogang Su, Ph.D. Department of Mathematical Science University of Texas at El Paso xsu@utep.edu Spring, 2017 Outline The Set-Up Exploratory Data Analysis

More information

Key Algebraic Results in Linear Regression

Key Algebraic Results in Linear Regression Key Algebraic Results in Linear Regression James H. Steiger Department of Psychology and Human Development Vanderbilt University James H. Steiger (Vanderbilt University) 1 / 30 Key Algebraic Results in

More information

Lecture 15. Hypothesis testing in the linear model

Lecture 15. Hypothesis testing in the linear model 14. Lecture 15. Hypothesis testing in the linear model Lecture 15. Hypothesis testing in the linear model 1 (1 1) Preliminary lemma 15. Hypothesis testing in the linear model 15.1. Preliminary lemma Lemma

More information

Regression Models - Introduction

Regression Models - Introduction Regression Models - Introduction In regression models there are two types of variables that are studied: A dependent variable, Y, also called response variable. It is modeled as random. An independent

More information

REGRESSION ANALYSIS AND INDICATOR VARIABLES

REGRESSION ANALYSIS AND INDICATOR VARIABLES REGRESSION ANALYSIS AND INDICATOR VARIABLES Thesis Submitted in partial fulfillment of the requirements for the award of degree of Masters of Science in Mathematics and Computing Submitted by Sweety Arora

More information

Multiple Linear Regression

Multiple Linear Regression Multiple Linear Regression ST 430/514 Recall: a regression model describes how a dependent variable (or response) Y is affected, on average, by one or more independent variables (or factors, or covariates).

More information

Introduction to Estimation Methods for Time Series models. Lecture 1

Introduction to Estimation Methods for Time Series models. Lecture 1 Introduction to Estimation Methods for Time Series models Lecture 1 Fulvio Corsi SNS Pisa Fulvio Corsi Introduction to Estimation () Methods for Time Series models Lecture 1 SNS Pisa 1 / 19 Estimation

More information

Chapter 10. Simple Linear Regression and Correlation

Chapter 10. Simple Linear Regression and Correlation Chapter 10. Simple Linear Regression and Correlation In the two sample problems discussed in Ch. 9, we were interested in comparing values of parameters for two distributions. Regression analysis is the

More information

Chapter 14. Linear least squares

Chapter 14. Linear least squares Serik Sagitov, Chalmers and GU, March 5, 2018 Chapter 14 Linear least squares 1 Simple linear regression model A linear model for the random response Y = Y (x) to an independent variable X = x For a given

More information

1 Least Squares Estimation - multiple regression.

1 Least Squares Estimation - multiple regression. Introduction to multiple regression. Fall 2010 1 Least Squares Estimation - multiple regression. Let y = {y 1,, y n } be a n 1 vector of dependent variable observations. Let β = {β 0, β 1 } be the 2 1

More information

Review of probability and statistics 1 / 31

Review of probability and statistics 1 / 31 Review of probability and statistics 1 / 31 2 / 31 Why? This chapter follows Stock and Watson (all graphs are from Stock and Watson). You may as well refer to the appendix in Wooldridge or any other introduction

More information

Lecture 8 CORRELATION AND LINEAR REGRESSION

Lecture 8 CORRELATION AND LINEAR REGRESSION Announcements CBA5 open in exam mode - deadline midnight Friday! Question 2 on this week s exercises is a prize question. The first good attempt handed in to me by 12 midday this Friday will merit a prize...

More information

Linear Regression. In this problem sheet, we consider the problem of linear regression with p predictors and one intercept,

Linear Regression. In this problem sheet, we consider the problem of linear regression with p predictors and one intercept, Linear Regression In this problem sheet, we consider the problem of linear regression with p predictors and one intercept, y = Xβ + ɛ, where y t = (y 1,..., y n ) is the column vector of target values,

More information

Chapter 1: Linear Regression with One Predictor Variable also known as: Simple Linear Regression Bivariate Linear Regression

Chapter 1: Linear Regression with One Predictor Variable also known as: Simple Linear Regression Bivariate Linear Regression BSTT523: Kutner et al., Chapter 1 1 Chapter 1: Linear Regression with One Predictor Variable also known as: Simple Linear Regression Bivariate Linear Regression Introduction: Functional relation between

More information

Lecture 6 Multiple Linear Regression, cont.

Lecture 6 Multiple Linear Regression, cont. Lecture 6 Multiple Linear Regression, cont. BIOST 515 January 22, 2004 BIOST 515, Lecture 6 Testing general linear hypotheses Suppose we are interested in testing linear combinations of the regression

More information

Statistics for Engineers Lecture 9 Linear Regression

Statistics for Engineers Lecture 9 Linear Regression Statistics for Engineers Lecture 9 Linear Regression Chong Ma Department of Statistics University of South Carolina chongm@email.sc.edu April 17, 2017 Chong Ma (Statistics, USC) STAT 509 Spring 2017 April

More information

STA 302 H1F / 1001 HF Fall 2007 Test 1 October 24, 2007

STA 302 H1F / 1001 HF Fall 2007 Test 1 October 24, 2007 STA 302 H1F / 1001 HF Fall 2007 Test 1 October 24, 2007 LAST NAME: SOLUTIONS FIRST NAME: STUDENT NUMBER: ENROLLED IN: (circle one) STA 302 STA 1001 INSTRUCTIONS: Time: 90 minutes Aids allowed: calculator.

More information

Lecture 4 Multiple linear regression

Lecture 4 Multiple linear regression Lecture 4 Multiple linear regression BIOST 515 January 15, 2004 Outline 1 Motivation for the multiple regression model Multiple regression in matrix notation Least squares estimation of model parameters

More information

STAT2012 Statistical Tests 23 Regression analysis: method of least squares

STAT2012 Statistical Tests 23 Regression analysis: method of least squares 23 Regression analysis: method of least squares L23 Regression analysis The main purpose of regression is to explore the dependence of one variable (Y ) on another variable (X). 23.1 Introduction (P.532-555)

More information

Lecture 14 Simple Linear Regression

Lecture 14 Simple Linear Regression Lecture 4 Simple Linear Regression Ordinary Least Squares (OLS) Consider the following simple linear regression model where, for each unit i, Y i is the dependent variable (response). X i is the independent

More information

Simple Linear Regression

Simple Linear Regression Simple Linear Regression MATH 282A Introduction to Computational Statistics University of California, San Diego Instructor: Ery Arias-Castro http://math.ucsd.edu/ eariasca/math282a.html MATH 282A University

More information

2.4.3 Estimatingσ Coefficient of Determination 2.4. ASSESSING THE MODEL 23

2.4.3 Estimatingσ Coefficient of Determination 2.4. ASSESSING THE MODEL 23 2.4. ASSESSING THE MODEL 23 2.4.3 Estimatingσ 2 Note that the sums of squares are functions of the conditional random variables Y i = (Y X = x i ). Hence, the sums of squares are random variables as well.

More information

Regression Estimation - Least Squares and Maximum Likelihood. Dr. Frank Wood

Regression Estimation - Least Squares and Maximum Likelihood. Dr. Frank Wood Regression Estimation - Least Squares and Maximum Likelihood Dr. Frank Wood Least Squares Max(min)imization Function to minimize w.r.t. β 0, β 1 Q = n (Y i (β 0 + β 1 X i )) 2 i=1 Minimize this by maximizing

More information

ECON The Simple Regression Model

ECON The Simple Regression Model ECON 351 - The Simple Regression Model Maggie Jones 1 / 41 The Simple Regression Model Our starting point will be the simple regression model where we look at the relationship between two variables In

More information

Final Exam - Solutions

Final Exam - Solutions Ecn 102 - Analysis of Economic Data University of California - Davis March 17, 2010 Instructor: John Parman Final Exam - Solutions You have until 12:30pm to complete this exam. Please remember to put your

More information

Applied Regression Analysis

Applied Regression Analysis Applied Regression Analysis Chapter 3 Multiple Linear Regression Hongcheng Li April, 6, 2013 Recall simple linear regression 1 Recall simple linear regression 2 Parameter Estimation 3 Interpretations of

More information

Business Statistics. Lecture 9: Simple Regression

Business Statistics. Lecture 9: Simple Regression Business Statistics Lecture 9: Simple Regression 1 On to Model Building! Up to now, class was about descriptive and inferential statistics Numerical and graphical summaries of data Confidence intervals

More information

K. Model Diagnostics. residuals ˆɛ ij = Y ij ˆµ i N = Y ij Ȳ i semi-studentized residuals ω ij = ˆɛ ij. studentized deleted residuals ɛ ij =

K. Model Diagnostics. residuals ˆɛ ij = Y ij ˆµ i N = Y ij Ȳ i semi-studentized residuals ω ij = ˆɛ ij. studentized deleted residuals ɛ ij = K. Model Diagnostics We ve already seen how to check model assumptions prior to fitting a one-way ANOVA. Diagnostics carried out after model fitting by using residuals are more informative for assessing

More information

Chapter 1 Linear Regression with One Predictor

Chapter 1 Linear Regression with One Predictor STAT 525 FALL 2018 Chapter 1 Linear Regression with One Predictor Professor Min Zhang Goals of Regression Analysis Serve three purposes Describes an association between X and Y In some applications, the

More information

Properties of the least squares estimates

Properties of the least squares estimates Properties of the least squares estimates 2019-01-18 Warmup Let a and b be scalar constants, and X be a scalar random variable. Fill in the blanks E ax + b) = Var ax + b) = Goal Recall that the least squares

More information

4 Multiple Linear Regression

4 Multiple Linear Regression 4 Multiple Linear Regression 4. The Model Definition 4.. random variable Y fits a Multiple Linear Regression Model, iff there exist β, β,..., β k R so that for all (x, x 2,..., x k ) R k where ε N (, σ

More information

Two-Variable Regression Model: The Problem of Estimation

Two-Variable Regression Model: The Problem of Estimation Two-Variable Regression Model: The Problem of Estimation Introducing the Ordinary Least Squares Estimator Jamie Monogan University of Georgia Intermediate Political Methodology Jamie Monogan (UGA) Two-Variable

More information

Chapter 8: Simple Linear Regression

Chapter 8: Simple Linear Regression Chapter 8: Simple Linear Regression Shiwen Shen University of South Carolina 2017 Summer 1 / 70 Introduction A problem that arises in engineering, economics, medicine, and other areas is that of investigating

More information

BNAD 276 Lecture 10 Simple Linear Regression Model

BNAD 276 Lecture 10 Simple Linear Regression Model 1 / 27 BNAD 276 Lecture 10 Simple Linear Regression Model Phuong Ho May 30, 2017 2 / 27 Outline 1 Introduction 2 3 / 27 Outline 1 Introduction 2 4 / 27 Simple Linear Regression Model Managerial decisions

More information

MA 575 Linear Models: Cedric E. Ginestet, Boston University Midterm Review Week 7

MA 575 Linear Models: Cedric E. Ginestet, Boston University Midterm Review Week 7 MA 575 Linear Models: Cedric E. Ginestet, Boston University Midterm Review Week 7 1 Random Vectors Let a 0 and y be n 1 vectors, and let A be an n n matrix. Here, a 0 and A are non-random, whereas y is

More information

ST430 Exam 1 with Answers

ST430 Exam 1 with Answers ST430 Exam 1 with Answers Date: October 5, 2015 Name: Guideline: You may use one-page (front and back of a standard A4 paper) of notes. No laptop or textook are permitted but you may use a calculator.

More information

Matrices and vectors A matrix is a rectangular array of numbers. Here s an example: A =

Matrices and vectors A matrix is a rectangular array of numbers. Here s an example: A = Matrices and vectors A matrix is a rectangular array of numbers Here s an example: 23 14 17 A = 225 0 2 This matrix has dimensions 2 3 The number of rows is first, then the number of columns We can write

More information

Maximum Likelihood Estimation

Maximum Likelihood Estimation Maximum Likelihood Estimation Merlise Clyde STA721 Linear Models Duke University August 31, 2017 Outline Topics Likelihood Function Projections Maximum Likelihood Estimates Readings: Christensen Chapter

More information

Inferences for Regression

Inferences for Regression Inferences for Regression An Example: Body Fat and Waist Size Looking at the relationship between % body fat and waist size (in inches). Here is a scatterplot of our data set: Remembering Regression In

More information

11 Regression. Introduction. The Correlation Coefficient. The Least-Squares Regression Line

11 Regression. Introduction. The Correlation Coefficient. The Least-Squares Regression Line 11 Regression The Correlation Coefficient The Least-Squares Regression Line The Correlation Coefficient Introduction A bivariate data set consists of n, (x 1, y 1 ),, (x n, y n ). A scatterplot is a of

More information

Linear Regression. 1 Introduction. 2 Least Squares

Linear Regression. 1 Introduction. 2 Least Squares Linear Regression 1 Introduction It is often interesting to study the effect of a variable on a response. In ANOVA, the response is a continuous variable and the variables are discrete / categorical. What

More information

Regression and Statistical Inference

Regression and Statistical Inference Regression and Statistical Inference Walid Mnif wmnif@uwo.ca Department of Applied Mathematics The University of Western Ontario, London, Canada 1 Elements of Probability 2 Elements of Probability CDF&PDF

More information

y ˆ i = ˆ " T u i ( i th fitted value or i th fit)

y ˆ i = ˆ  T u i ( i th fitted value or i th fit) 1 2 INFERENCE FOR MULTIPLE LINEAR REGRESSION Recall Terminology: p predictors x 1, x 2,, x p Some might be indicator variables for categorical variables) k-1 non-constant terms u 1, u 2,, u k-1 Each u

More information

IEOR 165 Lecture 7 1 Bias-Variance Tradeoff

IEOR 165 Lecture 7 1 Bias-Variance Tradeoff IEOR 165 Lecture 7 Bias-Variance Tradeoff 1 Bias-Variance Tradeoff Consider the case of parametric regression with β R, and suppose we would like to analyze the error of the estimate ˆβ in comparison to

More information

Midterm 2 - Solutions

Midterm 2 - Solutions Ecn 102 - Analysis of Economic Data University of California - Davis February 24, 2010 Instructor: John Parman Midterm 2 - Solutions You have until 10:20am to complete this exam. Please remember to put

More information

Simple Linear Regression Analysis

Simple Linear Regression Analysis LINEAR REGRESSION ANALYSIS MODULE II Lecture - 6 Simple Linear Regression Analysis Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur Prediction of values of study

More information

Overview Scatter Plot Example

Overview Scatter Plot Example Overview Topic 22 - Linear Regression and Correlation STAT 5 Professor Bruce Craig Consider one population but two variables For each sampling unit observe X and Y Assume linear relationship between variables

More information

Essential of Simple regression

Essential of Simple regression Essential of Simple regression We use simple regression when we are interested in the relationship between two variables (e.g., x is class size, and y is student s GPA). For simplicity we assume the relationship

More information

Linear models. Rasmus Waagepetersen Department of Mathematics Aalborg University Denmark. October 5, 2016

Linear models. Rasmus Waagepetersen Department of Mathematics Aalborg University Denmark. October 5, 2016 Linear models Rasmus Waagepetersen Department of Mathematics Aalborg University Denmark October 5, 2016 1 / 16 Outline for today linear models least squares estimation orthogonal projections estimation

More information

Chapter 12 - Lecture 2 Inferences about regression coefficient

Chapter 12 - Lecture 2 Inferences about regression coefficient Chapter 12 - Lecture 2 Inferences about regression coefficient April 19th, 2010 Facts about slope Test Statistic Confidence interval Hypothesis testing Test using ANOVA Table Facts about slope In previous

More information