STAT5044: Regression and Anova. Inyoung Kim

STAT5044: Regression and Anova Inyoung Kim

2 / 47 Outline 1 Regression 2 Simple Linear regression 3 Basic concepts in regression 4 How to estimate unknown parameters 5 Properties of Least Squares Estimators: Gauss-Markov theorem

3 / 47 Regression A way to model the relationship between dependent ( ) variable Y and independent ( ) variable X.

4 / 47 Regression A way to model the relationship between dependent ( ) variable Y and independent ( ) variable X. The goal of regression is to understand how the values of Y change as X is varied over its range of possible values.

5 / 47 Regression A way to model the relationship between dependent (response) variable Y and independent (explanatory) variable X. The goal of regression is to understand how the values of Y change as X is varied over its range of possible values and also predict Y using X. It is used to answer questions: Does changing class size affect success of students? Can we predict the time of the next eruption of old Faithful Geyser from the length of the most recent eruption? Do changes in diet result in changes in cholesterol level, and if so, do the results depend on other characteristic such as age, sex, and amount of exercise?

6 / 47 Regression Simple linear regression Polynomial regression Multiple linear regression let us start from simple linear regression

7 / 47 Simple Linear regression We have one response variable (Y) and one explanatory variable (X). Regression analysis was first developed by Sir Francis Galton Galton had studied the relation between heights of father and sons Galton had noted that the heights of sons of both tall and short fathers appeared to revert or regress to the mean of the group. Galton developed a mathematical description of this regression tendency, the precursor of today s regression models. The term regression persists to this day to describe statistical relations between variables.

8 / 47 Basic concepts in regression A regression model is a formal means of expressing the two essential ingredients of a statistical relation: A tendency of the dependent variable Y to vary with the independent variable in a systematic fashion A scattering of points around the curve of statistical relationship

9 / 47 Basic concepts in regression A regression model is a formal means of expressing the two essential ingredients of a statistical relation: A tendency of the dependent variable Y to vary with the independent variable in a systematic fashion There is a probability distribution of Y for each level of X A scattering of points around the curve of statistical relationship

10 / 47 Basic concepts in regression A regression model is a formal means of expressing the two essential ingredients of a statistical relation: A tendency of the dependent variable Y to vary with the independent variable in a systematic fashion There is a probability distribution of Y for each level of X A scattering of points around the curve of statistical relationship The means of these probability distributions vary in some systematic fashion with X.

11 / 47 What might be of interest in regression? Regression is a statistical method to estimate relationship between a response and explanatory variable using linear model

12 / 47 What might be of interest in regression? Regression is a statistical method to estimate relationship between a response and explanatory variable using linear model Is there a linear relationship? How to describe the relationship How to predict a new value How to predict the value of the explanatory variable that causes a specified response

13 / 47 Simple linear regression Model Y i = β 0 + β 1 X i + ε i, i = 1,...,n Y: response variable/dependent variable X: explanatory variable/independent variable ε i : random error with mean E(ε i ) = 0 and variance Var(ε i ) = σ 2, and covariance COV(ε i,ε j ) = 0

14 / 47 Simple linear regression Model Y i = β 0 + β 1 X i + ε i, i = 1,...,n Y: response variable/dependent variable X: explanatory variable/independent variable ε i : random error with mean E(ε i ) = 0 and variance Var(ε i ) = σ 2, and covariance COV(ε i,ε j ) = 0 There is one more random variable. What is it?

15 / 47 Simple linear regression Model Y i = β 0 + β 1 X i + ε i, i = 1,...,n Y: response variable/dependent variable X: explanatory variable/independent variable ε i : random error with mean E(ε i ) = 0 and variance Var(ε i ) = σ 2, and covariance COV(ε i,ε j ) = 0 There is one more random variable. What is it? Y E(Y i ) =? and Var(Y i ) =?

16 / 47 Simple linear regression Model Y i = β 0 + β 1 X i + ε i, i = 1,...,n Y: response variable/dependent variable X: explanatory variable/independent variable ε i : random error with mean E(ε i ) = 0 and variance Var(ε i ) = σ 2, and covariance COV(ε i,ε j ) = 0 There is one more random variable. What is it? Y E(Y i ) = β 0 + β 1 X i and Var(Y i ) = Var(ε i ) = σ 2

17 / 47 Simple linear regression Model Y i = β 0 + β 1 X i + ε i, i = 1,...,n E(Y i ) = β 0 + β 1 X i and Var(Y i ) = Var(ε i ) = σ 2 β 0, β 1 : regression coefficient parameters. β 1 : the slope of the regression line which indicates the change in the mean of the probability distribution of Y per unit increase in X. β 0 : the intercept of the regression line. If the slope of the model includes X = 0, β 0 gives the mean of the probability distribution of Y at X = 0. What are known values?

18 / 47 Simple linear regression Model Y i = β 0 + β 1 X i + ε i, i = 1,...,n E(Y i ) = β 0 + β 1 X i and Var(Y i ) = Var(ε i ) = σ 2 β 0, β 1 : regression coefficient parameters. β 1 : the slope of the regression line which indicates the change in the mean of the probability distribution of Y per unit increase in X. β 0 : the intercept of the regression line. If the slope of the model includes X = 0, β 0 gives the mean of the probability distribution of Y at X = 0. What are known values? What are unknown values?

19 / 47 Simple regression model To estimate the linear relationship between Y and X, what do we need to do?

20 / 47 Simple regression model To estimate the linear relationship between Y and X, what do we need to do? How to estimate unknown?

21 / 47 Simple regression model Goal: fit a straight line to points on a scatterplot; find intercept and slope such that ŷ i = b 0 + b 1 x i fits the data as well as possible. find b 0 and b 1 using least square estimation method (LSE) find b 0 and b 1 to minimize i e 2 i, Notation: residual e i = y i ŷ i

22 / 47 Notation and Definition Fitted value: Ŷ i = b 0 + b 1 X i b 0 = ˆβ 0, b 1 = ˆβ 1 Residual: e i = Y i Ŷ i S xy : i (x i x)(y i ȳ) S xx : i (x i x) 2 S yy : i (y i ȳ) 2

23 / 47 Notation and Definition Fitted value: Ŷ i = b 0 + b 1 X i b 0 = ˆβ 0, b 1 = ˆβ 1 Residual: e i = Y i Ŷ i S xy : i (x i x)(y i ȳ) S xx : i (x i x) 2 S yy : i (y i ȳ) 2 what is the difference between residual (e i ) and error (ε i )?

24 / 47 Interpreting a regression line or least squares line: 1.The slope of the line estimates the average increase in y for each one unit increase in x. 2.The intercept of the line is the value of y when x=0, but interpreting the intercept in the context of data only makes sense if 0 is included in the range of measured x-values. 3. Estimates the average y for specific value of x. It also can be as a prediction of the value of y for an individual with a specific value of x.

25 / 47 Interpreting a regression line or least squares line: Note: regression line is created based on the least squares criterion: When we use a line to predict the values of y, the sum of squared differences between the observed values of y and the predicted value is smaller for the least squares line than it is for any other line.

26 / 47 How to estimate parameters The method of Least Square ( or method of Ordinary Least Square): Estimate β 0 and β 1 to minimize Q = (Y i β 0 β 1 X i ) 2 i The values of β 0 and β 1 that minimize Q can be derived by differentiating Q with respect to β 0 and β 1 Q β 0 = Q β 1 = We then set these partial derivatives equal to zeros using b 0 and b 1 to denote the particular values of β 0 and β 1, respectively, that minimize Q.

27 / 47 How to estimate parameters We then finally obtain the following equation, call normal equation, Y i = nb 0 + b 1 X i X i Y i = b 0 i X i + b 1 X 2 i

28 / 47 Least Square Estimators ˆβ 0 = b 0 = Ȳ b 1 X ˆβ 1 = b 1 = (X i X)(Yi Ȳ )) (X i X) 2

29 / 47 Notation and Definition Fitted value: Ŷ i = b 0 + b 1 X i Residual: e i = Y i Ŷ i S xy : i (x i x)(y i ȳ) S xx : S yy :

30 / 47 Notation and Definition Fitted value: Ŷ i = b 0 + b 1 X i Residual: e i = Y i Ŷ i S xy : i (x i x)(y i ȳ) S xx : i (x i x) 2 S yy : i (y i ȳ) 2

31 / 47 Least Square Estimators ˆβ 0 = b 0 = Ȳ b 1 X = Ȳ ˆβ 1 X ˆβ 1 = b 1 = (X i X)(Yi Ȳ )) = S xy (X i X) 2 S xx

32 / 47 Properties of Least Squares Estimators Gauss-Markov theorem: Under the conditions of regression model, the least squares estimators b 0 and b 1 are unbiased and have minimum variance among all unbiased linear estimators. This theorem means that NOTE: unbiased estimator= a statistic is an unbiased estimation of the parameter if its expectation equals the parameter. i.e., E(b 0 ) = β 0 and E(b 1 ) = β 1

33 / 47 Properties of Least Squares Estimators Gauss-Markov theorem: Under the conditions of regression model, the least squares estimators b 0 and b 1 are unbiased and have minimum variance among all unbiased linear estimators. This theorem means that Among all linear estimators that are unbiased, b 0 and b 1 have the smallest variability in repeated samples in which the X levels remain unchanged. NOTE: unbiased estimator: E(b 0 ) = β 0 and E(b 1 ) = β 1

34 / 47 Properties of Fitted Regression Line E(Ŷ ) = β 0 + β 1 X = E(Y ) i e i = 0 Y i = Ŷi

35 / 47 Estimation of σ 2 Define SSE = e 2 i i = (y i ŷ i ) 2 i which is a sum of squares error (SSE) and has n-2 (why??) degree of freedom SSE/(n 2) is an unbiased estimator of σ 2 Notation: MSE = SSE/n 2 which call mean squares error.

36 / 47 Simple linear regression with normal error assumption Model Y i = β 0 + β 1 X i + ε i Y i : response variable/dependent variable X i : explanatory variable/independent variable ε i : are iid N(0, σ 2 ), i=1,...,n, COV(ε i,ε j ) = 0 β 0, β 1 : regression coefficient parameters. β 1 : the slope of the regression line which indicates the change in the mean of the probability distribution of Y per unit increase in X. β 0 : the Y intercept of the regression line. If the slope of the model includes X = 0, β 0 gives the mean of the probability distribution of Y at X = 0. Y i N(, ) Y i (β 0 + β 1 X i ) N(, )

Simple linear regression with normal error assumption Model Y i = β 0 + β 1 X i + ε i Y i : response variable/dependent variable X i : explanatory variable/independent variable ε i : are iid N(0, σ 2 ), i=1,...,n, COV(ε i,ε j ) = 0 β 0, β 1 : regression coefficient parameters. β 1 : the slope of the regression line which indicates the change in the mean of the probability distribution of Y per unit increase in X. β 0 : the Y intercept of the regression line. If the slope of the model includes X = 0, β 0 gives the mean of the probability distribution of Y at X = 0. Y i N(, ) Y i (β 0 + β 1 X i ) N(, ) Question: how to estimate β 0 and β 1? 37 / 47

38 / 47 Simple linear regression with normal error assumption We estimate β 0 and β 1 using maximum likelihood estimation (MLE). How to calculate MLE?

39 / 47 Simple linear regression with normal error assumption We estimate ˆβ 0 and β 1 using maximum likelihood estimation (MLE). How to calculate MLE? calculate Likelihood function Take the first derivative of the likelihood function with respect to β 0 and β 1 Check wheter the second derivative of the likelihood function is less than zero.

40 / 47 Maximum Likelihood Estimation Likelihood function is the function of parameters given data. Likelihood function L(,, ) given the sample observation Y 1,...,Y n L(,, ) = n i=1 1 (2πσ 2 ) exp[ (Y i β 0 β X i ) 2 ] 2σ 2

41 / 47 Maximum Likelihood Estimation Log-Likelihood is

42 / 47 MLE and LSE Parameter LSE MLE Normal Assumption. β 0 β 1 σ 2 SSE? SSE?

43 / 47 MLE and LSE Parameter LSE MLE Normal Assumption. NO Yes β 0 = β 1 = σ 2 SSE SSE n 2 n unbiased biased

What assumptions we have in simple linear regression 44 / 47

45 / 47 What assumptions we have in simple linear regression Independent

46 / 47 What assumptions we have in simple linear regression Independent Equal variance

47 / 47 What assumptions we have in simple linear regression Independent Equal variance Normal assumption to make inference including test and confidence interval