Simple Linear Regression

Similar documents
Linear Regression Models

1 Inferential Methods for Correlation and Regression Analysis

University of California, Los Angeles Department of Statistics. Simple regression analysis

Regression, Inference, and Model Building

Statistics 203 Introduction to Regression and Analysis of Variance Assignment #1 Solutions January 20, 2005

Chapters 5 and 13: REGRESSION AND CORRELATION. Univariate data: x, Bivariate data (x,y).

Grant MacEwan University STAT 252 Dr. Karen Buro Formula Sheet

Simple Regression. Acknowledgement. These slides are based on presentations created and copyrighted by Prof. Daniel Menasce (GMU) CS 700

TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics

S Y Y = ΣY 2 n. Using the above expressions, the correlation coefficient is. r = SXX S Y Y

University of California, Los Angeles Department of Statistics. Practice problems - simple regression 2 - solutions

Properties and Hypothesis Testing

TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics

Linear Regression Demystified

Stat 139 Homework 7 Solutions, Fall 2015

Assessment and Modeling of Forests. FR 4218 Spring Assignment 1 Solutions

Topic 9: Sampling Distributions of Estimators

SIMPLE LINEAR REGRESSION AND CORRELATION ANALYSIS

3/3/2014. CDS M Phil Econometrics. Types of Relationships. Types of Relationships. Types of Relationships. Vijayamohanan Pillai N.

ST 305: Exam 3 ( ) = P(A)P(B A) ( ) = P(A) + P(B) ( ) = 1 P( A) ( ) = P(A) P(B) σ X 2 = σ a+bx. σ ˆp. σ X +Y. σ X Y. σ X. σ Y. σ n.

Section 14. Simple linear regression.

Lecture 11 Simple Linear Regression

Algebra of Least Squares

(all terms are scalars).the minimization is clearer in sum notation:

Topic 9: Sampling Distributions of Estimators

Topic 9: Sampling Distributions of Estimators

Correlation Regression

Worksheet 23 ( ) Introduction to Simple Linear Regression (continued)

Open book and notes. 120 minutes. Cover page and six pages of exam. No calculators.

MOST PEOPLE WOULD RATHER LIVE WITH A PROBLEM THEY CAN'T SOLVE, THAN ACCEPT A SOLUTION THEY CAN'T UNDERSTAND.

Describing the Relation between Two Variables

Statistical Intervals for a Single Sample

11 Correlation and Regression

Statistical Properties of OLS estimators

Lecture 1, Jan 19. i=1 p i = 1.

Econ 325 Notes on Point Estimator and Confidence Interval 1 By Hiro Kasahara

Lecture 22: Review for Exam 2. 1 Basic Model Assumptions (without Gaussian Noise)

Final Review. Fall 2013 Prof. Yao Xie, H. Milton Stewart School of Industrial Systems & Engineering Georgia Tech

[ ] ( ) ( ) [ ] ( ) 1 [ ] [ ] Sums of Random Variables Y = a 1 X 1 + a 2 X 2 + +a n X n The expected value of Y is:

ECON 3150/4150, Spring term Lecture 3

Response Variable denoted by y it is the variable that is to be predicted measure of the outcome of an experiment also called the dependent variable

EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY

Statistics 20: Final Exam Solutions Summer Session 2007

MA 575, Linear Models : Homework 3

Statistics 511 Additional Materials

Geometry of LS. LECTURE 3 GEOMETRY OF LS, PROPERTIES OF σ 2, PARTITIONED REGRESSION, GOODNESS OF FIT

Circle the single best answer for each multiple choice question. Your choice should be made clearly.

Lesson 11: Simple Linear Regression

Statistical Inference (Chapter 10) Statistical inference = learn about a population based on the information provided by a sample.

Resampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n.

t distribution [34] : used to test a mean against an hypothesized value (H 0 : µ = µ 0 ) or the difference

Regression. Correlation vs. regression. The parameters of linear regression. Regression assumes... Random sample. Y = α + β X.

First, note that the LS residuals are orthogonal to the regressors. X Xb X y = 0 ( normal equations ; (k 1) ) So,

TAMS24: Notations and Formulas

Simple Regression Model

Continuous Data that can take on any real number (time/length) based on sample data. Categorical data can only be named or categorised

ECE 901 Lecture 12: Complexity Regularization and the Squared Loss

Confidence Interval for Standard Deviation of Normal Distribution with Known Coefficients of Variation

Linear Regression Analysis. Analysis of paired data and using a given value of one variable to predict the value of the other

- E < p. ˆ p q ˆ E = q ˆ = 1 - p ˆ = sample proportion of x failures in a sample size of n. where. x n sample proportion. population proportion

STP 226 ELEMENTARY STATISTICS

Overview. p 2. Chapter 9. Pooled Estimate of. q = 1 p. Notation for Two Proportions. Inferences about Two Proportions. Assumptions

Final Examination Solutions 17/6/2010

Inferential Statistics. Inference Process. Inferential Statistics and Probability a Holistic Approach. Inference Process.

Lecture 7: Properties of Random Samples

Ismor Fischer, 1/11/

UNIVERSITY OF TORONTO Faculty of Arts and Science APRIL/MAY 2009 EXAMINATIONS ECO220Y1Y PART 1 OF 2 SOLUTIONS

Matrix Representation of Data in Experiment

Sample Size Determination (Two or More Samples)

Interval Estimation (Confidence Interval = C.I.): An interval estimate of some population parameter is an interval of the form (, ),

Big Picture. 5. Data, Estimates, and Models: quantifying the accuracy of estimates.

y ij = µ + α i + ɛ ij,

Simple Linear Regression

STATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS. Comments:

Chapter 13, Part A Analysis of Variance and Experimental Design

Chapter 8: Estimating with Confidence

REVIEW OF SIMPLE LINEAR REGRESSION SIMPLE LINEAR REGRESSION

UNIT 11 MULTIPLE LINEAR REGRESSION

Topic 10: Introduction to Estimation

STP 226 EXAMPLE EXAM #1

Mathematical Notation Math Introduction to Applied Statistics

Confidence Level We want to estimate the true mean of a random variable X economically and with confidence.

The variance of a sum of independent variables is the sum of their variances, since covariances are zero. Therefore. V (xi )= n n 2 σ2 = σ2.

Parameter, Statistic and Random Samples

Important Formulas. Expectation: E (X) = Σ [X P(X)] = n p q σ = n p q. P(X) = n! X1! X 2! X 3! X k! p X. Chapter 6 The Normal Distribution.

Formulas and Tables for Gerstman

Linear Regression Models, OLS, Assumptions and Properties

n but for a small sample of the population, the mean is defined as: n 2. For a lognormal distribution, the median equals the mean.

TMA4245 Statistics. Corrected 30 May and 4 June Norwegian University of Science and Technology Department of Mathematical Sciences.

MidtermII Review. Sta Fall Office Hours Wednesday 12:30-2:30pm Watch linear regression videos before lab on Thursday

Linear regression. Daniel Hsu (COMS 4771) (y i x T i β)2 2πσ. 2 2σ 2. 1 n. (x T i β y i ) 2. 1 ˆβ arg min. β R n d

Bayesian Methods: Introduction to Multi-parameter Models

Tests of Hypotheses Based on a Single Sample (Devore Chapter Eight)

MA Advanced Econometrics: Properties of Least Squares Estimators

Direction: This test is worth 250 points. You are required to complete this test within 50 minutes.

MATH/STAT 352: Lecture 15

Goodness-of-Fit Tests and Categorical Data Analysis (Devore Chapter Fourteen)

REGRESSION MODELS ANOVA

Exam II Review. CEE 3710 November 15, /16/2017. EXAM II Friday, November 17, in class. Open book and open notes.

II. Descriptive Statistics D. Linear Correlation and Regression. 1. Linear Correlation

Transcription:

Simple Liear Regressio 1. Model ad Parameter Estimatio (a) Suppose our data cosist of a collectio of pairs (x i, y i ), where x i is a observed value of variable X ad y i is the correspodig observatio of radom variable Y. The simple liear regressio model y i = β 0 + β 1 x i + ɛ i expresses the relatioship betwee variables X ad Y. Here β 0 deotes the itercept ad β 1 the slope of the regressio lie. (b) Values for β 0 ad β 1 are estimated from the data by the method of least squares. (c) From the may straight lies that could be draw through our data, we fid the lie that miimizes the sum of squared residuals, where a residual is the vertical distace betwee a poit (x i, y i ) ad the regressio lie. (d) Values ˆβ 0 ad ˆβ 1 deote the estimates for β 0 ad β 1 that miimize the sum of squared residuals, or error sum of squares(sse). The estimates are called least squares estimates. SSE = ɛi 2 = i=1 i=1 (y i β 0 β 1 x i ) 2 (e) SSE is miimized whe the partial derivatives of the SSE with respect to the ukows (β 0 ad SSE β 1 ) are set to zero: β 0 = 0 ad SSE β 1 = 0. (You eed multivariable calculus [eg Math 2001] to uderstad the theoretical details, so we will just take this as a give.) These two coditios result i the two so-called ormal equatios. β 0 + β 1 i=1 x i = β 0 x i + β 1 xi 2 = i=1 i=1 y i i=1 x i y i i=1 (f) The two ormal equatios are solved simultaeously to obtai estimates of β 0 ad β 1. These estimates are: ˆβ 1 = i=1 (y i ȳ)(x i x) i=1 (x i x) 2 = i=1 x iy i ( i=1 x i) ( i=1 y i) i=1 x2 i ( i=1 x i) 2 ˆβ 0 = ȳ ˆβ 1 x Lookig at the formula for ˆβ 1, ad recallig the formula for the correlatio coefficiet r, it is easy to see that ˆβ 1 = rs y /s x. (g) The error variace, σ 2, is estimated as ˆσ 2 = SSE 2 = (y i ŷ i ) 2 2 1

The followig example shows the calculatios as they would be carried out by had, i gruesome detail. eg: To study the effect of ozoe pollutio o soybea yield, data were collected at four ozoe dose levels ad the resultig soybea seed yield moitored. Ozoe dose levels (i ppm)were reported as the average ozoe cocetratio durig the growig seaso. Soybea yield was reported i grams per plat. X Y Ozoe(ppm) Yield (gm/plat).02 242.07 237.11 231.15 201 Estimated values for β 0 ad β 1 are ow computed from the data X Y X 2 Y 2 XY.02 242.0004 58564 4.84.07 237.0049 56169 16.59.11 231.0121 53361 25.41.15 201.0225 40401 30.15 Colum sums: x i =.35, y i = 911, x 2 i =.0399, y 2 i = 208, 495, ad x i y i = 76.99 Meas: x =.0875 ad ȳ = 227.95 Itermediate terms: = i (x i x) 2 = i x 2 i ( x i) 2 =.0399 (.35)2 4 =.009275 SS xy = i (x i x)(y i ȳ) = i x i y i ( x i)( y i ) = 76.99.35(911) 4 = 2.7225 ˆβ 1 = SS xy = 293.531, ˆβ 0 = ȳ ˆβ 1 x = 227.95 ( 293.531)(.0875) = 253.434 (h) the least squares regressio equatio which characterizes the liear relatioship betwee soybea yield ad ozoe dose is ŷ i = 253.434 293.531x i (i) The error variace, σ 2, is estimated as MSE. (j) Residuals: ˆɛ i = y i ŷ i = y i ( ˆβ 0 + ˆβ 1 x i ) x i y i ŷ i ˆɛ i = y i ŷ i.02 242 247.563-5.563.07 237 232.887 4.113.11 231 221.146 9.854.15 201 209.404-8.404 2

(k) Residual Sum of Squares (I regressio problems, the error sum of squares is also kow as the residual sum of squares). (l) Mea Squared Error: MSE = SSE = ˆɛ 2 i = ( 5.563) 2 + (4.113) 2 + (9.854) 2 + ( 8.404) 2 = 215.59 SSE ( 2) = 107.80 3

x=c(.02,.07,.11,.15) y=c(242,237,231,201) SXX=sum((x-mea(x))^2) SXY=sum((x-mea(x))*(y-mea(y))) SYY=sum((y-mea(y))^2) b1=sxy/sxx b0=mea(y)-b1*mea(x) yp=b0+b1*x resids=y-yp SSE=sum(resids^2) SST=SYY SSR=SST-SSE SS=c(SSR,SSE,SST) =legth(y) df=c(1,-2,-1) MS=SS/df cbid(ss,df,ms) Calculatios by had i R SS df MS [1,] 799.1381 1 799.1381 [2,] 215.6119 2 107.8059 [3,] 1014.7500 3 338.2500 4

Check calculatios usig builti lm, summary ad ANOVA commads i R Call: lm(formula = y ~ x) Coefficiets: (Itercept) x 253.4-293.5 Call: lm(formula = y ~ x) Residuals: 1 2 3 4-5.563 4.113 9.854-8.404 Coefficiets: Estimate Std. Error t value Pr(> t ) (Itercept) 253.43 10.77 23.537 0.0018 ** x -293.53 107.81-2.723 0.1126 --- Sigif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual stadard error: 10.38 o 2 degrees of freedom Multiple R-squared: 0.7875,Adjusted R-squared: 0.6813 F-statistic: 7.413 o 1 ad 2 DF, p-value: 0.1126 Aalysis of Variace Table Respose: y Df Sum Sq Mea Sq F value Pr(>F) x 1 799.14 799.14 7.4127 0.1126 Residuals 2 215.61 107.81 1 2 3 4 247.5633 232.8868 221.1456 209.4043 1 2 3 4-5.563342 4.113208 9.854447-8.404313 [1] 215.6119 [1] 799.1381 215.6119 1014.7500 5

Statistical ifereces - CI s ad tests for the β s 2. Stadard Errors for Regressio Coefficiets (a) Regressio coefficiet values, ˆβ 0 ad ˆβ 1, are poit estimates of the true itercept ad slope, β 0 ad β 1 respectively. (b) To develop iterval estimates (cofidece itervals) for β 0 ad β 1, we eed to make assumptios about the errors i the regressio model. I partiular, we assume ɛ 1, ɛ 2,..., ɛ i.i.d N(0, σ 2 ), i which case: (c) The stadard deviatio of ˆβ 1 is σ 2 ˆβ 1 N(β 1, σ 2 ) (d) The value of σ 2 is ukow, so the estimator MSE is used i its place to produce the stadard error of the estimate ˆβ 1, as SE ˆβ 1 = MSE/ (e) The stadard error for estimate ˆβ 0 is give as: SE ˆβ 0 = MSE( 1 + x2 ) (f) Stadard Errors for regressio coefficiets i the above example are estimated below. =.009275 ad MSE = 107.80 SE ˆβ 1 = MSE/ = 107.80/.009275 = 107.81 SE ˆβ 0 = MSE( 1 + SS x2 xx ) = 107.80((1/4) + (.0399/.009275)) = 10.77 6

3. Cofidece Itervals for Regressio Coefficiets (a) Cofidece itervals are costructed usig the stadard errors as follows: ˆβ i ± t α/2, 2 SE ˆβ i (b) I the example, 95% cofidece itervals for β 1 ad β 0 are computed as follows. t α/2, 2 = t.025,2 = 4.303 For the slope, β 1 : 293.531 ± 4.303(107.81) ( 757.4, 170.3) For the itercept, β 0 : 253.434 ± 4.303(10.77) (207.1, 299.8) 95% Cofidece itervals i R upper 2.5th percetile of t-dist with -2 d.f. MSE=SSE/(-2) t=qt(.975,-2) t #upper.025'th percetile of t with -2 df. [1] 4.302653 95%cofidece iterval for β 1 SEb1=sqrt(MSE/SXX) #stadard error of beta_1 c(b1-t*seb1,b1+t*seb1) [1] -757.4057 170.3437 7

Why does the cofidece iterval have the correct coverage probability? Cosider the example of the iterval for ˆβ 1. We eed the followig facts: (a) β 1 has a ormal distributio with mea β 1 ad ukow variace σ 2 /SXX. A cosequece is that Z = β 1 β 1 σ/ SXX (b) W = ( 2)MSE σ to prove.) 2 N(0, 1) (Easy results to prove.) χ 2 2, a chi-squared distributio with 2 degrees of freedom. (A bit harder (c) β 1 ad SSE are idepedet, implies Z = β 1 β 1 σ/ ( 2)MSE ad are idepedet. (Hard to SXX σ 2 prove. Details ivolve cosiderable matrix algebra, ad are cotaied i appedix C3 of Motgomery et al) (d) Defiitio: If Z is stadard ormal, idepedet of W which is χ 2 ν, the t = have a t distributio with ν degrees of freedom. (e) The see geeral otes o costructig cofidece itervals. Z W/ν is defied to 8

4. The correlatio betwee X ad Y is estimated by: r = A alterative expressio is give by or i=1 (y i ȳ)(x i x) i=1 (x i x) 2 i=1 (y i ȳ) 2 r = ˆβ 1 i=1 (x i x) 2 i=1 (y i ȳ) 2 r = ˆβ 1 SSxx SSyy where = i=1 (x i x) 2 ad SS yy = i=1 (y i ȳ) 2 are the sums of squares of the X s ad Y s, respectively. Note that SS yy = SST, the total sum of squares. Note that stadard deviatios of the X s ad the Y s. The correlatio coefficiet lies i the iterval [-1,+1]. SSxx SSyy = s x s y, the ratio of the If the relatioship bewee Y ad X is perfectly liear ad icreasig, the correlatio will be +1. If the relatioship is perfectly liear ad decreasig, the correlatio will be +1. If there is o liear relatioship betwee X ad Y, the correlatio is 0. I the example, r = ˆβ SSxx 1 = 293.531.009275 =.887 SSyy 1016.49 9

5. Goodess of fit of the regressio lie is measured by the coefficiet of determiatio, R 2. For simple liear regressio R 2 = r 2. R 2 = SSR SST The Regressio Sum of Squares (SSR) is similar to the Treatmet Sum of Squares i a ANOVA problem. It is give by SSR = SS2 xy. Alterative ways of calculatig the residual sum of squares are to use the additivity relatioship (SSR + SSE = SST), or to use oe of the followig formulas. R 2 = SSR/SST 1 R 2 = (SST SSR)/SST = SSE/SST SSE = (1 R 2 )SST R 2 is the fractio of the total variability i y accouted for by the liear regressio lie, ad rages betwee 0 ad 1. R 2 = 1.00 idicates a perfect liear fit, while R 2 = 0.00 is a complete liear o-fit. I the example: SSR = SS2 xy = ( 2.7255) 2 /.009275 = 800.90 SST = SSR + SSE = 800.90 + 215.59 = 1016.49 R 2 = SSR/SST = 0.786 Note that R 2 = r 2, the square of the correlatio coefficiet. 78.8% of the variability i Y is accouted for by the regressio model. [1] 799.1381 [1] -0.8874245 [1] 0.7875222 10

6. Estimatig the mea of Y (a) The estimated mea of Y whe x = x is ˆµ x = ˆβ 0 + ˆβ 1 x. (b) (c) The stadard error of ˆµ x is ( ˆµ x = ˆβ 0 + ˆβ 1 x N (β 0 + β 1 x 1, σ 2 + (x x) 2 )) SE ˆµx = ( 1 MSE + (x x) 2 ) (d) A cofidece iterval for the mea µ x = β 0 + β 1 x whe x = x is give by ˆµ x ± t α/2, 2 SE ˆµx (e) eg. A 95% cofidece iterval for the mea at x = 0.10 is: Whe x = 0.10, the estimated mea is ˆµ.1 = 253.434 293.531(0.1) = 224.08 ( ) SE ˆµ.1 = 107.8 14 + (0.1.0875)2.009275 = 5.36 t α/2, 2 = t.025,2 = 4.303 margi of error = 4.303(5.36) = 23.08 224.08 ± 23.08 (201, 247.16) 95% cofidece iterval for mu at x0=.10 x0=.10 muhat=b0+b1*x0 # estimate of mea at x=x0 muhat SEmu=sqrt(MSE)*sqrt(1/+(x0-mea(x))^2/SXX) #SE of muhat SEmu c(muhat-t*semu, muhat+t*semu) [1] 224.0809 [1] 5.363545 [1] 201.0034 247.1583 11

7. Predictig a New Respose Value We are ow iterestig i predictig the value of y at a future value x = x. I makig a predictio iterval for a future observatio o y whe x = x, we eed to icorporate two sources of variatio which accout for the fact that we are replacig the ukow mea by the estimate ˆβ 0 + ˆβ 1 x, ad we are replacig the ukow stadard deviatio σ by the estimate MSE. y ( ˆβ 0 + ˆβ 1 x ) = (y (β 0 + β 1 x )) ( ˆβ 0 + ˆβ 1 x (β 0 + β 1 x )) The first term i brackets o the right had side of this expressio has a N(0, σ 2 ) distributio. From (b) above, the distributio of the secod term is ( 1 N (0, σ 2 + (x x) 2 )) As y represets a future observatio, the distributios of the two terms are idepedet, ad it follows that the distributio of y ( ˆβ 0 + ˆβ 1 x ) is N (0, σ (1 2 + 1 + (x x) 2 )) (a) The predicted value of y is give by ŷ = ˆβ 0 + ˆβ 1 x (b) The variace of the above distributio is estimated by: ( MSE 1 + 1 + (x x) 2 ) (c) ad the predictio iterval for y is give by ( ˆβ 0 + ˆβ 1 x ± t α/2, 2 MSE 1 + 1 + (x x) 2 ) (d) eg. A 95% predictio iterval for y whe x = 0.10 is: For x = 0.10, y = 253.434 293.531(0.1) = 224.08 ) SE y = 107.8 (1 + 1 4 + (0.1.0875)2.009275 = 11.69 t α/2, 2 = t.025,2 = 4.303 margi of error = 4.303(11.69) = 50.29 224.08 ± 50.29 (173.79, 274.37) SEmu=sqrt(MSE)*sqrt(1+1/+(x0-mea(x))^2/SXX) c(muhat-t*semu, muhat+t*semu) 95% predictio iterval for a ew observatio at x0=.10 12

[1] 173.7980 274.3637 13