TMA4255 Applied Statistics V2016 (5)

Similar documents
Simple Linear Regression

Measuring the fit of the model - SSR

Chapter 1: Linear Regression with One Predictor Variable also known as: Simple Linear Regression Bivariate Linear Regression

STAT5044: Regression and Anova. Inyoung Kim

Regression Models - Introduction

Ch 2: Simple Linear Regression

Chapter 12 - Lecture 2 Inferences about regression coefficient

PART I. (a) Describe all the assumptions for a normal error regression model with one predictor variable,

Apart from this page, you are not permitted to read the contents of this question paper until instructed to do so by an invigilator.

Linear Models and Estimation by Least Squares

Model Building Chap 5 p251

The simple linear regression model discussed in Chapter 13 was written as

Concordia University (5+5)Q 1.

Lecture 15 Multiple regression I Chapter 6 Set 2 Least Square Estimation The quadratic form to be minimized is

Confidence Interval for the mean response

Multivariate Regression (Chapter 10)

STA 108 Applied Linear Models: Regression Analysis Spring Solution for Homework #6

Basic Business Statistics, 10/e

Statistical Techniques II EXST7015 Simple Linear Regression

Business Statistics. Chapter 14 Introduction to Linear Regression and Correlation Analysis QMIS 220. Dr. Mohammad Zainal

Examination paper for TMA4255 Applied statistics

ECO220Y Simple Regression: Testing the Slope

Analysis of Bivariate Data

Chapter 14 Student Lecture Notes Department of Quantitative Methods & Information Systems. Business Statistics. Chapter 14 Multiple Regression

Ch 3: Multiple Linear Regression

The Multiple Regression Model

Multivariate Regression

Six Sigma Black Belt Study Guides

Simple and Multiple Linear Regression

Simple Linear Regression Analysis

Chapter 14 Simple Linear Regression (A)

Multiple Regression Examples

LINEAR REGRESSION ANALYSIS. MODULE XVI Lecture Exercises

Linear Regression Model. Badr Missaoui

Correlation & Simple Regression

Sample Problems. Note: If you find the following statements true, you should briefly prove them. If you find them false, you should correct them.

Chapter 1. Linear Regression with One Predictor Variable

Lecture 18: Simple Linear Regression

Inference for Regression

STAT 511. Lecture : Simple linear regression Devore: Section Prof. Michael Levine. December 3, Levine STAT 511

Lecture 15. Hypothesis testing in the linear model

School of Mathematical Sciences. Question 1

Regression Models - Introduction

Multiple Linear Regression

Stat 501, F. Chiaromonte. Lecture #8

Lecture 14 Simple Linear Regression

15.1 The Regression Model: Analysis of Residuals

AMS 315/576 Lecture Notes. Chapter 11. Simple Linear Regression

[4+3+3] Q 1. (a) Describe the normal regression model through origin. Show that the least square estimator of the regression parameter is given by

Regression and Statistical Inference

Inference for Regression Inference about the Regression Model and Using the Regression Line

Simple Linear Regression

Linear models and their mathematical foundations: Simple linear regression

Lecture 6 Multiple Linear Regression, cont.

Chapter 26 Multiple Regression, Logistic Regression, and Indicator Variables

2.4.3 Estimatingσ Coefficient of Determination 2.4. ASSESSING THE MODEL 23

Simple Linear Regression. Material from Devore s book (Ed 8), and Cengagebrain.com

EXAM IN TMA4255 EXPERIMENTAL DESIGN AND APPLIED STATISTICAL METHODS

STOR 455 STATISTICAL METHODS I

23. Inference for regression

28. SIMPLE LINEAR REGRESSION III

2.1: Inferences about β 1

ECON The Simple Regression Model

6. Multiple Linear Regression

Chapter 1 Linear Regression with One Predictor

Linear Regression. Simple linear regression model determines the relationship between one dependent variable (y) and one independent variable (x).

Simple Linear Regression

Ma 3/103: Lecture 25 Linear Regression II: Hypothesis Testing and ANOVA

UNIVERSITY OF TORONTO SCARBOROUGH Department of Computer and Mathematical Sciences Midterm Test, October 2013

INFERENCE FOR REGRESSION

LECTURE 6. Introduction to Econometrics. Hypothesis testing & Goodness of fit

Chapter 4. Regression Models. Learning Objectives

Econ 3790: Statistics Business and Economics. Instructor: Yogesh Uppal

Applied Econometrics (QEM)

Multiple Regression. Inference for Multiple Regression and A Case Study. IPS Chapters 11.1 and W.H. Freeman and Company

Chapter 2 Inferences in Simple Linear Regression

Summary of Chapter 7 (Sections ) and Chapter 8 (Section 8.1)

Inference for the Regression Coefficient

Simple linear regression

13 Simple Linear Regression

Lecture 11: Simple Linear Regression

Simple Linear Regression: A Model for the Mean. Chap 7

Econ 3790: Business and Economics Statistics. Instructor: Yogesh Uppal

STAT 360-Linear Models

Formal Statement of Simple Linear Regression Model

ECON 450 Development Economics

Regression Analysis. Regression: Methodology for studying the relationship among two or more variables

REGRESSION ANALYSIS AND INDICATOR VARIABLES

EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY

Multiple Regression Methods

Applied Regression. Applied Regression. Chapter 2 Simple Linear Regression. Hongcheng Li. April, 6, 2013

Variance reduction. Michel Bierlaire. Transport and Mobility Laboratory. Variance reduction p. 1/18

Section 3: Simple Linear Regression

STA121: Applied Regression Analysis

Lecture 16 - Correlation and Regression

Regression Analysis Chapter 2 Simple Linear Regression

Estimating σ 2. We can do simple prediction of Y and estimation of the mean of Y at any value of X.

Problems. Suppose both models are fitted to the same data. Show that SS Res, A SS Res, B

Basic Business Statistics 6 th Edition

Correlation Analysis

Transcription:

TMA4255 Applied Statistics V2016 (5) Part 2: Regression Simple linear regression [11.1-11.4] Sum of squares [11.5] Anna Marie Holand To be lectured: January 26, 2016 wiki.math.ntnu.no/tma4255/2016v/start

2 Part 2: Regression analysis Y : response, dependent variable. x: independent variable, regressor, covariate, predictor, explanatory variable. Goal: describe Y as a function of one or many x s. A statistical description based on a law of nature, some local approximation, the correlation between variables, trends over time, etc. Linear regression: Y is then a linear function of one or many (maybe transformed) x s. Simple linear regression: only one x. Multiple linear regression: many x s.

3 Wood quality Wood density is measure of wood quality. Within the wood industry there is a need to develop techniques to reduce the duration and the cost of wood analyses. Wood stiffness is generally evaluated by the determination of the modulus of elasticity in static bending, and lately sonic measurements have been investigated - this is expensive. We will look at a data set of simultaneous measurements of wood stiffness and wood density, to see if density can be used as a substitute for stiffness. Comment: source unknown. This data set has been used in this course for several years, data file taken from John Tyssedal

4 Wood quality x =wood density, and Y = log wood stiffness.

5 Simple linear regression [11.1-11.4] Previously, one random variable Y = µ + ε where ε was normally distributed with E(ε) = 0 and Var(ε) = σ 2. And we wrote Y N(µ, σ 2 ). The simple linear regression model: Y i = β 0 + β 1 x i + ε i and ε i is normally distributed with E(ε i ) = 0 and Var(ε i ) = σ 2, for i = 1,..., n.

6 Useful identities (x i x) = x i n x = n x n x = 0 S xy = = = (x i x)(y i ȳ) = (x i x)y i (x i x)ȳ (x i x)y i ȳ( x i n x) = x i y i x y i = (x i x)y i x i y i n xȳ

7 Useful identities S xx = = = (x i x) 2 = x i (x i x) x 2 i n x 2 (x i x)(x i x) x(x i x) = x 2 i x i x

8 Least squares estimators for β 0 and β 1 Given a data set, {(x i, y i ); i = 1,..., n}, the least squares estimators B 0 and B 1 for the parameters β 0 and β 1 are given as: B 1 = n (x i x)y i n (x i x) 2 = n (x i x)(y i Ȳ ) n (x i x) 2 n B 0 = Ȳ B 1 X = Y i B n 1 x i n These are also the maximum likelihood estimators, and the estimates are called b 0 and b 1.

9 Properties for β 0 and β 1 β 0 β 1 n Estimator B 0 = Ȳ B 1 x B 1 = (x i x)y i n (x i x) 2 Distribution Normal Normal Mean E(B 0 ) = β 0 E(B 1 ) = β 1 Variance Var(B 0 ) = σ2 n x i 2 n σ n (x. Var(B i x) 2 1 ) = 2 n (x i x) 2 exercise 3, problem 1. See

10 MINITAB The regression equation is log(stiff) = 8,25 + 0,125 density Predictor Coef SE Coef T P Constant 8,2516 0,1281 64,39 0,000 density 0,125190 0,007767 16,12 0,000 S = 0,243964 R-Sq = 90,3% R-Sq(adj) = 89,9%

11 Interpretation? We have wood density as covariate (x) and log wood stiffness as response (y), and have fitted a simple linear regression. What does b 0 = 8.25 and b 1 = 0.125 mean? A: If the wood density increase with 1, the log of wood stiffness increase with 8.25. B: If the wood density increase with 1, the log of wood stiffness increase with 0.125. C: I don t know. Vote at clicker.math.ntnu.no, class room TMA4255.

12 Correlation? DEF 4.5: Let X and Y be two random variables with covariance σ XY and variances σx 2 and σ2 Y, respectively. The correlation coefficient for X and Y is ρ XY = Cov(X, Y ) Var(X) Var(Y ) = σ XY σ X σ Y TEO 4.4: The covariance of two random variables X and Y with means µ X = E(X) and µ Y = E(Y ), is given by σ XY = Cov(X, Y ) = E(X Y ) E(X) E(Y ) = E(X Y ) µ X µ Y

13 b 1 and r An estimate of the correlation beween two RV s is the Pearson correlation coefficient: S xy r =. Sxx S yy The simple linear regression slope b 1 is given as b 1 = S xy S xx.

14 Estimator for σ 2 An unbiased estimator for σ 2 is S 2 = SSE n n 2 = (Y i Ŷ i ) 2 n 2 n = (Y i B 0 B 1 x i ) 2 n 2 This is not the maximum likelihood estimator. Further, V = (n 2)S2 σ 2 is chi-squared distributed with n 2 degrees of freedom.

15 MINITAB The regression equation is log(stiff) = 8,25 + 0,125 density Predictor Coef SE Coef T P Constant 8,2516 0,1281 64,39 0,000 density 0,125190 0,007767 16,12 0,000 S = 0,243964 R-Sq = 90,3% R-Sq(adj) = 89,9% Have inserted S for σ in the formulas: Var(B 0 ) = Var(B 1 ) = σ 2 n (x i x) 2 σ2 n x 2 i n n (x i x) 2 and

16 Sum of squares [11.5, 11.8] SST = n (y i ȳ) 2, total sum of squares. SSE = n (y i ŷ i ) 2, error sum of squares. SSR = n (ŷ i ȳ) 2, regression sum of squares. Coefficient of determination: R 2 = 1 SSE SST R 2 adj = 1 SSE/(n 2) SST/(n 1) Two RV s X and Y with a linear correlation coefficient of ρ has R 2 = ρ 2.

17 MINITAB The regression equation is log10(stiff) = 8,25 + 0,125 density S = 0,243964 R-Sq = 90,3% R-Sq(adj) = 89,9% Analysis of Variance Source DF SS MS F P Regression 1 15,464 15,464 259,81 0,000 Residual Error 28 1,667 0,060 Total 29 17,130

18 Coefficient of determination The relative amount of totalt variance that is explained by the simple linear regression model. R 2 = 1 SSE SST