Oct Simple linear regression. Minimum mean square error prediction. Univariate. regression. Calculating intercept and slope

Size: px
Start display at page:

Download "Oct Simple linear regression. Minimum mean square error prediction. Univariate. regression. Calculating intercept and slope"

Transcription

1 Oct / 28

2 Minimum MSE Y is the response variable, X the predictor variable, E(X) = E(Y) = 0. BLUP of Y minimizes average discrepancy var (Y ux) = C YY 2u C XY + u 2 C XX This is minimized when u takes the value b 1 = C XY /C XX. Ŷ = b 1 X is the of Y on X, b 1 is the coefficient. The error is var ( ) Y Ŷ = C YY C2 XY C XX If we relax the assumption that E(X) = E(Y) = 0, Ŷ = m Y + b 1 (X m X ). 2 / 28

3 error (C yy C 2 xy C xx) b = C xy C xx 3 / 28

4 Parent-offspring Trait is measured on offspring (Y) and parents (X 1, X 2 ). Mid-parent value X = (X 1 + X 2 )/2. According to genetic theory, C XY = 1 2 V A, C XX = 1 2 (V A + V E ) Regression coefficient (offspring on mid-parent) is b 1 = C XY /C XX = V A /(V A + V E ), the heritability of the trait. 4 / 28

5 Francis Galton Tall fathers tend to have tall sons, but average height of sons of tall fathers is less than average height of the fathers. There is (falling back) to the overall mean. 5 / 28

6 Galton s data child parent 6 / 28

7 Bivariate normal distribution In general, best possible predictor is E(Y X). For the bivariate normal distn, E(Y X) and the BLUP Ŷ coincide. When we are dealing with normally distd r.v.s, the BLUP is not just best among all linear predictors, but best among all predictors. 7 / 28

8 Sampling Usually (co)s are not known but are estimated from a sample (X 1, Y 1 ), (X 2, Y 2 ),..., (X n, Y n ) from the bivariate distn. Sample s obtained by dividing Sxx and Syy by n 1 provide unbiased estimates of C XX and C YY. The sample co obtained by dividing Sxy by n 1 is an unbiased estimate of C XY, where Sxy is the corrected sum of products. The coefficient of Y on X is then estimated by ˆb 1 = S xy /S xx. 8 / 28

9 Simple example Blood pressure was measured on a sample of women of different ages. Ages were grouped into 10-year classes, and mean b.p. calculated for each age class. Age class (yrs) b.p. (mmhg) Here age is fixed by experiment design, not random. Model for the dependence of Y (b.p.) on X (age): Y i = b 0 + b 1 X i + e i, i = 1... n Errors (residuals) e 1... e n are independently distd with zero mean and constant σ 2. Residuals e i are errors, and σ 2 is the error (residual ). 9 / 28

10 Blood pressure data 170 bp age / 28

11 Method I (not recommended). Calculate X = 55, Ȳ = 141, d x = X X, d y = Y Ȳ. X Y d x d y d 2 x d x d y d 2 y Total S xx = d 2 x = 1000, S xy = d x d y = 1380, S yy = d 2 y = / 28

12 Calculations Method II (recommended). Calculate uncorrected sum of squares or products, then subtract a correction factor. The result is the corrected sum of squares or products: N X Y X 2 Y 2 XY S xx = /5 = 1000 S xy = /5 = 1380 S yy = /5 = / 28

13 Calculations Equation of the line is Y 141 = 1.38 (X 55), or Y = X Slope of the line is ˆb 1 = 1.38 mmhg/year, an average increase of 13.8 mmhg per decade. The intercept (ˆb 0 = 65.1) is the predicted value of Y when X = 0. (An extrapolation far outside the range of the data). 13 / 28

14 Fitted line 170 bp age / 28

15 Fitted line 170 bp age / 28

16 Residuals, Values of Y predicted by the equation at the data values X 1... X n are called (Ŷ). Differences between observed and (Y Ŷ) are called residuals. For the blood pressure data, and residuals are: X Y Fitted Residual / 28

17 Fitted values and residuals 170 bp age / 28

18 Fitted values and residuals 170 bp age / 28

19 Fitted values and residuals 170 bp age / 28

20 Deviation from the mean can be split into two components: Y i Ȳ = (Y i Ŷ i ) + (Ŷ i Ȳ) Correspondingly, the total sum of squares splits into two components: (Yi Ȳ) 2 = (Ŷi Ȳ) 2 + (Y i Ŷi) 2 Total = Regression + Residual Regression sum of squares is corrected sum of squares of fitted values. Residual sum of squares is the sum of squared residuals. 17 / 28

21 anova calculation Total sum of squares is S yy. Regression sum of squares is S 2 xy/s xx. Residual sum of squares is obtained by subtraction. S yy = S 2 xy/s xx + (S yy S 2 xy/s xx ) Total = Regression + Residual 18 / 28

22 table Source Df Sum Sq Mean Sq F ratio Regression Residual Total Regression sum of squares S 2 xy/s xx has one degree of freedom. For the blood pressure data, total sum of squares has 4 d.f., leaving the residual sum of squares with 3 d.f. In general with a sample of size n (n pairs of X and Y values), total sum of squares has n 1 d.f., and residual sum of squares has n 2 d.f. Mean squares are obtained by dividing the corresponding sum of squares by the number of degrees of freedom. Residual mean square (10.53) estimates the residual σ / 28

23 Variance explained The proportion of explained is R 2 = ( sum of squares)/(total sum of squares), Sample correlation coefficient is the positive square root of R 2 multiplied by ±1 (the sign of the coefficient or Sxy). 20 / 28

24 Is the apparent increase in b.p. with age real, or due to chance? Null hypothesis H 0 : b 1 = 0 ( no relationship between X and Y ). Under H 0, the F statistic has an F distn with 1 and n 2 d.f. H 0 is rejected with large values of F (a one-sided test). For b.p. data, F = with 1 and 3 d.f. Tables of the F distn show this to be highly significant (P < 0.001). 21 / 28

25 Alternatively, null distn of b 1 /E is t with n 2 d.f., where E = S 2 /S xx is estimated s.e. of ˆb 1. For b.p. data, s.e. of ˆb 1 is 10.53/1000 = , and t = 1.38/ = with 3 d.f. Tables of the t distn give P < (two-sided test). Note that the t statistic is the square root of the F statistic. (When F has 1 and ν d.f., ± F is t with ν d.f.) 22 / 28

26 Inspect residuals for evidence that model assumptions do not hold. Plot residual against predictor variable or fitted value. Plots may show evidence of systematic discrepancy, due to inadequacies in the model, or an isolated discrepancy, due to an outlier. An outlier has an unusually large residual. If possible, a reason should be found. Outliers may sometimes be rejected, cautiously. 23 / 28

27 A correlation between X and Y does not necessarily imply that a change in X causes a change in Y. The link may be between X and Z, and between Z and Y, where Z is a third (unobserved) variable. For example, a correlation between birth rate and tractor sales may arise simply because both variables are increasing over time. 24 / 28

28 The Ŷ = Ȳ + b 1 (X X) has σ 2 [ 1/n + (X X) 2 /S xx ] In the example, Ŷ predicts the the mean b.p. of a large number of women whose age is X. The for one particular woman of age X is Ŷ plus an error term with zero mean and σ 2. The is unchanged, but is now σ 2 [ 1 + 1/n + (X X) 2 /S xx ] limits are Ŷ ± k E, where E is the square root of the appropriate, and k comes from tables of the normal distn (σ 2 known) or tables of the t distn (when σ 2 estimated by S 2 ). 25 / 28

29 age - c(35, 45, 55, 65, 75) bp - c(114, 124, 143, 158, 166) fit - lm(bp age) summary(fit) anova(fit) plot(fit) plot(fit) produces diagnostic plots. To graph points and line: plot(bp age) abline(fit) 26 / 28

30 summary output summary(fit) Residuals: Coefficients: Estimate Std. Error t value Pr( t ) (Intercept) ** age *** Multiple R-squared: F-statistic: on 1 and 3 DF, p-value: / 28

31 anova output anova(fit) Variance Table Response: bp Df Sum Sq Mean Sq F value Pr( F) age *** Residuals / 28

Ch 2: Simple Linear Regression

Ch 2: Simple Linear Regression Ch 2: Simple Linear Regression 1. Simple Linear Regression Model A simple regression model with a single regressor x is y = β 0 + β 1 x + ɛ, where we assume that the error ɛ is independent random component

More information

13 Simple Linear Regression

13 Simple Linear Regression B.Sc./Cert./M.Sc. Qualif. - Statistics: Theory and Practice 3 Simple Linear Regression 3. An industrial example A study was undertaken to determine the effect of stirring rate on the amount of impurity

More information

Linear Regression. Simple linear regression model determines the relationship between one dependent variable (y) and one independent variable (x).

Linear Regression. Simple linear regression model determines the relationship between one dependent variable (y) and one independent variable (x). Linear Regression Simple linear regression model determines the relationship between one dependent variable (y) and one independent variable (x). A dependent variable is a random variable whose variation

More information

Simple Linear Regression

Simple Linear Regression Simple Linear Regression In simple linear regression we are concerned about the relationship between two variables, X and Y. There are two components to such a relationship. 1. The strength of the relationship.

More information

Simple Linear Regression

Simple Linear Regression Simple Linear Regression MATH 282A Introduction to Computational Statistics University of California, San Diego Instructor: Ery Arias-Castro http://math.ucsd.edu/ eariasca/math282a.html MATH 282A University

More information

LINEAR REGRESSION ANALYSIS. MODULE XVI Lecture Exercises

LINEAR REGRESSION ANALYSIS. MODULE XVI Lecture Exercises LINEAR REGRESSION ANALYSIS MODULE XVI Lecture - 44 Exercises Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur Exercise 1 The following data has been obtained on

More information

Multiple regression. Partial regression coefficients

Multiple regression. Partial regression coefficients Multiple regression We now generalise the results of simple linear regression to the case where there is one response variable Y and two predictor variables, X and Z. Data consist of n triplets of values

More information

STAT5044: Regression and Anova. Inyoung Kim

STAT5044: Regression and Anova. Inyoung Kim STAT5044: Regression and Anova Inyoung Kim 2 / 47 Outline 1 Regression 2 Simple Linear regression 3 Basic concepts in regression 4 How to estimate unknown parameters 5 Properties of Least Squares Estimators:

More information

Oct Analysis of variance models. One-way anova. Three sheep breeds. Finger ridges. Random and. Fixed effects model. The random effects model

Oct Analysis of variance models. One-way anova. Three sheep breeds. Finger ridges. Random and. Fixed effects model. The random effects model s s Oct 2017 1 / 34 s Consider N = n 0 + n 1 + + n k 1 observations, which form k groups, of sizes n 0, n 1,..., n k 1. The r-th group has sample mean Ȳ r The overall mean (for all groups combined) is

More information

MAT2377. Rafa l Kulik. Version 2015/November/26. Rafa l Kulik

MAT2377. Rafa l Kulik. Version 2015/November/26. Rafa l Kulik MAT2377 Rafa l Kulik Version 2015/November/26 Rafa l Kulik Bivariate data and scatterplot Data: Hydrocarbon level (x) and Oxygen level (y): x: 0.99, 1.02, 1.15, 1.29, 1.46, 1.36, 0.87, 1.23, 1.55, 1.40,

More information

Estimating σ 2. We can do simple prediction of Y and estimation of the mean of Y at any value of X.

Estimating σ 2. We can do simple prediction of Y and estimation of the mean of Y at any value of X. Estimating σ 2 We can do simple prediction of Y and estimation of the mean of Y at any value of X. To perform inferences about our regression line, we must estimate σ 2, the variance of the error term.

More information

STAT Chapter 11: Regression

STAT Chapter 11: Regression STAT 515 -- Chapter 11: Regression Mostly we have studied the behavior of a single random variable. Often, however, we gather data on two random variables. We wish to determine: Is there a relationship

More information

Simple Linear Regression Analysis

Simple Linear Regression Analysis LINEAR REGRESSION ANALYSIS MODULE II Lecture - 6 Simple Linear Regression Analysis Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur Prediction of values of study

More information

Density Temp vs Ratio. temp

Density Temp vs Ratio. temp Temp Ratio Density 0.00 0.02 0.04 0.06 0.08 0.10 0.12 Density 0.0 0.2 0.4 0.6 0.8 1.0 1. (a) 170 175 180 185 temp 1.0 1.5 2.0 2.5 3.0 ratio The histogram shows that the temperature measures have two peaks,

More information

Measuring the fit of the model - SSR

Measuring the fit of the model - SSR Measuring the fit of the model - SSR Once we ve determined our estimated regression line, we d like to know how well the model fits. How far/close are the observations to the fitted line? One way to do

More information

Chapter 1: Linear Regression with One Predictor Variable also known as: Simple Linear Regression Bivariate Linear Regression

Chapter 1: Linear Regression with One Predictor Variable also known as: Simple Linear Regression Bivariate Linear Regression BSTT523: Kutner et al., Chapter 1 1 Chapter 1: Linear Regression with One Predictor Variable also known as: Simple Linear Regression Bivariate Linear Regression Introduction: Functional relation between

More information

Lecture 6 Multiple Linear Regression, cont.

Lecture 6 Multiple Linear Regression, cont. Lecture 6 Multiple Linear Regression, cont. BIOST 515 January 22, 2004 BIOST 515, Lecture 6 Testing general linear hypotheses Suppose we are interested in testing linear combinations of the regression

More information

Math 1710 Class 20. V2u. Last Time. Graphs and Association. Correlation. Regression. Association, Correlation, Regression Dr. Back. Oct.

Math 1710 Class 20. V2u. Last Time. Graphs and Association. Correlation. Regression. Association, Correlation, Regression Dr. Back. Oct. ,, Dr. Back Oct. 14, 2009 Son s Heights from Their Fathers Galton s Original 1886 Data If you know a father s height, what can you say about his son s? Son s Heights from Their Fathers Galton s Original

More information

ST430 Exam 2 Solutions

ST430 Exam 2 Solutions ST430 Exam 2 Solutions Date: November 9, 2015 Name: Guideline: You may use one-page (front and back of a standard A4 paper) of notes. No laptop or textbook are permitted but you may use a calculator. Giving

More information

MODELS WITHOUT AN INTERCEPT

MODELS WITHOUT AN INTERCEPT Consider the balanced two factor design MODELS WITHOUT AN INTERCEPT Factor A 3 levels, indexed j 0, 1, 2; Factor B 5 levels, indexed l 0, 1, 2, 3, 4; n jl 4 replicate observations for each factor level

More information

Mathematics for Economics MA course

Mathematics for Economics MA course Mathematics for Economics MA course Simple Linear Regression Dr. Seetha Bandara Simple Regression Simple linear regression is a statistical method that allows us to summarize and study relationships between

More information

Simple Linear Regression

Simple Linear Regression Simple Linear Regression ST 430/514 Recall: A regression model describes how a dependent variable (or response) Y is affected, on average, by one or more independent variables (or factors, or covariates)

More information

Nature vs. nurture? Lecture 18 - Regression: Inference, Outliers, and Intervals. Regression Output. Conditions for inference.

Nature vs. nurture? Lecture 18 - Regression: Inference, Outliers, and Intervals. Regression Output. Conditions for inference. Understanding regression output from software Nature vs. nurture? Lecture 18 - Regression: Inference, Outliers, and Intervals In 1966 Cyril Burt published a paper called The genetic determination of differences

More information

MATH 644: Regression Analysis Methods

MATH 644: Regression Analysis Methods MATH 644: Regression Analysis Methods FINAL EXAM Fall, 2012 INSTRUCTIONS TO STUDENTS: 1. This test contains SIX questions. It comprises ELEVEN printed pages. 2. Answer ALL questions for a total of 100

More information

Homework 2: Simple Linear Regression

Homework 2: Simple Linear Regression STAT 4385 Applied Regression Analysis Homework : Simple Linear Regression (Simple Linear Regression) Thirty (n = 30) College graduates who have recently entered the job market. For each student, the CGPA

More information

Multiple Linear Regression

Multiple Linear Regression Multiple Linear Regression Simple linear regression tries to fit a simple line between two variables Y and X. If X is linearly related to Y this explains some of the variability in Y. In most cases, there

More information

Chapter 1. Linear Regression with One Predictor Variable

Chapter 1. Linear Regression with One Predictor Variable Chapter 1. Linear Regression with One Predictor Variable 1.1 Statistical Relation Between Two Variables To motivate statistical relationships, let us consider a mathematical relation between two mathematical

More information

CAS MA575 Linear Models

CAS MA575 Linear Models CAS MA575 Linear Models Boston University, Fall 2013 Midterm Exam (Correction) Instructor: Cedric Ginestet Date: 22 Oct 2013. Maximal Score: 200pts. Please Note: You will only be graded on work and answers

More information

y ˆ i = ˆ " T u i ( i th fitted value or i th fit)

y ˆ i = ˆ  T u i ( i th fitted value or i th fit) 1 2 INFERENCE FOR MULTIPLE LINEAR REGRESSION Recall Terminology: p predictors x 1, x 2,, x p Some might be indicator variables for categorical variables) k-1 non-constant terms u 1, u 2,, u k-1 Each u

More information

Lectures 5 & 6: Hypothesis Testing

Lectures 5 & 6: Hypothesis Testing Lectures 5 & 6: Hypothesis Testing in which you learn to apply the concept of statistical significance to OLS estimates, learn the concept of t values, how to use them in regression work and come across

More information

Simple Linear Regression

Simple Linear Regression Chapter 2 Simple Linear Regression Linear Regression with One Independent Variable 2.1 Introduction In Chapter 1 we introduced the linear model as an alternative for making inferences on means of one or

More information

(ii) Scan your answer sheets INTO ONE FILE only, and submit it in the drop-box.

(ii) Scan your answer sheets INTO ONE FILE only, and submit it in the drop-box. FINAL EXAM ** Two different ways to submit your answer sheet (i) Use MS-Word and place it in a drop-box. (ii) Scan your answer sheets INTO ONE FILE only, and submit it in the drop-box. Deadline: December

More information

Lecture 15. Hypothesis testing in the linear model

Lecture 15. Hypothesis testing in the linear model 14. Lecture 15. Hypothesis testing in the linear model Lecture 15. Hypothesis testing in the linear model 1 (1 1) Preliminary lemma 15. Hypothesis testing in the linear model 15.1. Preliminary lemma Lemma

More information

Inference for Regression

Inference for Regression Inference for Regression Section 9.4 Cathy Poliak, Ph.D. cathy@math.uh.edu Office in Fleming 11c Department of Mathematics University of Houston Lecture 13b - 3339 Cathy Poliak, Ph.D. cathy@math.uh.edu

More information

MA 575 Linear Models: Cedric E. Ginestet, Boston University Midterm Review Week 7

MA 575 Linear Models: Cedric E. Ginestet, Boston University Midterm Review Week 7 MA 575 Linear Models: Cedric E. Ginestet, Boston University Midterm Review Week 7 1 Random Vectors Let a 0 and y be n 1 vectors, and let A be an n n matrix. Here, a 0 and A are non-random, whereas y is

More information

27. SIMPLE LINEAR REGRESSION II

27. SIMPLE LINEAR REGRESSION II 27. SIMPLE LINEAR REGRESSION II The Model In linear regression analysis, we assume that the relationship between X and Y is linear. This does not mean, however, that Y can be perfectly predicted from X.

More information

Simple Linear Regression

Simple Linear Regression Simple Linear Regression Reading: Hoff Chapter 9 November 4, 2009 Problem Data: Observe pairs (Y i,x i ),i = 1,... n Response or dependent variable Y Predictor or independent variable X GOALS: Exploring

More information

Multiple Regression: Example

Multiple Regression: Example Multiple Regression: Example Cobb-Douglas Production Function The Cobb-Douglas production function for observed economic data i = 1,..., n may be expressed as where O i is output l i is labour input c

More information

Lecture 1 Linear Regression with One Predictor Variable.p2

Lecture 1 Linear Regression with One Predictor Variable.p2 Lecture Linear Regression with One Predictor Variablep - Basics - Meaning of regression parameters p - β - the slope of the regression line -it indicates the change in mean of the probability distn of

More information

Introduction and Single Predictor Regression. Correlation

Introduction and Single Predictor Regression. Correlation Introduction and Single Predictor Regression Dr. J. Kyle Roberts Southern Methodist University Simmons School of Education and Human Development Department of Teaching and Learning Correlation A correlation

More information

Linear Models and Estimation by Least Squares

Linear Models and Estimation by Least Squares Linear Models and Estimation by Least Squares Jin-Lung Lin 1 Introduction Causal relation investigation lies in the heart of economics. Effect (Dependent variable) cause (Independent variable) Example:

More information

Lab 3 A Quick Introduction to Multiple Linear Regression Psychology The Multiple Linear Regression Model

Lab 3 A Quick Introduction to Multiple Linear Regression Psychology The Multiple Linear Regression Model Lab 3 A Quick Introduction to Multiple Linear Regression Psychology 310 Instructions.Work through the lab, saving the output as you go. You will be submitting your assignment as an R Markdown document.

More information

Simple and Multiple Linear Regression

Simple and Multiple Linear Regression Sta. 113 Chapter 12 and 13 of Devore March 12, 2010 Table of contents 1 Simple Linear Regression 2 Model Simple Linear Regression A simple linear regression model is given by Y = β 0 + β 1 x + ɛ where

More information

2 Regression Analysis

2 Regression Analysis FORK 1002 Preparatory Course in Statistics: 2 Regression Analysis Genaro Sucarrat (BI) http://www.sucarrat.net/ Contents: 1 Bivariate Correlation Analysis 2 Simple Regression 3 Estimation and Fit 4 T -Test:

More information

Statistical inference (estimation, hypothesis tests, confidence intervals) Oct 2018

Statistical inference (estimation, hypothesis tests, confidence intervals) Oct 2018 Statistical inference (estimation, hypothesis tests, confidence intervals) Oct 2018 Sampling A trait is measured on each member of a population. f(y) = propn of individuals in the popn with measurement

More information

Statistics - Lecture Three. Linear Models. Charlotte Wickham 1.

Statistics - Lecture Three. Linear Models. Charlotte Wickham   1. Statistics - Lecture Three Charlotte Wickham wickham@stat.berkeley.edu http://www.stat.berkeley.edu/~wickham/ Linear Models 1. The Theory 2. Practical Use 3. How to do it in R 4. An example 5. Extensions

More information

9. Linear Regression and Correlation

9. Linear Regression and Correlation 9. Linear Regression and Correlation Data: y a quantitative response variable x a quantitative explanatory variable (Chap. 8: Recall that both variables were categorical) For example, y = annual income,

More information

Lecture 11: Simple Linear Regression

Lecture 11: Simple Linear Regression Lecture 11: Simple Linear Regression Readings: Sections 3.1-3.3, 11.1-11.3 Apr 17, 2009 In linear regression, we examine the association between two quantitative variables. Number of beers that you drink

More information

Correlation and Regression Notes. Categorical / Categorical Relationship (Chi-Squared Independence Test)

Correlation and Regression Notes. Categorical / Categorical Relationship (Chi-Squared Independence Test) Relationship Hypothesis Tests Correlation and Regression Notes Categorical / Categorical Relationship (Chi-Squared Independence Test) Ho: Categorical Variables are independent (show distribution of conditional

More information

Correlation Analysis

Correlation Analysis Simple Regression Correlation Analysis Correlation analysis is used to measure strength of the association (linear relationship) between two variables Correlation is only concerned with strength of the

More information

Confidence Intervals, Testing and ANOVA Summary

Confidence Intervals, Testing and ANOVA Summary Confidence Intervals, Testing and ANOVA Summary 1 One Sample Tests 1.1 One Sample z test: Mean (σ known) Let X 1,, X n a r.s. from N(µ, σ) or n > 30. Let The test statistic is H 0 : µ = µ 0. z = x µ 0

More information

SCHOOL OF MATHEMATICS AND STATISTICS

SCHOOL OF MATHEMATICS AND STATISTICS RESTRICTED OPEN BOOK EXAMINATION (Not to be removed from the examination hall) Data provided: Statistics Tables by H.R. Neave MAS5052 SCHOOL OF MATHEMATICS AND STATISTICS Basic Statistics Spring Semester

More information

AMS 315/576 Lecture Notes. Chapter 11. Simple Linear Regression

AMS 315/576 Lecture Notes. Chapter 11. Simple Linear Regression AMS 315/576 Lecture Notes Chapter 11. Simple Linear Regression 11.1 Motivation A restaurant opening on a reservations-only basis would like to use the number of advance reservations x to predict the number

More information

22s:152 Applied Linear Regression. Chapter 8: 1-Way Analysis of Variance (ANOVA) 2-Way Analysis of Variance (ANOVA)

22s:152 Applied Linear Regression. Chapter 8: 1-Way Analysis of Variance (ANOVA) 2-Way Analysis of Variance (ANOVA) 22s:152 Applied Linear Regression Chapter 8: 1-Way Analysis of Variance (ANOVA) 2-Way Analysis of Variance (ANOVA) We now consider an analysis with only categorical predictors (i.e. all predictors are

More information

Topic 14: Inference in Multiple Regression

Topic 14: Inference in Multiple Regression Topic 14: Inference in Multiple Regression Outline Review multiple linear regression Inference of regression coefficients Application to book example Inference of mean Application to book example Inference

More information

CHAPTER EIGHT Linear Regression

CHAPTER EIGHT Linear Regression 7 CHAPTER EIGHT Linear Regression 8. Scatter Diagram Example 8. A chemical engineer is investigating the effect of process operating temperature ( x ) on product yield ( y ). The study results in the following

More information

Coefficient of Determination

Coefficient of Determination Coefficient of Determination ST 430/514 The coefficient of determination, R 2, is defined as before: R 2 = 1 SS E (yi ŷ i ) = 1 2 SS yy (yi ȳ) 2 The interpretation of R 2 is still the fraction of variance

More information

Math 423/533: The Main Theoretical Topics

Math 423/533: The Main Theoretical Topics Math 423/533: The Main Theoretical Topics Notation sample size n, data index i number of predictors, p (p = 2 for simple linear regression) y i : response for individual i x i = (x i1,..., x ip ) (1 p)

More information

TMA4255 Applied Statistics V2016 (5)

TMA4255 Applied Statistics V2016 (5) TMA4255 Applied Statistics V2016 (5) Part 2: Regression Simple linear regression [11.1-11.4] Sum of squares [11.5] Anna Marie Holand To be lectured: January 26, 2016 wiki.math.ntnu.no/tma4255/2016v/start

More information

Chapter 2. Continued. Proofs For ANOVA Proof of ANOVA Identity. the product term in the above equation can be simplified as n

Chapter 2. Continued. Proofs For ANOVA Proof of ANOVA Identity. the product term in the above equation can be simplified as n Chapter 2. Continued Proofs For ANOVA Proof of ANOVA Identity We are going to prove that Writing SST SSR + SSE. Y i Ȳ (Y i Ŷ i ) + (Ŷ i Ȳ ) Squaring both sides summing over all i 1,...n, we get (Y i Ȳ

More information

Statistics 512: Solution to Homework#11. Problems 1-3 refer to the soybean sausage dataset of Problem 20.8 (ch21pr08.dat).

Statistics 512: Solution to Homework#11. Problems 1-3 refer to the soybean sausage dataset of Problem 20.8 (ch21pr08.dat). Statistics 512: Solution to Homework#11 Problems 1-3 refer to the soybean sausage dataset of Problem 20.8 (ch21pr08.dat). 1. Perform the two-way ANOVA without interaction for this model. Use the results

More information

Chapter 16. Simple Linear Regression and dcorrelation

Chapter 16. Simple Linear Regression and dcorrelation Chapter 16 Simple Linear Regression and dcorrelation 16.1 Regression Analysis Our problem objective is to analyze the relationship between interval variables; regression analysis is the first tool we will

More information

ST430 Exam 1 with Answers

ST430 Exam 1 with Answers ST430 Exam 1 with Answers Date: October 5, 2015 Name: Guideline: You may use one-page (front and back of a standard A4 paper) of notes. No laptop or textook are permitted but you may use a calculator.

More information

Simple Linear Regression

Simple Linear Regression Simple Linear Regression September 24, 2008 Reading HH 8, GIll 4 Simple Linear Regression p.1/20 Problem Data: Observe pairs (Y i,x i ),i = 1,...n Response or dependent variable Y Predictor or independent

More information

Scatter plot of data from the study. Linear Regression

Scatter plot of data from the study. Linear Regression 1 2 Linear Regression Scatter plot of data from the study. Consider a study to relate birthweight to the estriol level of pregnant women. The data is below. i Weight (g / 100) i Weight (g / 100) 1 7 25

More information

Simple Linear Regression: A Model for the Mean. Chap 7

Simple Linear Regression: A Model for the Mean. Chap 7 Simple Linear Regression: A Model for the Mean Chap 7 An Intermediate Model (if the groups are defined by values of a numeric variable) Separate Means Model Means fall on a straight line function of the

More information

ANOVA Analysis of Variance

ANOVA Analysis of Variance ANOVA Analysis of Variance ANOVA Analysis of Variance Extends independent samples t test ANOVA Analysis of Variance Extends independent samples t test Compares the means of groups of independent observations

More information

PART I. (a) Describe all the assumptions for a normal error regression model with one predictor variable,

PART I. (a) Describe all the assumptions for a normal error regression model with one predictor variable, Concordia University Department of Mathematics and Statistics Course Number Section Statistics 360/2 01 Examination Date Time Pages Final December 2002 3 hours 6 Instructors Course Examiner Marks Y.P.

More information

6. Multiple Linear Regression

6. Multiple Linear Regression 6. Multiple Linear Regression SLR: 1 predictor X, MLR: more than 1 predictor Example data set: Y i = #points scored by UF football team in game i X i1 = #games won by opponent in their last 10 games X

More information

STAT420 Midterm Exam. University of Illinois Urbana-Champaign October 19 (Friday), :00 4:15p. SOLUTIONS (Yellow)

STAT420 Midterm Exam. University of Illinois Urbana-Champaign October 19 (Friday), :00 4:15p. SOLUTIONS (Yellow) STAT40 Midterm Exam University of Illinois Urbana-Champaign October 19 (Friday), 018 3:00 4:15p SOLUTIONS (Yellow) Question 1 (15 points) (10 points) 3 (50 points) extra ( points) Total (77 points) Points

More information

Unit 6 - Simple linear regression

Unit 6 - Simple linear regression Sta 101: Data Analysis and Statistical Inference Dr. Çetinkaya-Rundel Unit 6 - Simple linear regression LO 1. Define the explanatory variable as the independent variable (predictor), and the response variable

More information

Chapter 8: Correlation & Regression

Chapter 8: Correlation & Regression Chapter 8: Correlation & Regression We can think of ANOVA and the two-sample t-test as applicable to situations where there is a response variable which is quantitative, and another variable that indicates

More information

Math 3330: Solution to midterm Exam

Math 3330: Solution to midterm Exam Math 3330: Solution to midterm Exam Question 1: (14 marks) Suppose the regression model is y i = β 0 + β 1 x i + ε i, i = 1,, n, where ε i are iid Normal distribution N(0, σ 2 ). a. (2 marks) Compute the

More information

Linear models and their mathematical foundations: Simple linear regression

Linear models and their mathematical foundations: Simple linear regression Linear models and their mathematical foundations: Simple linear regression Steffen Unkel Department of Medical Statistics University Medical Center Göttingen, Germany Winter term 2018/19 1/21 Introduction

More information

Lecture 2. The Simple Linear Regression Model: Matrix Approach

Lecture 2. The Simple Linear Regression Model: Matrix Approach Lecture 2 The Simple Linear Regression Model: Matrix Approach Matrix algebra Matrix representation of simple linear regression model 1 Vectors and Matrices Where it is necessary to consider a distribution

More information

THE ROYAL STATISTICAL SOCIETY 2008 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE (MODULAR FORMAT) MODULE 4 LINEAR MODELS

THE ROYAL STATISTICAL SOCIETY 2008 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE (MODULAR FORMAT) MODULE 4 LINEAR MODELS THE ROYAL STATISTICAL SOCIETY 008 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE (MODULAR FORMAT) MODULE 4 LINEAR MODELS The Society provides these solutions to assist candidates preparing for the examinations

More information

9 Correlation and Regression

9 Correlation and Regression 9 Correlation and Regression SW, Chapter 12. Suppose we select n = 10 persons from the population of college seniors who plan to take the MCAT exam. Each takes the test, is coached, and then retakes the

More information

2.4.3 Estimatingσ Coefficient of Determination 2.4. ASSESSING THE MODEL 23

2.4.3 Estimatingσ Coefficient of Determination 2.4. ASSESSING THE MODEL 23 2.4. ASSESSING THE MODEL 23 2.4.3 Estimatingσ 2 Note that the sums of squares are functions of the conditional random variables Y i = (Y X = x i ). Hence, the sums of squares are random variables as well.

More information

Inference for the Regression Coefficient

Inference for the Regression Coefficient Inference for the Regression Coefficient Recall, b 0 and b 1 are the estimates of the slope β 1 and intercept β 0 of population regression line. We can shows that b 0 and b 1 are the unbiased estimates

More information

Simple linear regression

Simple linear regression Simple linear regression Prof. Giuseppe Verlato Unit of Epidemiology & Medical Statistics, Dept. of Diagnostics & Public Health, University of Verona Statistics with two variables two nominal variables:

More information

General Linear Model (Chapter 4)

General Linear Model (Chapter 4) General Linear Model (Chapter 4) Outcome variable is considered continuous Simple linear regression Scatterplots OLS is BLUE under basic assumptions MSE estimates residual variance testing regression coefficients

More information

Correlation. Bivariate normal densities with ρ 0. Two-dimensional / bivariate normal density with correlation 0

Correlation. Bivariate normal densities with ρ 0. Two-dimensional / bivariate normal density with correlation 0 Correlation Bivariate normal densities with ρ 0 Example: Obesity index and blood pressure of n people randomly chosen from a population Two-dimensional / bivariate normal density with correlation 0 Correlation?

More information

Regression Models - Introduction

Regression Models - Introduction Regression Models - Introduction In regression models there are two types of variables that are studied: A dependent variable, Y, also called response variable. It is modeled as random. An independent

More information

Scatter plot of data from the study. Linear Regression

Scatter plot of data from the study. Linear Regression 1 2 Linear Regression Scatter plot of data from the study. Consider a study to relate birthweight to the estriol level of pregnant women. The data is below. i Weight (g / 100) i Weight (g / 100) 1 7 25

More information

Statistics for Engineers Lecture 9 Linear Regression

Statistics for Engineers Lecture 9 Linear Regression Statistics for Engineers Lecture 9 Linear Regression Chong Ma Department of Statistics University of South Carolina chongm@email.sc.edu April 17, 2017 Chong Ma (Statistics, USC) STAT 509 Spring 2017 April

More information

14 Multiple Linear Regression

14 Multiple Linear Regression B.Sc./Cert./M.Sc. Qualif. - Statistics: Theory and Practice 14 Multiple Linear Regression 14.1 The multiple linear regression model In simple linear regression, the response variable y is expressed in

More information

Chapter 12 - Lecture 2 Inferences about regression coefficient

Chapter 12 - Lecture 2 Inferences about regression coefficient Chapter 12 - Lecture 2 Inferences about regression coefficient April 19th, 2010 Facts about slope Test Statistic Confidence interval Hypothesis testing Test using ANOVA Table Facts about slope In previous

More information

Lecture 3: Inference in SLR

Lecture 3: Inference in SLR Lecture 3: Inference in SLR STAT 51 Spring 011 Background Reading KNNL:.1.6 3-1 Topic Overview This topic will cover: Review of hypothesis testing Inference about 1 Inference about 0 Confidence Intervals

More information

Chapter 14 Simple Linear Regression (A)

Chapter 14 Simple Linear Regression (A) Chapter 14 Simple Linear Regression (A) 1. Characteristics Managerial decisions often are based on the relationship between two or more variables. can be used to develop an equation showing how the variables

More information

Chapter 2 Inferences in Simple Linear Regression

Chapter 2 Inferences in Simple Linear Regression STAT 525 SPRING 2018 Chapter 2 Inferences in Simple Linear Regression Professor Min Zhang Testing for Linear Relationship Term β 1 X i defines linear relationship Will then test H 0 : β 1 = 0 Test requires

More information

Correlation and simple linear regression S5

Correlation and simple linear regression S5 Basic medical statistics for clinical and eperimental research Correlation and simple linear regression S5 Katarzyna Jóźwiak k.jozwiak@nki.nl November 15, 2017 1/41 Introduction Eample: Brain size and

More information

Lecture 18: Simple Linear Regression

Lecture 18: Simple Linear Regression Lecture 18: Simple Linear Regression BIOS 553 Department of Biostatistics University of Michigan Fall 2004 The Correlation Coefficient: r The correlation coefficient (r) is a number that measures the strength

More information

Concordia University (5+5)Q 1.

Concordia University (5+5)Q 1. (5+5)Q 1. Concordia University Department of Mathematics and Statistics Course Number Section Statistics 360/1 40 Examination Date Time Pages Mid Term Test May 26, 2004 Two Hours 3 Instructor Course Examiner

More information

STAT2012 Statistical Tests 23 Regression analysis: method of least squares

STAT2012 Statistical Tests 23 Regression analysis: method of least squares 23 Regression analysis: method of least squares L23 Regression analysis The main purpose of regression is to explore the dependence of one variable (Y ) on another variable (X). 23.1 Introduction (P.532-555)

More information

Inferences for Regression

Inferences for Regression Inferences for Regression An Example: Body Fat and Waist Size Looking at the relationship between % body fat and waist size (in inches). Here is a scatterplot of our data set: Remembering Regression In

More information

Linear Regression. In this lecture we will study a particular type of regression model: the linear regression model

Linear Regression. In this lecture we will study a particular type of regression model: the linear regression model 1 Linear Regression 2 Linear Regression In this lecture we will study a particular type of regression model: the linear regression model We will first consider the case of the model with one predictor

More information

Correlation and Regression

Correlation and Regression Correlation and Regression October 25, 2017 STAT 151 Class 9 Slide 1 Outline of Topics 1 Associations 2 Scatter plot 3 Correlation 4 Regression 5 Testing and estimation 6 Goodness-of-fit STAT 151 Class

More information

STA 302 H1F / 1001 HF Fall 2007 Test 1 October 24, 2007

STA 302 H1F / 1001 HF Fall 2007 Test 1 October 24, 2007 STA 302 H1F / 1001 HF Fall 2007 Test 1 October 24, 2007 LAST NAME: SOLUTIONS FIRST NAME: STUDENT NUMBER: ENROLLED IN: (circle one) STA 302 STA 1001 INSTRUCTIONS: Time: 90 minutes Aids allowed: calculator.

More information

STA 303H1F: Two-way Analysis of Variance Practice Problems

STA 303H1F: Two-way Analysis of Variance Practice Problems STA 303H1F: Two-way Analysis of Variance Practice Problems 1. In the Pygmalion example from lecture, why are the average scores of the platoon used as the response variable, rather than the scores of the

More information

Applied Regression Modeling: A Business Approach Chapter 2: Simple Linear Regression Sections

Applied Regression Modeling: A Business Approach Chapter 2: Simple Linear Regression Sections Applied Regression Modeling: A Business Approach Chapter 2: Simple Linear Regression Sections 2.1 2.3 by Iain Pardoe 2.1 Probability model for and 2 Simple linear regression model for and....................................

More information