From last time... The equations

Size: px
Start display at page:

Download "From last time... The equations"

Transcription

1 This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this site. Copyright 2006, The Johns Hopkins University and Karl W. Broman. All rights reserved. Use of these materials permitted only in accordance with license rights granted. Materials provided AS IS ; no representations or warranties provided. User assumes all responsibility for use, and all liability related thereto, and must independently review all materials for accuracy and efficacy. May contain materials owned by others. User is responsible for obtaining permissions for use from third parties as needed.

2 From last time corr = 0.78 Father s height (inches) Father s span (inches) The equations Regression of y on x (for predicting y from x) Slope = r SD(y) SD(x) Goes through the point ( x, ȳ) ŷ ȳ = r SD(y) SD(x) (x x) ŷ = ˆβ 0 + ˆβ 1 x where ˆβ 1 = r SD(y) SD(x) and ˆβ 0 = ȳ ˆβ 1 x Regression of x on y (for predicting x from y) Slope = r SD(x) SD(y) ˆx x = r SD(x) SD(y) (y ȳ) Goes through the point (ȳ, x) ˆx = ˆβ 0 + ˆβ 1 y where ˆβ 1 = r SD(x) SD(y) and ˆβ 0 = x ˆβ 1 ȳ

3 Histograms Spans mean = 68.7 SD = span (inches) mean = 67.7 SD = 2.7 Heights height (inches) Error in prediction Having no information about x, Predict y as ȳ Typical prediction error: SD(y) For predicting height, SD(y) 2.73 Having been told about x, Predict y using the regression line: ŷ = ˆβ 0 + ˆβ 1 x Typical prediction error: SD(y) 1 r 2 For predicting height from span, SD(y) 1 r

4 Back to David Sullivans data... pf3d Y = X 0.30 OD H2O2 concentration Model: y i = β 0 + β 1 x i + ǫ i where ǫ i iid Normal(0,σ 2 ) Estimates: ˆβ1 = i (x i x) (y i ȳ)/ i (x i x) 2 ˆβ 0 = ȳ ˆβ 1 x ˆσ = i (y i ŷ i ) 2 /(n 2) Parameter estimates We already know that and in particular (n 2) ˆσ2 σ 2 χ2 n 2 E(ˆσ 2 ) = σ 2 What about ˆβ 0 and ˆβ 1?

5 Parameter estimates (2) One can show that E( ˆβ 0 ) = β 0 E( ˆβ 1 ) = β 1 ( Var( ˆβ 0 ) = σ 2 1 n + x ) 2 Var( ˆβ 1 ) = σ2 Cov( ˆβ 0, ˆβ 1 ) = σ 2 x Cor( ˆβ 0, ˆβ 1 ) = x x 2 + /n Note: We re thinking of the x s as fixed. Parameter estimates (3) One can even show that the distribution of ˆβ 0 and ˆβ 1 is a bivariate normal distribution! ( ) ˆβ0 ˆβ 1 N(β, Σ) where β = ( β0 β 1 ) and Σ = σ 2 1 n + x 2 x x 1

6 OD H 2 O OD H 2 O 2

7 slope y intercept Confidence intervals We know that ˆβ 0 N ( (β 0, σ 2 1 n + x )) 2 ˆβ 1 N ( β 1, σ 2 ) We can use those distributions for hypothesis testing and to construct confidence intervals!

8 Statistical inference We want to test: H 0 : β 1 = β 1 versus H a : β 1 β 1 Generally, β 1 is 0. We use t = ˆβ 1 β 1 se( ˆβ 1 ) t n 2 where se( ˆβ 1 ) = ˆσ 2 Also, [ ˆβ1 t (1 α 2 ),n 2 se( ˆβ 1 ), ˆβ 1 + t (1 α 2 ),n 2 se( ˆβ ] 1 ) is a (1 α) 100% confidence interval for β 1. Results The calculations in the test H 0 : β 0 = β0 versus H a : β 0 β0 are analogous, except that we have to use ( ) se( ˆβ 1 0 ) = ˆσ 2 n + x2 For the pf3d7 data we get the 95% confidence intervals (0.342, 0.364) for the intercept ( , ) for the slope Testing whether the intercept (slope) is equal to zero, we obtain 70.7 ( 22.0) as test statistic. This corresponds to a p-value of ( ).

9 Now how about that Testing for the slope being equal to zero, we use t = ˆβ 1 se( ˆβ 1 ) For the squared test statistic we get t 2 = ( ˆβ1 se( ˆβ 1 ) ) 2 = ˆβ 2 1 ˆσ 2 / = ˆβ 2 1 ˆσ 2 = (SYY RSS)/1 RSS/n 2 = MS reg MSE = F The squared t statistic is the same as the F statistic from the ANOVA! Joint confidence region A 95% joint confidence region for the two parameters is the set of all values (β 0,β 1 ) that fulfill ( ) T ( β0 n i x ) ( ) i β 1 i x β0 i i x2 i β 1 2ˆσ 2 F (0.95),2,n-2 where β 0 = β 0 ˆβ 0 and β 1 = β 1 ˆβ 1.

10 β^1 β^0 Notation Assume we have n observations: (x 1, y 1 ),...,(x n, y n ). We previously defined = i (x i x) 2 = i x 2 i n( x) 2 SYY = i (y i ȳ) 2 = i y 2 i n(ȳ) 2 SXY = i (x i x)(y i ȳ) = i x i y i n xȳ We also define r XY = SXY SYY (called the sample correlation)

11 Coefficient of determination In the previous lecture we wrote Define SS reg = SYY RSS = (SXY)2 R 2 = SS reg SYY = 1 RSS SYY R 2 is often called the coefficient of determination. Notice that R 2 = SS reg SYY = (SXY) 2 SYY = r2 XY Back to the Sullivan data David Sullivan was actually interested in the slopes when one re-scales the y-axis so that the y-intercept is at 1. y = β 0 + β 1 x + ǫ becomes y/β 0 = 1 + (β 1 /β 0 )x + ǫ So we re really interested in β 1 /β 0. We d estimate that by ˆβ 1 / ˆβ 0, but what about its standard error?

12 First-order Taylor expansion Consider f(x, y) = x/y. A first-order Taylor expansion to approximate the function would be f(x, y) f(x 0, y 0 ) + (x x 0 ) f + (y y 0 ) f x y (x0,y 0 ) (x0,y 0 ) Since f/ x = 1/y and f/ y = x/y 2, we obtain the following: x/y x 0 /y 0 + (x x 0 )/y 0 (y y 0 )x 0 /y 2 0 = (x 0 /y 0 )[1 + (x x 0 )/x 0 + (y y 0 )/y 0 ] How do we use this? We use the first-order Taylor expansion of ˆβ 1 / ˆβ 0 around β 1 and β 0. Variance of a ratio Remember that β 1 and β 0 are fixed, while ˆβ 1 and ˆβ 0 are random. Add the fact that var(x+y) = var(x) + var(y) + 2 cov(x,y) var{ ˆβ 1 / ˆβ 0 } var{(β 1 /β 0 )[1 + ( ˆβ 1 β 1 )/β 1 + ( ˆβ 0 β 0 )/β 0 ]} = (β 1 /β 0 ) 2 {var( ˆβ 1 )/β var( ˆβ 0 )/β cov( ˆβ 1, ˆβ 0 )/(β 1 β 0 )} We then replace β 1 and β 0 in this formula with our estimates of them, ˆβ 1 and ˆβ 0. Further, we replace the variances and covariance with our estimates. var{ ˆ ˆβ 1 / ˆβ 0 } = ( ˆβ 1 / ˆβ 0 ) 2 { var( ˆ ˆβ 1 )/ ˆβ var( ˆ ˆβ 0 )/ ˆβ cov( ˆ ˆβ 1, ˆβ 0 )/( ˆβ 1 ˆβ0 )} The estimated SE is then SE{ ˆ ˆβ 1 / ˆβ 0 } = ˆβ 1 / ˆβ 0 [ SE( ˆ ˆβ 1 )/ ˆβ 1 ] 2 + [ SE( ˆ ˆβ 0 )/ ˆβ 0 ] cov( ˆ ˆβ 1, ˆβ 0 )/( ˆβ 1 ˆβ0 )

13 Results pf3d7: ˆβ 0 = 0.353(0.005) ˆβ1 = (0.0002) ˆ cov( ˆβ 1, ˆβ 0 ) = ˆβ 1 / ˆβ = 1.10 (SE = 0.07). estimate SE bhem pgalnoel pgal pyoelii pf3d pviv pknow pov pbr pfhz

Simple Linear Regression. John McGready Johns Hopkins University

Simple Linear Regression. John McGready Johns Hopkins University This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this

More information

Ch 2: Simple Linear Regression

Ch 2: Simple Linear Regression Ch 2: Simple Linear Regression 1. Simple Linear Regression Model A simple regression model with a single regressor x is y = β 0 + β 1 x + ɛ, where we assume that the error ɛ is independent random component

More information

Sampling Variability and Confidence Intervals. John McGready Johns Hopkins University

Sampling Variability and Confidence Intervals. John McGready Johns Hopkins University This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this

More information

Statistics for laboratory scientists II

Statistics for laboratory scientists II This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this

More information

Lecture 27. December 13, Department of Biostatistics Johns Hopkins Bloomberg School of Public Health Johns Hopkins University.

Lecture 27. December 13, Department of Biostatistics Johns Hopkins Bloomberg School of Public Health Johns Hopkins University. This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this

More information

Simple Linear Regression

Simple Linear Regression Simple Linear Regression In simple linear regression we are concerned about the relationship between two variables, X and Y. There are two components to such a relationship. 1. The strength of the relationship.

More information

Section B. The Theoretical Sampling Distribution of the Sample Mean and Its Estimate Based on a Single Sample

Section B. The Theoretical Sampling Distribution of the Sample Mean and Its Estimate Based on a Single Sample This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this

More information

1 Hypothesis testing for a single mean

1 Hypothesis testing for a single mean This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this

More information

Lecture 21. December 19, Department of Biostatistics Johns Hopkins Bloomberg School of Public Health Johns Hopkins University.

Lecture 21. December 19, Department of Biostatistics Johns Hopkins Bloomberg School of Public Health Johns Hopkins University. This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this

More information

Outline. 1. Define likelihood 2. Interpretations of likelihoods 3. Likelihood plots 4. Maximum likelihood 5. Likelihood ratio benchmarks

Outline. 1. Define likelihood 2. Interpretations of likelihoods 3. Likelihood plots 4. Maximum likelihood 5. Likelihood ratio benchmarks This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this

More information

Example. Test for a proportion

Example. Test for a proportion This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this

More information

Simple Linear Regression

Simple Linear Regression Simple Linear Regression MATH 282A Introduction to Computational Statistics University of California, San Diego Instructor: Ery Arias-Castro http://math.ucsd.edu/ eariasca/math282a.html MATH 282A University

More information

Intro to Linear Regression

Intro to Linear Regression Intro to Linear Regression Introduction to Regression Regression is a statistical procedure for modeling the relationship among variables to predict the value of a dependent variable from one or more predictor

More information

Intro to Linear Regression

Intro to Linear Regression Intro to Linear Regression Introduction to Regression Regression is a statistical procedure for modeling the relationship among variables to predict the value of a dependent variable from one or more predictor

More information

Linear Methods for Prediction

Linear Methods for Prediction This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this

More information

f X, Y (x, y)dx (x), where f(x,y) is the joint pdf of X and Y. (x) dx

f X, Y (x, y)dx (x), where f(x,y) is the joint pdf of X and Y. (x) dx INDEPENDENCE, COVARIANCE AND CORRELATION Independence: Intuitive idea of "Y is independent of X": The distribution of Y doesn't depend on the value of X. In terms of the conditional pdf's: "f(y x doesn't

More information

Linear models and their mathematical foundations: Simple linear regression

Linear models and their mathematical foundations: Simple linear regression Linear models and their mathematical foundations: Simple linear regression Steffen Unkel Department of Medical Statistics University Medical Center Göttingen, Germany Winter term 2018/19 1/21 Introduction

More information

Chapter 12 - Lecture 2 Inferences about regression coefficient

Chapter 12 - Lecture 2 Inferences about regression coefficient Chapter 12 - Lecture 2 Inferences about regression coefficient April 19th, 2010 Facts about slope Test Statistic Confidence interval Hypothesis testing Test using ANOVA Table Facts about slope In previous

More information

Introduction to Structural Equations

Introduction to Structural Equations This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this

More information

Advanced Structural Equations Models I

Advanced Structural Equations Models I This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this

More information

Inference using structural equations with latent variables

Inference using structural equations with latent variables This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this

More information

MA 575, Linear Models : Homework 2

MA 575, Linear Models : Homework 2 MA 575, Linear Models : Homework 2 Question x i x 2 x 2 i 2x i x + x 2 x 2 i 2 x x i + n x 2 x 2 i 2 x x i + x x 2 i x i x x i xx i x i Question 2 RSSβ 0, β RSSβ 0, β β 0 RSSβ 0, β β y i β 0 + β x i 2

More information

Normalization. Example of Replicate Data. Biostatistics Rafael A. Irizarry

Normalization. Example of Replicate Data. Biostatistics Rafael A. Irizarry This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this

More information

Lecture 23. November 15, Department of Biostatistics Johns Hopkins Bloomberg School of Public Health Johns Hopkins University.

Lecture 23. November 15, Department of Biostatistics Johns Hopkins Bloomberg School of Public Health Johns Hopkins University. This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this

More information

Correlation and the Analysis of Variance Approach to Simple Linear Regression

Correlation and the Analysis of Variance Approach to Simple Linear Regression Correlation and the Analysis of Variance Approach to Simple Linear Regression Biometry 755 Spring 2009 Correlation and the Analysis of Variance Approach to Simple Linear Regression p. 1/35 Correlation

More information

General Linear Model (Chapter 4)

General Linear Model (Chapter 4) General Linear Model (Chapter 4) Outcome variable is considered continuous Simple linear regression Scatterplots OLS is BLUE under basic assumptions MSE estimates residual variance testing regression coefficients

More information

Simple Linear Regression

Simple Linear Regression Simple Linear Regression Reading: Hoff Chapter 9 November 4, 2009 Problem Data: Observe pairs (Y i,x i ),i = 1,... n Response or dependent variable Y Predictor or independent variable X GOALS: Exploring

More information

BANA 7046 Data Mining I Lecture 2. Linear Regression, Model Assessment, and Cross-validation 1

BANA 7046 Data Mining I Lecture 2. Linear Regression, Model Assessment, and Cross-validation 1 BANA 7046 Data Mining I Lecture 2. Linear Regression, Model Assessment, and Cross-validation 1 Shaobo Li University of Cincinnati 1 Partially based on Hastie, et al. (2009) ESL, and James, et al. (2013)

More information

ECON3150/4150 Spring 2015

ECON3150/4150 Spring 2015 ECON3150/4150 Spring 2015 Lecture 3&4 - The linear regression model Siv-Elisabeth Skjelbred University of Oslo January 29, 2015 1 / 67 Chapter 4 in S&W Section 17.1 in S&W (extended OLS assumptions) 2

More information

Measuring the fit of the model - SSR

Measuring the fit of the model - SSR Measuring the fit of the model - SSR Once we ve determined our estimated regression line, we d like to know how well the model fits. How far/close are the observations to the fitted line? One way to do

More information

MAT2377. Rafa l Kulik. Version 2015/November/26. Rafa l Kulik

MAT2377. Rafa l Kulik. Version 2015/November/26. Rafa l Kulik MAT2377 Rafa l Kulik Version 2015/November/26 Rafa l Kulik Bivariate data and scatterplot Data: Hydrocarbon level (x) and Oxygen level (y): x: 0.99, 1.02, 1.15, 1.29, 1.46, 1.36, 0.87, 1.23, 1.55, 1.40,

More information

Applied Regression. Applied Regression. Chapter 2 Simple Linear Regression. Hongcheng Li. April, 6, 2013

Applied Regression. Applied Regression. Chapter 2 Simple Linear Regression. Hongcheng Li. April, 6, 2013 Applied Regression Chapter 2 Simple Linear Regression Hongcheng Li April, 6, 2013 Outline 1 Introduction of simple linear regression 2 Scatter plot 3 Simple linear regression model 4 Test of Hypothesis

More information

Probability measures A probability measure, P, is a real valued function from the collection of possible events so that the following

Probability measures A probability measure, P, is a real valued function from the collection of possible events so that the following This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this

More information

Multiple Linear Regression

Multiple Linear Regression Multiple Linear Regression Simple linear regression tries to fit a simple line between two variables Y and X. If X is linearly related to Y this explains some of the variability in Y. In most cases, there

More information

Simple linear regression

Simple linear regression Simple linear regression Biometry 755 Spring 2008 Simple linear regression p. 1/40 Overview of regression analysis Evaluate relationship between one or more independent variables (X 1,...,X k ) and a single

More information

Simple Linear Regression

Simple Linear Regression Simple Linear Regression September 24, 2008 Reading HH 8, GIll 4 Simple Linear Regression p.1/20 Problem Data: Observe pairs (Y i,x i ),i = 1,...n Response or dependent variable Y Predictor or independent

More information

Chapter 1: Linear Regression with One Predictor Variable also known as: Simple Linear Regression Bivariate Linear Regression

Chapter 1: Linear Regression with One Predictor Variable also known as: Simple Linear Regression Bivariate Linear Regression BSTT523: Kutner et al., Chapter 1 1 Chapter 1: Linear Regression with One Predictor Variable also known as: Simple Linear Regression Bivariate Linear Regression Introduction: Functional relation between

More information

III. Inferential Tools

III. Inferential Tools III. Inferential Tools A. Introduction to Bat Echolocation Data (10.1.1) 1. Q: Do echolocating bats expend more enery than non-echolocating bats and birds, after accounting for mass? 2. Strategy: (i) Explore

More information

Problems. Suppose both models are fitted to the same data. Show that SS Res, A SS Res, B

Problems. Suppose both models are fitted to the same data. Show that SS Res, A SS Res, B Simple Linear Regression 35 Problems 1 Consider a set of data (x i, y i ), i =1, 2,,n, and the following two regression models: y i = β 0 + β 1 x i + ε, (i =1, 2,,n), Model A y i = γ 0 + γ 1 x i + γ 2

More information

Variance. Standard deviation VAR = = value. Unbiased SD = SD = 10/23/2011. Functional Connectivity Correlation and Regression.

Variance. Standard deviation VAR = = value. Unbiased SD = SD = 10/23/2011. Functional Connectivity Correlation and Regression. 10/3/011 Functional Connectivity Correlation and Regression Variance VAR = Standard deviation Standard deviation SD = Unbiased SD = 1 10/3/011 Standard error Confidence interval SE = CI = = t value for

More information

Estadística II Chapter 4: Simple linear regression

Estadística II Chapter 4: Simple linear regression Estadística II Chapter 4: Simple linear regression Chapter 4. Simple linear regression Contents Objectives of the analysis. Model specification. Least Square Estimators (LSE): construction and properties

More information

Simple Linear Regression

Simple Linear Regression Simple Linear Regression ST 430/514 Recall: A regression model describes how a dependent variable (or response) Y is affected, on average, by one or more independent variables (or factors, or covariates)

More information

Summer School in Statistics for Astronomers V June 1 - June 6, Regression. Mosuk Chow Statistics Department Penn State University.

Summer School in Statistics for Astronomers V June 1 - June 6, Regression. Mosuk Chow Statistics Department Penn State University. Summer School in Statistics for Astronomers V June 1 - June 6, 2009 Regression Mosuk Chow Statistics Department Penn State University. Adapted from notes prepared by RL Karandikar Mean and variance Recall

More information

Toxicokinetics: Absorption, Distribution, and Excretion. Michael A. Trush, PhD Johns Hopkins University

Toxicokinetics: Absorption, Distribution, and Excretion. Michael A. Trush, PhD Johns Hopkins University This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this

More information

AMS 315/576 Lecture Notes. Chapter 11. Simple Linear Regression

AMS 315/576 Lecture Notes. Chapter 11. Simple Linear Regression AMS 315/576 Lecture Notes Chapter 11. Simple Linear Regression 11.1 Motivation A restaurant opening on a reservations-only basis would like to use the number of advance reservations x to predict the number

More information

Master s Written Examination - Solution

Master s Written Examination - Solution Master s Written Examination - Solution Spring 204 Problem Stat 40 Suppose X and X 2 have the joint pdf f X,X 2 (x, x 2 ) = 2e (x +x 2 ), 0 < x < x 2

More information

STA441: Spring Multiple Regression. This slide show is a free open source document. See the last slide for copyright information.

STA441: Spring Multiple Regression. This slide show is a free open source document. See the last slide for copyright information. STA441: Spring 2018 Multiple Regression This slide show is a free open source document. See the last slide for copyright information. 1 Least Squares Plane 2 Statistical MODEL There are p-1 explanatory

More information

F3: Classical normal linear rgression model distribution, interval estimation and hypothesis testing

F3: Classical normal linear rgression model distribution, interval estimation and hypothesis testing F3: Classical normal linear rgression model distribution, interval estimation and hypothesis testing Feng Li Department of Statistics, Stockholm University What we have learned last time... 1 Estimating

More information

Inference for the Regression Coefficient

Inference for the Regression Coefficient Inference for the Regression Coefficient Recall, b 0 and b 1 are the estimates of the slope β 1 and intercept β 0 of population regression line. We can shows that b 0 and b 1 are the unbiased estimates

More information

Regression. Estimation of the linear function (straight line) describing the linear component of the joint relationship between two variables X and Y.

Regression. Estimation of the linear function (straight line) describing the linear component of the joint relationship between two variables X and Y. Regression Bivariate i linear regression: Estimation of the linear function (straight line) describing the linear component of the joint relationship between two variables and. Generally describe as a

More information

Simple Linear Regression Analysis

Simple Linear Regression Analysis LINEAR REGRESSION ANALYSIS MODULE II Lecture - 6 Simple Linear Regression Analysis Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur Prediction of values of study

More information

Simple Linear Regression for the MPG Data

Simple Linear Regression for the MPG Data Simple Linear Regression for the MPG Data 2000 2500 3000 3500 15 20 25 30 35 40 45 Wgt MPG What do we do with the data? y i = MPG of i th car x i = Weight of i th car i =1,...,n n = Sample Size Exploratory

More information

STAT 4385 Topic 03: Simple Linear Regression

STAT 4385 Topic 03: Simple Linear Regression STAT 4385 Topic 03: Simple Linear Regression Xiaogang Su, Ph.D. Department of Mathematical Science University of Texas at El Paso xsu@utep.edu Spring, 2017 Outline The Set-Up Exploratory Data Analysis

More information

Math 3330: Solution to midterm Exam

Math 3330: Solution to midterm Exam Math 3330: Solution to midterm Exam Question 1: (14 marks) Suppose the regression model is y i = β 0 + β 1 x i + ε i, i = 1,, n, where ε i are iid Normal distribution N(0, σ 2 ). a. (2 marks) Compute the

More information

Correlation and Linear Regression

Correlation and Linear Regression Correlation and Linear Regression Correlation: Relationships between Variables So far, nearly all of our discussion of inferential statistics has focused on testing for differences between group means

More information

R 2 and F -Tests and ANOVA

R 2 and F -Tests and ANOVA R 2 and F -Tests and ANOVA December 6, 2018 1 Partition of Sums of Squares The distance from any point y i in a collection of data, to the mean of the data ȳ, is the deviation, written as y i ȳ. Definition.

More information

Probability Pr(A) 0, for any event A. 2. Pr(S) = 1, for the sample space S. 3. If A and B are mutually exclusive, Pr(A or B) = Pr(A) + Pr(B).

Probability Pr(A) 0, for any event A. 2. Pr(S) = 1, for the sample space S. 3. If A and B are mutually exclusive, Pr(A or B) = Pr(A) + Pr(B). This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this

More information

STAT2012 Statistical Tests 23 Regression analysis: method of least squares

STAT2012 Statistical Tests 23 Regression analysis: method of least squares 23 Regression analysis: method of least squares L23 Regression analysis The main purpose of regression is to explore the dependence of one variable (Y ) on another variable (X). 23.1 Introduction (P.532-555)

More information

MA 575 Linear Models: Cedric E. Ginestet, Boston University Midterm Review Week 7

MA 575 Linear Models: Cedric E. Ginestet, Boston University Midterm Review Week 7 MA 575 Linear Models: Cedric E. Ginestet, Boston University Midterm Review Week 7 1 Random Vectors Let a 0 and y be n 1 vectors, and let A be an n n matrix. Here, a 0 and A are non-random, whereas y is

More information

Lecture 15. Hypothesis testing in the linear model

Lecture 15. Hypothesis testing in the linear model 14. Lecture 15. Hypothesis testing in the linear model Lecture 15. Hypothesis testing in the linear model 1 (1 1) Preliminary lemma 15. Hypothesis testing in the linear model 15.1. Preliminary lemma Lemma

More information

Lecture 14 Simple Linear Regression

Lecture 14 Simple Linear Regression Lecture 4 Simple Linear Regression Ordinary Least Squares (OLS) Consider the following simple linear regression model where, for each unit i, Y i is the dependent variable (response). X i is the independent

More information

Regression Analysis. Regression: Methodology for studying the relationship among two or more variables

Regression Analysis. Regression: Methodology for studying the relationship among two or more variables Regression Analysis Regression: Methodology for studying the relationship among two or more variables Two major aims: Determine an appropriate model for the relationship between the variables Predict the

More information

Business Statistics. Tommaso Proietti. Linear Regression. DEF - Università di Roma 'Tor Vergata'

Business Statistics. Tommaso Proietti. Linear Regression. DEF - Università di Roma 'Tor Vergata' Business Statistics Tommaso Proietti DEF - Università di Roma 'Tor Vergata' Linear Regression Specication Let Y be a univariate quantitative response variable. We model Y as follows: Y = f(x) + ε where

More information

Simple Linear Regression

Simple Linear Regression Simple Linear Regression EdPsych 580 C.J. Anderson Fall 2005 Simple Linear Regression p. 1/80 Outline 1. What it is and why it s useful 2. How 3. Statistical Inference 4. Examining assumptions (diagnostics)

More information

FENG CHIA UNIVERSITY ECONOMETRICS I: HOMEWORK 4. Prof. Mei-Yuan Chen Spring 2008

FENG CHIA UNIVERSITY ECONOMETRICS I: HOMEWORK 4. Prof. Mei-Yuan Chen Spring 2008 FENG CHIA UNIVERSITY ECONOMETRICS I: HOMEWORK 4 Prof. Mei-Yuan Chen Spring 008. Partition and rearrange the matrix X as [x i X i ]. That is, X i is the matrix X excluding the column x i. Let u i denote

More information

Linear Regression. In this problem sheet, we consider the problem of linear regression with p predictors and one intercept,

Linear Regression. In this problem sheet, we consider the problem of linear regression with p predictors and one intercept, Linear Regression In this problem sheet, we consider the problem of linear regression with p predictors and one intercept, y = Xβ + ɛ, where y t = (y 1,..., y n ) is the column vector of target values,

More information

Inferences for Regression

Inferences for Regression Inferences for Regression An Example: Body Fat and Waist Size Looking at the relationship between % body fat and waist size (in inches). Here is a scatterplot of our data set: Remembering Regression In

More information

Regression and Covariance

Regression and Covariance Regression and Covariance James K. Peterson Department of Biological ciences and Department of Mathematical ciences Clemson University April 16, 2014 Outline A Review of Regression Regression and Covariance

More information

Lecture 10: F -Tests, ANOVA and R 2

Lecture 10: F -Tests, ANOVA and R 2 Lecture 10: F -Tests, ANOVA and R 2 1 ANOVA We saw that we could test the null hypothesis that β 1 0 using the statistic ( β 1 0)/ŝe. (Although I also mentioned that confidence intervals are generally

More information

3 Multiple Linear Regression

3 Multiple Linear Regression 3 Multiple Linear Regression 3.1 The Model Essentially, all models are wrong, but some are useful. Quote by George E.P. Box. Models are supposed to be exact descriptions of the population, but that is

More information

Lectures on Simple Linear Regression Stat 431, Summer 2012

Lectures on Simple Linear Regression Stat 431, Summer 2012 Lectures on Simple Linear Regression Stat 43, Summer 0 Hyunseung Kang July 6-8, 0 Last Updated: July 8, 0 :59PM Introduction Previously, we have been investigating various properties of the population

More information

Review of Statistics

Review of Statistics Review of Statistics Topics Descriptive Statistics Mean, Variance Probability Union event, joint event Random Variables Discrete and Continuous Distributions, Moments Two Random Variables Covariance and

More information

Chapter 1. Linear Regression with One Predictor Variable

Chapter 1. Linear Regression with One Predictor Variable Chapter 1. Linear Regression with One Predictor Variable 1.1 Statistical Relation Between Two Variables To motivate statistical relationships, let us consider a mathematical relation between two mathematical

More information

STAT 350: Geometry of Least Squares

STAT 350: Geometry of Least Squares The Geometry of Least Squares Mathematical Basics Inner / dot product: a and b column vectors a b = a T b = a i b i a b a T b = 0 Matrix Product: A is r s B is s t (AB) rt = s A rs B st Partitioned Matrices

More information

Confidence Intervals, Testing and ANOVA Summary

Confidence Intervals, Testing and ANOVA Summary Confidence Intervals, Testing and ANOVA Summary 1 One Sample Tests 1.1 One Sample z test: Mean (σ known) Let X 1,, X n a r.s. from N(µ, σ) or n > 30. Let The test statistic is H 0 : µ = µ 0. z = x µ 0

More information

Inference in Regression Analysis

Inference in Regression Analysis Inference in Regression Analysis Dr. Frank Wood Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 4, Slide 1 Today: Normal Error Regression Model Y i = β 0 + β 1 X i + ǫ i Y i value

More information

Simple Linear Regression for the Climate Data

Simple Linear Regression for the Climate Data Prediction Prediction Interval Temperature 0.2 0.0 0.2 0.4 0.6 0.8 320 340 360 380 CO 2 Simple Linear Regression for the Climate Data What do we do with the data? y i = Temperature of i th Year x i =CO

More information

Regression Estimation Least Squares and Maximum Likelihood

Regression Estimation Least Squares and Maximum Likelihood Regression Estimation Least Squares and Maximum Likelihood Dr. Frank Wood Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 3, Slide 1 Least Squares Max(min)imization Function to minimize

More information

1. Simple Linear Regression

1. Simple Linear Regression 1. Simple Linear Regression Suppose that we are interested in the average height of male undergrads at UF. We put each male student s name (population) in a hat and randomly select 100 (sample). Then their

More information

Bivariate distributions

Bivariate distributions Bivariate distributions 3 th October 017 lecture based on Hogg Tanis Zimmerman: Probability and Statistical Inference (9th ed.) Bivariate Distributions of the Discrete Type The Correlation Coefficient

More information

One-Way ANOVA Model. group 1 y 11, y 12 y 13 y 1n1. group 2 y 21, y 22 y 2n2. group g y g1, y g2, y gng. g is # of groups,

One-Way ANOVA Model. group 1 y 11, y 12 y 13 y 1n1. group 2 y 21, y 22 y 2n2. group g y g1, y g2, y gng. g is # of groups, One-Way ANOVA Model group 1 y 11, y 12 y 13 y 1n1 group 2 y 21, y 22 y 2n2 group g y g1, y g2, y gng g is # of groups, n i denotes # of obs in the i-th group, and the total sample size n = g i=1 n i. 1

More information

STAT 3A03 Applied Regression With SAS Fall 2017

STAT 3A03 Applied Regression With SAS Fall 2017 STAT 3A03 Applied Regression With SAS Fall 2017 Assignment 2 Solution Set Q. 1 I will add subscripts relating to the question part to the parameters and their estimates as well as the errors and residuals.

More information

using the beginning of all regression models

using the beginning of all regression models Estimating using the beginning of all regression models 3 examples Note about shorthand Cavendish's 29 measurements of the earth's density Heights (inches) of 14 11 year-old males from Alberta study Half-life

More information

Linear Models and Estimation by Least Squares

Linear Models and Estimation by Least Squares Linear Models and Estimation by Least Squares Jin-Lung Lin 1 Introduction Causal relation investigation lies in the heart of economics. Effect (Dependent variable) cause (Independent variable) Example:

More information

Data Mining Stat 588

Data Mining Stat 588 Data Mining Stat 588 Lecture 02: Linear Methods for Regression Department of Statistics & Biostatistics Rutgers University September 13 2011 Regression Problem Quantitative generic output variable Y. Generic

More information

Multiple Regression. Inference for Multiple Regression and A Case Study. IPS Chapters 11.1 and W.H. Freeman and Company

Multiple Regression. Inference for Multiple Regression and A Case Study. IPS Chapters 11.1 and W.H. Freeman and Company Multiple Regression Inference for Multiple Regression and A Case Study IPS Chapters 11.1 and 11.2 2009 W.H. Freeman and Company Objectives (IPS Chapters 11.1 and 11.2) Multiple regression Data for multiple

More information

Estimating σ 2. We can do simple prediction of Y and estimation of the mean of Y at any value of X.

Estimating σ 2. We can do simple prediction of Y and estimation of the mean of Y at any value of X. Estimating σ 2 We can do simple prediction of Y and estimation of the mean of Y at any value of X. To perform inferences about our regression line, we must estimate σ 2, the variance of the error term.

More information

Applied Econometrics (QEM)

Applied Econometrics (QEM) Applied Econometrics (QEM) based on Prinicples of Econometrics Jakub Mućk Department of Quantitative Economics Jakub Mućk Applied Econometrics (QEM) Meeting #3 1 / 42 Outline 1 2 3 t-test P-value Linear

More information

STA442/2101: Assignment 5

STA442/2101: Assignment 5 STA442/2101: Assignment 5 Craig Burkett Quiz on: Oct 23 rd, 2015 The questions are practice for the quiz next week, and are not to be handed in. I would like you to bring in all of the code you used to

More information

Association studies and regression

Association studies and regression Association studies and regression CM226: Machine Learning for Bioinformatics. Fall 2016 Sriram Sankararaman Acknowledgments: Fei Sha, Ameet Talwalkar Association studies and regression 1 / 104 Administration

More information

[y i α βx i ] 2 (2) Q = i=1

[y i α βx i ] 2 (2) Q = i=1 Least squares fits This section has no probability in it. There are no random variables. We are given n points (x i, y i ) and want to find the equation of the line that best fits them. We take the equation

More information

For more information about how to cite these materials visit

For more information about how to cite these materials visit Author(s): Kerby Shedden, Ph.D., 2010 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution Share Alike 3.0 License: http://creativecommons.org/licenses/by-sa/3.0/

More information

Multivariate Regression Analysis

Multivariate Regression Analysis Matrices and vectors The model from the sample is: Y = Xβ +u with n individuals, l response variable, k regressors Y is a n 1 vector or a n l matrix with the notation Y T = (y 1,y 2,...,y n ) 1 x 11 x

More information

Section 4.6 Simple Linear Regression

Section 4.6 Simple Linear Regression Section 4.6 Simple Linear Regression Objectives ˆ Basic philosophy of SLR and the regression assumptions ˆ Point & interval estimation of the model parameters, and how to make predictions ˆ Point and interval

More information

Business Statistics. Lecture 9: Simple Regression

Business Statistics. Lecture 9: Simple Regression Business Statistics Lecture 9: Simple Regression 1 On to Model Building! Up to now, class was about descriptive and inferential statistics Numerical and graphical summaries of data Confidence intervals

More information

MFin Econometrics I Session 4: t-distribution, Simple Linear Regression, OLS assumptions and properties of OLS estimators

MFin Econometrics I Session 4: t-distribution, Simple Linear Regression, OLS assumptions and properties of OLS estimators MFin Econometrics I Session 4: t-distribution, Simple Linear Regression, OLS assumptions and properties of OLS estimators Thilo Klein University of Cambridge Judge Business School Session 4: Linear regression,

More information

Two-Variable Regression Model: The Problem of Estimation

Two-Variable Regression Model: The Problem of Estimation Two-Variable Regression Model: The Problem of Estimation Introducing the Ordinary Least Squares Estimator Jamie Monogan University of Georgia Intermediate Political Methodology Jamie Monogan (UGA) Two-Variable

More information

Overview Scatter Plot Example

Overview Scatter Plot Example Overview Topic 22 - Linear Regression and Correlation STAT 5 Professor Bruce Craig Consider one population but two variables For each sampling unit observe X and Y Assume linear relationship between variables

More information

Linear Regression. September 27, Chapter 3. Chapter 3 September 27, / 77

Linear Regression. September 27, Chapter 3. Chapter 3 September 27, / 77 Linear Regression Chapter 3 September 27, 2016 Chapter 3 September 27, 2016 1 / 77 1 3.1. Simple linear regression 2 3.2 Multiple linear regression 3 3.3. The least squares estimation 4 3.4. The statistical

More information

STAT5044: Regression and Anova. Inyoung Kim

STAT5044: Regression and Anova. Inyoung Kim STAT5044: Regression and Anova Inyoung Kim 2 / 47 Outline 1 Regression 2 Simple Linear regression 3 Basic concepts in regression 4 How to estimate unknown parameters 5 Properties of Least Squares Estimators:

More information