2. Regression Review

Size: px
Start display at page:

Download "2. Regression Review"

Transcription

1 2. Regression Review 2.1 The Regression Model The general form of the regression model y t = f(x t, β) + ε t where x t = (x t1,, x tp ), β = (β 1,..., β m ). ε t is a random variable, Eε t = 0, Var(ε t ) = σ 2, and the error ε t are uncorrelated. Normal distribution assumption for error ε t implies independence among error 1

2 Examples: 1. y t = β 0 + ε t (constant mean model) 2. y t = β 0 + β 1 x t + ε t (simple linear regression model) 3. y t = β 0 exp(β 1 x t ) + ε t (exponential growth model) 4. y t = β 0 + β 1 x t + β 2 x 2 t + ε t (quadratic model) 5. y t = β 0 + β 1 x t1 + β 2 x t2 + ε t (linear model with two independent variables) 6. y t = β 0 + β 1 x t1 + β 2 x t2 + β 11 x 2 t1 + β 22 x 2 t2 + β 12 x t1 x t2 + ε t (quadratic model with two independent variables) Linear Models: see Example 1, 2, 4, 5, 6 2

3 3

4 2.2 Prediction from regression models with known coefficients prediction y pred k forecast error e k = y pred k for y k = f(x k, β) + ε k y k The expected value of the squared forecast error E(e 2 k) = E(y pred k y k ) 2 = σ 2 + E[f(x k, β) y pred k ]2 Minimum mean square error(mmse) y pred k = f(x k, β) 100(1 α) percent prediction interval for the future value y k [f(x k, β) µ α/2 σ; f(x k, β) + µ α/2 σ] where µ α/2 is the 100(1 α/2) percentage point of the N (0, 1) distribution. 4

5 y t = β 0 + β 1 x t + ε t Least squares estimates of unknown coefficients The parameter estimates that minimize the sum of the squared deviations S(β) = n [y t f(x t ; β)] 2 t=1 are called the least squares estimates and are denoted by ˆβ. Examples 1. Simple linear regression model through the origin y t = βx t + ε t. 2. Simple linear regression model

6 Example: Running Performance For illustration, consider the data list in Table. In this example we are interested in predicting the racing performance of trained female distance runners. The measured variables include the running performance in 10-km road race; body composition such as height, weight, skinfold sum, and relative body fat; and maximal aerobic power. The subjects for study were 14 trained and competition-experienced female runners who had placed among the top 20 women in a 10-km road race with more than 300 female entrants. The laboratory data were collected during the week following the competitive run. For the moment we are interested only in the relationship between the racing performance(running time) and maximal aerobic power(volume of maximal oxygen uptake; V O2.) 6

7 7

8 8

9 Estimation in the General Linear Regression Model Linear Regression models can be written as y t = β 0 + β 1 x t1 + β 2 x t2 + + β p x tp + ε t or y t = x tβ + ε t where x t = (1, x t1,..., x tp ) and β = (β 0, β 1,..., β p ) To simplify the calculation of the least squares estimates for the general linear regression model, we introduce matrix notation, y = Xβ + ε where E(ε) = 0, Var((ε)) = σ 2 I. Then E(y) = Xβ, Var(y) = E[y E(y)][y E(y)] = σ 2 I 9

10 In matrix notation the least squares criterion can be expressed as minimizing S(β) = n (y t x tβ) 2 = (y Xβ) (y Xβ) t=1 Then the solution is then given by ˆβ = (X X) 1 X y Two special cases, 1. Simple linear regression through the origin: y t = βx t + ε t, for 1 t n. 2. Simple linear regression model y t = β 0 + β 1 x t + ε t, for 1 t n. 10

11 2.4 Properties of Least Squares Estimation 1. E(ˆβ) = β, Var(ˆβ) = σ 2 (X X) If it is assumed that errors in linear model are normally distribution, then ˆβ N p+1 [β, σ 2 (X X) 1 ] 3. The difference between the observed and fitted values are called residuals and are given by e = y ŷ = y X(X X) 1 X y = [I X(X X) 1 X ]y. Then we have X e = X [I X(X X) 1 X ]y = 0. 11

12 4. The variation of the observations y t around their mean ȳ, total sum of squares (SSTO) SSTO = (y t ŷ) 2 = y 2 t nȳ 2 = y y nȳ 2. The variation of fitted value ŷ t around the mean ȳ, sum of squares due to regression (SSR) SSR = (ŷ t ȳ) 2 = ŷ 2 t nȳ 2 = ˆβ X y nȳ 2. sum of squares due to error (or residual) (SSE) SSE = (y t ŷ t ) 2 = e e. Then SSTO = SSR + SSE and define the coefficient of determination R 2 as the ratio of the sum of squares due to regression and the total sum of squares: R 2 = SSR SSTO = 1 SSE SSTO 12

13 5. The variance σ 2 is usually unknown. However it can be estimated by the mean square error s 2 = SSE n p 1. Hence the estimated covariance matrix of the least squares estimator ˆβ is Var(ˆβ) = s 2 (X X) 1 The estimated standard error of ˆβ i is given by s ˆβi = s c ii, where c ii is the corresponding diagonal element in (X X) The least square estimate ˆβ and s 2 are independent. 13

14 2.5 Confidence Intervals and Hypothesis Testing Consider the quantity ˆβ i β i s c ii, i = 0, 1,..., p where ˆβ i is the least square estimate of β i, and s ˆβi = s c ii is its standard error. This quantity has a t-distribution with n p 1 degrees of freedom. Confidence intervals A 100(1 α) confidence interval for the unknown parameter β i is given by [ ˆβ i t α/2 (n p 1)s c ii, ˆβ i + t α/2 (n p 1)s c ii ] where t α/2 (n p 1) is the 100(1 α/2) percentage point of a t distribution with n p 1 degree of freedom 14

15 Hypothesis Tests for individual Coefficients H 0 : β i = β i0 vs H 1 : β i β i0 The test statistic is given by t = ˆβ i β i0 s c ii If t > t α/2(n p 1), we reject H 0 in favor of H 1 at significance level α. If t t α/2(n p 1), there is not enough evidence for rejecting the null hypothesis H 0 in favor of H 1. Thus loosely speaking, we accept H 0 at significance level α. Special case when β i0 = 0. 15

16 2.6 Prediction from Regression Models with Estimated Coefficients The MMSE prediction of a future y k from regression model y t = β 0 + β 1 x t1 + + β p x tp + ε t is given by y pred k = β 0 + β 1 x k1 + + β p x kp = x kβ Replace β by their least estimate ˆβ = (X X) 1 X y ŷ pred k = ˆβ 0 + ˆβ 1 x k1 + + ˆβ p x kp = x k ˆβ 16

17 These predictions have the following properties 1. Unbiased E(y k ŷ pred ) = 0 2. ŷ pred is the minimum mean square error forecast among all linear unbias forecast 3. The variance of the forecast error y k ŷ pred k is given by Var(y k ŷ pred k ) = Var{ε k + x k(β ˆβ)} = σ 2 [1 + x k(x X) 1 x k ] 4. The estimated variance of the forecast error is Var(y k ŷ pred k ) = s2 [1 + x k(x X) 1 x k ] A 100(1 α) percent prediction interval for the future y k is then given by its upper and lower limits ŷ pred k ± t α/2 (n p 1)s[1 + x k(x X) 1 x k ]

18 Example: Running Performance The simple linear regression model y t = β 0 + β 1 x t + ε t where y t is the time to run the 10-km race, and x t is the maximal aerobic capacity V O2. The least square estimate of ˆβ 0 = and β 1 = The fitting values ŷ t = ˆβ 0 + ˆβ 1 x t, the residuals e t = y t ŷ t, and the entries in ANOVA table are easily calculated. Source SS df MS F Regression Error Total

19 The coefficient of determination is given by R 2 = / =.435. Var( ˆβ) = s 2 (X X) 1 = s 2 [ 1 n + x 2 P (xt x) 2 = [ x P (xt x) 2 P x 1 (xt x) 2 P (xt x) 2 ] ] The standard error s ˆβi and t statistics t ˆβi = ˆβ i /s ˆβi are given in the following table ˆβ i s ˆβi t ˆβi Constant V O

20 H 0 : β 1 = 0 against H 1 : β 1 < 0 The critical value for this one-sided test with significance level α = 0.05 is t(12) = Since the t statistic is smaller than this critical value, we reject the null hypothesis. In the other words, there is evidence for a significance inverse relationship between running time and aerobic capacity. Prediction intervals For example, let us predict the running time for a female athlete with maximal aerobic capacity x k = 55. Then ŷ pred = ˆβ 0 + ˆβ 1 x k = and its standard error by s (1 + 1n + (x k x) 2 ) 1/2 (xt x) 2 = 2.40 Thus a 95 percent prediction interval for the running time is given by ± t (12)2.40 = ± 5.23 or (37.52, 47.98) 20

21 To investigate whether other variable (height X 2, weight X 3, skinfold sum X 4, relative body fat X 5 help explain the variation in the data, fit the model Then we have y t = β 0 + β 1 x t1 + + β 5 x t5 + ε t and the ANOVA table ˆβ i s ˆβi t ˆβi Constant V O Height Weight Skinfold sum Relative body fat Source SS df MS F Regression Error Total

22 Case Study I: Gas Mileage Data Consider predicting the gas mileage of an automobile as a function of its size and other engine characteristics. In table 2.2. we have list data on 38 automobile ( models). These data, which were originally taken from Consumer Reports. The variables include gas mileage in miles per gallon (MPG), number of cylinders, cubic engine displacement, horse power, weight, acceleration, and engine type [straight(1), V(0)]. The gas consumption (in gallons) should be proportional to the effort(=force distance) it takes to move the car. Furthermore, since force is proportional to weight, we except that gas consumption per unit distance is proportional to force and thus proportional to weight. Thus a model of the form y t = β 0 + β 1 x t + ε t where y = MPG 1 = GP M and x =weight. To check whether a quadratic term (weight) 2 is need, then fit the model and make a hypothesis test y t = β 0 + β 1 x t + β 2 x 2 t + ε t H 0 : β 2 = 0 againsit H 1 : β

23 23

24 24

25 2.7 Model Selection Techniques (I) Adjusted coefficient of fitness R 2 a = 1 SSE/(n p 1) SSTO/(n 1) = 1 s 2 SSTO/(n 1). Compared with R 2, we have extra penalty here for introducing more independent variables. Intention: maximize R 2 a (equivalently minimizing s 2 ). (II) Consider C p = SSE p s 2 (n 2p). Since E(SSE p ) (n p)σ 2, (large n), C p p. So choose smallest p such that C p p. (III) Backward Elimination, Forward Selection, and Stepwise Regression. 25

26 Example 2 (Case study I continued) Results from Fitting All possible Regression p Variables Included Ra 2 1 X X 2 X X 1 X 3 X X 1 X 3 X 5 X X 1 X 2 X 3 X 4 X X 1 X 2 X 3 X 4 X 5 X Correlations Among the Predictor Variables X 1 X 2 X 3 X 4 X 5 X 6 X X X X X X

27 t Statistics from Forward Selection and Backward Elimination Procedures Step Constant X 1 X 2 X 3 X 4 X 5 X 6 Ra 2 (a) Forward Selection (b) Backward Elimination Selected Models or the even simpler model GPM t = β 0 + β 1 (weight t ) + β 2 (displacement t ) + ε t GPM t = β 0 + β 1 (weight t ) + ε t 27

28 2.8 Multi-colinearity in Predictor Variables This is referring to the situation when the columns of X are almost linearly dependent. This sort of situation is common for observational data rather than artificially designed matrix. We say, in this case, that X X is ill conditioned. This could cause large variance of ˆβ i. So techniques like Ridge Regression will be applied, i.e. with restriction β β r 2. Example y t = β 0 + β 1 x t1 + β 2 x t2 + ε t, t = 1, 2,..., n. The least square estimate of βi = s i s y β i, i = 1, 2, where s i n 1 = [ (xit x i ) 2 ], s y n 1 = [ (yt ȳ) 2 ], has such property Var(β1) = Var(β2) = σ r12 2 and Corr(β 1, β 2) = r 12 where r 12 = [ (x t1 x 1 )(x t2 x 2 )]/[ (x t1 x 1 ) 2 (x t2 x 2 ) 2 ]

29 2.9 General Principles for Modelling Principle of Parsimony (As Simples as Possible) and ŷ = Xˆβ = X(X X) 1 X y = Var(ŷ) = σ 2 X(X X) 1 X. 1 n n i=1 Var(ŷ t ) = σ2 n Trace[X(X X) 1 X ] = pσ2 n. Hence the average forecast error has variance σ 2 (1 + p n ). So extra independent variables would increase p and the forecast error. Plot Residuals Ideally, the residuals would be in the form of white noise, otherwise some modifications might be necessary, such as extra linear/quadratic terms. Particularly for the funnel Shape (like a trumpet), the logarithm transformation might be proper. 29

30 30

Ch 3: Multiple Linear Regression

Ch 3: Multiple Linear Regression Ch 3: Multiple Linear Regression 1. Multiple Linear Regression Model Multiple regression model has more than one regressor. For example, we have one response variable and two regressor variables: 1. delivery

More information

Ch 2: Simple Linear Regression

Ch 2: Simple Linear Regression Ch 2: Simple Linear Regression 1. Simple Linear Regression Model A simple regression model with a single regressor x is y = β 0 + β 1 x + ɛ, where we assume that the error ɛ is independent random component

More information

Multiple Linear Regression

Multiple Linear Regression Multiple Linear Regression Simple linear regression tries to fit a simple line between two variables Y and X. If X is linearly related to Y this explains some of the variability in Y. In most cases, there

More information

Simple Linear Regression

Simple Linear Regression Simple Linear Regression In simple linear regression we are concerned about the relationship between two variables, X and Y. There are two components to such a relationship. 1. The strength of the relationship.

More information

Math 423/533: The Main Theoretical Topics

Math 423/533: The Main Theoretical Topics Math 423/533: The Main Theoretical Topics Notation sample size n, data index i number of predictors, p (p = 2 for simple linear regression) y i : response for individual i x i = (x i1,..., x ip ) (1 p)

More information

Ma 3/103: Lecture 25 Linear Regression II: Hypothesis Testing and ANOVA

Ma 3/103: Lecture 25 Linear Regression II: Hypothesis Testing and ANOVA Ma 3/103: Lecture 25 Linear Regression II: Hypothesis Testing and ANOVA March 6, 2017 KC Border Linear Regression II March 6, 2017 1 / 44 1 OLS estimator 2 Restricted regression 3 Errors in variables 4

More information

Simple Linear Regression

Simple Linear Regression Simple Linear Regression ST 430/514 Recall: A regression model describes how a dependent variable (or response) Y is affected, on average, by one or more independent variables (or factors, or covariates)

More information

Measuring the fit of the model - SSR

Measuring the fit of the model - SSR Measuring the fit of the model - SSR Once we ve determined our estimated regression line, we d like to know how well the model fits. How far/close are the observations to the fitted line? One way to do

More information

Linear Regression Model. Badr Missaoui

Linear Regression Model. Badr Missaoui Linear Regression Model Badr Missaoui Introduction What is this course about? It is a course on applied statistics. It comprises 2 hours lectures each week and 1 hour lab sessions/tutorials. We will focus

More information

Lecture 6: Linear Regression

Lecture 6: Linear Regression Lecture 6: Linear Regression Reading: Sections 3.1-3 STATS 202: Data mining and analysis Jonathan Taylor, 10/5 Slide credits: Sergio Bacallado 1 / 30 Simple linear regression Model: y i = β 0 + β 1 x i

More information

Multiple Regression Analysis. Part III. Multiple Regression Analysis

Multiple Regression Analysis. Part III. Multiple Regression Analysis Part III Multiple Regression Analysis As of Sep 26, 2017 1 Multiple Regression Analysis Estimation Matrix form Goodness-of-Fit R-square Adjusted R-square Expected values of the OLS estimators Irrelevant

More information

Chapter 3: Multiple Regression. August 14, 2018

Chapter 3: Multiple Regression. August 14, 2018 Chapter 3: Multiple Regression August 14, 2018 1 The multiple linear regression model The model y = β 0 +β 1 x 1 + +β k x k +ǫ (1) is called a multiple linear regression model with k regressors. The parametersβ

More information

Chapter 12 - Lecture 2 Inferences about regression coefficient

Chapter 12 - Lecture 2 Inferences about regression coefficient Chapter 12 - Lecture 2 Inferences about regression coefficient April 19th, 2010 Facts about slope Test Statistic Confidence interval Hypothesis testing Test using ANOVA Table Facts about slope In previous

More information

Lecture 6: Linear Regression (continued)

Lecture 6: Linear Regression (continued) Lecture 6: Linear Regression (continued) Reading: Sections 3.1-3.3 STATS 202: Data mining and analysis October 6, 2017 1 / 23 Multiple linear regression Y = β 0 + β 1 X 1 + + β p X p + ε Y ε N (0, σ) i.i.d.

More information

Chapter 2 Multiple Regression (Part 4)

Chapter 2 Multiple Regression (Part 4) Chapter 2 Multiple Regression (Part 4) 1 The effect of multi-collinearity Now, we know to find the estimator (X X) 1 must exist! Therefore, n must be great or at least equal to p + 1 (WHY?) However, even

More information

Simple and Multiple Linear Regression

Simple and Multiple Linear Regression Sta. 113 Chapter 12 and 13 of Devore March 12, 2010 Table of contents 1 Simple Linear Regression 2 Model Simple Linear Regression A simple linear regression model is given by Y = β 0 + β 1 x + ɛ where

More information

Chapter 1: Linear Regression with One Predictor Variable also known as: Simple Linear Regression Bivariate Linear Regression

Chapter 1: Linear Regression with One Predictor Variable also known as: Simple Linear Regression Bivariate Linear Regression BSTT523: Kutner et al., Chapter 1 1 Chapter 1: Linear Regression with One Predictor Variable also known as: Simple Linear Regression Bivariate Linear Regression Introduction: Functional relation between

More information

STAT5044: Regression and Anova

STAT5044: Regression and Anova STAT5044: Regression and Anova Inyoung Kim 1 / 25 Outline 1 Multiple Linear Regression 2 / 25 Basic Idea An extra sum of squares: the marginal reduction in the error sum of squares when one or several

More information

STAT763: Applied Regression Analysis. Multiple linear regression. 4.4 Hypothesis testing

STAT763: Applied Regression Analysis. Multiple linear regression. 4.4 Hypothesis testing STAT763: Applied Regression Analysis Multiple linear regression 4.4 Hypothesis testing Chunsheng Ma E-mail: cma@math.wichita.edu 4.4.1 Significance of regression Null hypothesis (Test whether all β j =

More information

Outline. Remedial Measures) Extra Sums of Squares Standardized Version of the Multiple Regression Model

Outline. Remedial Measures) Extra Sums of Squares Standardized Version of the Multiple Regression Model Outline 1 Multiple Linear Regression (Estimation, Inference, Diagnostics and Remedial Measures) 2 Special Topics for Multiple Regression Extra Sums of Squares Standardized Version of the Multiple Regression

More information

Multivariate Regression

Multivariate Regression Multivariate Regression The so-called supervised learning problem is the following: we want to approximate the random variable Y with an appropriate function of the random variables X 1,..., X p with the

More information

Simple Regression Model Setup Estimation Inference Prediction. Model Diagnostic. Multiple Regression. Model Setup and Estimation.

Simple Regression Model Setup Estimation Inference Prediction. Model Diagnostic. Multiple Regression. Model Setup and Estimation. Statistical Computation Math 475 Jimin Ding Department of Mathematics Washington University in St. Louis www.math.wustl.edu/ jmding/math475/index.html October 10, 2013 Ridge Part IV October 10, 2013 1

More information

Estimating σ 2. We can do simple prediction of Y and estimation of the mean of Y at any value of X.

Estimating σ 2. We can do simple prediction of Y and estimation of the mean of Y at any value of X. Estimating σ 2 We can do simple prediction of Y and estimation of the mean of Y at any value of X. To perform inferences about our regression line, we must estimate σ 2, the variance of the error term.

More information

Applied Regression Analysis

Applied Regression Analysis Applied Regression Analysis Chapter 3 Multiple Linear Regression Hongcheng Li April, 6, 2013 Recall simple linear regression 1 Recall simple linear regression 2 Parameter Estimation 3 Interpretations of

More information

Chapter 12: Multiple Linear Regression

Chapter 12: Multiple Linear Regression Chapter 12: Multiple Linear Regression Seungchul Baek Department of Statistics, University of South Carolina STAT 509: Statistics for Engineers 1 / 55 Introduction A regression model can be expressed as

More information

Lecture 10 Multiple Linear Regression

Lecture 10 Multiple Linear Regression Lecture 10 Multiple Linear Regression STAT 512 Spring 2011 Background Reading KNNL: 6.1-6.5 10-1 Topic Overview Multiple Linear Regression Model 10-2 Data for Multiple Regression Y i is the response variable

More information

Advanced Quantitative Methods: ordinary least squares

Advanced Quantitative Methods: ordinary least squares Advanced Quantitative Methods: Ordinary Least Squares University College Dublin 31 January 2012 1 2 3 4 5 Terminology y is the dependent variable referred to also (by Greene) as a regressand X are the

More information

Ma 3/103: Lecture 24 Linear Regression I: Estimation

Ma 3/103: Lecture 24 Linear Regression I: Estimation Ma 3/103: Lecture 24 Linear Regression I: Estimation March 3, 2017 KC Border Linear Regression I March 3, 2017 1 / 32 Regression analysis Regression analysis Estimate and test E(Y X) = f (X). f is the

More information

STAT5044: Regression and Anova. Inyoung Kim

STAT5044: Regression and Anova. Inyoung Kim STAT5044: Regression and Anova Inyoung Kim 2 / 47 Outline 1 Regression 2 Simple Linear regression 3 Basic concepts in regression 4 How to estimate unknown parameters 5 Properties of Least Squares Estimators:

More information

REGRESSION ANALYSIS AND INDICATOR VARIABLES

REGRESSION ANALYSIS AND INDICATOR VARIABLES REGRESSION ANALYSIS AND INDICATOR VARIABLES Thesis Submitted in partial fulfillment of the requirements for the award of degree of Masters of Science in Mathematics and Computing Submitted by Sweety Arora

More information

Applied Econometrics (QEM)

Applied Econometrics (QEM) Applied Econometrics (QEM) based on Prinicples of Econometrics Jakub Mućk Department of Quantitative Economics Jakub Mućk Applied Econometrics (QEM) Meeting #3 1 / 42 Outline 1 2 3 t-test P-value Linear

More information

Inference for Regression

Inference for Regression Inference for Regression Section 9.4 Cathy Poliak, Ph.D. cathy@math.uh.edu Office in Fleming 11c Department of Mathematics University of Houston Lecture 13b - 3339 Cathy Poliak, Ph.D. cathy@math.uh.edu

More information

Chapter 5 Matrix Approach to Simple Linear Regression

Chapter 5 Matrix Approach to Simple Linear Regression STAT 525 SPRING 2018 Chapter 5 Matrix Approach to Simple Linear Regression Professor Min Zhang Matrix Collection of elements arranged in rows and columns Elements will be numbers or symbols For example:

More information

Apart from this page, you are not permitted to read the contents of this question paper until instructed to do so by an invigilator.

Apart from this page, you are not permitted to read the contents of this question paper until instructed to do so by an invigilator. B. Sc. Examination by course unit 2014 MTH5120 Statistical Modelling I Duration: 2 hours Date and time: 16 May 2014, 1000h 1200h Apart from this page, you are not permitted to read the contents of this

More information

Multiple Linear Regression

Multiple Linear Regression Multiple Linear Regression University of California, San Diego Instructor: Ery Arias-Castro http://math.ucsd.edu/~eariasca/teaching.html 1 / 42 Passenger car mileage Consider the carmpg dataset taken from

More information

Regression Models - Introduction

Regression Models - Introduction Regression Models - Introduction In regression models there are two types of variables that are studied: A dependent variable, Y, also called response variable. It is modeled as random. An independent

More information

Introduction to Statistical modeling: handout for Math 489/583

Introduction to Statistical modeling: handout for Math 489/583 Introduction to Statistical modeling: handout for Math 489/583 Statistical modeling occurs when we are trying to model some data using statistical tools. From the start, we recognize that no model is perfect

More information

Chapter 2 Multiple Regression I (Part 1)

Chapter 2 Multiple Regression I (Part 1) Chapter 2 Multiple Regression I (Part 1) 1 Regression several predictor variables The response Y depends on several predictor variables X 1,, X p response {}}{ Y predictor variables {}}{ X 1, X 2,, X p

More information

Simple Linear Regression

Simple Linear Regression Simple Linear Regression ST 370 Regression models are used to study the relationship of a response variable and one or more predictors. The response is also called the dependent variable, and the predictors

More information

Linear regression. We have that the estimated mean in linear regression is. ˆµ Y X=x = ˆβ 0 + ˆβ 1 x. The standard error of ˆµ Y X=x is.

Linear regression. We have that the estimated mean in linear regression is. ˆµ Y X=x = ˆβ 0 + ˆβ 1 x. The standard error of ˆµ Y X=x is. Linear regression We have that the estimated mean in linear regression is The standard error of ˆµ Y X=x is where x = 1 n s.e.(ˆµ Y X=x ) = σ ˆµ Y X=x = ˆβ 0 + ˆβ 1 x. 1 n + (x x)2 i (x i x) 2 i x i. The

More information

Regression Models - Introduction

Regression Models - Introduction Regression Models - Introduction In regression models, two types of variables that are studied: A dependent variable, Y, also called response variable. It is modeled as random. An independent variable,

More information

Data Mining Stat 588

Data Mining Stat 588 Data Mining Stat 588 Lecture 02: Linear Methods for Regression Department of Statistics & Biostatistics Rutgers University September 13 2011 Regression Problem Quantitative generic output variable Y. Generic

More information

Formal Statement of Simple Linear Regression Model

Formal Statement of Simple Linear Regression Model Formal Statement of Simple Linear Regression Model Y i = β 0 + β 1 X i + ɛ i Y i value of the response variable in the i th trial β 0 and β 1 are parameters X i is a known constant, the value of the predictor

More information

Chapter 14 Simple Linear Regression (A)

Chapter 14 Simple Linear Regression (A) Chapter 14 Simple Linear Regression (A) 1. Characteristics Managerial decisions often are based on the relationship between two or more variables. can be used to develop an equation showing how the variables

More information

x 21 x 22 x 23 f X 1 X 2 X 3 ε

x 21 x 22 x 23 f X 1 X 2 X 3 ε Chapter 2 Estimation 2.1 Example Let s start with an example. Suppose that Y is the fuel consumption of a particular model of car in m.p.g. Suppose that the predictors are 1. X 1 the weight of the car

More information

STAT Chapter 11: Regression

STAT Chapter 11: Regression STAT 515 -- Chapter 11: Regression Mostly we have studied the behavior of a single random variable. Often, however, we gather data on two random variables. We wish to determine: Is there a relationship

More information

CHAPTER EIGHT Linear Regression

CHAPTER EIGHT Linear Regression 7 CHAPTER EIGHT Linear Regression 8. Scatter Diagram Example 8. A chemical engineer is investigating the effect of process operating temperature ( x ) on product yield ( y ). The study results in the following

More information

Lecture 6 Multiple Linear Regression, cont.

Lecture 6 Multiple Linear Regression, cont. Lecture 6 Multiple Linear Regression, cont. BIOST 515 January 22, 2004 BIOST 515, Lecture 6 Testing general linear hypotheses Suppose we are interested in testing linear combinations of the regression

More information

Lecture 15 Multiple regression I Chapter 6 Set 2 Least Square Estimation The quadratic form to be minimized is

Lecture 15 Multiple regression I Chapter 6 Set 2 Least Square Estimation The quadratic form to be minimized is Lecture 15 Multiple regression I Chapter 6 Set 2 Least Square Estimation The quadratic form to be minimized is Q = (Y i β 0 β 1 X i1 β 2 X i2 β p 1 X i.p 1 ) 2, which in matrix notation is Q = (Y Xβ) (Y

More information

Business Statistics. Tommaso Proietti. Linear Regression. DEF - Università di Roma 'Tor Vergata'

Business Statistics. Tommaso Proietti. Linear Regression. DEF - Università di Roma 'Tor Vergata' Business Statistics Tommaso Proietti DEF - Università di Roma 'Tor Vergata' Linear Regression Specication Let Y be a univariate quantitative response variable. We model Y as follows: Y = f(x) + ε where

More information

Statistical View of Least Squares

Statistical View of Least Squares Basic Ideas Some Examples Least Squares May 22, 2007 Basic Ideas Simple Linear Regression Basic Ideas Some Examples Least Squares Suppose we have two variables x and y Basic Ideas Simple Linear Regression

More information

Inference for Regression Inference about the Regression Model and Using the Regression Line

Inference for Regression Inference about the Regression Model and Using the Regression Line Inference for Regression Inference about the Regression Model and Using the Regression Line PBS Chapter 10.1 and 10.2 2009 W.H. Freeman and Company Objectives (PBS Chapter 10.1 and 10.2) Inference about

More information

4 Multiple Linear Regression

4 Multiple Linear Regression 4 Multiple Linear Regression 4. The Model Definition 4.. random variable Y fits a Multiple Linear Regression Model, iff there exist β, β,..., β k R so that for all (x, x 2,..., x k ) R k where ε N (, σ

More information

General Linear Model: Statistical Inference

General Linear Model: Statistical Inference Chapter 6 General Linear Model: Statistical Inference 6.1 Introduction So far we have discussed formulation of linear models (Chapter 1), estimability of parameters in a linear model (Chapter 4), least

More information

9. Model Selection. statistical models. overview of model selection. information criteria. goodness-of-fit measures

9. Model Selection. statistical models. overview of model selection. information criteria. goodness-of-fit measures FE661 - Statistical Methods for Financial Engineering 9. Model Selection Jitkomut Songsiri statistical models overview of model selection information criteria goodness-of-fit measures 9-1 Statistical models

More information

Weighted Least Squares

Weighted Least Squares Weighted Least Squares The standard linear model assumes that Var(ε i ) = σ 2 for i = 1,..., n. As we have seen, however, there are instances where Var(Y X = x i ) = Var(ε i ) = σ2 w i. Here w 1,..., w

More information

14 Multiple Linear Regression

14 Multiple Linear Regression B.Sc./Cert./M.Sc. Qualif. - Statistics: Theory and Practice 14 Multiple Linear Regression 14.1 The multiple linear regression model In simple linear regression, the response variable y is expressed in

More information

Simple Linear Regression (Part 3)

Simple Linear Regression (Part 3) Chapter 1 Simple Linear Regression (Part 3) 1 Write an Estimated model Statisticians/Econometricians usually write an estimated model together with some inference statistics, the following are some formats

More information

Multivariate Regression (Chapter 10)

Multivariate Regression (Chapter 10) Multivariate Regression (Chapter 10) This week we ll cover multivariate regression and maybe a bit of canonical correlation. Today we ll mostly review univariate multivariate regression. With multivariate

More information

Introduction to Estimation Methods for Time Series models. Lecture 1

Introduction to Estimation Methods for Time Series models. Lecture 1 Introduction to Estimation Methods for Time Series models Lecture 1 Fulvio Corsi SNS Pisa Fulvio Corsi Introduction to Estimation () Methods for Time Series models Lecture 1 SNS Pisa 1 / 19 Estimation

More information

Multiple Linear Regression

Multiple Linear Regression Multiple Linear Regression ST 430/514 Recall: a regression model describes how a dependent variable (or response) Y is affected, on average, by one or more independent variables (or factors, or covariates).

More information

The general linear regression with k explanatory variables is just an extension of the simple regression as follows

The general linear regression with k explanatory variables is just an extension of the simple regression as follows 3. Multiple Regression Analysis The general linear regression with k explanatory variables is just an extension of the simple regression as follows (1) y i = β 0 + β 1 x i1 + + β k x ik + u i. Because

More information

Chapter 14 Student Lecture Notes Department of Quantitative Methods & Information Systems. Business Statistics. Chapter 14 Multiple Regression

Chapter 14 Student Lecture Notes Department of Quantitative Methods & Information Systems. Business Statistics. Chapter 14 Multiple Regression Chapter 14 Student Lecture Notes 14-1 Department of Quantitative Methods & Information Systems Business Statistics Chapter 14 Multiple Regression QMIS 0 Dr. Mohammad Zainal Chapter Goals After completing

More information

Inferences for Regression

Inferences for Regression Inferences for Regression An Example: Body Fat and Waist Size Looking at the relationship between % body fat and waist size (in inches). Here is a scatterplot of our data set: Remembering Regression In

More information

Linear Regression. In this problem sheet, we consider the problem of linear regression with p predictors and one intercept,

Linear Regression. In this problem sheet, we consider the problem of linear regression with p predictors and one intercept, Linear Regression In this problem sheet, we consider the problem of linear regression with p predictors and one intercept, y = Xβ + ɛ, where y t = (y 1,..., y n ) is the column vector of target values,

More information

[y i α βx i ] 2 (2) Q = i=1

[y i α βx i ] 2 (2) Q = i=1 Least squares fits This section has no probability in it. There are no random variables. We are given n points (x i, y i ) and want to find the equation of the line that best fits them. We take the equation

More information

The Multiple Regression Model

The Multiple Regression Model Multiple Regression The Multiple Regression Model Idea: Examine the linear relationship between 1 dependent (Y) & or more independent variables (X i ) Multiple Regression Model with k Independent Variables:

More information

Final Review. Yang Feng. Yang Feng (Columbia University) Final Review 1 / 58

Final Review. Yang Feng.   Yang Feng (Columbia University) Final Review 1 / 58 Final Review Yang Feng http://www.stat.columbia.edu/~yangfeng Yang Feng (Columbia University) Final Review 1 / 58 Outline 1 Multiple Linear Regression (Estimation, Inference) 2 Special Topics for Multiple

More information

Linear Algebra Review

Linear Algebra Review Linear Algebra Review Yang Feng http://www.stat.columbia.edu/~yangfeng Yang Feng (Columbia University) Linear Algebra Review 1 / 45 Definition of Matrix Rectangular array of elements arranged in rows and

More information

Review of Econometrics

Review of Econometrics Review of Econometrics Zheng Tian June 5th, 2017 1 The Essence of the OLS Estimation Multiple regression model involves the models as follows Y i = β 0 + β 1 X 1i + β 2 X 2i + + β k X ki + u i, i = 1,...,

More information

Weighted Least Squares

Weighted Least Squares Weighted Least Squares The standard linear model assumes that Var(ε i ) = σ 2 for i = 1,..., n. As we have seen, however, there are instances where Var(Y X = x i ) = Var(ε i ) = σ2 w i. Here w 1,..., w

More information

TMA4255 Applied Statistics V2016 (5)

TMA4255 Applied Statistics V2016 (5) TMA4255 Applied Statistics V2016 (5) Part 2: Regression Simple linear regression [11.1-11.4] Sum of squares [11.5] Anna Marie Holand To be lectured: January 26, 2016 wiki.math.ntnu.no/tma4255/2016v/start

More information

Multiple linear regression S6

Multiple linear regression S6 Basic medical statistics for clinical and experimental research Multiple linear regression S6 Katarzyna Jóźwiak k.jozwiak@nki.nl November 15, 2017 1/42 Introduction Two main motivations for doing multiple

More information

Lecture 14 Simple Linear Regression

Lecture 14 Simple Linear Regression Lecture 4 Simple Linear Regression Ordinary Least Squares (OLS) Consider the following simple linear regression model where, for each unit i, Y i is the dependent variable (response). X i is the independent

More information

y ˆ i = ˆ " T u i ( i th fitted value or i th fit)

y ˆ i = ˆ  T u i ( i th fitted value or i th fit) 1 2 INFERENCE FOR MULTIPLE LINEAR REGRESSION Recall Terminology: p predictors x 1, x 2,, x p Some might be indicator variables for categorical variables) k-1 non-constant terms u 1, u 2,, u k-1 Each u

More information

Regression Models for Quantitative and Qualitative Predictors: An Overview

Regression Models for Quantitative and Qualitative Predictors: An Overview Regression Models for Quantitative and Qualitative Predictors: An Overview Polynomial regression models Interaction regression models Qualitative predictors Indicator variables Modeling interactions between

More information

Linear Regression. Simple linear regression model determines the relationship between one dependent variable (y) and one independent variable (x).

Linear Regression. Simple linear regression model determines the relationship between one dependent variable (y) and one independent variable (x). Linear Regression Simple linear regression model determines the relationship between one dependent variable (y) and one independent variable (x). A dependent variable is a random variable whose variation

More information

The Standard Linear Model: Hypothesis Testing

The Standard Linear Model: Hypothesis Testing Department of Mathematics Ma 3/103 KC Border Introduction to Probability and Statistics Winter 2017 Lecture 25: The Standard Linear Model: Hypothesis Testing Relevant textbook passages: Larsen Marx [4]:

More information

Section 3: Simple Linear Regression

Section 3: Simple Linear Regression Section 3: Simple Linear Regression Carlos M. Carvalho The University of Texas at Austin McCombs School of Business http://faculty.mccombs.utexas.edu/carlos.carvalho/teaching/ 1 Regression: General Introduction

More information

Linear models and their mathematical foundations: Simple linear regression

Linear models and their mathematical foundations: Simple linear regression Linear models and their mathematical foundations: Simple linear regression Steffen Unkel Department of Medical Statistics University Medical Center Göttingen, Germany Winter term 2018/19 1/21 Introduction

More information

Chapter 2 Inferences in Simple Linear Regression

Chapter 2 Inferences in Simple Linear Regression STAT 525 SPRING 2018 Chapter 2 Inferences in Simple Linear Regression Professor Min Zhang Testing for Linear Relationship Term β 1 X i defines linear relationship Will then test H 0 : β 1 = 0 Test requires

More information

STAT420 Midterm Exam. University of Illinois Urbana-Champaign October 19 (Friday), :00 4:15p. SOLUTIONS (Yellow)

STAT420 Midterm Exam. University of Illinois Urbana-Champaign October 19 (Friday), :00 4:15p. SOLUTIONS (Yellow) STAT40 Midterm Exam University of Illinois Urbana-Champaign October 19 (Friday), 018 3:00 4:15p SOLUTIONS (Yellow) Question 1 (15 points) (10 points) 3 (50 points) extra ( points) Total (77 points) Points

More information

Chapter 4: Regression Models

Chapter 4: Regression Models Sales volume of company 1 Textbook: pp. 129-164 Chapter 4: Regression Models Money spent on advertising 2 Learning Objectives After completing this chapter, students will be able to: Identify variables,

More information

Regression Review. Statistics 149. Spring Copyright c 2006 by Mark E. Irwin

Regression Review. Statistics 149. Spring Copyright c 2006 by Mark E. Irwin Regression Review Statistics 149 Spring 2006 Copyright c 2006 by Mark E. Irwin Matrix Approach to Regression Linear Model: Y i = β 0 + β 1 X i1 +... + β p X ip + ɛ i ; ɛ i iid N(0, σ 2 ), i = 1,..., n

More information

Statistics 203 Introduction to Regression Models and ANOVA Practice Exam

Statistics 203 Introduction to Regression Models and ANOVA Practice Exam Statistics 203 Introduction to Regression Models and ANOVA Practice Exam Prof. J. Taylor You may use your 4 single-sided pages of notes This exam is 7 pages long. There are 4 questions, first 3 worth 10

More information

2.1 Linear regression with matrices

2.1 Linear regression with matrices 21 Linear regression with matrices The values of the independent variables are united into the matrix X (design matrix), the values of the outcome and the coefficient are represented by the vectors Y and

More information

Inference in Normal Regression Model. Dr. Frank Wood

Inference in Normal Regression Model. Dr. Frank Wood Inference in Normal Regression Model Dr. Frank Wood Remember We know that the point estimator of b 1 is b 1 = (Xi X )(Y i Ȳ ) (Xi X ) 2 Last class we derived the sampling distribution of b 1, it being

More information

The Aitken Model. Copyright c 2012 Dan Nettleton (Iowa State University) Statistics / 41

The Aitken Model. Copyright c 2012 Dan Nettleton (Iowa State University) Statistics / 41 The Aitken Model Copyright c 2012 Dan Nettleton (Iowa State University) Statistics 611 1 / 41 The Aitken Model (AM): Suppose where y = Xβ + ε, E(ε) = 0 and Var(ε) = σ 2 V for some σ 2 > 0 and some known

More information

Regression Models. Chapter 4. Introduction. Introduction. Introduction

Regression Models. Chapter 4. Introduction. Introduction. Introduction Chapter 4 Regression Models Quantitative Analysis for Management, Tenth Edition, by Render, Stair, and Hanna 008 Prentice-Hall, Inc. Introduction Regression analysis is a very valuable tool for a manager

More information

MATH 644: Regression Analysis Methods

MATH 644: Regression Analysis Methods MATH 644: Regression Analysis Methods FINAL EXAM Fall, 2012 INSTRUCTIONS TO STUDENTS: 1. This test contains SIX questions. It comprises ELEVEN printed pages. 2. Answer ALL questions for a total of 100

More information

BIOS 2083 Linear Models c Abdus S. Wahed

BIOS 2083 Linear Models c Abdus S. Wahed Chapter 5 206 Chapter 6 General Linear Model: Statistical Inference 6.1 Introduction So far we have discussed formulation of linear models (Chapter 1), estimability of parameters in a linear model (Chapter

More information

STAT 540: Data Analysis and Regression

STAT 540: Data Analysis and Regression STAT 540: Data Analysis and Regression Wen Zhou http://www.stat.colostate.edu/~riczw/ Email: riczw@stat.colostate.edu Department of Statistics Colorado State University Fall 205 W. Zhou (Colorado State

More information

How the mean changes depends on the other variable. Plots can show what s happening...

How the mean changes depends on the other variable. Plots can show what s happening... Chapter 8 (continued) Section 8.2: Interaction models An interaction model includes one or several cross-product terms. Example: two predictors Y i = β 0 + β 1 x i1 + β 2 x i2 + β 12 x i1 x i2 + ɛ i. How

More information

Sections 7.1, 7.2, 7.4, & 7.6

Sections 7.1, 7.2, 7.4, & 7.6 Sections 7.1, 7.2, 7.4, & 7.6 Adapted from Timothy Hanson Department of Statistics, University of South Carolina Stat 704: Data Analysis I 1 / 25 Chapter 7 example: Body fat n = 20 healthy females 25 34

More information

Model Selection Procedures

Model Selection Procedures Model Selection Procedures Statistics 135 Autumn 2005 Copyright c 2005 by Mark E. Irwin Model Selection Procedures Consider a regression setting with K potential predictor variables and you wish to explore

More information

Homoskedasticity. Var (u X) = σ 2. (23)

Homoskedasticity. Var (u X) = σ 2. (23) Homoskedasticity How big is the difference between the OLS estimator and the true parameter? To answer this question, we make an additional assumption called homoskedasticity: Var (u X) = σ 2. (23) This

More information

Sampling distribution of t. 2. Sampling distribution of t. 3. Example: Gas mileage investigation. II. Inferential Statistics (8) t =

Sampling distribution of t. 2. Sampling distribution of t. 3. Example: Gas mileage investigation. II. Inferential Statistics (8) t = 2. The distribution of t values that would be obtained if a value of t were calculated for each sample mean for all possible random of a given size from a population _ t ratio: (X - µ hyp ) t s x The result

More information

STA 4210 Practise set 2a

STA 4210 Practise set 2a STA 410 Practise set a For all significance tests, use = 0.05 significance level. S.1. A multiple linear regression model is fit, relating household weekly food expenditures (Y, in $100s) to weekly income

More information

Statistics for Engineers Lecture 9 Linear Regression

Statistics for Engineers Lecture 9 Linear Regression Statistics for Engineers Lecture 9 Linear Regression Chong Ma Department of Statistics University of South Carolina chongm@email.sc.edu April 17, 2017 Chong Ma (Statistics, USC) STAT 509 Spring 2017 April

More information

Simple Linear Regression

Simple Linear Regression Simple Linear Regression September 24, 2008 Reading HH 8, GIll 4 Simple Linear Regression p.1/20 Problem Data: Observe pairs (Y i,x i ),i = 1,...n Response or dependent variable Y Predictor or independent

More information