Simple Regression Model Setup Estimation Inference Prediction. Model Diagnostic. Multiple Regression. Model Setup and Estimation.

Size: px
Start display at page:

Download "Simple Regression Model Setup Estimation Inference Prediction. Model Diagnostic. Multiple Regression. Model Setup and Estimation."

Transcription

1 Statistical Computation Math 475 Jimin Ding Department of Mathematics Washington University in St. Louis jmding/math475/index.html October 10, 2013 Ridge Part IV October 10, / 40 October 10, / 40 Outline I Examples Ridge 1 2 Ridge Ridge Classical regression models only deal with continuous response variable. Let Y denote response (dependent) variable X denote explanatory (independent) variable (predictor). Grade point average (GPA). X : entrance test scores; Y : GPA by the end of freshman year. X : height; Y : weight. X : education level; Y : income. The covariate X is usually continuous in regression categorical covariates are commonly investigated in ANOVA (analysis of variance). But we can also use regression model to study the relationship between a continuous response a categorical covariate by creating some dummy variables. This is equivalent to ANOVA/ANCOVA/MANOVA to some extend. October 10, / 40 October 10, / 40

2 General Goal of Study: Ridge linear regression for one covariate: Y i = β 0 + β 1 X i + e i, i = 1,, n, where Y i X i are the ith observed response covariate variables, e i is the rom error, β 0 β 1 are the unknown parameters. Assumptions: 1 E(e i ) = 0 for all i's. 2 Var(e i ) = σ 2. (Homogeneity) 3 e i e j are independent for any i j. (Independence) iid 4 e i N(0, σ 2 ). (Normality) Ridge Describe the relationship between explanatory response variables. Estimate β 0, β 1, σ 2. Predict/Forecast the response variable for a new given predictor value. Predict E(Y new X new ) = β 0 + β 1 X new. : testing whether the relationship is statistically signicant. Find CI for β 1 to see whether it includes 0. Test: H 0 : β 1 = 0. interval of response variable. October 10, / 40 October 10, / 40 Least Square Estimator Ridge Estimator: a function of data that provides a guess about the value or parameters of interest. Example: X can be used to estimate µ. Criteria: how to choose a good" estimator? The smaller the following quantities are, the better the estimator is: Bias: E( ˆβ) β, Variance: Var( ˆβ), MSE: E{( ˆβ β) 2 }. Question: the distribution of ˆβ is usually unknown since the distribution of Y is unknown! Approximate Bias, Variance MSE by their sample version. Ridge To predict Y well in a simple linear regression, it is natural to obtain the estimators by minimizing: Q = n [Y i (β 0 + β 1 X i )] 2, i=1 which is the so called Least Square Criterion. The obtained estimators ˆβ 0 = Ȳ ˆβ 1 X i=1 ˆβ 1 = (X i X )(Yi Ȳ ) i=1 (X i X ) 2 = ρ X,Y S Y S X, are the so called Least Square Estimators (LSE). October 10, / 40 October 10, / 40

3 Line Sum Squares in ANOVA Table Ridge y Y^ i Y i lines: y = ˆβ 0 + ˆβ 1 x. Fitted values: Ŷ i = ˆβ 0 + ˆβ 1 X i. Residuals: ê i = Ŷ i Y i. Ridge SSE: sum of square errors i=1 ê2 i. SSTO: i=1 (Y i Ȳ ) 2. SSR: i=1 (Ŷ i Ȳ ) 2. DF: the degree of freedom. DF of the residuals = the number of observations - the number of parameters in the model. MSE: SSE/DF of the error term. 1 n 2 i=1 ê2 i = ˆσ 2 (estimator for σ 2 ). x For LSE, we have i=1 êi = 0 i=1 ê2 i is minimized. October 10, / 40 October 10, / 40 Best Linear Unbiased (BLUE) Other Type of Estimators Ridge Note that LSE are unbiased estimators. E(MSE) = E(ˆσ 2 ) = σ 2 ; E( ˆβ 0 ) = β 0 ; E( ˆβ 1 ) = β 1. Both ˆβ 0 ˆβ 1 are linear functions of observations Y 1,, Y n. Among all unbiased estimators, LSE has smallest variance. In another words, it is more precise than any other unbiased linear predictors. Remark: BLUE property still hold without the normality assumption in (4). Ridge Least absolute deviation estimator (LAD): a robust estimator. Weighted least square estimator (WLSE): an estimator to adjust for heterogeneity. Maximum Likelihood Estimator (MLE): MLE is based on the likelihood function hence is only valid under the assumption (4). October 10, / 40 October 10, / 40

4 MLE Properties of MLE Ridge Recall Likelihood: L = n f (Y i ) = i=1 n i=1 1 2πσ 2 exp{ 1 2σ 2 (Y i β 0 β 1 X i ) 2 }. The parameters (β 0, β 1, σ 2 ) which maximize above likelihood function, L(β 0, β 1, σ 2 ), are dened as MLE of (β 0, β 1, σ 2 ), denoted by ( ˆβ 0,ML, ˆβ 1,ML, ˆσ 2 ML ). Ridge In the simple linear regression with the normality assumption on errors, LSE for β's are same as MLE for β's. (They are dierent for σ 2.) So MLE for β's are also BLUE. Usually in most of models, MLE are Consistent: ˆβ β in probability or a.s. Sucient: f (Y 1,, Y n ˆθ ML ) does not depend on θ. MVUE: minimum variance unbiased estimator. Asymptotic ecient: Var( ˆβ) reach the Cramér-Rao lower bound. (minimum variance) October 10, / 40 October 10, / 40 Condence Interval Stard Errors Ridge Note: ˆβ 0, ˆβ 1, ˆσ 2 are functions of data so are rom variables. As point estimators, they only provide a guess about true parameters will change if the data are changed. We also want to get a range guess for ˆβ 0, ˆβ 1, ˆσ 2, which guarantee that with certain probability the true parameter will be in the range. For example, if we repeat the experiments 100 times collect 100 set of data, 95 out the 100 guessed range will contain the true parameters. This range is called condence interval. CI for β 0 : ˆβ 0 ± t 1 α/2,n 2 se( ˆβ 0 ); CI for β 1 : ˆβ 1 ± t 1 α/2,n 2 se( ˆβ 1 ). Ridge Var( ˆβ 0 ) = Var(Ȳ ˆβ 1 X ) = Var( ˆβ 1 ) = Var( = σ 2 ( 1 n + X i=1 (X i X ) 2 ) i=1 (X i X )(Yi Ȳ ) i=1 (X ) i X ) 2 = = σ 2 1 ( i=1 (X i X ) 2 ) But σ 2 is still unknown, so we use the estimator ˆσ 2 = MSE to replace σ 2 in estimating stard errors. October 10, / 40 October 10, / 40

5 Stard Errors Distribution of Estimators Hence se( ˆβ 0 ) = MSE( 1 n + X 2 i=1 (X i X ) 2 ), Under the normality assumption, ˆβ p β p se( ˆβ p ) t n 2, Ridge se( ˆβ 1 ) = 1 MSE( i=1 (X i X ) 2 ). Ridge without the normality assumption, ˆβ p β p se( ˆβ p ) t n 2 approximately, for p = 1, 2. October 10, / 40 October 10, / 40 Hypothesis Test Condence Interval for the Mean of an Observation Ridge f(x) H 0 : β 1 = 0 v.s. β 1 0. (Whether Y i depends on X i or not.) Test statistic: t = ˆβ 1 0 se( ˆβ 1 ) t n 2. p-value=p(t n 2 > t ). (two sided) t 29 distribution Two sided p value Ridge Let Ŷ new = ˆβ 0 + ˆβ 1 X new. Var(Ŷ new ) = Var( ˆβ 0 + ˆβ 1 X new ) = = σ 2 ( 1 n + ( X Xnew ) 2 i=1 (X i X ) 2 ). Hence a (1 α)% CI for the mean value of the observation E(Y new ) = β 0 + β 1 X new is Ŷ new ± t 1 α/2,n 2 MSE( 1 n + ( X Xnew ) 2 i=1 (X i X ) 2 ), x which is denoted as CLM in SAS output. October 10, / 40 October 10, / 40

6 Interval (PI) Simultaneous Condence/ Bs Ridge A (1 α)% prediction interval is an interval I such that P(Y new I ) = 1 α. Note that Y new = β 0 + β 1 X new + e new is rom in the sense that Var(Y new ) = σ 2 0. We can derive Y new Ŷ new N(0, σ 2 (1 + 1 n + ( X Xnew ) 2 i=1 (X i X ) 2 )), 1 α = P( Y new Ŷ new MSE(1 + 1 n + ( X Xnew ) 2 i=1 (X i X ) 2 )) t 1 α/2,n 2 ). Hence (1 α)% PI for Y new, denoted as RLCLI in SAS, is: Ŷ new ± t 1 α/2,n 2 MSE(1 + 1 n + ( X Xnew ) 2 n i=1 (X i X ) 2 ). Ridge Condence b for E(Y ): Dierent from prediction interval condence interval for the mean, it is a simultaneous b for the entire regression line. Ŷ i ± W MSE( 1 n + ( X Xi ) 2 i=1 (X i X ) 2 )), where W 2 = 2F (1 α; 2, n 2) i = 1,, n. b for the Y i 's in the entire region: Ŷ i ± S(or B) MSE( 1 n + ( X Xi ) 2 n i=1 (X i X ) 2 )), where S 2 = gf (1 α; g, n 2) (Scheé type) B = t(1 α/2g, n 2) (Bonferroni type). October 10, / 40 October 10, / 40 Coecient of Determination F-test for Goodness of Fit Ridge Coecient of Determination is dened as R 2 = SSR SSTO = 1 SSE [0, 1]. SSTO Interpretation: the proportion reduction of total variation associated with the use of the predictor variable X. The larger R 2 is, the more the total variation of Y is explained by X. In simple linear regression, R 2 is same as ˆρ 2. The R close to 0 does not imply that X Y are not related, but simply means that the linear correlation between X Y is small. Ridge H 0 : Reduced modely i = β 0 + e i, v.s. H 1 : Full modely i = β 0 + β 1 X i + e i. This is equivalent to test H 0 : β 1 = 0. Test statistic: F = MSR F (1 α; 1, n 2), MSE where MSR = SSR/the number of model parameters 1. (Note that F 1,df = t 2 df ) P-value: October 10, / 40 October 10, / 40

7 General F-test Residual Plots If model 1 (reduced model) is nested (submodel) within model 2 (full model), the comparison between two models can be done by F-test. Residuals against the index of observations: 1 symmetric around 0; 2 constant variability; 3 no serial correction. Ridge F = SSE(R) SSE(F ) df (R) df (F ) SSE(F ) df (F ) General F-test is commonly used in model selection. Ridge Residuals against the predicted values. (add a link to possible problematic residual plots.) QQ plot/normality plot: check the normality assumption. Under the normality assumption, the residuals should be close to the reference line or look linear. October 10, / 40 October 10, / 40 Prototype Residual Plots Outliers Inuential Observations Cook's distance is dened as Ridge Ridge D i = j=1 Ŷj Ŷ j(i) p MSE = n i=1 ê 2 i h ii p MSE [ (1 h ii ) 2 ]. The value of Cook's distance for each observation represents a measure of the degree to which the predicted values change if the observation is left out of the regression. If an observation has an unusually large value for the Cook's distance, it might be worth to further investigations. (small inuence: D < 0.1; huge inuence: D > 0.5) Although above denition requires to t regression n times, it can be simplied only need to t model once. Here h ii is the i-th diagonal element of the hat matrix H = X (X T X ) 1 X T. October 10, / 40 October 10, / 40

8 Ridge Two predictors: Y i = β 0 + β 1 X i1 + β 2 X i2 + e i. More general: Consider p 1 predictors X i1,, X i(p 1), Y i = β 0 + β 1 X i1 + β 2 X i2 + β p 1 X i(p 1) + e i, for i = 1,, n. We may write it in the following matrix form Y = X β + e, where Y = (Y 1,, Y n ) T, β = (β 0,, β p 1 ) T, e = (e 1,, e n ) T N(0, σ 2 I ) X is the n p design matrix. Ridge The LSE for ˆβ LS = (X T X ) 1 X T Y Cov(ˆβ LS ) = σ 2 (X T X ) 1. se(ˆβ LS ) = [MSE(X T X ) 1 ] 1/2. Under normality assumption, ˆβ ML = ˆβ LS. The tted values: Ŷ = X ˆβ = X (X T X ) 1 X T Y. Hat matrix H = X (X T X ) 1 X T. The residuals: ê = Y Ŷ = (I H)Y, Cov(e) = σ 2 (I H). October 10, / 40 October 10, / 40 ANOVA Table : Covariates Ridge SSE: ê T ê = Y T (I H)Y. SSTO: i=1 (Y i Ȳ ) 2 = Y T Y 1 n Y T JY, where J is the n n matrix with all components equal to 1. SSR=SSTO-SSE. MSE=SSE/(n-p). MSTO=SSTO/(n-1). MSR=SSR(p-1). Overall F-test: H 0 : β = 0 (β 1 = β 2 = = β p 1 = 0) H a : not all β's are zeros. F = MSR MSE F 1 α;p 1,n p. R 2 = SSR SSTO = 1 SSE SSTO. Adjusted R 2 : adjust for the number of predictors 1 n 1 (1 R2 A ) = 1 n p (1 R2 ). Ridge Forward selection: from no covariates Backward selection: from all covariates Stepwise selection: Backward+forward October 10, / 40 October 10, / 40

9 When Happens Remedy for Ridge Adding or deleting a predictor changes R 2 substantially. Type III SSR heavily depends on other variables in the model. se( ˆβ k ) is large. Predictors are not signicant individually, but simple regression on each covariate is signicant. Ridge Centering stardize the predictors, which might be helpful in polynomial regression when some of the predictors are badly scaled. Drop the correlated variables by model selection. 1 The predictor is not signicant. 2 The reduced model after dropping the predictor ts data nearly as well as the full model. Add new observations. (Economy, Business) Use the index of several variables (PCA) Ridge October 10, / 40 October 10, / 40 Variance Ination Factor (VIF) Ridge Ridge Let R 2 j is the coecient of determination of X j on all other predictors. (R 2 j is the R 2 of regression model X ij = β 0 +β 1 X i1 + +β j 1 X i(j 1) +β j+1 X i(j+1) + +e ij.) Dene VIF j = 1 1 R 2. j VIF j 1 since R 2 j 1 for all j. If VIF > 10, we usually believe the variable has inuential variation to cause collinearity problem. In stardized regression Var ˆβ j = σvif j. In SAS, VIF table is reported in PROC REG by adding VIF option in MODEL statement. Ridge Recall that ˆβ LS = (X T X ) 1 X T Y, E(ˆβ LS ) = β Var(ˆβ) = (X T X ) 1. When (X T X ) is nearly singular (the determinate is close to 0), LSE is unbiased but has large variance, which leads to large mean square error of the estimator The idea of the ridge regression is: reduce variance at the cost of increasing bias beta October 10, / 40 October 10, / 40

10 Ridge SAS Program Ridge The ridge regression estimator is: ˆβ r (b) = (X T X + bi ) 1 X T Y, where b is a constant chosen by users referred as tuning parameter. As b increases, the bias increases but the variance decreases, ˆβ r (b) 0 (componentwise). One may choose the tuning parameter b, such that ridge trace (ˆβ r (b) against b) gets at, VIF j (against b) drop around 1. Ridge PROC GLM; PROC REG; PROC CATMOD; PROC GENMOD; PROC LOGISTIC; PROC NLINL; PROC PLS; PROC MIXED; October 10, / 40 October 10, / 40 Reading Assignment Ridge Textbook: Chapter 5 Chapter 9. October 10, / 40

Multiple Linear Regression

Multiple Linear Regression Multiple Linear Regression Simple linear regression tries to fit a simple line between two variables Y and X. If X is linearly related to Y this explains some of the variability in Y. In most cases, there

More information

Outline. Remedial Measures) Extra Sums of Squares Standardized Version of the Multiple Regression Model

Outline. Remedial Measures) Extra Sums of Squares Standardized Version of the Multiple Regression Model Outline 1 Multiple Linear Regression (Estimation, Inference, Diagnostics and Remedial Measures) 2 Special Topics for Multiple Regression Extra Sums of Squares Standardized Version of the Multiple Regression

More information

Lecture 15 Multiple regression I Chapter 6 Set 2 Least Square Estimation The quadratic form to be minimized is

Lecture 15 Multiple regression I Chapter 6 Set 2 Least Square Estimation The quadratic form to be minimized is Lecture 15 Multiple regression I Chapter 6 Set 2 Least Square Estimation The quadratic form to be minimized is Q = (Y i β 0 β 1 X i1 β 2 X i2 β p 1 X i.p 1 ) 2, which in matrix notation is Q = (Y Xβ) (Y

More information

Chapter 6 Multiple Regression

Chapter 6 Multiple Regression STAT 525 FALL 2018 Chapter 6 Multiple Regression Professor Min Zhang The Data and Model Still have single response variable Y Now have multiple explanatory variables Examples: Blood Pressure vs Age, Weight,

More information

Simple Linear Regression

Simple Linear Regression Simple Linear Regression In simple linear regression we are concerned about the relationship between two variables, X and Y. There are two components to such a relationship. 1. The strength of the relationship.

More information

SSR = The sum of squared errors measures how much Y varies around the regression line n. It happily turns out that SSR + SSE = SSTO.

SSR = The sum of squared errors measures how much Y varies around the regression line n. It happily turns out that SSR + SSE = SSTO. Analysis of variance approach to regression If x is useless, i.e. β 1 = 0, then E(Y i ) = β 0. In this case β 0 is estimated by Ȳ. The ith deviation about this grand mean can be written: deviation about

More information

Lecture 10 Multiple Linear Regression

Lecture 10 Multiple Linear Regression Lecture 10 Multiple Linear Regression STAT 512 Spring 2011 Background Reading KNNL: 6.1-6.5 10-1 Topic Overview Multiple Linear Regression Model 10-2 Data for Multiple Regression Y i is the response variable

More information

Math 423/533: The Main Theoretical Topics

Math 423/533: The Main Theoretical Topics Math 423/533: The Main Theoretical Topics Notation sample size n, data index i number of predictors, p (p = 2 for simple linear regression) y i : response for individual i x i = (x i1,..., x ip ) (1 p)

More information

BANA 7046 Data Mining I Lecture 2. Linear Regression, Model Assessment, and Cross-validation 1

BANA 7046 Data Mining I Lecture 2. Linear Regression, Model Assessment, and Cross-validation 1 BANA 7046 Data Mining I Lecture 2. Linear Regression, Model Assessment, and Cross-validation 1 Shaobo Li University of Cincinnati 1 Partially based on Hastie, et al. (2009) ESL, and James, et al. (2013)

More information

Simple and Multiple Linear Regression

Simple and Multiple Linear Regression Sta. 113 Chapter 12 and 13 of Devore March 12, 2010 Table of contents 1 Simple Linear Regression 2 Model Simple Linear Regression A simple linear regression model is given by Y = β 0 + β 1 x + ɛ where

More information

Statistics 203: Introduction to Regression and Analysis of Variance Course review

Statistics 203: Introduction to Regression and Analysis of Variance Course review Statistics 203: Introduction to Regression and Analysis of Variance Course review Jonathan Taylor - p. 1/?? Today Review / overview of what we learned. - p. 2/?? General themes in regression models Specifying

More information

Chapter 1: Linear Regression with One Predictor Variable also known as: Simple Linear Regression Bivariate Linear Regression

Chapter 1: Linear Regression with One Predictor Variable also known as: Simple Linear Regression Bivariate Linear Regression BSTT523: Kutner et al., Chapter 1 1 Chapter 1: Linear Regression with One Predictor Variable also known as: Simple Linear Regression Bivariate Linear Regression Introduction: Functional relation between

More information

STAT 705 Chapter 16: One-way ANOVA

STAT 705 Chapter 16: One-way ANOVA STAT 705 Chapter 16: One-way ANOVA Timothy Hanson Department of Statistics, University of South Carolina Stat 705: Data Analysis II 1 / 21 What is ANOVA? Analysis of variance (ANOVA) models are regression

More information

Business Statistics. Tommaso Proietti. Linear Regression. DEF - Università di Roma 'Tor Vergata'

Business Statistics. Tommaso Proietti. Linear Regression. DEF - Università di Roma 'Tor Vergata' Business Statistics Tommaso Proietti DEF - Università di Roma 'Tor Vergata' Linear Regression Specication Let Y be a univariate quantitative response variable. We model Y as follows: Y = f(x) + ε where

More information

Regression Review. Statistics 149. Spring Copyright c 2006 by Mark E. Irwin

Regression Review. Statistics 149. Spring Copyright c 2006 by Mark E. Irwin Regression Review Statistics 149 Spring 2006 Copyright c 2006 by Mark E. Irwin Matrix Approach to Regression Linear Model: Y i = β 0 + β 1 X i1 +... + β p X ip + ɛ i ; ɛ i iid N(0, σ 2 ), i = 1,..., n

More information

Final Review. Yang Feng. Yang Feng (Columbia University) Final Review 1 / 58

Final Review. Yang Feng.   Yang Feng (Columbia University) Final Review 1 / 58 Final Review Yang Feng http://www.stat.columbia.edu/~yangfeng Yang Feng (Columbia University) Final Review 1 / 58 Outline 1 Multiple Linear Regression (Estimation, Inference) 2 Special Topics for Multiple

More information

Chapter 1. Linear Regression with One Predictor Variable

Chapter 1. Linear Regression with One Predictor Variable Chapter 1. Linear Regression with One Predictor Variable 1.1 Statistical Relation Between Two Variables To motivate statistical relationships, let us consider a mathematical relation between two mathematical

More information

Chapter 2 Inferences in Simple Linear Regression

Chapter 2 Inferences in Simple Linear Regression STAT 525 SPRING 2018 Chapter 2 Inferences in Simple Linear Regression Professor Min Zhang Testing for Linear Relationship Term β 1 X i defines linear relationship Will then test H 0 : β 1 = 0 Test requires

More information

Formal Statement of Simple Linear Regression Model

Formal Statement of Simple Linear Regression Model Formal Statement of Simple Linear Regression Model Y i = β 0 + β 1 X i + ɛ i Y i value of the response variable in the i th trial β 0 and β 1 are parameters X i is a known constant, the value of the predictor

More information

Review: Second Half of Course Stat 704: Data Analysis I, Fall 2014

Review: Second Half of Course Stat 704: Data Analysis I, Fall 2014 Review: Second Half of Course Stat 704: Data Analysis I, Fall 2014 Tim Hanson, Ph.D. University of South Carolina T. Hanson (USC) Stat 704: Data Analysis I, Fall 2014 1 / 13 Chapter 8: Polynomials & Interactions

More information

Chapter 3. Diagnostics and Remedial Measures

Chapter 3. Diagnostics and Remedial Measures Chapter 3. Diagnostics and Remedial Measures So far, we took data (X i, Y i ) and we assumed Y i = β 0 + β 1 X i + ǫ i i = 1, 2,..., n, where ǫ i iid N(0, σ 2 ), β 0, β 1 and σ 2 are unknown parameters,

More information

Lecture 18 MA Applied Statistics II D 2004

Lecture 18 MA Applied Statistics II D 2004 Lecture 18 MA 2612 - Applied Statistics II D 2004 Today 1. Examples of multiple linear regression 2. The modeling process (PNC 8.4) 3. The graphical exploration of multivariable data (PNC 8.5) 4. Fitting

More information

Chapter 14. Linear least squares

Chapter 14. Linear least squares Serik Sagitov, Chalmers and GU, March 5, 2018 Chapter 14 Linear least squares 1 Simple linear regression model A linear model for the random response Y = Y (x) to an independent variable X = x For a given

More information

Dr. Maddah ENMG 617 EM Statistics 11/28/12. Multiple Regression (3) (Chapter 15, Hines)

Dr. Maddah ENMG 617 EM Statistics 11/28/12. Multiple Regression (3) (Chapter 15, Hines) Dr. Maddah ENMG 617 EM Statistics 11/28/12 Multiple Regression (3) (Chapter 15, Hines) Problems in multiple regression: Multicollinearity This arises when the independent variables x 1, x 2,, x k, are

More information

Topic 18: Model Selection and Diagnostics

Topic 18: Model Selection and Diagnostics Topic 18: Model Selection and Diagnostics Variable Selection We want to choose a best model that is a subset of the available explanatory variables Two separate problems 1. How many explanatory variables

More information

Multiple Regression Analysis. Part III. Multiple Regression Analysis

Multiple Regression Analysis. Part III. Multiple Regression Analysis Part III Multiple Regression Analysis As of Sep 26, 2017 1 Multiple Regression Analysis Estimation Matrix form Goodness-of-Fit R-square Adjusted R-square Expected values of the OLS estimators Irrelevant

More information

Chapter 2 Multiple Regression I (Part 1)

Chapter 2 Multiple Regression I (Part 1) Chapter 2 Multiple Regression I (Part 1) 1 Regression several predictor variables The response Y depends on several predictor variables X 1,, X p response {}}{ Y predictor variables {}}{ X 1, X 2,, X p

More information

where x and ȳ are the sample means of x 1,, x n

where x and ȳ are the sample means of x 1,, x n y y Animal Studies of Side Effects Simple Linear Regression Basic Ideas In simple linear regression there is an approximately linear relation between two variables say y = pressure in the pancreas x =

More information

STAT5044: Regression and Anova. Inyoung Kim

STAT5044: Regression and Anova. Inyoung Kim STAT5044: Regression and Anova Inyoung Kim 2 / 47 Outline 1 Regression 2 Simple Linear regression 3 Basic concepts in regression 4 How to estimate unknown parameters 5 Properties of Least Squares Estimators:

More information

Linear model selection and regularization

Linear model selection and regularization Linear model selection and regularization Problems with linear regression with least square 1. Prediction Accuracy: linear regression has low bias but suffer from high variance, especially when n p. It

More information

Linear models and their mathematical foundations: Simple linear regression

Linear models and their mathematical foundations: Simple linear regression Linear models and their mathematical foundations: Simple linear regression Steffen Unkel Department of Medical Statistics University Medical Center Göttingen, Germany Winter term 2018/19 1/21 Introduction

More information

Ch 2: Simple Linear Regression

Ch 2: Simple Linear Regression Ch 2: Simple Linear Regression 1. Simple Linear Regression Model A simple regression model with a single regressor x is y = β 0 + β 1 x + ɛ, where we assume that the error ɛ is independent random component

More information

Remedial Measures for Multiple Linear Regression Models

Remedial Measures for Multiple Linear Regression Models Remedial Measures for Multiple Linear Regression Models Yang Feng http://www.stat.columbia.edu/~yangfeng Yang Feng (Columbia University) Remedial Measures for Multiple Linear Regression Models 1 / 25 Outline

More information

The general linear regression with k explanatory variables is just an extension of the simple regression as follows

The general linear regression with k explanatory variables is just an extension of the simple regression as follows 3. Multiple Regression Analysis The general linear regression with k explanatory variables is just an extension of the simple regression as follows (1) y i = β 0 + β 1 x i1 + + β k x ik + u i. Because

More information

Regression Estimation - Least Squares and Maximum Likelihood. Dr. Frank Wood

Regression Estimation - Least Squares and Maximum Likelihood. Dr. Frank Wood Regression Estimation - Least Squares and Maximum Likelihood Dr. Frank Wood Least Squares Max(min)imization Function to minimize w.r.t. β 0, β 1 Q = n (Y i (β 0 + β 1 X i )) 2 i=1 Minimize this by maximizing

More information

Contents. 1 Review of Residuals. 2 Detecting Outliers. 3 Influential Observations. 4 Multicollinearity and its Effects

Contents. 1 Review of Residuals. 2 Detecting Outliers. 3 Influential Observations. 4 Multicollinearity and its Effects Contents 1 Review of Residuals 2 Detecting Outliers 3 Influential Observations 4 Multicollinearity and its Effects W. Zhou (Colorado State University) STAT 540 July 6th, 2015 1 / 32 Model Diagnostics:

More information

2. Regression Review

2. Regression Review 2. Regression Review 2.1 The Regression Model The general form of the regression model y t = f(x t, β) + ε t where x t = (x t1,, x tp ), β = (β 1,..., β m ). ε t is a random variable, Eε t = 0, Var(ε t

More information

MA 575 Linear Models: Cedric E. Ginestet, Boston University Midterm Review Week 7

MA 575 Linear Models: Cedric E. Ginestet, Boston University Midterm Review Week 7 MA 575 Linear Models: Cedric E. Ginestet, Boston University Midterm Review Week 7 1 Random Vectors Let a 0 and y be n 1 vectors, and let A be an n n matrix. Here, a 0 and A are non-random, whereas y is

More information

Chapter 2. Continued. Proofs For ANOVA Proof of ANOVA Identity. the product term in the above equation can be simplified as n

Chapter 2. Continued. Proofs For ANOVA Proof of ANOVA Identity. the product term in the above equation can be simplified as n Chapter 2. Continued Proofs For ANOVA Proof of ANOVA Identity We are going to prove that Writing SST SSR + SSE. Y i Ȳ (Y i Ŷ i ) + (Ŷ i Ȳ ) Squaring both sides summing over all i 1,...n, we get (Y i Ȳ

More information

Ch 3: Multiple Linear Regression

Ch 3: Multiple Linear Regression Ch 3: Multiple Linear Regression 1. Multiple Linear Regression Model Multiple regression model has more than one regressor. For example, we have one response variable and two regressor variables: 1. delivery

More information

STAT 100C: Linear models

STAT 100C: Linear models STAT 100C: Linear models Arash A. Amini June 9, 2018 1 / 56 Table of Contents Multiple linear regression Linear model setup Estimation of β Geometric interpretation Estimation of σ 2 Hat matrix Gram matrix

More information

Math 3330: Solution to midterm Exam

Math 3330: Solution to midterm Exam Math 3330: Solution to midterm Exam Question 1: (14 marks) Suppose the regression model is y i = β 0 + β 1 x i + ε i, i = 1,, n, where ε i are iid Normal distribution N(0, σ 2 ). a. (2 marks) Compute the

More information

3. Diagnostics and Remedial Measures

3. Diagnostics and Remedial Measures 3. Diagnostics and Remedial Measures So far, we took data (X i, Y i ) and we assumed where ɛ i iid N(0, σ 2 ), Y i = β 0 + β 1 X i + ɛ i i = 1, 2,..., n, β 0, β 1 and σ 2 are unknown parameters, X i s

More information

STAT420 Midterm Exam. University of Illinois Urbana-Champaign October 19 (Friday), :00 4:15p. SOLUTIONS (Yellow)

STAT420 Midterm Exam. University of Illinois Urbana-Champaign October 19 (Friday), :00 4:15p. SOLUTIONS (Yellow) STAT40 Midterm Exam University of Illinois Urbana-Champaign October 19 (Friday), 018 3:00 4:15p SOLUTIONS (Yellow) Question 1 (15 points) (10 points) 3 (50 points) extra ( points) Total (77 points) Points

More information

Regression, Ridge Regression, Lasso

Regression, Ridge Regression, Lasso Regression, Ridge Regression, Lasso Fabio G. Cozman - fgcozman@usp.br October 2, 2018 A general definition Regression studies the relationship between a response variable Y and covariates X 1,..., X n.

More information

Lectures on Simple Linear Regression Stat 431, Summer 2012

Lectures on Simple Linear Regression Stat 431, Summer 2012 Lectures on Simple Linear Regression Stat 43, Summer 0 Hyunseung Kang July 6-8, 0 Last Updated: July 8, 0 :59PM Introduction Previously, we have been investigating various properties of the population

More information

STAT763: Applied Regression Analysis. Multiple linear regression. 4.4 Hypothesis testing

STAT763: Applied Regression Analysis. Multiple linear regression. 4.4 Hypothesis testing STAT763: Applied Regression Analysis Multiple linear regression 4.4 Hypothesis testing Chunsheng Ma E-mail: cma@math.wichita.edu 4.4.1 Significance of regression Null hypothesis (Test whether all β j =

More information

Lecture 1 Linear Regression with One Predictor Variable.p2

Lecture 1 Linear Regression with One Predictor Variable.p2 Lecture Linear Regression with One Predictor Variablep - Basics - Meaning of regression parameters p - β - the slope of the regression line -it indicates the change in mean of the probability distn of

More information

Day 4: Shrinkage Estimators

Day 4: Shrinkage Estimators Day 4: Shrinkage Estimators Kenneth Benoit Data Mining and Statistical Learning March 9, 2015 n versus p (aka k) Classical regression framework: n > p. Without this inequality, the OLS coefficients have

More information

Data Mining Stat 588

Data Mining Stat 588 Data Mining Stat 588 Lecture 02: Linear Methods for Regression Department of Statistics & Biostatistics Rutgers University September 13 2011 Regression Problem Quantitative generic output variable Y. Generic

More information

6. Multiple Linear Regression

6. Multiple Linear Regression 6. Multiple Linear Regression SLR: 1 predictor X, MLR: more than 1 predictor Example data set: Y i = #points scored by UF football team in game i X i1 = #games won by opponent in their last 10 games X

More information

Lecture 3: Linear Models. Bruce Walsh lecture notes Uppsala EQG course version 28 Jan 2012

Lecture 3: Linear Models. Bruce Walsh lecture notes Uppsala EQG course version 28 Jan 2012 Lecture 3: Linear Models Bruce Walsh lecture notes Uppsala EQG course version 28 Jan 2012 1 Quick Review of the Major Points The general linear model can be written as y = X! + e y = vector of observed

More information

STAT5044: Regression and Anova. Inyoung Kim

STAT5044: Regression and Anova. Inyoung Kim STAT5044: Regression and Anova Inyoung Kim 2 / 51 Outline 1 Matrix Expression 2 Linear and quadratic forms 3 Properties of quadratic form 4 Properties of estimates 5 Distributional properties 3 / 51 Matrix

More information

Lecture 6 Multiple Linear Regression, cont.

Lecture 6 Multiple Linear Regression, cont. Lecture 6 Multiple Linear Regression, cont. BIOST 515 January 22, 2004 BIOST 515, Lecture 6 Testing general linear hypotheses Suppose we are interested in testing linear combinations of the regression

More information

Math 494: Mathematical Statistics

Math 494: Mathematical Statistics Math 494: Mathematical Statistics Instructor: Jimin Ding jmding@wustl.edu Department of Mathematics Washington University in St. Louis Class materials are available on course website (www.math.wustl.edu/

More information

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A. 1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n

More information

Sections 7.1, 7.2, 7.4, & 7.6

Sections 7.1, 7.2, 7.4, & 7.6 Sections 7.1, 7.2, 7.4, & 7.6 Adapted from Timothy Hanson Department of Statistics, University of South Carolina Stat 704: Data Analysis I 1 / 25 Chapter 7 example: Body fat n = 20 healthy females 25 34

More information

Estimating σ 2. We can do simple prediction of Y and estimation of the mean of Y at any value of X.

Estimating σ 2. We can do simple prediction of Y and estimation of the mean of Y at any value of X. Estimating σ 2 We can do simple prediction of Y and estimation of the mean of Y at any value of X. To perform inferences about our regression line, we must estimate σ 2, the variance of the error term.

More information

How the mean changes depends on the other variable. Plots can show what s happening...

How the mean changes depends on the other variable. Plots can show what s happening... Chapter 8 (continued) Section 8.2: Interaction models An interaction model includes one or several cross-product terms. Example: two predictors Y i = β 0 + β 1 x i1 + β 2 x i2 + β 12 x i1 x i2 + ɛ i. How

More information

Regression and Statistical Inference

Regression and Statistical Inference Regression and Statistical Inference Walid Mnif wmnif@uwo.ca Department of Applied Mathematics The University of Western Ontario, London, Canada 1 Elements of Probability 2 Elements of Probability CDF&PDF

More information

STAT 540: Data Analysis and Regression

STAT 540: Data Analysis and Regression STAT 540: Data Analysis and Regression Wen Zhou http://www.stat.colostate.edu/~riczw/ Email: riczw@stat.colostate.edu Department of Statistics Colorado State University Fall 205 W. Zhou (Colorado State

More information

MS&E 226: Small Data. Lecture 11: Maximum likelihood (v2) Ramesh Johari

MS&E 226: Small Data. Lecture 11: Maximum likelihood (v2) Ramesh Johari MS&E 226: Small Data Lecture 11: Maximum likelihood (v2) Ramesh Johari ramesh.johari@stanford.edu 1 / 18 The likelihood function 2 / 18 Estimating the parameter This lecture develops the methodology behind

More information

Multiple Regression. Inference for Multiple Regression and A Case Study. IPS Chapters 11.1 and W.H. Freeman and Company

Multiple Regression. Inference for Multiple Regression and A Case Study. IPS Chapters 11.1 and W.H. Freeman and Company Multiple Regression Inference for Multiple Regression and A Case Study IPS Chapters 11.1 and 11.2 2009 W.H. Freeman and Company Objectives (IPS Chapters 11.1 and 11.2) Multiple regression Data for multiple

More information

Regression Analysis. Regression: Methodology for studying the relationship among two or more variables

Regression Analysis. Regression: Methodology for studying the relationship among two or more variables Regression Analysis Regression: Methodology for studying the relationship among two or more variables Two major aims: Determine an appropriate model for the relationship between the variables Predict the

More information

2017 Financial Mathematics Orientation - Statistics

2017 Financial Mathematics Orientation - Statistics 2017 Financial Mathematics Orientation - Statistics Written by Long Wang Edited by Joshua Agterberg August 21, 2018 Contents 1 Preliminaries 5 1.1 Samples and Population............................. 5

More information

Categorical Predictor Variables

Categorical Predictor Variables Categorical Predictor Variables We often wish to use categorical (or qualitative) variables as covariates in a regression model. For binary variables (taking on only 2 values, e.g. sex), it is relatively

More information

Weighted Least Squares

Weighted Least Squares Weighted Least Squares The standard linear model assumes that Var(ε i ) = σ 2 for i = 1,..., n. As we have seen, however, there are instances where Var(Y X = x i ) = Var(ε i ) = σ2 w i. Here w 1,..., w

More information

Regression Models - Introduction

Regression Models - Introduction Regression Models - Introduction In regression models there are two types of variables that are studied: A dependent variable, Y, also called response variable. It is modeled as random. An independent

More information

Lecture One: A Quick Review/Overview on Regular Linear Regression Models

Lecture One: A Quick Review/Overview on Regular Linear Regression Models Lecture One: A Quick Review/Overview on Regular Linear Regression Models Outline The topics to be covered include: Model Specification Estimation(LS estimators and MLEs) Hypothesis Testing and Model Diagnostics

More information

F-tests and Nested Models

F-tests and Nested Models F-tests and Nested Models Nested Models: A core concept in statistics is comparing nested s. Consider the Y = β 0 + β 1 x 1 + β 2 x 2 + ǫ. (1) The following reduced s are special cases (nested within)

More information

STOR 455 STATISTICAL METHODS I

STOR 455 STATISTICAL METHODS I STOR 455 STATISTICAL METHODS I Jan Hannig Mul9variate Regression Y=X β + ε X is a regression matrix, β is a vector of parameters and ε are independent N(0,σ) Es9mated parameters b=(x X) - 1 X Y Predicted

More information

Business Statistics. Tommaso Proietti. Model Evaluation and Selection. DEF - Università di Roma 'Tor Vergata'

Business Statistics. Tommaso Proietti. Model Evaluation and Selection. DEF - Università di Roma 'Tor Vergata' Business Statistics Tommaso Proietti DEF - Università di Roma 'Tor Vergata' Model Evaluation and Selection Predictive Ability of a Model: Denition and Estimation We aim at achieving a balance between parsimony

More information

Applied Econometrics (QEM)

Applied Econometrics (QEM) Applied Econometrics (QEM) based on Prinicples of Econometrics Jakub Mućk Department of Quantitative Economics Jakub Mućk Applied Econometrics (QEM) Meeting #3 1 / 42 Outline 1 2 3 t-test P-value Linear

More information

Least Squares Estimation-Finite-Sample Properties

Least Squares Estimation-Finite-Sample Properties Least Squares Estimation-Finite-Sample Properties Ping Yu School of Economics and Finance The University of Hong Kong Ping Yu (HKU) Finite-Sample 1 / 29 Terminology and Assumptions 1 Terminology and Assumptions

More information

Diagnostics and Remedial Measures: An Overview

Diagnostics and Remedial Measures: An Overview Diagnostics and Remedial Measures: An Overview Residuals Model diagnostics Graphical techniques Hypothesis testing Remedial measures Transformation Later: more about all this for multiple regression W.

More information

Generalized Linear Models

Generalized Linear Models Generalized Linear Models Lecture 3. Hypothesis testing. Goodness of Fit. Model diagnostics GLM (Spring, 2018) Lecture 3 1 / 34 Models Let M(X r ) be a model with design matrix X r (with r columns) r n

More information

Simple Linear Regression

Simple Linear Regression Simple Linear Regression ST 430/514 Recall: A regression model describes how a dependent variable (or response) Y is affected, on average, by one or more independent variables (or factors, or covariates)

More information

Regression Estimation Least Squares and Maximum Likelihood

Regression Estimation Least Squares and Maximum Likelihood Regression Estimation Least Squares and Maximum Likelihood Dr. Frank Wood Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 3, Slide 1 Least Squares Max(min)imization Function to minimize

More information

High-dimensional regression modeling

High-dimensional regression modeling High-dimensional regression modeling David Causeur Department of Statistics and Computer Science Agrocampus Ouest IRMAR CNRS UMR 6625 http://www.agrocampus-ouest.fr/math/causeur/ Course objectives Making

More information

Linear Regression. In this problem sheet, we consider the problem of linear regression with p predictors and one intercept,

Linear Regression. In this problem sheet, we consider the problem of linear regression with p predictors and one intercept, Linear Regression In this problem sheet, we consider the problem of linear regression with p predictors and one intercept, y = Xβ + ɛ, where y t = (y 1,..., y n ) is the column vector of target values,

More information

Ph.D. Qualifying Exam Friday Saturday, January 6 7, 2017

Ph.D. Qualifying Exam Friday Saturday, January 6 7, 2017 Ph.D. Qualifying Exam Friday Saturday, January 6 7, 2017 Put your solution to each problem on a separate sheet of paper. Problem 1. (5106) Let X 1, X 2,, X n be a sequence of i.i.d. observations from a

More information

Introduction to Simple Linear Regression

Introduction to Simple Linear Regression Introduction to Simple Linear Regression Yang Feng http://www.stat.columbia.edu/~yangfeng Yang Feng (Columbia University) Introduction to Simple Linear Regression 1 / 68 About me Faculty in the Department

More information

BIO5312 Biostatistics Lecture 13: Maximum Likelihood Estimation

BIO5312 Biostatistics Lecture 13: Maximum Likelihood Estimation BIO5312 Biostatistics Lecture 13: Maximum Likelihood Estimation Yujin Chung November 29th, 2016 Fall 2016 Yujin Chung Lec13: MLE Fall 2016 1/24 Previous Parametric tests Mean comparisons (normality assumption)

More information

Correlation and the Analysis of Variance Approach to Simple Linear Regression

Correlation and the Analysis of Variance Approach to Simple Linear Regression Correlation and the Analysis of Variance Approach to Simple Linear Regression Biometry 755 Spring 2009 Correlation and the Analysis of Variance Approach to Simple Linear Regression p. 1/35 Correlation

More information

Applied Regression Analysis

Applied Regression Analysis Applied Regression Analysis Chapter 3 Multiple Linear Regression Hongcheng Li April, 6, 2013 Recall simple linear regression 1 Recall simple linear regression 2 Parameter Estimation 3 Interpretations of

More information

Chapter 4: Regression Models

Chapter 4: Regression Models Sales volume of company 1 Textbook: pp. 129-164 Chapter 4: Regression Models Money spent on advertising 2 Learning Objectives After completing this chapter, students will be able to: Identify variables,

More information

Diagnostics and Remedial Measures

Diagnostics and Remedial Measures Diagnostics and Remedial Measures Yang Feng http://www.stat.columbia.edu/~yangfeng Yang Feng (Columbia University) Diagnostics and Remedial Measures 1 / 72 Remedial Measures How do we know that the regression

More information

Chapter 3: Multiple Regression. August 14, 2018

Chapter 3: Multiple Regression. August 14, 2018 Chapter 3: Multiple Regression August 14, 2018 1 The multiple linear regression model The model y = β 0 +β 1 x 1 + +β k x k +ǫ (1) is called a multiple linear regression model with k regressors. The parametersβ

More information

BIOS 2083 Linear Models c Abdus S. Wahed

BIOS 2083 Linear Models c Abdus S. Wahed Chapter 5 206 Chapter 6 General Linear Model: Statistical Inference 6.1 Introduction So far we have discussed formulation of linear models (Chapter 1), estimability of parameters in a linear model (Chapter

More information

Lecture 3: Inference in SLR

Lecture 3: Inference in SLR Lecture 3: Inference in SLR STAT 51 Spring 011 Background Reading KNNL:.1.6 3-1 Topic Overview This topic will cover: Review of hypothesis testing Inference about 1 Inference about 0 Confidence Intervals

More information

STAT5044: Regression and Anova

STAT5044: Regression and Anova STAT5044: Regression and Anova Inyoung Kim 1 / 25 Outline 1 Multiple Linear Regression 2 / 25 Basic Idea An extra sum of squares: the marginal reduction in the error sum of squares when one or several

More information

Model Selection Procedures

Model Selection Procedures Model Selection Procedures Statistics 135 Autumn 2005 Copyright c 2005 by Mark E. Irwin Model Selection Procedures Consider a regression setting with K potential predictor variables and you wish to explore

More information

Statistics and Econometrics I

Statistics and Econometrics I Statistics and Econometrics I Point Estimation Shiu-Sheng Chen Department of Economics National Taiwan University September 13, 2016 Shiu-Sheng Chen (NTU Econ) Statistics and Econometrics I September 13,

More information

Concordia University (5+5)Q 1.

Concordia University (5+5)Q 1. (5+5)Q 1. Concordia University Department of Mathematics and Statistics Course Number Section Statistics 360/1 40 Examination Date Time Pages Mid Term Test May 26, 2004 Two Hours 3 Instructor Course Examiner

More information

K. Model Diagnostics. residuals ˆɛ ij = Y ij ˆµ i N = Y ij Ȳ i semi-studentized residuals ω ij = ˆɛ ij. studentized deleted residuals ɛ ij =

K. Model Diagnostics. residuals ˆɛ ij = Y ij ˆµ i N = Y ij Ȳ i semi-studentized residuals ω ij = ˆɛ ij. studentized deleted residuals ɛ ij = K. Model Diagnostics We ve already seen how to check model assumptions prior to fitting a one-way ANOVA. Diagnostics carried out after model fitting by using residuals are more informative for assessing

More information

Predictive Analytics : QM901.1x Prof U Dinesh Kumar, IIMB. All Rights Reserved, Indian Institute of Management Bangalore

Predictive Analytics : QM901.1x Prof U Dinesh Kumar, IIMB. All Rights Reserved, Indian Institute of Management Bangalore What is Multiple Linear Regression Several independent variables may influence the change in response variable we are trying to study. When several independent variables are included in the equation, the

More information

COMPREHENSIVE WRITTEN EXAMINATION, PAPER III FRIDAY AUGUST 26, 2005, 9:00 A.M. 1:00 P.M. STATISTICS 174 QUESTION

COMPREHENSIVE WRITTEN EXAMINATION, PAPER III FRIDAY AUGUST 26, 2005, 9:00 A.M. 1:00 P.M. STATISTICS 174 QUESTION COMPREHENSIVE WRITTEN EXAMINATION, PAPER III FRIDAY AUGUST 26, 2005, 9:00 A.M. 1:00 P.M. STATISTICS 174 QUESTION Answer all parts. Closed book, calculators allowed. It is important to show all working,

More information

Multiple Linear Regression

Multiple Linear Regression Multiple Linear Regression University of California, San Diego Instructor: Ery Arias-Castro http://math.ucsd.edu/~eariasca/teaching.html 1 / 42 Passenger car mileage Consider the carmpg dataset taken from

More information

Chapter 12: Multiple Linear Regression

Chapter 12: Multiple Linear Regression Chapter 12: Multiple Linear Regression Seungchul Baek Department of Statistics, University of South Carolina STAT 509: Statistics for Engineers 1 / 55 Introduction A regression model can be expressed as

More information

Lecture 1: Linear Models and Applications

Lecture 1: Linear Models and Applications Lecture 1: Linear Models and Applications Claudia Czado TU München c (Claudia Czado, TU Munich) ZFS/IMS Göttingen 2004 0 Overview Introduction to linear models Exploratory data analysis (EDA) Estimation

More information