Simple Regression Model Setup Estimation Inference Prediction. Model Diagnostic. Multiple Regression. Model Setup and Estimation.
|
|
- Adele Chambers
- 6 years ago
- Views:
Transcription
1 Statistical Computation Math 475 Jimin Ding Department of Mathematics Washington University in St. Louis jmding/math475/index.html October 10, 2013 Ridge Part IV October 10, / 40 October 10, / 40 Outline I Examples Ridge 1 2 Ridge Ridge Classical regression models only deal with continuous response variable. Let Y denote response (dependent) variable X denote explanatory (independent) variable (predictor). Grade point average (GPA). X : entrance test scores; Y : GPA by the end of freshman year. X : height; Y : weight. X : education level; Y : income. The covariate X is usually continuous in regression categorical covariates are commonly investigated in ANOVA (analysis of variance). But we can also use regression model to study the relationship between a continuous response a categorical covariate by creating some dummy variables. This is equivalent to ANOVA/ANCOVA/MANOVA to some extend. October 10, / 40 October 10, / 40
2 General Goal of Study: Ridge linear regression for one covariate: Y i = β 0 + β 1 X i + e i, i = 1,, n, where Y i X i are the ith observed response covariate variables, e i is the rom error, β 0 β 1 are the unknown parameters. Assumptions: 1 E(e i ) = 0 for all i's. 2 Var(e i ) = σ 2. (Homogeneity) 3 e i e j are independent for any i j. (Independence) iid 4 e i N(0, σ 2 ). (Normality) Ridge Describe the relationship between explanatory response variables. Estimate β 0, β 1, σ 2. Predict/Forecast the response variable for a new given predictor value. Predict E(Y new X new ) = β 0 + β 1 X new. : testing whether the relationship is statistically signicant. Find CI for β 1 to see whether it includes 0. Test: H 0 : β 1 = 0. interval of response variable. October 10, / 40 October 10, / 40 Least Square Estimator Ridge Estimator: a function of data that provides a guess about the value or parameters of interest. Example: X can be used to estimate µ. Criteria: how to choose a good" estimator? The smaller the following quantities are, the better the estimator is: Bias: E( ˆβ) β, Variance: Var( ˆβ), MSE: E{( ˆβ β) 2 }. Question: the distribution of ˆβ is usually unknown since the distribution of Y is unknown! Approximate Bias, Variance MSE by their sample version. Ridge To predict Y well in a simple linear regression, it is natural to obtain the estimators by minimizing: Q = n [Y i (β 0 + β 1 X i )] 2, i=1 which is the so called Least Square Criterion. The obtained estimators ˆβ 0 = Ȳ ˆβ 1 X i=1 ˆβ 1 = (X i X )(Yi Ȳ ) i=1 (X i X ) 2 = ρ X,Y S Y S X, are the so called Least Square Estimators (LSE). October 10, / 40 October 10, / 40
3 Line Sum Squares in ANOVA Table Ridge y Y^ i Y i lines: y = ˆβ 0 + ˆβ 1 x. Fitted values: Ŷ i = ˆβ 0 + ˆβ 1 X i. Residuals: ê i = Ŷ i Y i. Ridge SSE: sum of square errors i=1 ê2 i. SSTO: i=1 (Y i Ȳ ) 2. SSR: i=1 (Ŷ i Ȳ ) 2. DF: the degree of freedom. DF of the residuals = the number of observations - the number of parameters in the model. MSE: SSE/DF of the error term. 1 n 2 i=1 ê2 i = ˆσ 2 (estimator for σ 2 ). x For LSE, we have i=1 êi = 0 i=1 ê2 i is minimized. October 10, / 40 October 10, / 40 Best Linear Unbiased (BLUE) Other Type of Estimators Ridge Note that LSE are unbiased estimators. E(MSE) = E(ˆσ 2 ) = σ 2 ; E( ˆβ 0 ) = β 0 ; E( ˆβ 1 ) = β 1. Both ˆβ 0 ˆβ 1 are linear functions of observations Y 1,, Y n. Among all unbiased estimators, LSE has smallest variance. In another words, it is more precise than any other unbiased linear predictors. Remark: BLUE property still hold without the normality assumption in (4). Ridge Least absolute deviation estimator (LAD): a robust estimator. Weighted least square estimator (WLSE): an estimator to adjust for heterogeneity. Maximum Likelihood Estimator (MLE): MLE is based on the likelihood function hence is only valid under the assumption (4). October 10, / 40 October 10, / 40
4 MLE Properties of MLE Ridge Recall Likelihood: L = n f (Y i ) = i=1 n i=1 1 2πσ 2 exp{ 1 2σ 2 (Y i β 0 β 1 X i ) 2 }. The parameters (β 0, β 1, σ 2 ) which maximize above likelihood function, L(β 0, β 1, σ 2 ), are dened as MLE of (β 0, β 1, σ 2 ), denoted by ( ˆβ 0,ML, ˆβ 1,ML, ˆσ 2 ML ). Ridge In the simple linear regression with the normality assumption on errors, LSE for β's are same as MLE for β's. (They are dierent for σ 2.) So MLE for β's are also BLUE. Usually in most of models, MLE are Consistent: ˆβ β in probability or a.s. Sucient: f (Y 1,, Y n ˆθ ML ) does not depend on θ. MVUE: minimum variance unbiased estimator. Asymptotic ecient: Var( ˆβ) reach the Cramér-Rao lower bound. (minimum variance) October 10, / 40 October 10, / 40 Condence Interval Stard Errors Ridge Note: ˆβ 0, ˆβ 1, ˆσ 2 are functions of data so are rom variables. As point estimators, they only provide a guess about true parameters will change if the data are changed. We also want to get a range guess for ˆβ 0, ˆβ 1, ˆσ 2, which guarantee that with certain probability the true parameter will be in the range. For example, if we repeat the experiments 100 times collect 100 set of data, 95 out the 100 guessed range will contain the true parameters. This range is called condence interval. CI for β 0 : ˆβ 0 ± t 1 α/2,n 2 se( ˆβ 0 ); CI for β 1 : ˆβ 1 ± t 1 α/2,n 2 se( ˆβ 1 ). Ridge Var( ˆβ 0 ) = Var(Ȳ ˆβ 1 X ) = Var( ˆβ 1 ) = Var( = σ 2 ( 1 n + X i=1 (X i X ) 2 ) i=1 (X i X )(Yi Ȳ ) i=1 (X ) i X ) 2 = = σ 2 1 ( i=1 (X i X ) 2 ) But σ 2 is still unknown, so we use the estimator ˆσ 2 = MSE to replace σ 2 in estimating stard errors. October 10, / 40 October 10, / 40
5 Stard Errors Distribution of Estimators Hence se( ˆβ 0 ) = MSE( 1 n + X 2 i=1 (X i X ) 2 ), Under the normality assumption, ˆβ p β p se( ˆβ p ) t n 2, Ridge se( ˆβ 1 ) = 1 MSE( i=1 (X i X ) 2 ). Ridge without the normality assumption, ˆβ p β p se( ˆβ p ) t n 2 approximately, for p = 1, 2. October 10, / 40 October 10, / 40 Hypothesis Test Condence Interval for the Mean of an Observation Ridge f(x) H 0 : β 1 = 0 v.s. β 1 0. (Whether Y i depends on X i or not.) Test statistic: t = ˆβ 1 0 se( ˆβ 1 ) t n 2. p-value=p(t n 2 > t ). (two sided) t 29 distribution Two sided p value Ridge Let Ŷ new = ˆβ 0 + ˆβ 1 X new. Var(Ŷ new ) = Var( ˆβ 0 + ˆβ 1 X new ) = = σ 2 ( 1 n + ( X Xnew ) 2 i=1 (X i X ) 2 ). Hence a (1 α)% CI for the mean value of the observation E(Y new ) = β 0 + β 1 X new is Ŷ new ± t 1 α/2,n 2 MSE( 1 n + ( X Xnew ) 2 i=1 (X i X ) 2 ), x which is denoted as CLM in SAS output. October 10, / 40 October 10, / 40
6 Interval (PI) Simultaneous Condence/ Bs Ridge A (1 α)% prediction interval is an interval I such that P(Y new I ) = 1 α. Note that Y new = β 0 + β 1 X new + e new is rom in the sense that Var(Y new ) = σ 2 0. We can derive Y new Ŷ new N(0, σ 2 (1 + 1 n + ( X Xnew ) 2 i=1 (X i X ) 2 )), 1 α = P( Y new Ŷ new MSE(1 + 1 n + ( X Xnew ) 2 i=1 (X i X ) 2 )) t 1 α/2,n 2 ). Hence (1 α)% PI for Y new, denoted as RLCLI in SAS, is: Ŷ new ± t 1 α/2,n 2 MSE(1 + 1 n + ( X Xnew ) 2 n i=1 (X i X ) 2 ). Ridge Condence b for E(Y ): Dierent from prediction interval condence interval for the mean, it is a simultaneous b for the entire regression line. Ŷ i ± W MSE( 1 n + ( X Xi ) 2 i=1 (X i X ) 2 )), where W 2 = 2F (1 α; 2, n 2) i = 1,, n. b for the Y i 's in the entire region: Ŷ i ± S(or B) MSE( 1 n + ( X Xi ) 2 n i=1 (X i X ) 2 )), where S 2 = gf (1 α; g, n 2) (Scheé type) B = t(1 α/2g, n 2) (Bonferroni type). October 10, / 40 October 10, / 40 Coecient of Determination F-test for Goodness of Fit Ridge Coecient of Determination is dened as R 2 = SSR SSTO = 1 SSE [0, 1]. SSTO Interpretation: the proportion reduction of total variation associated with the use of the predictor variable X. The larger R 2 is, the more the total variation of Y is explained by X. In simple linear regression, R 2 is same as ˆρ 2. The R close to 0 does not imply that X Y are not related, but simply means that the linear correlation between X Y is small. Ridge H 0 : Reduced modely i = β 0 + e i, v.s. H 1 : Full modely i = β 0 + β 1 X i + e i. This is equivalent to test H 0 : β 1 = 0. Test statistic: F = MSR F (1 α; 1, n 2), MSE where MSR = SSR/the number of model parameters 1. (Note that F 1,df = t 2 df ) P-value: October 10, / 40 October 10, / 40
7 General F-test Residual Plots If model 1 (reduced model) is nested (submodel) within model 2 (full model), the comparison between two models can be done by F-test. Residuals against the index of observations: 1 symmetric around 0; 2 constant variability; 3 no serial correction. Ridge F = SSE(R) SSE(F ) df (R) df (F ) SSE(F ) df (F ) General F-test is commonly used in model selection. Ridge Residuals against the predicted values. (add a link to possible problematic residual plots.) QQ plot/normality plot: check the normality assumption. Under the normality assumption, the residuals should be close to the reference line or look linear. October 10, / 40 October 10, / 40 Prototype Residual Plots Outliers Inuential Observations Cook's distance is dened as Ridge Ridge D i = j=1 Ŷj Ŷ j(i) p MSE = n i=1 ê 2 i h ii p MSE [ (1 h ii ) 2 ]. The value of Cook's distance for each observation represents a measure of the degree to which the predicted values change if the observation is left out of the regression. If an observation has an unusually large value for the Cook's distance, it might be worth to further investigations. (small inuence: D < 0.1; huge inuence: D > 0.5) Although above denition requires to t regression n times, it can be simplied only need to t model once. Here h ii is the i-th diagonal element of the hat matrix H = X (X T X ) 1 X T. October 10, / 40 October 10, / 40
8 Ridge Two predictors: Y i = β 0 + β 1 X i1 + β 2 X i2 + e i. More general: Consider p 1 predictors X i1,, X i(p 1), Y i = β 0 + β 1 X i1 + β 2 X i2 + β p 1 X i(p 1) + e i, for i = 1,, n. We may write it in the following matrix form Y = X β + e, where Y = (Y 1,, Y n ) T, β = (β 0,, β p 1 ) T, e = (e 1,, e n ) T N(0, σ 2 I ) X is the n p design matrix. Ridge The LSE for ˆβ LS = (X T X ) 1 X T Y Cov(ˆβ LS ) = σ 2 (X T X ) 1. se(ˆβ LS ) = [MSE(X T X ) 1 ] 1/2. Under normality assumption, ˆβ ML = ˆβ LS. The tted values: Ŷ = X ˆβ = X (X T X ) 1 X T Y. Hat matrix H = X (X T X ) 1 X T. The residuals: ê = Y Ŷ = (I H)Y, Cov(e) = σ 2 (I H). October 10, / 40 October 10, / 40 ANOVA Table : Covariates Ridge SSE: ê T ê = Y T (I H)Y. SSTO: i=1 (Y i Ȳ ) 2 = Y T Y 1 n Y T JY, where J is the n n matrix with all components equal to 1. SSR=SSTO-SSE. MSE=SSE/(n-p). MSTO=SSTO/(n-1). MSR=SSR(p-1). Overall F-test: H 0 : β = 0 (β 1 = β 2 = = β p 1 = 0) H a : not all β's are zeros. F = MSR MSE F 1 α;p 1,n p. R 2 = SSR SSTO = 1 SSE SSTO. Adjusted R 2 : adjust for the number of predictors 1 n 1 (1 R2 A ) = 1 n p (1 R2 ). Ridge Forward selection: from no covariates Backward selection: from all covariates Stepwise selection: Backward+forward October 10, / 40 October 10, / 40
9 When Happens Remedy for Ridge Adding or deleting a predictor changes R 2 substantially. Type III SSR heavily depends on other variables in the model. se( ˆβ k ) is large. Predictors are not signicant individually, but simple regression on each covariate is signicant. Ridge Centering stardize the predictors, which might be helpful in polynomial regression when some of the predictors are badly scaled. Drop the correlated variables by model selection. 1 The predictor is not signicant. 2 The reduced model after dropping the predictor ts data nearly as well as the full model. Add new observations. (Economy, Business) Use the index of several variables (PCA) Ridge October 10, / 40 October 10, / 40 Variance Ination Factor (VIF) Ridge Ridge Let R 2 j is the coecient of determination of X j on all other predictors. (R 2 j is the R 2 of regression model X ij = β 0 +β 1 X i1 + +β j 1 X i(j 1) +β j+1 X i(j+1) + +e ij.) Dene VIF j = 1 1 R 2. j VIF j 1 since R 2 j 1 for all j. If VIF > 10, we usually believe the variable has inuential variation to cause collinearity problem. In stardized regression Var ˆβ j = σvif j. In SAS, VIF table is reported in PROC REG by adding VIF option in MODEL statement. Ridge Recall that ˆβ LS = (X T X ) 1 X T Y, E(ˆβ LS ) = β Var(ˆβ) = (X T X ) 1. When (X T X ) is nearly singular (the determinate is close to 0), LSE is unbiased but has large variance, which leads to large mean square error of the estimator The idea of the ridge regression is: reduce variance at the cost of increasing bias beta October 10, / 40 October 10, / 40
10 Ridge SAS Program Ridge The ridge regression estimator is: ˆβ r (b) = (X T X + bi ) 1 X T Y, where b is a constant chosen by users referred as tuning parameter. As b increases, the bias increases but the variance decreases, ˆβ r (b) 0 (componentwise). One may choose the tuning parameter b, such that ridge trace (ˆβ r (b) against b) gets at, VIF j (against b) drop around 1. Ridge PROC GLM; PROC REG; PROC CATMOD; PROC GENMOD; PROC LOGISTIC; PROC NLINL; PROC PLS; PROC MIXED; October 10, / 40 October 10, / 40 Reading Assignment Ridge Textbook: Chapter 5 Chapter 9. October 10, / 40
Multiple Linear Regression
Multiple Linear Regression Simple linear regression tries to fit a simple line between two variables Y and X. If X is linearly related to Y this explains some of the variability in Y. In most cases, there
More informationOutline. Remedial Measures) Extra Sums of Squares Standardized Version of the Multiple Regression Model
Outline 1 Multiple Linear Regression (Estimation, Inference, Diagnostics and Remedial Measures) 2 Special Topics for Multiple Regression Extra Sums of Squares Standardized Version of the Multiple Regression
More informationLecture 15 Multiple regression I Chapter 6 Set 2 Least Square Estimation The quadratic form to be minimized is
Lecture 15 Multiple regression I Chapter 6 Set 2 Least Square Estimation The quadratic form to be minimized is Q = (Y i β 0 β 1 X i1 β 2 X i2 β p 1 X i.p 1 ) 2, which in matrix notation is Q = (Y Xβ) (Y
More informationChapter 6 Multiple Regression
STAT 525 FALL 2018 Chapter 6 Multiple Regression Professor Min Zhang The Data and Model Still have single response variable Y Now have multiple explanatory variables Examples: Blood Pressure vs Age, Weight,
More informationSimple Linear Regression
Simple Linear Regression In simple linear regression we are concerned about the relationship between two variables, X and Y. There are two components to such a relationship. 1. The strength of the relationship.
More informationSSR = The sum of squared errors measures how much Y varies around the regression line n. It happily turns out that SSR + SSE = SSTO.
Analysis of variance approach to regression If x is useless, i.e. β 1 = 0, then E(Y i ) = β 0. In this case β 0 is estimated by Ȳ. The ith deviation about this grand mean can be written: deviation about
More informationLecture 10 Multiple Linear Regression
Lecture 10 Multiple Linear Regression STAT 512 Spring 2011 Background Reading KNNL: 6.1-6.5 10-1 Topic Overview Multiple Linear Regression Model 10-2 Data for Multiple Regression Y i is the response variable
More informationMath 423/533: The Main Theoretical Topics
Math 423/533: The Main Theoretical Topics Notation sample size n, data index i number of predictors, p (p = 2 for simple linear regression) y i : response for individual i x i = (x i1,..., x ip ) (1 p)
More informationBANA 7046 Data Mining I Lecture 2. Linear Regression, Model Assessment, and Cross-validation 1
BANA 7046 Data Mining I Lecture 2. Linear Regression, Model Assessment, and Cross-validation 1 Shaobo Li University of Cincinnati 1 Partially based on Hastie, et al. (2009) ESL, and James, et al. (2013)
More informationSimple and Multiple Linear Regression
Sta. 113 Chapter 12 and 13 of Devore March 12, 2010 Table of contents 1 Simple Linear Regression 2 Model Simple Linear Regression A simple linear regression model is given by Y = β 0 + β 1 x + ɛ where
More informationStatistics 203: Introduction to Regression and Analysis of Variance Course review
Statistics 203: Introduction to Regression and Analysis of Variance Course review Jonathan Taylor - p. 1/?? Today Review / overview of what we learned. - p. 2/?? General themes in regression models Specifying
More informationChapter 1: Linear Regression with One Predictor Variable also known as: Simple Linear Regression Bivariate Linear Regression
BSTT523: Kutner et al., Chapter 1 1 Chapter 1: Linear Regression with One Predictor Variable also known as: Simple Linear Regression Bivariate Linear Regression Introduction: Functional relation between
More informationSTAT 705 Chapter 16: One-way ANOVA
STAT 705 Chapter 16: One-way ANOVA Timothy Hanson Department of Statistics, University of South Carolina Stat 705: Data Analysis II 1 / 21 What is ANOVA? Analysis of variance (ANOVA) models are regression
More informationBusiness Statistics. Tommaso Proietti. Linear Regression. DEF - Università di Roma 'Tor Vergata'
Business Statistics Tommaso Proietti DEF - Università di Roma 'Tor Vergata' Linear Regression Specication Let Y be a univariate quantitative response variable. We model Y as follows: Y = f(x) + ε where
More informationRegression Review. Statistics 149. Spring Copyright c 2006 by Mark E. Irwin
Regression Review Statistics 149 Spring 2006 Copyright c 2006 by Mark E. Irwin Matrix Approach to Regression Linear Model: Y i = β 0 + β 1 X i1 +... + β p X ip + ɛ i ; ɛ i iid N(0, σ 2 ), i = 1,..., n
More informationFinal Review. Yang Feng. Yang Feng (Columbia University) Final Review 1 / 58
Final Review Yang Feng http://www.stat.columbia.edu/~yangfeng Yang Feng (Columbia University) Final Review 1 / 58 Outline 1 Multiple Linear Regression (Estimation, Inference) 2 Special Topics for Multiple
More informationChapter 1. Linear Regression with One Predictor Variable
Chapter 1. Linear Regression with One Predictor Variable 1.1 Statistical Relation Between Two Variables To motivate statistical relationships, let us consider a mathematical relation between two mathematical
More informationChapter 2 Inferences in Simple Linear Regression
STAT 525 SPRING 2018 Chapter 2 Inferences in Simple Linear Regression Professor Min Zhang Testing for Linear Relationship Term β 1 X i defines linear relationship Will then test H 0 : β 1 = 0 Test requires
More informationFormal Statement of Simple Linear Regression Model
Formal Statement of Simple Linear Regression Model Y i = β 0 + β 1 X i + ɛ i Y i value of the response variable in the i th trial β 0 and β 1 are parameters X i is a known constant, the value of the predictor
More informationReview: Second Half of Course Stat 704: Data Analysis I, Fall 2014
Review: Second Half of Course Stat 704: Data Analysis I, Fall 2014 Tim Hanson, Ph.D. University of South Carolina T. Hanson (USC) Stat 704: Data Analysis I, Fall 2014 1 / 13 Chapter 8: Polynomials & Interactions
More informationChapter 3. Diagnostics and Remedial Measures
Chapter 3. Diagnostics and Remedial Measures So far, we took data (X i, Y i ) and we assumed Y i = β 0 + β 1 X i + ǫ i i = 1, 2,..., n, where ǫ i iid N(0, σ 2 ), β 0, β 1 and σ 2 are unknown parameters,
More informationLecture 18 MA Applied Statistics II D 2004
Lecture 18 MA 2612 - Applied Statistics II D 2004 Today 1. Examples of multiple linear regression 2. The modeling process (PNC 8.4) 3. The graphical exploration of multivariable data (PNC 8.5) 4. Fitting
More informationChapter 14. Linear least squares
Serik Sagitov, Chalmers and GU, March 5, 2018 Chapter 14 Linear least squares 1 Simple linear regression model A linear model for the random response Y = Y (x) to an independent variable X = x For a given
More informationDr. Maddah ENMG 617 EM Statistics 11/28/12. Multiple Regression (3) (Chapter 15, Hines)
Dr. Maddah ENMG 617 EM Statistics 11/28/12 Multiple Regression (3) (Chapter 15, Hines) Problems in multiple regression: Multicollinearity This arises when the independent variables x 1, x 2,, x k, are
More informationTopic 18: Model Selection and Diagnostics
Topic 18: Model Selection and Diagnostics Variable Selection We want to choose a best model that is a subset of the available explanatory variables Two separate problems 1. How many explanatory variables
More informationMultiple Regression Analysis. Part III. Multiple Regression Analysis
Part III Multiple Regression Analysis As of Sep 26, 2017 1 Multiple Regression Analysis Estimation Matrix form Goodness-of-Fit R-square Adjusted R-square Expected values of the OLS estimators Irrelevant
More informationChapter 2 Multiple Regression I (Part 1)
Chapter 2 Multiple Regression I (Part 1) 1 Regression several predictor variables The response Y depends on several predictor variables X 1,, X p response {}}{ Y predictor variables {}}{ X 1, X 2,, X p
More informationwhere x and ȳ are the sample means of x 1,, x n
y y Animal Studies of Side Effects Simple Linear Regression Basic Ideas In simple linear regression there is an approximately linear relation between two variables say y = pressure in the pancreas x =
More informationSTAT5044: Regression and Anova. Inyoung Kim
STAT5044: Regression and Anova Inyoung Kim 2 / 47 Outline 1 Regression 2 Simple Linear regression 3 Basic concepts in regression 4 How to estimate unknown parameters 5 Properties of Least Squares Estimators:
More informationLinear model selection and regularization
Linear model selection and regularization Problems with linear regression with least square 1. Prediction Accuracy: linear regression has low bias but suffer from high variance, especially when n p. It
More informationLinear models and their mathematical foundations: Simple linear regression
Linear models and their mathematical foundations: Simple linear regression Steffen Unkel Department of Medical Statistics University Medical Center Göttingen, Germany Winter term 2018/19 1/21 Introduction
More informationCh 2: Simple Linear Regression
Ch 2: Simple Linear Regression 1. Simple Linear Regression Model A simple regression model with a single regressor x is y = β 0 + β 1 x + ɛ, where we assume that the error ɛ is independent random component
More informationRemedial Measures for Multiple Linear Regression Models
Remedial Measures for Multiple Linear Regression Models Yang Feng http://www.stat.columbia.edu/~yangfeng Yang Feng (Columbia University) Remedial Measures for Multiple Linear Regression Models 1 / 25 Outline
More informationThe general linear regression with k explanatory variables is just an extension of the simple regression as follows
3. Multiple Regression Analysis The general linear regression with k explanatory variables is just an extension of the simple regression as follows (1) y i = β 0 + β 1 x i1 + + β k x ik + u i. Because
More informationRegression Estimation - Least Squares and Maximum Likelihood. Dr. Frank Wood
Regression Estimation - Least Squares and Maximum Likelihood Dr. Frank Wood Least Squares Max(min)imization Function to minimize w.r.t. β 0, β 1 Q = n (Y i (β 0 + β 1 X i )) 2 i=1 Minimize this by maximizing
More informationContents. 1 Review of Residuals. 2 Detecting Outliers. 3 Influential Observations. 4 Multicollinearity and its Effects
Contents 1 Review of Residuals 2 Detecting Outliers 3 Influential Observations 4 Multicollinearity and its Effects W. Zhou (Colorado State University) STAT 540 July 6th, 2015 1 / 32 Model Diagnostics:
More information2. Regression Review
2. Regression Review 2.1 The Regression Model The general form of the regression model y t = f(x t, β) + ε t where x t = (x t1,, x tp ), β = (β 1,..., β m ). ε t is a random variable, Eε t = 0, Var(ε t
More informationMA 575 Linear Models: Cedric E. Ginestet, Boston University Midterm Review Week 7
MA 575 Linear Models: Cedric E. Ginestet, Boston University Midterm Review Week 7 1 Random Vectors Let a 0 and y be n 1 vectors, and let A be an n n matrix. Here, a 0 and A are non-random, whereas y is
More informationChapter 2. Continued. Proofs For ANOVA Proof of ANOVA Identity. the product term in the above equation can be simplified as n
Chapter 2. Continued Proofs For ANOVA Proof of ANOVA Identity We are going to prove that Writing SST SSR + SSE. Y i Ȳ (Y i Ŷ i ) + (Ŷ i Ȳ ) Squaring both sides summing over all i 1,...n, we get (Y i Ȳ
More informationCh 3: Multiple Linear Regression
Ch 3: Multiple Linear Regression 1. Multiple Linear Regression Model Multiple regression model has more than one regressor. For example, we have one response variable and two regressor variables: 1. delivery
More informationSTAT 100C: Linear models
STAT 100C: Linear models Arash A. Amini June 9, 2018 1 / 56 Table of Contents Multiple linear regression Linear model setup Estimation of β Geometric interpretation Estimation of σ 2 Hat matrix Gram matrix
More informationMath 3330: Solution to midterm Exam
Math 3330: Solution to midterm Exam Question 1: (14 marks) Suppose the regression model is y i = β 0 + β 1 x i + ε i, i = 1,, n, where ε i are iid Normal distribution N(0, σ 2 ). a. (2 marks) Compute the
More information3. Diagnostics and Remedial Measures
3. Diagnostics and Remedial Measures So far, we took data (X i, Y i ) and we assumed where ɛ i iid N(0, σ 2 ), Y i = β 0 + β 1 X i + ɛ i i = 1, 2,..., n, β 0, β 1 and σ 2 are unknown parameters, X i s
More informationSTAT420 Midterm Exam. University of Illinois Urbana-Champaign October 19 (Friday), :00 4:15p. SOLUTIONS (Yellow)
STAT40 Midterm Exam University of Illinois Urbana-Champaign October 19 (Friday), 018 3:00 4:15p SOLUTIONS (Yellow) Question 1 (15 points) (10 points) 3 (50 points) extra ( points) Total (77 points) Points
More informationRegression, Ridge Regression, Lasso
Regression, Ridge Regression, Lasso Fabio G. Cozman - fgcozman@usp.br October 2, 2018 A general definition Regression studies the relationship between a response variable Y and covariates X 1,..., X n.
More informationLectures on Simple Linear Regression Stat 431, Summer 2012
Lectures on Simple Linear Regression Stat 43, Summer 0 Hyunseung Kang July 6-8, 0 Last Updated: July 8, 0 :59PM Introduction Previously, we have been investigating various properties of the population
More informationSTAT763: Applied Regression Analysis. Multiple linear regression. 4.4 Hypothesis testing
STAT763: Applied Regression Analysis Multiple linear regression 4.4 Hypothesis testing Chunsheng Ma E-mail: cma@math.wichita.edu 4.4.1 Significance of regression Null hypothesis (Test whether all β j =
More informationLecture 1 Linear Regression with One Predictor Variable.p2
Lecture Linear Regression with One Predictor Variablep - Basics - Meaning of regression parameters p - β - the slope of the regression line -it indicates the change in mean of the probability distn of
More informationDay 4: Shrinkage Estimators
Day 4: Shrinkage Estimators Kenneth Benoit Data Mining and Statistical Learning March 9, 2015 n versus p (aka k) Classical regression framework: n > p. Without this inequality, the OLS coefficients have
More informationData Mining Stat 588
Data Mining Stat 588 Lecture 02: Linear Methods for Regression Department of Statistics & Biostatistics Rutgers University September 13 2011 Regression Problem Quantitative generic output variable Y. Generic
More information6. Multiple Linear Regression
6. Multiple Linear Regression SLR: 1 predictor X, MLR: more than 1 predictor Example data set: Y i = #points scored by UF football team in game i X i1 = #games won by opponent in their last 10 games X
More informationLecture 3: Linear Models. Bruce Walsh lecture notes Uppsala EQG course version 28 Jan 2012
Lecture 3: Linear Models Bruce Walsh lecture notes Uppsala EQG course version 28 Jan 2012 1 Quick Review of the Major Points The general linear model can be written as y = X! + e y = vector of observed
More informationSTAT5044: Regression and Anova. Inyoung Kim
STAT5044: Regression and Anova Inyoung Kim 2 / 51 Outline 1 Matrix Expression 2 Linear and quadratic forms 3 Properties of quadratic form 4 Properties of estimates 5 Distributional properties 3 / 51 Matrix
More informationLecture 6 Multiple Linear Regression, cont.
Lecture 6 Multiple Linear Regression, cont. BIOST 515 January 22, 2004 BIOST 515, Lecture 6 Testing general linear hypotheses Suppose we are interested in testing linear combinations of the regression
More informationMath 494: Mathematical Statistics
Math 494: Mathematical Statistics Instructor: Jimin Ding jmding@wustl.edu Department of Mathematics Washington University in St. Louis Class materials are available on course website (www.math.wustl.edu/
More informationFall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.
1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n
More informationSections 7.1, 7.2, 7.4, & 7.6
Sections 7.1, 7.2, 7.4, & 7.6 Adapted from Timothy Hanson Department of Statistics, University of South Carolina Stat 704: Data Analysis I 1 / 25 Chapter 7 example: Body fat n = 20 healthy females 25 34
More informationEstimating σ 2. We can do simple prediction of Y and estimation of the mean of Y at any value of X.
Estimating σ 2 We can do simple prediction of Y and estimation of the mean of Y at any value of X. To perform inferences about our regression line, we must estimate σ 2, the variance of the error term.
More informationHow the mean changes depends on the other variable. Plots can show what s happening...
Chapter 8 (continued) Section 8.2: Interaction models An interaction model includes one or several cross-product terms. Example: two predictors Y i = β 0 + β 1 x i1 + β 2 x i2 + β 12 x i1 x i2 + ɛ i. How
More informationRegression and Statistical Inference
Regression and Statistical Inference Walid Mnif wmnif@uwo.ca Department of Applied Mathematics The University of Western Ontario, London, Canada 1 Elements of Probability 2 Elements of Probability CDF&PDF
More informationSTAT 540: Data Analysis and Regression
STAT 540: Data Analysis and Regression Wen Zhou http://www.stat.colostate.edu/~riczw/ Email: riczw@stat.colostate.edu Department of Statistics Colorado State University Fall 205 W. Zhou (Colorado State
More informationMS&E 226: Small Data. Lecture 11: Maximum likelihood (v2) Ramesh Johari
MS&E 226: Small Data Lecture 11: Maximum likelihood (v2) Ramesh Johari ramesh.johari@stanford.edu 1 / 18 The likelihood function 2 / 18 Estimating the parameter This lecture develops the methodology behind
More informationMultiple Regression. Inference for Multiple Regression and A Case Study. IPS Chapters 11.1 and W.H. Freeman and Company
Multiple Regression Inference for Multiple Regression and A Case Study IPS Chapters 11.1 and 11.2 2009 W.H. Freeman and Company Objectives (IPS Chapters 11.1 and 11.2) Multiple regression Data for multiple
More informationRegression Analysis. Regression: Methodology for studying the relationship among two or more variables
Regression Analysis Regression: Methodology for studying the relationship among two or more variables Two major aims: Determine an appropriate model for the relationship between the variables Predict the
More information2017 Financial Mathematics Orientation - Statistics
2017 Financial Mathematics Orientation - Statistics Written by Long Wang Edited by Joshua Agterberg August 21, 2018 Contents 1 Preliminaries 5 1.1 Samples and Population............................. 5
More informationCategorical Predictor Variables
Categorical Predictor Variables We often wish to use categorical (or qualitative) variables as covariates in a regression model. For binary variables (taking on only 2 values, e.g. sex), it is relatively
More informationWeighted Least Squares
Weighted Least Squares The standard linear model assumes that Var(ε i ) = σ 2 for i = 1,..., n. As we have seen, however, there are instances where Var(Y X = x i ) = Var(ε i ) = σ2 w i. Here w 1,..., w
More informationRegression Models - Introduction
Regression Models - Introduction In regression models there are two types of variables that are studied: A dependent variable, Y, also called response variable. It is modeled as random. An independent
More informationLecture One: A Quick Review/Overview on Regular Linear Regression Models
Lecture One: A Quick Review/Overview on Regular Linear Regression Models Outline The topics to be covered include: Model Specification Estimation(LS estimators and MLEs) Hypothesis Testing and Model Diagnostics
More informationF-tests and Nested Models
F-tests and Nested Models Nested Models: A core concept in statistics is comparing nested s. Consider the Y = β 0 + β 1 x 1 + β 2 x 2 + ǫ. (1) The following reduced s are special cases (nested within)
More informationSTOR 455 STATISTICAL METHODS I
STOR 455 STATISTICAL METHODS I Jan Hannig Mul9variate Regression Y=X β + ε X is a regression matrix, β is a vector of parameters and ε are independent N(0,σ) Es9mated parameters b=(x X) - 1 X Y Predicted
More informationBusiness Statistics. Tommaso Proietti. Model Evaluation and Selection. DEF - Università di Roma 'Tor Vergata'
Business Statistics Tommaso Proietti DEF - Università di Roma 'Tor Vergata' Model Evaluation and Selection Predictive Ability of a Model: Denition and Estimation We aim at achieving a balance between parsimony
More informationApplied Econometrics (QEM)
Applied Econometrics (QEM) based on Prinicples of Econometrics Jakub Mućk Department of Quantitative Economics Jakub Mućk Applied Econometrics (QEM) Meeting #3 1 / 42 Outline 1 2 3 t-test P-value Linear
More informationLeast Squares Estimation-Finite-Sample Properties
Least Squares Estimation-Finite-Sample Properties Ping Yu School of Economics and Finance The University of Hong Kong Ping Yu (HKU) Finite-Sample 1 / 29 Terminology and Assumptions 1 Terminology and Assumptions
More informationDiagnostics and Remedial Measures: An Overview
Diagnostics and Remedial Measures: An Overview Residuals Model diagnostics Graphical techniques Hypothesis testing Remedial measures Transformation Later: more about all this for multiple regression W.
More informationGeneralized Linear Models
Generalized Linear Models Lecture 3. Hypothesis testing. Goodness of Fit. Model diagnostics GLM (Spring, 2018) Lecture 3 1 / 34 Models Let M(X r ) be a model with design matrix X r (with r columns) r n
More informationSimple Linear Regression
Simple Linear Regression ST 430/514 Recall: A regression model describes how a dependent variable (or response) Y is affected, on average, by one or more independent variables (or factors, or covariates)
More informationRegression Estimation Least Squares and Maximum Likelihood
Regression Estimation Least Squares and Maximum Likelihood Dr. Frank Wood Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 3, Slide 1 Least Squares Max(min)imization Function to minimize
More informationHigh-dimensional regression modeling
High-dimensional regression modeling David Causeur Department of Statistics and Computer Science Agrocampus Ouest IRMAR CNRS UMR 6625 http://www.agrocampus-ouest.fr/math/causeur/ Course objectives Making
More informationLinear Regression. In this problem sheet, we consider the problem of linear regression with p predictors and one intercept,
Linear Regression In this problem sheet, we consider the problem of linear regression with p predictors and one intercept, y = Xβ + ɛ, where y t = (y 1,..., y n ) is the column vector of target values,
More informationPh.D. Qualifying Exam Friday Saturday, January 6 7, 2017
Ph.D. Qualifying Exam Friday Saturday, January 6 7, 2017 Put your solution to each problem on a separate sheet of paper. Problem 1. (5106) Let X 1, X 2,, X n be a sequence of i.i.d. observations from a
More informationIntroduction to Simple Linear Regression
Introduction to Simple Linear Regression Yang Feng http://www.stat.columbia.edu/~yangfeng Yang Feng (Columbia University) Introduction to Simple Linear Regression 1 / 68 About me Faculty in the Department
More informationBIO5312 Biostatistics Lecture 13: Maximum Likelihood Estimation
BIO5312 Biostatistics Lecture 13: Maximum Likelihood Estimation Yujin Chung November 29th, 2016 Fall 2016 Yujin Chung Lec13: MLE Fall 2016 1/24 Previous Parametric tests Mean comparisons (normality assumption)
More informationCorrelation and the Analysis of Variance Approach to Simple Linear Regression
Correlation and the Analysis of Variance Approach to Simple Linear Regression Biometry 755 Spring 2009 Correlation and the Analysis of Variance Approach to Simple Linear Regression p. 1/35 Correlation
More informationApplied Regression Analysis
Applied Regression Analysis Chapter 3 Multiple Linear Regression Hongcheng Li April, 6, 2013 Recall simple linear regression 1 Recall simple linear regression 2 Parameter Estimation 3 Interpretations of
More informationChapter 4: Regression Models
Sales volume of company 1 Textbook: pp. 129-164 Chapter 4: Regression Models Money spent on advertising 2 Learning Objectives After completing this chapter, students will be able to: Identify variables,
More informationDiagnostics and Remedial Measures
Diagnostics and Remedial Measures Yang Feng http://www.stat.columbia.edu/~yangfeng Yang Feng (Columbia University) Diagnostics and Remedial Measures 1 / 72 Remedial Measures How do we know that the regression
More informationChapter 3: Multiple Regression. August 14, 2018
Chapter 3: Multiple Regression August 14, 2018 1 The multiple linear regression model The model y = β 0 +β 1 x 1 + +β k x k +ǫ (1) is called a multiple linear regression model with k regressors. The parametersβ
More informationBIOS 2083 Linear Models c Abdus S. Wahed
Chapter 5 206 Chapter 6 General Linear Model: Statistical Inference 6.1 Introduction So far we have discussed formulation of linear models (Chapter 1), estimability of parameters in a linear model (Chapter
More informationLecture 3: Inference in SLR
Lecture 3: Inference in SLR STAT 51 Spring 011 Background Reading KNNL:.1.6 3-1 Topic Overview This topic will cover: Review of hypothesis testing Inference about 1 Inference about 0 Confidence Intervals
More informationSTAT5044: Regression and Anova
STAT5044: Regression and Anova Inyoung Kim 1 / 25 Outline 1 Multiple Linear Regression 2 / 25 Basic Idea An extra sum of squares: the marginal reduction in the error sum of squares when one or several
More informationModel Selection Procedures
Model Selection Procedures Statistics 135 Autumn 2005 Copyright c 2005 by Mark E. Irwin Model Selection Procedures Consider a regression setting with K potential predictor variables and you wish to explore
More informationStatistics and Econometrics I
Statistics and Econometrics I Point Estimation Shiu-Sheng Chen Department of Economics National Taiwan University September 13, 2016 Shiu-Sheng Chen (NTU Econ) Statistics and Econometrics I September 13,
More informationConcordia University (5+5)Q 1.
(5+5)Q 1. Concordia University Department of Mathematics and Statistics Course Number Section Statistics 360/1 40 Examination Date Time Pages Mid Term Test May 26, 2004 Two Hours 3 Instructor Course Examiner
More informationK. Model Diagnostics. residuals ˆɛ ij = Y ij ˆµ i N = Y ij Ȳ i semi-studentized residuals ω ij = ˆɛ ij. studentized deleted residuals ɛ ij =
K. Model Diagnostics We ve already seen how to check model assumptions prior to fitting a one-way ANOVA. Diagnostics carried out after model fitting by using residuals are more informative for assessing
More informationPredictive Analytics : QM901.1x Prof U Dinesh Kumar, IIMB. All Rights Reserved, Indian Institute of Management Bangalore
What is Multiple Linear Regression Several independent variables may influence the change in response variable we are trying to study. When several independent variables are included in the equation, the
More informationCOMPREHENSIVE WRITTEN EXAMINATION, PAPER III FRIDAY AUGUST 26, 2005, 9:00 A.M. 1:00 P.M. STATISTICS 174 QUESTION
COMPREHENSIVE WRITTEN EXAMINATION, PAPER III FRIDAY AUGUST 26, 2005, 9:00 A.M. 1:00 P.M. STATISTICS 174 QUESTION Answer all parts. Closed book, calculators allowed. It is important to show all working,
More informationMultiple Linear Regression
Multiple Linear Regression University of California, San Diego Instructor: Ery Arias-Castro http://math.ucsd.edu/~eariasca/teaching.html 1 / 42 Passenger car mileage Consider the carmpg dataset taken from
More informationChapter 12: Multiple Linear Regression
Chapter 12: Multiple Linear Regression Seungchul Baek Department of Statistics, University of South Carolina STAT 509: Statistics for Engineers 1 / 55 Introduction A regression model can be expressed as
More informationLecture 1: Linear Models and Applications
Lecture 1: Linear Models and Applications Claudia Czado TU München c (Claudia Czado, TU Munich) ZFS/IMS Göttingen 2004 0 Overview Introduction to linear models Exploratory data analysis (EDA) Estimation
More information