Final Review. Yang Feng. Yang Feng (Columbia University) Final Review 1 / 58

Size: px
Start display at page:

Download "Final Review. Yang Feng. Yang Feng (Columbia University) Final Review 1 / 58"

Transcription

1 Final Review Yang Feng Yang Feng (Columbia University) Final Review 1 / 58

2 Outline 1 Multiple Linear Regression (Estimation, Inference) 2 Special Topics for Multiple Regression Extra Sums of Squares Standardized Version of the Multiple Regression Model 3 Polynomial and Interaction Regression Models 4 Model Selection 5 Remedial Measures for Multiple Linear Regression Models Yang Feng (Columbia University) Final Review 2 / 58

3 Outline 1 Multiple Linear Regression (Estimation, Inference) 2 Special Topics for Multiple Regression Extra Sums of Squares Standardized Version of the Multiple Regression Model 3 Polynomial and Interaction Regression Models 4 Model Selection 5 Remedial Measures for Multiple Linear Regression Models Yang Feng (Columbia University) Final Review 3 / 58

4 General Regression Model in Matrix Terms Y 1. 1 X 11 X X 1,p 1 Y =. X = X n1 X n2... X n,p 1 Y n β β =. =... β p 1 n Yang Feng (Columbia University) Final Review 4 / 58

5 General Linear Regression in Matrix Terms With E() = 0 and Y = Xβ + σ σ 2 () = 0 σ σ 2 We have E(Y) = Xβ and σ 2 {Y} = σ 2 I Yang Feng (Columbia University) Final Review 5 / 58

6 Least Square Solution The matrix normal equations can be derived directly from the minimization of Q(β) = (Y Xβ) (Y Xβ) w.r.t to β. b = (X X) 1 X Y Ŷ = Xb Yang Feng (Columbia University) Final Review 6 / 58

7 Hat Matrix-Puts hat on y We can also directly express the fitted values in terms of X and Y matrices Ŷ = X(X X) 1 X Y and we can further define H, the hat matrix Ŷ = HY H = X(X X) 1 X The hat matrix plans an important role in diagnostics for regression analysis. Yang Feng (Columbia University) Final Review 7 / 58

8 Hat Matrix Properties 1. the hat matrix is symmetric 2. the hat matrix is idempotent, i.e. HH = H Important idempotent matrix property For a symmetric and idempotent matrix A, rank(a) = trace(a), the number of non-zero eigenvalues of A. Yang Feng (Columbia University) Final Review 8 / 58

9 Residuals The residuals, like the fitted value Ŷ can be expressed as linear combinations of the response variable observations Y i e = Y Ŷ = Y HY = (I H)Y also, remember e = Y Ŷ = Y Xb these are equivalent. Yang Feng (Columbia University) Final Review 9 / 58

10 Covariance of Residuals Starting with we see that but which means that e = (I H)Y σ 2 {e} = (I H)σ 2 {Y}(I H) σ 2 {Y} = σ 2 {} = σ 2 I σ 2 {e} = σ 2 (I H)I(I H) = σ 2 (I H)(I H) and since I H is idempotent, we have σ 2 {e} = σ 2 (I H) Yang Feng (Columbia University) Final Review 10 / 58

11 Quadratic Forms In general, a quadratic form is defined by Y AY = i j a ijy i Y j where a ij = a ji with A the matrix of the quadratic form. The ANOVA sums SSTO,SSE and SSR can all be arranged into quadratic forms. SSTO = Y (I 1 n J)Y SSE = Y (I H)Y SSR = Y (H 1 n J)Y Yang Feng (Columbia University) Final Review 11 / 58

12 Inference Since σ 2 {Y} = σ 2 I we can write σ 2 {b} = (X X) 1 X σ 2 IX(X X) 1 = σ 2 (X X) 1 X X(X X) 1 = σ 2 (X X) 1 I = σ 2 (X X) 1 And E(b) = E((X X) 1 X Y) = (X X) 1 X E(Y) = (X X) 1 X Xβ = β Yang Feng (Columbia University) Final Review 12 / 58

13 Inference The estimated variance-covariance matrix Then, we have s 2 {b} = MSE(X X) 1 b k β k s{b k } t(n p), k = 0, 1,, p 1 1 α confidence intervals: b k ± t(1 α/2; n p)s{b k } Yang Feng (Columbia University) Final Review 13 / 58

14 t test Tests for β k : H 0 : β k = 0 H 1 : β k = 0 Test Statistic: t = b k s{b k } Decision Rule: t t(1 α/2; n p); conclude H 0 Otherwise, conclude H a Yang Feng (Columbia University) Final Review 14 / 58

15 F-test for regression H 0 : β 1 = β 2 = = β p 1 = 0 H a : no all β k, (k = 1,, p 1) equal zero Test statistic: Decision Rule: F = MSR MSE if F F (1 α; p 1, n p), conclude H 0 if F > F (1 α; p 1, n p), conclude H a Yang Feng (Columbia University) Final Review 15 / 58

16 R 2 and adjusted R 2 The coefficient of multiple determination R 2 is defined as: R 2 = SSR SSTO = 1 SSE SSTO 0 R 2 1 R 2 always increases when there are more variables. Therefore, adjusted R 2 : R 2 a = 1 SSE n p SSTO n 1 R 2 a may decrease when p is large. Coefficient of multiple correlation: Always positive square root! = 1 R = R 2 n 1 SSE n p SSTO Yang Feng (Columbia University) Final Review 16 / 58

17 Outline 1 Multiple Linear Regression (Estimation, Inference) 2 Special Topics for Multiple Regression Extra Sums of Squares Standardized Version of the Multiple Regression Model 3 Polynomial and Interaction Regression Models 4 Model Selection 5 Remedial Measures for Multiple Linear Regression Models Yang Feng (Columbia University) Final Review 17 / 58

18 Outline 1 Multiple Linear Regression (Estimation, Inference) 2 Special Topics for Multiple Regression Extra Sums of Squares Standardized Version of the Multiple Regression Model 3 Polynomial and Interaction Regression Models 4 Model Selection 5 Remedial Measures for Multiple Linear Regression Models Yang Feng (Columbia University) Final Review 18 / 58

19 Extra Sums of Squares Definition: marginal decrease in the SSE when one or several predictor variables are added to the regression model, given that other variables are already in the model. Examples: SSR(X 1 X 2 ) = SSE(X 2 ) SSE(X 1, X 2 ) = SSR(X 1, X 2 ) SSR(X 2 ) SSR(X 3 X 1, X 2 ) = SSE(X 1, X 2 ) SSE(X 1, X 2, X 3 ) = SSR(X 1, X 2, X 3 ) SSR(X 1, X 2 ) Yang Feng (Columbia University) Final Review 19 / 58

20 ANOVA Table Various software packages can provide extra sums of squares for regression analysis. These are usually provided in the order in which the input variables are provided to the system, for instance Figure: Yang Feng (Columbia University) Final Review 20 / 58

21 Summary of Tests Concerning Regression Coefficients Test whether all β k = 0 Test whether a single β k = 0 Test whether some β k = 0 Test involving relationships among coefficients, for example, H 0 : β 1 = β 2 vs. H a : β 1 = β 2 H 0 : β 1 = 3, β 2 = 5 vs. H a : otherwise Key point in all tests: form the full model and the reduced model Yang Feng (Columbia University) Final Review 21 / 58

22 Coefficients of Partial Determination Recall Coefficient of determination : R 2 measures the proportionate reduction in the variation of Y by introduction of the entire set of X. Partial Determination: measures the marginal contribution of one X variable when all others are already in the model. Yang Feng (Columbia University) Final Review 22 / 58

23 Two predictor variables Y i = β 0 + β 1 X i1 + β 2 X i2 + i Coefficient of partial determination between Y and X 1 given X 2 in the model is denoted as R 2 Y 1 2 : R 2 Y 1 2 = SSE(X 2) SSE(X 1, X 2 ) SSE(X 2 ) = SSR(X 1 X 2 ) SSE(X 2 ) Likewise: R 2 Y 2 1 = SSE(X 1) SSE(X 1, X 2 ) SSE(X 1 ) = SSR(X 2 X 1 ) SSE(X 1 ) Yang Feng (Columbia University) Final Review 23 / 58

24 General case R 2 Y 1 23 = SSR(X 1 X 2, X 3 ) SSE(X 2, X 3 ) R 2 Y = SSR(X 4 X 1, X 2, X 3 ) SSE(X 1, X 2, X 3 ) Yang Feng (Columbia University) Final Review 24 / 58

25 Coefficients of Partial Correlation Coefficients of Partial Correlation: square root of a coefficient of partial determination, following the same sign with the regression coefficient! Yang Feng (Columbia University) Final Review 25 / 58

26 Outline 1 Multiple Linear Regression (Estimation, Inference) 2 Special Topics for Multiple Regression Extra Sums of Squares Standardized Version of the Multiple Regression Model 3 Polynomial and Interaction Regression Models 4 Model Selection 5 Remedial Measures for Multiple Linear Regression Models Yang Feng (Columbia University) Final Review 26 / 58

27 Standardized Multiple Regression Transformed variables Y i = 1 ( Y i Ȳ ) n 1 s y X ik = 1 n 1 ( X ik X k s k ), k = 1,..., p 1 Yang Feng (Columbia University) Final Review 27 / 58

28 Standardized Regression Model The regression model using the transformed variables: Y i = β 1X i1 + + β p 1X i,p 1 + i Notice that there is no need for intercept It reduces to the standard linear regression problem Yang Feng (Columbia University) Final Review 28 / 58

29 Standardized Regression Model The solution b = b 1 b 2... b p 1 can be related to the solution to the untransformed regression problem through the relationship b k = ( sy s k )bk, k = 1,..., p 1 b 0 = Ȳ b 1 X 1... b p 1 X p 1 Yang Feng (Columbia University) Final Review 29 / 58

30 Multicollinearity Usually, we still have good fit of the data, in addition, we still have good prediction. The estimated regression coefficients tends to have large sampling variability when the predictor variables are highly correlated. Some of the regression coefficients maybe statistically not significant even though a definite statistical relation exists. The common interpretation of a regression coefficient is NOT fully applicable any more. Regress Y on both X 1 and X 2. It is possible that when individual t-tests are performed, neither β 1 or β 2 is significant. However, when the F -test is performed for both β 1 and β 2, the results may still be significant. Yang Feng (Columbia University) Final Review 30 / 58

31 Outline 1 Multiple Linear Regression (Estimation, Inference) 2 Special Topics for Multiple Regression Extra Sums of Squares Standardized Version of the Multiple Regression Model 3 Polynomial and Interaction Regression Models 4 Model Selection 5 Remedial Measures for Multiple Linear Regression Models Yang Feng (Columbia University) Final Review 31 / 58

32 One-predictor variable-second order Y i = β 0 + β 1 x i + β 11 x 2 i + i where x i = X i X X is centered due to the possible high correlation between X and X 2. Regression function: E{Y } = β 0 + β 1 x + β 11 x 2, quadratic response function β 0 is the mean response when x = 0, i.e., X = X. β 1 is called the linear effect. β 11 is called the quadratic effect. Yang Feng (Columbia University) Final Review 32 / 58

33 One Predictor Variable-Third Order Y i = β 0 + β 1 x i + β 11 x 2 i + β 111 x 3 i + i where x i = X i X Yang Feng (Columbia University) Final Review 33 / 58

34 One Predictor Variable-Higher Orders Employed with special caution. Tends to overfit Poor prediction Yang Feng (Columbia University) Final Review 34 / 58

35 Two Predictors-Second Order Y i = β 0 + β 1 x i1 + β 2 x i2 + β 11 x 2 i1 + β 22 x 2 i2 + β 12 x i1 x i2 + i where x i1 = X i1 X 1, x i2 = X i2 X 2 The coefficient β 12 is called the interaction effect coefficient. More on interaction later. Three Predictors- Second Order is similar. Yang Feng (Columbia University) Final Review 35 / 58

36 Implementation of Polynomial Regression Models Fitting Very easy, just use the least squares for multiple linear regressions since they can all be seen as a multiple regression. Determine the order Very important step! Y i = β 0 + β 1 x i + β 11 x 2 i + β 111 x 3 i + i Naturally, we want to test whether or not β 111 = 0, or whether or not both β 11 = 0 and β 111 = 0. How to do the test? Yang Feng (Columbia University) Final Review 36 / 58

37 Extra Sum of Squares Decomposition SSR into SSR(x), SSR(x 2 x) and SSR(x 3 x, x 2 ). Test whether β 111 = 0: use SSR(x 3 x, x 2 ). Test whether both β 11 = 0 and β 111 = 0: use SSR(x 2, x 3 x). Yang Feng (Columbia University) Final Review 37 / 58

38 Interpretation of Regression Models with Interactions E{Y } = β 0 + β 1 X 1 + β 2 X 2 + β 3 X 1 X 2 The change in mean response with a unit increase in X 1 when X 2 is held constant is β 1 + β 3 X 2 Similarly, a unit increase in X 2 when X 1 is constant: β 2 + β 3 X 1 Yang Feng (Columbia University) Final Review 38 / 58

39 Implementation of Interaction Regression Models Center the predictor variables to avoid the high multicollinearities x ik = X ik X k Using prior knowledge to reduce the number of interactions. If we have 8 predictors, then we have 28 pairwise terms in total. For p predictors, the number is p(p 1)/2. Yang Feng (Columbia University) Final Review 39 / 58

40 Qualitative Predictors Examples: Gender (male or female) Purchase status (yes or no) Disability status (not disabled, partly disabled, fully disabled) A qualitative variables with c classes will be represented by c 1 indicator variables, each taking on the values 0 and 1. Yang Feng (Columbia University) Final Review 40 / 58

41 Outline 1 Multiple Linear Regression (Estimation, Inference) 2 Special Topics for Multiple Regression Extra Sums of Squares Standardized Version of the Multiple Regression Model 3 Polynomial and Interaction Regression Models 4 Model Selection 5 Remedial Measures for Multiple Linear Regression Models Yang Feng (Columbia University) Final Review 41 / 58

42 Six Criteria R 2 p, R 2 a,p, C p, AIC p, BIC p (SBC p ), PRESS p Denote total number of variables as P 1, so P parameters in total. Here, 1 p P. Use the coefficient of multiple determination R 2 : R 2 p = 1 SSE p SSTO Use adjusted coefficient of multiple determination: Ra,p 2 = 1 n 1 SSE p n p SSTO = 1 MSE p SSTO n 1 Yang Feng (Columbia University) Final Review 42 / 58

43 Mallows C p Criterion Concerned with total mean squared error we have Criterion measure: (Ŷ i µ i ) 2 = [(E{Ŷ i } µ i ) + (Ŷ i E{Ŷ i })] 2 E(Y i µ i ) 2 = (E{Ŷ i } µ i ) 2 + σ 2 {Ŷ i } Γ p = 1 σ 2 n i=1(e{ŷi} µ i ) 2 + A good estimator of Γ p will be C p = n σ 2 {Ŷi} i=1 SSE p (n 2p) MSE(X 1,, X P 1 ) Yang Feng (Columbia University) Final Review 43 / 58

44 AIC and BIC The Akaike s information criterion (AIC) and the Bayesian information criterion (BIC) (also called the Schwarz criterion, SBC in the book) are two criteria that penalize model complexity. In the linear regression setting AIC p = n log SSE p n log n + 2p BIC p = n log SSE p n log n + (log n)p Roughly you can think of these two criteria as penalizing models with many parameters (p in the case of linear regression). Yang Feng (Columbia University) Final Review 44 / 58

45 PRESS p or Leave-One-Out Cross Validation The PRESS p or prediction sum of squares measures how well a subset model can predict the observed responses Y i. Let Ŷi(i) be the fitted value when i is being predicted from a model in which (i) was left out during training. The PRESS p criterion is then given by summing over all n cases PRESS p = n (Y i Ŷi(i)) 2 i=1 PRESS p values can be calculated without doing n separate regression runs. Yang Feng (Columbia University) Final Review 45 / 58

46 PRESS p or Leave-One-Out Cross Validation If we let d i be the deleted residual for the i th case then we can rewrite d i = Y i Ŷi(i) e i d i = 1 h ii where e i is the ordinary residual for the i th case and h ii is the i th diagonal element in the hat matrix. We can obtain the h ii diagonal element of the hat matrix directly from h ii = X i(x X) 1 X i Yang Feng (Columbia University) Final Review 46 / 58

47 Stepwise Regression Methods An automatic search procedure Identify a single best model several different formats Yang Feng (Columbia University) Final Review 47 / 58

48 Forward Stepwise Regression A (greedy) procedure for identifying variables to include in the regression model is as follows. Repeat until finished: 1 Fit a simple linear regression model for each of the P 1 X variables considered for inclusion. For each compute the t statistics for testing whether or not the slope is zero t k = b k 1 s{b k } 2 Pick the largest out of the P 1 tk s (in the first step k = 1) and include the corresponding X variable in the regression model if tk exceeds some significance level. 3 If the number of X variables included in the regression model is greater than one, check to see if the model would be improved by dropping variables (using the t-test and a threshold again). 1 Remember b k is the estimate for β k and s{b k } is the estimator sample standard deviation. Yang Feng (Columbia University) Final Review 48 / 58

49 Forward Stepwise Regression (cont) Other criteria can be used in determining which variables to add and delete, such as F-test (full model v.s. reduced model), AIC (default option in R), BIC, Cp. Usually much more efficient than the best subset regression Yang Feng (Columbia University) Final Review 49 / 58

50 Forward Regression Simplified version of forward stepwise regression No deletion step! Once a variable is in, it will be there from then on. Yang Feng (Columbia University) Final Review 50 / 58

51 Backward Elimination Start from the full model with P 1 variables. Iteratively check whether any variable should be deleted from the model by some given criteria. This time, no addition step! Yang Feng (Columbia University) Final Review 51 / 58

52 Outline 1 Multiple Linear Regression (Estimation, Inference) 2 Special Topics for Multiple Regression Extra Sums of Squares Standardized Version of the Multiple Regression Model 3 Polynomial and Interaction Regression Models 4 Model Selection 5 Remedial Measures for Multiple Linear Regression Models Yang Feng (Columbia University) Final Review 52 / 58

53 Unequal Error Variance Y i = β 0 + β 1 X i1 + + β p 1 X i,p 1 + i Here: i are independent N(0, σ 2 i ). (Originally: i are independent N(0, σ 2 )) In matrix form: σ σ 2 0 σ2 2 0 {} = σn 2 Yang Feng (Columbia University) Final Review 53 / 58

54 Known Error Variance Define weights Denote w i = 1 σ 2 i w w 2 0 W = w n Weighted least squares and maximum likelihood estimator is b w = (X WX) 1 X WY Yang Feng (Columbia University) Final Review 54 / 58

55 Error Variance Known up to Proportionality Constant Same estimator. w i = k 1 σ 2 i Yang Feng (Columbia University) Final Review 55 / 58

56 Unknown Error Variances In reality, one rarely known the variances σ 2 i. Estimation of Variance Function or Standard Deviation Function Use of Replicates or Near Replicates Yang Feng (Columbia University) Final Review 56 / 58

57 Estimation of Variance Function or Standard Deviation Function Four steps: (Can be iterated for several times to reach convegence) 1 Fit the regression model by unweighted least squares and analyze the residuals 2 Estimate the variance function or the standard deviation function by regressing either the squared residuals or the absolute residuals on the appropriate predictor(s). (We known that the variance of i σ 2 i = E( 2 i ) (E( i)) 2 = E( 2 i ). Hence the squared residual e 2 i is an estimator of σ 2 i.) 3 Use the fitted value from the estimated variance or standard deviation function to obtain the weights w i. 4 Estimate the regression coefficients using these weights. Yang Feng (Columbia University) Final Review 57 / 58

58 Ridge Estimators (Multi-collinearity) OLS: (X X)b = X Y Transformed by correlation transformation: r XX b = r YX Ridge Estimator: for a constant c 0, c = 0, OLS (r XX + ci)b R = r YX c > 0, biased, but much more stable. Yang Feng (Columbia University) Final Review 58 / 58

Outline. Remedial Measures) Extra Sums of Squares Standardized Version of the Multiple Regression Model

Outline. Remedial Measures) Extra Sums of Squares Standardized Version of the Multiple Regression Model Outline 1 Multiple Linear Regression (Estimation, Inference, Diagnostics and Remedial Measures) 2 Special Topics for Multiple Regression Extra Sums of Squares Standardized Version of the Multiple Regression

More information

Need for Several Predictor Variables

Need for Several Predictor Variables Multiple regression One of the most widely used tools in statistical analysis Matrix expressions for multiple regression are the same as for simple linear regression Need for Several Predictor Variables

More information

Remedial Measures for Multiple Linear Regression Models

Remedial Measures for Multiple Linear Regression Models Remedial Measures for Multiple Linear Regression Models Yang Feng http://www.stat.columbia.edu/~yangfeng Yang Feng (Columbia University) Remedial Measures for Multiple Linear Regression Models 1 / 25 Outline

More information

Model Selection. Frank Wood. December 10, 2009

Model Selection. Frank Wood. December 10, 2009 Model Selection Frank Wood December 10, 2009 Standard Linear Regression Recipe Identify the explanatory variables Decide the functional forms in which the explanatory variables can enter the model Decide

More information

Linear Algebra Review

Linear Algebra Review Linear Algebra Review Yang Feng http://www.stat.columbia.edu/~yangfeng Yang Feng (Columbia University) Linear Algebra Review 1 / 45 Definition of Matrix Rectangular array of elements arranged in rows and

More information

How the mean changes depends on the other variable. Plots can show what s happening...

How the mean changes depends on the other variable. Plots can show what s happening... Chapter 8 (continued) Section 8.2: Interaction models An interaction model includes one or several cross-product terms. Example: two predictors Y i = β 0 + β 1 x i1 + β 2 x i2 + β 12 x i1 x i2 + ɛ i. How

More information

Lecture 15 Multiple regression I Chapter 6 Set 2 Least Square Estimation The quadratic form to be minimized is

Lecture 15 Multiple regression I Chapter 6 Set 2 Least Square Estimation The quadratic form to be minimized is Lecture 15 Multiple regression I Chapter 6 Set 2 Least Square Estimation The quadratic form to be minimized is Q = (Y i β 0 β 1 X i1 β 2 X i2 β p 1 X i.p 1 ) 2, which in matrix notation is Q = (Y Xβ) (Y

More information

Review: Second Half of Course Stat 704: Data Analysis I, Fall 2014

Review: Second Half of Course Stat 704: Data Analysis I, Fall 2014 Review: Second Half of Course Stat 704: Data Analysis I, Fall 2014 Tim Hanson, Ph.D. University of South Carolina T. Hanson (USC) Stat 704: Data Analysis I, Fall 2014 1 / 13 Chapter 8: Polynomials & Interactions

More information

STAT 540: Data Analysis and Regression

STAT 540: Data Analysis and Regression STAT 540: Data Analysis and Regression Wen Zhou http://www.stat.colostate.edu/~riczw/ Email: riczw@stat.colostate.edu Department of Statistics Colorado State University Fall 205 W. Zhou (Colorado State

More information

Math 423/533: The Main Theoretical Topics

Math 423/533: The Main Theoretical Topics Math 423/533: The Main Theoretical Topics Notation sample size n, data index i number of predictors, p (p = 2 for simple linear regression) y i : response for individual i x i = (x i1,..., x ip ) (1 p)

More information

Chapter 5 Matrix Approach to Simple Linear Regression

Chapter 5 Matrix Approach to Simple Linear Regression STAT 525 SPRING 2018 Chapter 5 Matrix Approach to Simple Linear Regression Professor Min Zhang Matrix Collection of elements arranged in rows and columns Elements will be numbers or symbols For example:

More information

Lecture 10 Multiple Linear Regression

Lecture 10 Multiple Linear Regression Lecture 10 Multiple Linear Regression STAT 512 Spring 2011 Background Reading KNNL: 6.1-6.5 10-1 Topic Overview Multiple Linear Regression Model 10-2 Data for Multiple Regression Y i is the response variable

More information

Multiple Regression. Dr. Frank Wood. Frank Wood, Linear Regression Models Lecture 12, Slide 1

Multiple Regression. Dr. Frank Wood. Frank Wood, Linear Regression Models Lecture 12, Slide 1 Multiple Regression Dr. Frank Wood Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 12, Slide 1 Review: Matrix Regression Estimation We can solve this equation (if the inverse of X

More information

Formal Statement of Simple Linear Regression Model

Formal Statement of Simple Linear Regression Model Formal Statement of Simple Linear Regression Model Y i = β 0 + β 1 X i + ɛ i Y i value of the response variable in the i th trial β 0 and β 1 are parameters X i is a known constant, the value of the predictor

More information

Topic 7 - Matrix Approach to Simple Linear Regression. Outline. Matrix. Matrix. Review of Matrices. Regression model in matrix form

Topic 7 - Matrix Approach to Simple Linear Regression. Outline. Matrix. Matrix. Review of Matrices. Regression model in matrix form Topic 7 - Matrix Approach to Simple Linear Regression Review of Matrices Outline Regression model in matrix form - Fall 03 Calculations using matrices Topic 7 Matrix Collection of elements arranged in

More information

STAT5044: Regression and Anova. Inyoung Kim

STAT5044: Regression and Anova. Inyoung Kim STAT5044: Regression and Anova Inyoung Kim 2 / 51 Outline 1 Matrix Expression 2 Linear and quadratic forms 3 Properties of quadratic form 4 Properties of estimates 5 Distributional properties 3 / 51 Matrix

More information

Introduction to Statistical modeling: handout for Math 489/583

Introduction to Statistical modeling: handout for Math 489/583 Introduction to Statistical modeling: handout for Math 489/583 Statistical modeling occurs when we are trying to model some data using statistical tools. From the start, we recognize that no model is perfect

More information

Matrix Approach to Simple Linear Regression: An Overview

Matrix Approach to Simple Linear Regression: An Overview Matrix Approach to Simple Linear Regression: An Overview Aspects of matrices that you should know: Definition of a matrix Addition/subtraction/multiplication of matrices Symmetric/diagonal/identity matrix

More information

10. Alternative case influence statistics

10. Alternative case influence statistics 10. Alternative case influence statistics a. Alternative to D i : dffits i (and others) b. Alternative to studres i : externally-studentized residual c. Suggestion: use whatever is convenient with the

More information

MATH 644: Regression Analysis Methods

MATH 644: Regression Analysis Methods MATH 644: Regression Analysis Methods FINAL EXAM Fall, 2012 INSTRUCTIONS TO STUDENTS: 1. This test contains SIX questions. It comprises ELEVEN printed pages. 2. Answer ALL questions for a total of 100

More information

6. Multiple Linear Regression

6. Multiple Linear Regression 6. Multiple Linear Regression SLR: 1 predictor X, MLR: more than 1 predictor Example data set: Y i = #points scored by UF football team in game i X i1 = #games won by opponent in their last 10 games X

More information

Regression, Ridge Regression, Lasso

Regression, Ridge Regression, Lasso Regression, Ridge Regression, Lasso Fabio G. Cozman - fgcozman@usp.br October 2, 2018 A general definition Regression studies the relationship between a response variable Y and covariates X 1,..., X n.

More information

Chapter 14 Student Lecture Notes 14-1

Chapter 14 Student Lecture Notes 14-1 Chapter 14 Student Lecture Notes 14-1 Business Statistics: A Decision-Making Approach 6 th Edition Chapter 14 Multiple Regression Analysis and Model Building Chap 14-1 Chapter Goals After completing this

More information

MS&E 226: Small Data

MS&E 226: Small Data MS&E 226: Small Data Lecture 6: Model complexity scores (v3) Ramesh Johari ramesh.johari@stanford.edu Fall 2015 1 / 34 Estimating prediction error 2 / 34 Estimating prediction error We saw how we can estimate

More information

Lecture 9 SLR in Matrix Form

Lecture 9 SLR in Matrix Form Lecture 9 SLR in Matrix Form STAT 51 Spring 011 Background Reading KNNL: Chapter 5 9-1 Topic Overview Matrix Equations for SLR Don t focus so much on the matrix arithmetic as on the form of the equations.

More information

Chapter 2 Multiple Regression I (Part 1)

Chapter 2 Multiple Regression I (Part 1) Chapter 2 Multiple Regression I (Part 1) 1 Regression several predictor variables The response Y depends on several predictor variables X 1,, X p response {}}{ Y predictor variables {}}{ X 1, X 2,, X p

More information

Chapter 11 Building the Regression Model II:

Chapter 11 Building the Regression Model II: Chapter 11 Building the Regression Model II: Remedial Measures( 補救措施 ) 許湘伶 Applied Linear Regression Models (Kutner, Nachtsheim, Neter, Li) hsuhl (NUK) LR Chap 11 1 / 48 11.1 WLS remedial measures may

More information

ISyE 691 Data mining and analytics

ISyE 691 Data mining and analytics ISyE 691 Data mining and analytics Regression Instructor: Prof. Kaibo Liu Department of Industrial and Systems Engineering UW-Madison Email: kliu8@wisc.edu Office: Room 3017 (Mechanical Engineering Building)

More information

STAT5044: Regression and Anova

STAT5044: Regression and Anova STAT5044: Regression and Anova Inyoung Kim 1 / 25 Outline 1 Multiple Linear Regression 2 / 25 Basic Idea An extra sum of squares: the marginal reduction in the error sum of squares when one or several

More information

STAT 100C: Linear models

STAT 100C: Linear models STAT 100C: Linear models Arash A. Amini June 9, 2018 1 / 21 Model selection Choosing the best model among a collection of models {M 1, M 2..., M N }. What is a good model? 1. fits the data well (model

More information

STAT763: Applied Regression Analysis. Multiple linear regression. 4.4 Hypothesis testing

STAT763: Applied Regression Analysis. Multiple linear regression. 4.4 Hypothesis testing STAT763: Applied Regression Analysis Multiple linear regression 4.4 Hypothesis testing Chunsheng Ma E-mail: cma@math.wichita.edu 4.4.1 Significance of regression Null hypothesis (Test whether all β j =

More information

Machine Learning Linear Regression. Prof. Matteo Matteucci

Machine Learning Linear Regression. Prof. Matteo Matteucci Machine Learning Linear Regression Prof. Matteo Matteucci Outline 2 o Simple Linear Regression Model Least Squares Fit Measures of Fit Inference in Regression o Multi Variate Regession Model Least Squares

More information

Diagnostics and Remedial Measures

Diagnostics and Remedial Measures Diagnostics and Remedial Measures Yang Feng http://www.stat.columbia.edu/~yangfeng Yang Feng (Columbia University) Diagnostics and Remedial Measures 1 / 72 Remedial Measures How do we know that the regression

More information

Sections 7.1, 7.2, 7.4, & 7.6

Sections 7.1, 7.2, 7.4, & 7.6 Sections 7.1, 7.2, 7.4, & 7.6 Adapted from Timothy Hanson Department of Statistics, University of South Carolina Stat 704: Data Analysis I 1 / 25 Chapter 7 example: Body fat n = 20 healthy females 25 34

More information

LINEAR REGRESSION MODELS W4315

LINEAR REGRESSION MODELS W4315 LINEAR REGRESSION MODELS W431 HOMEWORK ANSWERS March 9, 2010 Due: 03/04/10 Instructor: Frank Wood 1. (20 points) In order to get a maximum likelihood estimate of the parameters of a Box-Cox transformed

More information

Regression Review. Statistics 149. Spring Copyright c 2006 by Mark E. Irwin

Regression Review. Statistics 149. Spring Copyright c 2006 by Mark E. Irwin Regression Review Statistics 149 Spring 2006 Copyright c 2006 by Mark E. Irwin Matrix Approach to Regression Linear Model: Y i = β 0 + β 1 X i1 +... + β p X ip + ɛ i ; ɛ i iid N(0, σ 2 ), i = 1,..., n

More information

Nonparametric Regression and Bonferroni joint confidence intervals. Yang Feng

Nonparametric Regression and Bonferroni joint confidence intervals. Yang Feng Nonparametric Regression and Bonferroni joint confidence intervals Yang Feng Simultaneous Inferences In chapter 2, we know how to construct confidence interval for β 0 and β 1. If we want a confidence

More information

Simple Regression Model Setup Estimation Inference Prediction. Model Diagnostic. Multiple Regression. Model Setup and Estimation.

Simple Regression Model Setup Estimation Inference Prediction. Model Diagnostic. Multiple Regression. Model Setup and Estimation. Statistical Computation Math 475 Jimin Ding Department of Mathematics Washington University in St. Louis www.math.wustl.edu/ jmding/math475/index.html October 10, 2013 Ridge Part IV October 10, 2013 1

More information

Outline. Topic 13 - Model Selection. Predicting Survival - Page 350. Survival Time as a Response. Variable Selection R 2 C p Adjusted R 2 PRESS

Outline. Topic 13 - Model Selection. Predicting Survival - Page 350. Survival Time as a Response. Variable Selection R 2 C p Adjusted R 2 PRESS Topic 13 - Model Selection - Fall 2013 Variable Selection R 2 C p Adjusted R 2 PRESS Outline Automatic Search Procedures Topic 13 2 Predicting Survival - Page 350 Surgical unit wants to predict survival

More information

MS-C1620 Statistical inference

MS-C1620 Statistical inference MS-C1620 Statistical inference 10 Linear regression III Joni Virta Department of Mathematics and Systems Analysis School of Science Aalto University Academic year 2018 2019 Period III - IV 1 / 32 Contents

More information

Linear Regression. September 27, Chapter 3. Chapter 3 September 27, / 77

Linear Regression. September 27, Chapter 3. Chapter 3 September 27, / 77 Linear Regression Chapter 3 September 27, 2016 Chapter 3 September 27, 2016 1 / 77 1 3.1. Simple linear regression 2 3.2 Multiple linear regression 3 3.3. The least squares estimation 4 3.4. The statistical

More information

Multiple Regression Analysis. Part III. Multiple Regression Analysis

Multiple Regression Analysis. Part III. Multiple Regression Analysis Part III Multiple Regression Analysis As of Sep 26, 2017 1 Multiple Regression Analysis Estimation Matrix form Goodness-of-Fit R-square Adjusted R-square Expected values of the OLS estimators Irrelevant

More information

An Introduction to Path Analysis

An Introduction to Path Analysis An Introduction to Path Analysis PRE 905: Multivariate Analysis Lecture 10: April 15, 2014 PRE 905: Lecture 10 Path Analysis Today s Lecture Path analysis starting with multivariate regression then arriving

More information

Multiple Linear Regression

Multiple Linear Regression Multiple Linear Regression University of California, San Diego Instructor: Ery Arias-Castro http://math.ucsd.edu/~eariasca/teaching.html 1 / 42 Passenger car mileage Consider the carmpg dataset taken from

More information

Machine Learning for OR & FE

Machine Learning for OR & FE Machine Learning for OR & FE Regression II: Regularization and Shrinkage Methods Martin Haugh Department of Industrial Engineering and Operations Research Columbia University Email: martin.b.haugh@gmail.com

More information

Dr. Maddah ENMG 617 EM Statistics 11/28/12. Multiple Regression (3) (Chapter 15, Hines)

Dr. Maddah ENMG 617 EM Statistics 11/28/12. Multiple Regression (3) (Chapter 15, Hines) Dr. Maddah ENMG 617 EM Statistics 11/28/12 Multiple Regression (3) (Chapter 15, Hines) Problems in multiple regression: Multicollinearity This arises when the independent variables x 1, x 2,, x k, are

More information

Chapter 10 Building the Regression Model II: Diagnostics

Chapter 10 Building the Regression Model II: Diagnostics Chapter 10 Building the Regression Model II: Diagnostics 許湘伶 Applied Linear Regression Models (Kutner, Nachtsheim, Neter, Li) hsuhl (NUK) LR Chap 10 1 / 41 10.1 Model Adequacy for a Predictor Variable-Added

More information

Predictive Analytics : QM901.1x Prof U Dinesh Kumar, IIMB. All Rights Reserved, Indian Institute of Management Bangalore

Predictive Analytics : QM901.1x Prof U Dinesh Kumar, IIMB. All Rights Reserved, Indian Institute of Management Bangalore What is Multiple Linear Regression Several independent variables may influence the change in response variable we are trying to study. When several independent variables are included in the equation, the

More information

PubH 7405: REGRESSION ANALYSIS. MLR: INFERENCES, Part I

PubH 7405: REGRESSION ANALYSIS. MLR: INFERENCES, Part I PubH 7405: REGRESSION ANALYSIS MLR: INFERENCES, Part I TESTING HYPOTHESES Once we have fitted a multiple linear regression model and obtained estimates for the various parameters of interest, we want to

More information

Chapter 14. Linear least squares

Chapter 14. Linear least squares Serik Sagitov, Chalmers and GU, March 5, 2018 Chapter 14 Linear least squares 1 Simple linear regression model A linear model for the random response Y = Y (x) to an independent variable X = x For a given

More information

Chapter 12: Multiple Linear Regression

Chapter 12: Multiple Linear Regression Chapter 12: Multiple Linear Regression Seungchul Baek Department of Statistics, University of South Carolina STAT 509: Statistics for Engineers 1 / 55 Introduction A regression model can be expressed as

More information

An Introduction to Mplus and Path Analysis

An Introduction to Mplus and Path Analysis An Introduction to Mplus and Path Analysis PSYC 943: Fundamentals of Multivariate Modeling Lecture 10: October 30, 2013 PSYC 943: Lecture 10 Today s Lecture Path analysis starting with multivariate regression

More information

Lecture 6: Linear Regression (continued)

Lecture 6: Linear Regression (continued) Lecture 6: Linear Regression (continued) Reading: Sections 3.1-3.3 STATS 202: Data mining and analysis October 6, 2017 1 / 23 Multiple linear regression Y = β 0 + β 1 X 1 + + β p X p + ε Y ε N (0, σ) i.i.d.

More information

Unit 11: Multiple Linear Regression

Unit 11: Multiple Linear Regression Unit 11: Multiple Linear Regression Statistics 571: Statistical Methods Ramón V. León 7/13/2004 Unit 11 - Stat 571 - Ramón V. León 1 Main Application of Multiple Regression Isolating the effect of a variable

More information

Regression Analysis V... More Model Building: Including Qualitative Predictors, Model Searching, Model "Checking"/Diagnostics

Regression Analysis V... More Model Building: Including Qualitative Predictors, Model Searching, Model Checking/Diagnostics Regression Analysis V... More Model Building: Including Qualitative Predictors, Model Searching, Model "Checking"/Diagnostics The session is a continuation of a version of Section 11.3 of MMD&S. It concerns

More information

Regression Analysis V... More Model Building: Including Qualitative Predictors, Model Searching, Model "Checking"/Diagnostics

Regression Analysis V... More Model Building: Including Qualitative Predictors, Model Searching, Model Checking/Diagnostics Regression Analysis V... More Model Building: Including Qualitative Predictors, Model Searching, Model "Checking"/Diagnostics The session is a continuation of a version of Section 11.3 of MMD&S. It concerns

More information

Chapter 6 Multiple Regression

Chapter 6 Multiple Regression STAT 525 FALL 2018 Chapter 6 Multiple Regression Professor Min Zhang The Data and Model Still have single response variable Y Now have multiple explanatory variables Examples: Blood Pressure vs Age, Weight,

More information

Linear Models 1. Isfahan University of Technology Fall Semester, 2014

Linear Models 1. Isfahan University of Technology Fall Semester, 2014 Linear Models 1 Isfahan University of Technology Fall Semester, 2014 References: [1] G. A. F., Seber and A. J. Lee (2003). Linear Regression Analysis (2nd ed.). Hoboken, NJ: Wiley. [2] A. C. Rencher and

More information

Introduction to Simple Linear Regression

Introduction to Simple Linear Regression Introduction to Simple Linear Regression Yang Feng http://www.stat.columbia.edu/~yangfeng Yang Feng (Columbia University) Introduction to Simple Linear Regression 1 / 68 About me Faculty in the Department

More information

y response variable x 1, x 2,, x k -- a set of explanatory variables

y response variable x 1, x 2,, x k -- a set of explanatory variables 11. Multiple Regression and Correlation y response variable x 1, x 2,, x k -- a set of explanatory variables In this chapter, all variables are assumed to be quantitative. Chapters 12-14 show how to incorporate

More information

Multivariate Regression (Chapter 10)

Multivariate Regression (Chapter 10) Multivariate Regression (Chapter 10) This week we ll cover multivariate regression and maybe a bit of canonical correlation. Today we ll mostly review univariate multivariate regression. With multivariate

More information

Statistics 203: Introduction to Regression and Analysis of Variance Course review

Statistics 203: Introduction to Regression and Analysis of Variance Course review Statistics 203: Introduction to Regression and Analysis of Variance Course review Jonathan Taylor - p. 1/?? Today Review / overview of what we learned. - p. 2/?? General themes in regression models Specifying

More information

Linear Regression. In this problem sheet, we consider the problem of linear regression with p predictors and one intercept,

Linear Regression. In this problem sheet, we consider the problem of linear regression with p predictors and one intercept, Linear Regression In this problem sheet, we consider the problem of linear regression with p predictors and one intercept, y = Xβ + ɛ, where y t = (y 1,..., y n ) is the column vector of target values,

More information

Data Mining Stat 588

Data Mining Stat 588 Data Mining Stat 588 Lecture 02: Linear Methods for Regression Department of Statistics & Biostatistics Rutgers University September 13 2011 Regression Problem Quantitative generic output variable Y. Generic

More information

Stat 5100 Handout #26: Variations on OLS Linear Regression (Ch. 11, 13)

Stat 5100 Handout #26: Variations on OLS Linear Regression (Ch. 11, 13) Stat 5100 Handout #26: Variations on OLS Linear Regression (Ch. 11, 13) 1. Weighted Least Squares (textbook 11.1) Recall regression model Y = β 0 + β 1 X 1 +... + β p 1 X p 1 + ε in matrix form: (Ch. 5,

More information

Chapter 3 Multiple Regression Complete Example

Chapter 3 Multiple Regression Complete Example Department of Quantitative Methods & Information Systems ECON 504 Chapter 3 Multiple Regression Complete Example Spring 2013 Dr. Mohammad Zainal Review Goals After completing this lecture, you should be

More information

Regression Diagnostics for Survey Data

Regression Diagnostics for Survey Data Regression Diagnostics for Survey Data Richard Valliant Joint Program in Survey Methodology, University of Maryland and University of Michigan USA Jianzhu Li (Westat), Dan Liao (JPSM) 1 Introduction Topics

More information

STA 4210 Practise set 2b

STA 4210 Practise set 2b STA 410 Practise set b For all significance tests, use = 0.05 significance level. S.1. A linear regression model is fit, relating fish catch (Y, in tons) to the number of vessels (X 1 ) and fishing pressure

More information

Generalized Linear Models

Generalized Linear Models Generalized Linear Models Lecture 3. Hypothesis testing. Goodness of Fit. Model diagnostics GLM (Spring, 2018) Lecture 3 1 / 34 Models Let M(X r ) be a model with design matrix X r (with r columns) r n

More information

Lecture 3: Linear Models. Bruce Walsh lecture notes Uppsala EQG course version 28 Jan 2012

Lecture 3: Linear Models. Bruce Walsh lecture notes Uppsala EQG course version 28 Jan 2012 Lecture 3: Linear Models Bruce Walsh lecture notes Uppsala EQG course version 28 Jan 2012 1 Quick Review of the Major Points The general linear model can be written as y = X! + e y = vector of observed

More information

Multiple Linear Regression

Multiple Linear Regression Multiple Linear Regression Simple linear regression tries to fit a simple line between two variables Y and X. If X is linearly related to Y this explains some of the variability in Y. In most cases, there

More information

Chapter 4: Regression Models

Chapter 4: Regression Models Sales volume of company 1 Textbook: pp. 129-164 Chapter 4: Regression Models Money spent on advertising 2 Learning Objectives After completing this chapter, students will be able to: Identify variables,

More information

Regression Models for Quantitative and Qualitative Predictors: An Overview

Regression Models for Quantitative and Qualitative Predictors: An Overview Regression Models for Quantitative and Qualitative Predictors: An Overview Polynomial regression models Interaction regression models Qualitative predictors Indicator variables Modeling interactions between

More information

Lecture 2: Linear and Mixed Models

Lecture 2: Linear and Mixed Models Lecture 2: Linear and Mixed Models Bruce Walsh lecture notes Introduction to Mixed Models SISG, Seattle 18 20 July 2018 1 Quick Review of the Major Points The general linear model can be written as y =

More information

2.2 Classical Regression in the Time Series Context

2.2 Classical Regression in the Time Series Context 48 2 Time Series Regression and Exploratory Data Analysis context, and therefore we include some material on transformations and other techniques useful in exploratory data analysis. 2.2 Classical Regression

More information

Regression Steven F. Arnold Professor of Statistics Penn State University

Regression Steven F. Arnold Professor of Statistics Penn State University Regression Steven F. Arnold Professor of Statistics Penn State University Regression is the most commonly used statistical technique. It is primarily concerned with fitting models to data. It is often

More information

Topic 18: Model Selection and Diagnostics

Topic 18: Model Selection and Diagnostics Topic 18: Model Selection and Diagnostics Variable Selection We want to choose a best model that is a subset of the available explanatory variables Two separate problems 1. How many explanatory variables

More information

Lecture 2. The Simple Linear Regression Model: Matrix Approach

Lecture 2. The Simple Linear Regression Model: Matrix Approach Lecture 2 The Simple Linear Regression Model: Matrix Approach Matrix algebra Matrix representation of simple linear regression model 1 Vectors and Matrices Where it is necessary to consider a distribution

More information

holding all other predictors constant

holding all other predictors constant Multiple Regression Numeric Response variable (y) p Numeric predictor variables (p < n) Model: Y = b 0 + b 1 x 1 + + b p x p + e Partial Regression Coefficients: b i effect (on the mean response) of increasing

More information

Linear regression methods

Linear regression methods Linear regression methods Most of our intuition about statistical methods stem from linear regression. For observations i = 1,..., n, the model is Y i = p X ij β j + ε i, j=1 where Y i is the response

More information

A Modern Look at Classical Multivariate Techniques

A Modern Look at Classical Multivariate Techniques A Modern Look at Classical Multivariate Techniques Yoonkyung Lee Department of Statistics The Ohio State University March 16-20, 2015 The 13th School of Probability and Statistics CIMAT, Guanajuato, Mexico

More information

Day 4: Shrinkage Estimators

Day 4: Shrinkage Estimators Day 4: Shrinkage Estimators Kenneth Benoit Data Mining and Statistical Learning March 9, 2015 n versus p (aka k) Classical regression framework: n > p. Without this inequality, the OLS coefficients have

More information

Ch14. Multiple Regression Analysis

Ch14. Multiple Regression Analysis Ch14. Multiple Regression Analysis 1 Goals : multiple regression analysis Model Building and Estimating More than 1 independent variables Quantitative( 量 ) independent variables Qualitative( ) independent

More information

No other aids are allowed. For example you are not allowed to have any other textbook or past exams.

No other aids are allowed. For example you are not allowed to have any other textbook or past exams. UNIVERSITY OF TORONTO SCARBOROUGH Department of Computer and Mathematical Sciences Sample Exam Note: This is one of our past exams, In fact the only past exam with R. Before that we were using SAS. In

More information

Lecture 4 Multiple linear regression

Lecture 4 Multiple linear regression Lecture 4 Multiple linear regression BIOST 515 January 15, 2004 Outline 1 Motivation for the multiple regression model Multiple regression in matrix notation Least squares estimation of model parameters

More information

STAT5044: Regression and Anova. Inyoung Kim

STAT5044: Regression and Anova. Inyoung Kim STAT5044: Regression and Anova Inyoung Kim 2 / 47 Outline 1 Regression 2 Simple Linear regression 3 Basic concepts in regression 4 How to estimate unknown parameters 5 Properties of Least Squares Estimators:

More information

Business Statistics. Tommaso Proietti. Model Evaluation and Selection. DEF - Università di Roma 'Tor Vergata'

Business Statistics. Tommaso Proietti. Model Evaluation and Selection. DEF - Università di Roma 'Tor Vergata' Business Statistics Tommaso Proietti DEF - Università di Roma 'Tor Vergata' Model Evaluation and Selection Predictive Ability of a Model: Denition and Estimation We aim at achieving a balance between parsimony

More information

Multiple Linear Regression

Multiple Linear Regression Andrew Lonardelli December 20, 2013 Multiple Linear Regression 1 Table Of Contents Introduction: p.3 Multiple Linear Regression Model: p.3 Least Squares Estimation of the Parameters: p.4-5 The matrix approach

More information

Python 데이터분석 보충자료. 윤형기

Python 데이터분석 보충자료. 윤형기 Python 데이터분석 보충자료 윤형기 (hky@openwith.net) 단순 / 다중회귀분석 Logistic Regression 회귀분석 REGRESSION Regression 개요 single numeric D.V. (value to be predicted) 과 one or more numeric I.V. (predictors) 간의관계식. "regression"

More information

Statistics 262: Intermediate Biostatistics Model selection

Statistics 262: Intermediate Biostatistics Model selection Statistics 262: Intermediate Biostatistics Model selection Jonathan Taylor & Kristin Cobb Statistics 262: Intermediate Biostatistics p.1/?? Today s class Model selection. Strategies for model selection.

More information

Linear model selection and regularization

Linear model selection and regularization Linear model selection and regularization Problems with linear regression with least square 1. Prediction Accuracy: linear regression has low bias but suffer from high variance, especially when n p. It

More information

MA 575 Linear Models: Cedric E. Ginestet, Boston University Midterm Review Week 7

MA 575 Linear Models: Cedric E. Ginestet, Boston University Midterm Review Week 7 MA 575 Linear Models: Cedric E. Ginestet, Boston University Midterm Review Week 7 1 Random Vectors Let a 0 and y be n 1 vectors, and let A be an n n matrix. Here, a 0 and A are non-random, whereas y is

More information

Chapter 7 Student Lecture Notes 7-1

Chapter 7 Student Lecture Notes 7-1 Chapter 7 Student Lecture Notes 7- Chapter Goals QM353: Business Statistics Chapter 7 Multiple Regression Analysis and Model Building After completing this chapter, you should be able to: Explain model

More information

Linear Regression In God we trust, all others bring data. William Edwards Deming

Linear Regression In God we trust, all others bring data. William Edwards Deming Linear Regression ddebarr@uw.edu 2017-01-19 In God we trust, all others bring data. William Edwards Deming Course Outline 1. Introduction to Statistical Learning 2. Linear Regression 3. Classification

More information

STAT 705 Chapter 16: One-way ANOVA

STAT 705 Chapter 16: One-way ANOVA STAT 705 Chapter 16: One-way ANOVA Timothy Hanson Department of Statistics, University of South Carolina Stat 705: Data Analysis II 1 / 21 What is ANOVA? Analysis of variance (ANOVA) models are regression

More information

Lecture 11. Correlation and Regression

Lecture 11. Correlation and Regression Lecture 11 Correlation and Regression Overview of the Correlation and Regression Analysis The Correlation Analysis In statistics, dependence refers to any statistical relationship between two random variables

More information

Topic 4: Model Specifications

Topic 4: Model Specifications Topic 4: Model Specifications Advanced Econometrics (I) Dong Chen School of Economics, Peking University 1 Functional Forms 1.1 Redefining Variables Change the unit of measurement of the variables will

More information

5. Multiple Regression (Regressioanalyysi) (Azcel Ch. 11, Milton/Arnold Ch. 12) The k-variable Multiple Regression Model

5. Multiple Regression (Regressioanalyysi) (Azcel Ch. 11, Milton/Arnold Ch. 12) The k-variable Multiple Regression Model 5. Multiple Regression (Regressioanalyysi) (Azcel Ch. 11, Milton/Arnold Ch. 12) The k-variable Multiple Regression Model The population regression model of a dependent variable Y on a set of k independent

More information

Weighted Least Squares

Weighted Least Squares Weighted Least Squares ST 430/514 Recall the linear regression equation E(Y ) = β 0 + β 1 x 1 + β 2 x 2 + + β k x k We have estimated the parameters β 0, β 1, β 2,..., β k by minimizing the sum of squared

More information

Wiley. Methods and Applications of Linear Models. Regression and the Analysis. of Variance. Third Edition. Ishpeming, Michigan RONALD R.

Wiley. Methods and Applications of Linear Models. Regression and the Analysis. of Variance. Third Edition. Ishpeming, Michigan RONALD R. Methods and Applications of Linear Models Regression and the Analysis of Variance Third Edition RONALD R. HOCKING PenHock Statistical Consultants Ishpeming, Michigan Wiley Contents Preface to the Third

More information