Math 423/533: The Main Theoretical Topics
|
|
- Camron Page
- 5 years ago
- Views:
Transcription
1 Math 423/533: The Main Theoretical Topics
2 Notation sample size n, data index i number of predictors, p (p = 2 for simple linear regression) y i : response for individual i x i = (x i1,..., x ip ) (1 p) row vector X - (n p) matrix containing all predictors for all individuals i = 1,..., n. y = (y 1,..., y n ) (n 1) column vector Y i and Y: random variables corresponding to responses 1
3 Linear model assumptions For i = 1,..., n, and where E Yi x i [Y i x i ] = x i β = β j x ij p j=1 Var Yi x i [Y i x i ] = σ 2 β = (β 1,..., β p ) is the (p 1) vector of regression coefficients, and σ 2 > 0 is the error variance. We assume also that Y 1,..., Y n are independent given x 1,..., x n. 2
4 Linear model assumptions In vector form E Y X [Y X] = Xβ (n 1) and Var Y X [Y X] = σ 2 I n (n n). This is equivalent to a model specification of Y = Xβ + ɛ where E ɛ X [ɛ X] = 0 n Var ɛ X [ɛ X] = σ 2 I n. 3
5 The intercept We usually consider including the special predictor x i0 1 i = 1,..., n and specify the model E Yi x i [Y i x i ] = x i β = β 0 + β j x ij = β j x ij This model has p + 1 β parameters. We will let p count the total number of predictors, including the intercept term. p j=1 p j=0 4
6 Simple linear regression We specify [ E β0 Yi x i [Y i x i ] = β 0 + β 1 x i1 = [1 x i1 ] β 1 with p = 2 parameters in the regression model. ] = x i β This model posits a straight line relationship between x and y. 5
7 Least squares estimation in simple linear regression On the basis of data (x i1, y i ), i = 1,..., n, we choose the line of best fit according to the least squares principle. We estimate parameters β = (β 0, β 1 ) by β where β = arg min β (y i β 0 β 1 x i1 ) 2 = arg min β where we may also write, in vector form, S(β) = (y Xβ) (y Xβ) S(β) We achieve the minimization by calculus. 6
8 The Normal Equations We solve S(β) β 0 = 2 S(β) β 1 = 2 (y i β 0 β 1 x i1 ) = 0 x i1 (y i β 0 β 1 x i1 ) = 0 These two equations can be written nβ 0 + β 1 x i1 = β 0 or, in matrix form x i1 + β 1 x 2 i1 = (X X)β = X y y i x i1 y i 7
9 The Normal Equations These equations are the Normal Equations. If the symmetric p p = 2 2 matrix X X is non-singular, then we may write the solution β = (X X) 1 X y which yields a p 1 = 2 1 vector of least squares estimates. Explicitly, we have [ β0 β 1 ] = [ ] 1 [ ] n xi1 yi xi1 x 2 i1 xi1 y i 8
10 Estimates n x i1 x i1 x 2 i1 1 = 1 n n x 2 i1 { n x i1 } 2 x 2 i1 x i1 n n x i1 n We write S xx = S xy = { } 2 x 2 i1 x i1 = { x i1 y i 1 n (x i1 x 1 ) 2 x i1 } y i = y i (x i1 x 1 ) 9
11 Estimates (cont.) Thus, after some algebra β 0 = y β 1 x 1 β 1 = S xy S xx 10
12 Residuals and fitted values Define for i = 1,..., n, e i = y i ( β 0 + β 1 x i1 ) = y i ŷ i e i ith residual ŷ i ith fitted value. 11
13 Statistical properties of least squares estimators It is evident from the formula β = (X X) 1 X y = Ay say, where A = (X X) 1 X that the least squares estimates are merely linear combinations of the observed responses y = (y 1,..., y n ). Specifically in the simple linear regression β 0 = where, for i = 1,..., n, ( ) 1 n x 1c i y i β1 = c i = x i1 x 1 S xx. c i y i 12
14 Statistical properties of the estimators In random variable form, we have the estimators β = (X X) 1 X Y = AY and thus, under the model assumptions E Y X [Y X] = Xβ and Var Y X [Y X] = σ 2 I n we can study distributional properties of the estimators. 13
15 Statistical properties of the estimators (cont.) We have, using elementary properties of expectation and variance, E Y X [ β X] = β (p 1) Var Y X [ β X] = σ 2 (X X) 1 (p p) with p = 2. Explicitly x Var Y X [ β 0 X] = σ 2 2 ( i1 1 = σ 2 ns xx n + (x 1) 2 ) S xx Var Y X [ β 1 X] = σ2 S xx 14
16 Residuals and Fitted values For the simple linear regression 1. e i = 1 n e = x i1 e i = 1 x 3. y i = n ŷ i. 4. ŷ i e i = 0, that is, the observed residual vector e is orthogonal to the observed n 1 vectors x = (x 11,..., x n1 ) 1 and ŷ = (ŷ 1,..., ŷ n ). 15
17 Residuals and Fitted values (cont.) These results arise as β solves X (y X β) = 0 p and the results above follow immediately. 16
18 Estimating σ 2 Let SS Res = ei 2 = (y i ŷ i ) 2 = y 2 i n(y)2 β 1 S xy = SS T β 1 S xy say, where SS T = y 2 i n(y)2 = (y i y) 2 We study the statistical properties of the random variable (Y i Ŷi) 2 = (Y i x i β) 2 = (Y X β) (Y X β) 17
19 Estimating σ 2 (cont.) where β = (X X) 1 X Y is the vector of least squares estimators. 18
20 Estimating σ 2 (cont.) But Ŷ = X β = X(X X) 1 X Y = HY say, where H = X(X X) 1 X is the hat matrix. We can show that H is symmetric, and that H H = H so H is idempotent. 19
21 Estimating σ 2 (cont.) Now consider the simpler model where dependence on x i1 is omitted, and we merely have an intercept term. Predictions in this model use the (n 1) matrix X = (1, 1,..., 1) = 1 n yielding the corresponding hat matrix H 1 = 1 n (1 n 1 n ) 1 1 n = 1 n 1 n1 n which is merely the (n n) matrix with all elements equal to 1/n. 20
22 Estimating σ 2 (cont.) We have that SS Res = (Y X β) (Y X β) = Y (I n H)Y where the (n n) matrix (I n H) is symmetric and idempotent. Now, we have the sum of squares decomposition or (y i y) 2 = (y i ŷ i ) 2 + (ŷ i y) 2 SS T = SS Res + SS R 21
23 Estimating σ 2 (cont.) Similarly to the previous result we have SS T = Y (I n H 1 )Y and SS R = Y (H H 1 )Y yielding the representation Y (I n H 1 )Y = Y (I n H)Y + Y (H H 1 )Y where the (n n) matrices (I n H 1 ) and (H H 1 ) are also symmetric and idempotent. 22
24 Estimating σ 2 (cont.) Using the result for the expectation of a quadratic form that if V is a k-dimensional random vector with E[V] = µ Var[V] = Σ then for k k matrix A, we have it follows that E[V AV] = trace(aσ) + µ Aµ E Y X [Y (I n H)Y] = (n p)σ 2 Hence an unbiased estimator of σ 2 is σ 2 = SS Res n p = MS Res with p = 2. 23
25 Estimating σ 2 (cont.) Using similar methods and E Y X [SS R X] = E Y X [Y (H H 1 )Y X] = (p 1)σ 2 + β 2 1S xx E Y X [SS T X] = E Y X [Y (I n H 1 )Y X] = (n 1)σ 2 + β 2 1S xx 24
26 Standard errors for the estimators We have that which is estimated by Var Y X [ β X] = σ 2 (X X) 1 σ 2 (X X) 1. The standard errors of the estimators are estimated by the square roots of the diagonal elements of this matrix; denote them by e.s.e( β j ) j = 0, 1. 25
27 Hypothesis Testing We can formulate hypothesis tests for the parameters provided we make the normality assumption For j = 0, 1, to test we use the test statistic ɛ X Normal(0 n, σ 2 I n ). H 0 : β j = 0 vs H 1 : β j 0 t j = β j e.s.e( β j ). 26
28 Hypothesis Testing (cont.) If H 0 is true, we have by standard distributional results that corresponding statistic T j Student(n p) with p = 2. We reject H 0 at significance level α if t j > t α/2,n p where t α,ν is the 1 α quantile of the Student-t distribution with ν degrees of freedom. 27
29 Confidence Intervals A (1 α) 100% confidence interval for β j is β j ± t α/2,n p e.s.e( β j ) j = 0, 1. 28
30 Global Model Adequacy The R 2 statistic is defined by R 2 = SS R SS T = 1 SS Res SS T and is a measure of the global adequacy of x as a predictor of y. The adjusted R 2 statistic is defined by R 2 Adj = 1 SS Res/(n p) SS T /(n 1) and is a measure that acknowledges that SS Res decreases in expectation as p increases. 29
31 Local Model Adequacy Residual plots are used to assess local model adequacy. If the model assumptions are correct, then the residual plots e i vs i e i vs x i1 e i vs ŷ i should be patternless that is, should not exhibit systematic patterns in either mean-level or variability. The residuals should form a horizontal band around zero, with equal variability around zero everywhere. 30
32 Prediction Predictions from the model at value of x are formed by using the estimated regression coefficients; at x = x i1 observed in the sample, we have the prediction equal to the fitted value In vector form, we have ŷ i = β 0 + β 1 x i1. ŷ = X β. At x = x new 1, we have the prediction ŷ new = β 0 + β 1 x new 1. 31
33 Confidence and Prediction Intervals In the random variable form we have predictions so that and Ŷ = X β = HY E Y X [Ŷ X] = Xβ Var Y X [Ŷ X] = σ2 X(X X) 1 X = σ 2 H Therefore a (1 α) 100% confidence interval for the prediction at x = x i1 is ŷ i ± t α/2,n p σ 2 h ii where h ii is the (i, i)th diagonal element of H. 32
34 Confidence and Prediction Intervals (cont.) For a prediction at x = x new 1, we have that Var Y X [Ŷ new X] = σ 2 x new (X X) 1 (x new ) = σ 2 h new and a (1 α) 100% confidence interval for the prediction at x = x new 1 is ŷ new ± t α/2,n p σ 2 h new 33
35 Confidence and Prediction Intervals (cont.) A prediction interval at x = x new 1 incorporates the random variation that is present in the observations. Let Ŷ new O = Ŷ new + ɛ new where ɛ new is a zero mean, variance σ 2 random residual error, independent of all other random quantities. Then Var Y X [Ŷ new O X] = Var Y X[Ŷ new X] + Var Y X [ɛ new X] = σ 2 h new + σ 2 = σ 2 (1 + h new ). Thus a (1 α) 100% prediction interval for the prediction at x = x new 1 is ŷ new ± t α/2,n p σ 2 (1 + h new ) 34
36 The Analysis of Variance The sums-of-squares decomposition SS T = SS Res + SS R forms the basic component of the Analysis of Variance (ANOVA) as it describes how overall observed variability in response y (SS T ) is decomposed into a component corresponding to the residual errors (SS Res ) and a component corresponding to the regression (SS R ). 35
37 The Analysis of Variance (cont.) Under the assumption of Normality of residual errors, ɛ X Normal(0 n, σ 2 I n ), and the hypothesis that β 1 = 0, we have the result that for the sums-of-squares random variables SS T σ 2 = Y (I n H 1 )Y σ 2 χ 2 n 1 SS Res σ 2 = Y (I n H)Y σ 2 χ 2 n p with SS Res and SS R independent. SS R σ 2 = Y (H H 1 )Y σ 2 χ 2 p 1 36
38 The Analysis of Variance (cont.) Consequently we can show that under the hypothesis, the random variable F = SS R/(p 1) SS Res /(n p) has a Fisher-F distribution with p 1 and n p degrees of freedom F Fisher(p 1, n p). We can construct a test of H 0 : β 1 = 0 based on this result: we reject H 0 at significance level α if F > F α,p 1,n p where F α,ν1,ν 2 is the (1 α) quantile of the Fisher-F distribution with ν 1 and ν 2 degrees of freedom 37
39 The Analysis of Variance (cont.) This test is equivalent to the test of H 0 : β 1 = 0 based on the t-statistic; we have that t 2 1 = { β1 e.s.e( β 1 ) } 2 = F and the two-tailed test based on t 1 is equivalent to the one-tailed test based on F. 38
40 The Analysis of Variance (cont.) The ANOVA table arranges the requires information in tabular form: Source SS df MS F Regression SS R p 1 SS R /(p 1) F Residual SS Res n p SS Res /(n p) Total SS T n 1 39
41 Maximum Likelihood Estimation Under the assumption of Normality of residual errors, so that ɛ X Normal(0 n, σ 2 I n ), Y X Normal(Xβ, σ 2 I n ), we may consider using a maximum likelihood (ML) procedure to estimate the model parameters (β, σ 2 ). The likelihood function for data y is L(β, σ 2 ) = ( ) 1 n/2 { 2πσ 2 exp 1 } 2σ 2 (y Xβ) (y Xβ) 40
42 Maximum Likelihood Estimation (cont.) The log-likelihood is l(β, σ 2 ) = n 2 log σ2 1 2σ 2 (y Xβ) (y Xβ) + constant = n 2 log σ2 1 S(β) + constant. 2σ2 We seek to maximize this function with respect to β and σ 2. It is evident that, in terms of β, the maximum value of l(β, σ 2 ) is attained (for any σ 2 ) when S(β) is minimized. Thus the maximum likelihood estimate of β is identical to the least squares estimate. 41
43 Maximum Likelihood Estimation (cont.) To maximize over σ 2, note that l(β, σ 2 ) σ 2 = n 2σ σ 4 S(β). Equating to zero, and setting β = β, we have the solution σ 2 ML = 1 n S( β) = 1 n (y X β) (y X β) = 1 n (y i ŷ i ) 2 = SS Res n. 42
44 Multiple Linear Regression The simple linear regression model extends to include multiple predictors E Y X [Y X] = Xβ in a straightforward way. Least squares estimation, inference, testing prediction etc. proceeds in precisely the same way. Model checking using residuals and R 2 also proceeds as for simple linear regression. F-testing procedures for assessing the utility of including all the predictors proceed as before, and there is a natural extension to the use of ANOVA tables in regression. 43
45 Hypothesis Testing However, now we may also consider individual t-tests for the coefficients, from a single fit. Furthermore we may consider simultaneous tests for collections of parameters using the F-test. We split X = [X (1) X (2) ] and β = [β (1) β (2) ] and consider the model Y = Xβ + ɛ = X (1) β (1) + X (2) β (2) + ɛ. We compare a simpler model with a more complex model, where the simpler model is obtained by hypothesizing that some βs are zero, that is Full Model : Y = Xβ + ɛ Reduced Model : Y = X (1) β (1) + ɛ that is, under the reduced model we assume β (2) = 0 r. 44
46 Hypothesis Testing (cont.) In general the parameter estimates arising from the two models will be different; that is β (1) estimates from the Full model will in general not be equal to β (1) estimates from the Reduced model, as in the Full model [ ] [ β β(1) {X = β (2) = (1) } X (1) {X (1) } X (2) ] 1 [ {X (1) } ] y {X (2) } X (1) {X (2) } X (2) {X (2) } y and in the reduced model β (1) = ({X (1) } X (1) ) 1 {X (1) } y. 45
47 Hypothesis Testing (cont.) We modify the earlier sums of squares decomposition (y i y) 2 = (y i ŷ i ) 2 + (ŷ i y) 2 SS T = SS Res + SS R to y 2 i = (y i ŷ i ) 2 + ŷ 2 i SS T = SS Res + SS R 46
48 Hypothesis Testing (cont.) Note that SS T = SS T n{y} 2 and SS R = SS R n{y} 2. 47
49 Hypothesis Testing (cont.) If SS R (β) and SS R (β (1) ) denote the regression sums of squares from the Full and Reduced models respectively, the extra sum of squares due to β (2) in the presence of β (1) is SS R (β (2) β (1) ) = SS R (β) SS R (β (1) ). This facilitates the F-test of the null hypothesis H 0 : β (2) = 0 r that is, that the Reduced model is an adequate simplification of the Full model. 48
50 Hypothesis Testing (cont.) We perform a partial F-test using F = (SS R(β (2) β (1) ))/r MS Res (β) = (SS Res(β (1) ) SS Res (β))/r SS Res (β)/(n p) which is distributed as Fisher(r, n p) if H 0 is true. 49
51 Hypothesis Testing (cont.) The test may be used sequentially: for example, if parameter β = (β 0, β 1, β 2, β 3 ), then we have SS R (β 0, β 1, β 2, β 3 ) = SS R (β 0 ) + SS R (β 1 β 0 ) + SS R (β 2 β 0, β 1 ) + SS R (β 3 β 0, β 1, β 2 ) and also SS R (β 1, β 2, β 3 β 0 ) = SS R (β 1 β 0 ) + SS R (β 2 β 0, β 1 ) + SS R (β 3 β 0, β 1, β 2 ) We also have SS R (β 1, β 2, β 3 β 0 ) SS R (β 1, β 2, β 3 ). 50
52 Hypothesis Testing (cont.) This latter decomposition allows us to test whether X 1 is worth including in the model, then whether X 2 is worth including, when X 1 is already included, then whether X 3 is worth including, when X 1 and X 2 are already included. 51
53 Hypothesis Testing (cont.) Note: If the X (1) and X (2) blocks are orthogonal {X (1) } X (2) = 0 p r,r then β estimates from Full and Reduced model are equal; we have the identity SS R (β (1) β (2) ) = SS R (β (1) ). 52
54 Multicollinearity Multicollinearity corresponds to dependence/correlation amongst the predictors: can lead to inflation of the variance of estimators if present; can be measured by variance inflation factors; equivalent to R 2 measures constructed for the predictors; can be computed from the correlation matrix from the predictors. 53
55 Special Types of Predictors polynomial terms; factor predictors: discrete predictors measured on a nominal ( non-numerical) scale; interactions: an interaction is a modification of the effect of one predictor in the presence of another. higher-order interactions. 54
56 Special Types of Predictors (cont.) Important issues include how to represent factor predictors using indicator functions; how to count parameters; the interpretation of interactions; removing the intercept. 55
57 Model Selection Strategies Our goal is to find the simplest possible model that adequately explains the observed response data. Forward selection; Backward elimination; Stepwise elimination; 56
58 Model Selection Criteria R 2 -based methods; largest R 2 Adj equivalent to minimum MS Res; Mallows s C p ; AIC; BIC; 57
59 Why the best model matters Over-simplified model: if the chosen model omits important predictors, then the resulting estimates will be biased, have lower variance, and can have higher mean-squared error, depending on the magnitude of the effect of the omitted predictors; the estimate of σ 2 will be too high; predictions will have higher mean squared error. If the omitted predictors are orthogonal to the included ones, then the bias is removed. 58
60 Why the best model matters (cont.) Over-complex model: if the chosen model includes unnecessary predictors, then the resulting estimates will be unbiased, have higher variance, and higher mean-squared error; the estimate of σ 2 will be unbiased; predictions will have no bias, higher variance and higher mean squared error. 59
61 Assessing the model fit PRESS residuals/statistic; leave-one-out/deletion residuals; leverage; influence in estimation and prediction; outliers; influence diagnostics. 60
62 Generalizing Least Squares We may relax the variance assumption Var Y X [Y X] = σ 2 I n to Var Y X [Y X] = σ 2 V where V is a square, symmetric, non-singular matrix. This allows for unequal variances in the conditional response distribution; correlation amongst the residual errors ɛ. This leads to the amended least squares objective function (y Xβ) V 1 (y Xβ) 61
63 Generalizing Least Squares (cont.) This leads to the estimates β = (X V 1 X) X V 1 y and the properties of the estimators, predictions etc. follow in a straightforward fashion. A special case is weighted least squares, where V = diag(1/w 1, 1/w 2,..., 1/w n ) which accommodates unequal variances. 62
64 Transforming the response We may use transformations of the response y i to a different form to make the linear regression model assumptions more appropriate: log; square root; Box-Cox transform. 63
Ch 3: Multiple Linear Regression
Ch 3: Multiple Linear Regression 1. Multiple Linear Regression Model Multiple regression model has more than one regressor. For example, we have one response variable and two regressor variables: 1. delivery
More informationLinear Regression. In this problem sheet, we consider the problem of linear regression with p predictors and one intercept,
Linear Regression In this problem sheet, we consider the problem of linear regression with p predictors and one intercept, y = Xβ + ɛ, where y t = (y 1,..., y n ) is the column vector of target values,
More informationCh 2: Simple Linear Regression
Ch 2: Simple Linear Regression 1. Simple Linear Regression Model A simple regression model with a single regressor x is y = β 0 + β 1 x + ɛ, where we assume that the error ɛ is independent random component
More informationFinal Review. Yang Feng. Yang Feng (Columbia University) Final Review 1 / 58
Final Review Yang Feng http://www.stat.columbia.edu/~yangfeng Yang Feng (Columbia University) Final Review 1 / 58 Outline 1 Multiple Linear Regression (Estimation, Inference) 2 Special Topics for Multiple
More informationMA 575 Linear Models: Cedric E. Ginestet, Boston University Midterm Review Week 7
MA 575 Linear Models: Cedric E. Ginestet, Boston University Midterm Review Week 7 1 Random Vectors Let a 0 and y be n 1 vectors, and let A be an n n matrix. Here, a 0 and A are non-random, whereas y is
More informationSimple Linear Regression
Simple Linear Regression ST 430/514 Recall: A regression model describes how a dependent variable (or response) Y is affected, on average, by one or more independent variables (or factors, or covariates)
More informationFormal Statement of Simple Linear Regression Model
Formal Statement of Simple Linear Regression Model Y i = β 0 + β 1 X i + ɛ i Y i value of the response variable in the i th trial β 0 and β 1 are parameters X i is a known constant, the value of the predictor
More information18.S096 Problem Set 3 Fall 2013 Regression Analysis Due Date: 10/8/2013
18.S096 Problem Set 3 Fall 013 Regression Analysis Due Date: 10/8/013 he Projection( Hat ) Matrix and Case Influence/Leverage Recall the setup for a linear regression model y = Xβ + ɛ where y and ɛ are
More informationOutline. Remedial Measures) Extra Sums of Squares Standardized Version of the Multiple Regression Model
Outline 1 Multiple Linear Regression (Estimation, Inference, Diagnostics and Remedial Measures) 2 Special Topics for Multiple Regression Extra Sums of Squares Standardized Version of the Multiple Regression
More informationSTAT763: Applied Regression Analysis. Multiple linear regression. 4.4 Hypothesis testing
STAT763: Applied Regression Analysis Multiple linear regression 4.4 Hypothesis testing Chunsheng Ma E-mail: cma@math.wichita.edu 4.4.1 Significance of regression Null hypothesis (Test whether all β j =
More informationGeneralized Linear Models
Generalized Linear Models Lecture 3. Hypothesis testing. Goodness of Fit. Model diagnostics GLM (Spring, 2018) Lecture 3 1 / 34 Models Let M(X r ) be a model with design matrix X r (with r columns) r n
More informationSimple and Multiple Linear Regression
Sta. 113 Chapter 12 and 13 of Devore March 12, 2010 Table of contents 1 Simple Linear Regression 2 Model Simple Linear Regression A simple linear regression model is given by Y = β 0 + β 1 x + ɛ where
More informationMultiple Linear Regression
Multiple Linear Regression ST 430/514 Recall: a regression model describes how a dependent variable (or response) Y is affected, on average, by one or more independent variables (or factors, or covariates).
More informationChapter 14. Linear least squares
Serik Sagitov, Chalmers and GU, March 5, 2018 Chapter 14 Linear least squares 1 Simple linear regression model A linear model for the random response Y = Y (x) to an independent variable X = x For a given
More informationLectures on Simple Linear Regression Stat 431, Summer 2012
Lectures on Simple Linear Regression Stat 43, Summer 0 Hyunseung Kang July 6-8, 0 Last Updated: July 8, 0 :59PM Introduction Previously, we have been investigating various properties of the population
More informationConfidence Intervals, Testing and ANOVA Summary
Confidence Intervals, Testing and ANOVA Summary 1 One Sample Tests 1.1 One Sample z test: Mean (σ known) Let X 1,, X n a r.s. from N(µ, σ) or n > 30. Let The test statistic is H 0 : µ = µ 0. z = x µ 0
More informationMultiple Linear Regression
Multiple Linear Regression Simple linear regression tries to fit a simple line between two variables Y and X. If X is linearly related to Y this explains some of the variability in Y. In most cases, there
More information[y i α βx i ] 2 (2) Q = i=1
Least squares fits This section has no probability in it. There are no random variables. We are given n points (x i, y i ) and want to find the equation of the line that best fits them. We take the equation
More informationSimple Regression Model Setup Estimation Inference Prediction. Model Diagnostic. Multiple Regression. Model Setup and Estimation.
Statistical Computation Math 475 Jimin Ding Department of Mathematics Washington University in St. Louis www.math.wustl.edu/ jmding/math475/index.html October 10, 2013 Ridge Part IV October 10, 2013 1
More informationLinear Regression. September 27, Chapter 3. Chapter 3 September 27, / 77
Linear Regression Chapter 3 September 27, 2016 Chapter 3 September 27, 2016 1 / 77 1 3.1. Simple linear regression 2 3.2 Multiple linear regression 3 3.3. The least squares estimation 4 3.4. The statistical
More informationStat/F&W Ecol/Hort 572 Review Points Ané, Spring 2010
1 Linear models Y = Xβ + ɛ with ɛ N (0, σ 2 e) or Y N (Xβ, σ 2 e) where the model matrix X contains the information on predictors and β includes all coefficients (intercept, slope(s) etc.). 1. Number of
More informationRegression Review. Statistics 149. Spring Copyright c 2006 by Mark E. Irwin
Regression Review Statistics 149 Spring 2006 Copyright c 2006 by Mark E. Irwin Matrix Approach to Regression Linear Model: Y i = β 0 + β 1 X i1 +... + β p X ip + ɛ i ; ɛ i iid N(0, σ 2 ), i = 1,..., n
More informationMultiple Linear Regression
Multiple Linear Regression University of California, San Diego Instructor: Ery Arias-Castro http://math.ucsd.edu/~eariasca/teaching.html 1 / 42 Passenger car mileage Consider the carmpg dataset taken from
More informationSummer School in Statistics for Astronomers V June 1 - June 6, Regression. Mosuk Chow Statistics Department Penn State University.
Summer School in Statistics for Astronomers V June 1 - June 6, 2009 Regression Mosuk Chow Statistics Department Penn State University. Adapted from notes prepared by RL Karandikar Mean and variance Recall
More information2. Regression Review
2. Regression Review 2.1 The Regression Model The general form of the regression model y t = f(x t, β) + ε t where x t = (x t1,, x tp ), β = (β 1,..., β m ). ε t is a random variable, Eε t = 0, Var(ε t
More informationIntroduction to Statistical modeling: handout for Math 489/583
Introduction to Statistical modeling: handout for Math 489/583 Statistical modeling occurs when we are trying to model some data using statistical tools. From the start, we recognize that no model is perfect
More informationANALYSIS OF VARIANCE AND QUADRATIC FORMS
4 ANALYSIS OF VARIANCE AND QUADRATIC FORMS The previous chapter developed the regression results involving linear functions of the dependent variable, β, Ŷ, and e. All were shown to be normally distributed
More informationSTATISTICS 174: APPLIED STATISTICS FINAL EXAM DECEMBER 10, 2002
Time allowed: 3 HOURS. STATISTICS 174: APPLIED STATISTICS FINAL EXAM DECEMBER 10, 2002 This is an open book exam: all course notes and the text are allowed, and you are expected to use your own calculator.
More informationLinear Models 1. Isfahan University of Technology Fall Semester, 2014
Linear Models 1 Isfahan University of Technology Fall Semester, 2014 References: [1] G. A. F., Seber and A. J. Lee (2003). Linear Regression Analysis (2nd ed.). Hoboken, NJ: Wiley. [2] A. C. Rencher and
More informationMaximum Likelihood Estimation
Maximum Likelihood Estimation Merlise Clyde STA721 Linear Models Duke University August 31, 2017 Outline Topics Likelihood Function Projections Maximum Likelihood Estimates Readings: Christensen Chapter
More informationNeed for Several Predictor Variables
Multiple regression One of the most widely used tools in statistical analysis Matrix expressions for multiple regression are the same as for simple linear regression Need for Several Predictor Variables
More informationChapter 1 Statistical Inference
Chapter 1 Statistical Inference causal inference To infer causality, you need a randomized experiment (or a huge observational study and lots of outside information). inference to populations Generalizations
More informationChapter 3: Multiple Regression. August 14, 2018
Chapter 3: Multiple Regression August 14, 2018 1 The multiple linear regression model The model y = β 0 +β 1 x 1 + +β k x k +ǫ (1) is called a multiple linear regression model with k regressors. The parametersβ
More informationLinear models and their mathematical foundations: Simple linear regression
Linear models and their mathematical foundations: Simple linear regression Steffen Unkel Department of Medical Statistics University Medical Center Göttingen, Germany Winter term 2018/19 1/21 Introduction
More information10. Alternative case influence statistics
10. Alternative case influence statistics a. Alternative to D i : dffits i (and others) b. Alternative to studres i : externally-studentized residual c. Suggestion: use whatever is convenient with the
More informationData Mining Stat 588
Data Mining Stat 588 Lecture 02: Linear Methods for Regression Department of Statistics & Biostatistics Rutgers University September 13 2011 Regression Problem Quantitative generic output variable Y. Generic
More informationThis model of the conditional expectation is linear in the parameters. A more practical and relaxed attitude towards linear regression is to say that
Linear Regression For (X, Y ) a pair of random variables with values in R p R we assume that E(Y X) = β 0 + with β R p+1. p X j β j = (1, X T )β j=1 This model of the conditional expectation is linear
More informationSTAT 540: Data Analysis and Regression
STAT 540: Data Analysis and Regression Wen Zhou http://www.stat.colostate.edu/~riczw/ Email: riczw@stat.colostate.edu Department of Statistics Colorado State University Fall 205 W. Zhou (Colorado State
More informationChapter 1. Linear Regression with One Predictor Variable
Chapter 1. Linear Regression with One Predictor Variable 1.1 Statistical Relation Between Two Variables To motivate statistical relationships, let us consider a mathematical relation between two mathematical
More informationReview of Econometrics
Review of Econometrics Zheng Tian June 5th, 2017 1 The Essence of the OLS Estimation Multiple regression model involves the models as follows Y i = β 0 + β 1 X 1i + β 2 X 2i + + β k X ki + u i, i = 1,...,
More informationSTAT5044: Regression and Anova. Inyoung Kim
STAT5044: Regression and Anova Inyoung Kim 2 / 51 Outline 1 Matrix Expression 2 Linear and quadratic forms 3 Properties of quadratic form 4 Properties of estimates 5 Distributional properties 3 / 51 Matrix
More informationSTAT 100C: Linear models
STAT 100C: Linear models Arash A. Amini June 9, 2018 1 / 56 Table of Contents Multiple linear regression Linear model setup Estimation of β Geometric interpretation Estimation of σ 2 Hat matrix Gram matrix
More informationReview: Second Half of Course Stat 704: Data Analysis I, Fall 2014
Review: Second Half of Course Stat 704: Data Analysis I, Fall 2014 Tim Hanson, Ph.D. University of South Carolina T. Hanson (USC) Stat 704: Data Analysis I, Fall 2014 1 / 13 Chapter 8: Polynomials & Interactions
More informationR 2 and F -Tests and ANOVA
R 2 and F -Tests and ANOVA December 6, 2018 1 Partition of Sums of Squares The distance from any point y i in a collection of data, to the mean of the data ȳ, is the deviation, written as y i ȳ. Definition.
More informationBusiness Statistics. Tommaso Proietti. Linear Regression. DEF - Università di Roma 'Tor Vergata'
Business Statistics Tommaso Proietti DEF - Università di Roma 'Tor Vergata' Linear Regression Specication Let Y be a univariate quantitative response variable. We model Y as follows: Y = f(x) + ε where
More informationChapter 5 Matrix Approach to Simple Linear Regression
STAT 525 SPRING 2018 Chapter 5 Matrix Approach to Simple Linear Regression Professor Min Zhang Matrix Collection of elements arranged in rows and columns Elements will be numbers or symbols For example:
More informationECON 450 Development Economics
ECON 450 Development Economics Statistics Background University of Illinois at Urbana-Champaign Summer 2017 Outline 1 Introduction 2 3 4 5 Introduction Regression analysis is one of the most important
More informationRegression Steven F. Arnold Professor of Statistics Penn State University
Regression Steven F. Arnold Professor of Statistics Penn State University Regression is the most commonly used statistical technique. It is primarily concerned with fitting models to data. It is often
More informationMath 5305 Notes. Diagnostics and Remedial Measures. Jesse Crawford. Department of Mathematics Tarleton State University
Math 5305 Notes Diagnostics and Remedial Measures Jesse Crawford Department of Mathematics Tarleton State University (Tarleton State University) Diagnostics and Remedial Measures 1 / 44 Model Assumptions
More informationDr. Junchao Xia Center of Biophysics and Computational Biology. Fall /1/2016 1/46
BIO5312 Biostatistics Lecture 10:Regression and Correlation Methods Dr. Junchao Xia Center of Biophysics and Computational Biology Fall 2016 11/1/2016 1/46 Outline In this lecture, we will discuss topics
More informationRegression Analysis. Simple Regression Multivariate Regression Stepwise Regression Replication and Prediction Error EE290H F05
Regression Analysis Simple Regression Multivariate Regression Stepwise Regression Replication and Prediction Error 1 Regression Analysis In general, we "fit" a model by minimizing a metric that represents
More informationLecture 1: Linear Models and Applications
Lecture 1: Linear Models and Applications Claudia Czado TU München c (Claudia Czado, TU Munich) ZFS/IMS Göttingen 2004 0 Overview Introduction to linear models Exploratory data analysis (EDA) Estimation
More informationLecture 2. The Simple Linear Regression Model: Matrix Approach
Lecture 2 The Simple Linear Regression Model: Matrix Approach Matrix algebra Matrix representation of simple linear regression model 1 Vectors and Matrices Where it is necessary to consider a distribution
More informationThis exam contains 5 questions. Each question is worth 10 points. Therefore, this exam is worth 50 points.
GROUND RULES: This exam contains 5 questions. Each question is worth 10 points. Therefore, this exam is worth 50 points. Print your name at the top of this page in the upper right hand corner. This is
More informationLINEAR REGRESSION ANALYSIS. MODULE XVI Lecture Exercises
LINEAR REGRESSION ANALYSIS MODULE XVI Lecture - 44 Exercises Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur Exercise 1 The following data has been obtained on
More informationStatistics 203: Introduction to Regression and Analysis of Variance Course review
Statistics 203: Introduction to Regression and Analysis of Variance Course review Jonathan Taylor - p. 1/?? Today Review / overview of what we learned. - p. 2/?? General themes in regression models Specifying
More information11 Hypothesis Testing
28 11 Hypothesis Testing 111 Introduction Suppose we want to test the hypothesis: H : A q p β p 1 q 1 In terms of the rows of A this can be written as a 1 a q β, ie a i β for each row of A (here a i denotes
More information1 Least Squares Estimation - multiple regression.
Introduction to multiple regression. Fall 2010 1 Least Squares Estimation - multiple regression. Let y = {y 1,, y n } be a n 1 vector of dependent variable observations. Let β = {β 0, β 1 } be the 2 1
More informationLecture 15 Multiple regression I Chapter 6 Set 2 Least Square Estimation The quadratic form to be minimized is
Lecture 15 Multiple regression I Chapter 6 Set 2 Least Square Estimation The quadratic form to be minimized is Q = (Y i β 0 β 1 X i1 β 2 X i2 β p 1 X i.p 1 ) 2, which in matrix notation is Q = (Y Xβ) (Y
More informationLecture 3: Linear Models. Bruce Walsh lecture notes Uppsala EQG course version 28 Jan 2012
Lecture 3: Linear Models Bruce Walsh lecture notes Uppsala EQG course version 28 Jan 2012 1 Quick Review of the Major Points The general linear model can be written as y = X! + e y = vector of observed
More informationSimple Linear Regression
Simple Linear Regression ST 370 Regression models are used to study the relationship of a response variable and one or more predictors. The response is also called the dependent variable, and the predictors
More informationDirection: This test is worth 250 points and each problem worth points. DO ANY SIX
Term Test 3 December 5, 2003 Name Math 52 Student Number Direction: This test is worth 250 points and each problem worth 4 points DO ANY SIX PROBLEMS You are required to complete this test within 50 minutes
More informationContents. 1 Review of Residuals. 2 Detecting Outliers. 3 Influential Observations. 4 Multicollinearity and its Effects
Contents 1 Review of Residuals 2 Detecting Outliers 3 Influential Observations 4 Multicollinearity and its Effects W. Zhou (Colorado State University) STAT 540 July 6th, 2015 1 / 32 Model Diagnostics:
More informationChapter 12: Multiple Linear Regression
Chapter 12: Multiple Linear Regression Seungchul Baek Department of Statistics, University of South Carolina STAT 509: Statistics for Engineers 1 / 55 Introduction A regression model can be expressed as
More informationRegression, Ridge Regression, Lasso
Regression, Ridge Regression, Lasso Fabio G. Cozman - fgcozman@usp.br October 2, 2018 A general definition Regression studies the relationship between a response variable Y and covariates X 1,..., X n.
More informationMultiple Linear Regression
Andrew Lonardelli December 20, 2013 Multiple Linear Regression 1 Table Of Contents Introduction: p.3 Multiple Linear Regression Model: p.3 Least Squares Estimation of the Parameters: p.4-5 The matrix approach
More informationLinear Regression In God we trust, all others bring data. William Edwards Deming
Linear Regression ddebarr@uw.edu 2017-01-19 In God we trust, all others bring data. William Edwards Deming Course Outline 1. Introduction to Statistical Learning 2. Linear Regression 3. Classification
More informationSTATISTICS 479 Exam II (100 points)
Name STATISTICS 79 Exam II (1 points) 1. A SAS data set was created using the following input statement: Answer parts(a) to (e) below. input State $ City $ Pop199 Income Housing Electric; (a) () Give the
More information6. Multiple Linear Regression
6. Multiple Linear Regression SLR: 1 predictor X, MLR: more than 1 predictor Example data set: Y i = #points scored by UF football team in game i X i1 = #games won by opponent in their last 10 games X
More informationSMA 6304 / MIT / MIT Manufacturing Systems. Lecture 10: Data and Regression Analysis. Lecturer: Prof. Duane S. Boning
SMA 6304 / MIT 2.853 / MIT 2.854 Manufacturing Systems Lecture 10: Data and Regression Analysis Lecturer: Prof. Duane S. Boning 1 Agenda 1. Comparison of Treatments (One Variable) Analysis of Variance
More information401 Review. 6. Power analysis for one/two-sample hypothesis tests and for correlation analysis.
401 Review Major topics of the course 1. Univariate analysis 2. Bivariate analysis 3. Simple linear regression 4. Linear algebra 5. Multiple regression analysis Major analysis methods 1. Graphical analysis
More informationRegression Model Specification in R/Splus and Model Diagnostics. Daniel B. Carr
Regression Model Specification in R/Splus and Model Diagnostics By Daniel B. Carr Note 1: See 10 for a summary of diagnostics 2: Books have been written on model diagnostics. These discuss diagnostics
More informationRestricted Maximum Likelihood in Linear Regression and Linear Mixed-Effects Model
Restricted Maximum Likelihood in Linear Regression and Linear Mixed-Effects Model Xiuming Zhang zhangxiuming@u.nus.edu A*STAR-NUS Clinical Imaging Research Center October, 015 Summary This report derives
More informationLecture 10 Multiple Linear Regression
Lecture 10 Multiple Linear Regression STAT 512 Spring 2011 Background Reading KNNL: 6.1-6.5 10-1 Topic Overview Multiple Linear Regression Model 10-2 Data for Multiple Regression Y i is the response variable
More informationApplied linear statistical models: An overview
Applied linear statistical models: An overview Gunnar Stefansson 1 Dept. of Mathematics Univ. Iceland August 27, 2010 Outline Some basics Course: Applied linear statistical models This lecture: A description
More informationSTATISTICS 174: APPLIED STATISTICS TAKE-HOME FINAL EXAM POSTED ON WEBPAGE: 6:00 pm, DECEMBER 6, 2004 HAND IN BY: 6:00 pm, DECEMBER 7, 2004 This is a
STATISTICS 174: APPLIED STATISTICS TAKE-HOME FINAL EXAM POSTED ON WEBPAGE: 6:00 pm, DECEMBER 6, 2004 HAND IN BY: 6:00 pm, DECEMBER 7, 2004 This is a take-home exam. You are expected to work on it by yourself
More informationVariable Selection and Model Building
LINEAR REGRESSION ANALYSIS MODULE XIII Lecture - 39 Variable Selection and Model Building Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur 5. Akaike s information
More informationy ˆ i = ˆ " T u i ( i th fitted value or i th fit)
1 2 INFERENCE FOR MULTIPLE LINEAR REGRESSION Recall Terminology: p predictors x 1, x 2,, x p Some might be indicator variables for categorical variables) k-1 non-constant terms u 1, u 2,, u k-1 Each u
More informationProbability and Statistics Notes
Probability and Statistics Notes Chapter Seven Jesse Crawford Department of Mathematics Tarleton State University Spring 2011 (Tarleton State University) Chapter Seven Notes Spring 2011 1 / 42 Outline
More information13 Simple Linear Regression
B.Sc./Cert./M.Sc. Qualif. - Statistics: Theory and Practice 3 Simple Linear Regression 3. An industrial example A study was undertaken to determine the effect of stirring rate on the amount of impurity
More informationDiagnostics and Remedial Measures
Diagnostics and Remedial Measures Yang Feng http://www.stat.columbia.edu/~yangfeng Yang Feng (Columbia University) Diagnostics and Remedial Measures 1 / 72 Remedial Measures How do we know that the regression
More informationLinear Regression Model. Badr Missaoui
Linear Regression Model Badr Missaoui Introduction What is this course about? It is a course on applied statistics. It comprises 2 hours lectures each week and 1 hour lab sessions/tutorials. We will focus
More informationChapter 1: Linear Regression with One Predictor Variable also known as: Simple Linear Regression Bivariate Linear Regression
BSTT523: Kutner et al., Chapter 1 1 Chapter 1: Linear Regression with One Predictor Variable also known as: Simple Linear Regression Bivariate Linear Regression Introduction: Functional relation between
More informationCentral Limit Theorem ( 5.3)
Central Limit Theorem ( 5.3) Let X 1, X 2,... be a sequence of independent random variables, each having n mean µ and variance σ 2. Then the distribution of the partial sum S n = X i i=1 becomes approximately
More informationMatrix Approach to Simple Linear Regression: An Overview
Matrix Approach to Simple Linear Regression: An Overview Aspects of matrices that you should know: Definition of a matrix Addition/subtraction/multiplication of matrices Symmetric/diagonal/identity matrix
More informationLecture 6 Multiple Linear Regression, cont.
Lecture 6 Multiple Linear Regression, cont. BIOST 515 January 22, 2004 BIOST 515, Lecture 6 Testing general linear hypotheses Suppose we are interested in testing linear combinations of the regression
More informationInferences for Regression
Inferences for Regression An Example: Body Fat and Waist Size Looking at the relationship between % body fat and waist size (in inches). Here is a scatterplot of our data set: Remembering Regression In
More informationIntroduction to Statistical Data Analysis Lecture 8: Correlation and Simple Regression
Introduction to Statistical Data Analysis Lecture 8: and James V. Lambers Department of Mathematics The University of Southern Mississippi James V. Lambers Statistical Data Analysis 1 / 40 Introduction
More informationTopic 22 Analysis of Variance
Topic 22 Analysis of Variance Comparing Multiple Populations 1 / 14 Outline Overview One Way Analysis of Variance Sample Means Sums of Squares The F Statistic Confidence Intervals 2 / 14 Overview Two-sample
More informationSTATISTICS 110/201 PRACTICE FINAL EXAM
STATISTICS 110/201 PRACTICE FINAL EXAM Questions 1 to 5: There is a downloadable Stata package that produces sequential sums of squares for regression. In other words, the SS is built up as each variable
More informationMath 533 Extra Hour Material
Math 533 Extra Hour Material A Justification for Regression The Justification for Regression It is well-known that if we want to predict a random quantity Y using some quantity m according to a mean-squared
More informationStandard Errors & Confidence Intervals. N(0, I( β) 1 ), I( β) = [ 2 l(β, φ; y) β i β β= β j
Standard Errors & Confidence Intervals β β asy N(0, I( β) 1 ), where I( β) = [ 2 l(β, φ; y) ] β i β β= β j We can obtain asymptotic 100(1 α)% confidence intervals for β j using: β j ± Z 1 α/2 se( β j )
More informationLinear model selection and regularization
Linear model selection and regularization Problems with linear regression with least square 1. Prediction Accuracy: linear regression has low bias but suffer from high variance, especially when n p. It
More informationTentative solutions TMA4255 Applied Statistics 16 May, 2015
Norwegian University of Science and Technology Department of Mathematical Sciences Page of 9 Tentative solutions TMA455 Applied Statistics 6 May, 05 Problem Manufacturer of fertilizers a) Are these independent
More informationCorrelation and Regression
Correlation and Regression October 25, 2017 STAT 151 Class 9 Slide 1 Outline of Topics 1 Associations 2 Scatter plot 3 Correlation 4 Regression 5 Testing and estimation 6 Goodness-of-fit STAT 151 Class
More informationCorrelation and regression
1 Correlation and regression Yongjua Laosiritaworn Introductory on Field Epidemiology 6 July 2015, Thailand Data 2 Illustrative data (Doll, 1955) 3 Scatter plot 4 Doll, 1955 5 6 Correlation coefficient,
More informationMATH Notebook 4 Spring 2018
MATH448001 Notebook 4 Spring 2018 prepared by Professor Jenny Baglivo c Copyright 2010 2018 by Jenny A. Baglivo. All Rights Reserved. 4 MATH448001 Notebook 4 3 4.1 Simple Linear Model.................................
More informationy response variable x 1, x 2,, x k -- a set of explanatory variables
11. Multiple Regression and Correlation y response variable x 1, x 2,, x k -- a set of explanatory variables In this chapter, all variables are assumed to be quantitative. Chapters 12-14 show how to incorporate
More informationMultivariate Linear Regression Models
Multivariate Linear Regression Models Regression analysis is used to predict the value of one or more responses from a set of predictors. It can also be used to estimate the linear association between
More informationREGRESSION ANALYSIS AND INDICATOR VARIABLES
REGRESSION ANALYSIS AND INDICATOR VARIABLES Thesis Submitted in partial fulfillment of the requirements for the award of degree of Masters of Science in Mathematics and Computing Submitted by Sweety Arora
More information