Lecture 11: Regression Methods I (Linear Regression)

Size: px
Start display at page:

Download "Lecture 11: Regression Methods I (Linear Regression)"

Transcription

1 Lecture 11: Regression Methods I (Linear Regression) Fall, / 40

2 Outline Linear Model Introduction 1 Regression: Supervised Learning with Continuous Responses 2 Linear Models and Multiple Linear Regression Ordinary Least Squares Statistical inferences Computational algorithms 2 / 40

3 Regression Models Linear Model Introduction If the response Y take real values, we refer this type of supervised learning problem as regression problem. linear regression models parametric models nonparametric regression splines, kernel estimator, local polynomial regression semiparametric regression Broad coverage: penalized regression, regression trees, support vector regression, quantile regression 3 / 40

4 Linear Regression Models A standard linear regression model assumes y i = x T i β + ɛ i, ɛ i i.i.d, E(ɛ i ) = 0, Var(ɛ i ) = σ 2, y i is the response for the ith observation, x i R d is the covariates β R d is the d-dimensional parameter vector Common model assumptions: independence of errors constant error variance (homoscedasticity) ɛ independent of X. Normality is not needed. 4 / 40

5 About Linear Models Linear Model Introduction Linear models has been a mainstay of statistics for the past 30 years and remains one of the most important tools. The covariates may come from different sources quantitative inputs; dummy coding qualitative inputs. transformed inputs: log(x ), X 2, X,... basis expansion: X 1, X1 2, X 1 3,... (polynomial representation) interaction between variables: X 1 X 2,... 5 / 40

6 Review on Matrix Theory (I) Let A be an m m matrix. Let I m be the identity matrix of size m. The determinant of A is deta) = A. The trace of A is tr(a) = the sum of the diagonal elements. The roots of the mth degree of polynomial equation in λ. λi m A = 0, denoted by λ 1,, λ m are called the eigenvalues of A. The collection {λ 1,, λ m } is called the spectrum of A. Any nonzero m 1 vector x i 0 such that Ax i = λ i x i is an eigenvector of A corresponding to the eigenvalue λ i. If B is another m m matrix, then AB = A B, tr(ab) = tr(ba). 6 / 40

7 Review on Matrix Theory (II) The following are equivalent: A 0 rank(a) = m A 1 exists. 7 / 40

8 Orthogonal Matrix Linear Model Introduction An m m matrix is symmetric if A = A. An m m matrix P is called an orthogonal matrix if PP = P P = I m, or P 1 = P. If P is an orthogonal matrix, then PP = P P = P 2 = I = 1, so P = ±1. For any m m matrix A, we have tr(pap ) = tr(ap P) = tr(a). PAP and A have the same eigenvalues, since λi m PAP = λpp PAP = P 2 λi m A = λi m A. 8 / 40

9 Spectral Decomposition of Symmetric Matrix For any m m symmetric matrix A, there exists an orthogonal matrix P such that P AP = Λ = diag{λ 1,, λ m }, λ i s are the eigenvalues of A. The corresponding eigenvectors of A are the column vectors of P. Denote the m 1 unit vectors by e i, i = 1,, m, where e i has 1 in the ith position and zeros elsewhere. The ith column of P is p i = Pe i, i = 1,, m. Note PP = m i=1 p ip i = I m. The spectral decomposition of A is m A = PΛP = λ i p i p i i=1 tr(a) = tr(λ) = n i=1 λ i and A = Λ = m i=1 λ i. 9 / 40

10 Idempotent Matrices Linear Model Introduction An m m matrix A is idempotent if A 2 = AA = A. The eigenvalues of an idempotent matrix are either zero or one λx = Ax = A(Ax) = A(λx) = λ 2 x, = λ = λ 2. A symmetric idempotent matrix A is also referred to as a projection matrix. For any x R m, the vector y = Ax is the orthogonal projection of x onto the subspace of R m generated by the column vectors of A. the vector z = (I A)x is the orthogonal projection of x onto the complementary subspace such that x = y + z = Ax + (I A)x. 10 / 40

11 Projection Matrices Linear Model Introduction If A is an symmetric idempotent, then If rank(a) = r, then A has r eigenvalues equal to 1 and n 4 zero eigenvalues tr(a) = rank(a). I m A is also symmetric idempotent, of rank m r. 11 / 40

12 Matrix Notations for Linear Regression The response vector y = (y 1,, y n ) T The design matrix X. Assume the first column of X is 1. The dimension of X is n (1 + d). ( ) β0 The regression coefficients β =. β 1 The error vector ɛ = (ɛ 1,, ɛ n )) T. The linear model is written as: the estimated coefficients β y = X β + ɛ the predicted response ŷ = X β. 12 / 40

13 Ordinary Least Squares (OLS) The most popular method for fitting the linear model is the ordinary least squares (OLS): min RSS(β) = (y X β β)t (y X β). Normal equations: X T (y X β) = 0 β = (X T X ) 1 X T y and ŷ = X (X T X ) 1 X T y. Residual vector is r = y ŷ = (I P X )y. Residual sum squares RSS = r T r. 13 / 40

14 Projection Matrix Linear Model Introduction Call the following square matrix the projection or hat matrix: Properties: We have Note symmetric and non-negative P X = X (X T X ) 1 X T. idempotent: P 2 X = P X. The eigenvalues are 0 s and 1 s. X T P X = X T, X T (I P X ) = 0. r = (I P X )y, RSS = y T (I P X )y. X T r = X T (I P X )y = 0. The residual vector is orthogonal to the column space spanned by X, col(x ). 14 / 40

15 Ð Ñ ÒØ Ó ËØ Ø Ø Ð Ä ÖÒ Ò À Ø Ì Ö Ò ² Ö Ñ Ò ¾¼¼½ ÔØ Ö ÈË Ö Ö ÔÐ Ñ ÒØ Ý Ü ¾ Ý Ü ½ ÙÖ º¾ Ì Æ ¹ Ñ Ò ÓÒ Ð ÓÑ ØÖÝ Ó Ð Ø ÕÙ Ö Ö Ö ÓÒ Û Ø ØÛÓ ÔÖ ØÓÖ º Ì ÓÙØÓÑ Ú ØÓÖ Ý ÓÖØ Ó ÓÒ ÐÐÝ ÔÖÓ Ø ÓÒØÓ Ø ÝÔ ÖÔÐ Ò Ô ÒÒ Ý Ø ÒÔÙØ Ú ØÓÖ Ü½ Ò Ü¾º Ì ÔÖÓ Ø ÓÒ Ý Ö ÔÖ ÒØ Ø Ú ØÓÖ Ó Ø Ð Ø ÕÙ Ö ÔÖ Ø ÓÒ 15 / 40

16 Sampling Properties of β Var(ˆβ) = σ 2 (X T X ) 1, The variance σ 2 can be estimated as ˆσ 2 = SSE/(n d 1). This is an unbiased estimator, i.e., E(ˆσ 2 ) = σ 2 16 / 40

17 Inferences for Gaussian Errors Under the Normal assumption on the error ɛ, we have ˆβ N(β, σ 2 (X T X ) 1 ) (n d 1)ˆσ 2 σ 2 χ 2 n d 1 ˆβ is independent of ˆσ 2 To test H 0 : β j = 0, we use if σ 2 is known, z j = if σ 2 is unknown, t j = H 0 ; ˆβ j σ v j has a Z distribution under H 0 ; ˆβ j ˆσ v j has a t n d 1 distribution under where v j is the jth diagonal element of (X T X ) / 40

18 Confidence Interval for Individual Coefficients Under Normal assumption, the 100(1 α)% C.I. of β j is ˆβ j ± t n d 1; α 2 ˆσ v j, where t k;ν is 1 ν percentile of t k distribution. In practice, we use the approximate 100(1 α)% C.I. of β j ˆβ j ± z α 2 ˆσ v j, where z α is 1 α 2 2 percentile of the standard Normal distribution. Even if the Gaussian assumption does not hold, this interval is approximately right, with the coverage probability 1 α as n. 18 / 40

19 Review on Multivariate Normal Distributions Distributions of Quadratic Form (Non-central χ 2 ): If X N p (µ, I p ), then W = X T X = p i=1 X 2 i χ 2 p(λ), λ = 1 2 µt µ. Special case: If X N p (0, I p ), then W = X T X χ 2 p. If X N p (µ, V ) where V is nonsingular, then W = X T V 1 X χ 2 p(λ), λ = 1 2 µt V 1 µ. If X N p (µ, V ) with V nonsingular, if A is symmetric and AV is idempotent with rank s, then W = X T AX χ 2 s (λ), λ = 1 2 µt Aµ. 19 / 40

20 Cochran s Theorem Linear Model Introduction Let y N n (µ, σ 2 I n ) and let A j, j = 1,, J be symmetric idempotent matrices with rank s j. Furthermore, assume that J j=1 A j = I n and J j=1 s j = n, then (i) W j = 1 σ 2 yt A j y χ 2 s j (λ j ), where λ j = 1 2σ 2 µ T A j µ (ii) W j s are mutually independent with each other. Essentially: we decompose y T y into the (scaled) sum of its quadratic forms, n i=1 y 2 i y i = y T I n y = J y T A j y. j=1 20 / 40

21 Application of Cochran s Theorem to Linear Models Example: Assume y N n (X β, σ 2 I n ). Define A = I P X and the residual sum of squares: RSS = y T Ay = r 2 the sum of squares regression: SSR = y T P X y = ŷ 2. By Cochran s Theorem, we have (i) RSS/σ 2 χ 2 n d 1, SSR/σ2 χ 2 d+1 (λ), where λ = (X β) T (X β)/(2σ 2 ), (ii) RSS is independent from SSR. (Note r ŷ) 21 / 40

22 F Distribution Linear Model Introduction If U 1 χ 2 p, U 2 χ 2 q and U 1 U 2, then F = U 1/p U 2 /q F p,q. If U 1 χ 2 p(λ), U 2 χ 2 q and U 1 U 2, then F = U 1/p U 2 /q F p,q(λ), (noncentral F ) Example: Assume y N n (X β, σ 2 I n ). Let A = I P X, and Then F = RSS = y T Ay T = r 2, SSR = y T P X y = ŷ 2. SSR/(d + 1) RSS/(n d 1) F d+1,n d 1(λ), λ = X β 2 /(2σ 2 ). 22 / 40

23 Making Inferences about Multiple Parameters Assume X = [X 0, X 1 ], where X 0 consists of the first k columns. Correspondingly, β = [β 0, β 1]. To test H 0 : β 0 = 0, using F = (RSS 1 RSS)/k RSS/(n d 1) RSS 1 = y T (I P X1 )y (reduced model). RSS = y T (I P X )y (full model) RSS 1 σ 2 χ 2 n d 1. RSS 1 RSS = y T (P X P X1 )y. 23 / 40

24 Testing Multiple Parameter Applying Cochran s Theorem to RSS 1, RSS and RSS 1 RSS, they are independent they respectively follow noncentral χ 2 distributions, with noncentralities (X β) T (I P X1 )(X β)/(2σ 2 ), 0, and (X β) T (P X P X1 )(X β)/(2σ 2 ).. Then we have F F k,n d 1 (λ), with λ = (X β) T (P X P X1 )(X β)/(2σ 2 ). Under H 0, we have X β = X 1 β 1, so F F k,n d / 40

25 Nested Model Selection To test for significance of groups of coefficients simultaneously, we use F -statistic where F = (RSS 0 RSS 1 )/(d 1 d 0 ), RSS 1 /(n d 1 1) RSS 1 is the RSS for the bigger model with d parameters RSS 0 is the RSS for the nested smaller model with d parameter, have d 1 d 0 parameters constrained to zero. F -statistic measure the change in RSS per additional parameter in the bigger model, and it is normalized by ˆσ 2. Under the assumption that the smaller model is correct, F F d1 d 0,n d / 40

26 Confidence Set Linear Model Introduction The approximate confidence set of β is C β = {β (ˆβ β) T (X T X )(ˆβ β) ˆσ 2 χ 2 d+1;1 α }, where χ 2 k;1 α is 1 α percentile of χ2 k distribution. The confidence interval for the true function f (x) = x T β is {x T β β C β } 26 / 40

27 Gauss-Markov Theorem Assume s T β is linearly estimable, i.e., there exists a linear estimator b + c T y such that E(b + c T y) = s T β. A function s T β is linearly estimable iff s = X T a for some a. Theorem: If s T β is linearly estimable, then s T β is the best linear unbiased estimator (BLUE) of s T β: For any c T y satisfying E(c T y) = s T β, we have Var(s T β) Var(c T y). s T β is the best among all the unbiased estimators. (It is a function of the complete and sufficient statistic (y T y, X T y).) Question: Is it possible to find a slightly biased linear estimator but with smaller variance? ( Trade a little bias for a large reduction in variance.) 27 / 40

28 Orthogonalize Design Matrix Linear Regression with Orthogonal Design If X is univariate, the least square estimate is ˆβ = i x iy i = < x, y > < x, x >. i x 2 i if X = [x 1,..., x d ] has orthogonal columns, i.e., < x j, x k >= 0, j k; or equivalently, X T X = diag ( x 1 2,..., x d 2). The OLS estimates are given as ˆβ j = < x j, y > < x j, x j > for j = 1,..., d. Each input has no effect on the estimation of other parameters. Multiple linear regression reduces to univariate regression. 28 / 40

29 How to orthogonalize X? Orthogonalize Design Matrix Consider the simple linear regression y = β β 1 x + ɛ. We regress x onto 1 and obtain the residual z = x x1. Orthogonalization Process: The residual z is orthogonal to the regressor 1. The column space of X is span{1, x}. Note: ŷ span{1, x} = span{1, z}, because β β 1 x = β 0 + β 1 [ x1 + (x x1)] = β 0 + β 1 [ x1 + z] = (β 0 + β 1 x)1 + β 1 z = η β 1 z. {1, z} form an orthogonal basis for the column space of X. 29 / 40

30 Orthogonalize Design Matrix How to orthogonalize X? (continued) Estimation Process: First, we regress y onto z for the OLS estimate of the slope ˆβ 1 ˆβ 1 = < y, z > < y, x x1 > = < z, z > < x x1, x x1 >. Second, we regress y onto 1 and get the coefficient ˆη 0 = ȳ. The OLS fit is given as ŷ = ˆη ˆβ 1 z = ˆη ˆβ 1 (x x1) = (ˆη 0 ˆβ 1 x)1 + ˆβ 1 x. Therefore, the OLS slope is obtained as ˆβ 0 = ˆη 0 ˆβ 1 x = ȳ ˆβ 1 x. 30 / 40

31 Orthogonalize Design Matrix Ð Ñ ÒØ Ó ËØ Ø Ø Ð Ä ÖÒ Ò À Ø Ì Ö Ò ² Ö Ñ Ò ¾¼¼½ ÔØ Ö ÈË Ö Ö ÔÐ Ñ ÒØ Ý Ü¾ Þ Ý Ü½ ÙÖ º Ä Ø ÕÙ Ö Ö Ö ÓÒ Ý ÓÖØ Ó ÓÒ Ð Þ ¹ Ø ÓÒ Ó Ø ÒÔÙØ º Ì Ú ØÓÖ Ü¾ Ö Ö ÓÒ Ø Ú ØÓÖ Ü½ Ð Ú Ò Ø Ö Ù Ð Ú ØÓÖ Þº Ì Ö Ö ÓÒ Ó Ý ÓÒ Þ Ú Ø ÑÙÐØ ÔÐ Ö Ö ÓÒ Ó Æ ÒØ Ó Ü¾º Ò ØÓ Ø Ö Ø ÔÖÓ Ø ÓÒ Ó Ý ÓÒ Ó Ü½ Ò Þ Ú Ø Ð Ø ÕÙ Ö Ø Ýº 31 / 40

32 How to orthogonalize X? (d=2) Consider y = β 1 x 1 + β 2 x 2 + β 3 x 3 + ɛ. Orthogonization process: Orthogonalize Design Matrix 1 We regress x 2 onto x 1, compute the residual z 1 = x 2 γ 12 x 1. (note z 1 x 1 ) 2 We regress x 3 onto (x 1, z 1 ), compute the residual z 2 = x 3 γ 13 x 1 γ 23 z 1. (note z 2 {x 1, z 1 }) Note: span{x 1, x 2, x 3 } = span{x 1, z 1, z 2 }, because β 1 x 1 + β 2 x 2 + β 3 x 3 = β 1 x 1 + β 2 (γ 12 x 1 + z 1 ) + β 3 (γ 13 x 1 + γ 23 z 1 + z 2 ) = (β 1 + β 2 γ 12 + β 3 γ 13 )x 1 + (β 2 + β 3 γ 23 )z 1 + β 3 z 2 = η 1 x 1 + η 2 z 1 + β 3 z / 40

33 Estimation Process Linear Model Introduction Orthogonalize Design Matrix We project y onto the orthogonal basis {x 1, z 2, z 3 } one by one, and then recover the coefficients corresponding to the original columns of X. First, we regress y onto z 2 for the OLS estimate of the slope ˆβ 3 ˆβ 3 = < y, z 2 > < z 2, z 2 > Second, we regress y onto z 1, leading to the coefficient ˆη 2, and ˆβ 2 = ˆη 2 ˆβ 3 γ 23 Third, we regress y onto x 1, leading to the coefficient ˆη 1, and ˆβ 1 = ˆη 1 ˆβ 3 γ 13 ˆβ 2 γ / 40

34 Orthogonalize Design Matrix Gram-Schmidt Procedure (Successive Orthogonalization) 1 Initialize z 0 = x 0 = 1 2 For j = 1,..., d Regression x j on z 0, z 1,..., z j 1 to produce coefficients ˆγ kj = <z k,x j > <z k,z k > for k = 0,..., j 1, and residual vector z j = x j j 1 k=0 ˆγ kjz k. ({z 0, z 1,..., z j 1 } are orthogonal) 3 Regress y on the residual z d to get ˆβ d = < y, z d > < z d, z d > 4 Compute ˆβ j, j = d 1,, 0 in that order successively. {z 0, z 1,..., z d } forms orthogonal basis for Col(X ). Multiple regression coefficient ˆβ j is the additional contribution of x j to y, after x j has been adjusted for x 0, x 1,..., x j 1, x j+1,..., x d. 34 / 40

35 Collinearity Issue Linear Model Introduction Orthogonalize Design Matrix The dth coefficient ˆβ d = < z d, y > < z d, z d > If x d is highly correlated with some of the other x js, then The residual vector z d is close to zero The coefficient ˆβ d will be very unstable The variance estimates Var( ˆβ d ) = σ2 z d 2. The precision for estimating ˆβ d depends on the length of z d, or, how much x d is unexplained by the other x k s 35 / 40

36 QR Decomposition Cholesky Algorithm Two Computational Algorithms For Multiple Regression Consider the Normal Equation X T X β = X T y. We like to avoid computing (X T X ) 1 directly. 1 QR decomposition of X X = QR where Q is orthonormal and R is upper triangular Essentially, a process of orthogonal matrix triangularization 2 Cholesky decomposition of X T X. X T X = RR T where R is lower triangular 36 / 40

37 QR Decomposition Cholesky Algorithm Matrix Formulation of Orthogonalization In Step 2 of Gram-Schmidt procedure, for j = 1,..., d j 1 z j = x j ˆγ kj z k = x j = ˆγ kj z k + z j. k=0 j 1 k=0 In matrix form X = [x 1,..., x d ] and Z = [z 1,..., z d ], X = ZΓ The columns of Z are orthogonal to each other The matrix Γ is upper triangular, with 1 at the diagonals. Standardizing Z using D = diag{ z 1,..., z d }, X = ZΓ = ZD 1 DΓ QR, with Q = ZD 1, R = DΓ. 37 / 40

38 QR Decomposition Linear Model Introduction QR Decomposition Cholesky Algorithm The columns of Q consists of an orthonormal basis for the column space of X. Q is orthogonal matrix of n d, satisfying Q T Q = I. R is upper triangular matrix of d d, full-ranked. X T X = (QR) T (QR) = R T Q T QR = R T R The least square solutions are β = (X T X ) 1 X T y = R 1 R T R T Q T y = R 1 Q T y ŷ = X ˆβ = (QR)(R 1 Q T y) = QQ T y. 38 / 40

39 QR Decomposition Cholesky Algorithm QR Algorithm for Normal Equations Regard β as the solution for linear equations system: Rβ = Q T y. 1 Conduct QR decomposition of X = QR. (Gram-Schmidt Orthogonalization) 2 Compute Q T y. 3 Solve the triangular system Rβ = Q T y. The computational complexity: nd 2 39 / 40

40 QR Decomposition Cholesky Algorithm Cholesky Decomposition Algorithm For any positive definite square matrix A, we have A = RR T, where R is a lower triangular matrix of full rank. 1 Compute X T X and X T y. 2 Factoring X T X = RR T, then ˆβ = (R T ) 1 R 1 X T y 3 Solve the triangular system Rw = X T y for w. 4 Solve the triangular system R T β = w for β. The computational complexity: d 3 + nd 2 /2 (can be faster than QR for small d, but can be less stable) Var(ŷ 0 ) = Var(x T 0 ˆβ) = σ 2 (x T 0 (R T ) 1 R 1 x 0 ). 40 / 40

Lecture 11: Regression Methods I (Linear Regression)

Lecture 11: Regression Methods I (Linear Regression) Lecture 11: Regression Methods I (Linear Regression) 1 / 43 Outline 1 Regression: Supervised Learning with Continuous Responses 2 Linear Models and Multiple Linear Regression Ordinary Least Squares Statistical

More information

STAT5044: Regression and Anova. Inyoung Kim

STAT5044: Regression and Anova. Inyoung Kim STAT5044: Regression and Anova Inyoung Kim 2 / 51 Outline 1 Matrix Expression 2 Linear and quadratic forms 3 Properties of quadratic form 4 Properties of estimates 5 Distributional properties 3 / 51 Matrix

More information

Lecture 16: Modern Classification (I) - Separating Hyperplanes

Lecture 16: Modern Classification (I) - Separating Hyperplanes Lecture 16: Modern Classification (I) - Separating Hyperplanes Outline 1 2 Separating Hyperplane Binary SVM for Separable Case Bayes Rule for Binary Problems Consider the simplest case: two classes are

More information

This model of the conditional expectation is linear in the parameters. A more practical and relaxed attitude towards linear regression is to say that

This model of the conditional expectation is linear in the parameters. A more practical and relaxed attitude towards linear regression is to say that Linear Regression For (X, Y ) a pair of random variables with values in R p R we assume that E(Y X) = β 0 + with β R p+1. p X j β j = (1, X T )β j=1 This model of the conditional expectation is linear

More information

Data Mining Stat 588

Data Mining Stat 588 Data Mining Stat 588 Lecture 02: Linear Methods for Regression Department of Statistics & Biostatistics Rutgers University September 13 2011 Regression Problem Quantitative generic output variable Y. Generic

More information

18.S096 Problem Set 3 Fall 2013 Regression Analysis Due Date: 10/8/2013

18.S096 Problem Set 3 Fall 2013 Regression Analysis Due Date: 10/8/2013 18.S096 Problem Set 3 Fall 013 Regression Analysis Due Date: 10/8/013 he Projection( Hat ) Matrix and Case Influence/Leverage Recall the setup for a linear regression model y = Xβ + ɛ where y and ɛ are

More information

Matrix Factorizations

Matrix Factorizations 1 Stat 540, Matrix Factorizations Matrix Factorizations LU Factorization Definition... Given a square k k matrix S, the LU factorization (or decomposition) represents S as the product of two triangular

More information

Business Statistics. Tommaso Proietti. Linear Regression. DEF - Università di Roma 'Tor Vergata'

Business Statistics. Tommaso Proietti. Linear Regression. DEF - Università di Roma 'Tor Vergata' Business Statistics Tommaso Proietti DEF - Università di Roma 'Tor Vergata' Linear Regression Specication Let Y be a univariate quantitative response variable. We model Y as follows: Y = f(x) + ε where

More information

Linear Regression. In this problem sheet, we consider the problem of linear regression with p predictors and one intercept,

Linear Regression. In this problem sheet, we consider the problem of linear regression with p predictors and one intercept, Linear Regression In this problem sheet, we consider the problem of linear regression with p predictors and one intercept, y = Xβ + ɛ, where y t = (y 1,..., y n ) is the column vector of target values,

More information

Maximum Likelihood Estimation

Maximum Likelihood Estimation Maximum Likelihood Estimation Merlise Clyde STA721 Linear Models Duke University August 31, 2017 Outline Topics Likelihood Function Projections Maximum Likelihood Estimates Readings: Christensen Chapter

More information

Lecture 1 Review: Linear models have the form (in matrix notation) Y = Xβ + ε,

Lecture 1 Review: Linear models have the form (in matrix notation) Y = Xβ + ε, 2. REVIEW OF LINEAR ALGEBRA 1 Lecture 1 Review: Linear models have the form (in matrix notation) Y = Xβ + ε, where Y n 1 response vector and X n p is the model matrix (or design matrix ) with one row for

More information

Part IB Statistics. Theorems with proof. Based on lectures by D. Spiegelhalter Notes taken by Dexter Chua. Lent 2015

Part IB Statistics. Theorems with proof. Based on lectures by D. Spiegelhalter Notes taken by Dexter Chua. Lent 2015 Part IB Statistics Theorems with proof Based on lectures by D. Spiegelhalter Notes taken by Dexter Chua Lent 2015 These notes are not endorsed by the lecturers, and I have modified them (often significantly)

More information

Recall the convention that, for us, all vectors are column vectors.

Recall the convention that, for us, all vectors are column vectors. Some linear algebra Recall the convention that, for us, all vectors are column vectors. 1. Symmetric matrices Let A be a real matrix. Recall that a complex number λ is an eigenvalue of A if there exists

More information

Linear Regression. September 27, Chapter 3. Chapter 3 September 27, / 77

Linear Regression. September 27, Chapter 3. Chapter 3 September 27, / 77 Linear Regression Chapter 3 September 27, 2016 Chapter 3 September 27, 2016 1 / 77 1 3.1. Simple linear regression 2 3.2 Multiple linear regression 3 3.3. The least squares estimation 4 3.4. The statistical

More information

3 Multiple Linear Regression

3 Multiple Linear Regression 3 Multiple Linear Regression 3.1 The Model Essentially, all models are wrong, but some are useful. Quote by George E.P. Box. Models are supposed to be exact descriptions of the population, but that is

More information

Introduction to Estimation Methods for Time Series models. Lecture 1

Introduction to Estimation Methods for Time Series models. Lecture 1 Introduction to Estimation Methods for Time Series models Lecture 1 Fulvio Corsi SNS Pisa Fulvio Corsi Introduction to Estimation () Methods for Time Series models Lecture 1 SNS Pisa 1 / 19 Estimation

More information

Properties of Matrices and Operations on Matrices

Properties of Matrices and Operations on Matrices Properties of Matrices and Operations on Matrices A common data structure for statistical analysis is a rectangular array or matris. Rows represent individual observational units, or just observations,

More information

ANOVA: Analysis of Variance - Part I

ANOVA: Analysis of Variance - Part I ANOVA: Analysis of Variance - Part I The purpose of these notes is to discuss the theory behind the analysis of variance. It is a summary of the definitions and results presented in class with a few exercises.

More information

Math 520 Exam 2 Topic Outline Sections 1 3 (Xiao/Dumas/Liaw) Spring 2008

Math 520 Exam 2 Topic Outline Sections 1 3 (Xiao/Dumas/Liaw) Spring 2008 Math 520 Exam 2 Topic Outline Sections 1 3 (Xiao/Dumas/Liaw) Spring 2008 Exam 2 will be held on Tuesday, April 8, 7-8pm in 117 MacMillan What will be covered The exam will cover material from the lectures

More information

Peter Hoff Linear and multilinear models April 3, GLS for multivariate regression 5. 3 Covariance estimation for the GLM 8

Peter Hoff Linear and multilinear models April 3, GLS for multivariate regression 5. 3 Covariance estimation for the GLM 8 Contents 1 Linear model 1 2 GLS for multivariate regression 5 3 Covariance estimation for the GLM 8 4 Testing the GLH 11 A reference for some of this material can be found somewhere. 1 Linear model Recall

More information

1 Cricket chirps: an example

1 Cricket chirps: an example Notes for 2016-09-26 1 Cricket chirps: an example Did you know that you can estimate the temperature by listening to the rate of chirps? The data set in Table 1 1. represents measurements of the number

More information

STAT 100C: Linear models

STAT 100C: Linear models STAT 100C: Linear models Arash A. Amini April 27, 2018 1 / 1 Table of Contents 2 / 1 Linear Algebra Review Read 3.1 and 3.2 from text. 1. Fundamental subspace (rank-nullity, etc.) Im(X ) = ker(x T ) R

More information

AMS526: Numerical Analysis I (Numerical Linear Algebra)

AMS526: Numerical Analysis I (Numerical Linear Algebra) AMS526: Numerical Analysis I (Numerical Linear Algebra) Lecture 5: Projectors and QR Factorization Xiangmin Jiao SUNY Stony Brook Xiangmin Jiao Numerical Analysis I 1 / 14 Outline 1 Projectors 2 QR Factorization

More information

Ma 3/103: Lecture 24 Linear Regression I: Estimation

Ma 3/103: Lecture 24 Linear Regression I: Estimation Ma 3/103: Lecture 24 Linear Regression I: Estimation March 3, 2017 KC Border Linear Regression I March 3, 2017 1 / 32 Regression analysis Regression analysis Estimate and test E(Y X) = f (X). f is the

More information

BANA 7046 Data Mining I Lecture 2. Linear Regression, Model Assessment, and Cross-validation 1

BANA 7046 Data Mining I Lecture 2. Linear Regression, Model Assessment, and Cross-validation 1 BANA 7046 Data Mining I Lecture 2. Linear Regression, Model Assessment, and Cross-validation 1 Shaobo Li University of Cincinnati 1 Partially based on Hastie, et al. (2009) ESL, and James, et al. (2013)

More information

Glossary of Linear Algebra Terms. Prepared by Vince Zaccone For Campus Learning Assistance Services at UCSB

Glossary of Linear Algebra Terms. Prepared by Vince Zaccone For Campus Learning Assistance Services at UCSB Glossary of Linear Algebra Terms Basis (for a subspace) A linearly independent set of vectors that spans the space Basic Variable A variable in a linear system that corresponds to a pivot column in the

More information

STAT 540: Data Analysis and Regression

STAT 540: Data Analysis and Regression STAT 540: Data Analysis and Regression Wen Zhou http://www.stat.colostate.edu/~riczw/ Email: riczw@stat.colostate.edu Department of Statistics Colorado State University Fall 205 W. Zhou (Colorado State

More information

ELE/MCE 503 Linear Algebra Facts Fall 2018

ELE/MCE 503 Linear Algebra Facts Fall 2018 ELE/MCE 503 Linear Algebra Facts Fall 2018 Fact N.1 A set of vectors is linearly independent if and only if none of the vectors in the set can be written as a linear combination of the others. Fact N.2

More information

Statistics 910, #5 1. Regression Methods

Statistics 910, #5 1. Regression Methods Statistics 910, #5 1 Overview Regression Methods 1. Idea: effects of dependence 2. Examples of estimation (in R) 3. Review of regression 4. Comparisons and relative efficiencies Idea Decomposition Well-known

More information

Cheat Sheet for MATH461

Cheat Sheet for MATH461 Cheat Sheet for MATH46 Here is the stuff you really need to remember for the exams Linear systems Ax = b Problem: We consider a linear system of m equations for n unknowns x,,x n : For a given matrix A

More information

235 Final exam review questions

235 Final exam review questions 5 Final exam review questions Paul Hacking December 4, 0 () Let A be an n n matrix and T : R n R n, T (x) = Ax the linear transformation with matrix A. What does it mean to say that a vector v R n is an

More information

MATH 304 Linear Algebra Lecture 20: The Gram-Schmidt process (continued). Eigenvalues and eigenvectors.

MATH 304 Linear Algebra Lecture 20: The Gram-Schmidt process (continued). Eigenvalues and eigenvectors. MATH 304 Linear Algebra Lecture 20: The Gram-Schmidt process (continued). Eigenvalues and eigenvectors. Orthogonal sets Let V be a vector space with an inner product. Definition. Nonzero vectors v 1,v

More information

MATRIX ALGEBRA. or x = (x 1,..., x n ) R n. y 1 y 2. x 2. x m. y m. y = cos θ 1 = x 1 L x. sin θ 1 = x 2. cos θ 2 = y 1 L y.

MATRIX ALGEBRA. or x = (x 1,..., x n ) R n. y 1 y 2. x 2. x m. y m. y = cos θ 1 = x 1 L x. sin θ 1 = x 2. cos θ 2 = y 1 L y. as Basics Vectors MATRIX ALGEBRA An array of n real numbers x, x,, x n is called a vector and it is written x = x x n or x = x,, x n R n prime operation=transposing a column to a row Basic vector operations

More information

MA 575 Linear Models: Cedric E. Ginestet, Boston University Regularization: Ridge Regression and Lasso Week 14, Lecture 2

MA 575 Linear Models: Cedric E. Ginestet, Boston University Regularization: Ridge Regression and Lasso Week 14, Lecture 2 MA 575 Linear Models: Cedric E. Ginestet, Boston University Regularization: Ridge Regression and Lasso Week 14, Lecture 2 1 Ridge Regression Ridge regression and the Lasso are two forms of regularized

More information

1 Appendix A: Matrix Algebra

1 Appendix A: Matrix Algebra Appendix A: Matrix Algebra. Definitions Matrix A =[ ]=[A] Symmetric matrix: = for all and Diagonal matrix: 6=0if = but =0if 6= Scalar matrix: the diagonal matrix of = Identity matrix: the scalar matrix

More information

MA 265 FINAL EXAM Fall 2012

MA 265 FINAL EXAM Fall 2012 MA 265 FINAL EXAM Fall 22 NAME: INSTRUCTOR S NAME:. There are a total of 25 problems. You should show work on the exam sheet, and pencil in the correct answer on the scantron. 2. No books, notes, or calculators

More information

The Statistical Property of Ordinary Least Squares

The Statistical Property of Ordinary Least Squares The Statistical Property of Ordinary Least Squares The linear equation, on which we apply the OLS is y t = X t β + u t Then, as we have derived, the OLS estimator is ˆβ = [ X T X] 1 X T y Then, substituting

More information

(a) If A is a 3 by 4 matrix, what does this tell us about its nullspace? Solution: dim N(A) 1, since rank(a) 3. Ax =

(a) If A is a 3 by 4 matrix, what does this tell us about its nullspace? Solution: dim N(A) 1, since rank(a) 3. Ax = . (5 points) (a) If A is a 3 by 4 matrix, what does this tell us about its nullspace? dim N(A), since rank(a) 3. (b) If we also know that Ax = has no solution, what do we know about the rank of A? C(A)

More information

Chap 3. Linear Algebra

Chap 3. Linear Algebra Chap 3. Linear Algebra Outlines 1. Introduction 2. Basis, Representation, and Orthonormalization 3. Linear Algebraic Equations 4. Similarity Transformation 5. Diagonal Form and Jordan Form 6. Functions

More information

Linear Algebra in Actuarial Science: Slides to the lecture

Linear Algebra in Actuarial Science: Slides to the lecture Linear Algebra in Actuarial Science: Slides to the lecture Fall Semester 2010/2011 Linear Algebra is a Tool-Box Linear Equation Systems Discretization of differential equations: solving linear equations

More information

Regression, Ridge Regression, Lasso

Regression, Ridge Regression, Lasso Regression, Ridge Regression, Lasso Fabio G. Cozman - fgcozman@usp.br October 2, 2018 A general definition Regression studies the relationship between a response variable Y and covariates X 1,..., X n.

More information

Math 102, Winter Final Exam Review. Chapter 1. Matrices and Gaussian Elimination

Math 102, Winter Final Exam Review. Chapter 1. Matrices and Gaussian Elimination Math 0, Winter 07 Final Exam Review Chapter. Matrices and Gaussian Elimination { x + x =,. Different forms of a system of linear equations. Example: The x + 4x = 4. [ ] [ ] [ ] vector form (or the column

More information

YORK UNIVERSITY. Faculty of Science Department of Mathematics and Statistics MATH M Test #2 Solutions

YORK UNIVERSITY. Faculty of Science Department of Mathematics and Statistics MATH M Test #2 Solutions YORK UNIVERSITY Faculty of Science Department of Mathematics and Statistics MATH 3. M Test # Solutions. (8 pts) For each statement indicate whether it is always TRUE or sometimes FALSE. Note: For this

More information

1. Diagonalize the matrix A if possible, that is, find an invertible matrix P and a diagonal

1. Diagonalize the matrix A if possible, that is, find an invertible matrix P and a diagonal . Diagonalize the matrix A if possible, that is, find an invertible matrix P and a diagonal 3 9 matrix D such that A = P DP, for A =. 3 4 3 (a) P = 4, D =. 3 (b) P = 4, D =. (c) P = 4 8 4, D =. 3 (d) P

More information

Math Camp Lecture 4: Linear Algebra. Xiao Yu Wang. Aug 2010 MIT. Xiao Yu Wang (MIT) Math Camp /10 1 / 88

Math Camp Lecture 4: Linear Algebra. Xiao Yu Wang. Aug 2010 MIT. Xiao Yu Wang (MIT) Math Camp /10 1 / 88 Math Camp 2010 Lecture 4: Linear Algebra Xiao Yu Wang MIT Aug 2010 Xiao Yu Wang (MIT) Math Camp 2010 08/10 1 / 88 Linear Algebra Game Plan Vector Spaces Linear Transformations and Matrices Determinant

More information

Linear models. Rasmus Waagepetersen Department of Mathematics Aalborg University Denmark. October 5, 2016

Linear models. Rasmus Waagepetersen Department of Mathematics Aalborg University Denmark. October 5, 2016 Linear models Rasmus Waagepetersen Department of Mathematics Aalborg University Denmark October 5, 2016 1 / 16 Outline for today linear models least squares estimation orthogonal projections estimation

More information

Chapter 7: Symmetric Matrices and Quadratic Forms

Chapter 7: Symmetric Matrices and Quadratic Forms Chapter 7: Symmetric Matrices and Quadratic Forms (Last Updated: December, 06) These notes are derived primarily from Linear Algebra and its applications by David Lay (4ed). A few theorems have been moved

More information

STAT200C: Review of Linear Algebra

STAT200C: Review of Linear Algebra Stat200C Instructor: Zhaoxia Yu STAT200C: Review of Linear Algebra 1 Review of Linear Algebra 1.1 Vector Spaces, Rank, Trace, and Linear Equations 1.1.1 Rank and Vector Spaces Definition A vector whose

More information

MATH 240 Spring, Chapter 1: Linear Equations and Matrices

MATH 240 Spring, Chapter 1: Linear Equations and Matrices MATH 240 Spring, 2006 Chapter Summaries for Kolman / Hill, Elementary Linear Algebra, 8th Ed. Sections 1.1 1.6, 2.1 2.2, 3.2 3.8, 4.3 4.5, 5.1 5.3, 5.5, 6.1 6.5, 7.1 7.2, 7.4 DEFINITIONS Chapter 1: Linear

More information

Chapter 4 Euclid Space

Chapter 4 Euclid Space Chapter 4 Euclid Space Inner Product Spaces Definition.. Let V be a real vector space over IR. A real inner product on V is a real valued function on V V, denoted by (, ), which satisfies () (x, y) = (y,

More information

MATH 20F: LINEAR ALGEBRA LECTURE B00 (T. KEMP)

MATH 20F: LINEAR ALGEBRA LECTURE B00 (T. KEMP) MATH 20F: LINEAR ALGEBRA LECTURE B00 (T KEMP) Definition 01 If T (x) = Ax is a linear transformation from R n to R m then Nul (T ) = {x R n : T (x) = 0} = Nul (A) Ran (T ) = {Ax R m : x R n } = {b R m

More information

VAR Model. (k-variate) VAR(p) model (in the Reduced Form): Y t-2. Y t-1 = A + B 1. Y t + B 2. Y t-p. + ε t. + + B p. where:

VAR Model. (k-variate) VAR(p) model (in the Reduced Form): Y t-2. Y t-1 = A + B 1. Y t + B 2. Y t-p. + ε t. + + B p. where: VAR Model (k-variate VAR(p model (in the Reduced Form: where: Y t = A + B 1 Y t-1 + B 2 Y t-2 + + B p Y t-p + ε t Y t = (y 1t, y 2t,, y kt : a (k x 1 vector of time series variables A: a (k x 1 vector

More information

Common-Knowledge / Cheat Sheet

Common-Knowledge / Cheat Sheet CSE 521: Design and Analysis of Algorithms I Fall 2018 Common-Knowledge / Cheat Sheet 1 Randomized Algorithm Expectation: For a random variable X with domain, the discrete set S, E [X] = s S P [X = s]

More information

MIT Spring 2015

MIT Spring 2015 Regression Analysis MIT 18.472 Dr. Kempthorne Spring 2015 1 Outline Regression Analysis 1 Regression Analysis 2 Multiple Linear Regression: Setup Data Set n cases i = 1, 2,..., n 1 Response (dependent)

More information

Matrix Algebra, part 2

Matrix Algebra, part 2 Matrix Algebra, part 2 Ming-Ching Luoh 2005.9.12 1 / 38 Diagonalization and Spectral Decomposition of a Matrix Optimization 2 / 38 Diagonalization and Spectral Decomposition of a Matrix Also called Eigenvalues

More information

MATH 304 Linear Algebra Lecture 34: Review for Test 2.

MATH 304 Linear Algebra Lecture 34: Review for Test 2. MATH 304 Linear Algebra Lecture 34: Review for Test 2. Topics for Test 2 Linear transformations (Leon 4.1 4.3) Matrix transformations Matrix of a linear mapping Similar matrices Orthogonality (Leon 5.1

More information

Online Exercises for Linear Algebra XM511

Online Exercises for Linear Algebra XM511 This document lists the online exercises for XM511. The section ( ) numbers refer to the textbook. TYPE I are True/False. Lecture 02 ( 1.1) Online Exercises for Linear Algebra XM511 1) The matrix [3 2

More information

Quantitative Analysis of Financial Markets. Summary of Part II. Key Concepts & Formulas. Christopher Ting. November 11, 2017

Quantitative Analysis of Financial Markets. Summary of Part II. Key Concepts & Formulas. Christopher Ting. November 11, 2017 Summary of Part II Key Concepts & Formulas Christopher Ting November 11, 2017 christopherting@smu.edu.sg http://www.mysmu.edu/faculty/christophert/ Christopher Ting 1 of 16 Why Regression Analysis? Understand

More information

Problem Set (T) If A is an m n matrix, B is an n p matrix and D is a p s matrix, then show

Problem Set (T) If A is an m n matrix, B is an n p matrix and D is a p s matrix, then show MTH 0: Linear Algebra Department of Mathematics and Statistics Indian Institute of Technology - Kanpur Problem Set Problems marked (T) are for discussions in Tutorial sessions (T) If A is an m n matrix,

More information

Linear Algebra Primer

Linear Algebra Primer Linear Algebra Primer David Doria daviddoria@gmail.com Wednesday 3 rd December, 2008 Contents Why is it called Linear Algebra? 4 2 What is a Matrix? 4 2. Input and Output.....................................

More information

2. A Review of Some Key Linear Models Results. Copyright c 2018 Dan Nettleton (Iowa State University) 2. Statistics / 28

2. A Review of Some Key Linear Models Results. Copyright c 2018 Dan Nettleton (Iowa State University) 2. Statistics / 28 2. A Review of Some Key Linear Models Results Copyright c 2018 Dan Nettleton (Iowa State University) 2. Statistics 510 1 / 28 A General Linear Model (GLM) Suppose y = Xβ + ɛ, where y R n is the response

More information

Vectors and Matrices Statistics with Vectors and Matrices

Vectors and Matrices Statistics with Vectors and Matrices Vectors and Matrices Statistics with Vectors and Matrices Lecture 3 September 7, 005 Analysis Lecture #3-9/7/005 Slide 1 of 55 Today s Lecture Vectors and Matrices (Supplement A - augmented with SAS proc

More information

2. Matrix Algebra and Random Vectors

2. Matrix Algebra and Random Vectors 2. Matrix Algebra and Random Vectors 2.1 Introduction Multivariate data can be conveniently display as array of numbers. In general, a rectangular array of numbers with, for instance, n rows and p columns

More information

Summer School in Statistics for Astronomers V June 1 - June 6, Regression. Mosuk Chow Statistics Department Penn State University.

Summer School in Statistics for Astronomers V June 1 - June 6, Regression. Mosuk Chow Statistics Department Penn State University. Summer School in Statistics for Astronomers V June 1 - June 6, 2009 Regression Mosuk Chow Statistics Department Penn State University. Adapted from notes prepared by RL Karandikar Mean and variance Recall

More information

Linear Algebra Highlights

Linear Algebra Highlights Linear Algebra Highlights Chapter 1 A linear equation in n variables is of the form a 1 x 1 + a 2 x 2 + + a n x n. We can have m equations in n variables, a system of linear equations, which we want to

More information

Linear regression methods

Linear regression methods Linear regression methods Most of our intuition about statistical methods stem from linear regression. For observations i = 1,..., n, the model is Y i = p X ij β j + ε i, j=1 where Y i is the response

More information

Lecture 2: Linear Algebra Review

Lecture 2: Linear Algebra Review EE 227A: Convex Optimization and Applications January 19 Lecture 2: Linear Algebra Review Lecturer: Mert Pilanci Reading assignment: Appendix C of BV. Sections 2-6 of the web textbook 1 2.1 Vectors 2.1.1

More information

Math 4A Notes. Written by Victoria Kala Last updated June 11, 2017

Math 4A Notes. Written by Victoria Kala Last updated June 11, 2017 Math 4A Notes Written by Victoria Kala vtkala@math.ucsb.edu Last updated June 11, 2017 Systems of Linear Equations A linear equation is an equation that can be written in the form a 1 x 1 + a 2 x 2 +...

More information

Linear Models Review

Linear Models Review Linear Models Review Vectors in IR n will be written as ordered n-tuples which are understood to be column vectors, or n 1 matrices. A vector variable will be indicted with bold face, and the prime sign

More information

Chapter 3 Transformations

Chapter 3 Transformations Chapter 3 Transformations An Introduction to Optimization Spring, 2014 Wei-Ta Chu 1 Linear Transformations A function is called a linear transformation if 1. for every and 2. for every If we fix the bases

More information

Multivariate Regression

Multivariate Regression Multivariate Regression The so-called supervised learning problem is the following: we want to approximate the random variable Y with an appropriate function of the random variables X 1,..., X p with the

More information

Mathematical Methods wk 2: Linear Operators

Mathematical Methods wk 2: Linear Operators John Magorrian, magog@thphysoxacuk These are work-in-progress notes for the second-year course on mathematical methods The most up-to-date version is available from http://www-thphysphysicsoxacuk/people/johnmagorrian/mm

More information

Multivariate Statistical Analysis

Multivariate Statistical Analysis Multivariate Statistical Analysis Fall 2011 C. L. Williams, Ph.D. Lecture 4 for Applied Multivariate Analysis Outline 1 Eigen values and eigen vectors Characteristic equation Some properties of eigendecompositions

More information

DIAGONALIZATION. In order to see the implications of this definition, let us consider the following example Example 1. Consider the matrix

DIAGONALIZATION. In order to see the implications of this definition, let us consider the following example Example 1. Consider the matrix DIAGONALIZATION Definition We say that a matrix A of size n n is diagonalizable if there is a basis of R n consisting of eigenvectors of A ie if there are n linearly independent vectors v v n such that

More information

STAT 100C: Linear models

STAT 100C: Linear models STAT 100C: Linear models Arash A. Amini June 9, 2018 1 / 56 Table of Contents Multiple linear regression Linear model setup Estimation of β Geometric interpretation Estimation of σ 2 Hat matrix Gram matrix

More information

Applied Statistics Preliminary Examination Theory of Linear Models August 2017

Applied Statistics Preliminary Examination Theory of Linear Models August 2017 Applied Statistics Preliminary Examination Theory of Linear Models August 2017 Instructions: Do all 3 Problems. Neither calculators nor electronic devices of any kind are allowed. Show all your work, clearly

More information

5.1 Consistency of least squares estimates. We begin with a few consistency results that stand on their own and do not depend on normality.

5.1 Consistency of least squares estimates. We begin with a few consistency results that stand on their own and do not depend on normality. 88 Chapter 5 Distribution Theory In this chapter, we summarize the distributions related to the normal distribution that occur in linear models. Before turning to this general problem that assumes normal

More information

Math 423/533: The Main Theoretical Topics

Math 423/533: The Main Theoretical Topics Math 423/533: The Main Theoretical Topics Notation sample size n, data index i number of predictors, p (p = 2 for simple linear regression) y i : response for individual i x i = (x i1,..., x ip ) (1 p)

More information

Exercise Sheet 1.

Exercise Sheet 1. Exercise Sheet 1 You can download my lecture and exercise sheets at the address http://sami.hust.edu.vn/giang-vien/?name=huynt 1) Let A, B be sets. What does the statement "A is not a subset of B " mean?

More information

UNIT 6: The singular value decomposition.

UNIT 6: The singular value decomposition. UNIT 6: The singular value decomposition. María Barbero Liñán Universidad Carlos III de Madrid Bachelor in Statistics and Business Mathematical methods II 2011-2012 A square matrix is symmetric if A T

More information

MLES & Multivariate Normal Theory

MLES & Multivariate Normal Theory Merlise Clyde September 6, 2016 Outline Expectations of Quadratic Forms Distribution Linear Transformations Distribution of estimates under normality Properties of MLE s Recap Ŷ = ˆµ is an unbiased estimate

More information

Lecture 1: Review of linear algebra

Lecture 1: Review of linear algebra Lecture 1: Review of linear algebra Linear functions and linearization Inverse matrix, least-squares and least-norm solutions Subspaces, basis, and dimension Change of basis and similarity transformations

More information

MTH 102: Linear Algebra Department of Mathematics and Statistics Indian Institute of Technology - Kanpur. Problem Set

MTH 102: Linear Algebra Department of Mathematics and Statistics Indian Institute of Technology - Kanpur. Problem Set MTH 102: Linear Algebra Department of Mathematics and Statistics Indian Institute of Technology - Kanpur Problem Set 6 Problems marked (T) are for discussions in Tutorial sessions. 1. Find the eigenvalues

More information

A Short Introduction to the Lasso Methodology

A Short Introduction to the Lasso Methodology A Short Introduction to the Lasso Methodology Michael Gutmann sites.google.com/site/michaelgutmann University of Helsinki Aalto University Helsinki Institute for Information Technology March 9, 2016 Michael

More information

Linear models and their mathematical foundations: Simple linear regression

Linear models and their mathematical foundations: Simple linear regression Linear models and their mathematical foundations: Simple linear regression Steffen Unkel Department of Medical Statistics University Medical Center Göttingen, Germany Winter term 2018/19 1/21 Introduction

More information

Answer Keys For Math 225 Final Review Problem

Answer Keys For Math 225 Final Review Problem Answer Keys For Math Final Review Problem () For each of the following maps T, Determine whether T is a linear transformation. If T is a linear transformation, determine whether T is -, onto and/or bijective.

More information

MAT Linear Algebra Collection of sample exams

MAT Linear Algebra Collection of sample exams MAT 342 - Linear Algebra Collection of sample exams A-x. (0 pts Give the precise definition of the row echelon form. 2. ( 0 pts After performing row reductions on the augmented matrix for a certain system

More information

ICS 6N Computational Linear Algebra Symmetric Matrices and Orthogonal Diagonalization

ICS 6N Computational Linear Algebra Symmetric Matrices and Orthogonal Diagonalization ICS 6N Computational Linear Algebra Symmetric Matrices and Orthogonal Diagonalization Xiaohui Xie University of California, Irvine xhx@uci.edu Xiaohui Xie (UCI) ICS 6N 1 / 21 Symmetric matrices An n n

More information

Quick Review on Linear Multiple Regression

Quick Review on Linear Multiple Regression Quick Review on Linear Multiple Regression Mei-Yuan Chen Department of Finance National Chung Hsing University March 6, 2007 Introduction for Conditional Mean Modeling Suppose random variables Y, X 1,

More information

7. Symmetric Matrices and Quadratic Forms

7. Symmetric Matrices and Quadratic Forms Linear Algebra 7. Symmetric Matrices and Quadratic Forms CSIE NCU 1 7. Symmetric Matrices and Quadratic Forms 7.1 Diagonalization of symmetric matrices 2 7.2 Quadratic forms.. 9 7.4 The singular value

More information

Linear Algebra Formulas. Ben Lee

Linear Algebra Formulas. Ben Lee Linear Algebra Formulas Ben Lee January 27, 2016 Definitions and Terms Diagonal: Diagonal of matrix A is a collection of entries A ij where i = j. Diagonal Matrix: A matrix (usually square), where entries

More information

Regression #5: Confidence Intervals and Hypothesis Testing (Part 1)

Regression #5: Confidence Intervals and Hypothesis Testing (Part 1) Regression #5: Confidence Intervals and Hypothesis Testing (Part 1) Econ 671 Purdue University Justin L. Tobias (Purdue) Regression #5 1 / 24 Introduction What is a confidence interval? To fix ideas, suppose

More information

Warm-up. True or false? Baby proof. 2. The system of normal equations for A x = y has solutions iff A x = y has solutions

Warm-up. True or false? Baby proof. 2. The system of normal equations for A x = y has solutions iff A x = y has solutions Warm-up True or false? 1. proj u proj v u = u 2. The system of normal equations for A x = y has solutions iff A x = y has solutions 3. The normal equations are always consistent Baby proof 1. Let A be

More information

Restricted Maximum Likelihood in Linear Regression and Linear Mixed-Effects Model

Restricted Maximum Likelihood in Linear Regression and Linear Mixed-Effects Model Restricted Maximum Likelihood in Linear Regression and Linear Mixed-Effects Model Xiuming Zhang zhangxiuming@u.nus.edu A*STAR-NUS Clinical Imaging Research Center October, 015 Summary This report derives

More information

LECTURE 2 LINEAR REGRESSION MODEL AND OLS

LECTURE 2 LINEAR REGRESSION MODEL AND OLS SEPTEMBER 29, 2014 LECTURE 2 LINEAR REGRESSION MODEL AND OLS Definitions A common question in econometrics is to study the effect of one group of variables X i, usually called the regressors, on another

More information

A Little Necessary Matrix Algebra for Doctoral Studies in Business & Economics. Matrix Algebra

A Little Necessary Matrix Algebra for Doctoral Studies in Business & Economics. Matrix Algebra A Little Necessary Matrix Algebra for Doctoral Studies in Business & Economics James J. Cochran Department of Marketing & Analysis Louisiana Tech University Jcochran@cab.latech.edu Matrix Algebra Matrix

More information

Multivariate Regression Analysis

Multivariate Regression Analysis Matrices and vectors The model from the sample is: Y = Xβ +u with n individuals, l response variable, k regressors Y is a n 1 vector or a n l matrix with the notation Y T = (y 1,y 2,...,y n ) 1 x 11 x

More information

This can be accomplished by left matrix multiplication as follows: I

This can be accomplished by left matrix multiplication as follows: I 1 Numerical Linear Algebra 11 The LU Factorization Recall from linear algebra that Gaussian elimination is a method for solving linear systems of the form Ax = b, where A R m n and bran(a) In this method

More information

lecture 2 and 3: algorithms for linear algebra

lecture 2 and 3: algorithms for linear algebra lecture 2 and 3: algorithms for linear algebra STAT 545: Introduction to computational statistics Vinayak Rao Department of Statistics, Purdue University August 27, 2018 Solving a system of linear equations

More information

Mathematical Foundations of Applied Statistics: Matrix Algebra

Mathematical Foundations of Applied Statistics: Matrix Algebra Mathematical Foundations of Applied Statistics: Matrix Algebra Steffen Unkel Department of Medical Statistics University Medical Center Göttingen, Germany Winter term 2018/19 1/105 Literature Seber, G.

More information