Linear Algebra Review

Similar documents
Chapter 5 Matrix Approach to Simple Linear Regression

Topic 7 - Matrix Approach to Simple Linear Regression. Outline. Matrix. Matrix. Review of Matrices. Regression model in matrix form

Matrix Approach to Simple Linear Regression: An Overview

Outline. Remedial Measures) Extra Sums of Squares Standardized Version of the Multiple Regression Model

Lecture 9 SLR in Matrix Form

Need for Several Predictor Variables

Final Review. Yang Feng. Yang Feng (Columbia University) Final Review 1 / 58

. a m1 a mn. a 1 a 2 a = a n

Inverse of a Square Matrix. For an N N square matrix A, the inverse of A, 1

STAT5044: Regression and Anova. Inyoung Kim

Math Camp II. Basic Linear Algebra. Yiqing Xu. Aug 26, 2014 MIT

Lecture 15 Multiple regression I Chapter 6 Set 2 Least Square Estimation The quadratic form to be minimized is

Lecture 13: Simple Linear Regression in Matrix Format. 1 Expectations and Variances with Vectors and Matrices

Preliminaries. Copyright c 2018 Dan Nettleton (Iowa State University) Statistics / 38

LINEAR REGRESSION MODELS W4315

POLI270 - Linear Algebra

Lecture 6: Geometry of OLS Estimation of Linear Regession

Matrix Algebra. Matrix Algebra. Chapter 8 - S&B

CS100: DISCRETE STRUCTURES. Lecture 3 Matrices Ch 3 Pages:

Properties of Matrices and Operations on Matrices

Appendix A: Matrices

Matrix & Linear Algebra

Linear Algebra review Powers of a diagonalizable matrix Spectral decomposition

Linear Algebra V = T = ( 4 3 ).

Phys 201. Matrices and Determinants

Matrix Algebra Determinant, Inverse matrix. Matrices. A. Fabretti. Mathematics 2 A.Y. 2015/2016. A. Fabretti Matrices

Matrix Arithmetic. j=1

Elementary Row Operations on Matrices

Chapter 1 Simple Linear Regression (part 6: matrix version)

Elementary maths for GMT

LECTURE 2 LINEAR REGRESSION MODEL AND OLS

Matrix Operations. Linear Combination Vector Algebra Angle Between Vectors Projections and Reflections Equality of matrices, Augmented Matrix

6. Multiple Linear Regression

A FIRST COURSE IN LINEAR ALGEBRA. An Open Text by Ken Kuttler. Matrix Arithmetic

Chapter 1. Matrix Algebra

Introduc)on to linear algebra

Applied Matrix Algebra Lecture Notes Section 2.2. Gerald Höhn Department of Mathematics, Kansas State University

CLASS 12 ALGEBRA OF MATRICES

Regression. Oscar García

Linear Algebra (Review) Volker Tresp 2018

Knowledge Discovery and Data Mining 1 (VO) ( )

Linear Models Review

POL 213: Research Methods

Multiple Regression. Dr. Frank Wood. Frank Wood, Linear Regression Models Lecture 12, Slide 1

Chapter 7. Linear Algebra: Matrices, Vectors,

MATRICES AND MATRIX OPERATIONS

Matrix operations Linear Algebra with Computer Science Application

ECON 186 Class Notes: Linear Algebra

MAC Module 2 Systems of Linear Equations and Matrices II. Learning Objectives. Upon completing this module, you should be able to :

Introduction to Quantitative Techniques for MSc Programmes SCHOOL OF ECONOMICS, MATHEMATICS AND STATISTICS MALET STREET LONDON WC1E 7HX

Linear Equations in Linear Algebra

Lecture 1 Review: Linear models have the form (in matrix notation) Y = Xβ + ε,

Topics. Vectors (column matrices): Vector addition and scalar multiplication The matrix of a linear function y Ax The elements of a matrix A : A ij

Linear Algebra review Powers of a diagonalizable matrix Spectral decomposition

MIT Spring 2015

STAT 540: Data Analysis and Regression

Chapter 1: Systems of linear equations and matrices. Section 1.1: Introduction to systems of linear equations

Math Bootcamp An p-dimensional vector is p numbers put together. Written as. x 1 x =. x p

MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS. + + x 1 x 2. x n 8 (4) 3 4 2

STAT 100C: Linear models

Matrix Algebra, part 2

Preliminary Linear Algebra 1. Copyright c 2012 Dan Nettleton (Iowa State University) Statistics / 100

MA 575 Linear Models: Cedric E. Ginestet, Boston University Revision: Probability and Linear Algebra Week 1, Lecture 2

Math 360 Linear Algebra Fall Class Notes. a a a a a a. a a a

Chapter 2. Linear Algebra. rather simple and learning them will eventually allow us to explain the strange results of

2. Matrix Algebra and Random Vectors

INSTITIÚID TEICNEOLAÍOCHTA CHEATHARLACH INSTITUTE OF TECHNOLOGY CARLOW MATRICES

MA 575 Linear Models: Cedric E. Ginestet, Boston University Midterm Review Week 7

Section 9.2: Matrices. Definition: A matrix A consists of a rectangular array of numbers, or elements, arranged in m rows and n columns.

Matrices. Math 240 Calculus III. Wednesday, July 10, Summer 2013, Session II. Matrices. Math 240. Definitions and Notation.

Lecture Notes in Linear Algebra

ANOVA: Analysis of Variance - Part I

Section 9.2: Matrices.. a m1 a m2 a mn

[ Here 21 is the dot product of (3, 1, 2, 5) with (2, 3, 1, 2), and 31 is the dot product of

Prepared by: M. S. KumarSwamy, TGT(Maths) Page

Numerical Analysis Lecture Notes

Introduction to Computational Finance and Financial Econometrics Matrix Algebra Review

Lecture 13: Simple Linear Regression in Matrix Format

Linear Algebra and Matrix Inversion

STAT200C: Review of Linear Algebra

Regression Review. Statistics 149. Spring Copyright c 2006 by Mark E. Irwin

Matrices and Determinants

Raphael Mrode. Training in quantitative genetics and genomics 30 May 10 June 2016 ILRI, Nairobi. Partner Logo. Partner Logo

Lecture 3: Matrix and Matrix Operations

Review of Linear Algebra

M. Matrices and Linear Algebra

Practical Linear Algebra: A Geometry Toolbox

MATH2210 Notebook 2 Spring 2018

Review Packet 1 B 11 B 12 B 13 B = B 21 B 22 B 23 B 31 B 32 B 33 B 41 B 42 B 43

Biostatistics 380 Multiple Regression 1. Multiple Regression

Peter Hoff Linear and multilinear models April 3, GLS for multivariate regression 5. 3 Covariance estimation for the GLM 8

Formal Statement of Simple Linear Regression Model

Multiple Linear Regression

Introduction. Vectors and Matrices. Vectors [1] Vectors [2]

ICS 6N Computational Linear Algebra Matrix Algebra

Addition and subtraction: element by element, and dimensions must match.

Linear Algebra. Matrices Operations. Consider, for example, a system of equations such as x + 2y z + 4w = 0, 3x 4y + 2z 6w = 0, x 3y 2z + w = 0.

Recall the convention that, for us, all vectors are column vectors.

Matrix Algebra, Class Notes (part 2) by Hrishikesh D. Vinod Copyright 1998 by Prof. H. D. Vinod, Fordham University, New York. All rights reserved.

Estimation of the Response Mean. Copyright c 2012 Dan Nettleton (Iowa State University) Statistics / 27

Transcription:

Linear Algebra Review Yang Feng http://www.stat.columbia.edu/~yangfeng Yang Feng (Columbia University) Linear Algebra Review 1 / 45

Definition of Matrix Rectangular array of elements arranged in rows and columns 16000 23 33000 47 21000 35 A matrix has dimensions The dimension of a matrix is its number of rows and columns It is expressed as 3 2 (in this case) Yang Feng (Columbia University) Linear Algebra Review 2 / 45

Indexing a Matrix Rectangular array of elements arranged in rows and columns a11 a A = 12 a 13 a 21 a 22 a 23 A matrix can also be notated A = [a ij ], i = 1, 2; j = 1, 2, 3 Yang Feng (Columbia University) Linear Algebra Review 3 / 45

Square Matrix and Column Vector A square matrix has equal number of rows and columns a 4 7 11 a 12 a 13 a 3 9 21 a 22 a 23 a 31 a 32 a 33 A column vector is a matrix with a single column 4 7 10 c 1 c 2 c 3 c 4 c 5 All vectors (row or column) are matrices, all scalars are 1 1 matrices. Yang Feng (Columbia University) Linear Algebra Review 4 / 45

Transpose The transpose of a matrix is another matrix in which the rows and columns have been interchanged 2 5 A = 7 10 3 4 A 2 7 3 = 5 10 4 Yang Feng (Columbia University) Linear Algebra Review 5 / 45

Equality of Matrices Two matrices are the same if they have the same dimension and all the elements are equal a 1 4 A = a 2 B = 7 a 3 3 A = B implies a 1 = 4, a 2 = 7, a 3 = 3 Yang Feng (Columbia University) Linear Algebra Review 6 / 45

Matrix Addition and Substraction Then 1 4 1 2 A = 2 5 B = 2 3 3 6 3 4 2 6 A + B = 4 8 6 10 Yang Feng (Columbia University) Linear Algebra Review 7 / 45

Multiplication of a Matrix by a Scalar 2 7 A = 9 3 2 7 2k ka = k = 9 3 9k 7k 3k Yang Feng (Columbia University) Linear Algebra Review 8 / 45

Multiplication of two Matrices where (AB) ij = for i = 1,, l and j = 1,, n. A l m B m n = (AB) l n, m A ik B jk, k=1 Yang Feng (Columbia University) Linear Algebra Review 9 / 45

Another Matrix Multiplication Example A = AB = 1 3 4 0 5 8 1 3 4 0 5 8 3 B = 5 2 3 5 26 = 41 2 Yang Feng (Columbia University) Linear Algebra Review 10 / 45

Special Matrices If A = A, then A is a symmetric matrix 1 4 6 1 4 6 A = 4 2 5 A = 4 2 5 6 5 3 6 5 3 If the off-diagonal elements of a matrix are all zeros it is then called a diagonal matrix 4 0 0 0 a 1 0 0 A = 0 a 2 0 B = 0 1 0 0 0 0 a 3 0 0 10 0 0 0 0 5 Yang Feng (Columbia University) Linear Algebra Review 11 / 45

Identity Matrix A diagonal matrix whose diagonal entries are all ones is an identity matrix. Multiplication by an identity matrix leaves the pre or post multiplied matrix unchanged. 1 0 0 0 I = 0 1 0 0 0 0 1 0 0 0 0 1 and AI = IA = A Yang Feng (Columbia University) Linear Algebra Review 12 / 45

Vector and matrix with all elements equal to one 1 1 1... 1 1 n =..... J n =....... 1... 1 1 n n n 1 1 1 1... 1 1 n 1 n =.. 1 1... 1 n 1 =.......... 1... 1 1 n 1 n n = J n Yang Feng (Columbia University) Linear Algebra Review 13 / 45

Linear Dependence and Rank of Matrix Consider 1 2 5 1 A = 2 2 10 6 3 4 15 1 and think of this as a matrix of a collection of column vectors. Note that the third column vector is a multiple of the first column vector. Yang Feng (Columbia University) Linear Algebra Review 14 / 45

Linear Dependence When m scalars k 1,..., k m not all zero, can be found such that: k 1 A 1 +... + k m A m = 0 where 0 denotes the zero column vector and A i is the i th column of matrix A, the m column vectors are called linearly dependent. If the only set of scalars for which the equality holds is k 1 = 0,..., k m = 0, the set of m column vectors is linearly independent. In the previous example matrix the columns are linearly dependent. 1 2 5 1 0 5 2 + 0 2 1 10 + 0 6 = 0 3 4 15 1 0 Yang Feng (Columbia University) Linear Algebra Review 15 / 45

Rank of Matrix The rank of a matrix is defined to be the maximum number of linearly independent columns in the matrix. Rank properties include The rank of a matrix is unique The rank of a matrix can equivalently be defined as the maximum number of linearly independent rows The rank of an r c matrix cannot exceed min(r, c) The row and column rank of a matrix are equal The rank of a matrix is preserved under nonsingular transformations., i.e. Let A (n n) and C (k k) be nonsingular matrices. Then for any n k matrix B we have rank(b) = rank(ab) = rank(bc) Yang Feng (Columbia University) Linear Algebra Review 16 / 45

Inverse of Matrix Like a reciprocal 6 1/6 = 1/6 6 = 1 x 1 x = 1 But for matrices AA 1 = A 1 A = I Yang Feng (Columbia University) Linear Algebra Review 17 / 45

Example More generally, where D = ad bc 2 4 A = 3 1 A 1.1.4 =.3.2 A 1 1 0 A = 0 1 a b A = c d A 1 = 1 d b D c a Yang Feng (Columbia University) Linear Algebra Review 18 / 45

Inverses of Diagonal Matrices are Easy then 3 0 0 A = 0 4 0 0 0 2 1/3 0 0 A 1 = 0 1/4 0 0 0 1/2 Yang Feng (Columbia University) Linear Algebra Review 19 / 45

Finding the inverse Finding an inverse takes (for general matrices with no special structure) O(n 3 ) operations (when n is the number of rows in the matrix) We will assume that numerical packages can do this for us in R: solve(a) gives the inverse of matrix A Yang Feng (Columbia University) Linear Algebra Review 20 / 45

Example Solving a system of simultaneous equations 2y 1 + 4y 2 = 20 3y 1 + y 2 = 10 y1 = 2 4 3 1 y1 y 2 = y 2 2 4 3 1 20 10 1 20 10 Yang Feng (Columbia University) Linear Algebra Review 21 / 45

List of Useful Matrix Properties A + B = B + A (A + B) + C = A + (B + C) (AB)C = A(BC) C(A + B) = CA + CB k(a + B) = ka + kb (A ) = A (A + B) = A + B (AB) = B A (ABC) = C B A (AB) 1 = B 1 A 1 (ABC) 1 = C 1 B 1 A 1 (A 1 ) 1 = A, (A ) 1 = (A 1 ) Yang Feng (Columbia University) Linear Algebra Review 22 / 45

Random Vectors and Matrices Let s say we have a vector consisting of three random variables Y 1 Y = Y 2 Y 3 The expectation of a random vector is defined as E(Y 1 ) E(Y) = E(Y 2 ) E(Y 3 ) Yang Feng (Columbia University) Linear Algebra Review 23 / 45

Expectation of a Random Matrix The expectation of a random matrix is defined similarly E(Y) = [E(Y ij )] i = 1,...n; j = 1,..., p Yang Feng (Columbia University) Linear Algebra Review 24 / 45

Variance-covariance Matrix of a Random Vector The variances of three random variables σ 2 (Y i ) and the covariances between any two of the three random variables σ(y i, Y j ), are assembled in the variance-covariance matrix of Y σ 2 (Y 1 ) σ(y 1, Y 2 ) σ(y 1, Y 3 ) cov(y) = σ 2 {Y} = σ(y 2, Y 1 ) σ 2 (Y 2 ) σ(y 2, Y 3 ) σ(y 3, Y 1 ) σ(y 3, Y 2 ) σ 2 (Y 3 ) remember σ(y 2, Y 1 ) = σ(y 1, Y 2 ) so the covariance matrix is symmetric Yang Feng (Columbia University) Linear Algebra Review 25 / 45

Derivation of Covariance Matrix In vector terms the variance-covariance matrix is defined by σ 2 {Y} = E[(Y E(Y))(Y E(Y)) ] because Y 1 E(Y 1 ) σ 2 {Y} = E( Y 2 E(Y 2 ) Y 1 E(Y 1 ) Y 2 E(Y 2 ) Y 3 E(Y 3 ) ) Y 3 E(Y 3 ) Yang Feng (Columbia University) Linear Algebra Review 26 / 45

Regression Example Take a regression example with n = 3 with constant error terms σ 2 ( i ) and are uncorrelated so that σ 2 ( i, j ) = 0 for all i = j The variance-covariance matrix for the random vector is σ 2 0 0 σ 2 () = 0 σ 2 0 0 0 σ 2 which can be written as σ 2 {} = σ 2 I Yang Feng (Columbia University) Linear Algebra Review 27 / 45

Basic Results If A is a constant matrix and Y is a random matrix then W = AY is a random matrix E(A) = A E(W) = E(AY) = A E(Y) σ 2 {W} = σ 2 {AY} = Aσ 2 {Y}A Yang Feng (Columbia University) Linear Algebra Review 28 / 45

Multivariate Normal Density Let Y be a vector of p observations Y = (Y 1, Y 2,, Y p ) Let µ be a vector of the means of each of the p observations µ = (µ 1, µ 2,, µ p ) Yang Feng (Columbia University) Linear Algebra Review 29 / 45

Multivariate Normal Density let Σ be the variance-covariance matrix of Y σ 2 1 σ 12... σ 1p σ 21 σ2 2... σ 2p Σ =... σ p1 σ p2... σp 2 Then the multivariate normal density is given by P(Y µ, Σ) = 1 (2π) p/2 Σ 1/2 exp[ 1 2 (Y µ) Σ 1 (Y µ)] Yang Feng (Columbia University) Linear Algebra Review 30 / 45

Matrix Simple Linear Regression Nothing new-only matrix formalism for previous results Remember the normal error regression model Y i = β 0 + β 1 X i + i, i N(0, σ 2 ), i = 1,..., n Expanded out this looks like Y 1 = β 0 + β 1 X 1 + 1 Y 2 = β 0 + β 1 X 2 + 2... Y n = β 0 + β 1 X n + n which points towards an obvious matrix formulation. Yang Feng (Columbia University) Linear Algebra Review 31 / 45

Regression Matrices If we identify the following matrices Y 1 Y 2 Y =... Y n 1 X 1 1 X 2 X =... 1 X n β = β0 β 1 1 2 =... n We can write the linear regression equations in a compact form Y = Xβ + Yang Feng (Columbia University) Linear Algebra Review 32 / 45

Regression Matrices Of course, in the normal regression model the expected value of each of the s is zero, we can write E(Y) = Xβ This is because E() = 0 E( 1 ) E( 2 )... E( n ) = 0 0... 0 Yang Feng (Columbia University) Linear Algebra Review 33 / 45

Error Covariance Because the error terms are independent and have constant variance σ 2 σ 2 0... 0 σ 2 {} = 0 σ 2... 0... 0 0... σ 2 σ 2 {} = σ 2 I Yang Feng (Columbia University) Linear Algebra Review 34 / 45

Matrix Normal Regression Model In matrix terms the normal regression model can be written as Y = Xβ + where N(0, σ 2 I) Yang Feng (Columbia University) Linear Algebra Review 35 / 45

Least Square Estimation If we remember both the starting normal equations that we derived and the fact that nb 0 + b 1 Xi = Y i b 0 Xi + b 1 X 2 i = X i Y i 1 X 1 X 1 1... 1 1 X 2 X = X 1 X 2... X n.... = n Xi Xi X 2 i 1 X n Y 1 X 1 1... 1 Y 2 Y = X 1 X 2... X n.. = Y n Yi Xi Y i Yang Feng (Columbia University) Linear Algebra Review 36 / 45

Least Square Estimation Then we can see that these equations are equivalent to the following matrix operations with X X b = X Y b = b0 b 1 with the solution to this equation given by when (X X) 1 exists. b = (X X) 1 X Y Yang Feng (Columbia University) Linear Algebra Review 37 / 45

Fitted Value Ŷ = Xb Because: Ŷ = Ŷ 1 Ŷ 2... Ŷ n 1 X 1 1 X 2 =...... 1 X n b0 b 1 b 0 + b 1 X 1 b 0 + b 1 X 2 =... b 0 + b 1 X n Yang Feng (Columbia University) Linear Algebra Review 38 / 45

Fitted Values, Hat Matrix plug in b = (X X) 1 X Y We have Ŷ = Xb = X(X X) 1 X Y or where Ŷ = HY H = X(X X) 1 X is called the hat matrix. Property of hat matrix H: 1 symmetric 2 idempotent: HH = H. Yang Feng (Columbia University) Linear Algebra Review 39 / 45

Residuals Then e = Y Ŷ = Y HY = (I H)Y e = (I H)Y The matrix I H is also symmetric and idempotent. The variance-covariance matrix of e is And we can estimate it by σ 2 {e} = σ 2 (I H) s 2 {e} = MSE(I H) Yang Feng (Columbia University) Linear Algebra Review 40 / 45

Analysis of Variance Results We know SSTO = (Y i Ȳ )2 = Y 2 i ( Y i ) 2 n Y Y = Y 2 i and J is the matrix with entries all equal to 1. Then we have As a result: ( Y i ) 2 n = 1 n Y JY SSTO = Y Y 1 n Y JY Yang Feng (Columbia University) Linear Algebra Review 41 / 45

Analysis of Variance Results Also, SSE = e 2 i = (Y i Ŷi) 2 can be represented as SSE = e e = Y (I H) (I H)Y = Y (I H)Y Notice that H1 = 1, then (I H)J = 0 Finally by similarly reasoning, SSR = ([H 1 n J]Y) ([H 1 n J]Y) = Y [H 1 n J]Y Easy to check that SSTO = SSE + SSR Yang Feng (Columbia University) Linear Algebra Review 42 / 45

Sums of Squares as Quadratic Forms When n = 2, an example of quadratic forms: 5Y 2 1 + 6Y 1 Y 2 + 4Y 2 2 can be expressed as matrix term as 5 3 Y1 Y1 Y 2 = Y AY 3 4 In general, a quadratic term is defined as : Y 2 Y AY = n n A ij Y i Y j i=1 j=1 where A ij = A ji Here, A is a symmetric n n matrix, the matrix of the quadratic form. Yang Feng (Columbia University) Linear Algebra Review 43 / 45

Quadratic forms for ANOVA SSTO = Y [I 1 n J]Y SSE = Y [I H]Y SSR = Y [H 1 n J]Y Yang Feng (Columbia University) Linear Algebra Review 44 / 45

Inference in Regression Analysis Regression Coefficients: The variance-covariance matrix of b is σ 2 {b} = σ 2 (X X) 1 Mean Response: To estimate the mean response at X h, define 1 X h = Then X h Ŷ h = X h b And the variance-covariance matrix of Ŷh is σ 2 {Ŷ h } = X h σ2 {b}x h = σ 2 X h (X X) 1 X h Prediction of New Observation: s 2 {pred} = MSE(1 + X h (X X) 1 X h ) Yang Feng (Columbia University) Linear Algebra Review 45 / 45