Properties of Matrices and Operations on Matrices

Similar documents
Chapter 1. Matrix Algebra

The Singular Value Decomposition

LinGloss. A glossary of linear algebra

linearly indepedent eigenvectors as the multiplicity of the root, but in general there may be no more than one. For further discussion, assume matrice

Chapter 3. Matrices. 3.1 Matrices

Linear Algebra Review. Vectors

Foundations of Matrix Analysis

Section 3.9. Matrix Norm

MATH36001 Generalized Inverses and the SVD 2015

Stat 206: Linear algebra

Knowledge Discovery and Data Mining 1 (VO) ( )

B553 Lecture 5: Matrix Algebra Review

Mathematical Foundations of Applied Statistics: Matrix Algebra

UNIT 6: The singular value decomposition.

7. Symmetric Matrices and Quadratic Forms

Appendix A: Matrices

CS 246 Review of Linear Algebra 01/17/19

Linear Algebra (Review) Volker Tresp 2017

Linear Algebra: Matrix Eigenvalue Problems

Computational Methods CMSC/AMSC/MAPL 460. Eigenvalues and Eigenvectors. Ramani Duraiswami, Dept. of Computer Science

Chapter 7. Canonical Forms. 7.1 Eigenvalues and Eigenvectors

Linear Algebra. Session 12

Linear Algebra (Review) Volker Tresp 2018

Pseudoinverse & Moore-Penrose Conditions

ALGEBRA QUALIFYING EXAM PROBLEMS LINEAR ALGEBRA

DS-GA 1002 Lecture notes 0 Fall Linear Algebra. These notes provide a review of basic concepts in linear algebra.

Notes on singular value decomposition for Math 54. Recall that if A is a symmetric n n matrix, then A has real eigenvalues A = P DP 1 A = P DP T.

Linear Algebra - Part II

Review of Linear Algebra

Deep Learning Book Notes Chapter 2: Linear Algebra

Chapter 3 Transformations

Review of Some Concepts from Linear Algebra: Part 2

Linear Algebra for Machine Learning. Sargur N. Srihari

Lecture 6: Geometry of OLS Estimation of Linear Regession

Introduction to Numerical Linear Algebra II

STAT 309: MATHEMATICAL COMPUTATIONS I FALL 2017 LECTURE 5

Math Camp II. Basic Linear Algebra. Yiqing Xu. Aug 26, 2014 MIT

Large Scale Data Analysis Using Deep Learning

Math Bootcamp An p-dimensional vector is p numbers put together. Written as. x 1 x =. x p

Seminar on Linear Algebra

Lecture notes: Applied linear algebra Part 1. Version 2

Linear Systems. Carlo Tomasi. June 12, r = rank(a) b range(a) n r solutions

Linear algebra for computational statistics

Proposition 42. Let M be an m n matrix. Then (32) N (M M)=N (M) (33) R(MM )=R(M)

Review of Linear Algebra

Introduction to Matrix Algebra

Linear Algebra V = T = ( 4 3 ).

Matrix Algebra: Summary

MATH 583A REVIEW SESSION #1

EE731 Lecture Notes: Matrix Computations for Signal Processing

Preliminary/Qualifying Exam in Numerical Analysis (Math 502a) Spring 2012

Glossary of Linear Algebra Terms. Prepared by Vince Zaccone For Campus Learning Assistance Services at UCSB

1 Linear Algebra Problems

Matrices and Linear Algebra

Matrices and Vectors. Definition of Matrix. An MxN matrix A is a two-dimensional array of numbers A =

ELE/MCE 503 Linear Algebra Facts Fall 2018

1. What is the determinant of the following matrix? a 1 a 2 4a 3 2a 2 b 1 b 2 4b 3 2b c 1. = 4, then det

1 Inner Product and Orthogonality

Bare minimum on matrix algebra. Psychology 588: Covariance structure and factor models

σ 11 σ 22 σ pp 0 with p = min(n, m) The σ ii s are the singular values. Notation change σ ii A 1 σ 2

Chapter 6: Orthogonality

Lecture 1 Review: Linear models have the form (in matrix notation) Y = Xβ + ε,

Linear Algebra Highlights

Linear Algebra Review. Fei-Fei Li

ECE 275A Homework # 3 Due Thursday 10/27/2016

Introduction to Matrices

Maths for Signals and Systems Linear Algebra in Engineering

Stat 159/259: Linear Algebra Notes

Jim Lambers MAT 610 Summer Session Lecture 1 Notes

MATH 315 Linear Algebra Homework #1 Assigned: August 20, 2018

Recall the convention that, for us, all vectors are column vectors.

1 Last time: least-squares problems

Computational math: Assignment 1

Eigenvalues and diagonalization

Lecture II: Linear Algebra Revisited

Review of some mathematical tools

Numerical Linear Algebra Homework Assignment - Week 2

Linear Algebra review Powers of a diagonalizable matrix Spectral decomposition

Mathematical foundations - linear algebra

LEAST SQUARES SOLUTION TRICKS

Matrix Algebra, part 2

Numerical Methods. Elena loli Piccolomini. Civil Engeneering. piccolom. Metodi Numerici M p. 1/??

Linear Systems. Carlo Tomasi

Final Review Written by Victoria Kala SH 6432u Office Hours R 12:30 1:30pm Last Updated 11/30/2015

2. Review of Linear Algebra

Review of linear algebra

Linear Systems. Class 27. c 2008 Ron Buckmire. TITLE Projection Matrices and Orthogonal Diagonalization CURRENT READING Poole 5.4

2. Matrix Algebra and Random Vectors

Linear Models Review

8. Diagonalization.

Mathematical foundations - linear algebra

Algebra C Numerical Linear Algebra Sample Exam Problems

APPENDIX A. Background Mathematics. A.1 Linear Algebra. Vector algebra. Let x denote the n-dimensional column vector with components x 1 x 2.

LINEAR ALGEBRA BOOT CAMP WEEK 4: THE SPECTRAL THEOREM

Math 102, Winter Final Exam Review. Chapter 1. Matrices and Gaussian Elimination

NOTES ON BILINEAR FORMS

Problem # Max points possible Actual score Total 120

Lecture Notes in Linear Algebra

COMP 558 lecture 18 Nov. 15, 2010

Math 4A Notes. Written by Victoria Kala Last updated June 11, 2017

Transcription:

Properties of Matrices and Operations on Matrices A common data structure for statistical analysis is a rectangular array or matris. Rows represent individual observational units, or just observations, and columns represent the variables or features that are observed for each unit. If the elements of a matrix X represent numeric observations on variables in the structure of a rectangular array as indicated above, the mathematical properties of X carry useful information about the observations and about the variables themselves. In addition, mathematical operations on the matrix may be useful in discovering structure in the data. These operations include various transformations and factorizations. 1

Symmetric Matrices A matrix A with elements a ij is said to be symmetric if each element a ji has the same value as a ij. Symmetric matrices have useful properties that we will mention from time to time. Symmetric matrices provide a generalization of the inner product. If A is symmetric and x and y are conformable vectors, then the bilinear form x T Ay has the property that x T Ay = y T Ax, and hence this operation on x and y is commutative, which is one of the properties of an inner product. More generally, a bilinear form is a kernel function of the two vectors, and a symmetric matrix corresponds to a symmetric kernel. An important type of bilinear form is x T Ax, which is called a quadratic form. 2

Nonnegative Definite and Positive Definite Matrices A real symmetric matrix A such that for any real conformable vector x the quadratic form x T Ax is nonnegative, that is, such that x T Ax 0, is called a nonnegative definite matrix. We denote the fact that A is nonnegative definite by A 0. (Note that we consider the zero matrix, 0 n n, to be nonnegative definite.) If the quadratic form is strictly positive, A is called a positive definite matrix and we write A 0. 3

Systems of Linear Equations One of the most common uses of matrices is to represent a system of linear equations Ax = b. Whether or not the system has a solution (that is, whether or not for a given A and b there is an x such that Ax = b) depends on the number of linearly independent rows in A (that is, considering each row of A as being a vector). The number of linearly independent rows of a matrix, which is also the number of linearly independent columns of the matrix, is called the rank of the matrix. A matrix is said to be of full rank if its rank is equal to either its number of rows or its number of columns. 4

A square full rank matrix is called a nonsingular matrix. We call a matrix that is square but not full rank singular. The system Ax = b has a solution if and only if rank(a b) rank(a), where A b is the matrix formed from A by adjoining b as an additional column. If a solution exists, the system is said to be consistent. The common regression equations do not satisfy the condition. 5

Matrix Inverses If the system Ax = b is consistent then x = A b is a solution, where A is any matrix such that AA A = A, as we can see by substituting A b into AA Ax = Ax. Given a matrix A, a matrix A such that AA A = A is called a generalized inverse of A, and we denote it as indicated. If A is square and of full rank, the generalized inverse, which is unique, is called the inverse and is denoted by A 1. It has a stronger property: AA 1 = A 1 A = I, where I is the identity matrix. 6

To the general requirement AA A = A, we successively add three requirements that define special generalized inverses, sometimes called respectively g 2, g 3, and g 4 inverses. The general generalized inverse is sometimes called a g 1 inverse. The g 4 inverse is called the Moore-Penrose inverse. For a matrix A, a Moore-Penrose inverse, denoted by A +, is a matrix that has four properties. 7

1. AA + A = A. Any matrix that satisfies this condition is called a generalized inverse, and as we have seen above is denoted by A. For many applications, this is the only condition necessary. Such a matrix is also called a g 1 inverse, an inner pseudoinverse, or a conditional inverse. 2. A + AA + = A +. A matrix A + that satisfies this condition is called an outer pseudoinverse. A g 1 inverse that also satisfies this condition is called a g 2 inverse or reflexive generalized inverse, and is denoted by A. 3. A + A is symmetric. 4. AA + is symmetric. 8

The Matrix X T X When numerical data are stored in the usual way in a matrix X, the matrix X T X often plays an important role in statistical analysis. A matrix of this form is called a Gramian matrix, and it has some interesting properties. First of all, we note that X T X is symmetric; that is, the (ij) th element, k x k,i x k,j is the same as the (ji) th element. Secondly, because for any y, (Xy) T Xy 0, X T X is nonnegative definite. Next we note that X T X = 0 X = 0. 9

The generalized inverses of X T X have useful properties. First, we see from the definition, for any generalized inverse (X T X), that ((X T X) ) T is also a generalized inverse of X T X. (Note that (X T X) is not necessarily symmetric.) Also, we have X(X T X) X T X = X. This means that (X T X) X T is a generalized inverse of X. The Moore-Penrose inverse of X has an interesting relationship with a generalized inverse of X T X: XX + = X(X T X) X T. 10

An important property of X(X T X) X T is its invariance to the choice of the generalized inverse of X T X. The matrix X(X T X) X T has a number of other interesting properties in addition to those mentioned above. ( X(X T X) X T) ( X(X T X) X T) = X(X T X) (X T X)(X T X) X T that is, X(X T X) X T is idempotent. = X(X T X) X T, It is clear that the only idempotent matrix that is of full rank is the identity I. 11

Any real symmetric idempotent matrix is a projection matrix. The most familiar application of the matrix X(X T X) X T is in the analysis of the linear regression model y = Xβ + ɛ. This matrix projects the observed vector y onto a lower-dimensional subspace that represents the fitted model: ŷ = X(X T X) X T y. Projection matrices, as the name implies, generally transform or project a vector onto a lower-dimensional subspace. 12

Eigenvalues and Eigenvectors Multiplication of a given vector by a square matrix may result in a scalar multiple of the vector. If A is an n n matrix, v is a vector not equal to 0, and c is a scalar such that Av = cv, we say v is an eigenvector of A and c is an eigenvalue of A. We should note how remarkable the relationship Av = cv is: The effect of a matrix multiplication of an eigenvector is the same as a scalar multiplication of the eigenvector. The eigenvector is an invariant of the transformation in the sense that its direction does not change under the matrix multiplication transformation. 13

Eigenvalues and Eigenvectors We immediately see that if an eigenvalue of a matrix A is 0, then A must be singular. We also note that if v is an eigenvector of A, and t is any nonzero scalar, tv is also an eigenvector of A. Hence, we can normalize eigenvectors, and we often do. If A is symmetric there are several useful facts about its eigenvalues and eigenvectors. The eigenvalues and eigenvector of a (real) symmetric matrix are all real. 14

Eigenvalues and Eigenvectors The eigenvectors of a symmetric matrix are (or can be chosen to be) mutually orthogonal. We can therefore represent a symmetric matrix A as A = V CV T, where V is an orthogonal matrix whose columns are the eigenvectors of A and C is a diagonal matrix whose (ii) th element is the eigenvalue corresponding to the eigenvector in the i th column of V. This is called the diagonal factorization of A. 15

Eigenvalues and Eigenvectors If A is a nonnegative (positive) definite matrix, and c is an eigenvalue with corresponding eigenvector v, if we multiply both sides of the equation Av = cv, we have v T Av = cv T v 0(> 0), and since v T v > 0, we have c 0(> 0). The maximum modulus of any eigenvalue in a given matrix is of interest. This value is called the spectral radius, and for the matrix A, is denoted by ρ(a): ρ(a) = max c i, where the c i s are the eigenvalues of A. The spectral radius is very important in many applications, from both computational and statistical standpoints. The convergence of some iterative algorithms, for example, depend on bounds on the spectral radius. 16

Matrix Decomposition Computations with matrices are often facilitated by first decomposing the matrix into multiplicative factors that are easier to work with computationally, or else reveal some important characteristics of the matrix. Some decompositions exist only for special types of matrices, such as symmetric matrices or positive definite matrices. 17

The Singular Value Decomposition One of most useful decompositions, and one that applies to all types of matrices, is the singular value decomposition. An n m matrix A can be factored as A = UDV T, where U is an n n orthogonal matrix, V is an m m orthogonal matrix, and D is an n m diagonal matrix with nonnegative entries. The number of positive entries in D is the same as the rank of A. This factorization is called the singular value decomposition (SVD) or the canonical singular value factorization of A. 18

Singular Values and the Singular Value Decomposition The elements on the diagonal of D, d i, are called the singular values of A. We can rearrange the entries in D so that d 1 d 2, and by rearranging the columns of U correspondingly, nothing is changed. If the rank of the matrix is r, we have d 1 d r > 0, and if r < min(n, m), then d r+1 = = d min(n,m) = 0. In this case D = where D r = diag(d 1,..., d r ). [ Dr 0 0 0 ], From the factorization defining the singular values, we see that the singular values of A T are the same as those of A. 19

Singular Values and the Singular Value Decomposition For a matrix with more rows than columns, in an alternate definition of the singular value decomposition, the matrix U is n m with orthogonal columns, and D is an m m diagonal matrix with nonnegative entries. Likewise, for a matrix with more columns than rows, the singular value decomposition can be defined as above but with the matrix V being m n with orthogonal columns and D being m m and diagonal with nonnegative entries. If A is symmetric its singular values are the absolute values of its eigenvalues. 20

SVD and the Moore-Penrose Inverse The Moore-Penrose inverse of a matrix has a simple relationship to its SVD. If the SVD of A is given by UDV T, then its Moore-Penrose inverse is A + = V D + U T, as is easy to verify. The Moore-Penrose inverse of D is just the matrix D + formed by inverting all of the positive entries of D and leaving the other entries unchanged. 21

Square Root Factorization of a Nonnegative Definite Matrix If A is a nonnegative definite matrix (which, for me, means that it is symmetric), its eigenvalues are nonnegative, so we can write S = C 1 2, where S is a diagonal matrix whose elements are the square roots of the elements in the C matrix in the diagonal factorization of A. Now we observe that (V SV T ) 2 = V CV T = A; hence, we write and we have (A 1 2) 2 = A. A 1 2 = V SV T, 22