Massachusetts Institute of Technology Department of Economics Statistics. Lecture Notes on Matrix Algebra

Similar documents
Math Camp Lecture 4: Linear Algebra. Xiao Yu Wang. Aug 2010 MIT. Xiao Yu Wang (MIT) Math Camp /10 1 / 88

Knowledge Discovery and Data Mining 1 (VO) ( )

Math Camp II. Basic Linear Algebra. Yiqing Xu. Aug 26, 2014 MIT

MATRIX ALGEBRA. or x = (x 1,..., x n ) R n. y 1 y 2. x 2. x m. y m. y = cos θ 1 = x 1 L x. sin θ 1 = x 2. cos θ 2 = y 1 L y.

3 (Maths) Linear Algebra

STAT200C: Review of Linear Algebra

A matrix over a field F is a rectangular array of elements from F. The symbol

Chapter 3. Matrices. 3.1 Matrices

Problem Set (T) If A is an m n matrix, B is an n p matrix and D is a p s matrix, then show

Eigenvalues and Eigenvectors

Chapter 1. Matrix Algebra

Lecture 15 Review of Matrix Theory III. Dr. Radhakant Padhi Asst. Professor Dept. of Aerospace Engineering Indian Institute of Science - Bangalore

Principal Component Analysis (PCA) Our starting point consists of T observations from N variables, which will be arranged in an T N matrix R,

Ma/CS 6b Class 20: Spectral Graph Theory

Introduction to Quantitative Techniques for MSc Programmes SCHOOL OF ECONOMICS, MATHEMATICS AND STATISTICS MALET STREET LONDON WC1E 7HX

Appendix A: Matrices

Chapter 4 - MATRIX ALGEBRA. ... a 2j... a 2n. a i1 a i2... a ij... a in

ELEMENTARY LINEAR ALGEBRA

Linear Algebra. Matrices Operations. Consider, for example, a system of equations such as x + 2y z + 4w = 0, 3x 4y + 2z 6w = 0, x 3y 2z + w = 0.

Lecture 1 Review: Linear models have the form (in matrix notation) Y = Xβ + ε,

Topic 1: Matrix diagonalization

CS 246 Review of Linear Algebra 01/17/19

Elementary linear algebra

Introduction to Matrix Algebra

OHSx XM511 Linear Algebra: Solutions to Online True/False Exercises

Linear Algebra M1 - FIB. Contents: 5. Matrices, systems of linear equations and determinants 6. Vector space 7. Linear maps 8.

Linear Algebra: Matrix Eigenvalue Problems

Math Bootcamp An p-dimensional vector is p numbers put together. Written as. x 1 x =. x p

JUST THE MATHS UNIT NUMBER 9.8. MATRICES 8 (Characteristic properties) & (Similarity transformations) A.J.Hobson

Math113: Linear Algebra. Beifang Chen

MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS. + + x 1 x 2. x n 8 (4) 3 4 2

Matrices A brief introduction

Lecture notes on Quantum Computing. Chapter 1 Mathematical Background

Ma/CS 6b Class 20: Spectral Graph Theory

a 11 x 1 + a 12 x a 1n x n = b 1 a 21 x 1 + a 22 x a 2n x n = b 2.

Notes on Linear Algebra

1 Matrices and Systems of Linear Equations

2. Linear algebra. matrices and vectors. linear equations. range and nullspace of matrices. function of vectors, gradient and Hessian

Chapters 5 & 6: Theory Review: Solutions Math 308 F Spring 2015

Matrices. Chapter What is a Matrix? We review the basic matrix operations. An array of numbers a a 1n A = a m1...

Lecture Summaries for Linear Algebra M51A

Introduction to Matrices

A matrix is a rectangular array of. objects arranged in rows and columns. The objects are called the entries. is called the size of the matrix, and

Mathematical Foundations of Applied Statistics: Matrix Algebra

Review of Linear Algebra

Matrices A brief introduction

Foundations of Matrix Analysis

A matrix is a rectangular array of. objects arranged in rows and columns. The objects are called the entries. is called the size of the matrix, and

CS 143 Linear Algebra Review

Chap 3. Linear Algebra

Fundamentals of Engineering Analysis (650163)

ELEMENTARY LINEAR ALGEBRA

ELEMENTARY LINEAR ALGEBRA

Linear Algebra March 16, 2019

Linear Algebra Primer

3 Matrix Algebra. 3.1 Operations on matrices

Equality: Two matrices A and B are equal, i.e., A = B if A and B have the same order and the entries of A and B are the same.

Queens College, CUNY, Department of Computer Science Numerical Methods CSCI 361 / 761 Spring 2018 Instructor: Dr. Sateesh Mane.

Vectors and matrices: matrices (Version 2) This is a very brief summary of my lecture notes.

DS-GA 1002 Lecture notes 0 Fall Linear Algebra. These notes provide a review of basic concepts in linear algebra.

ELE/MCE 503 Linear Algebra Facts Fall 2018

Vectors To begin, let us describe an element of the state space as a point with numerical coordinates, that is x 1. x 2. x =

Numerical Linear Algebra Homework Assignment - Week 2

Quick Tour of Linear Algebra and Graph Theory

Lecture Notes in Linear Algebra

Math 315: Linear Algebra Solutions to Assignment 7

Lecture 2: Linear operators

Introduction to Matrices

1 Matrices and Systems of Linear Equations. a 1n a 2n

Linear Algebra Highlights

The Singular Value Decomposition

Eigenvalues and Eigenvectors

MATH 431: FIRST MIDTERM. Thursday, October 3, 2013.

Matrix operations Linear Algebra with Computer Science Application

Recall the convention that, for us, all vectors are column vectors.

SAMPLE OF THE STUDY MATERIAL PART OF CHAPTER 1 Introduction to Linear Algebra

1 The Basics: Vectors, Matrices, Matrix Operations

Chapter 5. Linear Algebra. A linear (algebraic) equation in. unknowns, x 1, x 2,..., x n, is. an equation of the form

linearly indepedent eigenvectors as the multiplicity of the root, but in general there may be no more than one. For further discussion, assume matrice

EE/ACM Applications of Convex Optimization in Signal Processing and Communications Lecture 2

Phys 201. Matrices and Determinants

Symmetric and anti symmetric matrices

Math 102, Winter Final Exam Review. Chapter 1. Matrices and Gaussian Elimination

2 b 3 b 4. c c 2 c 3 c 4

Econ Slides from Lecture 7

M. Matrices and Linear Algebra

AMS526: Numerical Analysis I (Numerical Linear Algebra)

Introduction Eigen Values and Eigen Vectors An Application Matrix Calculus Optimal Portfolio. Portfolios. Christopher Ting.

Linear Algebra: Lecture notes from Kolman and Hill 9th edition.

NOTES on LINEAR ALGEBRA 1

Mathematics. EC / EE / IN / ME / CE. for

EE731 Lecture Notes: Matrix Computations for Signal Processing

Systems of Algebraic Equations and Systems of Differential Equations

GATE Engineering Mathematics SAMPLE STUDY MATERIAL. Postal Correspondence Course GATE. Engineering. Mathematics GATE ENGINEERING MATHEMATICS

MATRICES ARE SIMILAR TO TRIANGULAR MATRICES

MATH 240 Spring, Chapter 1: Linear Equations and Matrices

Digital Workbook for GRA 6035 Mathematics

Stat 159/259: Linear Algebra Notes

Mathematical Foundations

SAMPLE OF THE STUDY MATERIAL PART OF CHAPTER 1 Introduction to Linear Algebra

Transcription:

Massachusetts Institute of Technology Department of Economics 14.381 Statistics Guido Kuersteiner Lecture Notes on Matrix Algebra These lecture notes summarize some basic results on matrix algebra used frequently in statistics and econometrics. Where necessary they also review some more general results on linear algebra. Starred (*) paragraphs are optional. The most important thing is to remember the calculus rules highlighted as Results and numbered throughout the notes for easy reference. 1 Basic Concepts An m n matrix A is a rectangular array of real numbers a 11 a 12 a 1n a A = 21 a 22 a 2n... a m1 a m2 a mn The real numbers a ij are called the elements of A. An m 1 matrix x is a point in R m. It is called a column vector. A 1 m matrix is called a row vector. The sum of two matrices A, B of dimension m n is obtained by forming the sum of all individual elements of A and B such that the i, j-th element of (A + B) ij = a ij + b ij a 11 + b 11 a 1n + b 1n A + B =.. a m1 + b m1 a mn + b mn The product of a matrix A with a scalar λ is λa = Aλ with i, j-th element (λa) ij = λa ij for all i, j. The following properties follow easily A + B = B + A (A + B)+C = A +(B + C) (λ + µ)a = λa + µa µ(a + B) = µa + µb λ(µa) = (λµ)a 1

The null matrix is the matrix that has all elements equal to zero and is denoted by. A +( 1)A = The matrix product of A and B is defined when A is m n and B is n p. The i, k th element of AB is then given by nx (AB) ik = a ij b jk and AB is an m p matrix. The matrix product satisfies (AB)C = A(BC) A(B + C) = AB + AC (A + B)C = AC + BC Note that if AB exists, BA need not be defined and even if it exists if is usually not true that AB = BA. Matrices are often used to represent linear maps. An m n matrix maps R n R m. If x R n then AX = y R m. The matrix product then represents a sequence of maps applied to x, i.e.x 7 Ax = y and y 7 By = BAx = z. The transpose of an m n matrix A is denoted by A. If A has elements a ij then A has elements a ji.wehave j=1 (A ) = A (A + B) = A + B (AB) = B A 2 The rank of a matrix Definition 1 (Linear Independence) Let x 1...x n R n be vectors in R n. Then x 1...x n aresaidtobelinearlyindependentifthereexistsnosetofscalarsα 1...α j, not all zero, such that X α i x i =. If x i are not linearly independent then they are said to be linearly dependent. i Definition 2 (Rank) Thecolumnrankofanm n matrix A is the largest number of linearly independent column vectors A contains. The row rank is the largest number of linearly independent row vectors A contains. 2

It can be shown that the row rank and column rank are equal. The row and column rank of a matrix therefore is often referred to simply as the rank of the matrix. We denote the rank of A by r(a). Definition 3 (Column Space) The column space of A is the set of vectors generated by the columns of A, denotedby M(A) ={y : y = Ax, x R n }. The dimension of M(A) is r(a). Definition 4 (Non-Singular) AsquarematrixA of dimension n n is said to be non singular if r(a) =n. A is said to be singular if r(a) <n. Definition 5 (Null Space) The null space of A is the set of vectors x such that Ax =, denoted by N(A). If A is non singular then N(A) =. The dimension of N(A) is n r(a). 3 Inverses, Partitioned Matrices and Orthogonal Matrices If A is non-singular then there exists a non-singular matrix A 1 such that AA 1 = A 1 A = I where 1 I =... 1 is the identity matrix. The following facts about inverses are useful Result 1 i) (AB) 1 = B 1 A 1 ii)(a ) 1 =(A 1 ) Proof. For i) note that ABB 1 A 1 = AA 1 = I and B 1 A 1 AB = I. For ii) note that (A (A 1 ) ) = A 1 A = I and ((A 1 ) A ) = AA 1 = I Definition 6 AmatrixA is called orthogonal if A 1 = A. Two vectors x and y are orthogonal if x y =. 3

Itissometimesusefultopartitionamatrixintosubmatrices. A11 A 12 A = A 21 A 22 IfA is m n then where A 11 is m 1 n 1, A 12 is m 1 n 2, A 21 is m 2 n 1,andA 22 is m 2 n 2 and m 1 + m 2 = m and n 1 + n 2 = n. Then the sum of two matrices that are partitioned conformingly can be obtained as A11 + B 11 A 12 + B 12 A + B = A 21 + B 21 A 22 + B 22 where A ij and B ij for i, j =1, 2 have to have the same dimension m i n j. The product of two conformable matrices A and B canbeexpressedinpartitionedformifb 11 is n 1 p 1,B 12 is n 1 p 2,B 21 is n 2 p 1 and B 22 is n 2 p 2 such that and the product is of dimension m (p 1 + p 2 ). AB = A 11B 11 + A 12 B 21 A 11 B 12 + A 12 B 22 A 21 B 11 + A 22 B 21 A 21 B 12 + A 22 B 22 Result 2 If A is n n and both A 11 and A 22 are square non singular and A 12 = A 21 =then A 1 A 1 11 = A 1. 22 Result 3 If A is non singular and D = A 22 A 21 A 1 11 A 12 is non singular then à A 1 A 1 11 = + A 1 11 A 12D 1 A 11 A 1 11 A! 12D 1 D 1 A 21 A 1 11 D 1. Also if A is non singular and E is non singular where E = A 11 A 12 A 1 22 A 21, then A 1 E 1 E 1 A 12 A 1 22 = A 1 22 A 21E 1 A 1 22 + A 1 22 A 21E 1 A 12 A 1. 22 (*) Proof. We only give a proof for the second representation. The idea behind the proof is to solve the system of matrix equations AX = I for the unknown matrix X. By the properties of the inverse X then has to be the inverse. It is useful to partition X as X11 X 12 X = X 21 X 22 4

and to write the system of matrix equations more explicitly as the following set of equations A 11 X 11 + A 12 X 21 = I A 11 X 12 + A 12 X 22 = A 21 X 11 + A 22 X 21 = A 21 X 12 + A 22 X 22 = I. Since A 22 is non-sigular we can solve the third matrix equation for X 21 such that Substituting this result in the first equation leads to X 21 = A 1 22 A 21X 11. (1) A 11 X 11 A 12 A 1 22 A 21X 11 = I X 11 = (A 11 A 12 A 1 22 A 21) 1 (2) which leads to expressions for the partitioned inverse elements X 21 and X 11. In the same way we now solve the second equation for X 12 where X 12 = A 1 11 A 12X 22 (3) A 21 A 1 11 A 12X 22 + A 22 X 22 = I X 22 = (A 22 A 21 A 1 11 A 12) 1 (4) such that we have determined all the elements in of the inverse. In order to fully establish our result we need to show that X 22 =(A 22 A 21 A 1 11 A 12) 1 = A 1 22 + A 1 11 A 21E 1 A 12 A 1 22. By Lemma(1)itfollowsthat(I m +CB) 1 = I m C(I n + BC) 1 B for matrices B and C such that C is m n and B is n m and I m + CB as well as I n + BC are non singular. Now note that Let C = A 1 22 A 21 and B = A 1 11 A 12 then (A 22 A 21 A 1 11 A 12) 1 =(I A 1 22 A 21A 1 11 A 12) 1 A 1 22. (I A 1 22 A 21A 1 11 A 12) 1 = (I + A 1 22 A 21(I A 1 11 A 12A 1 22 A 21) 1 A 1 11 A 12 and the result follows. = I + A 1 22 A 21(A 11 A 12 A 1 22 A 21) 1 A 12 We now state the lemma that we used in the proof of the partitioned inverse formula. Lemma 1 (*) Let Cm n and Bn m s.t. I m + CB and I n + BC are non singular. Then (I m + CB) 1 = I m C(I n + BC) 1 B. 5

Proof. We use the following identity In B In B I m C I m In + BC = C I m. Then (In + BC) 1 (In + BC) 1 B(I m + CB) 1 In B C(I n + BC) 1 I m = C(I n + BC) 1 (I m + CB) 1 I m by (1), (2), (3) and (4). This implies C(I n + BC) 1 B +(I m + CB) 1 = I m. 4 The Determinant The determinant is defined as a function mapping the vectors of a n n matrix into a single number. If A is a n n matrix we use A to denote the determinant where this causes no confusion with the absolute value of a real number. Alternatively, the notation det A is used. The function plays a role in representing inverses of matrices and can be used as a test of whether a matrix is singular or non singular. We first give a formal definition. Definition 7 The determinant of a square matrix A with dimension n n is a mapping A A such that i) is linear in each row of A ii) if rank (A) <nthen A =and vice versa iii) I =1 An alternative way of representing the determinant is to view the matrix A as a collection of n row vectors {a 1,..., a n } where each a i R n. The determinant is then a function mapping the set of vectors {a 1,...,a n } into R. We can write D(a 1,..., a n ) R. Property i) of the previous definition the means that D(a 1,..., λa i + µa i,a i+1,...,a n )=λd(a 1,...,a i,..., a n )+µd(a 1,..., a i,...,a n ) for any scalars µ, λ R and any vectors a i,a i R. In particular note that D(a 1,..., λa i,..., a n )= λd(a 1,...,a i,..., a n ). A simple geometric interpretation of the determinant can be obtained in the case where n =2. Considerthepairofvectorsa 1,a 2 R 2. Geometrically, a 1 and a 2 span a parallelogram with edges a 1 and a 2. The absolute value of the determinant D(a 1,a 2 ) then is the area of the parallelogram. If we stretch one of the vectors by a factor λ then the area changes by a factor 6

λ. If we stretch both vectors by the factor λ then D(λa 1, λa 2 )=λ 2 such that the area changes by a factor λ 2. The following theorem establishes (without proof) that there always exists a function that satisfies the above definition. Theorem 4 The mapping exists and is unique. The discussion so far only established the existence of the determinant and some of its properties. It did not give an explicitly formula for computing the determinant. We note that the determinant can be computed by using the following recursive formulation. Note that nx A = ( 1) i+j a ij A ij i=1 where A ij is the matrix obtained from leaving out the i th row and the j th column. This formulation is called the expansion of. along the jth column. We rarely need to compute determinants explicitly by hand (computers were invented for that!). On the other hand the following calculus rules are useful. Result 5 Let A, B be n n matrices. Then AB = A B A 1 = 1/ A (*) Proof:. The idea of the proof is to establish that A B possesses all the properties of the determinant of AB. Because the determinant is the unique function with these properties this is enough to establish that AB = A B. Now first assume that B 6=. But then we can check that A := AB B has the required properties of a determinant which then establishes AB = A B. For this purpose let à be the matrix where the j-throwwasreplacedby αa jk k =1,..., n. Also use the notation [A] ij for the element a ij of the matrix A. Then i hãb = α X a jk b k` ` j` k i hãb = X a ik b k` `, and i 6= j i` so that again ÃB is a matrix where the j th row has been scaled by α. Then the properties of.. Using our claimed representation of à we note à = ÃB B ÃB = α AB by = α AB B = α A as required. Also, if rank A<nthen rank(ab) <nsuch that A = AB B =and I = B B =1 so in other words, AB = A B satisfies all the required properties. If B =then AB = and A B satisfies the requirements trivially. For the second result we note that I =1=A 1 A = A 1 A. 7

Result 6 Let A be a n n matrix. Then A = A. Proof. We only need to show that is also linear in the columns. Using the expansion along column j, wefind A = X ( 1) i+j αa ij à ij i where Ãij does not depend on the j th column. This shows that the determinant is linear in the columns of A. Result 7 Let A be a n n matrix. Then αa = α n A Proof. Note that αa = D(αa 1, αa 2,..., αa n )=αd(a 1, αa 2,..., αa n )=α 2 D(a 1,a 2,..., αa n ) by repeated application of the linearity property of.. Result 8 Let A be a n n matrix, B a m m matrix and C a n m matrix. Then A C det = A B. B Proof. We need to check that the function A B satisfies the three condition for a determinant. Let à and C be transformations of A and C by having one row replaced by λr i + µri where λ,µ are scalars and r i =(a i,c i ) R n+m and r i =(a i,c i ) R n+m are vectors. Let A i be the matrix A with ith row a i and A i the matrix A with ith row a i with corresponding definitions for C. Then à C det B = à B = λ A i + µ A A i C i A i Ci i B = λ det + µ det. B B The same argument holds for linear combinations of rows in B. This establishes linearity in the rows. If A is reduced rank then matrix A C B is of reduced rank and thus the determinant is zero. The same argument again holds if B is singular. Now assume A, B are full rank then A C x 1 = B x 2 8

implies Bx 2 =and Ax 1 + CX 2 =. Since x 2 =it follows that Ax 1 =. But because the null space of A is zero, it follows that x 1 =. So A C B is also full rank and A B 6=as required. Finally, if A = I n and B = I m and C =then I n I m =1, establishing the last property. We can establish a more general result for the determinant of a partitioned matrix similar to the result on partitioned inverses. Result 9 Let A11 A 12 A = A 21 A 22 where A 11 and A 22 are full rank n n, m m matrices. Then then Proof. Define the matrix such that A = A 11 A 22 A 21 A 1 = A 22 A 11 A 12 A 1 I A 1 11 B = A 12 I 11 A 12 22 A 21 A 11 AB = A 21 A 22 A 21 A 1 11 A 12 AB = A B = A 11 A 22 A 21 A 1 11 A 12 where the first equality follows from Result (5) and the second equality follows from Result (8) and the fact that B =1. 5 Quadratic Forms Let A be an n n matrix and x R n. Then x Ax = X i,j a ij x i x j 9

is a quadratic form. Note that x Ax is a scalar. Therefore, it follows that x Ax = x A x such that we can write x Ax =1/2x (A+A )x where (A + A ) /2 is symmetric. Thus assume without loss of generality that A is symmetric, i.e. a ij = a ji or A = A. Then A is positive definite if x Ax > x 6= positive semidefinite if x Ax x negative definite if x Ax < x 6= negative semidefinite if x Ax x Result 1 For any matrix B, A = B B is positive semi-definite. Proof. Let y = Bx. Then x Ax = x B Bx = y y = P i y2 i. Note that y y could be zero if x is chosen to lie in the null space of B. Result 11 A positive definite matrix is non singular. Proof. Assume that A is positive definite and A is singular. Then there exists an x 6= such that Ax =. But then x Ax =which contradicts that A is positive definite. Result 12 If B is m n with m n and r(b) =n then B B is positive definite. Proof. Let y = Bx where Bx is only zero if x =because B is of full column rank n. Then y 6= such that x B Bx = y y>. Definition 8 A n n matrix M is said to be idempotent if MM = M. Result 13 If M is symmetric then M is also positive semi-definite. Proof. For any x R n, let y = Mx. Then x Mx = x MMx by idempotency. By symmetry it then follows that x MMx = x M Mx = y y as shown before where y y =only if y = which occurs if Mx =. (This is possible because usually M is of reduced rank). 6 Eigenvalues and Eigenvectors In this section we study eigenvalues and eigenvectors of a given matrix A. These eigenvectors and eigenvalues can be used to transform the matrix A into a simpler form which in turn is useful to solve systems of linear equations and to analyze the properties of the mapping described by A. We say that λ is an eigenvalue of an n n matrix A with corresponding eigenvector x if Ax = λx (5) 1

for some x 6=. An x 6= satisfying this equation is an eigenvector corresponding to the eigenvalue λ. Note that Equation (5) can be written as (A λi)x =. This homogeneous system of equations has a nontrivial (x 6= )solution if and only if the matrix A λi is singular. The matrix A λi is singular if and only if the determinant A λi =. This leads to a characterization of the eigenvalues as solutions to the equation A λi =.Note that A λi is a polynomial of degree n in λ. To see this consider the representation of the determinant as det A λi = X ny ( 1) φ( 1,..., n) (a iji λ1 {i = j i }) where the summation is taken over all permutations ( 1,..., n ) of the set of integers (1,..., n), and φ(j 1,..., j n ) is the number of transpositions required to change (1,..., n) into (j 1,..., j n ). The polynomial A λi has at most n, possibly complex, distinct roots that can appear with multiplicities, and at least one (complex) root. Note, that the number of distinct roots can be less than n. In that case one or more roots occur with multiplicities. Also note that the eigenvectors associated with a given eigenvalue need not be unique. A simple example is the case where A = I. Then, the only eigenvalue of I is 1, occurring with multiplicity n. In this case all nonzero vectors x R n are eigenvectors of I. We now normalize the eigenvector x such that x x =1.Ifxis an eigenvector for λ and y an eigenvector for µ then y Ax = λy x i=1 and thus y Ax = µy x =(λ µ)y x implying that y x =. This shows that eigenvectors of distinct eigenvalues are orthogonal. If λ = µ and x 6= αy then any vector in the span of x and y is also a solution to (A λi)x == (A µi)y. Thus we can always choose x and y to be orthogonal. We have so far established that any n n matrix A has eigenvalues λ 1,...,λ n with orthogonal eigenvectors v i. We now use these results to transform a given matrix A into a simpler form. The next theorem states that for any n n matrix A there exists possibly complex valued matrices T and M such that M is upper triangular (all elements below the diagonal are zero) and T is unitary. Note that T is unitary if T T = I where T = X + iy for two matrices x and y of dimension n n, i = 1 and T = x iy.thematrixt is called the complex conjugate of T. Result 14 (Schur Decomposition Theorem) Let A be an n n matrix. Then there exists aunitaryn n matrix T and an upper triangular matrix M whose diagonal elements are the 11

eigenvalues of A such that P AP = M. (*) Proof. Let us proceed by induction, beginning with the 2 2 case. Let λ 1 be a characteristic root of A and c 1 an associated characteristic vector normalized by the condition c 1 c 1 =1where c 1 is the complex conjugate transpose of c 1. Let T be a matrix whose first column is c 1 and whose second column is chosen so that T is unitary. Then evaluating the expression T 1 AT as the product T 1 (AT ), weseethat T 1 AT = λ1 b 12 b 22 The quantity b 22 must equal λ 2 since T 1 AT has the same characteristic roots as A. Let us now show that we can use the reduction for the N th -order matrices to demonstrate the result for (N +1)-dimensional matrices. As before, let c 1 be a normalized characteristic vector associated with λ 1,andletN other vectors a 1,a 2,..., a N, be chosen so that the matrix T 1, whose columns are c 1,a 1,a 2,..., a N, is unitary. Then, as for N =2, we have λ 1 b 1 12 b 1 1N+1 T1 1 AT 1 =. B N where B N is an N N matrix. Since the characteristic equation of the right-hand side is (λ 1 λ) B N λi = it follows that the characteristic roots of B N are λ 2, λ 3,..., λ N+1, the remaining N characteristic roots of A. The inductive hypothesis asserts that there exists a unitary matrix T N such that λ 2 c 12 c 1N TN 1 B λ N T N = 3 c 2N.... λ N+1 Let T N+1 be the unitary (N +1) (N +1)matrix formed as follows: 1 T N+1 =. T N 12

Consider the expression (T 1 T N+1 ) 1 A(T 1 T N+1 ) = TN+1 1 1 (T1 AT 1 )T N+1 λ 1 b 12 b 1,N+1 λ = 1.... λ N+1 The matrix T 1 T N+1 is thus the required unitary matrix which reduces A to semidiagonal form. Our next result shows that under certain conditions on the matrix A we can find transformations that reduce A to a diagonal matrix. In order to prove this result we first show that the eigenvalues of a real, symmetric matrix are also real valued. Result 15 The eigenvalues of a real symmetric matrix A are real. Proof. Let x be an eigenvalue of A and x = u + iv its associated eigenvector. Then A(u + iv) =λ(u + iv) and (u iv) A(u + iv) =λ(u iu) (u + iv) because of symmetry of A it follows that iv Au = u Aiv and thus u Au + v Av = λ(u u + v v) which implies that λ is real. If λ is real then x canbechosentoberealaswell. Result 16 Let A be a real symmetric matrix. Then there exists an orthogonal matrix S and a diagonal matrix Λ whose diagonal elements are the eigenvalues of A such that S AS = Λ. Proof. By the previous result the eigenvalues are real and the eigenvectors can be chosen real. By the Schur decomposition theorem there exists an upper diagonal matrix M such that S AS = M. By symmetry S AS = M such that M = M which proves that M is diagonal. From the previous result we also know that the diagonal elements of M are real. Thus, write M = Λ. Since AS = SΛ it follows that the rows of S contain the eigenvectors corresponding to Λ. But we know that the eigenvectors can be chosen real by the previous result. This establishes the claim. 13

Result 17 If A is idempotent then the eigenvalues of A are or 1. Proof. Ax = λx Ax = AAx = λax = λ 2 x so λ 2 = λ which implies λ =or λ =1. Result 18 If A is real symmetric and idempotent n n matrix of rank k<nthen S 1,S 2 n k and n n k s.t. S 1 S 2 =,S2 S 1 =,S1 1S 1 = I and S2 S 2 = I such that A = S 1 S 1. Proof. By the previous result we know that the eigenvalues of A are either or 1. Since A has rank k there must exist exactly n k linearly independent vectors x such that Ax =. This shows that A has n k zero eigenvalues with the remaining nonzero eigenvalues equal to 1.We know that S AS = Λ where S is an orthogonal matrix of eigenvectors and Λ is the corresponding matrix of eigenvalues. Now arrange the eigenvalues such that Ik Λ =. Then arrange the eigenvectors accordingly such that I k A =[S 1 S 2 ] S 1 S 2 = S 1 S1. 1 The square root of a real symmetric matrix A is defined as A 1/2 A 1/2 = A. Note that A 1/2 is not the same as taking the square root of all elements in A! We want to find an explicit representation of A 1/2. By the previous result we can diagonalize the matrix A such that S AS = Λ or A = SΛS = SΛ 1/2 Λ 1/2 S, wherenowweareallowedtodefine Λ 1/2 = λ 1/2 1... λ 1/2 n because Λ 1/2, so defined, satisfies Λ 1/2 Λ 1/2 = Λ because Λ is a diagonal matrix. Then define A 1/2 = SΛ 1/2 S. It can be checked that A 1/2 A 1/2 = SΛ 1/2 S SΛ 1/2 S = SΛ 1/2 Λ 1/2 S = A. Also note that A 1 =(SΛS ) 1 = SΛ 1 S such that A 1/2 = SΛ 1/2 S 1. 14

Result 19 If Σ is a symmetric real n n matrix then there exists a matrix A such that AA = Σ. Proof. Let Σ = SΛS. Then A = SΛ 1/2. Result 2 If A is a symmetric real n n matrix then A = Q i λ i where λ i are the eigenvalues of A. Proof. We diagonalize the matrix A such that A = SΛS = S Λ S = SS Λ = I Λ = Q i λ i. The last equality can be shown by using a column expansion of the determinant of Λ. Result 21 If A is a symmetric real n n matrix and if A is positive definite then λ i >. Proof. Let x be an eigenvector of A such that Ax = λx where x 6=.Thenx Ax = λ > by positive definiteness of A. The trace of a square matrix A is definedasthesumofallthediagonalelements.leta be a n n matrix. Then define nx tr A = a ii. Result 22 i) tr αa = α tr A for α scalar ii) tr A =tra iii) tr(a + B) =tra +trb where A and B are n n matrices iv) tr(ab) =tr(ba) if AB is square v) Let a be a n 1 vector then a a =tr(a a)=traa. vi) if A is a n n matrix then tr A = Σλ i. vii) if A is real symmetric and idempotent then tr A =ranka. Proof. Most of these results can be immediately verified with some algebra. We show vi): By the Schur decomposition theorem and iv) we can write tr A =trsms =trs SM = tr M = P i λ i. Next we show vii): From previous results we know that Ã! tr A = tr(sλs Ik S )=tr [S 1 S 2 ] 1 i=1 = tr(s 1 S 1)=trI k = k where k is the number of non-zero eigenvalues. S 2 15