A proof of the Jordan normal form theorem

Similar documents
Further linear algebra. Chapter IV. Jordan normal form.

MATH SOLUTIONS TO PRACTICE MIDTERM LECTURE 1, SUMMER Given vector spaces V and W, V W is the vector space given by

Notes on the matrix exponential

The Cayley-Hamilton Theorem and the Jordan Decomposition

Bare-bones outline of eigenvalue theory and the Jordan canonical form

Topics in linear algebra

MATH JORDAN FORM

JORDAN NORMAL FORM. Contents Introduction 1 Jordan Normal Form 1 Conclusion 5 References 5

ALGEBRA QUALIFYING EXAM PROBLEMS LINEAR ALGEBRA

1 Last time: least-squares problems

Math 110, Summer 2012: Practice Exam 1 SOLUTIONS

Math 594. Solutions 5

The Jordan Normal Form and its Applications

THE MINIMAL POLYNOMIAL AND SOME APPLICATIONS

SOME REMARKS ON LINEAR SPACES OF NILPOTENT MATRICES

The Jordan canonical form

First we introduce the sets that are going to serve as the generalizations of the scalars.

The Jordan Canonical Form

The Cyclic Decomposition of a Nilpotent Operator

Definition (T -invariant subspace) Example. Example

JORDAN AND RATIONAL CANONICAL FORMS

MATHEMATICS 217 NOTES

A NOTE ON THE JORDAN CANONICAL FORM

Control Systems. Linear Algebra topics. L. Lanari

Eventually reducible matrix, eventually nonnegative matrix, eventually r-cyclic

Math 113 Homework 5. Bowei Liu, Chao Li. Fall 2013

1 Invariant subspaces

Math 113 Final Exam: Solutions

Math Final December 2006 C. Robinson

Jordan blocks. Defn. Let λ F, n Z +. The size n Jordan block with e-value λ is the n n upper triangular matrix. J n (λ) =

INTRODUCTION TO LIE ALGEBRAS. LECTURE 10.

Lecture 21: The decomposition theorem into generalized eigenspaces; multiplicity of eigenvalues and upper-triangular matrices (1)

First of all, the notion of linearity does not depend on which coordinates are used. Recall that a map T : R n R m is linear if

Generalized Eigenvectors and Jordan Form

NONCOMMUTATIVE POLYNOMIAL EQUATIONS. Edward S. Letzter. Introduction

2 Eigenvectors and Eigenvalues in abstract spaces.

Foundations of Matrix Analysis

DIAGONALIZATION. In order to see the implications of this definition, let us consider the following example Example 1. Consider the matrix

Linear Algebra Final Exam Solutions, December 13, 2008

Homework 6 Solutions. Solution. Note {e t, te t, t 2 e t, e 2t } is linearly independent. If β = {e t, te t, t 2 e t, e 2t }, then

j=1 x j p, if 1 p <, x i ξ : x i < ξ} 0 as p.

Theorem 1 ( special version of MASCHKE s theorem ) Given L:V V, there is a basis B of V such that A A l

Notes on nilpotent orbits Computational Theory of Real Reductive Groups Workshop. Eric Sommers

Symmetric and self-adjoint matrices

RIEMANN SURFACES. max(0, deg x f)x.

Honors Algebra 4, MATH 371 Winter 2010 Assignment 4 Due Wednesday, February 17 at 08:35

(Almost) Jordan Form

(f + g)(s) = f(s) + g(s) for f, g V, s S (cf)(s) = cf(s) for c F, f V, s S

Eigenvalues and Eigenvectors

Jordan chain. Defn. E.g. T = J 3 (λ) Daniel Chan (UNSW) Lecture 29: Jordan chains & tableaux Semester / 10

MAS4107 Linear Algebra 2

GENERALIZED EIGENVECTORS, MINIMAL POLYNOMIALS AND THEOREM OF CAYLEY-HAMILTION

Math 113 Winter 2013 Prof. Church Midterm Solutions

(VI.D) Generalized Eigenspaces

Given a finite-dimensional vector space V over a field K, recall that a linear

Math 4153 Exam 3 Review. The syllabus for Exam 3 is Chapter 6 (pages ), Chapter 7 through page 137, and Chapter 8 through page 182 in Axler.

On common eigenbases of commuting operators

Math 25a Practice Final #1 Solutions

NORMS ON SPACE OF MATRICES

SUPPLEMENT TO CHAPTERS VII/VIII

5 Quiver Representations

Category O and its basic properties

MATH 320: PRACTICE PROBLEMS FOR THE FINAL AND SOLUTIONS

Lecture 11: Finish Gaussian elimination and applications; intro to eigenvalues and eigenvectors (1)

MATRICES ARE SIMILAR TO TRIANGULAR MATRICES

ALGEBRA 8: Linear algebra: characteristic polynomial

Calculating determinants for larger matrices

Structure of rings. Chapter Algebras

REPRESENTATION THEORY WEEK 5. B : V V k

L(C G (x) 0 ) c g (x). Proof. Recall C G (x) = {g G xgx 1 = g} and c g (x) = {X g Ad xx = X}. In general, it is obvious that

Cartan s Criteria. Math 649, Dan Barbasch. February 26

What is on this week. 1 Vector spaces (continued) 1.1 Null space and Column Space of a matrix

Solution. That ϕ W is a linear map W W follows from the definition of subspace. The map ϕ is ϕ(v + W ) = ϕ(v) + W, which is well-defined since

Intrinsic products and factorizations of matrices

= W z1 + W z2 and W z1 z 2

6 Inner Product Spaces

Jordan Normal Form and Singular Decomposition

MAT2342 : Introduction to Applied Linear Algebra Mike Newman, fall Projections. introduction

m We can similarly replace any pair of complex conjugate eigenvalues with 2 2 real blocks. = R

Lecture notes on Quantum Computing. Chapter 1 Mathematical Background

Generalized eigenspaces

Diagonalisierung. Eigenwerte, Eigenvektoren, Mathematische Methoden der Physik I. Vorlesungsnotizen zu

Linear and Bilinear Algebra (2WF04) Jan Draisma

IRREDUCIBLE REPRESENTATIONS OF SEMISIMPLE LIE ALGEBRAS. Contents

Unit 5: Matrix diagonalization

Math 110: Worksheet 3

4. Linear transformations as a vector space 17

Spectral radius, symmetric and positive matrices

= A(λ, t)x. In this chapter we will focus on the case that the matrix A does not depend on time (so that the ODE is autonomous):

Chapter 5. Eigenvalues and Eigenvectors

LECTURES ON SYMPLECTIC REFLECTION ALGEBRAS

Chapter 5 Eigenvalues and Eigenvectors

LINEAR ALGEBRA BOOT CAMP WEEK 1: THE BASICS

YORK UNIVERSITY. Faculty of Science Department of Mathematics and Statistics MATH M Test #2 Solutions

Singular Value Decomposition (SVD) and Polar Form

INVARIANT PROBABILITIES ON PROJECTIVE SPACES. 1. Introduction

MATH PRACTICE EXAM 1 SOLUTIONS

Throughout these notes we assume V, W are finite dimensional inner product spaces over C.

Math 408 Advanced Linear Algebra

235 Final exam review questions

Transcription:

A proof of the Jordan normal form theorem Jordan normal form theorem states that any matrix is similar to a blockdiagonal matrix with Jordan blocks on the diagonal. To prove it, we first reformulate it in the following way: Jordan normal form theorem. For any finite-dimensional vector space V and any linear operator A: V V, there exist a decomposition of V V = V 1 V 2... V k into a direct sum of invariant subspaces of A; a basis e (i) 1,..., e(i) n i of V i for each i = 1,..., k such that (A λ i Id)e (i) 1 = 0; (A λ i Id)e (i) 2 = e (i) 1 ;... (A λ i Id)e (i) n i = e (i) n i 1 for some λ i (which may coincide or be different for different i). Dimensions of these subspaces and coefficients λ i are determined uniquely up to permutations. Invariant subspaces, direct sums, and block matrices Recall the following important definitions. Definition 1. A subspace U of a vector space V is called an invariant subspace of a linear operator A: V V if for any u U we have A(u) U. Definition 2. For any two subspaces U 1 and U 2 of a vector space V, their sum U 1 + U 2 is defined as the set of all vectors u 1 + u 2, where u 1 U 1, u 2 U 2. If U 1 U 2 = {0}, then the sum of U 1 and U 2 is called direct, and is denoted by U 1 U 2. Choose any decomposition of V into a direct sum of two subspaces U 1 and U 2. Note that if u 1 U 1 and u 2 U 2, then u 1 +u 2 = 0 implies u 1 = u 2 = 0. Indeed, one can rewrite it as u 1 = u 2 and use the fact that U 1 U 2 = {0}. This means that any vector of the direct sum can be represented in the form u 1 + u 2, where u 1 U 1 and u 2 U 2, in a unique way. An immediate consequence of that fact is the following important formula: dim(u 1 U 2 ) = dim U 1 + dim U 2. Indeed, it is clear that we can get a basis for the direct sum by joining together bases for summands. Fix a basis for U 1 and a basis for U 2. Joining them together, we get a basis for V = U 1 U 2. When we write down the matrix of any linear operator 1

( ) A11 A with respect to this basis, we get a block-diagonal matrix 12 ; its A 21 A 22 splitting into blocks correspond to the way our basis is split into two parts. Let use formulate important facts which are immediate from the definition of a matrix of a linear operator. 1. A 12 = 0 if and only if U 2 is invariant; 2. A 21 = 0 if and only if U 1 is invariant. Thus, this matrix is block-triangular if and only if one of the subspaces is invariant, and is block-diagonal if and only if both subspaces are invariant. This admits an obvious generalisation for the case of larger number of summands in the direct sum. Generalised eigenspaces The first step of the proof is to decompose our vector space into the direct sum of invariant subspaces where our operator has only one eigenvalue. Definition 3. Let N 1 (λ) = Ker(A λ Id), N 2 (λ) = Ker(A λ Id) 2,..., N m (λ) = Ker(A λ Id) m,... Clearly, N 1 (λ) N 2 (λ)... N m (λ)... Since we only work with finite-dimensional vector spaces, this sequence of subspaces cannot be strictly increasing; if N i (λ) N i+1 (λ), then, obviously, dim N i+1 (λ) 1 + dim N i (λ). It follows that for some k we have N k (λ) = N k+1 (λ). Lemma 1. We have N k (λ) = N k+1 (λ) = N k+2 (λ) =... Let us prove that N k+l (λ) = N k+l 1 (λ) by induction on l. Note that the induction basis (case l = 1) follows immediately from our notation.suppose that N k+l (λ) = N k+l 1 (λ); let us prove N k+l+1 (λ) = N k+l (λ). If we assume that it is false, then there is a vector v such that that is v N k+l+1 (λ), v / N k+l (λ), (A λ Id) k+l+1 (v) = 0, (A λ Id) k+l (v) 0. Put w = (A λ Id)(v) = 0. Obviously, we have (A λ Id) k+l (w) = 0, (A λ Id) k+l 1 (w) 0, which contradicts the induction hypothesis, and our statement follows. Lemma 2. Ker(A λ Id) k Im(A λ Id) k = {0}. 2

Indeed, assume there is a vector v Ker(A λ Id) k Im(A λ Id) k. This means that (A λ Id) k (v) = 0 and that there exists a vector w such that v = (A λ Id) k (w). It follows that (A λ Id) 2k (w) = 0, so w Ker(A λ Id) 2k = N 2k (λ). But from the previous lemma we know that N 2k (λ) = N k (λ), so w Ker(A λ Id) k. Thus, v = (A λ Id) k (w) = 0, which is what we need. Lemma 3. V = Ker(A λ Id) k Im(A λ Id) k. Indeed, consider the direct sum of these two subspaces. It is a subspace of V of dimension dim Ker(A λ Id) k +dim Im(A λ Id) k. Let A = (A λ Id) k. Earlier we proved that for any linear operator its rank and the dimension of its kernel sum up to the dimension of the vector space where it acts. rk A = dim Im A, so we have dim Ker(A λ Id) k + dim Im(A λ Id) k = dim Ker A + rk A = dim V. A subspace of V whose dimension is equal to dim V has to coincide with V, so the lemma follows. Lemma 4. Ker(A λ Id) k and Im(A λ Id) k are invariant subspaces of A. Indeed, note that A(A λ Id) = (A λ Id)A, so if (A λ Id) k (v) = 0, then (A λ Id) k (A(v)) = A(A λ Id) k (v) = 0; if v = (A λ Id) k (w), then A(v) = A(A λ Id) k (w) = (A λ Id) k (A(w)). To complete this step, we use induction on dim V. Note that on the invariant subspace Ker(A λ Id) k the operator A has only one eigenvalue (if Av = µv for some 0 v Ker(A λ Id) k, then (A λ Id)v = (µ λ)v, and 0 = (A λ Id) k v = (µ λ) k v, so µ = λ), and the dimension of Im(A λ Id) k is less than dim V (if λ is an eigenvalue of A), so we can apply the induction hypothesis for A acting on the vector space V = Im(A λ Id) k. This results in the following Theorem. For any linear operator A: V V whose (different) eigenvalues are λ 1,..., λ k, there exist integers n 1,..., n k such that V = Ker(A λ 1 Id) n 1... Ker(A λ k Id) n k. Normal form for a nilpotent operator The second step in the proof is to establish the Jordan normal form theorem for the case of an operator B: V V for which B k = 0 (such operators are called nilpotent). This would basically complete the proof, after we put B = A λ Id and use the result that we already obtained; we will discuss it more precisely below. 3

Let us modify slightly the notation we used in the previous section; put N 1 = Ker B, N 2 = Ker B 2,..., N m = Ker B m,... We have N k = N k+1 = N k+2 =... = V. To make our proof more neat, we shall use the following definition. Definition 4. For a vector space V and a subspace U V, we say that a sequence of vectors e 1,..., e l is a basis of V relative to U if any vector v V can be uniquely represented in the form c 1 e 1 + c 2 e 2 +... + c l e l + u, where c 1,..., c l are coefficients, and u U. In particular, the only linear combination of e 1,..., e l that belongs to U is the trivial combination (all coefficients are equal to zero). Example 1. The usual notion of a basis is contained in the new notion of a relative basis: a usual basis of V is a basis relative to U = {0}. Definition 5. We say that a sequence of vectors e 1,..., e l is linearly independent relative to U if the only linear combination of e 1,..., e l that belongs to U is the trivial combination (all coefficients are equal to zero). Exercise 1. Any sequence of vectors that is linearly independent relative to U can be extended to a basis relative to U. Now we are going to prove our statement, constructing a required basis in k steps. First, find a basis of V = N k relative to N k 1. Let e 1,..., e s be vectors of this basis. Lemma 5. The vectors e 1,..., e s, B(e 1 ),..., B(e s ) are linearly independent relative to N k 2. Indeed, assume that c 1 e 1 +... + c s e s + d 1 B(e 1 ) +... + d s B(e s ) N k 2. Since e i N k, we have B(e i ) N k 1 N k 2, so c 1 e 1 +... + c s e s d 1 B(e 1 )... d s B(e s ) + N k 2 N k 1, which means that c 1 =... = c s = 0 (e 1,..., e s form a basis relative to N k 1 ). Thus, so B(d 1 e 1 +... + d s e s ) = d 1 B(e 1 ) +... + d s B(e s ) N k 2, d 1 e 1 +... + d s e s N k 1, and we deduce that d 1 =... = d s = 0 (e 1,..., e s form a basis relative to N k 1 ), so the lemma follows. Now we extend this collection of vectors by vectors f 1,..., f t which together with B(e 1 ),..., B(e s ) form a basis of N k 1 relative to N k 2. Absolutely analogously one can prove 4

Lemma 6. The vectors e 1,..., e s, B(e 1 ),..., B(e s ), B 2 (e 1 ),..., B 2 (e s ), f 1,..., f t, B(f 1 ),..., B(f t ) are linearly independent relative to N k 3. We continue that extension process until we end up with a usual basis of V of the following form: e 1,..., e s, B(e 1 ),..., B(e s ), B 2 (e 1 ),..., B k 1 (e 1 ),..., B k 1 (e s ), f 1,..., f t, B(f 1 ),..., B k 2 (f 1 ),..., B k 2 (f t ),..., g 1,..., g u, where the first line contains a vector from N k, a vector from N k 1,..., a vector from N 1, the second one a vector from N k 1, a vector from N k 2,..., a vector from N 1,,..., the last one just a vector from N 1. To get from this basis a Jordan basis, we just re-number the basis vectors. Note that the vectors v 1 = B k 1 (e 1 ), v 2 = B k 2 (e 1 ),..., v k 1 = B(e 1 ), v k = e 1 form a thread of vectors for which B(v 1 ) = 0, B(v i ) = v i 1 for i > 1, which are precisely formulas for the action of a Jordan block matrix. Arranging all vectors in chains like that, we obtain a Jordan basis. Remark 1. Note that if we denote by m d the number of Jordan blocks of size d, we have m 1 + m 2 +... + m k = dim N 1, m 2 +... + m k = dim N 2 dim N 1,... m k = dim N k dim N k 1, so the sizes of Jordan blocks are uniquely determined by the properties of our operator. General case of the Jordan normal form theorem From the second section we know that V can be decomposed into a direct sum of invariant subspaces Ker(A λ i Id) n i. From the third section, changing the notation and putting B = A λ i Id, we deduce that each of these subspaces can be decomposed into a direct sum of subspaces where A acts by a Jordan block matrix; sizes of blocks can be computed from the dimension data listed above. This completes the proof of the Jordan normal form theorem. 5