MA20216: Algebra 2A. Notes by Fran Burstall

MA20216: Algebra 2A Notes by Fran Burstall Corrections by: Callum Kemp Carlos Galeano Rios Kate Powell Tobias Beith Krunoslav Lehman Pavasovic Dan Corbie Phaidra Anastasiadou Louise Hannon Vlad Brebeanu Lauren Godfrey Elizabeth Crowley James Green Reuben Russell Ross Trigoll Emerald Dilworth George Milton Caitlin Ray

Contents 1 Linear algebra: concepts and examples 2 1.1 Vector spaces........................................... 2 1.2 Subspaces............................................. 3 1.3 Bases............................................... 4 1.3.1 Standard bases...................................... 5 1.4 Linear maps............................................ 5 1.4.1 Vector spaces of linear maps............................... 6 1.4.2 Extension by linearity.................................. 7 1.4.3 The rank-nullity theorem................................ 8 2 Sums, products and quotients 9 2.1 Sums and products........................................ 9 2.1.1 Sums of subspaces.................................... 9 2.1.2 Internal direct sum (two summands).......................... 9 2.1.3 Internal direct sums (many summands)........................ 12 2.1.4 External direct sums and products........................... 12 2.2 Quotients............................................. 14 3 Inner product spaces 18 3.1 Inner products.......................................... 18 3.1.1 Definition and examples................................. 18 3.1.2 Cauchy Schwarz inequality............................... 20 3.2 Orthogonality........................................... 22 3.2.1 Orthonormal bases.................................... 22 3.2.2 Orthogonal complements and orthogonal projection................. 25 4 Linear operators on inner product spaces 29 4.1 Linear operators and their adjoints............................... 29 4.1.1 Linear operators and matrices.............................. 29 4.1.2 Adjoints.......................................... 30 i

4.1.3 Linear isometries..................................... 32 4.2 The spectral theorem....................................... 35 4.2.1 Eigenvalues and eigenvectors.............................. 35 4.2.2 Invariant subspaces and adjoints............................ 36 4.2.3 The spectral theorem for normal operators...................... 37 4.2.4 The spectral theorem for real self-adjoint operators.................. 39 4.2.5 The spectral theorem for symmetric and Hermitian matrices............ 40 4.2.6 Singular value decomposition.............................. 42 5 Duality 44 5.1 Dual spaces............................................ 44 5.2 Solution sets and annihilators.................................. 47 5.3 Transposes............................................ 50 6 Bilinearity 53 6.1 Bilinear maps........................................... 53 6.2 Bilinear forms and quadratic forms............................... 54 6.2.1 Bilinear forms and matrices............................... 54 6.2.2 Symmetric bilinear forms................................ 55 6.2.3 Quadratic forms..................................... 56 6.2.4 Classification of symmetric bilinear and quadratic forms............... 57 A Further results 62 A.1 More on sums and products................................... 62 A.1.1 Sums, products and linear maps............................ 62 A.1.2 Infinite sums and products............................... 63 1

Chapter 1 Linear algebra: concepts and examples Let us warm up by revising some of the key ideas from Algebra 1B. Along the way, we will see some new examples and prove a couple of new results. 1.1 Vector spaces Recall from Algebra 1B, 2.1: Definition. A vector space V over a field F is a set V with two operations: addition V V V : (v, w) v + w with respect to which V is an abelian group: v + w = w + v, for all v, w V ; u + (v + w) = (u + v) + w, for all u, v, w V ; there is a zero element 0 V for which v + 0 = v = 0 + v, for all v V ; each element v V has an additive inverse v V for which v + ( v) = 0 = ( v) + v. scalar multiplication F V V : (λ, v) λv such that (λ + µ)v = λv + µv, for all v V, λ, µ F. λ(v + w) = λv + λw, for all v, w V, λ F. (λµ)v = λ(µv), for all v V, λ, µ F. 1v = v, for all v V. We call the elements of F scalars and those of V vectors. Examples. 1. Take V = F, the field itself, with addition and scalar multiplication the field addition and multiplication. 2. F n, the n-fold Cartesian product of F with itself, with component-wise addition and scalar multiplication: (λ 1,..., λ n ) + (µ 1,..., µ n ) := (λ 1 + µ 1,..., λ n + µ n ) λ(λ 1,..., λ n ) := (λλ 1,..., λλ n ). 2

3. Let M m n (F) denotes the set of m by n matrices (thus m rows and n columns) with entries in F. This is a vector space under entry-wise addition and scalar multiplication. Special cases are the vector spaces of column vectors M n 1 (F) and row vectors M 1 n (F). In computations, we often identify F n with M n 1 (F) by associating x = (x 1,..., x n ) F n with the column vector x = 4. Here is a very general example: let I be any set and V a vector space. Recall that V I denotes the set {f : I V } of all maps from I to V. I claim that V I is a vector space under pointwise addition and scalar multiplication. That is, for f, g : I V and λ F, we define for all i I. x 1. x n. (f + g)(i) := f(i) + g(i) (λf)(i) := λ(f(i)), The zero element is just the constant zero function: 0(i) := 0, and the additive inverses are defined pointwise also: ( f)(i) := (f(i)). Exercise. 1 Prove the claim! That is, show that V I is a vector space under pointwise addition and scalar multiplication. Remark. For suitable I, this last example captures many familiar vector spaces. For example: We identify F n with F {1,...,n} by associating (x 1,..., x n ) F n with the map (i x i ). Similarly, we identify M m n (F) with F {1,...,m} {1,...,n} by associating the matrix A with the map (i, j) A ij. R N is the set of real sequences {(a n ) n N : a n R} that played such a starring role in Analysis 1. 1.2 Subspaces Definition. A vector (or linear) subspace of a vector space V over F is a non-empty subset U V which is closed under addition and scalar multiplication: whenever u, u 1, u 2 U and λ F, then u 1 + u 2 U and λu U. In this case, we write U V. Say that U is trivial if U = {0} and proper if U V. Of course, U is now a vector space in its own right using the addition and scalar multiplication of V. Examples. A good way to see that something is a vector space is to see that it is a subspace of some V I. That way, there is no need to verify all the tedious axioms (associativity, distributivity and so on). 1. The set c := {real convergent sequences} R N and so is a vector space. This is part of the content of the Algebra of Limits Theorem in Analysis 1. 1 Question 4 on sheet 1. 3

2. Let [a, b] R be an interval and set the set of continuous functions. C 0 [a, b] := {f : [a, b] R f is continuous}, Then C 0 [a, b] R [a,b]. This is most of the Algebra of Continuous Functions Theorem from Analysis 1. 1.3 Bases Definitions. Let v 1,..., v n be a list of vectors in a vector space V. 1. The span of v 1,..., v n is span{v 1,..., v n } := {λ 1 v 1 + + λ n v n λ i F, 1 i n} V. 2. v 1,..., v n span V (or are a spanning list for V ) if span{v 1,..., v n } = V. 3. v 1,..., v n are linearly independent if, whenever λ 1 v 1 + + λ n v n = 0, then each λ i = 0, 1 i n, and linearly dependent otherwise. 4. v 1,..., v n is a basis for V if they are linearly independent and span V. Definition. A vector space is finite-dimensional if it admits a finite list of vectors as basis and infinitedimensional otherwise. If V is finite-dimensional, the dimension of V, dim V, is the number of vectors in a (any) basis of V. Here is a slightly different take on bases which can be helpful in practice: Proposition 1.1. v 1,..., v n is a basis for V if and only if any v V can be written in the form for unique λ 1,..., λ n F. v = λ 1 v 1 + + λ n v n (1.1) Proof. First suppose that v 1,..., v n is a basis and so spans. Then, for v V, we can find some λ 1,..., λ n F for which (1.1) holds. For uniqueness, suppose that v = n λ iv i = n µ iv i. Then 0 = v v = (λ 1 µ 1 )v 1 + + (λ n µ n )v n and the linear independence of the v i now forces each λ i = µ i. Conversely, if the v i have the unique linear combinations property, they clearly span. As for linear independence, suppose that λ 1 v 1 + +λ n v n = 0. Since we also have 0 = 0v 1 + +0v n, the uniqueness tell us that each λ i = 0. A very useful fact about bases that we shall use many times was proved in Algebra 1B: Proposition 1.2 (Algebra 1B, Chapter 3, Theorem 7(b)). Any linearly independent list of vectors in a finite-dimensional vector space can be extended to a basis. In particular, any basis of a subspace U V extends to a basis of V. Consequently: Lemma 1.3. Let V be a finite-dimensional vector space and U V. Then with equality if and only if U = V. dim U dim V Proof. A basis of U contains less vectors than one of V hence the inequality. In the case of equality, a basis of U is already a maximal linearly independent list of vectors in V and so must be a basis of V. Thus U = V. 4

1.3.1 Standard bases In general, finite-dimensional vector spaces have many bases and there is no good reason to prefer any particular one. However, some lucky vector spaces come equipped with a natural basis. Proposition 1.4. For I a set and i I, define e i F I by { 1 if i = j e i (j) = 0 if i j, for all j I. If I is finite then (e i ) i I is a basis, called the standard basis, of F I. In particular, dim F I = I. Proof. The key observation is that there is a unique way to write v F I as a linear combination of the e i : v = v(i)e i. i I Indeed, for j I, ( ) v(i)e i (j) = v(i)e i (j) = v(i)0 + v(j)1 = v(j). i I i I i j Thus Proposition 1.1 applies to show that the (e i ) i I are a basis. Examples. Identify F n with F {1,...,n} and then e i = (0,..., 1,..., 0) with a single 1 in the i-th place. Similarly, the vector space of column vectors has a standard basis with e i, the column vector with a single 1 in the i-th row: 0. e i = 1.. 0 Finally, identifying M m n (F) with F {1,...,m} {1,...,n} yields the standard basis (e (i,j) ) i,j of M m n (F) where e (i,j) differs from the zero matrix by a single 1 in the i-th row and j-th column. 1.4 Linear maps Definitions. A map φ : V W of vector spaces over F is a linear map (or, in older books, linear transformation) if for all v, w V, λ F. φ(v + w) = φ(v) + φ(w) φ(λv) = λφ(v), The kernel of φ is ker φ := {v V φ(v) = 0} V. The image of φ is im φ := {φ(v) v V } W. Remark. φ is linear if and only if φ(v + λw) = φ(v) + λφ(w), for all v, w V, F, which has the virtue of being only one thing to prove. 5

Examples. 1. A M m n (F) determines a linear map φ A : F n F m by φ A (x) = y where, for 1 i m, y i = A ij x j. j=1 Otherwise said, y is given by matrix multiplication: y = Ax. 2. For any vector space V, the identity map id V : V V is linear. 3. If φ : V W and ψ : W U are linear then so is ψ φ : V U. 4. Recall that c is the vector space of convergent sequences. The map lim n : (a n ) n N lim n a n : c R is linear thanks to the Algebra of Limits Theorem in Analysis 1. 5. b a : f b a f : C0 [a, b] R is also linear. Definition. A linear map φ : V W is a (linear) isomorphism if there is a linear map ψ : W V such that ψ φ = id V, φ ψ = id W. If there is an isomorphism V W, say that V and W are isomorphic and write V = W. In Algebra 1B, we saw: Lemma 1.5. φ : V W is an isomorphism if and only if φ is a linear bijection (and then ψ = φ 1 ). 1.4.1 Vector spaces of linear maps Notation. For vector spaces V, W over F, denote by L F (V, W ) (or simply L(V, W )) the set {φ : V W φ is linear} of linear maps from V to W. Theorem 1.6 (Linearity is a linear condition). L(V, W ) is a vector space under pointwise addition and scalar multiplication. Otherwise said, L(V, W ) W V. Proof. It is enough to show that L(V, W ) is a vector subspace of W V, that is, is non-empty and closed under addition and scalar multiplication. First observe that the zero map 0 : v 0 W is linear: In particular, L(V, W ) is non-empty. 0(v + λw) = 0 = 0 + λ0 = 0(v) + λ0(w). Now let φ, ψ L(V, W ) and show that φ + ψ is linear: (φ + ψ)(v + λw) = φ(v + λw) + ψ(v + λw) = φ(v) + λφ(w) + ψ(v) + λψ(w) = (φ(v) + ψ(v)) + λ(φ(w) + ψ(w)) = (φ + ψ)(v) + λ(φ + ψ)(w), for all v, w V, λ F. Here the first and last equalities are just the definition of pointwise addition while the middle equalities come from the linearity of φ, ψ and the vector space axioms of W. Similarly, it is a simple exercise to see that if µ F and φ L(V, W ) then µφ is also linear. 6

1.4.2 Extension by linearity A linear map of a finite-dimensional vector space is completely determined by its action on a basis. More precisely: Proposition 1.7 (Extension by linearity). Let V, W be vector spaces over F. Let v 1,..., v n be a basis of V and w 1,..., w n any vectors in W. Then there is a unique φ L(V, W ) such that φ(v i ) = w i, 1 i n. (1.2) Proof. We need to prove that such a φ exists and that there is only one. We prove existence first. Let v V. By Proposition 1.1, we know there are unique λ 1,..., λ n F for which v = λ 1 v 1 + + λ n v n and so we define φ(v) to be the only thing it could be: φ(v) := λ 1 w 1 + + λ n w n. Let us show that this φ does the job. First, with λ i = 1 and λ j = 0, for i j, we see that φ(v i ) = j i 0w j + 1w i = w i so that (1.2) holds. Now let us see that φ is linear: let v, w V with Then, for λ F, v = λ 1 v 1 + + λ n v n w = µ 1 v 1 + + µ n v n. whence v + λw = (λ 1 + λµ 1 )v 1 + + (λ n + λµ n )v n φ(v + λw) = (λ 1 + λµ 1 )w 1 + + (λ n + λµ n )w n = (λ 1 w 1 + + λ n w n ) + λ(µ 1 w 1 + + µ n w n ) = φ(v) + λφ(w). For uniqueness, suppose that φ, φ L(V, W ) both satisfy (1.2). Let v V and write v = λ 1 v 1 + + λ n v n. Then φ(v) = λ 1 φ(v 1 ) + + λ n φ(v n ) = λ 1 w 1 + + λ n w n = λ 1 φ (v 1 ) + + λ n φ (v n ) = φ (v), where the first and last equalities come from the linearity of φ, φ and the middle two from (1.2) for first φ and then φ. We conclude that φ = φ and we are done. Remark. In the context of Proposition 1.7, φ is an isomorphism if and only if w 1,..., w n is a basis for W (exercise 2!). Here is an application which gives us another way to think about bases: we can view them as linear isomorphisms from F n. Let B : v 1,..., v n be a basis for V. Then Proposition 1.7 gives us a linear isomorphism φ B : F n V such that that is, φ B (x) = i x iv i. φ B (e i ) = v i, 1 i n, (1.3) Conversely, any linear isomorphism φ : F n V defines a unique basis via (1.3). 2 This is question 2 on exercise sheet 2. 7

1.4.3 The rank-nullity theorem Easily the most important result in Algebra 1B is the famous Rank-nullity theorem: Theorem 1.8 (Rank-nullity). Let φ : V W be linear with V finite-dimensional. Then dim im φ + dim ker φ = dim V. Using this, together with the observation that φ is injective if and only if ker φ = {0}, we saw in Algebra 1B: Proposition 1.9. Let φ : V W be linear with V, W finite-dimensional vector spaces of the same dimension: dim V = dim W. Then the following are equivalent: 1. φ is injective. 2. φ is surjective. 3. φ is an isomorphism. Remark. Proposition 1.9 is flat-out false for infinite-dimensional V, W. For example: let S : R N R N be the shift operator: S((a 0, a 1,... )) := (a 1,... ). We readily check that: S is linear; S surjects; S is not injective. For example: S((1, 0, 0,... )) = 0. 8

Chapter 2 Sums, products and quotients We will discuss various ways of building new vector spaces out of old ones. Convention. In this chapter, all vector spaces are over the same field F unless we say otherwise. 2.1 Sums and products 2.1.1 Sums of subspaces Definition. Let V 1,..., V k V. The sum V 1 + + V k is the set V 1 + + V k := {v 1 + + v k v i V i, 1 i k}. V 1 + + V k is the smallest subspace of V that contains each V i. More precisely: Proposition 2.1. Let V 1,..., V k V. Then (1) V 1 + + V k V. (2) If W V and V 1,..., V k W then V 1,..., V k V 1 + + V k W. Proof. It suffices to prove (2) since (1) then follows by taking W = V. For (2), first note that V 1 + + V k is a subset of W : if v i V i then v i W so that v 1 + + v k W since W is closed under addition. Now observe that each V i V 1 + +V k since we can write any v i V i as 0+ +v i + +0 V 1 + +V k. In particular, V 1 + + V k is non-empty. Finally, we show that V 1 + + V k is closed under addition and scalar multiplication. If v 1 + + v k, w 1 + + w k V 1 + + V k, with v i, w i V i, for all i, then (v 1 + + v k ) + (w 1 + + w k ) = (v 1 + w 1 ) + + (v k + w k ) V 1 + + V k since each v i + w i V i. Again, for λ F, since λv i V i. λ(v 1 + + v k ) = λv 1 + + λv k V 1 + + V k, 2.1.2 Internal direct sum (two summands) Here is an important special case of the sum construction. 9

Definition. Let V 1, V 2 V. V is the (internal) direct sum of V 1 and V 2 if (a) V = V 1 + V 2 ; (b) V 1 V 2 = {0}. In this case, write V = V 1 V 2 and say that V 2 is a complement of V 1 (and V 1 is a complement of V 2!). V 2 0 V 1 An alternative take: Figure 2.1: R 3 as a direct sum of a line and a plane Proposition 2.2. For V 1, V 2 V, the following are equivalent: (1) V = V 1 + V 2 and V 1 V 2 = {0}. (2) Each v V can be written for unique v i V i, i = 1, 2. v = v 1 + v 2, Proof. We show (1) implies (2) first. Let v V. Since V = V 1 + V 2, there are v i V i, i = 1, 2, with v = v 1 + v 2. For the uniqueness, if v = v 1 + v 2 also with v i V i then 0 = v v = (v 1 v 1) + (v 2 v 2) yields v 1 v 1 = v 2 v 2 V 1 V 2 = {0}. Thus v i = v i, i = 1, 2. Now suppose (2) holds and prove (1): clearly we have V = V 1 + V 2. If v V 1 V 2 then we can write v = v 1 + 0 = 0 + v 2, with v 1 = v 2 = v. The uniqueness part of (2) now gives v 1 = v 2 = 0, that is v = 0. The situation is illustrated in Figure 2.2. Dimensions add in direct sums: Proposition 2.3. Let V = V 1 V 2 with V finite-dimensional. Then dim V = dim V 1 + dim V 2. Proof. Let v 1,..., v k be a basis for V 1 and w 1,..., w m be a basis for V 2. It suffices to show that v 1,..., v k, w 1,..., w m is a basis for V. 10

V 2 v 2 v V 1 0 v 1 Figure 2.2: R 2 = V 1 V 2 For this, let v V. By Proposition 2.2, we have unique v V 1, v V 2 for which v = v + v while, by Proposition 1.1, there are unique scalars λ i, µ j F such that We conclude that v = λ 1 v 1 + + λ k v k, v = µ 1 w 1 + + µ m w m. v = λ 1 v 1 + + λ k v k + µ 1 w 1 + + µ m w m, for unique λ i, µ j F so that, by Proposition 1.1, v 1,..., v k, w 1,..., w m is a basis as required. For finite-dimensional vector spaces, any subspace has a complement: Proposition 2.4 (Complements exist). Let U V, a finite-dimensional vector space. Then there is a complement to U. Proof. Let v 1,..., v k be a basis for U and so a linearly independent list of vectors in V. By Proposition 1.2, we can extend the list to get a basis v 1,..., v n of V. Set W = span{v k+1,..., v n }: this is a complement to U. Indeed, V = U + W since any v V can be written v = λ 1 v 1 + + λ n v n = (λ 1 v 1 + + λ k v k ) + (λ k+1 v k+1 + + λ n v n ) U + W. Further, if v U W we can write v = λ 1 v 1 + + λ k v k + 0v k+1 + + 0v n = 0v 1 + + 0v k + λ k+1 v k+1 + + λ n v n and uniqueness in Proposition 1.1 tells us that each λ i = 0 so that v = 0. In fact, as Figure 2.3 illustrates, there are many complements to a given subspace. Figure 2.3: Each dashed line is a complement to the undashed subspace. 11

2.1.3 Internal direct sums (many summands) We can have more than two summands in the direct sum construction. This is how the conditions of Proposition 2.2 generalise: Proposition 2.5. Let V 1,..., V k V, k 2. Then the following are equivalent: (1) V = V 1 + + V k and, for each 1 j k, V j ( i j V i) = {0}. (2) Any v V can be written v = v 1 + + v k for unique v i V i, 1 i k. Proof. This is an exercise in imitating the proof of Proposition 2.2. Definition. Let V 1,..., V k V. Say that V is the (internal) direct sum of the V i if either condition of Proposition 2.5 holds. In this case, write V = V 1 V k. Remark. The condition on intersections in Proposition 2.5(1) is much more stringent than simply asking that each V i V j = {0}: the latter is simply not enough when k > 2. 2.1.4 External direct sums and products There is a similar and very closely related construction where the V i are arbitrary vector spaces rather than subspaces of a fixed vector space V. For this, recall the Cartesian product of sets X 1,..., X k : this is X 1 X k := {(x 1,..., x k ) x i X i, 1 i k}. The Cartesian product of vector spaces is a vector space under component-wise addition and scalar multiplication: Theorem 2.6. Let V 1,..., V k be vector spaces over a field F. Then the Cartesian product V 1 V k is a vector space over F under component-wise addition and scalar multiplication: (v 1,..., v k ) + (w 1,..., w k ) = (v 1 + w 1,..., v k + w k ) λ(v 1,..., v k ) = (λv 1,..., λv k ). The zero element is (0,..., 0) where the zero in the i-th slot is the zero element of V i. Similarly, (v 1,..., v k ) = ( v 1,..., v k ). Proof. This is a straightforward exercise: the vector space axioms for the product come by applying those of the factors V i to the components. Definition. Let V 1,..., V k be vector spaces over a field F. The direct product or external direct sum of the V i is the Cartesian product of the V i equipped with the vector space structure of component-wise addition and scalar multiplication. This space is denoted V 1 V k or V 1 V k. Remark. The latter notation is a bit confusing since we are already using it for the internal direct sum of subspaces. However, we are about to see that internal and external direct sums are essentially the same. Dimensions add in direct products too: Proposition 2.7. Let V 1,..., V k be finite-dimensional vector spaces. Then V 1 V k is also finitedimensional and dim V 1 V k = dim V 1 + + dim V k. 12

Proof. We induct on k. For k = 1, there is nothing to prove. For the induction step, suppose that the formula holds for products with k 1 factors. Now consider the map p : V 1 V k V 1 given by p(v 1,..., v k ) = v 1. This is plainly linear with im p = V 1 and Thus, by the induction hypothesis, ker p = {0} V 2 V k = V2 V k. dim ker p = dim V 2 + + dim V k, which, together with the rank-nullity theorem, yields dim V 1 V k = dim V 1 + + dim V k. Remark. Another, more tedious, approach is to build a basis for the product out of bases for the V i : if v (i) 1,..., v(i) n(i) is a basis for V i, we define n(i) elements of V 1 V k by setting ˆv (i) j := (0,..., v (i) j,..., 0), where all components but the i-th are zero. Then the collection of all ˆv (i) j, 1 j n(i), 1 i k, can be shown to be a basis of V 1 V k. We can now see the relation between internal and external direct sums: they are isomorphic in a natural way. Theorem 2.8. Let V 1,..., V k V. Then V = V 1 V k (internal direct sum) if and only if the linear map Γ : V 1 V k V given by is an isomorphism. Γ(v 1,..., v k ) = v 1 + + v k Proof. Clearly Γ surjects exactly when V = V 1 + +V k. Moreover, Γ is injective if and only if, whenever v 1 + + v k = w 1 + + w k, with each v i, w i V i, then v i = w i, for all 1 i k. Otherwise said, Γ is bijective if and only if the condition of Proposition 2.5(2) holds, that is, when V = V 1 V k. Corollary 2.9. If V = V 1 V k is an internal direct sum of finite-dimensional subspaces then dim V = dim V 1 + + dim V k. Proof. By Theorem 2.8, we know that V = V 1 V k and so we can apply Proposition 2.7. Remark. On the other hand, we may view any direct product V 1 V k as an internal direct sum: define a subspace ˆV i of V 1 V k be setting all components to zero except the i-th. Thus ˆV i = {(0,..., v i,..., 0) v i V i } V 1 V k. Then each V i = ˆVi and V 1 V k = ˆV 1 ˆV k (internal direct sum). Example. Let us compare R 3 R 2 with R 5. 1. Is it true that R 3 R 2 = R 5? No: elements of R 5 are lists of five numbers (x 1,..., x 5 ) while elements of R 3 R 2 are pairs of lists of numbers ((x 1,..., x 3 ), (y 1, y 2 )). However: 2. R 3 R 2 = R 5 : we identify ((x 1,..., x 3 ), (y 1, y 2 )) with (x 1,..., x 3, y 1, y 2 ). 13

3. Moreover, as in the previous remark, we can set and then R 5 = ˆR 3 ˆR 2 (internal direct sum). ˆR 3 := {(x 1, x 2, x 3, 0, 0) x i R} = R 3 ˆR 2 := {(0, 0, 0, y 1, y 2 ) y i R} = R 2 4. Many people see very little difference between R 3 and ˆR 3 et cetera and simply write R 5 = R 3 R 2 = R 3 R 2. While we may sympathise, we should remember in some far corner of our minds that these vector spaces are not quite identical. There is more we could say about sums and products: in particular, one can define the direct sum and product of an infinite number of vector subspaces. However, in that case, the direct product is quite different to the direct sum. You can read about this in the very non-examinable Appendix A.1. 2.2 Quotients Let U V. We construct a new vector space from U and V which is an abstract complement to U. The elements of this vector spaces are equivalence classes for the following equivalence relation: Definition. Let U V. Say that v, w V are congruent modulo U if v w U. In this case, we write v w mod U. Lemma 2.10. Congruence modulo U is an equivalence relation. Proof. Exercise 1! Thus each v V lies in exactly one equivalence class [v] V. What do these equivalence classes look like? Note that w v mod U if and only if w v U or, equivalently, w = v + u, for some u U. Definition. For v V, U V, the set v + U := {v + u u U} V is called a coset of U and v is called a coset representative of v + U. We conclude that the equivalence class of v modulo U is the coset v + U. v v + U U 0 Figure 2.4: A subspace U R 2 and a coset v + U. Remark. In geometry, cosets of vector subspaces are called affine subspaces. Examples include lines in R 2 and lines and planes in R 3 irrespective of whether they contain zero (as vector subspaces must). 1 This is question 1 on exercise sheet 3. 14

Examples. 1. Fibres of a linear map: let φ : V W be a linear map and let w = φ(v) im φ. Then v φ 1 {w} if and only if φ(v ) = φ(v) or, equivalently, φ(v v ) = 0, that is, v v ker φ. Thus φ 1 {w} = v + ker φ. We shall see below that any coset arises this way for a suitable φ. 2. General solutions of inhomogeneous equations: here is a concrete version of the previous example. Consider the matrix ( ) 1 3 2 B = 1 4 1 and the corresponding linear map φ B : R 3 R 2. Let us seek the general solution to the inhomogeneous linear equation ( 0 φ B (x) = (0, 4), equivalently, Bx =. (2.1) 4) One solution is ( 1, 1, 1) while the general solution is the fibre of φ B over (0, 4) which is ( 1, 1, 1) + ker φ B. Finding the kernel amounts to solving the homogeneous linear system Bx = 0 which we readily achieve to get that ker φ B = span{(5, 3, 7)} so that the general solution to (2.1) is ( 1, 1, 1) + span{(5, 3, 7)} = {(5λ 1, 3λ + 1, 7λ 1) λ R}. Definition. Let U V. The quotient space V/U of V by U is the set V/U, pronounced V mod U, of cosets of U: V/U := {v + U v V }. The quotient map q : V V/U is defined by q(v) = v + U. This is a vector space and q is a linear map: Theorem 2.11. Let U V. Then, for v, w V, λ F, (v + U) + (w + U) := (v + w) + U λ(v + U) := (λv) + U give well-defined operations of addition and scalar multiplication on V/U with respect to which V/U is a vector space and q : V V/U is a linear map. Moreover, ker q = U and im q = V/U (so q surjects). Proof. For readability, we use the equivalence class notation [v] = v + U = q(v). So our addition and scalar multiplication are given by [v] + [w] := [v + w] λ[v] := [λv] and a key issue is to see that these are well-defined, that is, we get the same answers if we use different representatives of the cosets. More precisely, if [v] = [v ] and [w] = [w ], we must show that [v + w] = [v + w ], [λv] = [λv ]. (2.2) 15

However, in this case, we have v v = u 1 and w w = u 2, for some u 1, u 2 U and then since U is a subspace, and this establishes (2.2). (v + w) (v + w ) = u 1 + u 2 U λv λv = λu 1 U, As for the vector space axioms, these follow from those of V. For example: [v] + [w] = [v + w] = [w + v] = [w] + [v]. The zero element is [0] = 0 + U = U while the additive inverse of [v] is [ v] = ( v) + U. The linearity of q comes straight from how we defined our addition and scalar multiplication: q(v + λw) = [v + λw] = [v] + λ[w] = q(v) + λq(w). Finally v ker q if and only if [v] = [0] if and only if v U while, for any v + U V/U, v + U = q(v) so that q surjects. v + U v v + U U 0 q 0 + U V V/U Figure 2.5: The quotient map q. Corollary 2.12. Let U V. If V is finite-dimensional then so is V/U and dim V/U = dim V dim U. Proof. Apply rank-nullity to q using ker q = U and im q = V/U. Remark. Theorem 2.11 shows that: 1. Any U V is the kernel of a linear map. 2. Any coset v + U is the fibre of a linear map: indeed v + U = q 1 {v + U}, where we read the v + U on the right as an element of V/U and that on the left as a subset of V! Theorem 2.13 (First Isomorphism Theorem). Let φ : V W be a linear map of vector spaces. Define φ : V/ ker φ im φ by φ(v + ker φ) = φ(v). Then φ is a well-defined linear isomorphism. In particular, V/ ker φ = im φ. Proof. Once again, we use equivalence class notation and write [v] for the coset v + ker φ. Thus φ is defined by φ([v]) = φ(v). First we show that φ is well-defined: [v] = [v ] if and only if v v ker φ if and only if φ(v v ) = 0, or, equivalently, φ(v) = φ(v ). 16

To see that φ is linear, we compute: for v 1, v 2 V, λ F. φ([v 1 ] + λ[v 2 ]) = φ([v 1 + λv 2 ]) = φ(v 1 + λv 2 ) = φ(v 1 ) + λφ(v 2 ) = φ([v 1 ]) + λ φ([v 2 ]), Finally we show that φ is an isomorphism: first [v] ker φ if and only if v ker φ if and only if [v] = ker φ, the zero element of V/ ker φ. Thus φ injects. Further, if w im φ, then w = φ(v) = φ([v]), for some v V, so that φ surjects. Remarks. 1. Let q : V V/ ker φ be the quotient map and i : im φ W the inclusion. Then the First Isomorphism Theorem shows that we may write φ as the composition i φ q of a quotient map, an isomorphism and an inclusion. 2. This whole story of cosets, quotients and the First Isomorphism Theorem has versions in many other contexts such as group theory (see MA30237) and ring theory (MA20217). Examples. (1) Let φ L(V, W ). For w = φ(v) im φ, we identified the fibre over w with a coset of ker φ: φ 1 {w} = v + ker φ. From this point of view, the isomorphism φ : V/ ker φ im φ simply reads φ 1 {w} w. (2) More practically, consider once again the matrix ( ) 1 3 2 B = 1 4 1 and the corresponding linear map φ B : R 3 R 2. Now B has rank 2 (the rows are not proportional) so φ B is onto. Thus the elements of R 3 / ker φ B are the solution sets {x Bx = y}, for each y R 2, and the isomorphism φ B is {x Bx = y} y. This helps us to understand the vector space operations of R 3 / ker φ B : {x Bx = y 1 } + λ{x Bx = y 2 } = {x Bx = y 1 + λy 2 }. 17

Chapter 3 Inner product spaces In this chapter, we equip real or complex vector spaces with extra structure that generalises the familiar dot product. Convention. In this chapter, we take the field F of scalars to be either R or C. 3.1 Inner products 3.1.1 Definition and examples Recall the dot (or scalar) product on R n : for x = (x 1,..., x n ), y = (y 1,..., y n ) R n, x y := x 1 y 1 + + x n y n = x T y. Using this we define: the length of x: x := x x; the angle θ between x and y: x y = x y cos θ. There is also a dot product on C n : for x, y C n, x y = x 1 y 1 + + x n y n = x y, where x (pronounced x-dagger ) is the conjugate transpose x T of x. We then have that x x = x i x i = x i 2 is real, non-negative and vanishes exactly when x = 0. We abstract the key properties of the dot product into the following: Definition. Let V be a vector space of F (which is R or C). An inner product on V is a map V V F : (v, w) v, w which is: (1) (conjugate) symmetric: w, v = v, w, for all v, w V. In particular v, v = v, v and so is real. (2) linear in the second slot: u, v + w = u, v + u, w u, λv = λ u, v, for all u, v, w V and λ F. 18

(3) positive definite: For all v V, v, v 0 with equality if and only if v = 0. A vector space with an inner product is called an inner product space. Remark. Any subspace U of an inner product space V is also an inner product space: just restrict, to U U. Let us spell out the implications of this definition in the real and complex cases. Suppose first that F = R. Then the conjugate symmetry is just symmetry: v, w = w, v and it follows that we also have linearity in the first slot: v + w, u = v, u + w, u λv, u = λ v, u. We summarise the situation by saying that a real inner product is a positive definite, symmetric, bilinear form. We shall have more to say about bilinear forms later in chapter 6. Now let us turn to the case F = C. Now it is not the case that an inner product is linear in the first slot. Definition. A map φ : V W of complex vector spaces is conjugate linear (or anti-linear) if for all v, w V and λ F. φ(v + w) = φ(v) + φ(w) φ(λv) = λφ(v), We see from properties (1) and (2) that a complex inner product has v + w, u = v, u + w, u λv, u = λ v, u and so is conjugate linear in the first slot and linear in the second. Such a function is said to be sesquilinear (from the Latin sesqui which means one-and-a-half). Thus an inner product on a complex vector spaces is a positive definite, conjugate symmetric, sesquilinear form. Definition. Let V be an inner product space. 1. The norm of v V is v := v, v 0. 2. Say v, w V are orthogonal or perpendicular if v, w = 0. In this case, we write v w. Remarks. 1. The norm allows us to define the distance between v and w by v w. We can now do analysis on V : this is one of the Big Ideas in MA20218. 2. Warning: There is another convention for complex inner products which is prevalent in Analysis: there they ask that, be linear in the first slot and conjugate linear in the second. There are good reasons for either choice. 3. Physicists often write v w for v, w. Inner product spaces (especially infinite-dimensional ones) are the setting for quantum mechanics. Examples. 1. The dot product on R n or C n is an inner product. 2. Let [a, b] R be a closed, bounded interval. Define a real inner product on C 0 [a, b] by f, g = This is clearly symmetric, bilinear and non-negative. To see that it is definite, one must show that if b a f 2 = 0 then f = 0. This is an exercise in Analysis using the inertia property of continuous functions (see MA20218). b a fg. 19

3. The set of square summable sequences l 2 R N is given by l 2 := {(a n ) n N n N a 2 n < }. Exercises. 1 (a) l 2 R N. (b) If a, b l 2 then n N a nb n is absolutely convergent and then a, b := n N a n b n defines an inner product on l 2. Hint: for x, y R, rearrange 0 ( x y ) 2 to get 2 x y x 2 + y 2 (3.1a) and then deduce (x + y) 2 2(x 2 + y 2 ). (3.1b) Judicious use of equations (3.1) and the comparison theorem from MA10207 will bake the cake. Remark. Perhaps surprisingly, l 2 and C 0 [a, b] are closely related: this is what Fourier series are about: see MA20223. 3.1.2 Cauchy Schwarz inequality Here is one of the most important and ubiquitous inequalities in all of mathematics: Theorem 3.1 (Cauchy Schwarz inequality). Let V be an inner product space. For v, w V, v, w v w (3.2) with equality if and only if v, w are linearly dependent, that is, either v = 0 or w = λv, for some λ F. Proof. The idea of the proof is to write w = λv + u where u v (see Figure 3.1) and then use the fact that u 2 0. In detail, first note that if v = 0 then both sides of the inequality vanish and there is nothing to prove. Otherwise, let us seek λ F so that u := w λv v. We therefore need 0 = v, w λv = v, w λ v, v so that The situation is shown in Figure 3.1. λ = v, w v 2. With λ and then u so defined we have 1 Question 7 on sheet 4. 0 u 2 = w λv, w λv = w, w λ v, w λ w, v + λ λ v, v = w 2 = w 2 w, v v, w v, w w, v v, w w, v v 2 v 2 + v 4 v 2 v, w 2 v 2, 20

u w 0 λv v Figure 3.1: Construction of u. where we used the sesquilinearity of the inner product to reach the second line and the conjugate symmetry to reach the third. Rearranging this yields v, w 2 v 2 w 2 and taking a square root gives us the Cauchy Schwarz inequality. Finally, we have equality if and only if u = 0 or, equivalently, u = 0, that is, w = λu. Examples. 1. Let (Ω, P ) be a finite probability space. Then the space R Ω of real random variables is an inner product space with f, g = E(fg) = x Ω f(x)g(x)p (x), so long as P (x) > 0 for each x Ω (we need this for positive-definiteness). Now the (square of) the Cauchy Schwarz inequality reads E(fg) 2 E(f 2 )E(g 2 ). 2. For a, b l 2, the Cauchy Schwarz inequality reads: ( a n b n n N n N a 2 n ) 1/2 ( 1/2 bn) 2. The Cauchy Schwarz inequality is an essentially 2-dimensional result about the inner product space span{v, w}. Here are some more that are almost as fundamental: Proposition 3.2. Let V be an inner product space and v, w V. 1. Pythagoras Theorem: If v w then n N v + w 2 = v 2 + w 2. (3.3) 2. Triangle inequality: v + w v + w with equality if and only if v = 0 or w = λv with λ 0. 3. Parallelogram identity: v + w 2 + v w 2 = 2( v 2 + w 2 ). Proof. 1. Exercise 2 : expand out v + w 2 = v + w, v + w. 2 Question 2 on sheet 4. 21

w v + w w v + w 0 v (a) Pythagoras Theorem 0 v w v (b) Parallelogram identity Figure 3.2: The identities of Proposition 3.2 2. We prove v + w 2 ( v + w ) 2. We have Now, by Cauchy Schwarz so that v + w 2 = v 2 + 2Re v, w + w 2. Re v, w v, w v w v + w 2 v 2 + 2 v w + w 2 = ( v + w ) 2 with equality if and only if Re v, w = v, w = v w in which case we first get w = λv, for some λ F, and then that Reλ = λ so that λ 0. 3. Exercise 3! 3.2 Orthogonality 3.2.1 Orthonormal bases Definition. A list of vectors u 1,..., u k in an inner product space V is orthonormal if, for all 1 i, j k, { 1 if i = j; u i, u j = δ ij := 0 if i j. If u 1,..., u k is also a basis, we call it an orthonormal basis. Example. The standard basis e 1,..., e n of F n is orthonormal for the dot product. Orthonormal bases are very cool. Here is why: if u 1,..., u k is orthonormal and v span{u 1,..., u k } then we can write v = λ 1 u 1 + + λ k u k. How can we compute the coordinates λ i? In general, this amounts to solving a system of linear equations and so involves something tedious and lengthy like Gaussian elimination. However, in our case, things are much easier. Observe: u i, v = u i, j λ j u j = j λ j u i, u j = j λ j δ ij = λ i. Thus λ i = u i, v (3.4) which is very easy to compute. Let us enshrine this analysis into the following lemma: 3 Question 3 on sheet 4. 22

Lemma 3.3. Let V be an inner product space with orthonormal basis u 1,..., u n and let v V. Then As an immediate consequence of (3.4): v = u i, v u i. Lemma 3.4. Any orthonormal list of vectors u 1,..., u k is linearly independent. Proof. If λ 1 u 1 + + λ k u k = 0 then (3.4) gives λ i = u i, 0 = 0. What is more, these coordinates λ i are all you need to compute inner products. Proposition 3.5. Let u 1,..., u n be an orthonormal basis of an inner product space V. Let v = x 1 u 1 + + x n u n and w = y 1 u 1 + + y n u n. Then v, w = x i y i = x y. j=1 Thus the inner product of two vectors is the dot product of their coordinates with respect to an orthonormal basis. Proof. We simply expand out v, w by sesquilinearity: v, w = i x i u i, j y j u j = i,j x i y j u i, u j = i,j x i y j δ ij = i x i y i = x y. To put it another way: Proposition 3.6. Let u 1,..., u n be an orthonormal basis of an inner product space V and v, w V. Then: (1) Parseval s identity: v, w = n v, u i u i, w. (2) Bessel s equality: v 2 = n v, u i 2. Proof. (1) This comes straight from Proposition 3.5, using conjugate symmetry of the inner product to get x i = u i, v = v, u i. (2) Put v = w in (1). All of this should make us eager to get our hands on orthonormal bases and so we would like to know if they always exist. To see that they do, we need the following construction: Theorem 3.7 (Gram Schmidt orthogonalisation). Let v 1,..., v m be linearly independent vectors in an inner product space V. Then there is an orthonormal list u 1,..., u m such that for all 1 k m, defined inductively by: span{u 1,..., u k } = span{v 1,..., v k }, u 1 := v 1 / v 1 u k := w k / w k 23

where, for k > 1, k 1 w k := v k u j, v k u j. j=1 Proof. We induct with inductive hypothesis at k that u 1,..., u k is orthonormal and that, for 1 l k, span{u 1,..., u l } = span{v 1,..., v l }. At k = 1, this reads u 1 = 1 and span{u 1 } = span{v 1 } which is certainly true. Now assume the hypothesis is true at k 1 so that u 1,..., u k 1 is orthonormal and span{u 1,..., u k 1 } = span{v 1,..., v k 1 }. Then span{u 1,..., u k } = span{u 1,..., u k 1, w k } = span{v 1,..., v k 1, w k } = span{v 1,..., v k }. Moreover, for any i < k, u i, w k = u i, v k j<k u j, v k u i, u j = u i, v k j<k u j, v k δ ij = u i, v k u i, v k = 0 Thus w k u 1,..., u k 1 so that u k is also whence u 1,..., u k is orthonormal. Thus the inductive hypothesis is true at k and so at m by induction. Remark. For practical purposes, we can get an easier to use formula (no square roots!) for w k by setting w 1 = v 1 and then replacing u j by w j / w j, for j < k, to get: k 1 w k = v k j=1 w j, v k w j 2 w j. Corollary 3.8. Any finite-dimensional inner product space V has an orthonormal basis. Proof. Let v 1,..., v n be any basis of V and apply Theorem 3.7 to get an orthonormal (and so linearly independent by Lemma 3.4) list u 1,..., u n with span{u 1,..., u n } = span{v 1,..., v n } = V. Thus the u 1,..., u n span also and so are an orthonormal basis. Example. Let U R 3 be given by {x R 3 x 1 + x 2 + x 3 = 0}. Let us find an orthonormal basis for U. First we need a basis of U to start with: dim U = 2 (why?) with basis v 1 = (1, 0, 1), v 2 = (0, 1, 1). Then v 1 = 2 so that u 1 = ( 1 2, 0, 1 2 ). Next w 1, v 2 = v 1, v 2 = 1 so that w 2 = v 2 w 1, v 2 w 1, w 1 w 1 = (0, 1, 1) 1 2 (1, 0, 1) = ( 1 2, 1, 1 2 ). This means that w 2 = 1/4 + 1 + 1/4 = 3/2 so that u 2 = 2 3 ( 1 2, 1, 1 2 ) = ( 1 2 6, 3, 1 6 ). Let us conclude our discussion of orthonormal bases with an application of Gram Schmidt which has uses in Statistics (see MA20227) and elsewhere. First, a definition: 24

Definition. A matrix Q M n n (R) is orthogonal if Q T Q = I n, or, equivalently, Q has orthonormal columns with respect to the dot product. identity matrix. Here I n is the n n Remark. The two conditions in this definition are indeed equivalent: if q i is the i-th column of Q then (Q T Q) ij = q T i q j. Theorem 3.9 (QR decomposition). Let A M n n (R) be an invertible matrix. Then we can write A = QR, where Q is orthogonal and R is upper triangular (R ij = 0 if i > j) with positive entries on the diagonal. Proof. We apply Theorem 3.7 to the columns of A to get the columns of Q. So let v 1,..., v n be the columns of A. Since A is invertible, these are a basis so we can apply Theorem 3.7 to get an orthonormal basis u 1,..., u n. Let Q be the orthogonal matrix whose columns are the u i. Unravelling the formulae of the Gram Schmidt procedure, we have and, more generally, v 1 = v 1 u 1 v 2 = w 2 u 2 + u 1, v 2 u 1 v k = w k u k + j<k u j, v k u j. Otherwise said, A = QR where R kk = w k, R jk = u j, v k, for j < k, and R ij = 0 if i > j. To compute Q and R in practice, first do Gram Schmidt orthogonalisation on the columns of A to get Q and then note that Q T A = Q T QR = I n R = R so that R = Q T A which is probably easier to compute than keeping track of intermediate coefficients in the orthogonalisation! Remarks. 1. In pure mathematics, the QR decomposition is a special case of the Iwasawa decomposition. 2. We shall have more to say about orthogonal matrices in the next chapter, see section 4.1.3. 3.2.2 Orthogonal complements and orthogonal projection Definition. Let V be an inner product space and U V. The orthogonal complement U of U (in V ) is given by U := {v V u, v = 0, for all u U}. Proposition 3.10. Let V be an inner product space and U V. Then (1) U V ; (2) U U = {0}; (3) U (U ). 25

U U Figure 3.3: Orthogonal complements in R 2 Proof. (1) This is a straightforward exercise using the second slot linearity of the inner product. (2) If u U U, u, u = 0 so that u = 0 by positive-definiteness of the inner product. (3) If u U and w U then w, u = u, w = 0 so that u (U ). If U is finite-dimensional then U is a complement to U in the sense of section 2.1.2 (even if V is infinite-dimensional!): Theorem 3.11. Let U be a finite-dimensional subspace of an inner product space V. Then V is an internal direct sum: V = U U. Proof. By Proposition 3.10(2), we just need to prove that V = U +U. For this, let u 1,..., u k be an orthonormal basis of U and let v V. We write v = ( k ) ( k ) u i, v u i + v u i, v u i =: v1 + v 2. Now v 1 U being in the span of the u i while, for 1 j k, Thus, for u = λ 1 u 1 + + λ k u k U, so that v 2 U. u j, v 2 = u j, v k u i, v u j, u i = u j, v u j, v = 0. u, v 2 = k λ j u j, v 2 = 0 Corollary 3.12. Let V be a finite-dimensional inner product space and U V. Then (1) dim U = dim V dim U. (2) U = (U ). Proof. j=1 26

(1) This is immediate from Proposition 2.3. (2) Two applications of (1) give dim(u ) = dim V dim U = dim U while Proposition 3.10(3) gives U (U ). We conclude that we have equality by Lemma 1.3. Definition. Let V be an inner product space and U V such that V = U U. We can write any v V as v = v 1 + v 2 for unique v 1 U, v 2 U. Define π U : V V, the orthogonal projection onto U, by π U (v) = v 1. Remark. π U (v) = v 2 = v v 1 so that π U = id V π U. The situation is illustrated in Figure 3.4. π U (v) v U π U (v) U Figure 3.4: Orthogonal projections Proposition 3.13. Let V be an inner product space and U V such that V = U U. Then (1) π u is linear. (2) ker π U = U. (3) π U U = id U so that im π U = U. (4) If U is finite-dimensional with orthonormal basis u 1,..., u k then, for all v V, π U (v) = k u i, v u i. Proof. Items (1) (3) (which make sense for any direct sum) are exercises 4. Item (4) is what we proved to establish Theorem 3.11. Let us conclude this chapter with an application to a minimisation problem that, among other things, underlies much of Fourier analysis (see MA20223). Theorem 3.14. Let V be an inner product space and U V such that V = U U. For v V, π U (v) is the nearest point of U to v: for all u U, v π U (v) v u. 4 Question 4 on sheet 2. 27

v u U π U (v) Figure 3.5: The orthogonal projection minimises distance to U Proof. As we see in Figure 3.5, this is just the Pythagoras theorem. π U (v) u U while v π U (v) = π U (v) U. Thus Indeed, for u U, note that v π U (v) 2 v π U (v) 2 + π U (v) u 2 = v π U (v) + π U (v) u 2 = v u 2, where the first equality is Pythagoras theorem (Proposition 3.2). Now take square roots! Exercise. Read Example 6.58 on pages 199 200 of Axler s Linear Algebra Done Right to see a beautiful application of this result. He takes V = C 0 [ π, π] and U to be the space of polynomials of degree at most 5 to get an astonishingly accurate polynomial approximation to sin. 28

Chapter 4 Linear operators on inner product spaces Convention. In this chapter, we once again take the field F of scalars to be either R or C. 4.1 Linear operators and their adjoints 4.1.1 Linear operators and matrices Definition. Let V be a vector space over F. A linear operator on V is a linear map φ : V V. The vector space of linear operators on V is denoted L(V ) (instead of L(V, V )). We saw in Algebra 1B that linear operators in the presence of a basis are closely related to square matrices: Definition. Let V be a finite-dimensional vector space over F with basis B = v 1,..., v n and let φ L(V ). The matrix of φ with respect to B is the matrix A = (A ij ) M n n (F) for which for 1 j n. φ(v j ) = A ij v i, (4.1) In words, the j-th column of A are the coefficients obtained by expanding out φ(v j ) in terms of the basis B. Equivalently, φ(x 1 v 1 + + x n v n ) = y 1 v 1 + + y n v n where Remarks. y = Ax. 1. The map φ A is a linear isomorphism L(V ) M n n (F) that sends composition of operators to multiplication of matrices: if ψ L(V ) has matrix B with respect to B, then ψ φ has matrix BA. 2. This is a special case of the story from Algebra 1B where we use the same basis on the domain and codomain. 3. The fancy way to say the relation between φ and A is to use the isomorphism φ B : F n V corresponding to B (see section 1.4.2). Then φ = φ B φ A φ 1 B, 29

or, equivalently, φ B φ A = φ φ B so that the following diagram commutes: V φ V φ B F n φ A F n φ B (The assertion that such a diagram commutes is simply that the two maps one builds by following the arrows in two different ways coincide. However, the diagram also helps us keep track of where the various maps go!) 4.1.2 Adjoints First a preliminary lemma: Lemma 4.1 (Nondegeneracy Lemma). Let V be an inner product space and v V. Then v, w = 0, for all w V, if and only if v = 0. Proof. For the forward implication, take v = w to get v, v = 0 and so v = 0 by positive-definiteness of inner product. Conversely, if v = 0, v, w = 0, for any w V, since the inner product is anti-linear in the first slot 1. Remark. To put this another way: V = {0}. Definition. Let V be an inner product space and φ L(V ). φ L(V ) such that, for all v, w V, we have An adjoint to φ is a linear operator φ (v), w = v, φ(w) or, equivalently, by conjugate symmetry, w, φ (v) = φ(w), v. Adjoints are well-behaved under most linear map constructions: Proposition 4.2. Let V be an inner product space and suppose φ, ψ L(V ) have adjoints. Then φ ψ; φ + λψ, λ F; φ and id V all have adjoints given by: (1) (φ ψ) = ψ φ (note the change of order here!). (2) (φ + λψ) = φ + λψ. (3) (φ ) = φ. (4) id V = id V. Proof. These are all easy exercises 2. When V is finite-dimensional, any φ L(V ) has a unique adjoint: Proposition 4.3. Let V be a finite-dimensional inner product space and φ L(V ) a linear operator. Then (1) φ has a unique adjoint φ. (2) Let u 1,..., u n be an orthonormal basis of V with respect to which φ has matrix A. Then φ has matrix A := A T (which is A T when F = R). 1 To spell it out: 0, w = 0 + 0, w = 0, w + 0, w 2 Question 1 on sheet 6. 30