(VII.B) Bilinear Forms There are two standard generalizations of the dot product on R n to arbitrary vector spaces. The idea is to have a product that takes two vectors to a scalar. Bilinear forms are the more general version, and inner products (a kind of bilinear form) the more specific they are the ones that look like the dot product (= Euclidean inner product ) with respect to some basis. Here we tour the zoo of various classes of bilinear forms, before narrowing our focus to inner products in the next section. that Now let V/R be an n -dimensional vector space: DEFINITION. A bilinear f orm is a function B : V V R such B(a u + b v, w) = ab( u, w) + bb( v, w) B( u, a v + b w) = ab( u, v) + bb( u, w) for all u, v, w V. It is called symmetric if also B( u, w) = B( w, u). Now let us lay down the following law: whenever we are considering a bilinear form B on a vector space V (even if V = R n!), all orthogonality (or orthono rmality) is relative to B : u w means B( u, w) = 0. Sounds fine, huh? EXAMPLE 2. Take a look at the following (non-symmetric) bilinear form B on R 2 : ( B( x, y) = 0 x x 2 ) ( 0 0 0 ) ( y y 2 ) = x 2 y. This is absolutely sick, because under this bilinear form we have ( ) ( ) ( ) ( ) 0, but it is not true that 0! So we usually consider symmetric bilinear forms. 0
2 (VII.B) BILINEAR FORMS Try instead ( B( x, y) = x x 2 ) ( 0 0 ) ( y y 2 ) = x y x 2 y 2 ; B is symmetric but the notion of norm is not (( what ) we ( are )) used to from the dot product: B(ê 2, ê 2 ) =, while B, = 0 that is, under B there exists a nonzero self-perpendicular vector ( ) ( ). Such pathologies with general bilinear forms are routine; the point of defining inner products in VII.C will be to get rid of them. The Matrix of a Bilinear form. Consider a basis B = { v,..., v n } for V and write u = u i v i, w = w i v i. Let B be any bilinear form on V and set b ij = B( v i, v j ) ; we have ( = B( u, w) = B ( ) u i v i, w j v j = u i b ij w j i,j ) u u n b b n..... b n b nn w. w n =: t [ u] B [B] B [ w] B, defining a matrix for B. We write the basis B as a superscript, because basis change does not work the same way as for transformation matrices: t ([ u] B ) [B] B ([ w] B ) = t (S C B [ u] C ) [B] B (S C B [ w] C ) = t [ u] C ( t S C B [B] B S C B ) [ w] C = [B] C = t S[B] B S. Matrices related in this fashion (where S is invertible) are said to be cogredient. However, we warn the reader that the only bilinear form or inner product on R n for which B(ê i, ê j ) = δ ij (i.e. under which ê is an orthonormal basis) is the dot product! In order to avoid confusion coming from ê we will often work in a more abstract setting than R n.
(VII.B) BILINEAR FORMS 3 Symmetric Bilinear forms. Next let B be an arbitrary symmetric bilinear form, so that is a symmetric relation, and the subspace V w = { v V B( v, w) = B( w, v) = 0} makes sense. Now already the matrix of such a form (with respect to any basis) is symmetric, that is, t [B] B = [B] B ; and we now explain how to put it into an especially nice form. LEMMA 3. There exists a basis B such that where the b i = 0. [B] B = diag{b,..., b r, 0,..., 0}, PROOF. by induction on n = dim V (clear for n = ): assume the result for (n ) -dimensional spaces and prove for V. If B( u, w) = 0 for all u, w V, then [B] B = 0 in any basis and we are done. Moreover, if B( v, v) = 0 for all v V then (using symmetry and bilinearity) 2B( u, w) = B( u, w) + B( w, u) = B( u + w, u + w) B( u, u) B( w, w) = 0 for all u, w and we re done once more. So assume otherwise: that there exists v V such that B( v, v ) = b = 0. Clearly B( v, v ) = 0 = (i) span{ v } V v = {0}. Furthermore, (ii) V = span{ v } + V v : any w V can be written w = ( w B( w, v )b v ) + B( w, v )b v where the second term is in span{ v }, and the first is in V v since B(first term, v ) = B( w, v ) B( w, v )b B( v, v ) = 0. Combining (i) and (ii), V = span{ v } V v. Applying the inductive hypothesis to V v establishes the existence of a basis { v 2,..., v n } with B( v i, v j ) = b i δ ij (i, j 2). Combining this with the direct-sum decomposition of V, B = { v, v 2,..., v n }
4 (VII.B) BILINEAR FORMS is a basis for V. By definition of V v, B( v, v j ) = B( v j, v ) = 0 for j 2. Therefore B( v i, v j ) = b i δ ij for i, j and we re through. Reorder the basis elements so that [B] B = diag{ b,..., b p >0, b p+,..., b p+q <0, 0,..., 0} and then normalize: take B v = v p v p+ v p+q,..., ;,..., ; v p+q+,..., v n b bp b bp+q, p+ so that [B] B = diag{,..., p,,..., q Suppose for some other basis C = { w,..., w n }, 0,..., 0}. [B] C = diag{ c,..., c p >0 are p, q and p, q necessarily the same?, c p +,..., c p +q <0, 0,..., 0}; Since [B] C and [B] B are cogredient they must have the same rank, p + q = p + q. Now any nonzero u span{ v,..., v p } has B( u, u) > 0, while a nonzero u span{ w p +,..., w n } has B( u, u) 0. Therefore the only intersection of these two spans can be at {0}, which says the sum of their dimensions cannot exceed dim V : p + (n p ) n, which implies p p. An exactly symmetric argument (interchanging the v s and w s) shows that p p. So p = p, and q = q follows immediately. We have proved THEOREM 4 (Sylvester s Theorem). Given a symmetric bilinear form B on V/R, there exists a basis B = { v,..., v n } for V such that B( v i, v j ) = { 0 0 for i = j, and B( v i, v i ) = where the number of + s, s, ± and 0 s is well-defined (another such choice of basis will not change these
(VII.B) BILINEAR FORMS 5 numbers). The corresponding statement for the matrix of B gives rise to the following COROLLARY 5. Any symmetric real matrix is cogredient to exactly one matrix of the form diag{+,..., +,,...,, 0,..., 0}. For a given symmetric bilinear form (or symmetric matrix), we call the # of + s and s in the form guaranteed by the Theorem (or Corollary) the index of positivity (= p) and index of negativity (= q) of B (or [B]). Their sum r = p + q is (for obvious reasons) referred to as the rank, and the pair (p, q) (or triple (p, q, n r), or sometimes the difference p q) is referred to as the signature of B (or [B]). Anti-symmetric bilinear forms. Now if B : V V R is antisymmetric (or alternating ), i.e. B( u, v) = B( v, u), for all u, v V, then is still a symmetric relation. In this case the matrix of B with respect to any basis is skew-symmetric, t [B] B = [B] B, and the analogue of Sylvester s Theorem is actually simpler: blocks) PROPOSITION 6. There exists a basis B such that (in terms of matrix [B] B = where r = 2m n is the rank of B. 0 I m 0 I m 0 0 0 0 0, REMARK 7. This matrix can be written more succinctly as ( ) [B] B J = 2m 0, 0 0
6 (VII.B) BILINEAR FORMS where we define () J 2m := ( 0 I m I m 0 ). PROOF OF PROPOSITION 6. We induce on n: clearly the result holds (B = 0) if n =, since B( v, v) = B( v, v) must be 0. Now assume the result is known for all dimensions less than n. If V (of dimension n) contains any vector v with B( v, w) = 0 w V, take v n := v. Let U V be any (n )-dimensional subspace with V = span{ v} U, and obtain the rest of the basis by applying induction to U. If V does not contain such a vector, then the columns of [B] (with respect to any basis) are independent and det([b]) = 0, with the immediate consequence that n = 2m is even (by Exercise IV.B.4). So choose v 2m V\{0} arbitrarily, and v m V so that B( v 2m, v m ) =. Writing W := span{ v m, v 2m }, and W := { v V B( v, w) = 0 w W}, we claim that V = W W. Indeed, if w W W then w = a v m + b v 2m has 0 = B( w, v m ) = b and 0 = B( w, v 2m ) = a = w = 0. Moreover, given v V we may consider w := B( v, v m ) v 2m B( v, v 2m ) v m W and w := v w, which satisfies B( w, v m ) = B( v, v m ) B( v, v m )B( v 2m, v m ) and + B( v, v 2m )B( v m, v m ) = 0 0 B( w, v 2m ) = B( v, v 2m ) B( v, v m )B( v 2m, v 2m ) + B( v, v 2m )B( v m, v 2m ) = 0. 0 Thus w W, v = w + w, and the claim is proved. Now apply induction to W to get the remaining 2m 2 basis elements v,..., v m, v m+,..., v 2m. Hermitian symmetric forms. Given an n-dimensional complex vector space V/C, a Hermitian-linear (or sesquilinear 2 ) form H : V 2 from the Latin for one and a half, since it s only conjugate-linear in the first entry
(VII.B) BILINEAR FORMS 7 V C satisfies H(α u, w) = ᾱh( u, w), H( u, α w) = αh( u w), H( u + u 2, w) = H( u, w) + H( u 2, w), H( u, w + w 2 ) = H( u, w ) + H( u, w 2 ). It is Hermitian symmetric if it also satisfies H( u, w) = H( w, u); we will often call such forms simply Hermitian. The passage to matrices (where as usual B = { v,..., v n }, u = u i v i, w = w j v j, h ij = H( v i, v j ) ) looks like H( u, w) = ū i h ij w j = [ u] B [H]B [ w] B ; the relevant version of cogredience for coordinate transformations is S [H]S. If H is Hermitian symmetric then h ij = h ji ; that is, [H] B = ([H] B ) is a Hermitian symmetric matrix. For such forms/matrices Sylvester holds verbatim. For the purpose of studying inner product spaces in the next two sections, you can think of Hermitian symmetric matrices (A = t Ā ) as being the right complex generalization of the real symmetric matrices (which they include), and similarly for the forms. But there is an important caveat here: Hermitian forms aren t really bilinear over C, due to their conjugate-linearity in the first argument. If we view V as a 2n-dimensional vector space over R, then they are bilinear over R. In fact, if we go that route, then we may view H = B re + B im : R 2n R 2n C = R R as splitting up into two bilinear forms over R. Moreover, since B re ( v, u) + B im ( v, u) = H( v, u) = H( u, v) = B re ( u, v) B im ( u, v) for all u, v R 2n, B re is symmetric and B im is anti-symmetric. Now given a basis B = { v,..., v n } for V as a C-vector space, B := { v,..., v n, v,..., v n } =: { w,..., w 2n } is a basis for
8 (VII.B) BILINEAR FORMS V as an R-vector space. The matrix of 3 J := {multiplication by } : V V is clearly [J ] B = J 2n (as defined in ()). The complex anti-linearity of H in the first argument gives B re (J u, v) + B im (J u, v) = H( u, v) = H( u, v) = B im ( u, v) B re ( u, v) hence B re (J u, v) = B im ( u, v) and B im (J u, v) = B re ( u, v). Writing v = J w, this gives B re (J u, J w) = B im ( u, J w) = B im (J w, u) = B re ( w, u) = B re ( u, w) and similarly B im (J u, J w) = B im ( u, w). In this way one deduces that giving a Hermitian form on V is equivalent to giving a (real) symmetric or antisymmetric form which is invariant under the complex structure map J. FInally, if [H] B = diag{+,..., +,,...,, 0..., 0} =: D then one finds that ( ) ( ) [B re ] B D 0 = and [B im ] B 0 D = ; 0 D D 0 that is, the signature of B re is exactly double that of H. Quadratic Forms. Now let s return to unambiguously real vector spaces. form DEFINITION 8. A quadratic form is a function Q : V R of the Q( u) = B( u, u), for some symmetric bilinear form B. 3 Conversely, given a 2n-dimensional real vector space V with a linear transformation J with J 2 = Id (or equivalently, matrix J 2n in some basis), we call J a complex structure on V.
(VII.B) BILINEAR FORMS 9 Let B = { v,..., v n }, use this basis to expand a given vector u = x i v i, and set b ij = B( v i, v j ) = entries of [B] B ; then Q( u) = t [ u] B [B] B [ u] B = x i b ij x j. Basically this is a homogeneous quadratic function of several variables, with cross-terms (like x 2 x 3 ). Sylvester effectively says so long to the cross-terms. It gives us a new basis C = { w,..., w n } such that [B] C = diag{ +,..., + p u = y i w i in this new basis we have,,..., q, 0,..., 0} ; expanding (2) Q( u) = t [ u] C [B] C [ u] C = y 2 +... + y2 p (y 2 p+ +... + y2 p+q). Application to Calculus. Say f : R n R (nonlinear) has a stationary point at 0 = (0,..., 0), f x 0 =... = f x n 0 = 0. Consider its Taylor expansion about 0 : where f (x,..., x n ) = constant + b ij = n x i x j b ij + higher-order terms, i,j= 2 f x i x j 0. These are the entries of the so-called Hessian matrix of f (evaluated at 0 ); sometimes Hess( f ) is just called the matrix of second partials in calculus texts. Now Sylvester s theorem, applied exactly as above, gives us a linear change of coordinates (y,..., y n ) so that the second term in the Taylor series takes exactly the form (2). This is otherwise known as completing squares! If p = n then we have y 2 +... + y2 n and a local minimum; if q = n then y 2... y2 n and a local maximum. The other cases where p + q = n are saddle points, and if p + q < n
0 (VII.B) BILINEAR FORMS (Hess( f ) 0 not of maximal rank) then one must look to the higherorder terms. Incidentally in this latter case we say 0 is a degenerate stationary point. Is there any way you can tell what situation you re in without finding the basis guaranteed by Sylvester s theorem? Certainly if det(hess( f ) 0 ) = 0 then p + q = n we have a nondegenerate critical point. Where do we go from there? If n = 2 (notice that this is the situation in your calculus text), then we can use the fact that determinants of cogredient matrices have the same sign: t SAS = B = det A (det S) 2 = det B. If det(hess( f ) 0 ) > 0 then either (p, q) = (2, 0) or (0, 2) (a ++ or situation), which means a local extremum; if det(hess( f ) 0 ) < 0 then we have p = q = ( + ) and a saddle point. More generally (n > 2 ) if Hess( f ) 0 is diagonalizable/r, that could completely substitute for Sylvester (how?). In fact, in the next section we will see that real symmetric matrices can always be diagonalized over R! Nondegenerate bilinear forms. A bilinear (or Hermitian) form B on a vector space V is said to be nondegenerate if and only if, for every vector v V, there exists w V such that B( v, w) = 0. (Equivalently, the matrix of the form has nonzero determinant.) A nondegenerate symmetric form B is called orthogonal. 4 If the signs in Sylvester s theorem are all positive (resp. negative), B is positive (resp. negative) definite; otherwise, B is indefinite of signature (p, q), where p + q = n. A nondegenerate anti-symmetric form B is termed symplectic. In this case dim(v) is necessarily even: n = 2m, and (in some basis B) [B] B = J 2m. A nondegenerate Hermitian form H is called unitary. The same terminology applies as in the orthogonal case. 4 Calling a form orthogonal or unitary is a bit nonstandard, but is more consistent with the standard use of symplectic.
(VII.B) BILINEAR FORMS EXAMPLE 9. On R 4 with spacetime coordinates (x, y, z, t), we define an indefinite orthogonal form of signature (3, ) via its quadratic form x 2 + y 2 + z 2 t 2. The resulting Minkowski space is closely associated with the theory of special relativity. Nondegenerate symplectic and indefinite-orthogonal forms also play a central role in the topology of projective manifolds. We conclude this section with a brief (and somewhat sketchy) account of the simplest of the so-called classical groups. This material will not be used in the remainder of the text. Linear algebraic groups. These are those subgroups of GL n (C) or GL n (R), the general linear groups of invertible n n matrices with entries in C or R, that are defined by linear algebraic conditions. We ll only be interested in examples of reductive such groups, which are the ones with the property that whenever a vector subspace W is closed under their action on a vector space V, there is another subspace W also closed under this action, such that V = W W. In general, one can show (Chevalley s Theorem) that the reductive linear algebraic groups are the subgroups of GL n defined by the property of fixing a subalgebra of the tensor algebra 5 a,b V a (V ) b, where V = C n resp. R n. But that is a subject for a different course; here we ll just define a few of these groups, using only determinants and nondegenerate bilinear forms. For B orthogonal resp. symplectic, the corresponding orthogonal resp. symplectic group consists of all g GL n (F) (F = R or C) 5 We have not discussed tensor products of vector spaces, but they re not hard to define: given vector spaces V and W with bases { v i } i= n and { w j} m j=, you simply take V W to be the nm-dimensional vector space with basis { v i w j } i =,..., n j =,..., m. (These symbols obey the distributive property.) These aren t unfamiliar objects either. You could try to prove, for instance, that V W is the vector space of linear transformations from V to W, or that a bilinear form is an element of (V ) 2.
2 (VII.B) BILINEAR FORMS satisfying B(g v, g w) = B( v, w) v, w F n. (We write Sp n (F, B) resp. O n (F, B) for these groups.) By Proposition 6, all the symplectic groups are isomorphic (via conjugation by a change-of-basis matrix) to Sp 2m (F) := {g GL 2m (F) t gj 2m g = J 2m } for F = C resp. R. The real orthogonal groups are isomorphic to one of the O(p, q) := {g GL p+q (R) t gi p,q g = I p,q }, where I p,q := diag{i p, I q }. Note that O(p, q) = O(q, p), and O(n, 0) is written O n (R); these are called indefinite resp. definite orthogonal groups. All the complex orthogonal groups are isomorphic to O n (C) := {g GL n (C) t gg = I n }. The unitary groups U n (H) are defined by the property of preserving a Hermitian form: they consist of those g GL n (C) with (3) H(g u, g v) = H( u, v) u, v C n, and are each conjugate to one of the U(p, q) := {g GL p+q (C) g I p,q g = I p,q }. (Write U(n, 0) =: U(n) for the definite unitary group.) But this is a little deceptive, as (3) is not a C-linear condition: the U(p, q) are actually real linear algebraic groups. If (as above) we think of C n as R 2n, and write H = B re + B im, then I claim that U n (H) = O 2n (R, B re ) Sp 2n (R, B im ) as a subgroup of GL 2n (R). The inclusion is clear, since to preserve H, γ GL 2n (R) must preserve its real and imaginary parts. Conversely, we just need to show that preserving B re and B im also implies that γ comes from an element of GL n (C), or equivalently
(VII.B) BILINEAR FORMS 3 commutes with J (cf. Exercise 6 below). This follows from B re (gj u, g v) = B re (J u, v) = B im ( u, v) = B im (g u, g v) = B re (J g u, g v) and nondegeneracy of B re. The special linear groups SL n (F) consist of the elements of GL n (F) with determinant. They contain the symplectic groups (see Exercise 5), from which it also follows that SL 2n (R) contains U n (H). On the other hand, SL n (C) does not contain U n (H); intersecting them gives the special unitary groups SU n (H) (and SU(p, q)). The special orthogonal groups (denoted SO n (F, B), SO(p, q), etc.) are given by intersecting the orthogonal and special linear groups. Jordan decomposition. Let G be one of the above linear algebraic groups (over R or C). 6 Since G is a subgroup of GL n (F) for some n, we may think of elements of G as invertible matrices (with additional constraints) acting on R n or C n. We conclude this section with a fundamental result that is closely related to the Jordan normal form. DEFINITION 0. An element g G is semisimple if g is diagonalizable over C unipotent if a power of (g I n ) is zero. THEOREM. Every g G may be written UNIQUELY as a product (4) g = g ss g un of COMMUTING semisimple and unipotent elements OF G. PROOF. Begin with the case of G = GL n (C). A Jordan form matrix J admits such a decomposition, since each block decomposes: σ σ σ...... = σ σ.......... σ σ σ 6 Even more generally, Theorem will hold for all reductive linear algebraic groups over a perfect field, though that is obviously beyond our scope here.
4 (VII.B) BILINEAR FORMS By Theorem VI.E.8, we have g = γj(g)γ = γj ss J un γ = γj ss γ γj un γ =:g ss =:g un, and g ss, g un commute because J ss, J un do. Conversely, given a decomposition (4) and v E σ (g ss ), we have (g σi n ) n v = (g un g ss σi n ) n v = σ n (g un I n ) n v = 0 since g ss and g un commute, and so v E s σ(g). Writing C n = s j= E σ j (g ss ), this shows that E σj (g ss ) E s σ j (g). Since the dimensions of E s σ j (g) cannot sum to more than n, these inclusions are equalities. Therefore, the eigenspaces of g ss, and so g ss itself (and thus g un = gg ss ), are determined uniquely by g. It remains to show that if g belongs to one of the classical groups G GL n (C) above, then g ss and g un are as well. First, if g has real entries, then ker s (σi g) (= E σ (g ss )) and ker s ( σi g) (= E σ (g ss )) are conjugate, from which one deduces that g ss (hence g un ) is real. Next, if det(g) =, then since the determinant of a unipotent matrix is always, det(g ss ) = det(g un ) =. Finally, if g preserves a nondegenerate symmetric or alternating bilinear form B, we claim that g ss, g un do too. (This will finish the proof, as all the above groups are cut out of GL n (C) by some combination of these constraints.) Write V i := E s σ i (g) = E σi (g ss ). Since any vector decomposes into a sum of vectors in these V i, it will suffice to show that (5) B(g v, g w) = B( v, w) i, j, v V i, w V j implies (6) B(g ss v, g ss w) = B( v, w) i, j, v V i, w V j. The latter is equivalent to (7) B(V i, V j ) = 0 if σ i σ j =,
EXERCISES 5 since LHS(6) is just σ i σ j B( v, w). Suppose σ i σ j = ; then (σ i I g) is invertible on V j, and g invertible on V i. Given v V i, w V j, write w l := (σ i I g) l w V j and v l := g l v V i ; and note that (g σ i I) k V i = { 0} for some k. Applying (5), B( v, w) = B = σ i ( v, (σ i I g) w ) = σ i B( v, w ) B( v, g w ) B( v, w ) B(g v, w ) = σ i B ((g σ i I) v, w ) ) ((g σ i I) k v k, w k = 0 = = σ k i B establishes (7) and we are done. Exercises () A quadratic form on R 3 is given by Q(x, y, z) = x 2 + 3y 2 + z 2 4xy + 2xz 2yz. (a) Write the matrix of the corresponding symmetric bilinear form B with respect to ê. (Be careful: if 4 is an entry in your matrix, it isn t quite correct.) (b) Find S such that t S[B]êS is diagonal by following the steps in the proof of Sylvester s theorem. (c) What is the signature of B? (2) Consider the vector space P 2 (R) of polynomials of degree 2 with (symmetric) bilinear form B( f, g) := 0 f (t)g(t)dt. (a) Find a basis B of P 2 (R) in which [B] B = I 3. [Hint: work as if B were the dot product and use Gram-Schmidt, starting from the basis {, t, t 2 }.] (b) Let T : P 2 (R) P 2 (R) be the shift operator T f (t) = f (t ). Compute [T] B (where B is the basis found in part (a)). (3) Find a basis for the vector space(!) of all alternating bilinear forms on R n. What is its dimension? [Hint: you could use matrices.] (4) Write out the proof of Sylvester s theorem for Hermitian forms.
6 (VII.B) BILINEAR FORMS (5) Show that Sp 2m (F) is in fact a subgroup of SL 2m (F). [Hint: use Jordan to reduce to g ss.] (6) Deduce that the image of GL n (C) GL 2n (R) given by regarding C n as R 2n (via (z,..., z n ) (Re(z ),..., Re(z n ), Im(z ),..., Im(z n ))) consists of those elements commuting with J 2n. (7) (i) Let B be an orthogonal form on F n, and w F n be such that B( w, w) = 2. Show that the (matrix of the) reflection µ( v) := v B( w, v) w belongs to O n (F, B). (ii) Let B be a symplectic form on F 2m, w F 2m nonzero. Show that the (matrix of the) transvection τ( v) := v B( w, v) w belongs to Sp 2m (F, B).