Determinants of Block Matrices and Schur s Formula by Istvan Kovacs, Daniel S. Silver*, and Susan G. Williams* For what is the theory of determinants? It is an algebra upon algebra; a calculus which enables us to combine and foretell the results of algebraic operations, in the same way as algebra itself enables us to dispense with the performance of the special operations of arithmetic. All analysis must ultimately clothe itself under this form. J. J. Sylvester, 1851 Introduction. Most undergraduate students become acquainted with the determinant of a matrix by the end of their second or third year of studies. Determinants are usually introduced to classify systems of linear equations. But as many students later learn, determinants are also a rich source of inspiration about algebra, geometry and combinatorics. The purpose of this article is twofold. Our first goal is to discuss Laplace expansions. This important generalization of the standard definition of determinant has quietly slipped away from college textbooks and courses. Laplace expansions lead naturally to questions about the determinant of block matrices. One important consequence is a special case of Schur s Formula (see Proposition 1). Our second purpose is to present a generalization of Schur s Formula that is very natural and yet seems to be absent from the vast literature of matrix theory. It is a pleasure to thank our colleague Professor Dan Flath for helpful suggestions. Laplace Expansions. Let A = (a ij ) be an n n matrix. (We will assume that the entries a ij of any matrix that we consider are contained in a ring R. The reader who is unfamiliar with rings should think of the entries as real or complex numbers.) The nthorder determinant of A, denoted here by A, is usually defined as the sum of n! terms. Each term is the product of 1 or 1 together with n elements of the matrix such that no two elements belong to the same row or column. In the language of permutations, this is expressed by the following formula. A = π S n (sgn π)a 1π(1) a 2π(2) a nπ(n) Here the summation is taken over the elements of the set S n of all possible permutations of the first n natural numbers {1, 2,..., n}. The symbol sgn π denotes the parity of the permutation π (see [5] or [6]). The determinant is an element of the coefficient ring R. For the trivial case n = 1, the determinant of A is equal to a 11. For n = 2, we have (1) A = a 11 a 12 a 21 a 22 = a 11a 22 a 12 a 21. *The second and third authors are partially supported by National Science Foundation Grant DMS-9704399 1
The computation of very high order determinants is a challenge even for the fastest computers. Software packages reduce the problem to the computation of determinants of smaller and smaller orders. The traditional approach is the well-known co-factor expansion of the determinant along a selected row or column. This is a special case of the less widely known Laplace expansion theorem, according to which the determinant can be expanded along a selected set of rows or columns. What is a Laplace expansion? Let A = (a ik ) be any n n matrix. Choose any r rows or r columns (1 r n) of the matrix A. Let us say that we have selected r columns corresponding to the indicies 1 k 1 < k 2 < k r n. (The same argument can be carried out for the case of row selections, and notationwise, there will be no difference.) Now form the r r submatrix A r determined by a choice of the row indicies (2) 1 i 1 < i 2 < < i r n. Delete from A all the rows and columns that contain elements from A r. The remaining elements form an (n r) (n r) submatrix which we denote by A n r. The matrices A r and A n r can be considered complements of one another relative to A. Compute A n r and multiply it by ( 1) K, where K is equal to the sum of the row indices and the column indices that determine the position of A r in A; that is, K = i 1 + i 2 + + i r + k 1 + k 2 + + k r. The number ( 1) K A n r is called the co-factor of A r. Since the number of possible choices of the row indices (2) is the same as the number of ways r objects can be chosen from n objects, the number of r r submatrices A (1) r,..., A r (N) that we can form is ( ) n n! N = = r r!(n r)!. We are ready to state the Laplace Expansion Theorem. If A is an n n matrix, then for any choice of r columns (or r rows) of A, A = N i=1 ( 1) K i A (i) r A (i) n r. The case r = 1 is the familiar co-factor expansion of A. On the other hand, the theorem is trivial when r = n. (We make the convention that the co-factor of A is equal to 1. ) The proof of the Laplace Expansion Theorem is not difficult but we omit it. The interested reader can refer to [3] or [6]. 2
Pierre Simon Laplace was not the inventor of determinants. That honor probably belongs to Gottfied Wilhelm Leibniz, who defined determinants in a letter to Guillaume L Hôpital in a letter of 1693 [1]. Eighty years later Laplace developed a general theory of determinants in order to attack physical problems. The Laplace Expansion Theorem is an example of an ancient (more than 50 years old) theorem that bears the name of its true discoverer. The Laplace Expansion Theorem can be used to prove the following result. Proposition 1. Assume that ( ) A B M = O D is a partitioned matrix, where A is an m m matrix, D is an n n matrix, and O is the n m zero matrix. Then M = A D. Proof. Apply the Laplace Expansion Theorem, choosing the first m columns, and recall that a matrix with a zero row has zero determinant. A block matrix is a matrix that has been partitioned into submatrices ( blocks ) of the same size. Early in this century Issai Schur used Proposition 1 to compute the determinant of block matrices arising in the representation theory of groups. Schur considered a 2n 2n matrix M and partitioned it into four n n blocks A, B, C and D as shown below. ( ) A B M =. C D Schur proved that under certain commutativity assumptions on the blocks, one has (3) M = AD BC, in a close analogy with (1). In particular, this is true whenever the matrices A, B, C and D commute pairwise; the proof of (3) often appears in this case as an exercise in linear algebra texts such as [6, page 164]. Formula (3) entered the literature as Schur s formula [4]. It converts the computation of a 2nth-order determinant to that of an nth-order determinant. Schur s Formula Generalized. How does (3) generalize to higher order cases? In order to give our answer we need a short preparation. Consider now the ring Mat n (C) of all n n complex matrices under the operations of matrix addition and multiplication. Let us form a k k block matrix M of elements A ij of Mat n (C) that pairwise commute; that is, let 3
A 11 A 1k M =......., A k1 A kk where A ij A lm = A lm A ij for all possible pairs of indices i, j and l, m. In what follows, we will denote the determinant of M, viewed as a k k matrix with entries in Mat n (C), by D(M). We will reserve the symbol M for the determinant of M, regarded as an element of Mat kn (C). It is important to realize that D(M) is an n n matrix. The following result may be well known to a group of algebraists, but we have not found it in the literature. Theorem 1. Assume that M is a k k block matrix of elements A ij of Mat n (C) that pairwise commute. Then (4) M = D(M) = (sgn π)a 1π(1) A 2π(2) A kπ(k). π S k Proof. We use induction on k. The case k = 1 is evident. We suppose that (4) is true for k 1 and then prove it for k. Observe that the following matrix equation holds. I O O I O O A 21 I O O A 11 O......... A k1 O I O O A 11 M = A 11 Ọ. N O, where N is a (k 1) (k 1) matrix with complex entries. For the sake of notation, we write this as (5) PQM = R, where the symbols are defined appropriately. By the multiplicative property of determinants and Proposition 1 we have and D(PQM) = D(P)D(Q)D(M) = A k 1 11 D(M) D(R) = A 11 D(N). Hence we have 4
A k 1 11 D(M) = A 11D(N). Take the determinant of both sides. Using D(N) = N, a consequence of the induction hypothesis, together with (5) and Proposition 1 we see A 11 k 1 D(M) = A 11 D(N) = A 11 N = R = P Q M = A 11 k 1 M. If A 11 0 we can divide the sides by A 11 k 1 to get (4). For the case A 11 = 0, we recall that any square matrix can be made nonsingular by arbitrarily small changes in its entries, a consequence of the fact that the determinant function is continuous in each of the matrix entries. By approximating A 11 by nonsingular matrices, we see that (4) is valid in this case too. When k = 2, Theorem 1 reduces the order of the determinant from 2n to n; in this case it is a weaker form of Schur s Formula. In fact, (3) can be proved under less restrictive conditions. However, the commutativity conditions on the elements of the block matrices cannot simply be dropped. The reader can see this by considering the matrix M below. 1 0 0 0 0 0 0 1 M = 0 1 0 0 0 0 1 0 We conclude by describing a class of block matrices that satisfy the commutativity condition of Theorem 1. Matrices of this type arose in [7], and were the original motivation for this investigation. Let p ij (t) be polynomials, 1 i, j k, and let N be an n n matrix. Since the matrices p ij (N) commute pairwise, the block matrix p 11 (N) p 1k (N) M =..... p k1 (N) p kk (N) satisfies the hypothesis of Theorem 1. In fact, using the theorem we can say something about the determinant of M. Let p(t) be the determinant of p 11 (t) p 1k (t)....., p k1 (t) p kk (t) 5
and let ζ 1,..., ζ n be the (not necessarily distinct) eigenvalues of N. We leave the proof of the following assertion as an exercise. M = n p(ζ r ). r=1 Bibliography [1] Albert, A. Adrian, article on determinants in Encyclopedia Brittanica (Benton, 1960) [2] Boyer, Carl B. and Merzbach, Uta C., A History of Mathematics (Wiley, 1989) [3] Eves, H., Elementary Matrix Theory (Dover, 1966) [4] Gantmacher, F. R., The Theory of Matrices (Chelsea, 1960) [5] Halmos, P.R., Finite-dimensional Vector Spaces (van Nostrand, 1958) [6] Hoffman, K. and Kunze, R., Linear Algebra (Prentice-Hall, 1971) [7] Silver, Daniel S. and Williams, Susan G, Coloring link diagrams with a continuous palette, preprint Department of Mathematics and Statistics University of South Alabama Mobile, AL 36688 6