Determinants - Uniqueness and Properties 2-2-2008 In order to show that there s only one determinant function on M(n, R), I m going to derive another formula for the determinant It involves permutations of the rows, so I ll give a brief introduction to permutations first Definition A permutation of the set {1, 2,,n} is an invertible function σ : {1, 2,,n} {1, 2,,n} The set of all permutations of {1, 2,,n} is denoted S n and is called the symmetric group on n letters A transposition is a permutation in which two elements are swapped, and everything else doesn t move Recall that a function has an inverse if and only if it is bijective: Different inputs produce different outputs (injective), and everything in the target set is an output of the function (surjective) So a permutation of {1, 2,,n} is just a way of scrambling the numbers from 1 to n Example There are 6 permutations of {1, 2, 3}: The last two in the first row and the first one in the second row are transpositions In each case, two elements are swapped, and the other element doesn t move For example, the one in the bottom right corner of the diagram is the function σ(1) = 3, σ(2) = 1, σ(3) = 2 Now many permutations are there of {1, 2,,n}? There are n places that 1 can be mapped to Once you know where 1 goes, there are n 1 places remaining where 2 can be mapped to (1 and 2 can t go to the same place because permutations are injective) Continuing in this way, you find that the total numbers of possibilities is n(n 1)(n 2) 2 1 = n! Therefore, there are n! permutations of {1, 2,,n} Permutations are multiplied by composing them as functions, ie by doing one permutation followed by another The following result says that every permutation can be built out of transpositions Theorem (a) Every permutation can written as a product of transpositions (b) No permutation can be written as a product of both an even and an odd number of transpositions 1
You ll probably see these results proved in a course in abstract algebra You can see that the first part reasonable: It says that any way of scrambling the elements in {1, 2,,n} could be accomplished by swapping two elements at a time, one swap after another The second part of the theorem is important, because it justifies the following definition Definition A permutation is even if it can be written as a product of an even number of transpositions A permutation is odd if it can be written as a product of an odd number of transpositions If σ is a permutation, the sign of σ is sgn(σ) = { +1 if σ is even 1 if σ is odd Theorem If D : M(n, R) R is linear in each row and is 0 on matrices with equal rows, then D(A) = σ S n sgn(σ)a 1σ(1) A nσ(n) D(I) Proof Write Observe that r 1 A = r n r i = j A ij e j, where e j is the j-th standard basis vector (equivalently, the j-th row of the identity matrix) By linearity, j A 1je j e j r D(A) = D 2 = r A 1j D 2 j r n r n Now do the same with row 2: D(A) = j e j e A 1j A 2k D k k r n Do this for all n rows, using j 1,,j n as the summation indices: e j1 D(A) = e j2 A 1j1 A 2j2 A njn D j 1,,j n e jn Since D is 0 on matrices with equal rows, I only need to consider terms where all the rows are different ie terms where j 1,,j n are distinct This means that {j 1,,j n } is a permutation of {1,,n}, and I m summing over σ S n : e σ(1) D(A) = e σ(2) A 1σ(1) A 2σ(2) A nσ(n) D σ S n e σ(n) 2
Now e σ(1) e σ(2) e σ(n) is just the identity matrix with its rows permuted by σ Every permutation is a product of transpositions By an earlier lemma, since D is linear on the rows and is 0 for matrices with equal rows, a transposition (ie a row swap) multiplies the value of D by 1 Hence, e σ(1) e σ(2) D = sgn(σ)d(i) e σ(n) Therefore, D(A) = σ S n A 1σ(1) A 2σ(2) A nσ(n) sgn(σ)d(i) Since the determinant function defined by expansion by cofactors is alternating and linear in each row, it satisfies the conditions of the theorem Since det I = 1, the formula in the theorem becomes det A = σ S n sgn(σ)a 1σ(1) A nσ(n) I ll call this the permutation formula for the determinant Example Compute the following real determinant using the permutation formula det 1 2 3 4 5 6 7 8 9 I have to choose 3 entries from the matrix at a time, in such a way that each choice involves exactly one element from each row and each column For each such choice, I take the product of the three elements and multiply by the sign of the permutation of the elements, which I ll describe below Finally, I add up the results In order to do this systematically, focus on the first column I can choose 1, 4, or 7 from column 1 If I choose 1 from column 1, I can either choose 5 from column 2 and 9 from column 3, or 8 from column 2 and 6 from column 3 (Remember that I can t have two elements from the same row or column) 1 5 9 1 6 8 If I choose 4 from column 1, I can either choose 2 from column 2 and 9 from column 3, or 8 from column 2 and 3 from column 3 2 4 3 4 9 8 3
Finally, if I choose 7 from column 1, I can either choose 2 from column 2 and 6 from column 3, or 5 from column 2 and 3 from column 3 2 6 3 5 7 7 This gives me 6 products: (1)(5)(9) (1)(6)(8) (4)(2)(9) (4)(8)(3) (7)(2)(6) (7)(5)(3) Next, I have to attach a sign to each product To do this, I count the number of row swaps I need to move the 1 s in the identity matrix into the same positions as the numbers in the product I ll illustrate with two examples 3 4 : 8 1 0 0 0 1 0 0 0 1 r 1 r 2 0 1 0 1 0 0 0 0 1 r 1 r 3 0 0 1 1 0 0 0 1 0 It took 2 row swaps to move the 1 s into the same positions as 4, 8, and 3 Since 2 is even, the sign of (4)(8)(3) is +1 3 5 : 1 0 0 0 1 0 0 0 1 0 1 0 r 7 0 0 r 3 1 0 0 It took 1 row swap to move the 1 s into the same positions as 7, 5, and 3 Since 1 is odd, the sign of (7)(5)(3) is 1 Continuing in this fashion, I get det 1 2 3 4 5 6 = (1)(5)(9) (1)(6)(8) (4)(2)(9) + (4)(8)(3) + (7)(2)(6) (7)(5)(3) = 0 7 8 9 Notice how ugly the computation was! While the permutation formula can be used for computations, it s easier to use row or column operations or expansion by cofactors Corollary There is a unique determinant function : M(n, R) R Proof A determinant function D is linear in each row, is 0 on matrices with equal rows, and satisfies D(I) = 1 The theorem then shows that D(A) is completely determined by A In other words, the determinant function I defined by cofactor expansion is the only determinant function on n n matrices Remark Another way to express the proof of the Corollary is: If D : M(n, R) R is alternating and linear in each row, then D(A) = A D(I), where denotes the determinant function on M(n, R) Corollary If A M(n, R), then A = A T Proof A T = n sgn(σ) A T iσ(i) = n sgn(σ) A σ(i)i = σ S n i=1 σ S n i=1 4
n sgn(σ) A jσ 1 (j) = ( sgn σ 1 ) n A jσ 1 (j) = σ S n j=1 σ S n j=1 ( sgn σ 1 ) n A jσ 1 (j) = n sgn (τ) A jτ(j) = A σ 1 S n j=1 τ S n j=1 I got the next-to-the-last equality by letting τ = σ 1 Remark Since the rows of A are the columns of A T and vice versa, the Corollary implies that you can also use column operations to compute determinants The allowable operations are swapping two columns, multiplying a column by a number, and adding a multiple of a column to another column They have the same effects on the determinant as the corresponding row operations As I noted earlier, this result also means that you can compute determinants using cofactors of rows as well as columns The uniqueness of determinant functions is useful in deriving properties of determinants Theorem Let A, B M(n, R) Then AB = A B Proof Fix B, and define D(A) = AB Let r i denote the i-th row of A Then r 1 B r D(A) = 2 B r n B Now is alternating, so interchanging two rows in the determinant above multiplies D(A) by 1 Hence, D is alternating Next, I ll show that D is linear: k xb D kx + y = + yb = k D This proves that D is linear in each row By the remark following the uniqueness theorem, (kx + y)b x AB = D(A) = A D(I) = A B = + D y In words, this important result says that the determinant of a product is the product of the determinants Note that the same thing does not work for sums Example It s not true that det(a + B) = det A + det B For example, in M(2, R), [ ] [ ] 1 0 1 0 det = 1 and det = 1 0 1 0 1 5
But det ([ ] [ ]) [ ] 1 0 1 0 0 0 + = det = 0 0 1 0 1 0 0 Theorem Let F be a field, and let A M(n, F) A is invertible if and only if A 0 Proof If A is invertible, then A A 1 = AA 1 = I = 1 This equation implies that A 0 (since A = 0 would yield 0 = 1 ) Conversely, suppose that A 0 Suppose that A row reduces to the row reduced echelon matrix R, and consider the effect of elementary row operations on A Swapping two rows multiplies the determinant by 1 Adding a multiple of a row to another row leaves the determinant unchanged And multiplying a row by a nonzero number multiplies the determinant by that nonzero number Clearly, no row operation will make the determinant 0 if it was nonzero to begin with Since A 0, it follows that R 0 Since R is a row reduced echelon matrix with nonzero determinant, it can t have any all-zero rows An n n row reduced echelon matrix with no all-zero rows must be the identity, so R = I Since A row reduces to the identity, A is invertible Corollary Let F be a field, and let A M(n, F) If A is invertible, then A 1 = 1 A Proof I showed in proving the theorem that A A 1 = 1, so A 1 = 1 A Here s an easy application of this result Corollary Let F be a field, and let A, P M(n, F) If P is invertible, then PAP 1 = A Proof PAP 1 = P A P 1 = P P 1 A = A (The matrices PAP 1 and A are said to be similar This will come up later when I talk about changing coordinates in vector spaces) Definition Let R be a commutative ring with identity, and let A M(n, R) The classical adjoint adja is the matrix whose i-j-th entry is (adj A) ij = ( 1) i+j A(j i) In other words, adj A is the transpose of the matrix of cofactors Example Compute the adjoint of A = 1 0 3 0 1 1 2 First, I ll compute the cofactors The first line shows the cofactors of the first row, the second line the cofactors of the second row, and the third line the cofactors of the third row + 1 2 = 3, 0 1 1 2 = 1, + 0 1 1 1 = 1, 0 3 1 2 = 3, + 1 3 1 2 = 1, 1 0 1 1 = 1, 6
+ 0 3 = 3, 1 3 0 1 = 1, + 1 0 0 1 = 1 The adjoint is the transpose of the matrix of cofactors: 3 3 3 adj A = 1 1 1 1 I ll need the following little lemma later on Lemma Let R be a commutative ring with identity, and let A M(n, R) Then (adja) T = adj A T Proof Consider the (i, j) th elements of the matrices on the two sides of the equation [(adja) T ] ij = (adja) ji = ( 1) j+i A(i j), [adj A T ] ij = ( 1) i+j A T (j i) The signs ( 1) j+i and ( 1) i+j are the same; what about the other terms? A(i j) is the determinant of the matrix formed by deleting the i th row and the j th column from A A T (j i) is the determinant of the matrix formed by deleting the j th row and i th column from A T But the i th row of A is the i th column of A T, and the j th column of A is the j th row of A T So the two matrices that remain after these deletions are transposes of one another, and hence they have the same determinant Thus, A(i j) = A T (j i) Hence, [(adja) T ] ij = [adj A T ] ij Theorem Let R be a commutative ring with identity, and let A M(n, R) Then Proof Expand A by cofactors of row i: A I = A adj A A = j ( 1) i+j A ij A(i j) Take k i, and construct a new matrix by replacing row k of A with row i Since the new matrix has two equal rows, its determinant is 0 On the other hand, expanding by cofactors of row k, 0 = j ( 1) k+j A kj A(k j) = j ( 1) k+j A ij A(k j) In words, this means that if you expand by cofactors of a row using the cofactors from another row, you get 0 You only get A if you use the right cofactors ie the cofactors for the given row In terms of the Kronecker delta function, ( 1) k+j A ij A(k j) = δ ik A for all i, k j Interpret this equation as a matrix equation, where the two sides represent the i-k-th entries of their respective matrices What are the respective matrices? The right side is obviously the i-k-th entry of A I For the left side, (A adja) ik = A ij (adja) jk = A ij ( 1) j+k A(k j) j j 7
Therefore, A I = A adja I can use the theorem to obtain an important corollary I already know that a matrix over a field is invertible if and only if its determinant is nonzero The next result explains what happens over a commutative ring with identity, and also provides a formula for the inverse of a matrix Corollary Let R be a commutative ring with identity, and let A M(n, R) A is invertible if and only if A is invertible in R, in which case A 1 = 1 adj A A Proof First, suppose A is invertible Then AA 1 = I, so A A 1 = AA 1 = I = 1 Therefore, A is invertible in R Conversely, suppose A is invertible in R I have Take the last equation and replace A with A T : A I = A adja, so I = A A 1 adja (1) I = A T A T 1 adj A T Use the facts that A T = A and (from the earlier lemma) that (adj A) T = adja T : Use the transpose rule (MN) T = N T M T : I = A T A 1 (adja) T I = A 1 (adja A) T Finally, transpose both sides, using the transpose rules I T = I and (M T ) T = M: I = A 1 adja A (2) Equations (1) and (2) show that A 1 adj A multiplies A to I on both the left and right Therefore, A is invertible, and A 1 = A 1 adj A The last result can be used to find the inverse of a matrix It s not very good for big matrices from a computational point of view: The usual row reduction algorithm uses fewer steps However, it s not too bad for small matrices say 3 3 or smaller Example Compute the inverse of the following real matrix using cofactors: 1 2 2 A = 3 2 0 1 First, I ll compute the cofactors The first line shows the cofactors of the first row, the second line the cofactors of the second row, and the third line the cofactors of the third row + 2 0 = 2, 3 0 = 3, + 3 2 = 5, 8
2 2 = 0, + 1 2 = 3, 3 2 = 3, + 2 2 2 0 = 4, 1 2 3 0 = 6, + 1 2 3 2 = 4 The adjoint is the transpose of the matrix of cofactors: 2 0 4 adja = 6 5 3 4 Since deta = 6, I have A 1 = 1 2 0 4 6 6 5 3 4 Another consequence of the formula A I = A adj A is Cramer s rule, which gives a formula for the solution of a system of linear equations As with the adjoint formula for the inverse of a matrix, Cramer s rule is not computationally efficient: It s better to use row reduction to solve systems Corollary (Cramer s rule) If A is an invertible n n matrix, the unique solution to Ax = y is given by x i = B i A, where B i is the matrix obtained from A by replacing its i-th column by y Proof A x i = j Ax = y, (adj A)Ax = (adja)y, (adja) ij y j = ( 1) i+j y j A(j i) j What is the last sum? It is a cofactor expansion of A along column i, where instead of the elements of A s column i I m using the components of y This is exactly B i Example Use Cramer s Rule to solve the following system over R: In matrix form, this is 2x + y + z = 1 x + y z = 5 3x y + 2z = 2 2 1 x y = 1 5 3 1 2 z 2 I replace the successive columns of the coefficient matrix with 1, 5, 2, in each case computing the determinant of the resulting matrix and dividing by the determinant of the coefficient matrix: 1 2 2 5 1 1 2 1 2 x = 2 = 10 7 = 10 1 5 1 7, y = 3 2 = 6 7 = 6 5 7, z = 3 1 2 2 = 19 7 = 19 7 1 1 1 3 1 2 3 1 2 3 1 2 9
This looks pretty simple, doesn t it? But notice that you need to compute four 3 3 determinants to do this It becomes more expensive to solve systems this way as the matrices get larger c 2008 by Bruce Ikenaga 10