Matrices A brief introduction Basilio Bona DAUIN Politecnico di Torino Semester 1, 2014-15 B. Bona (DAUIN) Matrices Semester 1, 2014-15 1 / 44
Definitions Definition A matrix is a set of N real or complex numbers organized in m rows and n columns, with N = mn a 11 a 12 a 1n a A = 21 a 22 a 2n a ij [ ] a ij i = 1,...,m j = 1,...,n a m1 a m2 a mn A matrix is always written as a boldface capital letter viene as in A. To indicate matrix dimensions we use the following symbols A m n A m n A F m n A F m n where F = R for real elements and F = C for complex elements. B. Bona (DAUIN) Matrices Semester 1, 2014-15 2 / 44
Transpose matrix Given a matrix A m n we define a transpose matrix the matrix obtained exchanging rows and columns a 11 a 21 a m1 A T n m = a 12 a 22 a m2...... a 1n a 2n a mn The following property holds (A T ) T = A B. Bona (DAUIN) Matrices Semester 1, 2014-15 3 / 44
Square matrix A matrix is said to be square when m = n A square n n matrix is upper triangular when a ij = 0, i > j a 11 a 12 a 1n 0 a A n n = 22 a 2n...... 0 0 a nn If a square matrix is upper triangular its transpose is lower triangular and viceversa a 11 0 0 A T n n = a 12 a 22 0...... a 1n a 2n a nn B. Bona (DAUIN) Matrices Semester 1, 2014-15 4 / 44
Symmetric matrix A real square matrix is said to be symmetric if A = A T, or A A T = O In a real symmetric matrix there are at least n(n+1) independent 2 elements. If a matrix K has complex elements k ij = a ij +jb ij (where j = 1) its conjugate is K with elements k ij = a ij jb ij. Given a complex matrix K, an adjoint matrix K is defined, as the conjugate transpose K = K T = K T A complex matrix is called self-adjoint or hermitian when K = K. Some textbooks indicate this matrix as K or K H B. Bona (DAUIN) Matrices Semester 1, 2014-15 5 / 44
Diagonal matrix A square matrix is diagonal if a ij = 0 for i j a 1 0 0 0 a A n n = diag(a i ) = 2 0...... 0 0 a n A diagonal matrix is always symmetric. B. Bona (DAUIN) Matrices Semester 1, 2014-15 6 / 44
Skew-symmetric matrix Skew-symmetric matrix A square matrix is skew-symmetric or antisymmetric if A+A T = 0 A = A T Given the constraints of the above relation, a generic skew-symmetric matrix has the following structure 0 a 12 a 1n a A n n = 12 0 a 2n...... a 1n a 2n 0 In a skew-symmetric matrix there are at most n(n 1) non zero 2 independent elements. We will see in the following some important properties of the skew-symmetric 3 3 matrices. B. Bona (DAUIN) Matrices Semester 1, 2014-15 7 / 44
Block matrix It is possible to represent a matrix with blocks as A = A 11 A 1n A ij A m1 A mn where the blocks A ij have suitable dimensions. Given the following matrices A 1 = A 11 A 1n O A ij A 2 = A 11 O O A ij O A 3 = A 11 O O O A ij O O O A mn A m1 A mn O O A mn A 1 is upper block triangular, A 2 is lower block triangular, and A 3 is block diagonal B. Bona (DAUIN) Matrices Semester 1, 2014-15 8 / 44
Matrix algebra Matrices are elements of an algebra, i.e., a vector space together with a product operator. The main operations of this algebra are: product by a scalar, sum, and matrix product Product by a scalar a 11 a 12 a 1n αa 11 αa 12 αa 1n a αa = α 21 a 22 a 2n...... = αa 21 αa 22 αa 2n...... a m1 a m2 a mn αa m1 αa m2 αa mn Sum a 11 +b 11 a 12 +b 12 a 1n +b 1n a A+B = 21 +b 21 a 22 +b 22 a 2n +b 2n...... a m1 +b m1 a m2 +b m2 a mn +b mn B. Bona (DAUIN) Matrices Semester 1, 2014-15 9 / 44
Matrix sum Sum properties A+O = A A+B = B+A (A+B)+C = A+(B+C) (A+B) T = A T +B T The null (neutral, zero) element O takes the name of null matrix. The subtraction (difference) operation is defined using the scalar α = 1: A B = A+( 1)B B. Bona (DAUIN) Matrices Semester 1, 2014-15 10 / 44
Matrix product Matrix product The operation is performed using the well-known rule rows by columns : the generic element c ij of the matrix product C m p = A m n B n p is c ij = n a ik b kj k=1 The bi-linearity of the matrix product is guaranteed, since it is immediate to verify that, given a generic scalar α, the following identity holds: α(a B) = (αa) B = A (αb) B. Bona (DAUIN) Matrices Semester 1, 2014-15 11 / 44
Product Product properties In general: A B C = (A B) C = A (B C) A (B+C) = A B+A C (A+B) C = A C+B C (A B) T = B T A T the matrix product is non-commutative: A B B A, apart from particular cases; A B = A C does not imply B = C, apart from particular cases; A B = O does not imply A = O or B = O, apart from particular cases. B. Bona (DAUIN) Matrices Semester 1, 2014-15 12 / 44
Identity matrix A neutral element wrt product exists and is called identity matrix, written as I n or simply I when no ambiguity arises; given a rectangular matrix A m n the following identities hold A m n = I m A m n = A m n I n Identity matrix 1 0 0 0 0 I =...... 0 0 1 B. Bona (DAUIN) Matrices Semester 1, 2014-15 13 / 44
Trace Trace The trace of a square matrix A n n is the sum of its diagonal elements tr(a) = n k=1 a kk The matrix traces satisfies the following properties tr(αa+βb) = αtr(a)+β tr(b) tr(ab) = tr(ba) tr(a) = tr(a T ) tr(a) = tr(t 1 AT) for non singular T (see below) B. Bona (DAUIN) Matrices Semester 1, 2014-15 14 / 44
Determinant Once defined the cofactor, the determinant of a square matrix A can be defined by row, i.e., choosing a generic row i, det(a) = n a ik ( 1) i+k det(a (ik) ) = k=1 n a ik A ik or, choosing a generic column j, we have the definition by column : det(a) = n a kj ( 1) k+j det(a (kj) ) = k=1 k=1 n a kj A kj Since these definition are recursive and assume the computation of determinants of smaller order minors, it is necessary to define the determinant of a matrix 1 1 (scalar), that is simply det(a ij ) = a ij. k=1 B. Bona (DAUIN) Matrices Semester 1, 2014-15 15 / 44
Properties of determinant det(a B) = det(a)det(b) det(a T ) = det(a) det(ka) = k n det(a) if one makes a number of s exchanges between rows or columns of A, obtaining a new matrix A s, we have det(a s ) = ( 1) s det(a) if A has two equal or proportional rows/columns, we have det(a) = 0 if A has a row or a column that is a linear combination of other rows or columns, we have det(a) = 0 if A è upper or lower triangular, we have det(a) = n i=1 a ii if A is block triangular, with p blocks A ii on the diagonal, we have det(a) = p i=1 deta ii B. Bona (DAUIN) Matrices Semester 1, 2014-15 16 / 44
Singular matrix and rank A matrix A is singular if det(a) = 0. We define the rank of matrix A m n, the number ρ(a m n ), computed as the maximum integer such that at least a non zero minor D p exists. The following properties hold: ρ(a) min{m,n} if ρ(a) = min{m,n}, A is said to have full rank if ρ(a) < min{m,n}, the matrix does not have full rank and one says that there is a fall of rank ρ(ab) min{ρ(a),ρ(b)} ρ(a) = ρ(a T ) ρ(aa T ) = ρ(a T A) = ρ(a) if A n n and deta < n then it has no full rank B. Bona (DAUIN) Matrices Semester 1, 2014-15 17 / 44
Invertible matrix Given a square matrix A R n n, it is invertible of nonsingular if an inverse matrix A 1 n n exists, such that AA 1 = A 1 A = I n The matrix is invertible iff ρ(a) = n, or rather it has full rank; this implies det(a) 0. The inverse matrix can be computed as A 1 = 1 det(a) Adj(A) The following properties hold: (A 1 ) 1 = A; (A T ) 1 = (A 1 ) T. The inverse matrix, if exists, allows to compute the following matrix equation y = Ax obtaining the unknown x as x = A 1 y. B. Bona (DAUIN) Matrices Semester 1, 2014-15 18 / 44
Orthonormal matrix A square matrix is orthonormal if A 1 = A T. the following identity holds A T A = AA T = I Given two square matrices A and B of equal dimension n n, the following identity holds (AB) 1 = B 1 A 1 An important results, called Inversion lemma, establish what follows: if A,C are square invertible matrices and B,D are matrices of of suitable dimensions, then (A+BCD) 1 = A 1 A 1 B(DA 1 B+C 1 ) 1 DA 1 Matrix (DA 1 B+C 1 ) must be invertible. The inversion lemma is useful to compute the inverse of a sum of matrices A 1 +A 2, when A 2 is decomposable into the product BCD and C is easily invertible, for instance diagonal or triangular. B. Bona (DAUIN) Matrices Semester 1, 2014-15 19 / 44
Matrix derivative If a matrix A(t) is composed of elements a ij (t) that are all differentiable wrt (t), then the matrix derivative is [ ] d d A(t) = Ȧ(t) = dt dt a ij(t) = [ȧ ij (t)] If a square matrix A(t) has rank ρ(a(t)) = n for any time (t), then the derivative of its inverse is d dt A(t) 1 1 = A (t)ȧ(t)a(t) 1 Since the inverse operator is non linear, in general it results [ da(t) dt ] 1 d [ A(t) 1 ] dt B. Bona (DAUIN) Matrices Semester 1, 2014-15 20 / 44
Symmetric Skew-symmetric decomposition Given a real matrix A R m n, the two matrices are both symmetric. A T A R n n AA T R m m Given a square matrix A, it is always possible to factor it in a sum of two matrices, as follows: A = A s +A a where A s = 1 2 (A+AT ) symmetric matrix A a = 1 2 (A AT ) skew-symmetric matrix B. Bona (DAUIN) Matrices Semester 1, 2014-15 21 / 44
Similarity transformation Given a square matrix A R n n and a non singular square matrix T R n n, the new matrix B R n n, obtained as B = T 1 AT oppure B = TAT 1 is said to be similar to A, and the transformation T is called similarity transformation. B. Bona (DAUIN) Matrices Semester 1, 2014-15 22 / 44
Eigenvalues and eigenvectors Considering the similarity transformation between A and Λ, where the latter is diagonal Λ = diag(λ i ) and A = UΛU 1 U = [ u 1 u 2 u n ] Multiplying to the right A by U one obtains and then AU = UΛ Au i = λ i u i This identity is the well-known formula that relates the matrix eigenvalues to eigenvectors; the constant quantities λ i are the eigenvalues of A, while vectors u i are the eigenvectors of A, usually with non-unit norm. B. Bona (DAUIN) Matrices Semester 1, 2014-15 23 / 44
Eigenvalues and eigenvectors Given a square matrix A n n, the solutions λ i (real or complex) of the characteristic equation det(λi A) = 0 are the eigenvalues of A. det(λi A) is a polynomial in λ, called characteristic polynomial. If the eigenvalues arre all distinct, the vectors u i that satisfy the identity are the eigenvectors of A. Au i = λ i u i B. Bona (DAUIN) Matrices Semester 1, 2014-15 24 / 44
Skew-symmetric matrices Skew-symmetric matrix A square matrix S is called skew-symmetric or antisymmetric when S+S T = O or S = S T A skew-symmetric matrix has the following structure 0 s 12 s 1n s A n n = 12 0 s 2n...... s 1n s 2n 0 Therefore there it has at most n(n 1) 2 independent elements. B. Bona (DAUIN) Matrices Semester 1, 2014-15 25 / 44
Skew-symmetric matrices For n = 3 it results n(n 1) = 3, hence an skew-symmetric matrix has as 2 many element as a 3D vector v. Given a vector v = [ v 1 v 2 v 3 ] T it is possible to build S, and given a matrix S it is possible to extract the associated vector v. We indicate this fact using the symbol S(v), where, by convention S(v) = 0 v 3 v 2 v 3 0 v 1 v 2 v 1 0 B. Bona (DAUIN) Matrices Semester 1, 2014-15 26 / 44
Skew-symmetric matrices Some properties: Given any vector v R 3 : Given two scalars λ 1,λ 2 R: Given any two vectors v,u R 3 : S T (v) = S(v) = S( v) S(λ 1 u+λ 2 v) = λ 1 S(u)+λ 2 S(v) S(u)v = u v = v u = S( v)u = S T (v)u Therefore S(u) is the representation of the operator (u ) and viceversa. B. Bona (DAUIN) Matrices Semester 1, 2014-15 27 / 44
Skew-symmetric matrices The matrix S(u)S(u) = S 2 (u) is symmetrical and S 2 (u) = uu T u 2 I Hence the dyadic product D(u,u) = uu T = S 2 (u)+ u 2 I B. Bona (DAUIN) Matrices Semester 1, 2014-15 28 / 44
Eigenvalues and eigenvectors of skew-symmetric matrices Given an skew-symmetric matrix S(v), its eigenvalues are imaginary or zero. λ 1 = 0, λ 2,3 = ±j v The eigenvalue related to the eigenvector λ 1 = 0 is v; the other two are complex conjugate. The set of skew-symmetric matrices is a vector space, denoted as so(3). Given two skew-symmetric matrices S 1 and S 2, we call commutator or Lie bracket the following operator that is itself skew-symmetric. [S 1,S 2 ] def = S 1 S 2 S 2 S 1 Skew-symmetric matrices form a Lie algebra, which is related to the Lie group of orthogonal matrices. B. Bona (DAUIN) Matrices Semester 1, 2014-15 29 / 44
Orthogonal matrices A square matrix A R n is called orthogonal when with α i 0. α 1 0 0 A T 0 α A = 2 0...... 0 0 α n A square orthogonal matrix U R n is called orthonormal when all the constants α i are 1: U T U = UU T = I Therefore U 1 = U T B. Bona (DAUIN) Matrices Semester 1, 2014-15 30 / 44
Orthonormal matrices Other properties: The columns, as well as the rows, of U or orthogonal to each other and have unit norm. U = 1; The determinant of U has unit module: det(u) = 1 therefore it can be +1 or 1. Given a vector x, its orthonormal transformation is y = Ux. B. Bona (DAUIN) Matrices Semester 1, 2014-15 31 / 44
Orthonormal matrices If U is an orthonormal matrix, then AU = UA = A. Property in general valid also for unitary matrices, i.e., U U = I. When U R 3 3, only 3 out of 9 elements are independent. Scalar product is invariant to orthonormal transformations, (Ux) (Uy) = (Ux) T (Uy) = x T U T Uy = x T y = x y This means that vector lengths are invariant wrt orthonormal trasformations Ux = (Ux) T (Ux) = x T U T Ux = x T Ix = x T x = x B. Bona (DAUIN) Matrices Semester 1, 2014-15 32 / 44
Orthonormal matrices When considering orthonormal transformations, it is important to distinguish the two cases: When det(u) = +1, U represents a proper rotation or simply a rotation, when det(u) = 1, U represents an improper rotation or reflection. The set of rotations forms a continuous non-commutative (wrt product) group; the set of reflections do not have this quality. Intuitively this means that infinitesimal rotations exist, while infinitesimal reflections do not have any meaning. Reflections are the most basic transformation in 3D spaces, in the sense that translations, rotations and roto-reflections (slidings) are obtained from the composition of two or three reflections B. Bona (DAUIN) Matrices Semester 1, 2014-15 33 / 44
Figure: Reflections. B. Bona (DAUIN) Matrices Semester 1, 2014-15 34 / 44
Orthonormal matrices If U is an orthonormal matrix, the distributive property wrt the cross product holds: U(x y) = (Ux) (Uy) (with general A matrices this is not true). For any proper rotation matrix U and a generic vector x the following holds US(x)U T y = U ( x (U T y) ) = (Ux) (UU T y) = (Ux) y = S(Ux)y where S(x) is the skew-symmetric matrix associated with x; therefore: US(x)U T = S(Ux) US(x) = S(Ux)U B. Bona (DAUIN) Matrices Semester 1, 2014-15 35 / 44
Bilinear and quadratic forms A bilinear form associated to the matrix A R m n is the scalar quantity defined as b(x,y) def = x T Ay = y T A T x A quadratic form associated to the square matrix A R n n is the scalar quantity defined as q(x) def = x T Ax = x T A T x Every quadratic form associated to a skew-symmetric matrix S(y) is identically zero x T S(y)x 0 x Indeed, assuming w = S(y)x = y x, one obtains x T S(y)x = x T w, but since, by definition, w is orthogonal to both y and x, the scalar product x T w will be always zero, and also the quadratic form at the left hand side. B. Bona (DAUIN) Matrices Semester 1, 2014-15 36 / 44
Definite positive matrices 1 Recalling the standard decomposition of a generic square matrix A in symmetric term A s and an skew-symmetric one A a, one concludes that the quadratic form depends only on the symmetric part of the matrix: q(x) = x T Ax = x T (A s +A a )x = x T A s x A square matrix A is said to be positive definite if the associated quadratic form x T Ax satisfies to the following conditions x T Ax > 0 x 0 x T Ax = 0 x = 0 A square matrix A is said to be positive semidefinite if the associated quadratic form x T Ax satisfies to the following conditions x T Ax 0 x A square matrix A is said to be negative definite if A is positive definite; similarly, a square matrix A is semidefinite negative if A è semidefinite positive. B. Bona (DAUIN) Matrices Semester 1, 2014-15 37 / 44
Definite positive matrices 2 Often we use the following notations: definite positive matrix: A 0 semidefinite positive matrix: A 0 definite negative matrix: A 0 semidefinite negative matrix: A 0 A necessary but not sufficient condition for a square matrix A to bepositive definite is that the elements on its diagonal are all strictly positive. A necessary and sufficient condition for a square matrix A to be definite positive is that all its eigenvalues are strictly positive. B. Bona (DAUIN) Matrices Semester 1, 2014-15 38 / 44
Sylvester criterion The Sylvester criterion states that a square matrix A is positive definite iff all its principal minors are strictly positive. A definite positive matrix has full rank and is always invertible The associated quadratic form x T Ax satisfies the following identity λ min (A) x 2 x T Ax λ max (A) x 2 where λ min (A) and λ max (A) are, respectively, the minimum and the maximum eigenvalues. B. Bona (DAUIN) Matrices Semester 1, 2014-15 39 / 44
Semidefinite matrix and rank A semidefinite positive matrix A n n has rank ρ(a) = r < n, i.e., it has r strictly positive eigenvalues and n r zero eigenvalues. The quadratic form sgoes to zero for every vector x N(A). Given a real matrix of generic dimensions A m n, we have seen that both A T A and AA T are symmetrical; in addition we know that ρ(a T A) = ρ(aa T ) = ρ(a) These matrices have all real, non negative eigenvalues, and therefore they are definite or semidefinite positive: in particular, if A m n has full rank, then if m < n, A T A 0 and AA T 0, if m = n, A T A 0 and AA T 0, if m > n, A T A 0 and AA T 0. B. Bona (DAUIN) Matrices Semester 1, 2014-15 40 / 44
Matrix derivatives 1 If matrix A has its elements that are functions of a quantity x, one can define the matrix derivative wrt x as [ ] d daij A(x) := dx dx If x is the time t, one writes d A(t) Ȧ(t) := dt [ ] daij (t) [ ] ȧ ij dt If A is a time function through the variable x(t), then [ ] [ ] d aij (x) dx(t) aij (x) A(x(t)) Ȧ(x(t)) := ẋ(t) dt x dt x B. Bona (DAUIN) Matrices Semester 1, 2014-15 41 / 44
Matrix derivatives 2 Given a vector-values scalar function φ(x) defined as φ( ) : R n R 1, the gradient of the function φ wrt to x is a column vector x φ = φ x := φ(x) x 1 φ(x) x n i.e., x := If x(t) is a differentiable time function, then x 1 x n = grad x dφ(x) dt φ(x) = T x φ(x)ẋ (Notice the convention: the gradient for us is a column vector, although many textbooks assume it is a row vector) B. Bona (DAUIN) Matrices Semester 1, 2014-15 42 / 44
Jacobian matrix Given a m 1 vector function f(x) = [ f 1 (x) f m (x) ] T, x R n, the Jacobian matrix (or simply the jacobian) is a m n matrix defined as ( f1 (x) x J f (x) = ( fm (x) x ) T ) T f 1 (x) x 1 = f m (x) x 1 f 1 (x) x n f i (x) x j and if x(t) is a differentiable time function, then ḟ(x) df(x) dt f m (x) x n = df(x) dx ẋ(t) = J f(x)ẋ(t) = (grad xf 1 ) T (grad x f m ) T Notice that the rows of J f are the transpose of the gradients of the various functions. B. Bona (DAUIN) Matrices Semester 1, 2014-15 43 / 44
Gradient Given a bilinear form b(x,y) = x T Ay, we call gradients the following vectors: gradient wrt x: grad x b(x,y) def = b(x,y) = Ay x gradient wrt y: grad y b(x,y) def = b(x,y) = A T x y Given the quadratic form q(x) = x T Ax, we call gradient wrt x the following vector: x q(x) grad x q(x) def = q(x) = 2Ax x B. Bona (DAUIN) Matrices Semester 1, 2014-15 44 / 44