Matrices A brief introduction - PDF Free Download

Matrices A brief introduction Basilio Bona DAUIN Politecnico di Torino September 2013 Basilio Bona (DAUIN) Matrices September 2013 1 / 74

Definitions Definition A matrix is a set of N real or complex numbers organized in m rows and n columns, with N = mn a 11 a 12 a 1n a A = 21 a 22 a 2n a ij [ ] a ij i = 1,...,m j = 1,...,n a m1 a m2 a mn A matrix is always written as a boldface capital letter viene as in A. To indicate matrix dimensions we use the following symbols A m n A m n A F m n A F m n where F = R for real elements and F = C for complex elements. Basilio Bona (DAUIN) Matrices September 2013 2 / 74

Transpose matrix Given a matrix A m n we define a transpose matrix the matrix obtained exchanging rows and columns a 11 a 21 a m1 A T n m = a 12 a 22 a m2...... a 1n a 2n a mn The following property holds (A T ) T = A Basilio Bona (DAUIN) Matrices September 2013 3 / 74

Square matrix A matrix is said to be square when m = n A square n n matrix is upper triangular when a ij = 0, i > j a 11 a 12 a 1n 0 a A n n = 22 a 2n...... 0 0 a nn If a square matrix is upper triangular its transpose is lower triangular and viceversa a 11 0 0 A T n n = a 12 a 22 0...... a 1n a 2n a nn Basilio Bona (DAUIN) Matrices September 2013 4 / 74

Symmetric matrix A real square matrix is said to be symmetric if A = A T, or A A T = O In a real symmetric matrix there are at least n(n+1) independent 2 elements. If a matrix K has complex elements k ij = a ij +jb ij (where j = 1) its conjugate is K with elements k ij = a ij jb ij. Given a complex matrix K, an adjoint matrix K is defined, as the conjugate transpose K = K T = K T A complex matrix is called self-adjoint or hermitian when K = K. Some textbooks indicate this matrix as K or K H Basilio Bona (DAUIN) Matrices September 2013 5 / 74

Diagonal matrix A square matrix is diagonal if a ij = 0 for i j a 1 0 0 0 a A n n = diag(a i ) = 2 0...... 0 0 a n A diagonal matrix is always symmetric. Basilio Bona (DAUIN) Matrices September 2013 6 / 74

Skew-symmetric matrix Skew-symmetric matrix A square matrix is skew-symmetric or antisymmetric if A+A T = 0 A = A T Given the constraints of the above relation, a generic skew-symmetric matrix has the following structure 0 a 12 a 1n a A n n = 12 0 a 2n...... a 1n a 2n 0 In a skew-symmetric matrix there are at most n(n 1) non zero 2 independent elements. We will see in the following some important properties of the skew-symmetric 3 3 matrices. Basilio Bona (DAUIN) Matrices September 2013 7 / 74

Block matrix It is possible to represent a matrix with blocks as A = A 11 A 1n A ij A m1 A mn where the blocks A ij have suitable dimensions. Given the following matrices A 1 = A 11 A 1n O A ij A 2 = A 11 O O A ij O A 3 = A 11 O O O A ij O O O A mn A m1 A mn O O A mn A 1 is upper block triangular, A 2 is lower block triangular, and A 3 is block diagonal Basilio Bona (DAUIN) Matrices September 2013 8 / 74

Matrix algebra Matrices are elements of an algebra, i.e., a vector space together with a product operator. The main operations of this algebra are: product by a scalar, sum, and matrix product Product by a scalar a 11 a 12 a 1n αa 11 αa 12 αa 1n a αa = α 21 a 22 a 2n...... = αa 21 αa 22 αa 2n...... a m1 a m2 a mn αa m1 αa m2 αa mn Sum a 11 +b 11 a 12 +b 12 a 1n +b 1n a A+B = 21 +b 21 a 22 +b 22 a 2n +b 2n...... a m1 +b m1 a m2 +b m2 a mn +b mn Basilio Bona (DAUIN) Matrices September 2013 9 / 74

Matrix sum Sum properties A+O = A A+B = B+A (A+B)+C = A+(B+C) (A+B) T = A T +B T The null (neutral, zero) element O takes the name of null matrix. The subtraction (difference) operation is defined using the scalar α = 1: A B = A+( 1)B Basilio Bona (DAUIN) Matrices September 2013 10 / 74

Matrix product Matrix product The operation is performed using the well-known rule rows by columns : the generic element c ij of the matrix product C m p = A m n B n p is c ij = n a ik b kj k=1 The bi-linearity of the matrix product is guaranteed, since it is immediate to verify that, given a generic scalar α, the following identity holds: α(a B) = (αa) B = A (αb) Basilio Bona (DAUIN) Matrices September 2013 11 / 74

Product Product properties In general: A B C = (A B) C = A (B C) A (B+C) = A B+A C (A+B) C = A C+B C (A B) T = B T A T the matrix product is non-commutative: A B B A, apart from particular cases; A B = A C does not imply B = C, apart from particular cases; A B = O does not imply A = O or B = O, apart from particular cases. Basilio Bona (DAUIN) Matrices September 2013 12 / 74

Identity matrix A neutral element wrt product exists and is called identity matrix, written as I n or simply I when no ambiguity arises; given a rectangular matrix A m n the following identities hold A m n = I m A m n = A m n I n Identity matrix 1 0 0 0 0 I =...... 0 0 1 Basilio Bona (DAUIN) Matrices September 2013 13 / 74

Idempotent matrix Given a square matrix A R n n, the k-th power is A k = A matrix is said to be idempotent if k A l=1 A 2 = A A n = A Basilio Bona (DAUIN) Matrices September 2013 14 / 74

Trace Trace The trace of a square matrix A n n is the sum of its diagonal elements tr(a) = n k=1 a kk The matrix traces satisfies the following properties tr(αa+βb) = αtr(a)+β tr(b) tr(ab) = tr(ba) tr(a) = tr(a T ) tr(a) = tr(t 1 AT) for non singular T (see below) Basilio Bona (DAUIN) Matrices September 2013 15 / 74

Minor A minor of order p of a matrix A m n is the determinant D p of a square sub-matrix obtained selecting any p rows and p columns of A m n The formal definition of determinant will be presented below There are as many minors as there are possible choices of p on m rows and of p on n columns. Given a matrix A m n, the principal minors of order k are the determinants D k, with k = 1,,min{m,n}, obtained selecting the first k rows an kd columns of A m n. Basilio Bona (DAUIN) Matrices September 2013 16 / 74

Minor and cofactor Given A R n n, we indicate with A (ij) R (n 1) (n 1) the matrix obtained taking out the i-th row and the j-th column of A. We define the minor D rc of a generic element a rc of a square matrix A n n, the determinant of the matrix obtained taking out the r-th row and the c-th column, i.e., deta (rc) D rc = deta (rc). We define the cofactor of an element a rc of a square matrix A n n the product A rc = ( 1) r+c D rc Basilio Bona (DAUIN) Matrices September 2013 17 / 74

Determinant Once defined the cofactor, the determinant of a square matrix A can be defined by row, i.e., choosing a generic row i, det(a) = n a ik ( 1) i+k det(a (ik) ) = k=1 n a ik A ik or, choosing a generic column j, we have the definition by column : det(a) = n a kj ( 1) k+j det(a (kj) ) = k=1 k=1 n a kj A kj Since these definition are recursive and assume the computation of determinants of smaller order minors, it is necessary to define the determinant of a matrix 1 1 (scalar), that is simply det(a ij ) = a ij. k=1 Basilio Bona (DAUIN) Matrices September 2013 18 / 74

Properties of determinant det(a B) = det(a)det(b) det(a T ) = det(a) det(ka) = k n det(a) if one makes a number of s exchanges between rows or columns of A, obtaining a new matrix A s, we have det(a s ) = ( 1) s det(a) if A has two equal or proportional rows/columns, we have det(a) = 0 if A has a row or a column that is a linear combination of other rows or columns, we have det(a) = 0 if A è upper or lower triangular, we have det(a) = n i=1 a ii if A is block triangular, with p blocks A ii on the diagonal, we have det(a) = p i=1 deta ii Basilio Bona (DAUIN) Matrices September 2013 19 / 74

Singular matrix and rank A matrix A is singular if det(a) = 0. We define the rank of matrix A m n, the number ρ(a m n ), computed as the maximum integer such that at least a non zero minor D p exists. The following properties hold: ρ(a) min{m,n} if ρ(a) = min{m,n}, A is said to have full rank if ρ(a) < min{m,n}, the matrix does not have full rank and one says that there is a fall of rank ρ(ab) min{ρ(a),ρ(b)} ρ(a) = ρ(a T ) ρ(aa T ) = ρ(a T A) = ρ(a) if A n n and deta < n then it has no full rank Basilio Bona (DAUIN) Matrices September 2013 20 / 74

Invertible matrix Given a square matrix A R n n, it is invertible of nonsingular if an inverse matrix A 1 n n exists, such that AA 1 = A 1 A = I n The matrix is invertible iff ρ(a) = n, or rather it has full rank; this implies det(a) 0. The inverse matrix can be computed as A 1 = 1 det(a) Adj(A) The following properties hold: (A 1 ) 1 = A; (A T ) 1 = (A 1 ) T. The inverse matrix, if exists, allows to compute the following matrix equation y = Ax obtaining the unknown x as x = A 1 y. Basilio Bona (DAUIN) Matrices September 2013 21 / 74

Orthonormal matrix A square matrix is orthonormal if A 1 = A T. the following identity holds A T A = AA T = I Given two square matrices A and B of equal dimension n n, the following identity holds (AB) 1 = B 1 A 1 An important results, called Inversion lemma, establish what follows: if A,C are square invertible matrices and B,D are matrices of of suitable dimensions, then (A+BCD) 1 = A 1 A 1 B(DA 1 B+C 1 ) 1 DA 1 Matrix (DA 1 B+C 1 ) must be invertible. The inversion lemma is useful to compute the inverse of a sum of matrices A 1 +A 2, when A 2 is decomposable into the product BCD and C is easily invertible, for instance diagonal or triangular. Basilio Bona (DAUIN) Matrices September 2013 22 / 74

Matrix derivative If a matrix A(t) is composed of elements a ij (t) that are all differentiable wrt (t), then the matrix derivative is [ ] d d A(t) = Ȧ(t) = dt dt a ij(t) = [ȧ ij (t)] If a square matrix A(t) has rank ρ(a(t)) = n for any time (t), then the derivative of its inverse is d dt A(t) 1 1 = A (t)ȧ(t)a(t) 1 Since the inverse operator is non linear, in general it results [ da(t) dt ] 1 d [ A(t) 1 ] dt Basilio Bona (DAUIN) Matrices September 2013 23 / 74

Symmetric Skew-symmetric decomposition Given a real matrix A R m n, the two matrices are both symmetric. A T A R n n AA T R m m Given a square matrix A, it is always possible to factor it in a sum of two matrices, as follows: A = A s +A a where A s = 1 2 (A+AT ) symmetric matrix A a = 1 2 (A AT ) skew-symmetric matrix Basilio Bona (DAUIN) Matrices September 2013 24 / 74

Similarity transformation Given a square matrix A R n n and a non singular square matrix T R n n, the new matrix B R n n, obtained as B = T 1 AT oppure B = TAT 1 is said to be similar to A, and the transformation T is called similarity transformation. Basilio Bona (DAUIN) Matrices September 2013 25 / 74

Eigenvalues and eigenvectors Considering the similarity transformation between A and Λ, where the latter is diagonal Λ = diag(λ i ) and A = UΛU 1 U = [ u 1 u 2 u n ] Multiplying to the right A by U one obtains and then AU = UΛ Au i = λ i u i This identity is the well-known formula that relates the matrix eigenvalues to eigenvectors; the constant quantities λ i are the eigenvalues of A, while vectors u i are the eigenvectors of A, usually with non-unit norm. Basilio Bona (DAUIN) Matrices September 2013 26 / 74

Eigenvalues and eigenvectors Given a square matrix A n n, the solutions λ i (real or complex) of the characteristic equation det(λi A) = 0 are the eigenvalues of A. det(λi A) is a polynomial in λ, called characteristic polynomial. If the eigenvalues arre all distinct, the vectors u i that satisfy the identity are the eigenvectors of A. Au i = λ i u i Basilio Bona (DAUIN) Matrices September 2013 27 / 74

Generalized eigenvectors If the eigenvalues are not all distinct, one obtains the so-called generalized eigenvalues, whose characterization goes beyond the scope of these notes. From a geometrical point of view, the eigenvectors define those directions in R n (domain of the linear transformation represented by A) that are invariant wrt the transformation A, while the eigenvalues provide the related scale factors along these directions. The set of eigenvalues of a matrix A will be indicated as Λ(A), or rather {λ i (A)}; the set of eigenvectors of A will be indicated as {u i (A)}. In general, since the eigenvectors represent the invariant directions of the transformation, they are represented up to a constant factor, so they are usually normalized; this is a tacit assumption that will be considered here, unless otherwise stated. Basilio Bona (DAUIN) Matrices September 2013 28 / 74

Eigenvalues Properties Given a matrix A and its eigenvalues {λ i (A)}, the following holds true {λ i (A+cI)} = {(λ i (A)+c)} Given a matrix A and its eigenvalues {λ i (A)}, the following holds true {λ i (ca)} = {(cλ i (A)} Given an upper or lower triangular matrix a 11 a 12 a 1n 0 a 22 a 2n......, 0 0 a nn a 11 0 0 a 21 a 22 0...... a n1 a n2 a nn its eigenvalues are the terms on the diagonal {λ i (A)} = {a ii }; the same applies for a diagonal matrix. Basilio Bona (DAUIN) Matrices September 2013 29 / 74

Invariance of the eigenvalues Given a matrix A n n and its eigenvalues {λ i (A)}, the following holds true and det(a) = tr(a) = Given a general invertible transformation, represented by the matrix T, the eigenvalues of A are invariant to the similarity transformation n i=1 n i=1 λ i λ i B = T 1 AT or rather {λ i (B)} = {λ i (A)} Basilio Bona (DAUIN) Matrices September 2013 30 / 74

Modal matrix If we build a matrix M, whose columns are the unit eigenvalues u i (A) of A M = [ u 1 u n ] then the similarity transformation wrt M results i a diagonal matrix M takes the name of modal matrix. λ 1 0 0 0 λ Λ = 2 0...... = M 1 AM 0 0 λ n If A is symmetric, its eigenvalues are all real and the following identity holds Λ = M T AM In this particular case M is orthonormal. Basilio Bona (DAUIN) Matrices September 2013 31 / 74

Singular value decomposition SVD Given a generic matrix A R m n, having rank r = ρ(a) s, with s = min{m,n}, it can be factorized according to the in the following way: Singular value decomposition (SVD) A = UΣV T = s σ i u i v T i i=1 the important elements of this decomposition are σ i, u i and v i Basilio Bona (DAUIN) Matrices September 2013 32 / 74

SVD σ i (A) 0 are called singular values and are equal to the non-negative square roots of the eigenvalues of the symmetric matrix A T A: {σ i (A)} = { λ i (A T A)} σ i 0 listed in decreasing order σ 1 σ 2 σ s 0 if rank r < s there are only r positive singular values; the remaining ones are zero σ 1 σ 2 σ r > 0; σ r+1 = = σ s = 0 U is an orthonormal square matrix (m m) U = [ u 1 u 2 u m ] whose columns are the eigenvectors u i of AA T Basilio Bona (DAUIN) Matrices September 2013 33 / 74

SVD V is a orthonormal square matrix (n n) V = [ v 1 v 2 v n ] whose columns are the eigenvectors v i of A T A Σ is a rectangular matrix (m n) with the following structure if m < n Σ = [ Σ s O ] if m = n Σ = Σ s if m > n Σ = [ Σs O Σ s = diag(σ i ) is s s diagonal, and its diagonal terms are the singular values. ]. Basilio Bona (DAUIN) Matrices September 2013 34 / 74

SVD Otherwise we can decompose A in a way that puts in evidence the positive singular values alone: where A = [ P P ] [ ] Σr O }{{} O O U }{{} Σ [ Q T Q T ] }{{} V T = PΣ r Q T P is a m r orthonormal matrix; P is a m (m r) orthonormal matrix; Q is a n r orthonormal matrix, Q T ia a n (n r) orthonormal matrix; Σ r is a r r diagonal matrix with diagonal elements the positive singular values σ i, i = 1,,r. Basilio Bona (DAUIN) Matrices September 2013 35 / 74

SVD and rank The rank r of A is equal to the number r s of nonzero singular values. Given a generic matrix A R m n, the two matrices A T A and AA T are symmetrical, have the same positive singular values, and differ only for the number of zero singular values. Basilio Bona (DAUIN) Matrices September 2013 36 / 74

Linear operators representation Given two vector spaces X R n and Y R m, with dimensions n and m, and given two generic vectors x X and y Y, the generic linear transformation between the two spaces can be represented by the matrix operator A R m n, as follows: y = Ax; x R n ; y R m. Therefore a matrix can be always interpreted as an operator that transforms a vector from the domain in X to the range Y. Conversely, a linear operator has at least one matrix that represents it. Basilio Bona (DAUIN) Matrices September 2013 37 / 74

Image space and null space The image space or range of a transformation A is the subspace Y defined by the following property: R(A) = {y y = Ax, x X}; R(A) Y The null space or kernel of a transformation A is the subspace of X defined by the following property: N(A) = {x 0 = Ax, x X}; N(A) X The null space represents all the vectors in X that are trasformed into the origin of Y. The dimensions of the range and kernel are called, respectively, rank ρ(a) and nullity ν(a): ρ(a) = dim(r(a)); ν(a) = dim(n(a)). Basilio Bona (DAUIN) Matrices September 2013 38 / 74

Image space and null space If X and Y have finite dimensions, the the following equalities hold: N(A) = R(A T ) R(A) = N(A T ) N(A) = R(A T ) R(A) = N(A T ) where indicates the orthogonal complement to the corresponding (sub-)space. We recall that {0} = R. The following orthogonal decomposition of subspaces X and Y hold X = N(A) N(A) = N(A) R(A T ) Y = R(A) R(A) = R(A) N(A T ) where the symbol represents the direct sum operator between subspaces. Basilio Bona (DAUIN) Matrices September 2013 39 / 74

Generalized inverse Given a generic real matrix A R m n, with m n, the inverse matrix is not defined. Nevertheless, it is possible to define a class of matrices A, called pseudo-inverses or generalized inverses, that satisfy the following relation: AA A = A If A has full rank, i.e., ρ(a) = min{m,n}, it is possible to define two classes of generalized inverses if m < n (i.e., ρ(a) = m), the right inverse of A is a matrix A r R n m such that AA r = I m m is n < m (i.e., ρ(a) = n), the left inverse of A is a matrix A l R n m such that A l A = I n n Basilio Bona (DAUIN) Matrices September 2013 40 / 74

Pseudo-inverse matrix Among the possible left- or right- inverses, two classes are important: right pseudo-inverse (m < n): A + r = A T (AA T ) 1 When ρ(a) = m, then (AA T ) 1 exists. left pseudo-inverse (n < m): A + l = (AT A) 1 A T When ρ(a) = n, then (A T A) 1 exists; this particular left pseudo-inverse (A T A) 1 A T is also known as the Moore-Penrose pseudo-inverse. Basilio Bona (DAUIN) Matrices September 2013 41 / 74

Moore-Penrose pseudo-inverse In general, also if A T A is non invertible, it is always possible to define a Moore-Penrose pseudo-inverse A + that satistfies the following relations: AA + A = A A + AA + = A + (AA + ) T = AA + (A + A) T = A + A (1) Basilio Bona (DAUIN) Matrices September 2013 42 / 74

Left and right pseudo-inverses The two pseudo-inverses A + r and A + l coincide with the traditional inverse matrix A 1 when A is square and full-rank: A 1 = A + r = A + l = A+ The linear transformation associated to A R m n y = Ax, with x R n and y R m, is equivalent to a system of m linear equations in n unknowns, whose coefficients are the elements of A; this linear system can admit one solution, no solution or an infinite number of solutions. If we use the pseudo-inverses to solve the linear system y = Ax, we must distinguish two cases, assuming that A has full rank. Basilio Bona (DAUIN) Matrices September 2013 43 / 74

Linear systems solution 1 When n > m there are more unknowns than equations; among the infinite possible solutions x R n, we choose the one with minimum norm x, given by x = A + d y = AT (AA T ) 1 y All the other possible solutions of y = Ax are obtained as x = x +v = A + d y+v where v N(A) is a vector belonging to the null space of A, with dimensions n m. These other possible solutions can be expressed also as x = A + d y+(i A+ d A)w where w R n is a n 1 generic vector. The matrix I A + d A projects w on the null space of A, transforming w in v N(A); this matrix is called projection matrix. Basilio Bona (DAUIN) Matrices September 2013 44 / 74

Figure: Solution of y = Ax when n > m. Basilio Bona (DAUIN) Matrices September 2013 45 / 74

Linear systems solution 2 When m > n there are more equations than unknowns; no exact solutions exist for y = Ax, but only approximate solutions, with an error e = y Ax 0. Among these possible approximate solutions we choose that minimizing the norm of the error, i.e., The solution is ˆx = arg min x R n y Ax ˆx = A + l y = (AT A) 1 A T y Geometrically it is the orthogonal projection of y on the orthogonal complement of N(A), i.e., on the subspace N(A) = R(A T ). The approximation error, also called projection error, is ê = (I AA + s )y and its norm is the lowest, as said above. Basilio Bona (DAUIN) Matrices September 2013 46 / 74

Figure: Solution of y = Ax when m > n. Basilio Bona (DAUIN) Matrices September 2013 47 / 74

Linear systems solution 3 The similarity between the projection matrix I A + da and the matrix that gives the projection error I AA + s is important and will be studied when projection matrices will be treated. In order to compute the generalized inverses, one can use the SVD. In particular, the pseudo-inverse is computed as [ ] A + Σ 1 = V r O U T = QΣ 1 O O r P T. Basilio Bona (DAUIN) Matrices September 2013 48 / 74

Projections and projection matrices The geometrical concept of a projection of a segment on a plane can be extended and generalized to the elements of a vector space. This concept is important for the solution of a large number of problems, as approximation, estimation, prediction and filtering problems. Give a n-dimensional real vector space V(R n ) with dimensions, endowed with the scalar product, and a k n dimensional subspace W(R k ), it is possible to define the projection operator of vectors v V on the subspace W. The projection operator is the square projection matrix P R n n, whose columns are the projections of the base elements of V in W. A matrix is a projection matrix iff P 2 = P i.e., is idempotent. The projection can be orthogonal or non orthogonal; in the first case P is symmetrical, in the second case it is generic. If P is a projection matrix, also I P is a projection matrix. Basilio Bona (DAUIN) Matrices September 2013 49 / 74

Projection matrices Some examples of projection matrices are those associated to the left pseudo-inverse P 1 = AA + s e P 2 = I AA + s and to the right pseudo-inverse P 3 = A + d A e P 4 = I A + d A From a geometrical point of view, P 1 projects every vector v V in the range space R(A), while P 2 projects v in its orthogonal complement R(A) = N(A T ). Basilio Bona (DAUIN) Matrices September 2013 50 / 74

Matrix norm 1 Similarly to what can be established for a vector, it is possible to provide a measure of the matrix, i.e., give its magnitude, defining the matrix norm. Since a matrix represents a linear transformation between vectors, the matrix norm measures how big this transformation is, but in some way, must normalize the result, to avoid that the magnitude of the transformed vector affects the norm; hence the following definition: A def Ax = sup x x = sup Ax. x =1 Basilio Bona (DAUIN) Matrices September 2013 51 / 74

Matrix norm 2 Given a square matrix A R n n, its norm must satisfy the following general (norm) axioms: 1 A > 0 for every A O; 2 A = 0 iff A = O; 3 A+B A + B (triangular inequality); 4 αa = α A for any scalar α and any matrix A; 5 AB A B. Given A R n n and its eigenvalues {λ i (A)}, the following inequality holds true 1 A 1 λ i A i = 1,...,n Basilio Bona (DAUIN) Matrices September 2013 52 / 74

Matrix norm 3 Taking only into account real matrices, the most used norms are: Spectral norm: Frobenius norm Max singular value: A 2 = max{λ i (A T A)} i A F = aij tra 2 = T A i j A σ = max{σ i (A)} i Basilio Bona (DAUIN) Matrices September 2013 53 / 74

Matrix norm 4 1-norm or max-norm: -norm: In general, and A 1 = max j A = max i A 2 = A σ n a ij i=1 n a ij j=1 A 2 2 A 1 A Basilio Bona (DAUIN) Matrices September 2013 54 / 74

Skew-symmetric matrices Skew-symmetric matrix A square matrix S is called skew-symmetric or antisymmetric when S+S T = O or S = S T A skew-symmetric matrix has the following structure 0 s 12 s 1n s A n n = 12 0 s 2n...... s 1n s 2n 0 Therefore there it has at most n(n 1) 2 independent elements. Basilio Bona (DAUIN) Matrices September 2013 55 / 74

Skew-symmetric matrices For n = 3 it results n(n 1) = 3, hence an skew-symmetric matrix has as 2 many element as a 3D vector v. Given a vector v = [ v 1 v 2 v 3 ] T it is possible to build S, and given a matrix S it is possible to extract the associated vector v. We indicate this fact using the symbol S(v), where, by convention S(v) = 0 v 3 v 2 v 3 0 v 1 v 2 v 1 0 Basilio Bona (DAUIN) Matrices September 2013 56 / 74

Skew-symmetric matrices Some properties: Given any vector v R 3 : Given two scalars λ 1,λ 2 R: Given any two vectors v,u R 3 : S T (v) = S(v) = S( v) S(λ 1 u+λ 2 v) = λ 1 S(u)+λ 2 S(v) S(u)v = u v = v u = S( v)u = S T (v)u Therefore S(u) is the representation of the operator (u ) and viceversa. Basilio Bona (DAUIN) Matrices September 2013 57 / 74

Skew-symmetric matrices The matrix S(u)S(u) = S 2 (u) is symmetrical and S 2 (u) = uu T u 2 I Hence the dyadic product D(u,u) = uu T = S 2 (u)+ u 2 I Basilio Bona (DAUIN) Matrices September 2013 58 / 74

Eigenvalues and eigenvectors of skew-symmetric matrices Given an skew-symmetric matrix S(v), its eigenvalues are imaginary or zero. λ 1 = 0, λ 2,3 = ±j v The eigenvalue related to the eigenvector λ 1 = 0 is v; the other two are complex conjugate. The set of skew-symmetric matrices is a vector space, denoted as so(3). Given two skew-symmetric matrices S 1 and S 2, we call commutator or Lie bracket the following operator that is itself skew-symmetric. [S 1,S 2 ] def = S 1 S 2 S 2 S 1 Skew-symmetric matrices form a Lie algebra, which is related to the Lie group of orthogonal matrices. Basilio Bona (DAUIN) Matrices September 2013 59 / 74

Orthogonal matrices A square matrix A R n is called orthogonal when with α i 0. α 1 0 0 A T 0 α A = 2 0...... 0 0 α n A square orthogonal matrix U R n is called orthonormal when all the constants α i are 1: U T U = UU T = I Therefore U 1 = U T Basilio Bona (DAUIN) Matrices September 2013 60 / 74

Orthonormal matrices Other properties: The columns, as well as the rows, of U or orthogonal to each other and have unit norm. U = 1; The determinant of U has unit module: det(u) = 1 therefore it can be +1 or 1. Given a vector x, its orthonormal transformation is y = Ux. Basilio Bona (DAUIN) Matrices September 2013 61 / 74

Orthonormal matrices If U is an orthonormal matrix, then AU = UA = A. Property in general valid also for unitary matrices, i.e., U U = I. When U R 3 3, only 3 out of 9 elements are independent. Scalar product is invariant to orthonormal transformations, (Ux) (Uy) = (Ux) T (Uy) = x T U T Uy = x T y = x y This means that vector lengths are invariant wrt orthonormal trasformations Ux = (Ux) T (Ux) = x T U T Ux = x T Ix = x T x = x Basilio Bona (DAUIN) Matrices September 2013 62 / 74

Orthonormal matrices When considering orthonormal transformations, it is important to distinguish the two cases: When det(u) = +1, U represents a proper rotation or simply a rotation, when det(u) = 1, U represents an improper rotation or reflection. The set of rotations forms a continuous non-commutative (wrt product) group; the set of reflections do not have this quality. Intuitively this means that infinitesimal rotations exist, while infinitesimal reflections do not have any meaning. Reflections are the most basic transformation in 3D spaces, in the sense that translations, rotations and roto-reflections (slidings) are obtained from the composition of two or three reflections Basilio Bona (DAUIN) Matrices September 2013 63 / 74

Figure: Reflections. Basilio Bona (DAUIN) Matrices September 2013 64 / 74

Orthonormal matrices If U is an orthonormal matrix, the distributive property wrt the cross product holds: U(x y) = (Ux) (Uy) (with general A matrices this is not true). For any proper rotation matrix U and a generic vector x the following holds US(x)U T y = U ( x (U T y) ) = (Ux) (UU T y) = (Ux) y = S(Ux)y where S(x) is the skew-symmetric matrix associated with x; therefore: US(x)U T = S(Ux) US(x) = S(Ux)U Basilio Bona (DAUIN) Matrices September 2013 65 / 74

Bilinear and quadratic forms A bilinear form associated to the matrix A R m n is the scalar quantity defined as b(x,y) def = x T Ay = y T A T x A quadratic form associated to the square matrix A R n n is the scalar quantity defined as q(x) def = x T Ax = x T A T x Every quadratic form associated to a skew-symmetric matrix S(y) is identically zero x T S(y)x 0 x Indeed, assuming w = S(y)x = y x, one obtains x T S(y)x = x T w, but since, by definition, w is orthogonal to both y and x, the scalar product x T w will be always zero, and also the quadratic form at the left hand side. Basilio Bona (DAUIN) Matrices September 2013 66 / 74

Definite positive matrices 1 Recalling the standard decomposition of a generic square matrix A in symmetric term A s and an skew-symmetric one A a, one concludes that the quadratic form depends only on the symmetric part of the matrix: q(x) = x T Ax = x T (A s +A a )x = x T A s x A square matrix A is said to be positive definite if the associated quadratic form x T Ax satisfies to the following conditions x T Ax > 0 x 0 x T Ax = 0 x = 0 A square matrix A is said to be positive semidefinite if the associated quadratic form x T Ax satisfies to the following conditions x T Ax 0 x A square matrix A is said to be negative definite if A is positive definite; similarly, a square matrix A is semidefinite negative if A è semidefinite positive. Basilio Bona (DAUIN) Matrices September 2013 67 / 74

Definite positive matrices 2 Often we use the following notations: definite positive matrix: A 0 semidefinite positive matrix: A 0 definite negative matrix: A 0 semidefinite negative matrix: A 0 A necessary but not sufficient condition for a square matrix A to bepositive definite is that the elements on its diagonal are all strictly positive. A necessary and sufficient condition for a square matrix A to be definite positive is that all its eigenvalues are strictly positive. Basilio Bona (DAUIN) Matrices September 2013 68 / 74

Sylvester criterion The Sylvester criterion states that a square matrix A is positive definite iff all its principal minors are strictly positive. A definite positive matrix has full rank and is always invertible The associated quadratic form x T Ax satisfies the following identity λ min (A) x 2 x T Ax λ max (A) x 2 where λ min (A) and λ max (A) are, respectively, the minimum and the maximum eigenvalues. Basilio Bona (DAUIN) Matrices September 2013 69 / 74

Semidefinite matrix and rank A semidefinite positive matrix A n n has rank ρ(a) = r < n, i.e., it has r strictly positive eigenvalues and n r zero eigenvalues. The quadratic form sgoes to zero for every vector x N(A). Given a real matrix of generic dimensions A m n, we have seen that both A T A and AA T are symmetrical; in addition we know that ρ(a T A) = ρ(aa T ) = ρ(a) These matrices have all real, non negative eigenvalues, and therefore they are definite or semidefinite positive: in particular, if A m n has full rank, then if m < n, A T A 0 and AA T 0, if m = n, A T A 0 and AA T 0, if m > n, A T A 0 and AA T 0. Basilio Bona (DAUIN) Matrices September 2013 70 / 74

Matrix derivatives 1 If matrix A has its elements that are functions of a quantity x, one can define the matrix derivative wrt x as [ ] d daij A(x) := dx dx If x is the time t, one writes d A(t) Ȧ(t) := dt [ ] daij (t) [ ] ȧ ij dt If A is a time function through the variable x(t), then [ ] [ ] d aij (x) dx(t) aij (x) A(x(t)) Ȧ(x(t)) := ẋ(t) dt x dt x Basilio Bona (DAUIN) Matrices September 2013 71 / 74

Matrix derivatives 2 Given a vector-values scalar function φ(x) defined as φ( ) : R n R 1, the gradient of the function φ wrt to x is a column vector x φ = φ x := φ(x) x 1 φ(x) x n i.e., x := If x(t) is a differentiable time function, then x 1 x n = grad x dφ(x) dt φ(x) = T x φ(x)ẋ (Notice the convention: the gradient for us is a column vector, although many textbooks assume it is a row vector) Basilio Bona (DAUIN) Matrices September 2013 72 / 74

Jacobian matrix Given a m 1 vector function f(x) = [ f 1 (x) f m (x) ] T, x R n, the Jacobian matrix (or simply the jacobian) is a m n matrix defined as ( f1 (x) x J f (x) = ( fm (x) x ) T ) T f 1 (x) x 1 = f m (x) x 1 f 1 (x) x n f i (x) x j and if x(t) is a differentiable time function, then ḟ(x) df(x) dt f m (x) x n = df(x) dx ẋ(t) = J f(x)ẋ(t) = (grad xf 1 ) T (grad x f m ) T Notice that the rows of J f are the transpose of the gradients of the various functions. Basilio Bona (DAUIN) Matrices September 2013 73 / 74

Gradient Given a bilinear form b(x,y) = x T Ay, we call gradients the following vectors: gradient wrt x: grad x b(x,y) def = b(x,y) = Ay x gradient wrt y: grad y b(x,y) def = b(x,y) = A T x y Given the quadratic form q(x) = x T Ax, we call gradient wrt x the following vector: x q(x) grad x q(x) def = q(x) = 2Ax x Basilio Bona (DAUIN) Matrices September 2013 74 / 74