Computational Foundations of Cognitive Science. Addition and Scalar Multiplication. Matrix Multiplication

Computational Foundations of Cognitive Science Lecture 1: Algebraic ; Transpose; Inner and Outer Product Frank Keller School of Informatics University of Edinburgh keller@inf.ed.ac.uk February 23, 21 1 2 3 Reading: Anton and Busby, Ch. 3.2 Frank Keller Computational Foundations of Cognitive Science 1 Frank Keller Computational Foundations of Cognitive Science 2 Matrix addition and scalar multiplication obey the laws familiar from the arithmetic with real numbers. Theorem: of If a and b are scalars, and if the sizes of the matrices A, B, and C are such that the operations can be performed, then: A + B = B + A (cummutative law for addition) A + (B + C) = (A + B) + C (associative law for addition) (ab)a = a(ba) (a + b)a = aa + ba (a b)a = aa ba a(a + B) = aa + ab a(a B) = aa ab Frank Keller Computational Foundations of Cognitive Science 3 However, matrix multiplication is not cummutative, i.e., in general AB BA. There are three possible reasons for this: AB is defined, but BA is not (e.g., A is 2 3, B is 3 4); AB and BA are both defined, but differ in size (e.g., A is 2 3, B is 3 2); AB and BA are both defined and of the same size, but they are different. 1 1 2 Assume A = B = then 2 3 3 1 2 3 6 AB = BA = 11 4 3 Frank Keller Computational Foundations of Cognitive Science 4

While the cummutative law is not valid for matrix multiplication, many properties of multiplication of real numbers carry over. Theorem: of If a is a scalar, and if the sizes of the matrices A, B, and C are such that the operations can be performed, then: A(BC) = (AB)C (associative law for multiplication) A(B + C) = AB + AC (left distributive law) (B + C)A = BA + CA (right distributive law) A(B C) = AB AC (B C)A = BA CA a(bc) = (ab)c = B(aC) Therefore, we can write A + B + C and ABC without parentheses. Frank Keller Computational Foundations of Cognitive Science 5 A matrix whose entries are all zero is called a zero matrix. It is denoted as or n m if the dimensions matter. [ ] [ ] [] The zero matrix plays a role in matrix algebra that is similar to that of in the algebra of real numbers. But again, not all properties carry over. Frank Keller Computational Foundations of Cognitive Science 6 Theorem: of If c is a scalar, and if the sizes of the matrices are such that the operations can be performed, then: A + = + A = A A = A A A = A + ( A) = A = if ca = then c = or A = However, the cancellation law of real numbers does not hold for matrices: if ab = ac and a, then not in general b = c. [ ] 1 1 1 2 5 Assume A = B = C =. 2 3 4 3 4 [ ] 3 4 It holds that AB = AC =. 6 8 So even though A, we can t conclude that B = C. Frank Keller Computational Foundations of Cognitive Science 7 Frank Keller Computational Foundations of Cognitive Science 8

Identity Matrix We saw that if ca = then c = or A =. Does this extend to the matrix matrix product? In other words, can we conclude that if CA = then C = or A =? [ ] 1 Assume C =. 2 Can you come up with an A so that CA =? A square matrix with ones on the main diagonal and zeros everywhere else is an identity matrix. It is denoted as I or In to indicate the size [1] [ ] 1 1 1 1 1 1 1 1 1 The identity matrix I plays a role in matrix algebra similar to that of 1 in the algebra of real numbers, where a 1 = 1 a = a. Frank Keller Computational Foundations of Cognitive Science 9 Frank Keller Computational Foundations of Cognitive Science 1 Identity Matrix : Representing Images Multiplying an matrix with I will leave that matrix unchanged. Theorem: Identity If A is an n m matrix, then AIm = A and InA = A. Assume we use matrices to represent greyscale images. If we multiply an image with I then it remains unchanged: [ ] [ ] a11 a12 a13 AI3 = 1 1 a11 a12 a13 = = A a21 a22 a23 a21 a22 a23 1 [ ] 1 a11 a12 a13 a11 a12 a13 I2A = = = A 1 a21 a22 a23 a21 a22 a23 255I = AI = Recall that is black, and 255 is white. Frank Keller Computational Foundations of Cognitive Science 11 Frank Keller Computational Foundations of Cognitive Science 12

of the Transpose of the Trace : Transpose If A is an m n matrix, then the transpose of A, denoted by A T, is defined to be the n m matrix that is obtained by making the rows of A into columns: (A)ij = (A T )ji. a11 a12 a13 a14 2 3 A = a21 a22 a23 a24 B = 1 4 C = [ 1 3 5 ] D = [4] a31 a32 a33 a34 5 6 a11 a21 a31 [ ] 1 A T = a12 a22 a32 2 1 5 a13 a23 a33 = C 3 4 6 = 3 D T = [4] 5 a14 a24 a34 If A is a square matrix, we can obtain A T by interchanging the entries that are symmetrically positions about the main diagonal: 1 2 4 1 3 5 A = 3 7 A T = 2 7 8 5 8 6 4 6 : Trace If A is a square matrix, then the trace of A, denoted by tr(a), is defined to be the sum of the entries on the main diagonal of A. tr(a) = tr(a T ) = 1 + 7 + ( 6) = Frank Keller Computational Foundations of Cognitive Science 13 Frank Keller Computational Foundations of Cognitive Science 14 of the Transpose of the Trace Theorem: of the Transpose If the sizes of the matrices are such that the stated operations can be performed, then: (A T ) T = A (A + B) T = A T + B T (A B) T = A T B T (ka) T = ka T (AB) T = B T A T Theorem: Transpose and Dot Product Au v = u A T v u Av = A T u v Theorem: of the Trace If A and B are square matrices of the same size, then: tr(a T ) = tr(a) tr(ca) = c tr(a) tr(a + B) = tr(a) + tr(b) tr(a B) = tr(a) tr(b) tr(ab) = tr(ba) Frank Keller Computational Foundations of Cognitive Science 15 Frank Keller Computational Foundations of Cognitive Science 16

: Representing Images We can transpose a matrix representing an image: : If u and v are column vectors with the same size, then u T v is the inner product of u and v; if u and v are column vectors of any size, then uv T is the outer product of u and v. A T = (A T ) T = Theorem: of u T v = tr(uv T ) u T v = u v = v u = v T u tr(uv T ) = tr(vu T ) = u v Frank Keller Computational Foundations of Cognitive Science 17 Frank Keller Computational Foundations of Cognitive Science 18 Summary 1 2 u = v = u 3 5 T v = [ 1 3 ] [ ] 2 = 1 2 + 3 5 = [13] = 13 5 [ ] 1 [2 ] 1 2 1 5 2 5 uv T = 5 = = 3 3 2 3 5 6 15 u T v = [ u1 u2 v1 ] v2 un = [u1v1 + u2v2 + + unvn] = u v. vn u1 u1v1 u1v2 u1vn uv T u2 [ ] u2v1 u2v2 u2vn = v1 v2 vn =....... un unv1 unv2 unvn Matrix addition and scalar multiplication are cummutative, associative, and distributive; matrix multiplication is associative and distributive, but not cummutative: AB BA; the zero matrix consists of only zeros, the identity matrix I consists of ones on the diagonal and zeros everywhere else; transpose A T : (A)ij = (A T )ji; trace tr(a): sum of the entries on the main diagonal; the trace and the transpose are distributive; inner product: u T v; outer product: uv T. Frank Keller Computational Foundations of Cognitive Science 19 Frank Keller Computational Foundations of Cognitive Science 2