Lecture 19: Introduction to Linear Transformations

Lecture 19: Introduction to Linear Transformations Winfried Just, Ohio University October 11, 217

Scope of this lecture Linear transformations are important and useful: A lot of applications of linear algebra involve linear transformations. Linear algebra is much easier to understand when you look at it through the lens of linear transformations. Linear transformations are not hard to understand when one thinks of them in terms of concrete examples. Here I will develop the theory of linear transformations only as far as it directly relates to the remainder of this course and omit its more abstract aspects. In particular, we will think of a linear transformation (almost always) as a certain type of function T : R n R m, where R is the set of real numbers. Note: The textbook uses L instead of T.

A motivating example Think about all portfolios that contain only shares of Tried-And-True and of Get-Rich-Fast. The current value [ ] of such a portfolio could be represented by a x vector v curr = in R y 2, where x denotes the current value of its Tried-and-True shares and y denotes the current value of its Get-Rich-Fast shares. If you are willing to think about negative values for x and y as shares owed, each point in R 2 represents exactly one portfolio. Now we cannot know for sure what will happen to the shares over the next year, but we know that the value next year can also be represented as a point v next in R 2. We can think about the events of the coming year, whatever they may be, as transforming v curr into v next. This description defines a function, or transformation T : R 2 R 2 such that T ( v curr ) = v next. Ohio University Since Winfried 184 Just, Ohio University MATH32, Lecture 19: Linear Department Transformations of Mathematics

I wish I knew a formula for T... but I don t. Neither does anybody else. Next year of course, every investor will have 2-2 hindsight. But a bit of abstraction is quite useful here: Even without the formula, we can already determine two properties of T. If you double the holdings in your current portfolio, its value next year will also increase by a factor of 2 relative to what it would have been for your current holdings. The same goes for tripling (multiplying by a factor of λ = 3 instead of λ = 2), and so on. In mathematical terms: T (λ v) = λt ( v) for all scalars λ. Moreover, if you wish to merge your portfolio v with the portfolio w of your Significant Other, it does not matter whether the merger is done now, or next year: T ( v + w) = T ( v) + T ( w).

Linear transformations: Our definition Definition (For the purpose of this course) Let n, m be positive integers and let T : R n R m be a function. Then T is called a linear transformation if it satisfies both of the following conditions for all vectors v, w in R n and all scalars λ in R: (i) T (λ v) = λt ( v) (ii) T ( v + w) = T ( v) + T ( w). These two simple properties have many important consequences. They can be derived for the much broader class of T : V W, where V, W are abstract vector spaces and the scalars are allowed to be also complex numbers or members of any algebraic field. This more general theory is rather abstract and gives linear transformations a reputation for being a difficult concept. Our definition will do here, but you should know that it is just a special case and everything works just the same way if, in particular, the scalars are complex numbers. Ohio University Since Winfried 184 Just, Ohio University MATH32, Lecture 19: Linear Department Transformations of Mathematics

My Crystal Ball Let us return to our example. Recall [ ] that the value of a portfolio is x represented by a vector v curr = in R y 2, where x denotes the value of its Tried-and-True shares and y denotes the value of its Get-Rich-Fast shares. We defined a transformation T : R 2 R 2 by T ( v curr ) = v next. Now I m going to tell you a secret. Will you keep it? My Crystal Ball tells me that Tried-and-True shares will gain 2% over the coming year, while Get-Rich-Fast shares will lose 5% of their value. Trust me. What is the formula for T? T ([ ]) x y = [ ] 3x.5y Where have we seen this before?

Stretching and compressing a sheet of rubber In Lecture 6 this example was called T A ( v) = A v: ([ ]) [ ] [ ] [ ] x 3 x 3x T A = A v = = y.5 y.5y The matrix here is: A = [ ] 3.5 This transformation was interpreted as a threefold stretch in the horizontal (x-) direction and a twofold compression in the vertical (y-) direction of a sheet of rubber that lies flat on a surface. We see that the same transformation pops up in two very different contexts. It can be defined as multiplying a matrix by a vector.

Products of matrices and column vectors Let A be a matrix of order m n and let v be an n 1 column vector. a 11... a 1n v 1 w 1 A v =... =. a m1... a mn v n Then A v is an m 1 column vector. This defines a transformation T A : R n R m by T ( v) = A v w m By general properties of matrix multiplication: (i) T A (λ v) = A(λ v) = λa v = λt A ( v) (ii) T A ( v + w) = A( v + w) = A v + A w = T A ( v) + T A ( w). The transformation T A is linear!

Where have we seen this before? We have seen examples of transformations T A = A v already for 2 2 square matrices A and interpreted them geometrically. Now suppose you have a transformation T : R 2 R 2 that first compresses a sheet of rubber by a factor of two in the vertical direction, stretches it by a factor of 3 in the horizontal direction, and then rotates it by an angle or π 3. Given any two points on the sheet that are represented by vectors v and w, can we determine where the point v + w ends up, that is, T ( v + w), by adding the vectors T ( v) and T ( w)? Homework 56: (a) Take a few minutes and try to figure out the answer relying exclusively on your geometric intuition. (b) Show that the transformation T described above can be represented as T = T A for some matrix A. After part (b) of Homework 56, the answer to our question becomes easy: The transformation T A must be linear, and the question is precisely whether T has property (ii) of linear transformations. Yes!

Where have we seen this before? Now consider two transformations T A = R n R m and T B = R k R p. Here A must have order m n and B must have order p k. We would like to calculate the value of the composition T B T A ( v) = T B (T A ( v)). This is done by first calculating w = T A ( v) (by definition a column vector of dimension m) and applying transformation T B ( w). But we can do this only if w is in the domain R k of T B. Thus we must have m = k. This is exactly the condition for when the product BA is defined! T B T A ( v) = T B (T A v) = B(A v) = (BA) v = T BA ( v). Compositions of these linear transformations corresponds to matrix products.

Cashing in a portfolio Let us return to our example of[ the ] value of a portfolio that is x represented by a vector v curr = in R y 2, where x denotes the value of its Tried-and-True shares and y denotes the value of its Get-Rich-Fast shares. We defined a transformation T : R 2 R 2 by T ( v curr ) = v next. [ ] 3 My Crystal Ball says T = T A, where A =..5 The owner wants to transform it into cash a year from now. Homework 57: (a) Show that cashing in means applying a linear transformation T B : R 2 R 1 and find the matrix B. (b) Note that the entire transformation of the owner s current holdings v curr into cash amounts to applying the transformation T B T A : R 2 R 1, and that T B T A ( v curr ) = T B (T A ( v curr )) = T BA ( v curr ). Find BA.

Where have we seen this before? Let A be the coefficient matrix of a system of linear equations a 11 x 1 + + a 1n x n = b 1... a m1 x 1 + + a mn x n = b m We can write the system in matrix form as A x = b, where x = [x 1... x n ] T and b = [b 1... b m ] T. Thus T A ( x) = b. We can deduce the following: Theorem When the transformation T A maps R n onto R m, the above system must be consistent. When the transformation T A : R n R m is one-to-one, the above system cannot be underdetermined.

Recall some definitions Consider any function T : X Y, where X and Y are arbitrary sets. The range R of T is the set of all y in Y such that there exists some x in X with T (x) = y. The function T is onto if Y = R, that is, if every y in Y is a function value T (x) for some x in X. The function T is one-to-one if T never takes the same value for any two different x in X, that is, when T (x 1 ) T (x 2 ) whenever x 1 x 2. Homework 58: Read the definitions of consistent and underdetermined systems from a previous lecture if need be. Then reread the theorem on the previous slide a few times and convince yourself that it is nothing else but a translation of onto and one-to-one into the context of T A and systems of linear equations with coefficient matrix A.

Flashback: Products of matrices and column vectors Let A be a matrix of order m n and let v be an n 1 column vector. a 11... a 1n v 1 w 1 A v =... =. a m1... a mn Then A v is an m 1 column vector. This defines a transformation T A : R n R m by T A ( v) = A v v n w m By general properties of matrix multiplication: (i) T A (λ v) = A(λ v) = λa v = λt A ( v) (ii) T A ( v + w) = A( v + w) = A v + A w = T A ( v) + T A ( w). The transformation T A is linear!

Are all linear transformations like this? Suppose T : R n R m is a linear transformation. Does there always exist a matrix A such that T = T A, that is, T ( x) = A x for all x in R n? Not quite. When P is the matrix of transition probabilities of a Markov Chain with n states, and probability distributions are written as row vectors x, then T P ( x) = xp defines a linear transformation T P : R n R n that cannot be written as in the above question. Homework 59: If we write probability distributions as column vectors instead, then for the above transformation T P there does exist a matrix A such that T P ( x) = A x for all x in R n. (a) Find A. (b) Find one example of P for weather.com-light examples of Markov chains so that P = A and another example with P A.

All linear transformations of column vectors are of this form Theorem (Matrix representation of linear transformations) Suppose T : R n R m is a linear transformation. If both the elements of the domain R n of T and the function values T ( x) in R m are treated as column vectors, then there exists a matrix A of order m n such that T = T A, that is, T ( x) = A x for all x in R n. Proof: We need some notation. Recall the following definition. Definition A vector w is a linear combination of vectors v 1, v 2,..., v n if there exist scalars d 1, d 2,..., d n such that w = d 1 v 1 + d 2 v 2 + + d n v n.

Review: The vectors e 1, e 2,..., e n Fix a positive integer n and treat the elements of R n as column vectors. Let x be an n 1 column vector in R n. Then x = x 1 x 2. x n = x 1 1. + x 2 1. + + x n. 1 Recall the notation: e 1 = 1. e 2 = 1.... e n =. 1 Then every n 1 column vector x is a linear combination of e 1, e 2,..., e n, and the coefficients of this combination are unique.

Review: The standard basis in R n Fix a positive integer n and treat the elements of R n as column vectors. A basis for R n is a set of n 1 column vectors b 1, b 2,..., b n such that every n 1 column vector x can be expressed as a linear combination d 1 b1 + d 2 b2 + + d n bn in exactly one way. The vectors e 1, e 2,..., e n of the previous slide form the so-called standard basis of R n.

The vectors T ( e 1 ), T ( e 2 ),..., T ( e n ) Now we are ready to prove the theorem. Let T : R n R m be a linear transformation. Consider the m 1 column vectors T ( e 1 ) = a 11 a 21. T ( e 2 ) = a 12 a 22.... T ( e n) = a 1n a 2n a m1 a m2 a mn a 11 a 12... a 1n a 21 a 22... a 2n Let A = [T ( e 1 ), T ( e 2 ),..., T ( e n )] =... a m1 a m2... a mn.

We show that T = T A Let T : R n R m be a linear transformation, and let A be the matrix defined on the previous slide. Then a 11 a 12... a 1n 1 a 11 a 21 a 22... a 2n A e 1 =.... = a 21. = T ( e 1) a m1 a m2... a mn a m1

We show that T = T A, continued Let T : R n R m be a linear transformation, and let A be the matrix defined on the previous slide. Then a 11 a 12... a 1n a 12 a 21 a 22... a 2n 1 A e 2 =.... = a 22. = T ( e 2) a m1 a m2... a mn a m2

We show that T = T A, continued Let T : R n R m be a linear transformation, and let A be the matrix defined on the previous slide. Then a 11 a 12... a 1n a 1n a 21 a 22... a 2n A e n =.... = a 2n. = T ( e n) a m1 a m2... a mn 1 a mn

We show that T = T A, completed Let T : R n R m be a linear transformation, and let A be the matrix defined on the previous slide. We have shown that for j = 1, 2,..., n: A e j = T ( e j ). By properties of matrix multiplication and linearity of T, for all x in R n : T A ( x) = A x = A(x 1 e j + x 2 e 2 + + x n e n ) = x 1 A e 1 + x 2 A e 2 + + x n A e n = x 1 T ( e 1 ) + x 2 T ( e 2 ) + + x n T ( e n ) = T (x 1 e 1 ) + T (x 2 e 2 ) + + T (x n e n ) = T (x 1 e 1 + x 2 e 2 + + x n e n ) = T ( x). This completes the proof of the theorem.

Some practice problems Homework 6: For each of the following linear transformations T find the matrix A such that T = T A : (a) T : R 2 R 2, T ( e 1 ) = [2, 3] T, T ( e 2 ) = [ 1, 4] T (Note that the latter T is doing double duty here by first denoting the transformation and then as a superscript that indicates the transpose of a vector.) (b) T : R 2 R 2, T ([1, 1] T ) = [2, 3] T, T ([ 1, 1] T ) = [ 1, 4] T (c) T : R 3 R 2, T ( e 1 ) = [2, 3] T, T ( e 2 ) = T ( e 3 ) = 2T ( e 1 )