ROOTS COMPLEX NUMBERS - PDF Free Download

MAT1341 Introduction to Linear Algebra Mike Newman 1. Complex Numbers class notes ROOTS Polynomials have roots; you probably know how to find the roots of a quadratic, say x 2 5x + 6, and to factor it as (x 2)(x 3). Some terminology: a polynomial is monic if its leading coefficient is 1, and the multiplicity of a root is the number of times it appears as a factor. We can identify a monic polynomial by its roots, counting multiplicity. Note that this doesn t mean that a polynomial is the same thing as its roots: x 2 5x + 6 is not the same thing as x = 2, 3. It just means that given the polynomial, you could in principle determine the roots and given the roots you could multiply out the factors to get the polynomial. polynomial roots x 2 5x + 6 2, 3 x 3 13x 2 + 35x + 49 1, 7, 7 x 2 2 2, 2 The last example is interesting, because 2 is not a number we can write down precisely in a concrete way (i.e., decimal expansion). The polynomial x 2 2 is more easily describable, because we can write down the coefficients (1, 0 and 2) quite easily in a precise way. In many ways, the simplest way to describe 2 is a positive number which is a root of x 2 2. Consider the polynomial x 2 + 1. It certainly has no real roots, but it would be nice if it had some kind of root. For lack of a better term, we call one of the roots i. There should be a second root (why?). Perhaps we should call it j? But we notice that ( i) 2 = i 2, so i must be the other root. It is wrong to say that i is positive and i is negative: neither is a real number so neither can be positive or negative. So our correspondence between polynomials and roots works here too: x 2 + 1 i, i We ll content ourselves with saying that i is a number such that i 2 = 1. You can think of it as if i = 1. It s sometimes called an imaginary number. You shouldn t take that word too literally: 2 is sometimes called an irrational (i.e., crazy) number, and we get along with it just fine. COMPLEX NUMBERS The complex numbers are what we get when we add i to the real numbers and allow all the arithmetic we want. Definition 1.1. C = {a + bi a, b R} This says that a complex number looks like a + bi where a and b are real numbers: for instance, 2 7i, 2 + πi. We write Re(a + bi) = a and Im(a + bi) = b to indicate the real part and the imaginary part: for instance Re( 2 + πi) = 2 and Im(2 7i) = 7. The + in a + bi isn t meant as an addition that we would actually evaluate, it just tells us that the two parts form one complex number. These notes are intended for students in mike s course For other uses please say hi to mnewman@uottawa.ca. 1

There is a nice graphical interpretation of complex numbers, where the horizontal axis is the real axis and the vertical axis is the imaginary axis. We ll see this in class. Arithmetic of complex numbers is much like real arithmetic. (a + bi) ± (c + di) = (a ± c) + (b ± d)i (a + bi)(c + di) = ac + adi + bic + bidi = ac + (ad + bc)i + bdi 2 = (ac bd) + (ad + bc)i We used the fact that i 2 = 1, which is just another way of saying that i is a root of x 2 + 1 (right?). There are two new operations: conjugation and absolute value a + bi = a bi a + bi = a 2 + b 2 Exercise 1.2. Show that zz = z 2. (hint: write z = a + bi where a and b are real numbers, and expand both sides.) To divide complex numbers we need to modify them. a + bi c + di = a + bi c di c + di c di (a + bi)(c di) (ac + bd) + ( ad + bc)i = = (c + di)(c di) c 2 + d 2 = ac + bd c 2 + d ad + bc + 2 c 2 + d 2 i We want to write the answer in so-called Cartesian form, namely α + βi for real numbers α, β. What we did can be succinctly expressed in terms of conjugates and absolute values z w = zw ww = aw w 2 The final division is a division by a real number so we don t need anything for that (right?). Notice that real numbers are just special complex numbers with no imaginary part: they look like a + 0i, where a is a real number. A graphical way of saying this is that real numbers are the complex numbers that are on the real axis. A third way of saying this is that real numbers are the complex numbers that are equal to their own conjugate. Exercise 1.3. Check that if we apply the definition of a complex absolute value to a real number, we get the ordinary real absolute value. Check that the conjugate of a real number is itself. Check that if we apply the previous method of adding, subtracting, multiplying and dividing complex numbers to real numbers then we get the ordinary real operations. (hint: write real numbers as a + 0i, c + 0i and treat them like complex numbers) POLYNOMIALS Our motivation for considering complex numbers was roots of polynomials. More precisely it would be nice if the following were true: Theorem 1.4. FALSE! FALSE! FALSE! If p(x) = x n + a n 1 x n 1 + a 1 x + a 0 is a polynomial with real coefficients then it has exactly n real roots counting multiplicity; in other words p(x) = (x α 1 )(x α 2 ) (x α n ) where each α j R. FALSE! FALSE! FALSE! Please take your pen (not pencil!) and cross out the word theorem immediately above. This theorem is completely wrong. In fact you already know an example where it fails: x 2 + 1 is a polynomial with real coefficients but no real roots at all. Here is the real version. (haha!) 2

Theorem 1.5. If p(x) = x n + a n 1 x n 1 + a 1 x + a 0 is a polynomial with complex coefficients then it has exactly n complex roots counting multiplicity; in other words p(x) = (x α 1 )(x α 2 ) (x α n ) where each α j C. Since real numbers are special kinds of polynomials, this says that even if we only ever care about real polynomials, we need the complex numbers anyway. We ll see this explicitly when we examine eigenvalues later. But it also says that we ll never need anything beyond complex numbers (technically: complex numbers are algebraically complete). We can factor any real polynomial into complex factors. There is one other useful observation. Theorem 1.6. Let p(x) be a real polynomial. Then the complex roots of p(x) come in conjugate pairs. Namely, if a + bi is a root of multiplicity m then a bi is also a root of multiplicity m. Example 1.7. Given that the polynomial x 3 x 2 + 2 has 1 + i as a root, factor it. You can (and should!) check that 1 + i is a root, by plugging it in and seeing that you get zero. Since the polynomial is real, then we know that 1 i must also be a root. So we have (x (1 + i))(x (1 i)) = x 2 2x + 2 is a factor. Long division gives that the remaining factor is (x + 1), so p(x) = (x (1 + i))(x (1 i))(x + 1). Exercise 1.8. Find a polynomial with 3 + 4i as a root. Find a polynomial with 3 + 4i as a double root (multiplicity two). Find a polynomial with 3 + 4i and 2 as simple (multiplicity one) roots. Exercise 1.9. A cubic polynomial is a polynomial of degree three: a 3 x 3 + a 2 x 2 + a 1 x + a 0. Show that every cubic polynomial has a real root. (hint: three is an odd number). 3

MAT1341 Introduction to Linear Algebra Mike Newman 2. vectors class notes GEOMETRY AND COORDINATES A vector is an object that has a magnitude and a direction. Some examples are 10 metres, straight up 30 km/h, north-east 7 units, 30 The last one might look a little strange: in order to understand the meaning we need some kind of reference (how big is one unit? what is the reference line for directions?). We can picture a vector as an arrow: the length of the arrow is the magnitude and the direction of the arrow is the direction of the vector. Notice that a vector does not have a fixed basepoint. It is a relative position. We often draw a vector with its basepoint at the origin of some coordinate system. This works particularly well in two dimensions. If we draw a vector with its basepoint at the origin then we can identify the coordinates of the head of the arrow as the coordinates of the vector. As an example, consider the vector with magnitude 5 and direction 37 above horizontal (in two dimensions). Let P be the point at the head of the vector u. If we move u so that it s basepoint is at the origin we get the picture on the left; if we move the basepoint to ( 2, 1) we get the picture on the right. P = [ 4 3 ] P = [ 2 4 ] u = [ 4 3 ] u = [ 4 3 ] We have that, in each case, [ 4 P = 3] [ 4 u = 3] [ 2 P = 4] [ 4 u = 3] The coordinates of a point indicate its absolute position relative to the axes. The coordinates of a vector indicate the position of its head relative to its tail. We typically write vectors as columns of numbers. Occasionally we write them as comma-separated lists, and sometimes as rows with a T at the end: this stands for transpose, which means interchange rows and columns. So the following all mean the same thing: [ 4 u = u = (4, 3) u = 3] [ 4 3 ] T In the notes I will typically write vectors in bold, like this: u. We can refer to the individual components by indices. So for u = [ 4 3 ] T we have u1 = 4 and u 2 = 3 These notes are intended for students in mike s course For other uses please say hi to mnewman@uottawa.ca. 4

ARITHMETIC We can add vectors coordinate-wise, and similarly multiply them by a real number. Example 2.1. [ 4 3] + 2 = 1 [ 4 2 2 = 3 + 1 4] [ ( 2 4 ( 3 ) 2 = 3 )4 ] 3 ( 2 3 )3 = 8 3 2 In fact vectors can have any number of coordinates (although the pictures don t work as well with more than two). We can only add vectors of the same size, so in general, in n dimensions x 1 y 1 x 1 + y 1 x 2 x + y =. + y 2. = x 2 + y 2. x n y n x n + y n x 1 ax 1 x 2 ax = a. = ax 2. x n ax n We write R n for the set of all n-dimensional real vectors. Formally: x 1 R n x 2 =. x 1, x 2,..., x n R x n So for instance R 2 is the ordinary plane, R 3 is ordinary space. Our intuition works best in R2 and R 3. We want to work with vectors in higher dimensional spaces too. Here are some rules which we notice are true for arithmetic in R n. Proposition 2.2. Let u, v, w R n and a, b R. Then: u + v = v + u u + (v + w) = (u + v) + w u + 0 = u u + u = 0 a(u + v) = au + av for a R (a + b)u = au + bu for a, b R a(bu) = (ab)u for a, b R 1u = u Note that 0 is the vector 0, which has all components equal to zero, and that 1 is the number 1. Also notice that some of the additions above are vector additions and some are real number additions; likewise some of the multiplications are scalar multiplication of a vector and some are multiplication of two real numbers. Higher dimensional space is not just an abstract notion. In many contexts (eg, statistics, experimental physics...) the relevant dimension is the number of unknown parameters, which has nothing to do with spatial dimension. 5

Let s prove one of these, say commutativity: u + v = v + u. u 1 u 2 u + v =. + v 2. u n v 1 v n u 1 + v 1 u 2 + v 2 =. u n + v n v 1 + u 1 v 2 + u 2 =. v n + u n v 1 v 2 =. + u 2. = v + u v n u 1 u n definition of vectors in R n definition of vector addition commutativity of addition of real numbers definition of vector addition Exercise 2.3. Prove the remaining properties of Proposition 2.2. LINEAR COMBINATIONS A linear combination of vectors is a weighted sum. If we have k vectors x 1, x 2,..., x k and k real numbers a 1, a 2,... a k then we can form their linear combination as a 1 x 1 + a 2 x 2 + + a k x k This turns out to be a key concept in linear algebra. Note that here x 1 is a vector, it is not the first coordinate of the vector x. [ [ [ 3 1 1 Example 2.4. Foreshadowing things to come, is a linear combination of and? 4] 1] 2] This is the same as asking if there are real numbers a and b such that [ [ [ { 3 1 1 a + b = 3 = a + b 4] 1] 2] a + 2b = 4 Again, we want to solve a system of linear equations. We ll describe a nice general all-purpose way to do this later, but for now we can notice that a = 2 and b = 1 will work. So yes, the given vector is a linear combination of the other two. Exercise 2.5. Without doing any calculations (but using the previous example), is [ [ 3 1 combination of and? 4] 2] [ 1 1] a linear 6

DOT PRODUCT The dot product (sometimes called scalar product) is a way of multiplying two vectors in R n to give a scalar (i.e., real number). x y = x 1 y 1 + x 2 y 2 + x n y n [ [ 4 4 For instance = 4 3] 3] 2 + 3 2. Of course 4 2 + 3 2 = 5 2, where 5 is the length of the vector. A picture will help here! In other words, if we define the norm of x, written as x, to be the length (magnitude) of the vector x, then we have in general that x x = x 2 This is nothing more or less than a restatement of Pythagoras Theorem. In fact it is true in general. Exercise 2.6. A little more challenging. Using Pythagoras Theorem in two dimensions, prove that the length of a vector x R n is x 2 1 + x2 2 + + x2 n; in other words that x x really is the length of x. Notice that the norm of a vector in R 2 looks a lot like the absolute value of a complex number: a + bi = a 2 + b 2 [ a b] = a 2 + b 2 In fact, C looks a lot like R 2. This is no coincidence: they share many features in common, as we ll see when we consider vector spaces shortly. But they aren t the same: multiplying complex vectors has no direct analogue in R 2. The dot product does behave nicely, for the most part. Proposition 2.7. Let u, v, w R n and a R. Then: u v = v u u 0 = 0 (au) v = a(u v) = u (av) (u + v) w = u w + v w We ll check one of these, namely that (au) v = a(u v). u 1 (au) v = a u 2. v 2. u n au 1 v 1 au 2 =. v 2. au n v n v 1 v n definition of vectors in R n definition of scalar multiplication = (au 1 )v 1 + (au 2 )v 2 + + (au n )v n definition of dot product = a ( ) u 1 v 1 + u 2 v 2 + + u n v n real arithmetic u 1 = a u 2. v 2. = a(u v) u n v 1 v n definition of dot product 7

Exercise 2.8. Verify the remaining assertions in Proposition 2.7. PROJECTION If consider imagine two vectors x and y in R 2, and place them so that their basepoints coincide, we would like to measure how similar they are. One way to do this is to consider one of them, say y as a a reference, and project x onto y. Thus we imagine x as consisting of two parts: one along the direction of y, and one perpendicular to y. The part parallel to y is the projection of x onto y, denoted by proj y (x). x y θ proj y (x) Proposition 2.9. The projection is given by proj y (x) = y x y y y. Notice that x = ( proj y (x) ) + ( x proj y (x) ) By vector addition, the second term is the part of x that is perpendicular to y. It s the dotted line in the picture. We say that vectors are orthogonal if they are perpendicular. Directly from the picture, we see that x and y are orthogonal, written x y, exactly when proj y (x) = 0. Exercise 2.10. Prove that proj y (x) = 0 if and only if x y = 0. Exercise 2.11. Prove that proj y (x) = 0 if and only if proj x (y) = 0. Do this twice: once based on the geometry of the picture, once based on the formula. Proposition 2.12. x y if and only if x y = 0. So dot products are related to angles between vectors. We can say more. Let θ be the angle between x and y. Then ( proj y (x) ) ( ) y x y y y y x y y y cos(θ) = = x x x = y x y y y y x x = y x y y x x = y x y x The fraction x y y y is a real number. So in the middle step, we used Proposition 2.7; more precisely we used the part that we proved, twice. Rephrased we find the following, which should look vaguely familiar. Proposition 2.13. For any x, y R n, x y = x y cos θ. Why a new word if it means the same as the old? The notion of orthogonality extends to situations where there is no direct geometric meaning. For instance, functions can be orthogonal, in a way which does not mean they meet at 90. 8

MAT1341 Introduction to Linear Algebra Mike Newman 3. Geometry: Lines and Planes class notes LINES A line can be determined by a basepoint P and a direction d. We can think of it as all the possible points that are in a direction d from P. Formally a line looks like: {P + td t R} This is often called the parametric form of a line, because it has a parameter. consider the following two lines {[ [ } {[ } 1 2 3 1 l 1 : + s s R l 1] 1] 2 : + t t R 2] 0.5 For instance The equations look different, but from the pictures it seems that they are identical. The basepoint of l 1 is on l 2, because if solve [ [ 1 3 1 = + t 1] 2] 0.5 We obtain t = 1. Also the direction vector of l 1 is a multiple of the direction vector of l 2, so it is parallel. Exercise 3.1. Show that the basepoint of l 2 is on l 1. Show that the direction vector of l 2 is a multiple of the direction vector of l 1 Note that we can easily check if two lines are parallel, by checking if their direction vectors are multiples of each other (i.e., if their direction vectors, though possibly different, have the same direction!) Example 3.2. We can write the y = mx + b form of the line l 1 by considering a general point (x, y) on the line. [ [ [ { x 1 2 x = 1 + 2s = + s y] 1] 1] y = 1 + s Substituting the first equation into the second we obtain y = x/2 + 1/2. Exercise 3.3. Find directly the y = mx + b form of the line l 2. Of course you know the answer already... These notes are intended for students in mike s course For other uses please say hi to mnewman@uottawa.ca. 9

The same line can have seemingly different equations. This is a slight disadvantage of the parametric form for the equation of a line, but this form has one huge advantage: it works consistently in any dimension. Here is a line in R 6 (picture not available due to space-time transmogrification constraints): { [1 ] T } T l 3 : 1 0 2 3 1 + t 2 1 2 1 3 0 t R Here we used the transpose way of writing vectors (to save space on the page!). Exercise 3.4. What is the y = mx + b form of the line l 3? What is its slope? Or perhaps, what are its slopes? What is its y-intercept? Does it even have a y-intercept? PLANES IN R 3 A plane doesn t have one single direction (in fact it has infinitely many... ) but in R 3 is has a single perpendicular direction. This is called the normal to the plane. Fix some basepoint P on the plane, and consider some other arbitrary point Q = (x, y, z) on the plane. If the normal vector is u, then we want to express the fact that u is perpendicular to the vector from P to Q. We get the equation: (Q P ) u = 0 So for instance the plane passing through the point P = (1, 1, 1) and perpendicular to the normal vector [ 2 0 1 ] T is given by x y 1 1 2 0 = 0 z 1 1 { } { } 2x + 0y z (1)(2) + (1)(0) + (1)( 1) = 0 2x z = 1 A plane can have many different basepoints: in fact any point on the plane will do. And there are lots of different normal vectors: any scalar multiple will work, since only the direction matters. Exercise 3.5. Find the equation of the plane containing the point (1, 0, 1) and having normal vector [ 4, 0, 2 ] T. Explain why the answer looks so eerily familiar. Exercise 3.6. For the plane 2x + 3y 5z = 1 find a normal vector and a basepoint. INTERSECTIONS We can find the intersection of two lines by setting them equal. Example 3.7. Find the intersection of l 1 : 1 1 + s 2 1 s R 1 0 l 2 : 2 2 + t 1 0 t R 0 1 Do I really mean any scalar multiple? 10

We want to find a value of s in the first line and a value of t in the second line that give the same point. 1 1 + s 2 1 = 2 2 + t 1 { 0 1 2s = 2 + t1 s = 21 = t 1 0 0 1 The second says that s = 1, the third says that t = 1, and the first is then consistent. So s = 1, t = 1 and we get the point (3, 2, 1). We notice that these two lines are not parallel, since their direction multiples are not multiples of each other. Exercise 3.8. Will the intersection of two lines always be a point? Will two lines that are not parallel always intersect? Will two lines that are parallel ever intersect? Two lines that do intersect lie in a common plane (this is true in any dimension). So we should be able to find the equation of this plane, at least in R 3 (in higher dimensions we haven t seen what the equation of a plane looks like). It turns out that the vector [ 1 2 1 ] is orthogonal to both direction vectors. Exercise 3.9. Verify that [ 1 2 1 ] T is orthogonal to [ 2 1 0 ] T and [ 1 0 1 ] T. Notice that this can be checked directly. If you hadn t been told the vector [ 1 2 1 ] T, how would you have found it? Exercise 3.10. Given that the plane containing l 1 and l 2 contains the point (3, 2, 1) and has normal vector ( [ 1 2 1 ], write down its equation. We can also find intersections between two planes. Note that the equation of a plane has a different form than the equation of a line, so technically this looks a little different. Example 3.11. Find the intersection of the planes 2x + 4y + 4z = 7 and 6x 3y + 2z = 1. We will substitute one into the other. There are various ways, let s solve the second for z and substitute into the first. 1 6x + 3y 6x 3y + 2z = 1 = z = 2 ( ) 1 6x + 3y 2x + 4y + 4z = 7 = 2x + 4y + 4 = 7 2 2x + 4y + 2 12x + 6y = 7 The intersection is then 2x + 2y = 1 10x + 10y = 5 { 2x + 2y = 1 z = (1 6x + 3y)/2 The first looks like the equation of a line, but it is unclear why there are two equations? In fact it is a line. We can regard the second equation as giving the height of the line of the first. Let s rewrite it in parametric form. To do this we need a parameter: we can choose x = t. The we write the other variables in terms of t. 2x + 2y = 1 = y = 1 + 2x = 1 + 2t = 1 2 + t z = 1 6x + 3y 2 = 2 2 1 6(t) + 3(1 + 2t)/2 2 = 5 4 3 2 t 11

The parametric form is then x y = + t z x y = 0 + t 1 z x y = 0 1/2 + t 1 1 z 5/4 3/2 The first line is the template : each of x and y is the sum of a number and a multiple of t. Since x is a parameter, we fill in the first row so that it says x = t. Then we fill in the second row so that it gives the equation for y, namely y = 1/2 + t. Exercise 3.12. What are the normal vectors of the two planes in the previous example? Are they parallel? Exercise 3.13. Will the intersection of two planes always be a line? Will two planes that are not parallel always intersect? Will two planes that are parallel ever intersect? CROSS PRODUCT There is one thing left to explain. How did we find a vector orthogonal to two given vectors in R 3? This is the cross product, which is a way of multiplying two vectors to get a third, which is orthogonal to both of them. It is defined as v w = v 1 v 2 w 1 w 2 = v 2w 3 w 2 v 3 v 1 w 3 + w 1 v 3 v 3 w 3 v 1 w 2 w 1 v 2 Exercise 3.14. Verify that v w is orthogonal to both v and w, and so prove that the formula given above for v w is correct. There are a variety of mnemonic ways of remembering this formula, often involving writing the terms in a slightly different order (eg u 3 v 2 instead of v 2 u 3 etc.). It can also be written in a nice circular way. There is also a nice formula in terms of determinants which states that î v 1 w 1 v w = det î v 2 w 2 ˆk v 3 w 3 Here î = [ 1 0 0 ] T, ĵ = [ 0 1 0 ] T, ˆk = [ 0 0 1 ] T are the standard basis vectors of R 3. Some of you might know this, or something similar. If you don t, then don t worry: we ll see determinants later. Exercise 3.15. Using the definition of cross product, verify that î ĵ = ˆk, ĵ ˆk = î, ˆk î = ĵ. Verify that ĵ î = ˆk, ˆk ĵ = î, î ˆk = ĵ. This reveals a strange property of this product: it is not commutative! Proposition 3.16. Let u, v, w R 3 and a R. Then: u v = v u u 0 = 0 u u = 0 (au) v = a(u v) = u (av) (u + v) w = u w + v w u (v + w) = u v + u w 12

When two real numbers multiply to give zero, it s because one of them already is zero. Not so with cross products: the cross product of any vector with itself gives the zero vector. Also note that cross product is not associative: (u v) w u (v z). Let s prove one of these, namely that (u + v) w = u w + v w. (u + v) w = u 1 u 2 + v 1 v 2 w 1 w 2 u 3 v 3 w 3 = u 1 + v 1 u 2 + v 2 w 1 w 2 definition of vector addition u 3 + v 3 w 3 = (u 2 + v 2 )w 3 w 2 (u 3 + v 3 ) (u 1 + v 1 )w 3 + w 1 (u 3 + v 3 ) definition of cross product (u 1 + v 1 )w 2 w 1 (u 2 + v 2 ) = u 2w 3 w 2 u 3 + v 2 w 3 w 2 v 3 u 1 w 3 + w 1 u 3 v 1 w 3 + w 1 v 3 real arithmetic u 1 w 2 w 1 u 2 + v 1 w 2 w 1 v 2 = u 2w 3 w 2 u 3 u 1 w 3 + w 1 u 3 + v 2w 3 w 2 v 3 v 1 w 3 + w 1 v 3 definition of vector addition u 1 w 2 w 1 u 2 v 1 w 2 w 1 v 2 = u w + v w definition of cross product Exercise 3.17. Prove the remaining parts of Proposition 3.16. There is a nice relationship between the cross product and the dot product. Theorem 3.18. Let x, y R 3. Then x y 2 + (x y) 2 = ( x y ) 2. Exercise 3.19. Prove Theorem 3.18 by using the definition of cross product, dot product and norm, and expanding. We also have the following Proposition 3.20. For any x, y R 3, x y = x y sin θ. Exercise 3.21. Using Theorem 3.18 and the fact that cos 2 θ + sin 2 θ = 1, prove Proposition 3.20. One consequence of this result is that we see again that x x = 0. Furthermore, if x and y are parallel then x y = 0. Here are two different proofs of this. Firstly, if they are parallel then the angle between them is either 0 or π, and so sin θ = 0 and by Theorem 3.18 x y = 0. The only vector whose length is zero is 0. Secondly, if x and y are parallel, then y = kx and then using Proposition 3.16 we get Again, the only vector whose length is zero is 0. x y = x (kx) = k(x x) = k0 = 0 13

MAT1341 Introduction to Linear Algebra Mike Newman 4. Vector Spaces class notes INTRODUCTION We have seen how columns of numbers have a certain algebraic structure. In fact there are a vast number of objects that have essentially the same structure, meaning that they have an addition and a scalar multiplication that behave much like columns of numbers. Our goal is to find a way to unify them, and see them all as different aspects of the same thing. We ll start with some examples, and then give a formal definition. EQUATIONS Consider the set of linear equation in three variables, x, y, z. Formally, equations in three variables look like ax + by + cz = d for arbitrary real constants a, b, c, d. For example: E = {ax + by + cz = d a, b, c, d R} E 1 : 2x + 3y 5z = 3 E 2 : x z = 6 E 3 : 0 = 1 We don t need all three variables to appear: there is a tacit +0y in E 2. Also, we are simply considering the equations, not solving them. So E 3 is a perfectly valid equation (that happens to have no solutions). The addition and scalar multiplication is possibly what you expected. In other words, they add by like terms, and scalar multiplication multiplies each term. E 1 + E 2 : 3x + 3y 6z = 9 7(E 1 ) : 14x 21y + 35z = 21 Although we might be tempted to divide through by 3, E 1 + E 2 is not the same thing as the equation x + y 2z = 3 (although of course they have the same solutions). As an equation, the function of the variables is to tell us which terms we can and can t combine. Formally, there is a new addition here, and we should actually be writing something like E 1 E 2 to indicate equation-addition and k E 1 to indicate scalar multiplication. Even the + and and = in E 1 really only serve to separate the terms. The only time ordinary addition happens is when we do things like add 2x to x, computing 2 + 1 = 3 to get 3x. But it is convenient to use the same symbols for all of the different operations. There is one very special equation Z : 0 = 0 This has the special property that for any equation E, we always have E + Z = E. Try it! Exercise 4.1. Verify that E + Z = E for any equation E. Exercise 4.2. Let E be any equation. Let F = ( 1)E. Show that E + F = Z. These notes are intended for students in mike s course For other uses please say hi to mnewman@uottawa.ca. 14

Exercise 4.3. Go back through this section, and identify each + as being either an addition of equations, an addition of real numbers, or a placeholder in an equation. FUNCTIONS Functions are recipes for taking an input and producing an output. We ll consider functions of one variable that produce a single output. F (x) = {f f : R R} This just says that F is the set of all functions that take an input x, which must be a real number, and map it to an output, which must also be a real number. Two functions add to produce a function that gives the sum of the two addends, and scalar multiply to give a function that produces a scalar multiple of its argument. This is sometimes called point-wise addition, because we add by adding the values at each point x. That sounds complicated, but it means what you would hopefully think it does. examples of functions. If s = f + g is their sum and m = 7g, then f(x) = sin(x) g(x) = x 3 cos(x) s(x) = (f + g)(x) = sin(x) + x 3 cos(x) m(x) = ( 7g)(x) = 7 ( x 3 cos(x) ) = 7x 3 + 7 cos(x) Here are some The + in f + g is addition of functions, not numbers. It might be nice to use a different symbol, like f g, but again, the point is that it behaves a lot like ordinary addition so we keep the same symbol. Example 4.4. Let s show that function-addition is commutative. Let f and g be arbitrary functions, and let a = f + g and b = g + f. We will show that a = b by showing that a(x) = b(x) for every possible x. a(x) = (f + g)(x) = f(x) + g(x) definition of function addition = g(x) + f(x) addition of real numbers is commutative = (g + f)(x) = b(x) definition of function addition Exercise 4.5. Show that function-addition is associative. Namely show that for any three functions f, g and h, with a = (f + g) + h and b = f + (g + h), show that a = b. Exercise 4.6. Go back through this section, and identify each + as being either an addition of real numbers, an addition of functions. SUBSETS OF R n We already know about R n ; let s consider a subset of it. Here s two examples {[ } { } t s U = t R W = s R 2t] 2s + 1 In other words some of the pairs of real numbers, but not all of them. So for instance (3, 6) and ( 0.25, 0.5) are both in U, while (0, 1) and ( 1, 1) are both in W. Exercise 4.7. Are there any vectors that are in U and W? That is, are there any x U W? 15

In this case, we know lots about these kind of subsets already, precisely because they are subsets of something we already know. For instance, for any x, y U we have x + y = y + x. Example 4.8. Prove that x + y = y + x for any x, y U. [ a b x + y = + definition of vectors in R 2a 2b] 2 a + b = definition of vector addition 2a + 2b b + a = commutativity of addition of real numbers 2b + 2a b a = + definition of vector addition = y + x 2b 2a Actually, we already proved this before we even thought of this particular set U: we proved that any two vectors in R n commute for any n. In fact if you look at the proof in example 4.8, it is the same as the corresponding proof in R 2 except that we restrict the vectors to being in U. If something is true in general, then it is certainly true in a special case; in exactly this way, many things that were true for R 2 are automatically true for U. These are sometimes called inherited properties. Exercise 4.9. Let u, v W. Is is true that u + v = v + u? If so, prove it like we did above. If not, give a counter-example. How does this differ from U? Exercise 4.10. Which properties of Proposition 2.2 are automatically true for U and W? There is an important difference between U and W. Let x, y U and u, v W. [ a b c d x + y = + u + v = + 2a 2b] 2c + 1 2d + 1 a + b c + d = = 2a + 2b 2c + 2d + 2 [ ] (a + b) (c + d) = = 2(a + b) 2(c + d) + 2 Is x + y U? If so, then it must have the right form, so we have to find a value for t such that a + b = t and 2(a + b) = 2t. The first equation says that t = a + b, and this works in the second. So x + y U. Is u + v W? If so, then it must have the right form, so we have to find a value for s such that c + d = s and 2(c + d) + 2 = 2s + 1. The first equation says that s = c + d, but then the second equation says that 2c + 2d + 2 = 2c + 2d + 1, which says that 2 = 1. This is false, so u + v / W. Exercise 4.11. Let x U and u W, and p, q R. Is px U? Is qu W? Prove your answer in both cases. Exercise 4.12. Can you identify a geometric difference between U and W that explains the algebraic differences? MATRICES A matrix is an object with rows and columns, and each entry is a number. Examples: [ ] 2 1 5 3 0 1 2 0 π 2 3 0 1 3 0 4 16

The last example is interesting: column vectors are a special kind of matrix. This inspires us to define addition and scalar multiplication in a similar fashion, entry-wise. So for example: 2 1 7 6 2 + 7 1 + 6 2 1 ( 6)(2) ( 6)(1) + = ( 6) = 0 π 1 2 0 1 π + 2 0 π ( 6)(0) ( 6)( π) 9 7 12 6 = = 1 π + 2 0 6π We can only add matrices of the same size. Formally we write M m n (R) for the set of matrices with m rows, n columns and entries that are real numbers. Sometimes we just write M m n and it is understood that we are talking about real matrices. Occasionally we will have use for matrices with complex entries. Example 4.13. We will prove that addition of matrices is commutative. Since we don t know what size they are, it s helpful to use the notation a ij to indicate the entry of the matrix A in the i th row and j th column. a 11 a 12 a 1n b 11 b 12 b 1n a 21 a 22 a 2n A + B =..... + b 21 b 22 b 2n.... matrix notation. a m1 a m2 a mn b m1 b m2 b mn a 11 + b 11 a 12 + b 12 a 1n + b 1n a 21 + b 21 a 22 + b 22 a 2n + b 2n =.... definition of matrix addition. a m1 + b m1 a m2 + b m2 a mn + b mn b 11 + a 11 b 12 + a 12 b 1n + a 1n b 21 + a 21 b 22 + a 22 b 2n + a 2n =.... real addition is commutative. b m1 + a m1 b m2 + a m2 b mn + a mn b 11 b 12 b 1n a 11 a 12 a 1n b 21 b 22 b 2n =..... + a 21 a 22 a 2n.... definition of matrix addition. b m1 b m2 b mn a m1 a m2 a mn = B + A This should be looking familiar... POLYNOMIALS A polynomial in one variable is an object of the form a n x n + a n 1 x n 1 + + a 1 x + a 0 where each a j is a real number, and n is a non-negative integer and a n 0. The value n is the highest power of x that appears in the polynomial, and is called the degree. We already saw polynomials from the point of view of solving them; here we are only interested in their global structure. There is one exception: the polynomial 0 + 0x + 0x 2 + = 0 has degree 0. 17

We add polynomials by like terms, and scalar multiplication applies to all terms. For example, here are some polynomials, and some polynomial-arithmetic on them. p(x) = 2x 2 + 3x + 7 q(x) = 3x 5 + x 2 7 p(x) + q(x) = (3 + 0)x 5 + (1 2)x 2 + (0 + 3)x + (7 7) = 3x 5 x 2 + 3x ( 1)q(x) = ( 1)(3)x 5 + ( 1)(1)x 2 + ( 1)( 7) = 3x 5 x 2 + 7 When we add by like terms, if a term is missing then it is presumed zero. Strictly speaking, the sum above should contain calculations like (0 + 0)x 157, but we omit those because they all just give zero and putting them in would make these notes way to long. We write P(x) for the set of all polynomials in the variable x, with addition and scalar multiplication defined term-wise. Sometimes it will be convenient to write P n (x) for the set of all polynomials of degree at most n. This is possibly a little different than you expected; of the two examples above, both p(x) and q(x) are in P 5 (x) and also in P 15 (x), but only p(x) is in P(x) 2 (x). Often we omit the (x) if the variable is obvious, writing P of P 2. But note that we never omit the subscript, because that means something different: P contains all polynomials, P 2 contains only the polynomials of degree 2, 1 and 0. Exercise 4.14. Give an example of a polynomial of degree 0. Describe all polynomials of degree 0. It should come as no surprise to you that polynomial-addition and polynomial-scalar-multiplication behave nicely. Exercise 4.15. Show that p(x) + q(x) = q(x) + p(x) for any p(x), q(x) P(x). Similarly, there is a special polynomial Z(x) that has the property that r(x) + Z(x) = r(x) for any polynomial r(x) P. You might hazard a guess at to what Z(x) is, but let s pretend we have no idea and find it. Example 4.16. We find Z(x) by equation r(x) + Z(x) = r(x) for an arbitrary polynomial r(x). Let the degree of r be n. If we write out the equation r(x) + Z(x) = r(x) in detail we get: ( ) r(x) + Z(x) = r 0 + r 1 t + + r n t n + 0t n+1 + 0t n+2 + + ( z 0 + z 1 t + ) = (r 0 + z 0 ) + (r 1 + z 1 ) t + + (r n + z n ) t n + (z n+1 ) t n+1 + (z n+2 ) t n+2 + (Note that the degree of the sum might be different than the degree of r or Z.) This equation says that for every single index j, we must have r j + z j = r j. This is an equation involving real numbers, so the only option is z j = 0, and so Z(x) = 0 + 0x + = 0. Exercise 4.17. Go back through this section, and identify each + as being either an addition of real numbers, an addition of polynomials, or a placeholder in a polynomial. AXIOMS All of these types of objects (and many others!) have some structure in common. This is the motivation for defining a vector space. Definition 4.18. A vector space V has three parts: A collection of objects A way of adding these objects together A way of scalar multiplying these objects by real numbers 18

The addition and scalar multiplication must obey the following rules; that is each of the following axioms must hold for all vectors u, v, w and all real numbers α, β. Two closure axioms: 1) u + v V 2) αu V Arithmetical axioms: 3) u + v = v + u 4) (u + v) + w = u + (v + w) 5) α(u + v) = au + bv 6) (α + β)u = au + bv 7) α(βu) = (αβ)v 8) 1u = u Two special axioms: 9) There is a special vector ZERO such that u + ZERO = u. 10) For each u there is a special u such that u + u = ZERO. Typically we refer to ZERO as 0, the zero vector. It is sometimes called an additive identity. Also, u is the often called the negative of u and written as u, the additive inverse of u. Exercise 4.19. In Definition 4.18, we used the same notation for vector addition and real number addition, and for scalar multiplication of a vector and multiplication of two real numbers. Go through the definition carefully and identity each type of operation. Exercise 4.20. Check that the following are all vector spaces (satisfy all the axioms of Definition 4.18). R n for a fixed n, with addition coordinate-wise and scalar multiplication of all coordinates. M m n for fixed m, n, with addition entry-wise and scalar multiplication of all coordinates. P(x), with addition by like terms and scalar multiplication of all terms. P n (x) for fixed n with addition by like terms and scalar multiplication of all terms. F (x), with addition and scalar multiplication defined point-wise. This is potentially a very long exercise, but you should work through enough of it so that you can see that it is a long and tedious but essentially straightforward exercise. Exercise 4.21. Check if the following sets are vector spaces. {[ } { t s U = t R W = 2t] 2s + 1 } s R Notice that the operations are part of the definition. Often the operations associated with a vector space are standard : we added matrices, equations, functions, polynomials etc, in what you might think of as the natural way. Often in such cases we won t mention the operations, and we ll say things like... the vector space of polynomials of degree at most three... instead of the more pedantically correct 19

... the vector space of polynomials of degree at most three, with addition defined by like terms and scalar multiplication acting on all terms... Whenever we omit the description of the operations like this, we will always mean the standard operations. However, there is nothing in Definition 4.18 that says what addition has to be, only how it has to behave. There are (many!) vector spaces with operations that look quite different. There are even instances where the same vectors can have different operations. Exercise 4.22. Let S 2 be the set of pairs of positive real numbers with addition and scalar multiplication defined as follows. {[ } S 2 a = a, b > 0 b] u1 v1 u1 v u v = = 2 u 2 v 2 u 2 v 2 [ u1 (u1 ) k u = k = k ] u 2 (u 2 ) k So S 2 looks superficially like the positive vectors of R 2, but the way these vectors interact is completely different. Show that S 2 is a vector space. (To avoid confusion we have used and to indicate vector addition and scalar multiplication of a vector by a real number in S 2 ; all other arithmetical symbols relate to ordinary real arithmetic.) 20

MAT1341 Introduction to Linear Algebra Mike Newman 5. Subspaces class notes MOTIVATION: SPACES INSIDE SPACES A particularly important kind of vector space is one that lives inside of another, and inherits the operations of the big space. Definition 5.1. Let V be some vector space, with addition and scalar multiplication defined in some way. Let U be a subset of V, with the same operations as V. If U is then a vector space in its own right, then we say that it is a subspace of V. We have already seen at least one example; in fact you have already checked one as an exercise. Here it is again, rephrased slightly. Exercise 5.2. Check if the sets {[ } t U = t R 2t] { s W = 2s + 1 } s R are subspaces of R 2. Note that much of the work of checking is already covered by checking that R 2 is a vector space. Recall that we proved that U is commutative in example 4.8, only to observe afterwards that we had already done this by checking that all of R 2 is commutative. Is worth noticing this formally. Theorem 5.3 (Subspace Test). Let V be some vector space, with addition and scalar multiplication defined in some way. Let U be a subset of V, with the same operations as V. If U satisfies the two closure axioms, then it is a vector space. That is, if it satisfies the two closure axioms then it satisfies all of the vector space axioms. So in order to answer exercise 5.2 it suffices to check the two closure axioms for U and then for W. We ve already done this... but in case you ve forgotten it makes a nice exercise! Exercise 5.4. Check the two closure axioms for U and for W. There is a slightly different form that some people find useful. It is possible to check both closure axioms simultaneously. Theorem 5.5 (Subspace Test, version 2). A set U satisfies the two closure axioms if and only if it satisfies kx + y U for all x, y U and k R Sometimes an even quicker test is possible, but we ll need to take a slight theoretical detour first. These notes are intended for students in mike s course For other uses please say hi to mnewman@uottawa.ca. 21

SPECIAL VECTORS ARE UNIQUE Theorem 5.6. Let V be a vector space. Then the special vector 0 is unique; that is there is only one vector that acts as an additive identity. Proof. Let s assume that z 1 and z 2 are both zero vectors. That is, for every vector x V, we have both x + z 1 = x and x + z 2 = x. Then z 1 = z 1 + z 2 since z 2 is a zero-vector = z 2 + z 1 vector addition is commutative In a similar vein we have the following. = z 2 since z 1 is a zero-vector Theorem 5.7. Let V be a vector space and x V. Then the special vector x is unique; that is there is only one vector that acts as an additive inverse for x. Proof. Let s assume that x and x are both additive inverses for x. x + x = 0. Then x + x = 0 x + x + x = 0 + x x (x + x ) = x + 0 x + 0 = x + 0 x = x since x is additive inverse of x add x to both sides arithmetic axioms since x is additive inverse of x since 0 is additive identity That is, x + x = 0 and In fact we can say more. We can find the zero vector and find the additive inverse by direct computation. Recall that in example 4.16 we discovered what the zero polynomial was. There is a way to do this directly. Theorem 5.8. Let V be a vector space and x V. Then 0 = 0x and x = ( 1)x. This is why we generally write 0 for the zero vector instead of ZERO, and x for the additive inverse instead of x. Proof. First of all we have 0x = (0 + 0)x 0x = 0x + 0x 0x + (0x) = 0x + 0x + (0x) and so 0 = 0x for any x V. Next we have since 0 + 0 = 0 (numbers, not vectors) arithmetic axioms (0x) is the additive inverse of 0x 0 = 0x + 0 a vector plus its additive inverse gives 0 0 = 0x adding 0 to a vector gives the vector 0 = 0x just proved! 0 = (1 + ( 1))x 1 + ( 1) = 0 as numbers 0 = 1x + ( 1)x arithmetic axioms 0 = x + ( 1)x arithmetic axioms This says that ( 1)x is a vector, which when added to x, produces 0. So it is the additive inverse of x. 22

From now on we will write 0 for the zero vector and x for the additive inverse of x. Example 5.9. Find the vector 0 in the space P 3. You can probably guess the answer, but let s pretend we don t know. Pick any polynomial we want in P 3, say x 2 13x. Multiply it by the scalar 0 to obtain (0)(x 2 13x) = (0)(1)x 2 + (0)( 13)x = 0x 2 + 0x = 0 (The final 0 in that equation is actually a polynomial... ) So 0 = 0 = 0x 3 + 0x 2 + 0x + 0. The previous example was perhaps not so inspiring. But in fact the same approach works in any vector space. Exercise 5.10. Find the vector 0 in the space S 2 of exercise 4.22. In the same space, find the vector x for x = [ 2 3 ]. In the same space, find the vector 0. QUICK CHECK One of the benefits of Theorem 5.8 is that it gives a quick check for subspaces. If we have a vector space V and a subset U V, then we know that the zero-vector of V behaves like a zero vector of U also. Furthermore, if U is to be a subspace, it can have only one zero, by Theorem 5.8, so it must have the same zero as V. So we get the following check. Proposition 5.11 (Subspace Quick Check). Let V be a vector space and U a subset of V. Let 0 be the zero vector of V. If 0 / U then U is not a subspace. We know that 0 = [ 0 0 ] in the vector space R2. Let s apply Proposition 5.11 to exercise 5.2. A quick check reveals that [ 0 0 ] U. Conclusion: this tells us nothing, we need to check the closure axioms (which we already did, of course!). Another quick check reveals that [ 0 0 ] / W. Conclusion: W is not a subspace, and we needn t check the closure axioms; in fact we know that at least one of them will fail... MORE EXAMPLES Example 5.12. Show that diagonal matrices form a subspace of matrices (all of size n n). A matrix is diagonal if it has zeros off the diagonal. Precisely, if A is diagonal then a ij = 0 whenever i j. Diagonal matrices look like: a 11 0 0 b 11 0 0 0 a 22 0 0 b 22 0 A =.... B =...... 0 a nn 0 b nn They are clearly a subset of M n n, so all we need to do is apply the Subspace Test, namely check the two closure axioms. First let s try the quick check: is the zero n n matrix diagonal? Yes it is, since it has zeros off the diagonal. This tells us... nothing. The quick check is mostly useful when you suspect that the subset is not a subspace. Diagonal matrices most certainly are a subspace, as 23

the following verification of the closure axioms attests: a 11 + b 11 0 + 0 0 + 0 a 11 + b 11 0 0 0 a 22 + b 22 0 + 0 A + B =..... = 0 a 22 + b 22 0..... 0 + 0 a nn + b nn 0 a nn + b nn a 11 0 0 ka 11 k0 k0 ka 11 0 0 0 a 22 0 ka = k..... = k0 ka 22 k0..... = 0 ka 22 0..... 0 a nn k0 ka nn 0 ka nn In both cases there are zeros off the diagonal, so both results are diagonal matrices and the closure axioms are verified. Example 5.13. Let U be the set of polynomials p(x) with p(0) = 1. Precisely, U = {p(x) p(0) = 1}. Is U a subspace (of P, of course)? We ll check the second closure axiom. Let a(x) U, k R and let f = ka ) f(x) = ka(x) = k (a n x n + a n 1 x n 1 + + a 1 x + a 0 = ka n x n + ka n 1 x n 1 + + ka 1 x + ka 0 Now we ask the question: is the result in U? In this case that means, is f(0) = 1? f(0) = ka n (0) n + ka n 1 (0) n 1 + + ka 1 (0) + ka 0 = ka 0 But a 0 = a(0), and since a(x) U, we have that a(0) = a 0 = 1. So f(0) = ka 0 = k. If this is in U, then it must be the case that for every k R we have k = 1. Reread the previous sentence: we re done. Exercise 5.14. We don t really have to, but check the first closure axiom for U = {p(x) p(0) = 1}. Also, check the quick test: is the zero polynomial in U? Exercise 5.15. Show that P 3 is a subspace of P. Is R 3 a subspace of R 4? Can you find a subspace of R 4 that is like R 3? Does like mean is equal to? What does like mean? Exercise 5.16. In what way is P 3 like R 4? Are they the same space? Really? If so then what is P like? 24

MAT1341 Introduction to Linear Algebra Mike Newman 6. Span class notes HOW DO YOU DESCRIBE A VECTOR SPACE? Example 6.1. Consider the following two subspaces of R 2. { [ } {[ 3 x X = t t R Y = 1] y] } x 3y = 0 The description of X is constructive. It s easy to generate vectors in X, just choose any real number for t and plug it in. The description of Y is descriptive. It s easy to check if any particular vector is in Y, just plug it in. The description of X tells us what directions are permitted. The description of Y tells us which directions are forbidden. The description of X is similar to the way we wrote lines earlier (in fact X is a line). The description of Y is similar to the way we wrote planes earlier (actually Y is a line too). Exercise 6.2. Check that X and Y of example 6.1 are indeed subspaces of R 2. Exercise 6.3. Show that in fact X = Y in example 6.1. SPAN Let V be a vector space and {x 1, x 2,..., x k } be a set of k vectors in V. We define the span of {x 1, x 2,..., x k } to be the set of all linear combinations of them; formally, span {x 1, x 2,..., x k } = {α 1 x 1 + α 2 x 2 +... + α k x k α 1,... α k R} If U is a subspace of V (possibly U = V ) and every vector in U can be written as a linear combination β 1 x 1 + β 2 x 2 + + β k x k for some β 1,..., β k, then we say that {x 1, x 2,..., x k } spans U. Sometimes we say that {x 1, x 2,..., x k } is a spanning set for U. Note that {x 1, x 2,..., x k } is a set containing k vectors, but the span of this set is a subspace containing an infinite number of vectors. Example 6.4. Recall the subspace X of example 6.1. We see that by definition { [ } {[ 3 3 X = t t R = span 1] 1]} This suggests a nice connection between subspaces and spans. Theorem 6.5. Let {x 1, x 2,..., x k } be a set of vectors in some vector space V. Then span {x 1, x 2,..., x k } is a subspace of V (and hence a vector space). Proof. The proof is perhaps more straightforward than you might expect. We just use the subspace test, Theorem 5.3. Let u, v span {x 1, x 2,..., x k }; this means that u and v are both linear combinations of the x j s, i.e., that we can write u = α 1 x 1 + α 2 x 2 + + α k x k v = β 1 x 1 + β 2 x 2 + + β k x k These notes are intended for students in mike s course For other uses please say hi to mnewman@uottawa.ca. There is one exception: if the set is empty (containing no vectors at all) then its span is {0} (containing exactly one vector, the zero vector). 25