Vector Calculus Math 213 Course Notes

Vector Calculus Math 213 Course Notes Cary Ross Humber November 28, 2016

Preface iii

Contents Preface iii 1 Linear Algebra Primer 1 1 Vectors....................................... 1 1.1.1 Vector Operations............................. 2 1.1.2 Some Geometric Concepts........................ 3 1.1.3 Linear Independence, Bases, other definitions............. 9 1.1.4 Projection................................. 12 2 A little about matrices.............................. 13 1.2.1 Determinant Formulas.......................... 15 1.2.2 Determinant Geometry.......................... 17 1.2.3 Cross product, triple product...................... 20 1.2.4 Lines and Planes............................. 22 2 Multivariable functions 27 2.0.1 Level Sets................................. 29 2.0.2 Sections................................... 29 1 Derivatives..................................... 33 2.1.1 What s wrong with partial derivatives?................. 40 2.1.2 Directional derivatives.......................... 43 v

2 Tangent vectors and planes............................ 44 2.2.1 Surfaces in R 3............................... 44 3 Coordinates.................................... 46 4 Parametric Curves................................. 49 5 Parametric surfaces................................ 50 6 Practice problems................................. 53 3 Exterior Forms 57 1 Constant Forms.................................. 58 3.1.1 1-Forms.................................. 58 2 2-Forms....................................... 61 3.2.1 Wedge (Exterior) product........................ 61 3 k-forms....................................... 63 3.3.1 Wedge product again........................... 63 4 Vector fields.................................... 66 5 Differential forms................................. 70 3.5.1 Exterior derivative............................ 71 3.5.2 Pullbacks/Change of coordinates.................... 76 6 Practice problems................................. 79 4 Integration and the fundamental correspondence 85 1 The correspondence between vector fields and differential forms...... 85 2 Flux integrals................................... 87 4.2.1 Line integrals and work......................... 96 4.2.2 Orientations................................ 102 4.2.3 Integration of 3-forms.......................... 109 3 Practice problems................................. 112 vi

5 Stokes Theorem 115 1 Surfaces with boundary.............................. 115 2 The generalized Stokes Theorem........................ 116 5.2.1 Stokes Theorem for 1-surfaces..................... 116 5.2.2 Stokes Theorem for 2-surfaces..................... 123 5.2.3 Stokes Theorem for 3-surfaces..................... 126 A Coordinate representations 131 B Some applications of differential forms and vector calculus 135 1 Extreme values................................... 135 2.1.1 Constrained Extrema........................... 138 2 Maxwell s equations................................ 141 vii

viii

Chapter 1 Linear Algebra Primer 1 Vectors The majority of our calculus will take place in 2-dimensional and 3-dimensional space. Occasionally, we may work in higher dimensions. For our purposes, a vector is like a point in space, along with a direction. Other information, such as magnitude or length of a vector, can be determined from this point and direction. We visualize a vector as an arrow emanating from the origin, which we often denote as O, and ending at this point. The space (so called vector space) R 2 = {(x 1,x 2 ) x 1,x 2 R} consists of pairs of real numbers. Such a pair, which we often denote by a single letter (bold, hatted, arrow on top), is a vector in R 2. The convention taken for these notes is to denote vectors by bold letters. It is typical to express a vector x in column form x = on a chalkboard/whiteboard, or whenever space is not a concern. Whenever space is at a premium, it is just as typical to denote the same vector x in row form x = (x 1,x 2 ). The space R 3 consists of 3-tuples of real numbers, or real 3-component vectors. Just as with R 2, we can express R 3 as the set ( x1 x 2 R 3 = {(x 1,x 2,x 3 ) x 1,x 2,x 3 R}. Each vector x R 3 consists of three components, each of which is a real number. 1 )

In general, the space R n consists of n-tuples of real numbers, or real n-component vectors, R n = {(x 1,...,x n ) x j R,j = 1,...,n}. The higher the dimension, the more space is preserved by using row form x = (x 1,...,x n ). 1.1.1 Vector Operations There are two basic vector operations, that of vector addition and scalar multiplication. Both operations are defined component-wise. Given two vectors a,b R n with component forms a = (a 1,a 2,...,a n ) and b = (b 1,b 2,...,b n ), the vector sum a + b is the vector obtained by adding the components of a to those of b, a + b = (a 1 + b 1,a 2 + b 2,...,a n + b n ). Similarly, if α R is a scalar, the scalar multiple αa is obtained by multiplying each component of a by α, αa = (αa 1,αa 2,...,αa n ). In what follows, whether we are discussing R 2,R 3 or R n, in general, we denote the zero vector by 0, which is simply the vector with 0 in every component. With respect to vector addition and scalar multiplication, the following conditions are satisfied for all α,β R and a,b,c R n V1) a + b = b + a V2) (a + b) + c = a + (b + c) V3) a + 0 = a V4) a + ( a) = 0 V5) 1a = a V6) α(βa) = (αβ)a V7) (α + β)a = αa + βa V8) α(a + b) = αa + αb. Example 1.1 Let a = (2, 1) and b = ( 3, 4). Then, a + b = (2 3,1 + 4) = ( 1,5). The vector a + b is the diagonal of the parallelogram with sides a and b as depicted in Figure 1.1. 2

b a + b y a x Figure 1.1: The vector a + b is the diagonal of the parallelogram with sides a,b. 1.1.2 Some Geometric Concepts Given two vectors x = (x 1,...,x n ) and y = (y 1,...,y n ) in R n, we define their inner product by n x,y = x j y j. The term inner product is synonymous with scalar product. If we input two vectors, the output is a scalar (real number). This particular inner product is often called the dot product in vector calculus texts. So, the same formula may be denoted x y = j=1 k x j y j. Soon, we will see what the inner product tells us about the geometric relationship between two (or more) vectors. j=1 Another important scalar quantity is the length or magnitude of a vector. This is a scalar associated with a single vector, whereas the inner product is a scalar associated with two vectors. However, these quantities are related. The norm (in particular, Euclidean norm) 3

of the vector x R n is n x = x,x 1 /2 = In other words, the quantity x 2 is the inner product of x with itself. Geometrically, the norm x represents the length of x. The Cauchy-Schwarz inequality gives another relationship between the norm and inner product, namely a,b a b (1.1) for any a,b R n. Though simple, the Cauchy-Schwarz inequality is very powerful. Another powerful inequality is the triangle inequality a b a ± b a + b. (1.2) Theorem 1.2 (Properties of the inner product). Let α,β be real numbers and let a,b,c R n. j=1 x 2 j 1/2. 1. αa,b = α a,b 2. a,βb = β a,b 3. a + b,c = a,c + b,c 4. a,b + c = a,b + a,c These properties highlight why it is often preferrable to work with inner products, rather than norms. With inner products, scalars factor out of both arguments. In contrast, the analogous property for norms is αa = α a, only the absolute value factors out, in general. Since inner products have nicer properties, whenever it makes sense to do so, we will often square norms so that they become inner products. Definition 1.3. A vector a R n is called a unit vector if a = 1. From any vector b R n we can obtain a unit vector by normalizing it. The vector u = b / b has norm 1. Let a = (a 1,a 2 ),b = (b 1,b 2 ) be two vectors in R 2. We want to determine an expression for the angle, ϕ, between the vectors a and b. Let ϕ a,ϕ b be the angles between the positive x-axis (e 1 -axis) and a,b, respectively. To each vector there corresonds a right triangle, 4

y a b ϕ a ϕ ϕ b a 1 b 1 a 2 b 2 x Figure 1.2: The angle ϕ between a and b whose side lengths correspond to the components of the vector and hypotenuse is the norm, as depicted in Figure 1.2. This gives the following trigonometric relations sinϕ a = a 2 a, sinϕ b = b 2 b cosϕ a = a 1 a, cosϕ b = b 1 b tanϕ a = a 2 a 1, tanϕ b = b 2 b 2. If ϕ is the angle between a and b, then ϕ = ϕ a ϕ b. Thus, cosϕ = cos(ϕ a cosϕ b ) = cosϕ a cosϕ b + sinϕ a sinϕ b b 1 b 2 = a 1 a b + a 2 a b = a 1b 1 + a 2 b 2, a b where the numerator in the last expression is a,b. Note that this analysis holds in higher dimensions, as well. Thus, the angle ϕ between vectors a,b R n is determined by cosϕ = a,b a b, (1.3) where ϕ (0,π). We can just as well let ϕ be in the closed interval [0,π], which includes the possibility that a,b lie on the same line. If the angle between a and b is 0 or π, a and b are called parallel. 5

Proof of Cauchy-Schwarz inequality. The Cauchy-Schwarz inequality is a direct consequence of the cosine identity (1.3). By (1.3), for vectors a,b R n we have Since 1 cosϕ 1, this yields hence which is equivalent to cosϕ = a,b a b. 1 a,b a b 1, a b a,b a b a,b a b. Definition 1.4. Two vectors a,b R n are called orthogonal (or perpendicular) if a,b = 0. Note that, in this case, by (1.3) cosϕ = a,b a b = 0 = ϕ = π 2. If a and b are orthogonal, we denote this by a b. Example 1.5 Let s compute the angle ϕ between a = (1,1) and ( b = 1 3 2 + 2, 1 ) 3 2 +. 2 We have norms and The inner product is a = 1 2 + 1 2 = 2 ( b = 1 ) 2 3 2 + + 2 (1 3 = 4 2 + 3 ) + 4 = 2. a,b = 1 ( 1 2 + ( 1 4 + ( 1 ) ( 3 1 2 + + 1 2 2 + 3 2 ) 2 3 2 + 3 4 ) ) 3 = 3. 2 6

Thus, the angle ϕ between a and b is determined by cosϕ = a,b a b = 3 2 2 = 3 2. Keeping in mind that ϕ must be in the interval (0,π), we have ϕ = π 6. Proposition 1.6 (Parallelogram Law). a + b 2 + a b 2 = 2 a 2 + 2 b 2 (1.4) Proof. We will give a proof of the parallelogram law using the law of cosines. For a triangle with angles A,B,C with opposing sides of length a,b,c, respectively, such as the triangle depicted in Figure 1.3, the law of cosines states a 2 = b 2 + c 2 2bc cosa b 2 = a 2 + c 2 2ac cosb c 2 = a 2 + b 2 2ab cosc. (1.5) The geometric relationship between the vectors a,b,a + b and a b is depicted in Figure 1.4. First, let ϕ denote the angle between a and a + b. By (1.3), By the law of cosines, we have cosϕ = a,a + b a a + b. (1.6) b 2 = a 2 + a + b 2 2 a a + b cosϕ ( ) a,a + b = a 2 + a + b 2 2 a a + b a a + b = a 2 + a + b 2 2 a,a + b = a 2 + a + b 2 2( a 2 + a,b ). (1.7) Thus, we have a 2 + b 2 = a + b 2 2 a,b. (1.8) Now, let θ denote the angle between a and b. By (1.3), cosθ = a,b a b. (1.9) 7

A c B b a C Figure 1.3: Relationship between A,B,C and a,b,c for the law of cosines. a + b b a b a Figure 1.4: The parallelogram generated by a,b with diagonals a + b and a b. By the law of cosines, we have a b 2 = a 2 + b 2 2 a b cosθ ( ) a,b = a 2 + b 2 2 a b a b = a 2 + b 2 2 a,b. (1.10) Thus, we have a 2 + b 2 = a b 2 + 2 a,b. (1.11) Adding equation (1.8) and (1.11) we arrive at the parallelogram law 2 a 2 + 2 b 2 = a + b 2 + a b 2. Proposition 1.7. The vectors a,b are orthogonal if and only if a + b 2 = a 2 + b 2. 8

Proof. By the parallelogram law (1.4) a + b 2 + a b 2 = 2 a 2 + 2 b 2. If a,b are orthogonal a,b = 0. If we expand the term a b 2 in terms of inner products, we have a b 2 = a b,a b = a,a b b,a b = a,a a,b b,a + b,b = a 2 + b 2, since a,b = 0 = b,a. Since a b 2 = a 2 + b 2, the parallelogram law reduces to a + b 2 = a 2 + b 2. 1.1.3 Linear Independence, Bases, other definitions Definitions. 1.8.1) Given a collection of vectors {v 1,...,v k } in R n, a linear combination of these vectors is an expression of the form where α j R for j = 1,...,k. k α j v j = α 1 v 1 + + α k v k, j=1 1.8.2) A collection of vectors {v 1,...,v k } in R n is called linearly independent if k α j v j = 0 implies α 1 =... = α k = 0. j=1 Otherwise, the collection is called linearly dependent. Geometrically, any two vectors are linearly independent if they do not lie on the same line. Suppose a,b R n are linearly dependent. This means that we can find α,β R which are not both 0 such that αa + βb = 0. Then, by rearranging the previous equality, a = β α b, 9

or, equivalently, b = α β b. Thus, the vectors a,b are scalar multiples of each other, so they lie on the same line. 1.8.3) The span of {v 1,...,v k } in R n is the set of all linear combinations of {v 1,...,v k }, span{v 1,...,v k } = {α 1 v 1 + + α k v k α j R,j = 1,...,k}. We say R n (or a subspace of R n ; see below) is spanned by {v 1,...,v k } if any vector x R n can be expressed as a linear combination of v 1,...,v k ; that is, for some scalars α 1,...,α k R. x = k α j v j j=1 The span of a single vector a R n is simply the (infinite) line on which a lies. Since span{a} = {αa α R} is the set of all scalar multiples of a and scalar multiples of a vector lie on the same line. Let s look at the span of two linearly independent vectors. First, why do we want them to be linearly independent? Well, if we take two linearly dependent vectors, they lie on the same line. That is, if a,b R n are linearly dependent, then span{a, b} = span{a} = span{b}. Geometrically, we do not gain any new information by considering a linearly dependent collection. Now, if a,b R n are linearly independent, then span{a,b} generates a plane. 1.8.4) A basis for R n is a collection of linearly independent vectors that span R n. In other words, a collection of vectors {v 1,v 2,...,v n } is a basis for R n if every vector a R n is uniquely expressible as a linear combination of the basis vectors. This means, given any vector a R n we can find unique scalars α 1,α 2,...,α n such that a = α 1 v 1 + α 2 v 2 + + α n v n. (1.12) In this case, the scalars (α 1,α 2,...,α n ) are called the coordinates of a relative to the basis {v 1,...,v n } (or simply coordinates). It is a basic theorem of linear algebra that any basis for R n consists of exactly n vectors. We often deal with the canonical basis {e 1,...,e n }, where e j R n has a 1 in the j-th component and 0 in all other components. For R 2, this is {e 1,e 2 } where e 1 = (1,0) and 10

e 2 = (0,1). For R 3, the canonical basis is {e 1,e 2,e 3 }, where e 1 = (1,0,0), e 2 = (0,1,0) and e 3 = (0,0,1). The canonical basis {e 1,e 2,e 3 } consists of unit vectors directed along the x,y and z axes, respectively. The notation {i,j,k} for the canonical basis of R 3 (in that order) is quite common in vector calculus texts, as well as other texts that rely heavily on vector calculus (e.g. electrodynamics and other physics texts). 1.8.5) A collection of vectors {v 1,...,v k } in R n is called orthonormal if v i,v j = 0 for i j and v j,v j = v j 2 = 1 for all j = 1,,k. Equivalently, this collection is orthonormal if v i,v j = δ ij, where { 1, i = j δ ij = 0, i j. In other words, an orthonormal collection is a collection of mutually orthogonal unit vectors. The canonical bases for R 2 and R 3 are both orthonormal. To any collection {v 1,...,v k } of vectors in R n there is an associated n k matrix whose columns are the vectors A = ( v 1 v 2 v k ). For instance, the canonical basis {e 1,e 2,e 3 } for R 3 corresonds to the matrix A = ( ) 1 0 0 e 1 e 2 e 3 = 0 1 0. 0 0 1 Given this correspondence, numerous statements about collections of vectors can be translated into statements about the corresponding matrix (and vice versa). Theorem 1.9. Let {v 1,...,v n } be a collection of vectors in R n. Let A R n n be the matrix associated with this collection A = ( v 1 v n ). The following statements are equivalent. 1. The collection {v 1,...,v n } is a basis for R n. 2. The matrix A is invertible. 3. det(a) 0 4. For any b R n, the nonhomogeneous system Ax = b has a solution. Moreover, the solution x R n is unique. 5. The homogeneous system Ax = 0 has only the trivial solution x = 0. 11

1.1.4 Projection Suppose a,b R 3 are linearly independent (hence, not on the same line). Since a,b are linearly independent, they span (or generate) a plane in R 3. Let M = span{a,b} = {αa + βb α,β R} denote this plane. By definition, {a,b} is a basis for M (the vectors are lin early independent and span M); however, as we will see, it is often convenient to have an orthonormal basis. Another linear algebra fact, that we will not cover in detail at this point, is that we can always determine an orthonormal basis. Let {v 1,v 2 } be an orthonormal basis for M, where v 1 = b / b. At this point, we do not need to know what v 2 is; we can use the fact the {v 1,v 2 } is orthonormal to determine v 2! Since {v 1,v 2 } is a basis for M a = α 1 v 1 + α 2 v 2, (1.13) for some α 1,α 2 R. First, take the inner product of the previous equation ( both sides) with v 1. Then, a,v 1 = α 1 v 1,v 1 + α 1 v 2,v 1 (1.14) = α 1, (1.15) which follows since v 1 is a unit vector ( v 1 = 1) and v 1,v 2 are orthogonal to each other ( v 1,v 2 = 0). Similarly, if we take the inner product of (1.13) with v 2, we have a,v 2 = α 1 v 1,v 2 + α 2 v 2,v 2 (1.16) = α 2, (1.17) which also follows by the orthonormality of {v 1,v 2 }. Now, if we go back to the equation a = α 1 v 1 + α 2 v 2, we know the values of the scalars α 1,α 2 and the vector v 1. We can now solve for v 2. But first, let s rewrite α 1 as Then, solving for v 2, we have α 1 = a,v 1 = a, b b = a,b b. v 2 = 1 α 2 a α 1 α 2 v 1 = a α 1v 1 α 2 (1.18) = a a,b b b b a a,b b b,b (1.19) 12

We have done several things here. First of all, we have determined formulae for an orthonormal basis of the plane spanned by {a,b}, namely v 1 = b b, v 2 = a a,b b b b a a,b b b,b. Secondly, α 1 is the component of the vector a parallel to the vector b, while α 1 v 1 = a,b b,b b (1.20) is the orthogonal projection of a onto b. The orthogonal projection of a onto b may be denoted proj b a. Example 1.10 Let a = (1, 1) and b = (2, 1). We have a,b = 1 2 + 1 1 = 3 and So the projection of a onto b is b,b = 2 2 + 1 1 = 5. a,b b,b b = 3 ( 6 5 (2,1) = 5, 3 ), 5 as depicted in Figure 1.5. The projection of one vector onto another yields another geometric interpretation of the inner product, as depicted in Figure 1.6. It is straightforward to verify that a,b = proj b a b (1.21) = proj a b a. (1.22) 2 A little about matrices Although we will not cover the details (yet), at first glance a matrix may seem a mere bookkeeping strategy, yet matrices are much more. At this point, it is perfectly acceptable to think of a matrix as a collection of numbers, organized by rows and 13

y a b proj b a x Figure 1.5: The orthogonal projection of a onto b. y b l a x Figure 1.6: The inner product a,b equals l times the length of b. The length l is precisely proj b a. columns (as with a spreadsheet). As with vectors, we denote a matrix by a single letter, 14

often a capital letter. For instance, 2 3 A = 7 1 0 4 is what we call a real 3 2 matrix; real since each entry is a real number and 3 2 to denote that A has 3 rows and 2 columns. A real m n matrix has m rows and n columns. We denote the (vector) space of all real m n matrices by R m n. When speaking of an arbitrary (non-specific) matrix A, we denote the corresponding entries by the lower case counterpart with indices. For instance, if A R m n, we may express this in the form a 11 a 12 a 13 a 1n a 21 a 22 a 23 a 2n A =..... a m1 a m2 a m3 a mn That is, the entry in the ith row and jth column is denoted by a ij. So, the first index is for the row and the second index for the column. If we wish to be more concise, we may write A = (a ij ) i=1,...,m. A matrix is called square if it has an equal number of columns and rows. j=1,...,n 1.2.1 Determinant Formulas As with vectors, we would like to associate to any given matrix a scalar which tells us something about the matrix. For vectors, we have norms and inner products which yield information in the form of a scalar number. We can, in fact, define norms and inner products for matrices, but these do not suffice for our purposes. What we need is the determinant of a square matrix. The determinant of a square matrix tells us when a matrix is invertible or not (see the discussion in the previous section). More importantly for our purposes, the determinant yields useful geometric information. Arguably, the determinant of a square 2 2 matrix is the most important, as the formula for larger matrices relies on the 2 2 case. So, we ll start with 2 2. An arbitrary real 2 2 matrix takes the form ( ) a b A =, c d where a,b,c,d R. In this case, the determinant of A, denoted det(a) is For example, if det(a) = ad bc. (1.23) A = ( ) 2 3, 1 7 15

then det(a) = 2(7) ( 3)1 = 17. Now, suppose A R 3 3 is the matrix a 11 a 12 a 13 A = a 21 a 22 a 23. a 31 a 32 a 33 Then, the determinant of A is given by ( ) ( ) ( ) a22 a det(a) = a 11 det 23 a21 a a a 32 a 12 det 23 a21 a + a 33 a 31 a 13 det 22 33 a 31 a 32 = a 11 (a 22 a 33 a 32 a 23 ) a 12 (a 21 a 33 a 31 a 23 ) + a 13 (a 21 a 32 a 31 a 22 ). Let s take a closer look at (1.24) a 11 a 12 a 13 a 11 a 12 a 13 a 11 a 12 a 13 det(a) = a 11 det a 21 a 22 a 23 a 12 det a 21 a 22 a 23 + a 13 det a 21 a 22 a 23 a 31 a 32 a 33 a 31 a 32 a 33 a 31 a 32 a 33 (1.24) (1.25) Formula (1.25) computes the determinant of A by expansion along the first row. Notice that there are three terms, each one corresponding to an entry in the first row of A. For the first entry in row one, a 11, we remove row 1 and column 1 to obtain the submatrix ( ) a22 a 23. a 32 a 33 Then, we multiply a 11 by the determinant of this submatrix. For the next entry in row one, a 12, we remove row 1 and column 2 to obtain the submatrix ( ) a21 a 23. a 31 a 33 Now, we multiply a 12 by the determinant of this submatrix. However, this time there is a negative thrown in front. Why?... More importantly, what we should see is that the the indices for each a ij coefficient tell us which row and column to remove. So, for the last term in (1.25) corresponding to a 13, we remove row 1 and column 3 to obtain ( ) a21 a 22. a 31 a 32 Again, we compute the determinant of this 2 2 submatrix, then multiply by the corresponding coefficient a 13. Notice that this term does not have a negative in front. 16

Working our way from left to right along the first row, the sign in front of each term alternates between + and. If there were more terms in the determinant formula, this alternating behavior would continue. Indeed, this must be taken into account when computing the determinant of a matrix larger than 3 3. However, knowing the determinant formula for a matrix no larger than 3 3 is sufficient for our needs. Example 1.11 Let 1 2 8 A = 0 1 3. 1 3 5 We will compute det(a) by (1.24). We have det(a) = 1det ( ) 1 3 2det 3 5 ( ) 0 3 + 8det 1 5 = 1(5 ( 9)) 2(0 3) + 8(0 ( 1)) = 28. ( 0 ) 1 1 3 1.2.2 Determinant Geometry So, what can the determinant tell us geometrically? We will first show that if a,b R 2, then the determinant of the matrix whose columns are a,b is the area of the parallelogram they generate. First, let s assume one of the vectors is directed along a coordinate axis, namely, let a = (a 1,0) and b = (b 1,b 2 ). Moreover, let s assume a 1,b 1,b 2 are all positive. We will compute the area of the parallelogram depicted in Figure 1.7. Without referring to a known area formula, there are several ways to determine the area of the shaded region. One way is to decompose the parallelogram into a smaller rectangle and two right triangles (however, this only works if b 1 < a 1 ). One can also determine the area of a larger rectangle and subtract the area of two right triangles, which works no matter how a 1 and b 1 compare. Using this method, as depicted in Figure 1.8, we find that the area of the large rectangle is (a 1 + b 1 )b 2, while each right triangle has area 1 2 b 1b 2. Thus, the area of the parallelogram is (a 1 + b 1 )b 2 b 1 b 2 = a 1 b 2. Of course, in general, such a parallelogram may be in any quadrant, such as the parallelogram depicted in Figure 1.9. You should convince yourself that if the vectors a,b were rotated so that a is on the positive x-axis, the area of the parallelogram would not change. It turns out that if a = (a 1,a 2 ) and b = (b 1,b 2 ), the area of this parallelogram 17

y b a x Figure 1.7: Any two linearly independent vectors a,b R 2 generate a parallelogram. y b 2 b 1 x a 1 + b 1 Figure 1.8: The area of the parallelogram can be computed by subtracting the area of the shaded triangular regions from the rectangle of area (a 1 + b 1 )b 2. is given by ( ) det a1 b 1 a 2 b 2 = a 1b 2 a 2 b 1. 18

y x a b Figure 1.9: An arbitrary parallelogram in the plane, generated by two linearly independent vectors a,b R 2. 19

Note that the absolute value is necessary, so that a positive value is yielded for the area, regardless of the sign of the vector components. 1.2.3 Cross product, triple product One of the primary reasons we need matrices is to define the determinant, which in turn, we need to define the cross product of vectors. Definition 1.12. Given two vectors a = (a 1,a 2,a 3 ), b = (b 1,b 2,b 3 ) in R 3, we define the cross product of a and b as the vector a b = (a 2 b 3 a 3 b 2,a 3 b 1 a 1 b 3,a 1 b 2 a 2 b 1 ). (1.26) We will see later in the semester, after introducing differential forms, that the cross product is a particular type of wedge product. For this reason, the cross product may also be denoted a b A convenient method for computing the cross product is to compute e 1 e 2 e 3 a b = det a 1 a 2 a 3. (1.27) b 1 b 2 b 3 Caution must be exercised when using formula (1.27)! When expanding the determinant, the e 1,e 2,e 3 are symbolic, in a sense; these vectors tell us which component the resulting scalars belong to. Example 1.13 Let a = (1, 2, 1) and b = (2, 1, 1). To compute the cross product a b, we compute e 1 e 2 e 3 det 1 2 1 = ( 2 1)e 1 (1 2)e 2 + (1 ( 4))e 3 2 1 1 C1) b a = (a b) = 3e 1 + e 2 + 5e 3 = ( 3,1,5). C2) a b,a = 0 = a b,b The following properties of the cross product hold, C3) a,b are linearly dependent if and only if a b = 0 C4) a (b + c) = a b + a c C5) (a + b) c = a c + b c C6) (αa) b = α(a b) = a (αb), where α R. 20

Property C2) states that a b is orthogonal to both a and b. Going back to Example 1.13, one can directly verify that a b = ( 3,1,5) is orthogonal to both a and b. These vectors are depicted in Figure 1.10. a b z a b x y Figure 1.10: The vectors a,b and a b. The standard basis vectors {e 1,e 2,e 3 } for R 3 satisfy e 1 e 2 = e 3 (1.28) e 3 e 1 = e 2 (1.29) e 2 e 3 = e 1. (1.30) Let ϕ denote the angle between a and b, then a b = a b sinϕ. (1.31) Furthermore, if a,b are linearly independent, (1.31) yields the area of the parallelogram generated by a and b. Consider the parallelogram generated by two linearly independent vectors a,b in R 3, such as depicted in Figure 1.11. As we did in the planar case, we can decompose the parallelogram into a rectangular region and two right triangular regions. Notice that the base of the right triangle corresponds precisely to proj a b, the projection of b onto a, as depicted in Figure 1.12. Moreover, if ϕ is the angle between a and b, the height of this triangle is b sinϕ. Definition 1.14. Given vectors a,b,c R 3, the quantity a,b c is called the scalar triple product. 21

a z b proj a b y x Figure 1.11: The parallelogram. b ϕ proj a b Figure 1.12: The right triangular portion of the parallelogram generated by a and b. If a,b,c R 3 are linearly independent, then the volume of the parallelepiped generated by a,b,c, as depicted in Figure 1.13, is given by a,b c. Equivalently, the volume can be computed by the triple product c,a b. 1.2.4 Lines and Planes In this section we will derive various equations of lines and planes in space. Whether or not not we distinguish notationally, we must be careful to conceptually distinguish between points and vectors in R 3. 22

z c x a b y Figure 1.13: The parallelepiped generated by a,b,c. Given two vectors a and b, the vector starting at the tip of a and ending at the tip of b is b a. Recall that we can define a line in the plane by specifing two points, a point and a slope, or a slope and an intercept. For instance, the line through (a 1,a 2 ) and (b 1,b 2 ) is y = a 2 + b 2 a 2 b 1 a 1 (x a 1 ). (1.32) Notice that the graph of this line goes through the tips of the vectors a = (a 1,a 2 ) and b = (b 1,b 2 ). In addition to the form given above, we can express the same line in parametric form; that is, in terms of a real parameter t on which both coordinates may depend. Rather than specifying a point and slope, we can specify a point and a direction. Since the line goes through the tips of a and b, the line is parallel to the vector b a (equivalently, a b). The parametric equation (x(t),y(t)) = (a 1,a 2 ) + t(b 1 a 1,b 2 a 2 ) (1.33) where t R defines the same line in terms of vectors. The algebraic equivalence of this equation and (1.32) follows from setting t = x a 1 b 1 a. Making this substitution in (1.32) 1 yields y = a 2 + t(b 2 a 2 ), which is exactly the second coordinate of (1.33). Similarly, if t = x a 1 b 1 a 1 then x = a 1 + t(b 1 a 1 ). Since both the x and y coordinates depend on the parameter t, this dependence is denoted in the parametric equation (1.33). We define lines in space by similar means. 23

Equation of line through a point in direction of a vector: The line l through the point (x 1,x 2,x 3 ) in the direction a = (a 1,a 2,a 3 ) is given by the equation l(t) = (x 1,x 2,x 3 ) + t(a 1,a 2,a 3 ) for t R. (1.34) In this case, the line l may be expressed as l(t) = (x(t),y(t),z(t)), in terms of its coordinate functions. Thus, the coordinate functions x(t),y(t),z(t) are given by x(t) = x 1 + ta 1, y(t) = x 2 + ta 2, z(t) = x 3 + ta 3. Although the line l goes through the point corresponding to the vector x = (x 1,x 2,x 3 ), it is also common to express this line as since the coordinate expressions are equivalent. l(t) = x + ta, t R, (1.35) Now, if we want the line passing through the tips of two vectors a and b, we need the line pointed in the direction b a (or a b). This line passes through the points corresponding to the vectors a and b. Equation of line through tips of vectors: The line l through the tips of the vectors a = (a 1,a 2,a 3 ) and b = (b 1,b 2,b 3 ) is given by the equation l(t) = a + t(b a) for t R. (1.36) In terms of the coordinate functions, l has the form l(t) = (x(t),y(t),z(t)) = (a 1 + t(b 1 a 1 ),a 2 + t(b 2 a 2 ),a 3 + t(b 3 a 3 )). (1.37) The same line is defined by any one of the following expressions l(t) = a + t(a b) = b + t(b a) = b + t(a b). In terms of an infinite line, any one of these equations will typically suffice. However, if we want a line segment (that is, a finite line) we often want the segment directed as we see fit. A line segment can be obtained by restricting the values of the parameter t. For instance, the line segment beginning at the tip of a and ending at the tip of b is given by l(t) = a + t(b a), for t [0, 1]. Notice that l(0) = a and l(1) = b. The line segment directed from b to a may be obtained by traversing l(t) = a + t(b a) backwards. That is, take l(t) = l(1 t) = a + (1 t)(b a). 24

Then, l(0) = b and l(1) = a. There are various ways of describing any given plane in R 3. For instance, the x,y-plane may be expressed as the set of points in R 3 for which the z-coordinate is 0, {(x,y,0) x,y R}. More formally, this means the x,y-plane is the span of {e 1,e 2 }. Similarly, if we fix z = α, where α is a fixed real number, the set {(x,y,α) x,y R} is a plane in R 3, parallel to the x,y-plane and shifted vertically by α. This seems simple enough, but what if we took the x,y-plane and tilted it about the origin. The result would still be a plane, but it would not be as straightforward to describe this plane by simply fixing one of the coordinates. We need to know what distinguishing characteristics can be used to fully and unambiguously describe a plane in R 3. One such characteristic is any vector which is orthogonal to the plane. For instance, any vector in the x,y-plane is orthogonal to the z-axis (more particularly, orthogonal to the vector e 3 ). Any vector which is parallel to e 3 is also orthogonal to the x,y-plane. However, this characteristic alone is not enough to uniquely define the x,y-plane. The vector e 3 (or any parallel vector) is also parallel to all of the planes {(x,y,α) x,y R}, where α is fixed. We need more information to completely define a plane. Specifying a point contained in the plane, along with an orthogonal vector, provides enough information to completely define a plane in R 3. For instance, there is only one plane in R 3 which is orthogonal to e 3 and contains the point (1, 2,0), the x,y-plane. Suppose we want to determine an equation defining the plane in R 3 which contains the point (x 1,x 2,x 3 ) and which is orthogonal to the vector a = (a 1,a 2,a 3 ). Let (x,y,z) be any other point in this plane. Then, the vector b = (x x 1,y x 2,z x 3 ), directed from one point to the other, must lie on this plane. Since the vector a is orthogonal to all vectors on this plane, we must have a,b = 0, hence a 1 (x x 1 ) + a 2 (y x 2 ) + a 3 (z x 3 ) = 0. (1.38) Any point (x,y,z) on this plane must satisfy equation (1.38). Equation of plane orthogonal to given vector: The plane P in R 3 containing the point (x 1,x 2,x 3 ) which is orthogonal to the vector a = (a 1,a 2,a 3 ) satisfies the equation a 1 (x x 1 ) + a 2 (y x 2 ) + a 3 (z x 3 ) = 0. (1.39) 25

Suppose (p 1,p 2,p 3 ) is a point in space and let p denote the corresponding vector. We would like to determine the distance between this point and the plane P defined by (1.39). If (x 1,x 2,x 3 ) is a point on P, let u = (x 1 p 1,x 2 p 2,x 3 p 3 ) denote the vector directed from (p 1,p 2,p 3 ) to (x 1,x 2,x 3 ). The distance from p to P is the norm of the projection of u onto a. We have proj a u = u,a a,a a, x p,a 2 proj a u,proj a u = a,a 2 a,a = x p,a 2 a 2, where x is the vector with coordinates (x 1,x 2,x 3 ). Hence, the distance from p = (p 1,p 2,p 3 ) to the plane P is given by x p,a. (1.40) a 26

Chapter 2 Multivariable functions We are interested in defining functions of more than one variable, for the purpose of calculus in multiple variables (vector calculus). A real-valued function f in n variables, x 1,x 2,...,x n, is a map which sends n real numbers (the inputs) to a single real number (the output), wherever f is defined. We denote the fact that f accepts n variables as inputs and outputs a single real number by f : R n R, which is read as f maps R n to R. For example, f (x,y) = x 2 + y 2 is a real-valued function of 2 variables, so we would write f : R 2 R. Definition 2.1. A real-valued function, f, from R n to R is a mapping sending each n-tuple of real numbers to a single real number, wherever defined. In this case, we write f : R n R. Equivalently, if f : R n R, f sends a vector x R n to a real number, which we denote as f (x). Thus, if x = (x 1,x 2,...,x n ), the output may be expressed as f (x 1,x 2,...,x n ) or f (x). The fact is, we have already dealt with a function which maps vectors to scalars, namely the norm. If we temporarily define f : R 3 R by f (a) = f (a 1,a 2,a 3 ) = a 2 1 + a2 2 + a2 3, then f (a) = a is the real valued function which determines the length of a given vector a. We will also deal with maps which output multiple variables, in addition to accepting multiple inputs. If we write g : R n R m, then g accepts n real numbers as inputs and outputs m real numbers. When m = 1, g is exactly a real-valued function defined on R n, as above. Definition 2.2. A vector-valued function, g, from R n to R m is a mapping sending each n-tuple of real numbers to an m-tuple of real numbers, wherever defined. In this case, 27

we write g : R n R m. Here, the inputs and the outputs can be regarded as vectors. If x R n, then the output f (x) is a vector in R m. Remark 2.3. Before discussing numerous examples, let s address one technical detail. When we write g : R n R m, we do not necessarily mean that the domain of g is all of R n, nor do we mean that the range of g is all of R m. Rather, if g : R n R m, then the domain of g is a subset of R n and the range of g is a subset of R m. Also note that, whenever necessary, we will denote the domain of a function g by dom(g). The equation of a line in R 3 is an example of a vector-valued function. Specifically, the line l : R R 3 defined by l(t) = (x 1 + ta 1,x 2 + ta 2,x 3 + ta 3 ) = (x(t),y(t),z(t)) is a function which maps a single real number t to a vector (x(t),y(t),z(t)) in R 3. Before diving deeper into the subject of vector-valued functions, we will discuss real-valued functions in more detail. Eventually, we want to develop a calculus for multivariable functions, both real and vector valued. As in single variable calculus, we want to develop limits, differentiation, extreme values, integration, etc. and see how these ideas come up in applications. First, how can we visualize a function of more than one variable? In particular, what does the graph of a function f : R n R look like? Definition 2.4. If f : R n R is a real-valued function, the graph of f is the subset of R n+1 defined by graph(f ) = {(x 1,x 2,...,x n,f (x 1,x 2,...,x n )) x 1,...,x n R}. (2.1) Equivalently, if we express the input as a vector x = (x 1,...,x n ), then the graph of f may be expressed more concisely as graph(f ) = {(x,f (x)) x R n }. (2.2) Since x R n and f (x) R, graph(f ) is indeed a subset of R n+1. Many of our examples will be functions from R 2 to R. If f : R 2 R, then graph(f ) = {(x,y,f (x,y)) x,y R} is a subset of R 3. In this particular case, a point (x,y,z) in R 3 lies on the graph of f if z = f (x,y). Thus, the function value f (x,y) determines the z component, or height. 28

2.0.1 Level Sets One particular method for determining graphical information about f is to determine the level sets of f. Determining the level sets of f is analogous to determining x and y intercepts of a single variable function h : R R. Definition 2.5. If f : R n R and α R, the α-level set of f is the set of (x 1,...,x n ) such that f (x 1,...,x n ) = α. If we denote the α-level set of f by L α (f ) then If f : R 2 R, the α-level set of f is the set L α (f ) = {(x 1,...,x n ) f (x 1,...,x n ) = α}. L α (f ) = {(x,y) R 2 f (x,y) = α}, which is often called a level curve in this particular case. If f : R 3 R, the α-level set of f L α (f ) = {(x,y,z) R 3 f (x,y,z) = α} is called a level surface. For a function f : R 2 R, the α-level curve of f corresponds to setting z = α. Example 2.6 Let z = f (x,y) = x 2 + y 2. The 0-level curve, L 0 (f ), is the set of (x,y) R 2 for which x 2 + y 2 = 0. The equation x 2 + y 2 = 0, equivalently x 2 + y 2 = 0, is only satisfied for (x,y) = (0,0). This tells us that the point (0,0,0) (i.e., the origin) is on the graph of f and, moreover, it is the only point on the graph with z coordinate 0. 2.0.2 Sections In addition to determining level sets of a multivariable function f, we can determine more graphical information by determining cross-sections of the graph. A cross-section (or slice) of the graph of f is the intersection of a plane with graph(f ). For level sets, we determine the intersection of graph(f ) with a plane parallel with the xy-plane. In contrast, for cross-sections we determine the intersection of graph(f ) with a vertical plane. In this context, a vertical plane is any plane which is not parallel with the xy-plane. For example, the intersection of graph(f ) with the yz-plane is a cross-section. The yz-plane is the set {(0,y,z) y,z R} and the intersection with graph(f ) is the set {(0,y,f (0,y)) y R}. (2.3) 29

2 1 y 0 1 2 1 0 1 2 x Figure 2.1: The 0,1 and 2 level curves of f (x,y) = x 2 + y 2. 2 z 1 x y Figure 2.2: The 0,1 and 2 level sets of f (x,y) = x 2 + y 2 lifted to the corresponding z values. This cross-section is precisely the portion of the graph which lies on the yz-plane. Another common cross-section is the intersection of graph(f ) with the xz-plane. The xz-plane is the set {(x,0,z) x,z R} and the intersection with graph(f ) is the set which is the portion of the graph which lies on the xz-plane. {(x,0,f (x,0)) x R}, (2.4) Typically, level sets give enough information that, along with one or two cross-sections, we can sketch the graph of f fairly accurately. Example (2.6 Continued). Returning to Example 2.6, for which f (x,y) = x 2 + y 2, let s determine the cross-sections of graph(f ) with the yz and xz planes. A point on the graph 30

of f is of the form (x,y, x 2 + y 2 ). For the yz-plane cross-section, we simply substitute x = 0 which yields the set of points {(0,y, y 2 ) y R} = {(0,y, y ) y R}. This is the graph of the absolute value function, z = y, placed on the yz-plane. Similary, the xz-plane cross-section can be obtained by substituting y = 0, which yields the set of points {(x,0, x 2 ) x R} = {(x,0, x ) x R}. This is the graph of the absolute value function, z = x, placed on the xz-plane. These cross-sections, along with the previously determined level curves, are depicted in Figure 2.3. A full surface plot of f (x,y) = x 2 + y 2 can be seen in Figure 2.4. 2 z 1 x y Figure 2.3: The 0,1 and 2 level sets, along with the xz and yz cross-sections of graph(f ). The graph of f (x,y) = x 2 + y 2 is a cone. z 2 x y Figure 2.4: A surface plot of z = f (x,y). 31

Example 2.7 Let g : R 2 R be defined by g(x,y) = /3 1 36 4x 2 y 2. The 0-level set, L 0 (g), is the set of (x,y) for which 0 = 1 36 4x 3 2 y 2, hence 4x 2 + y 2 = 36. Equivalently, 1 9 x2 + 1 36 y2 = 1, which is the equation of an ellipse with radii r x = 3 and r y = 6. The α-level set, L α (g), is characterized by the equation 3α = 36 4x 2 y 2, hence 4x 2 + y 2 = 36 9α 2. Equivalently, for (x,y) to be in L α (g) we must have 4 1 36 9α 2 x2 + 36 9α 2 y2 = 1. Again, this is the equation of an ellipse, but with radii 36 9α r x = 2 = 3 4 α 4 2 2, r y = 36 9α 2 = 3 4 α 2. From this we see that the we must have 2 α 2. For instance, if α = 1 the corresponding ellipse has radii r x = 3 /2 3 and ry = 3 3. For the xz-plane cross-section, where y = 0, we have z = 1 3 36 4x 2, which becomes the ellipse 1 9 x2 + 1 4 z2 = 1 with radii r x = 3 and r z = 2 in the xz-plane. Similarly, the yz-plane cross-section is characterized by z = 1 36 y 3 2, corresponding to the ellipse 1 36 y2 + 1 4 z2 = 1 with radii r y = 6 and r z = 2 in the yz-plane. The corresponding cross-sections, level sets and surface plot are depicted in Figures 2.5-2.7. 32

x 2 1 1 2 z y Figure 2.5: The xz and yz cross-sections of graph(g). The graph of g(x, y) = 1/3 36 4x 2 y 2 is an ellipsoid. 1 z x y Figure 2.6: The 0,1 and 1.5 level sets of graph(g). The graph of g(x,y) = 1 /3 36 4x 2 y 2 is an ellipsoid. 1 Derivatives Some of the techniques used for visualizing multivariable functions can help us make sense of derivatives/rates of change for multivariable functions. In particular, if f : R 2 R is a real-valued function, if we slice the graph of f we obtain a single variable function. The intersection of graph(f ) with a plane parallel to the yz-plane yields a curve depending only on z and y. Equivalently, taking x to be a fixed number yields the corresponding single variable function. The intersection of graph(f ) with a plane parallel to the xz-plane yields a curve depending only on z and x, which corresponds to the single variable function obtained by fixing y. In the same manner, we may regard x as fixed and then compute a derivative as we would in single variable calculus. That is, if x is not changing, f (x,y) is a function depending only on y, so we may compute the instantaneous rate of change of f with respect to y by lim h 0 f (x,y + h) f (x,y). (2.5) h 33

Figure 2.7: A surface plot of z = g(x,y). If the limit in (2.5) exists, we call this the partial derivative of f with respect to y, at the point (x,y), which we denote by f (x,y). Whenever the point at which a partial y derivative is evaluated need not be specified, it is conventional to denote the partial derivative as simply f. Similarly, if we regard y as fixed, f (x,y) depends only on y changes in x. The instantaneous rate of change of f with respect to x is given by lim h 0 f (x + h,y) f (x,y). (2.6) h If the limit in (2.6) exists, this is the partial derivative of f with respect to x, at the point (x,y), which is denoted by f f (x,y) or whenever it is unnecessary to specify the point. x x Example 2.8 Let f : R 2 R be the real-valued function f (x,y) = x 2 cosy. By definition, 34

we have f x = lim (x + h) 2 cosy x 2 cosy h 0 h (2xh + h 2 )cosy = lim h 0 h = lim(2x + h)cosy h 0 = 2x cosy. Notice that the same result is obtained by y as a constant and differentiating as usual, using the power rule in this case. In the same way, we may compute f y without resorting to limits. Thus, f y = x2 siny. The definition of partial derivatives is analogous for any real-valued function f : R n R. Definition 2.9. Let f : R n R be a real-valued function. The partial derivative of f with respect to x j, at the point (x 1,...,x n ), is defined by f f (x 1,...,x j + h,...,x n ) f (x 1,...,x n ) = lim, (2.7) x j h 0 h which is the result of differentiating f with respect to x j, regarding all other variables as fixed. Whenever it is necessary to denote the point at which the partial derivative is evaluated, we denote (2.7) by f x j (x 1,...,x n ). The limit definition of f x j can also be expressed as f f (x + he j ) f (x) = lim, (2.8) x j h 0 h where e j = (0,...,1,...,0) is the jth standard basis vector for R n. We denote the operation jth comp. of partial differentiation with respect to the jth variable by x j. 35

By only considering the rate of change of f as the variable x j changes, we are computing the rate of change in a direction parallel to the respective coordinate axis. For example, if f : R 2 R, then f f and are rates of change parallel to the x-axis and y-axis, x y respectively. That is, if we consider a particle moving along the graph of f, then we only know how much the height (z-coordinate) of the particle changes as the particle moves parallel to the x and y axes. Example 2.10 Let f : R 3 R be defined by f (x,y,z) = a z 2 xy where a > 0 is a constant. Recall the exponential derivative formula Then, Similarly, we have and d dt at = ln(a)a t. f x = ln(a)a z 2 xy x ( z 2 xy) = ln(a)a z 2 xy 1 2 z 2 xy x (z2 xy) = ln(a)a z 2 xy 1 2 z 2 xy ( y) = ln(a)ya z 2 xy 2. z 2 xy f y = ln(a)a z 2 xy y ( z 2 xy) = ln(a)a z 2 xy 1 2 z 2 xy y (z2 xy) = ln(a)xa z 2 xy 2 z 2 xy f z = ln(a)a z 2 xy z ( z 2 xy) = ln(a)a z 2 xy 1 2 z 2 xy z (z2 xy) = ln(a)za z 2 xy. z 2 xy 36

Of course, we can take multiple partial derivatives of a function f, whenever the corresponding limits are defined, in the same way that we compute multiple derivatives of a single variable function. For instance, if f : R n R we can differentiate with respect to x j, then differentiate with respect to x k. If the corresponding limit exists, this is a second order partial derivative of f, denoted by x k ( f x j ) = 2 f x k x j. Notice that the order in which we differentiate is from right to left, or inside to outside. If k = j, we denote this as 2 f, x 2 j which is the second partial derivative of f with respect to x j. If f : R 2 R, there are 4 second order partial derivatives, namely 2 f x 2, 2 f y x, 2 f x y, 2 f y 2. Remark 2.11. Depending on the properties of the real-valued function f, the order in which we take partial derivatives may matter! For a function f : R 2 R, we may have 2 f y x 2 f x y, in general. However, we will discuss a particular class of functions for which these mixed partial derivatives are equal. Example (2.8 continued). Consider f (x,y) = x 2 cosy. In Example 2.8 we computed the first partials f f = 2x cosy, x y = x2 siny. The mixed second derivatives are 2 f y x = (2x cosy) = 2x siny y and 2 f x y = x ( x2 siny) = 2x siny, 37

so 2 f y x = 2 f x y in this case. The remaining second derivatives are 2 f x 2 = 2cosy, and 2 f y 2 = x2 cosy. Now, if f : R n R m is a vector-valued function, we can define partial derivatives similarly. Since the output f (x) is a vector in R m for each x R n, it is convenient to introduce the notation f (x) = (f 1 (x),f 2 (x),...,f m (x)), where each f i : R n R is a real-valued function, i = 1,...,m. We call f i the ith coordinate function of f. Rather than defining derivatives of f directly, we can use existing derivative definitions for each of the m coordinate functions. Since each coordinate function maps R n to R, there are n partial derivatives for each of the m coordinate functions. Remark 2.12. The coordinate functions can also be denoted using subscripts, f i, rather than superscripts, f i. However, the use of subscripts can be confusing when we opt to denote partial derivatives with subscripts. For example, we will represent the partial derivative of the 2nd coordinate function with respect to x 3 by Df (x) = f 2 x 3 or f 2 3. Definition 2.13. If f : R n R m is a vector-valued function, the Jacobian of f at the point x is the m n matrix f 1 f 1 x 1 x 2 f 2 f 2 x 1 x 2..... f 1 x n f 2 x n, (2.9) f m x 1 f m x 2 f m x n where each partial derivative is evaluated at the point x. The (i,j)-entry in Df (x) is f i x j. The Jacobian of f is also called, simply, the matrix of partial derivatives of f. 38

The Jacobian of a real-valued function f : R n R has a special name, the gradient of f, denoted gradf or f. In accordance with the above definition, the gradient of f is the row vector ( f gradf =, f,..., f ). (2.10) x 1 x 2 x n If p is a point, we denote the gradient of f evaluated at p by grad p f, or f (p). Example 2.14 If r : R 3 R is defined by r(x,y,z) = x 2 + y 2 + z 2 then r x = x x 2 + y 2 + z, 2 r y = y x 2 + y 2 + z, 2 r z = z x 2 + y 2 + z, 2 hence x gradr = x 2 + y 2 + z, 2 y x 2 + y 2 + z 2, z x 2 + y 2 + z 2. If p = (1,2, 1), then grad p r = 1 / 6(1,2, 1). Example 2.15 Let f : R 2 R 2 be the vector-valued function defined by f (x,y) = (x 2 cosy,y 2 sinx). In this case, we have two coordinate functions The corresponding partial derivatives are f 1 (x,y) = x 2 cosy and f 2 (x,y) = y 2 sinx. f 1 x = 2x cosy, f 1 y = x2 siny and f 2 x = y2 cosx, f 2 y = 2y sinx. 39