Definition 3 (Continuity). A function f is continuous at c if lim x c f(x) = f(c).

Functions of Several Variables A function of several variables is just what it sounds like. It may be viewed in at least three different ways. We will use a function of two variables as an example. z = f(x, y) may be viewed as a function of the two independent variables x, y. It may be viewed as a function defined at different points (x, y) in the plane. It may be viewed as a function whose domain is the set of vectors < x, y > or xi + yj. Limits of Functions of Several Variables We define a limit of a function of several variables essentially the same way we define a limit for an ordinary function: Definition 1 (Limit). lim x c f(x) = L if ɛ > 0, δ > 0 such that f(x) L < ɛ whenever 0 < x c < δ. Definition 2 (Limit). lim x c f(x) = L if ɛ > 0, δ > 0 such that f(x) L < ɛ whenever 0 < x c < δ. Properties of Limits Rule of Thumb: If a property of limits makes sense when translated to refer to a limit of a function of several variables, then it is valid for a function of several variables. For example, the limit of a sum will be the sum of the limits, the limit of a difference will be the difference of the limits, the limit of a product will be the product of the limits and the limit of a quotient will be the quotient of the limits, provided the latter limit exists. Continuity The definition of continuity for a function of several variables is essentially the same as the definition for an ordinary function. Definition 3 (Continuity). A function f is continuous at c if lim x c f(x) = f(c). Definition 4 (Continuity for a Function of Several Variables). A function f is continuous at c if lim x c f(x) = f(c). As with ordinary functions, functions of several variables will generally be continuous except where there s an obvious reason for them not to be. 1

Partial Derivatives For a function of several variables, we have partial derivatives with respect to each of its variables. The definition is based on the definition of an ordinary derivative. Definition 5 (Derivative). Let f : R R. df dx (x) = lim f(x + h) f(x) h 0. h Definition 6 (Partial Derivative). Let f : R 2 R. f(x + h, y) f(x, y) lim h 0, f h (x, y) = lim h 0 2 f (x, y) = f(x, y + h) f(x, y). h The obvious generalizations hold for functions with more than two independent variables. Calculation of Partial Derivatives Effectively, we calculate the partial derivative of a function with respect to one of its independent variables by acting as if the other independent variables were actually constants. Notation The following notations for the partial derivatives of a function z = f(x, y) are equivalent. f x = f = = f 1 = D 1 f = D x f f y = f = = f 2 = D 2 f = D y f Higher Order Derivatives Since a partial derivative is itself a function of several variables, it has its own partial derivatives. (f x ) y = f xy = f 12 = ( ) f (f y ) x = f yx = f 21 = ( ) f = 2 f = = 2 f = 2 z 2 z Changing the Order of Differentiation Theorem 1 (Clairaut s Theorem). If f xy and f yx are both continuous on a disk containing (a, b), then f xy (a, b) = f yx (a, b).

Proof. Let φ(h) = f(x + h, y + h) f(x, y + h) f(x + h, y) + f(x, y). The motivation comes from writing either f xy or f yx as a limit. We may write φ(h) = α(y +h) α(y), where α(t) = f(x+h, t) f(x, t). The Mean Value Theorem implies α(y + h) α(y) = α (t)h for some t between y and y + h. Since α (t) = f 2 (x + h, t) f 2 (x, t), we have φ(h) = [f 2 (x + h, t) f 2 (x, t)]h. If we write β(s) = f 2 (s, t), then f 2 (x+h, t) f 2 (x, t) = β(x+h) β(x). Clairault s Theorem β(s) = f 2 (s, t), f 2 (x + h, t) f 2 (x, t) = β(x + h) β(x). By the Mean Value Theorem, β(x + h) β(x) = β (s)h for some s between x and x + h. Since β (s) = f 21 (s, t), we get f 2 (x + h, t) f 2 (x, t) = f 21 (s, t)h, so φ(h) = f 21 (s, t)h 2. 3 Thus φ(h) (x, y). h 2 = f 21 (s, t) f 21 (x, y) as h 0, since f 21 is continuous at A similar calculation shows φ(h) h 2 = f 12 (s, t) f 12 (x, y) as h 0, showing f 12 (x, y) = f 21 (x, y). Tangent Planes Consider a surface z = f(x, y) and suppose we are interested in the plane tangent to the surface at the point (a, b, c), where c = f(a, b). Since represents about how much z will change if x changes by 1 and y is fixed, here, and elsewhere as we look at tangent planes, tangent plane approximations and differentials, the partial derivative shown really means the partial derivative s value at the relevant point, in this case (a, b), it seems reasonable to expect the vector < 1, 0, > to be tangent to the surface. Similarly, it is reasonable to expect the vector < 0, 1, tangent to the surface. Tangent Planes > to be

i j k We thus expect n = 1 0 0 1 = i j + k to be a normal vector to the tangent plane. We thus take n =<,, 1 >. We thus get <,, 1 > < x a, y b, z c >= 0 as an equation for the tangent plane, or (x a) (y b) + (z c) = 0, or z c = (x a) + (y b). This should be reminiscent of the Point-Slope Formula for the equation of a line. Tangent Hyperplanes It generalizes to y b = n i=1 i (x i a i ) as an equation for the hyperplane tangent to the hypersurface y = f(x 1, x 2,..., x n ) at the point (a 1, a 2,..., a n, b). Tangent Plane Approximations and Differentials If we take z c = (x a) + (y b) and solve for z, we get z = c + (x a) + (y b) This should be reminiscent of the Tangent Line Approximation for ordinary functions. We may use this formula to approximate f(x, y) at a point (x, y) close to a point (a, b). 4 Definition 7 (Differentials). dx = x = x a dy = y = y b dz = (x a) + (y b) Differentials

We may use the differential dz to approximate the change z = f of a function f(x, y) if the independent variables x and y change by amounts dx and dy. This generalizes in the obvious way to functions of more than two variables. Differentiability Recall that for an ordinary function y = f(x) which was differentiable at a point, we found 0 as x 0. dy y x We take the analogue of this as a definition of differentiability for functions of several variables. We state the definition for the case of a function of two variables; the variation for more variables should be obvious. Definition 8 (Differentiable). We say a function f(x, y) is differentiable at a point if dz z ( x)2 + ( y) 0 as ( x) 2 + ( y) 2 0. 2 Differentiability Recall ( x) 2 + ( y) 2 is the distance between (x, y) and the point in question. Effectively, we are defining a function of several variables to be differentialbe when an approximation using differentials is reasonable. We still need a reasonable way of determining whether a function is differentiable. This is given by the following theorem. Differentiability Theorem 2. If both partial derivatives of a function z = f(x, y) are continuous in some open disc {(x, y) : (x a) 2 + (y b) 2 < r} centered at (a, b), then f(x, y) is differentiable at (a, b). Proof. We need to show 0. ( We may write z dz = f(x, y) f(a, b) f(x, y) f(a, y) dz z ( x)2 + ( y) 2 0 as ( x) 2 + ( y) 2 5 ) (x a) + (y b) = (x a) + f(a, y) f(a, b) (y b). Proof

By the Mean Value Theorem, f(x, y) f(a, y) = (x, y)(x a) for some x between a and x if x is close enough to a. Similarly, f(a, y) f(a, b) = (a, y )(y b) for some y between b and y if y is close enough to b. We thus get z dz = (x, y)(x a) (x a) + (a, y )(y b) ( (y b) = (x, y) ) ( (x a)+ (a, y ) ) (y b). Since both Proof x a (x a)2 + (y b) 2 1 and y b (x a)2 + (y b) 2 1, we have ( (x, y) ) (x a) (x a)2 + (y b) 2 (x, y) 0 and( (a, y ) ) (y b) (x a)2 + (y b) 2 (a, y ) 0 since both partial derivatives are continuous near (a, b). The Chain Rule For an ordinary function, if y = f(u) and u = g(x), making y = f g(x) a composite function, we can differentiate with respect to x dy using the Chain Rule: dx = dy du du dx. Suppose we have a function z = f(x, y), but x = g(t) and y = h(t), making z = f(g(t), h(t)) a composite function of t. We can come up with a variation of the Chain Rule, which holds under appropriate conditions. The conditions we will assume are that all the relevant derivatives exist and are continuous near t and all the relevant partial derivatives exist and are continuous near (f(t), g(t)). By the definition of a derivative, dz dt = lim k 0 f(g(t + k), h(t + k)) f(g(t), h(t)). k The Chain Rule 6

We can rewrite the numerator as f(g(t+k), h(t+k)) f(g(t), h(t)) = [f(g(t+k), h(t+k)) f(g(t), h(t+k))]+[f(g(t), h(t+k)) f(g(t), h(t))]. Using the Mean Value Theorem, the first difference may be written: f(g(t + k), h(t + k)) f(g(t), h(t + k)) = f 1 (u, h(t + k))[g(t + k) g(t)], where u is between g(t + k) and g(t). But, also by the Mean Value Theorem, g(t + k) g(t) = g (t )k, where t is between t and t + k. We thus have f(g(t + k), h(t + k)) f(g(t), h(t + k)) = f 1 (u, h(t + k))g (t )k Similarly, f(g(t), h(t + k)) f(g(t), h(t)) = f 2 (g(t), v)h (t )k, where v is between h(t) and h(t + k) and t is between t and t + k. The Chain Rule We thus get dz dt = lim f 1 (u, h(t + k))g (t )k + f 2 (g(t), v)h (t )k k 0 = k lim k 0 f 1 (u, h(t + k))g (t ) + f 2 (g(t), v)h (t ) = f 1 (g(t), h(t))g (t) + f 2 (g(t), h(t))h (t). Using Leibniz Notation, this may be written as: dz dt = dx dt + dy dt. This is one variation of the Chain Rule. Partial Derivatives Via the Chain Rule Suppose z = f(x, y), while x = g(s, t) and y = h(s, t). Then z = f(g(s, t), h(s, t)) can be thought of as a function of s and t. We might then want to calculate the partial derivatives and s t. By the nature of partial differentiation, the Chain Rule we just derived can be adjusted to give formulas for these partial derivatives. s = s + s t = t + t If we have functions involving more than two variables, this may be adjusted in the hopefully obvious way. Implicit Differentiation The Chain Rule may be used to derive a formula for implicit differentiation. 7

Theorem 3 (Implicit Differentiation). If a differentiable function y = f(x) is defined implicitly by an equation F (x, y) = 0, then dy dx = F x = F y F. F Note: We have assumed y = f(x) is differentiable. We are not here dealing with how one knows whether such a function is differentiable. In general, if such a function is not differentiable, it will be relatively obvious. Implicit Differentiation Proof. Using the Chain Rule, df dx = F dx dx + F dy dx = F + F dy dx. Since F (x, y) = 0, it follows that df dx F = 0, so + F dy dx = 0. F Solving for dy F dy, we get dx dx = F dy, so dx =. F Directional Derivatives Consider a function z = f(x, y) and its graph, which will be a surface. The partial derivative may be thought of as representing how fast the surface is rising above one s head if one is walking on the xy-plane in the direction of the x-axis. Similarly, the partial derivative may be thought of as representing how fast the surface is rising above one s head if one is walking on the xy-plane in the direction of the y-axis. Directional Derivative For a given unit vector u, we define the directional derivative D u z to represent how fast the surface is rising above one s head if one is walking on the xy-plane in the direction of u. Definition 9 (Directional Derivative). Let f : R n R and let u R n be a unit vector. Let g(t) = f(x + ut). D u f(x) = g (0) is called the directional derivative of f at x in the direction of u. 8

Note that if n = 1, then the directional derivative is the same as the ordinary derivative, while the directional derivatives in the directions of the coordinate axes are the same as the partial derivatives. The Del Operator and the Gradient Definition 10 (Del Operator). = (, ) Note this is really just a symbolic entity. By itself, it is meaningless, but we use it as a mneumonic device. ( f Definition 11 (Gradient). grad f = f =, f ) The gradient turns out to be convenient when calculating directional derivatives. It also generalizes to higher dimensions. Calculating Directional Derivatives Theorem 4. If all the partial derivatives of z = f(x) are continuous is some open ball centered at x, then D u f(x) = ( f) u. This theorem gives us a convenient way to calculate any directional derivative of a function and also shows that it is sufficient to be able to calculate all the partial derivatives. Proof We will prove the theorem for R 2, but a similar proof will work for higher dimensions; only the notation would get messier. Proof. Consider a function f(x, y) and a unit vector u =< a, b >. Let z = g(t) be defined by letting z = f(x, y), where x = x 0 +at, y = y 0 +bt. By definition, D u f(x 0, y 0 ) = g (0). By the Chain Rule, g (t) = dz dt = dx dt + dy dt Evaluating this at 0 gives the result. = ( z) u. Maximum Value of the Directional Derivative D u f = ( f) u = f u cos θ, where θ is the angle between f and u. Since 1 cos θ 1, the maximal value obviously occurs when θ = 0 and cos θ = 1, in other words, when u is in the same direction as f. There s a catch: This depends on the property u v = u v cos θ, which we ve seen for R 2 and R 3, but whose very meaning is unclear for higher dimensions. 9

Cauchy-Schwarz Inequality We can give u v = u v cos θ meaning through the Cauchy-Schwarz Inequality u v u v. We will show the Cauchy-Schwarz Inequality holds in any dimension, with equality holding if and only if one vector is a multiple of the other. Consider a vector u tv. Certainly (u tv) (u tv) 0, with equality holding if and only if u is a multiple t of v or v = 0. Since (u tv) (u tv) = u u 2tu v + t 2 vv = v 2 t 2 2u vt + u 2, we get v 2 t 2 2u vt + u 2 0. Cauchy-Schwarz Inequality It follows that the quadratic equation v 2 t 2 2u vt + u 2 = 0 in t can t have more than one solution, so the discriminant ( 2u v) 2 4 v 2 u 2 can t be positive. In other words, ( 2u v) 2 4 v 2 u 2 0, so 4(u v) 2 4 v 2 u 2 0, so (u v) 2 v 2 u 2 0, so (u v) 2 v 2 u 2, so u v u v. Equality clearly holds if and only if either u tv = 0 or if v = 0, in other words, if and only if either u is a scalar multiple of v or if v = 0. Cauchy-Schwarz and Directional Derivatives Since u v u v, it follows that 1 u v u v 1. We may thus define the angle θ between u and v by θ = arccos 10 ( ) u v. u v It follows that u v = u v cos θ, so the argument we used before about the directional derivative being maximal in the direction of the gradient can legitimately be used. Tangent Planes and Gradients Recall the formula for the plane tangent to the surface z = f(x, y) at a point (a, b): z c = (x a) + (y b). Using the language of gradients, this could be written in the form z c = ( f) < x a, y b > or z c = ( f) (x x 0 ), where x =< x, y > and x 0 =< a, b >. Since one standard form for the equation of a plane is z z 0 = n (x x 0 ), with n being a normal to the plane, it follows that f is normal to the tangent plane. Tangent Planes for Surfaces Defined Implicitly

Suppose a surface is the graph of an equation φ(x, y, z) = 0. At most points (where there is a tangent plane and the tangent plane isn t vertical), a portion of the surface near the point can be considered the graph of a function z = f(x, y) defined implicitly by the equation φ(x, y, z) = 0 along with some side conditions. By the formula for implicit differentiation, =, so the equation of the tangent plane may be written z c = (x a) (y b). Simplifying: (z c) = (x a) (y b), (x a) + (y b) + (z c) = 0. 11 and = This can also be written in the form ( φ) < x a, y b, z c >= 0.