Course Summary Math 211 table of contents I. Functions of several variables. II. R n. III. Derivatives. IV. Taylor s Theorem. V. Differential Geometry. VI. Applications. 1. Best affine approximations. 2. Optimization. 3. Lagrange multipliers. 4. Conservation of energy. I. Functions of several variables. Definition 1.1. Let S and T be sets. The Cartesian product of S and T is the set of ordered pairs: S T := {(s, t) s S, t T }. Definition 1.2. Let S and T be sets. A function from S to T is a subset W of the Cartesian product S T such that: (i) for each s S there is an element in W whose first component is s, i.e., there is an element (s, t) W for some t T ; and (ii) if (s, t) and (s, t ) are in W, then t = t. Notation: if (s, t) W, we write f(s) = t. The subset W, which is by definition the function f, is also called the graph of f. Definition 1.3. Let f: S T be a function between sets S and T. 1. f is one-to-one or injective if f(x) = f(y) only if x = y. 2. The image or range of f is {f(s) T s S}. The image will be denoted by im(f) or f(s). 3. f is onto if im(f) = T. 4. The domain of f is S and the codomain of f is T. 5. The inverse image of t T is f 1 (t) := {s S f(s) = t}. 1
Definition 1.4. Let f: S T and g: T U. The composition of f and g is the function g f: S U given by (g f)(s) := g(f(s)). Definition 1.5. R n is the Cartesian product of R with itself n times. We think of R n as the set of ordered n-tuples of real numbers: R n := {(a 1,..., a n ) a i R, 1 i n}. The elements of R n are called points or vectors. Definition 1.6. A function of several variables is a function of the form f: S R m where S R n. Writing f(x) = (f 1 (x),..., f m (x)), the function f i : S R, for each i = 1,..., m, is called the i-th component function of f. Definition 1.7. Let f be a function of several variables, f: S R m, with S R n. If n = 1, then f is a parametrized curve, if n = 2, then f is a parametrized surface. In general, we say f is a parametrized n-surface. Definition 1.8. A vector field is a function of the form f: S R m where S R m. Definition 1.9. If f: S R with S R n, a level set of f is the inverse image of a point in R. A drawing showing several level sets is called a contour diagram for f. II. R n. linear structure. Definition 2.1. The i-th coordinate of a = (a 1,..., a n ) R n is a i. For i = 1,..., n, define the i-th standard basis vector for R n to be the vector e i whose coordinates are all zero except the i-th coordinate, which is 1. Definition 2.2. The additive inverse of a = (a 1,..., a n ) R n is the vector a := ( a 1,..., a n ). Definition 2.3. In R n, define 0 := (0,..., 0), the vector whose coordinates are all 0. Definition 2.4. (Linear structure on R n.) If a = (a 1,..., a n ) and b = (b 1,..., b n ) are points in R n and s R, define a + b = (a 1,..., a n ) + (b 1,..., b n ) := (a 1 + b 1,..., a n + b n ) sa = s(a 1,..., a n ) := (sa 1,..., sa n ). The point a + b is the translation of a by b (or of b by a) and sa is the dilation of a by a factor of s. Define a b := a + ( b). 2
metric structure. Definition 2.5. The dot product on R n is the function R n R n R given by (a 1,..., a n ) (b 1,..., b n ) := n a i b i. i=1 The dot product is also called the inner product or scalar product. If a, b R n, the dot product is denoted by a b, as above, or sometimes by (a, b) or a, b. Definition 2.6. The norm or length of a vector a = (a 1,..., a n ) R n is a := a a = n a 2 i. The norm can also be denoted by a. Definition 2.7. The vector a R n is a unit vector if a = 1. Definition 2.8. Let p R n and r R. 1. The open ball of radius r centered at p is the set i=1 B r (p) := {a R n a p < r}. 2. The closed ball of radius r centered at p is the set B r (p) := {a R n a p r}. 3. The sphere of radius r centered at p is the set S r (p) := {a R n a p = r}. Definition 2.9. The distance between a = (a 1,..., a n ) and b = (b 1,..., b n ) in R n is d(a, b) := a b = n (a i b i ) 2. Definition 2.10. Points a, b R n are perpendicular or orthogonal if a b = 0. Definition 2.11. Suppose a, b are nonzero vectors in R n. The angle between them is defined to be cos 1 a b. a b 3 i=1
Definition 2.12. Let a, b R n with b 0. The component of a along b is the scalar c := a b = a b. The projection of a along b is the vector cb where c is the component of a b b b 2 along b. affine subspaces. Definition 2.13. A nonempty subset W R n is a linear subspace if it is closed under vector addition and scalar multiplication. This means that: (i) if a, b W then a + b W, and (ii) if a W and s R, then sa W. Definition 2.14. A vector v R n is a linear combination of vectors v 1,..., v k R n if there are scalars a 1,..., a k R such that v = k i=1 a iv i. Definition 2.15. A subspace W R n is spanned by a subset S R n if every element of W can be written as a linear combination of elements of S. If W is spanned by S, we write span(s) = W. Definition 2.16. The dimension of a linear subspace W R n is the smallest number of vectors needed to span W. Definition 2.17. Let W be a subset of R n and let p R n. The set p + W := {p + w w W } is called the translation of W by p. An affine subspace of R n is any subset of the form p + W where W is a linear subspace of R n. In this case, the dimension of the affine subspace is defined to be the dimension of W. Definition 2.18 A k-plane in R n is an affine subspace of dimension k. A line is a 1-plane, and a hyperplane is a (n 1)-plane. affine functions. Definition 2.19. A function L: R n R m is a linear function (or transformation or map) if it preserves vector addition and scalar multiplication. This means that for all a, b R n and for all s R, 1. L(a + b) = L(a) + L(b); 2. L(sa) = sl(a). Definition 2.20. (Linear structure on the space of linear functions.) Let L and M be linear functions with domain R n and codomain R m. 1. Define the linear function L + M: R n R m by for all v R n. (L + M)(v) := L(v) + M(v) 4
2. If s R, define the linear function sl: R n R m by for all v R n. (sl)(v) := L(sv) Definition 2.21. A function f: R n R m is an affine function (or transformation or map) if it is the translation of a linear function. This means that there is a linear funtion L: R n R m and a point p R m such that f(v) = p + L(v) for all v R n. Definition 2.22. Let W be a k-dimensional affine subspace of R n. A parametric equation for W is any affine function f: R k R n whose image is W. Definition 2.23. An m n matrix is a rectangular block of real numbers with m rows and n columns. The real number appearing in the i-th row and j-th column is called the i, j-th entry of the matrix. We write A = (a ij ) for the matrix whose i, j-th entry is a ij. Definition 2.24. (Linear structure on matrices.) Let A = (a ij ) and B = (b ij ) be m n matrices. Define A + B := (a ij + b ij ). If s R, define sa := (sa ij ). Definition 2.25. (Multiplication of matrices.) Let A = (a ij ) be an m k matrix, and let B = (b ij ) be a k n matrix. Define the product, AB to be the m n matrix whose i, j-th entry is k l=1 a ilb lj. Definition 2.26. Let A = (a ij ) be an m n matrix. The linear function determined by (or associated with) A is the function L A : R n R m such that L(x 1,..., x n ) = ( n j=1 a 1jx j,..., n j=1 a mjx j ). Definition 2.27. Let L: R n R m be a linear function. The matrix determined by (or associated with) L is the m n matrix whose i-th column is the image of the i-th standard basis vector for R n under L, i.e., L(e i ). Definition 2.28. An n n matrix, A, is invertible or nonsingular if there is an n n matrix B such that AB = I n where I n is the identity matrix whose entries consist of 1s along the diagonal and 0s otherwise. In this case, B is called the inverse of A and denoted A 1. theorems Theorem 2.1. Let a, b, c R n and s, t R. Then 1. a + b = b + a. 2. (a + b) + c = a + (b + c). 3. 0 + a = a + 0 = a. 5
4. a + ( a) = ( a) + a = 0. 5. 1a = a and ( 1)a = a. 6. (st)a = s(ta). 7. (s + t)a = sa + ta. 8. s(a + b) = sa + sb. Theorem 2.2. Let a, b, c R n and s R. Then 1. a b = b a. 2. a (b + c) = a b + a c. 3. (sa) b = s(a b). 4. a a 0. 5. a a = 0 if and only if a = 0. Theorem 2.3. Let a, b R n and s R. Then 1. a 0. 2. a = 0 if and only if a = 0. 3. sa = s a. 4. a b a b (Cauchy-Schwartz inequality). 5. a + b a + b (triangle inequality). Theorem 2.4. Let a, b R n be nonzero vectors. Then 1 a b a b 1. This shows that our definition of angle makes sense. Theorem 2.5. (Pythagorean theorem.) Let a, b R n. If a and b are perpendicular, then a 2 + b 2 = a + b 2. Theorem 2.6. Any linear subspace of R n is spanned by a finite subset. 6
Theorem 2.7. If a = (a 1,..., a n ) 0 and p = (p 1,..., p n ) are elements of R n, then H := {x R n (x p) a = 0} is a hyperplane. In other words, the set of solutions, (x 1,..., x n ), to the equation a 1 x 1 + + a n x n = d where d = n i=1 a ip i is a hyperplane. Conversely, every hyperplane is the set of solutions to an equation of this form. Theorem 2.8. If L: R n R m is a linear function and W R n is a linear subspace, then L(W ) is a linear subspace of R m. Theorem 2.9. A linear map is determined by its action on the standard basis vectors. In other words: if you know the images of the standard basis vectors, you know the image of an arbitrary vector. Theorem 2.10. The image of the linear map determined by a matrix is the span of the columns of that matrix. Theorem 2.11. Let W be a k-dimensional subspace of R n spanned by vectors v 1,..., v k, and let p R n. Then a parametric equation for the affine space p + W is f: R k R n (a 1,..., a k ) p + k a i v i. i=1 Theorem 2.12. Let L be a linear function and let A be the matrix determined by L. Then the linear map determined by A is L. (The converse also holds, switching the roles of L and A.) Theorem 2.13. The linear structures on linear maps and on their associated matrices are combatible: Let L and M be linear functions with associated matrices A and B, respectively, and let s R. Then the matrix associated with L + M is A + B, and the matrix associated with sl is sa. Theorem 2.14. Let L: R n R k and M: R k R m be linear functions with associated matrices A and B, respectively. Then the matrix associated with the composition, M L is the product BA. III. Derivatives. Definition 3.1. A subset U R n is open if for each u U there is a nonempty open ball centered at u contained entirely in U: there exists a real number r > 0 such that B r (u) U. 7
Definition 3.2. A point u R n is a limit point of a subset S R n if every open ball centered at u, B r (u), contains a points of S different from u. Definition 3.3. Let f: S R m be a function with S R n. Let s be a limit point of S. The limit of f(x) as x approaches s is v R m if for all real numbers ɛ > 0, there is a real number δ > 0 such that 0 < x s < δ and x S f(x) v < ɛ. Notation: lim x s f(x) = v. Definition 3.4. Let f: S R m with S R n, and let s S. The function f is continuous at s S if for all real numbers ɛ > 0, there is a real number δ > 0 such that x s < δ and x S f(x) f(s) < ɛ. (Thus, f is continuous at a limit point s S if and only if lim x s f(x) = f(s) and f is automatically continuous at all points in S which are not limit points of S.) The function f is continuous on S if it is continuous at each point of S. Definition 3.5. Let f: U R m with U an open subset of R n, and let e i be the i-th standard basis vector for R n. The i-th partial of f at u U is the vector in R m provided this limit exists. f f(u + te i ) f(u) (u) := lim x i t 0 t Definition 3.6. Let f: U R with U an open subset of R n. Let u U, and let v R n be a unit vector. The directional derivative of f at u in the direction of v is the real number f v (u) := lim t 0 f(u + tv) f(u) t provided this limit exists. The directional derivative of f at u in the direction of an arbitrary nonzero vector w is defined to be the directional derivative of f at u in the direction of the unit vector w/ w. Definition 3.7. Let f: U R m with U an open subset of R n. Then f is differentiable at u U if there is a linear function Df u : R n R m such that f(u + h) f(u) Df u (h) lim h 0 h The linear function Df u is then called the derivative of f at u. The notation f (u) is sometimes used instead of Df u. The function f is differentiable on U if it is differentiable at each point of U. Definition 3.8. Let f: U R m with U an open subset of R n. The Jacobian matrix of f at u U is the m n matrix of partial derivatives of the component functions of f: f ( ) 1 f x 1 (u)... 1 x n (u) fi Jf(u) := (u) =. x..... j f m f x 1 (u)... m x n (u) 8 = 0.
1. The i-th column of the Jacobian matrix is the i-th partial derivative f at u and is called the i-th principal tangent vector to f at u. 2. If n = 1, then f is a parametrized curve and the Jacobian matrix consists of a single column. This column is the tangent vector to f at u or the velocity of f at u, and its length is the speed of f at u. We write for this tangent vector. f (u) = (f 1(u),..., f m(t)) 3. If m = 1, the Jacobian matrix consists of a single row. This row is called the gradient vector for f at u and denoted f(u) or gradf(u): ( f f(u) := (u),..., f ) (u). x 1 x n theorems Theorem 3.1. Let f: S R m and g: S R m where S is a subset of R n. 1. The limit of a function is unique. 2. The limit, lim x s f(x), exists if and only if the corresponding limits for each of the component functions, lim x s f i (x), exists. In that case, ( ) lim f(x) = lim f 1(x),..., lim f m (x). x s x s x s 3. Define f +g: S R m by (f +g)(x) := f(x)+g(x). If lim x s f(x) = a and lim x s g(x) = b, then lim x s (f + g)(x) = a + b. Similarly, if t R, define tf: U R m by (tf)(x) := t(f(x)). If lim x s f(x) = a, then lim x s (tf)(x) = ta. 4. If m = 1, define (fg)(x) := f(x)g(x) and (f/g)(x) := f(x)/g(x) (provided g(x) 0). If lim x s f(x) = a and lim x s g(x) = b, then lim x s (fg)(x) = ab and, if b 0, then lim x s (f/g)(x) = a/b. 5. If m = 1 and g(x) f(x) for all x, then lim x s g(x) lim x s f(x) provided these limits exist. Theorem 3.2. Let f: S R m and g: S R m where S is a subset of R n. 1. The function f is continuous if and only if the inverse image of every open subset of R m under f is the intersection of an open subset of R n with S. 9
2. The function f is continuous at s if and only if each of its component functions is continuous at s. 3. The composition of continuous functions is continuous. 4. The functions f + g and tf for t R as above are continuous at s S provided f and g are continuous at s. 5. If m = 1 and f and g are continuous at s S, then fg and f/g are continuous at s (provided g(s) 0 in the latter case). 6. A function whose coordinate functions are polynomials is continuous. Theorem 3.3. If f: R n R m is a linear transformation, then f is differentiable at each p R n, and Df p = f. Theorem 3.4. (The chain rule.) Let f: U R k and g: V R m where U is an open subset of R n and V is an open subset of R k. Suppose that f(u) V so that we can form the composition, g f: U R m. Suppose that f is differentiable at p U and g is differentiable at f(p); then g f is differentiable at p, and In terms of Jacobian matrices, we have, D(g f) p = Dg f(p) Df p. J(g f)(p) = Jg(f(p))Jf(p). Theorem 3.5. Let f: U R m where U is an open subset of R n. Then f is differentiable at p U if and only if each component function f i : U R is differentiable, and in that case, Df p (v) = (Df 1p (v),..., Df mp (v)) for all v R n. Theorem 3.6. Let f: U R where U is an open subset of R n. If the directional derivative of f at u U in the direction of the unit vector v R n exists, it is equal to the dot product f(u) v. Theorem 3.7. Let f: U R be a differentiable function on an open subset U R n. The gradient vector, f(u), of f at u U points in the direction of quickest increase of f and its magnitude gives the rate of increase of f in that direction. Theorem 3.8. Let f: U R be a differentiable function on an open subset U R n. The gradient vector, f(u) of f at u is perpendicular to the level set of f through u, i.e., to f 1 (f(u)). More precisely, let h: I U be a differentiable function on an open interval I R containing the origin with h(0) = u. Suppose that f h is constant, i.e., the image of 10
h lies in the level set through u. Then the gradient of f at u is perpendicular to the tangent to h at 0: h (0) f(u) = 0. Theorem 3.9. Let f: U R m where U is an open subset of R n. Suppose that the partial derivative of each of the component functions of f exists at a U. Then each partial derivative of f exists at a and ( f f1 (a) = (a),..., f ) m (a). x i x i x i Theorem 3.10. Let f: U R m where U is an open subset of R n, and let u U. The second partial derivatives, 2 f(u)/ x i x j and 2 f(u)/ x j x i are equal if they exist and are continuous. Theorem 3.11. Let f: U R m where U is an open subset of R n. If f is differentiable at u U, then each of the first partial derivatives of each of the component functions exists and Df u is the linear map determined by the Jacobian matrix, Jf(u). Theorem 3.12. Let f: U R m where U is an open subset of R n. If each of the first partial derivatives of each of the component functions of f exists at u U and is continuous, then f is differentiable at u. In this case, f is said to be continuously differentiable at u. Theorem 3.13. (The inverse function theorem.) Let f: U R n be a function with continuous partial derivatives on the open set U R n. Suppose that the Jacobian matrix, Jf(u) is invertible at some point u U. Then there is an open subset V U containing u such that f is injective when restricted to V and its inverse (defined on f(v )) is differentiable with Jacobian matrix Jf(u) 1. Theorem 3.14. (The implicit function theorem.) Suppose f: R n R m R m is a function with continuous partial derivatives. Suppose that the m m matrix ( ) fi (u) x n+j where 1 i, j m is invertible for some u = (u 1, u 2 ) R n R m. Then there is an open set U R n containing u 1 and an open set V R m containing u 2 such that for each x U, there is a unique g(x) V with f(x, g(x)) = 0. The function g is differentiable. Theorem 3.15 (The rank theorem.) Suppose f : U V is a smooth mapping and that f has constant rank k in some open neighborhood of p U. Then there exist open sets Ũ U containing p and Ṽ V containing f(p) along with diffeomorphisms φ: Ũ U ψ : Ṽ V 11
onto open subsets U R n and V R m such that IV. Taylor s Theorem ψ f φ 1 (x 1,..., x n ) = (x 1,..., x k, 0,..., 0). Definition 4.1. Let f: U R m with U an open subset of R n, and let k be a nonnegative integer. A partial derivative of order k of f at u U is defined recursively as follows: (i) the zeroth-order partial derivative is f(u); (ii) a k-th order partial derivative, with k > 0 is any partial derivative of a (k 1)-th partial derivative. Definition 4.2. Let f: U R m with U an open subset of R n. Suppose that all partial derivatives of f of order less than or equal to k exist and are continuous. The Taylor polynomial of order k for f at u U is P k u f(x 1,..., x n ) := i 1 + +i n k 1 i 1! i n! i 1+ +i n f x i 1 1... x in n (u)(x 1 u 1 ) i1 (x n u n ) in. If the partial derivatives of every order exist and are continuous at u, one defines the Taylor series for f at u, P u f, by replacing k by in the above displayed equation. theorems Theorem 4.1. (Taylor s theorem in one variable.) Let f: S R with S R. Suppose that S contains an open interval containing the closed interval [a, b]. Also suppose that all the derivatives up to order k exist and are continuous on [a, b] and the (k + 1)-th derivative exists on (a, b). Let x, y [a, b]. Then there exists a number c between x and y such that f(x) = P k y f(x) + 1 d k+1 f (c)(x y)k+1 (k + 1)! dxk+1 where P k y f is the k-th order Taylor polynomial for f at y. Theorem 4.2. (Taylor s theorem in several variables.) Let f: U R m where U is an open subset of R n. Suppose that the partial derivatives of f up to order k exist and are continuous on U and that the partial derivatives of order k + 1 exist on U. Let u, x U and suppose that the line segment, u + tx, 0 t 1, is contained in U. Then there exists a number c between 0 and 1 such that f(x) = P k u f(x) + r(cx) where P k u f is the k-th order Taylor polynomial for f at u and r: U R m is a function such that lim v 0 r(v)/ v k+1 = 0. V. Differential geometry. 12
Definition 5.1. The tangent space to R n at u R n is R n labeled with the point u; more formally, it is the Cartesian product {u} R n. We denote the tangent space at u by R n u. Definition 5.2. If f: U R m with U an open subset of R n is differentiable at u U, the tangent map for f at u is defined to be the derivative of f at u, thought of as a mapping between tangent spaces, Df u : R n u R m f(u). Definition 5.3. Let f : R n R m be a differentiable function. The first fundamental form for f at p R n is the function, p : T p R n T p R n R given by Given the first fundamental form, define u, v p := Df p (u) Df p (v). 1. The length of u T p R n is u p := u, u p. 2. Vectors u, v T p R n are perpendicular if u, v p = 0. 3. The angle between nonzero u, v T p R n is arccos u, v p u p v p. 4. The component of u T p R n along nonzero v T p R n is u, v p v, v p. Definition 5.4. Let f : R n R m be a differentiable function. The first fundamental form matrix is the n n symmetric matrix, I with i, j-th entry f xi f xj, ( ) f I = x i f x j. Definition 5.5. A pseudo-metric on R n is a symmetric n n matrix I whose entries are real-valued functions on R n. Given any such I, define for each p R n and for all u, v T p R n u, v p := u T I(p) v. The matrix I is called a metric on R n if, p is positive definite for each p R n, i.e., for all u T p R n, u, u p 0 with equality exactly when u = 0. Given the form,, p, one defines lengths, distances, angles, and components as with the first fundamental form. Definition 5.6. Let c: [a, b] R n be a parametrized curve. The length of c is length(c) = 13 b a c (t) dt.
Definition 5.7. Let c: [a, b] R n be a parametrized curve. Suppose α: [a, b ] [a, b] is a function such that α is continuous, α (t) > 0 for all t, α(a ) = a, and α(b ) = b. Then the curve c = c α: [a, b ] R n is called a reparametrization of c. Definition 5.8. Let u: [a, b] R n be a parametrized curve, and let I be a pseudo-metric on R n. The length of u with respect to I is length(u) = b a u (t) u(t) dt, where the length of the vector u (t) is its length with respect to I as an element of the tangent space T u(t) R n. Definition 5.9. Let u: [a, b] R n and f : R n R m, so c = f u is a parametrized curve on the image of f. Then c is a geodesic if for i = 1,..., n. c (t) f xi (u(t)) = 0 Definition 5.10. Fix a metric I, and denote its ij-th entry by g ij. Since I is positive definite, it turns out that it must be an invertible matrix, and we denote the entries of the inverse by g ij : I = (g ij ), I 1 = ( g ij). For each i, j, l {1,..., n} define a Christoffel symbol Γ l ij = 1 2 n k=1 ( gjk x i g ij + g ) ki g kl. x k x j Let x(t) = (x 1 (t),..., x n (t)) be a curve in R n. Then x(t) is a geodesic if it satisfies the system of differential equations ẍ k + i,j Γ k i,jẋ i ẋ j = 0, k = 1,..., n. In the case n = 2 we can write ( E F I = F G ), I 1 = 1 ( G F F E ) where = 1/ det I = 1/(EG F 2 ). In this case, the Christoffel symbols are: Γ 1 11 = 1 2 (GE x 2F F x + F E y ) Γ 2 11 = 1 2 (2EF x EE y F E x ) Γ 1 12 = 1 2 (GE y F G x ) Γ 2 12 = 1 2 (EG x F E y ) Γ 1 22 = 1 2 (2GF y GG x F G y ) Γ 2 22 = 1 2 (EG y 2F F y + F G x ), 14
and the equations for a geodesic are: ẍ 1 + ẋ 2 1 Γ 1 11 + 2ẋ 1 ẋ 2 Γ 1 12 + ẋ 2 2 Γ 1 22 = 0 ẍ 2 + ẋ 2 1 Γ 2 11 + 2ẋ 1 ẋ 2 Γ 2 12 + ẋ 2 2 Γ 2 22 = 0. theorems Theorem 5.1. Let f : R n R m be a differentiable function, and let p R n. For u, v T p R n, u, v p = u T I(p) v = ( f ) x1 (p) f x1 (p)... f x1 (p) f xn (p) u 1... u n..... f xn (p) f x1 (p)... f xn (p) f xn (p) Theorem 5.2. The first fundamental form is a symmetric, positive definite form: 1. x, y u = y, x u for all x, y R 2 u. 2. x + y, z u = x, z u + y, z u for all x, y, z R 2 u. 3. sx, y u = s x, y u for all x, y R 2 u and s R. 4. x, x u 0 for all x R 2 u, and x, x u = 0 if and only if x = 0. Theorem 5.3. Consider R n with metric I. For each u, v R n, then there is an ɛ > 0 and a unique geodesic h: ( ɛ, ɛ) R n with h(0) = u and h (0) = v. Theorem 5.4. Geodesics give, locally, the shortest distance between two points on a surface: Consider R n with metric I, and let u R n. Then there is a ball of radius r centered at u, B r (u), such that for any v B r (u), the geodesic joining u to v is shorter than any other curve joining u to v. VI. Applications. VI.1. Best affine approximations. Definition 6.1.1. Let f: U R m be a differentiable function on an open set U R n. The best affine approximation to f at u U is the affine function where Df u is the derivative of f at u. T f u : R n R m x f(u) + Df u (x u) v 1. v n. 15
Definition 6.1.2. With notation as in Definition 6.1.1, the image of T f u is called the (embedded) tangent space to f at u. theorems Theorem 6.1.1. Let f: U R m be a differentiable function on an open set U R n, and let u U. If A: R n R m is an affine function with A(u) = f(u) and f(x) A(x) lim x u x then A is T f u, the best affine approximation to f at u. Theorem 6.1.2. Let f: U R m with U and open subset of R n, and let h: V U where V is an open subset of R. Let c := f h be the corresponding parametrized curve on the surface f. Let u U, and suppose that c(0) = f(u). Then f(u) + c (0) is contained in the embedded tangent space to f at u. Conversely, each element of the embedded tangent space to f at u can be written as f(u)+c (0) for some parametrized curve c on f with c(0) = f(u). = 0 VI.2. Optimization. Definition 6.2.1. The set S R n is closed if its complement is open, i.e., if for every x not in S, there is a nonempty open ball centered at x which does not intersect S. Definition 6.2.2. The set S R n is bounded if it is contained in some open ball centered at the origin, i.e., if there is a real number r > 0 such that s < r for all s S. Definition 6.2.3. A point s S R n is in the interior of S if there is a nonempty open ball centered at s contained entirely in S. Otherwise, s is on the boundary of S. Definition 6.2.4. Let f: S R where S R n. 1. s S is a (global) maximum for f if f(s) f(s ) for all s S. 2. s S is a (global) minimum for f if f(s) f(s ) for all s S. 3. s S is a local maximum for f if s is an interior point and f(s) f(s ) for all s in some nonempty open ball centered at s. 4. s S is a local minimum for f if s is an interior point and f(s) f(s ) for all s in some nonempty open ball centered at s. 5. A global extremum for f is a global maximum or global minimum for f. A local extremum for f is a local maximum or local minimum for f. 16
Definition 6.2.5. Let f: S R where S R n. A point s S is a critical or stationary point of f if it is an interior point, f is differentiable at s, and f(u) = 0, i.e., all the partial derivatives of f vanish at s. Definition 6.2.6. A critical point which is not a local extrema is called a saddle point. Definition 6.2.7. Let f: U R be a function on an open set U R n with continuous partial derivatives up to order 3. Let u U be a critical point of f. The quadratic form for f at u is defined by Q u f(x) := P 2 uf(x + u), where P 2 u is the second order Taylor polynomial for f at u. Hence, Q u f(x 1,..., x n ) := i 1 + +i n=2 1 i 1! i n! 2 f x i 1 1... x in n (u)x i 1 1 x in n. theorems Theorem 6.2.1. Let S R n be a closed and bounded set. Let f: S R be a continuous function. Then f has a maximum and a minimum in S. Theorem 6.2.2. Let f: U R be a differentiable function on an open set U R n. If u U is a local extrema for f, then u is a critical point for f. Theorem 6.2.3. Let f: U R be a function on an open set U R n with continuous partial derivatives up to order 3. Suppose the quadratic form, Q u f, for f at a critical point u is nonzero. Then u is a local minimum, a local maximum, or a saddle point according as Q u f is a local minimum, a local maximum, or a saddle at 0. VI.3. Lagrange multipliers. Theorem 6.3.1. Let f and g 1,..., g k be k + 1 differentiable, real-valued functions on an open subset U R n. Let u U. Suppose that if there are constants a 0, a 1,..., a k for which a 0 f(u) + a 1 g 1 (u) + + a n g n (u) = 0 then a 0 = a 1 = = a n = 0. Then there are points u 1 and u 2 arbitrarily close to u such that g i (u) = g i (u 1 ) = g i (u 2 ) for i = 1,..., k and f(u 1 ) < f(u) < f(u 2 ). VI.4. Conservation of energy. Definition 6.4.1. Let f: U R n be a differentiable function on an open set U R n (i.e., f is a differentiable vector field on U). Then f is conservative if there is a differentiable function φ: U R such that f = grad φ. In that case, we call ψ := φ a potential (energy) function for f (so, in that case, f = grad ψ). 17
Definition 6.4.2. Let f: U R n be a differentiable function on an open set U R n, and let h: I U be a function on an open subset I R whose second derivatives exist. Suppose there is a constant m such that f(h(t)) = mh (t). Then h is said to satisfy Newton s Law. Definition 6.4.3. With notation as in definition 6.4.2, define the kinetic energy of h to be 1 2 m h (t) 2. theorems Theorem 6.4.1. With notation as in definition 6.4.2, the sum of the potential and the kinetic energy is constant. 18