Math 452 - Advanced Calculus II Manifolds and Lagrange Multipliers In this section, we will investigate the structure of critical points of differentiable functions. In practice, one often is trying to maximize or minimize a particular function with respect to certain constraints. When these constraints are sufficiently nice, i.e., they are the zero sets of some continuously differentiable maps, then we can use the extra structure provided by the constraints to both find and nicely characterize the critical points of our original map. This is called the Lagrange multiplier method. In order to do this, we need to introduce the notion of a manifold in R n, which is a subspace which locally looks like some lower dimensional Euclidean space. Roughly, a k-manifold M R n has the property that an open neighborhood of any given point in M looks like an open neighborhood of a point in R k for some fixed k n. We will make this more precise shortly. Manifolds have several nice features reminiscent of R n and this extra geometric structure will allow us to translate problems about differentiable maps attaining critical values on a manifold into the features of the manifold itself. What we will see is that certain nice subsets of zero sets of continuously differentiable maps are often manifolds whose internal geometry are determined by the differentiable maps in question. The Lagrange multiplier method thus involves solving critical value problems for some differentiable function f : R n R on some manifold M R n defined by some continuously differentiable map g : R n R m via direct calculations involving various quantities related to the map g itself. 1 Critical points of functions along curves We first want to work out the easiest situation, when the critical point of a differentiable function f : R n R lies in some open set in R n on which f is defined and differentiable. Definition 1. Let D be a compact set of R n. We say that the function f : D R has a local maximum (respectively, local minimum) on D at the point p D if and only if there exists an open ball B D centered at p such that f(x) f(p) [respectively, f(x) f(p)] for all points x B. Recall the well-known result from single-variable calculus that if the differentiable function f : R R has a local maximum or local minimum at p R, then f (p) = 0. Theorem 1. Let S R n, and ϕ : R S be a differentiable curve with ϕ(0) = a. If f is a differentiable real-valued function defined on some open set containing S, and f has a local 1
maximum (or local minimum) on S at a, then the gradient vector f(a) is orthogonal to the velocity vector ϕ (0). Proof. Use the chain rule. Corollary 1. If U is an open set of R n and a U is a point at which the differentiable function f : U R has a local maximum or local minimum, then f(a) = 0. Proof. Prove that f(a) v = 0 for every v R n. φ(t) = a + tv so that φ (t) v. Now use Theorem 1. For each v R n, define φ : R R n by 2 Single-constraint Optimization In this section, we will focus on single-constraint optimization problems. We will begin by introducing the notion of an (n 1)-manifold in R n. 2.1 (n 1)-manifolds in R n As suggested in the discussion at the beginning, the key idea here is that every point in an (n 1)- manifold in R n has an open neighborhood which looks like an open neighborhood of a point in R n 1. We now make this precise. Definition 2. The projection mapping π i : R n R n 1 is defined by removing the i th coordinate: π i (x 1,..., x n ) = (x 1,..., x i,..., x n ) = (x 1,..., x i 1, x i+1,..., x n ) R n 1. The open neighborhoods mentioned above are called patches. The idea of a patch is that, up to reordering the coordinate basis, a patch in an (n 1)-manifold looks like the graph of some differentiable function h : U R for some open U R n 1. Definition 3 ((n 1)-dimensional patch). The set P R n is called an (n 1)-dimensional patch if and only if for some integer i with 1 i n there exists An open set U R n 1 and A differentiable function h : U R, such that P = {x R n : π i (x) U and x i = h(π i (x)) }. 2
Remark 1. The definition of an (n 1)-dimensional patch is equivalent to having a permutation x i1,..., x in of the coordinates x 1,..., x n and a differentiable function h : U R on an open set U R n 1 such that: P = {x R n : (x i1,..., x in 1 ) U and x in = h(x i1,..., x in 1 ) } This observation is relevant to how we will define a k-dimensional manifold later. Remark 2. The terminology patch is not standard. theory, patches are called charts. In differentiable geometry and manifold We are now ready for one of our main definitions: Definition 4 ((n 1)-manifold). The set M R n is called an (n 1)-dimensional manifold if and only if each point a M lies in an open subset U R n such that U M is an (n 1)- dimensional patch. 2.2 Manifolds have tangent planes As suggested before, a key property of a manifold is having a tangent plane of the same dimension at each point. We now make this latter concept precise: Definition 5 (Tangent planes). A set M R n is said to have a k-dimensional tangent plane at the point a M if the union of all tangent lines to differentiable curves on M passing through a is a k-dimensional plane. Theorem 2. If M is an (n 1)-dimensional manifold in R n, then at each of its points M has an (n 1)-dimensional tangent plane. Proof. Let a M. It suffices to prove that the set of all velocity vectors to differentiable curves on M passing through a is an (n 1)-dimensional subspace of R n. By assumption that M is a manifold, there exists a differentiable function h : R n 1 R such that for some i we have x i = h(x 1,..., ˆx i,... x n ) for all points (x 1,..., x n ) M sufficiently close to a. Without loss of generality, we may assume that i = n. Let φ : R M be a differentiable curve with φ(0) = a. Let π n : R n R n 1 be the standard projection, and define ψ : R R n 1 by ψ = π n φ. By definition of M being a manifold and the function h, there exists some U R n such that we have that φ(t) = (ψ(t), h(ψ(t))) 3
when φ(t) U M, which happens when t is sufficiently close to 0. Setting ψ(0) = b, we now apply the chain rule and obtain that φ (0) = ( ψ (0), h(b) ψ (0) ) n 1 = φ i(0) (e i, D i h(b)), where e 1,..., e n 1 are the unit basis vectors in R n 1. Hence φ (0) lives in the (n 1)-dimensional vector subspace of R n spanned by the n 1 vectors: i=1 which are clearly linearly independent. ((e 1, D 1 h(b)),..., (e n 1, D n 1 h(b)) Since we have shown that the tangent line at a of any given differentiable curve lying on M and passing through a lies in this subspace, it now suffices to prove that any vector in this subspace is the velocity vector of some curve of this type. To that end, let v = n 1 i=1 v i (e i, D i h(b)), so that v lies in this subspace. Define a differentiable curve φ v (t) = (b + tw, h(b + tw)) where w = (v 1,..., v n 1 ) R n 1. By definition of U R n 1 being open, there exists ɛ > 0 so that φ v (t) U M. Hence, we get a differentiable curve φ v : ( ɛ, ɛ) M (differentiability of φ v follows easily from the fact that h is differentiable) so that φ v (0) = a, and φ v(0) = v. which is what we wanted. 2.3 Recognizing manifolds In this subsection, we will state a theorem (to be proven later) and prove another, the latter which states that the zero sets of differentiable functions are manifolds, with a specific description of their tangent planes. We will prove the next theorem in the next chapter: Theorem (Implicit Function Theorem). Let g : R n R be continuously differentiable and suppose that g(a) = 0 and D n g(a) 0. Then there exists 4
A neighborhood U R n of a, A neighborhood V R n 1 of (a 1,..., a n 1 ), and A differentiable function f : V R such that U g 1 (0) = {x R n : (x 1,..., x n 1 ) V and x n = f(x 1,..., x n 1 )}. Remark 3. Note the similarity to the role of the function f in the IFT to the function h in the definition of an (n 1)-manifold. Theorem 3. Suppose that g : R n R is continuously differentiable. If M = {x g 1 [0] R n g(x) 0}, then M is an (n 1)-manifold. Moreover, if a M, then g(a) is orthogonal to the tangent plane to M at a. Proof. Let a M so that g(a) = 0 but g(a) 0. Then there exists 1 i n such that D i g(a) 0. In order to apply the IFT (Theorem 2.3) as stated, we will define a new function G : R n R by G(x 1,..., x n ) = g(x 1,..., x i 1, x n, x i,..., x n 1 ), which we note has the property that G(b) = 0 and D n G(b) 0 where b = (a 1,..., a i 1, a i+1,..., a n, a i ). By the Implicit Function Theorem 2.3, there exist open sets U R n and V R n 1 and a differentiable function F : V R so that U G 1 (0) = {x R n (x 1,... x n 1 ) V and x n = F (x 1,..., x n 1 )}. Set W = {(x 1,..., x n ) R n (x 1,..., x i 1, x i+1,..., x n, x i ) U}. Then W M = {(x 1,..., x n ) W x i = F (x 1,..., x i 1, x i+1,..., x n )} making W M an (n 1)-dimensional patch by definition. Hence M is an (n 1)-manifold. To see the moreover statement, it suffices to prove that g(a) is orthogonal to any velocity vector of a differentiable curve at a. Thus let φ : R M be a differentiable curve with φ(0) = a. Since g(x) = 0 for all x M, we see that g φ 0. Hence the chain rule gives us g(a) φ (0) = (g φ) (0) = 0, which completes the proof. 5
Theorem 4. Suppose g : R n R is continuously differentiable and let M be the set of points x R n at which g(x) = 0 and g(x) 0. If the differentiable function f : R n R attains a local maximum or minimum on M at the point a M, then f(a) = λ g(a) for some number λ, denoted as the Lagrange multiplier. Proof. By Theorem 3, M is an (n 1)-manifold, so M has an (n 1)-dimensional tangent plane by Theorem 2. By Theorems 1 and 3, the vectors f(a) and g(a) are both orthogonal to this tangent plane. Since the orthogonal complement to an (n 1)-dimensional subspace of R n is 1-dimensional, it follows that f(a) and g(a) are collinear. Since g(a) 0, this implies that f(a) is a scalar multiple of g(a), completing the proof. 3 Multiple-constraint Optimization Definition 6. The set P R n is called a k-dimensional patch and only if there exists a permutation x i1,..., x in of x 1,..., x n, and differentiable function h : U R n k for U R k open, such that P = {x R n : (x i1,..., x ik ) U and (x ik+1,..., x in ) = h(x i1,..., x ik ) } Definition 7. The set M R n is called a k-dimensional manifold if and only if each point a M lies in an open subset U R n such that U M is a k-dimensional patch. Theorem 5. If M is a k-dimensional manifold in R n then, at each of its points, M has a k- dimensional tangent plane. Proof. On the blackboard. Theorem (Implicit Mapping Theorem). Let g : R n R m (m < n) be a continuously differentiable map. Suppose that g(a) = 0 and that the rank of the derivative matrix g (a) is m. Then there exists a permutation x i1,..., x in of the coordinates in R n, an open set U R n containing a, an open subset V R n m containing b = π [n m+1,n] (a i1,..., a in ), and a differentiable mapping h : V R m such that each point x U lies on S = g 1 [0] if and only if (x i1,..., x in m ) V and (x in m+1,..., x in ) = h(x i1,..., x in m ). Theorem 6. Suppose that g : R n R m (m < n) is continuously differentiable. If M is the set of all points x S = g 1 [0] for which the rank of g (x) is m, then M is an (n m)-manifold. Given a M, the gradient vectors g 1 (a),..., g m (a) are all orthogonal to the tangent plane to M at a. 6
Theorem 7. Suppose g : R n R m (m < n) is continuously differentiable and let M be the set of points x R n such that g(x) = 0 and the gradient vectors g 1 (x),..., g m (x) are linearly independent. If the differentiable function f : R n R attains a local maximum or minimum on M at the point a M, then there exist real numbers λ 1,..., λ m (called Lagrange multipliers) such that: f(a) = λ 1 g 1 (a) +... + λ m g m (a). 7