Chapter 3. Differentiable Mappings. 1. Differentiable Mappings

Similar documents
Differentiation. f(x + h) f(x) Lh = L.

We denote the derivative at x by DF (x) = L. With respect to the standard bases of R n and R m, DF (x) is simply the matrix of partial derivatives,

THE INVERSE FUNCTION THEOREM

Math113: Linear Algebra. Beifang Chen

Math 421, Homework #9 Solutions

COMPLETE METRIC SPACES AND THE CONTRACTION MAPPING THEOREM

MATH Topics in Applied Mathematics Lecture 12: Evaluation of determinants. Cross product.

INVERSE FUNCTION THEOREM and SURFACES IN R n

Definition 2.3. We define addition and multiplication of matrices as follows.

Equality: Two matrices A and B are equal, i.e., A = B if A and B have the same order and the entries of A and B are the same.

1. Bounded linear maps. A linear map T : E F of real Banach

II. Determinant Functions

2. Linear algebra. matrices and vectors. linear equations. range and nullspace of matrices. function of vectors, gradient and Hessian

Implicit Functions, Curves and Surfaces

Chapter 4 - MATRIX ALGEBRA. ... a 2j... a 2n. a i1 a i2... a ij... a in

Section 1.6. M N = [a ij b ij ], (1.6.2)

Several variables. x 1 x 2. x n

ALGEBRAIC TOPOLOGY N. P. STRICKLAND

Chapter 2. Square matrices

Differentiation in higher dimensions

f(x, y) = 1 2 x y2 xy 3

Introduction to Matrix Algebra

Mathematical Analysis Outline. William G. Faris

The Contraction Mapping Theorem and the Implicit Function Theorem

MA2AA1 (ODE s): The inverse and implicit function theorem

THE INVERSE FUNCTION THEOREM FOR LIPSCHITZ MAPS

1 Directional Derivatives and Differentiability

Math 291-2: Final Exam Solutions Northwestern University, Winter 2016

MATH 323 Linear Algebra Lecture 6: Matrix algebra (continued). Determinants.

SPRING OF 2008 D. DETERMINANTS

MATHEMATICS. IMPORTANT FORMULAE AND CONCEPTS for. Final Revision CLASS XII CHAPTER WISE CONCEPTS, FORMULAS FOR QUICK REVISION.

The Contraction Mapping Theorem and the Implicit and Inverse Function Theorems

Functions of Several Variables (Rudin)

3. Vector spaces 3.1 Linear dependence and independence 3.2 Basis and dimension. 5. Extreme points and basic feasible solutions

Review of Multi-Calculus (Study Guide for Spivak s CHAPTER ONE TO THREE)

ELEMENTARY LINEAR ALGEBRA WITH APPLICATIONS. 1. Linear Equations and Matrices

Math 61CM - Quick answer key to section problems Fall 2018

Prepared by: M. S. KumarSwamy, TGT(Maths) Page

Multivariable Calculus

Math Advanced Calculus II

2 b 3 b 4. c c 2 c 3 c 4

f(x) f(z) c x z > 0 1

Determinants Chapter 3 of Lay

Chapter 4. Inverse Function Theorem. 4.1 The Inverse Function Theorem

Mathematics Course 111: Algebra I Part I: Algebraic Structures, Sets and Permutations

Nonlinear equations. Norms for R n. Convergence orders for iterative methods

CLASS 12 ALGEBRA OF MATRICES

Course Summary Math 211

MA102: Multivariable Calculus

Analysis-3 lecture schemes

Economics 204 Summer/Fall 2011 Lecture 11 Monday August 8, f(x + h) (f(x) + ah) = a = 0 = 0. f(x + h) (f(x) + T x (h))

The Inverse Function Theorem 1

Algebra II. Paulius Drungilas and Jonas Jankauskas

Linear Algebra. Matrices Operations. Consider, for example, a system of equations such as x + 2y z + 4w = 0, 3x 4y + 2z 6w = 0, x 3y 2z + w = 0.

SYLLABUS. 1 Linear maps and matrices

Lecture 7. Econ August 18

2 Sequences, Continuity, and Limits

Linear Algebra II. 2 Matrices. Notes 2 21st October Matrix algebra

Basics of Calculus and Algebra

Linear Algebra Notes. Lecture Notes, University of Toronto, Fall 2016

2. Matrix Algebra and Random Vectors

k=1 ( 1)k+j M kj detm kj. detm = ad bc. = 1 ( ) 2 ( )+3 ( ) = = 0

Differential Topology Solution Set #2

MATH 213 Linear Algebra and ODEs Spring 2015 Study Sheet for Midterm Exam. Topics

LINEAR ALGEBRA REVIEW

NOTES ON LINEAR ALGEBRA. 1. Determinants

M311 Functions of Several Variables. CHAPTER 1. Continuity CHAPTER 2. The Bolzano Weierstrass Theorem and Compact Sets CHAPTER 3.

Undergraduate Mathematical Economics Lecture 1

1.1 Limits and Continuity. Precise definition of a limit and limit laws. Squeeze Theorem. Intermediate Value Theorem. Extreme Value Theorem.

Evaluating Determinants by Row Reduction

Analysis and Linear Algebra. Lectures 1-3 on the mathematical tools that will be used in C103

Analysis Finite and Infinite Sets The Real Numbers The Cantor Set

Math Bootcamp An p-dimensional vector is p numbers put together. Written as. x 1 x =. x p

Metric Spaces and Topology

MATH2210 Notebook 2 Spring 2018

Section Summary. Definition of a Function.

A fixed point theorem for weakly Zamfirescu mappings

ACI-matrices all of whose completions have the same rank

MATH 240 Spring, Chapter 1: Linear Equations and Matrices

= 1 and 2 1. T =, and so det A b d

Solution to Homework 1

4. Determinants.

Math 3108: Linear Algebra

Chapter 8: Taylor s theorem and L Hospital s rule

Linear Algebra March 16, 2019

SMSTC (2017/18) Geometry and Topology 2.

Homework Set #8 Solutions

Lecture 2: Review of Prerequisites. Table of contents

APPLICATIONS OF DIFFERENTIABILITY IN R n.

International Competition in Mathematics for Universtiy Students in Plovdiv, Bulgaria 1994

MATH 106 LINEAR ALGEBRA LECTURE NOTES

Elementary maths for GMT

Fundamentals of Engineering Analysis (650163)

Elementary Matrices. which is obtained by multiplying the first row of I 3 by -1, and

A matrix A is invertible i det(a) 6= 0.

Transformations from R m to R n.

UNIVERSAL IDENTITIES. b = det

Optimization Theory. A Concise Introduction. Jiongmin Yong

Linear Systems and Matrices

NOTES ON MULTIVARIABLE CALCULUS: DIFFERENTIAL CALCULUS

Transcription:

Chapter 3 Differentiable Mappings 1 Differentiable Mappings Let V and W be two linear spaces over IR A mapping L from V to W is called a linear mapping if L(u + v) = Lu + Lv for all u, v V and L(λv) = λ(lv) for all λ IR and v V If L is a linear mapping from V to W, then for any c IR, cl is a linear mapping Moreover, if L and M are two linear mappings from V to W, then L + M is a linear mapping Let U, V and W be linear spaces If M is a linear mapping from U to V and L is a linear mapping from V to W, then the composite mapping L M is a linear mapping from U to W Now assume that V = IR k and W = IR m, where k and m are positive integers If L is a linear mapping from IR k to IR m, then there exists a unique matrix B = (b ij ) 1 i m,1 j k such that b 11 b 12 b 1k b L(x) = Bx = 21 b 22 b 2k b m1 b m2 b mk x 1 x 2 x k, x = x 1 x 2 x k IRk The matrix B is the matrix representation of the linear mapping L with respect to the standard bases We call B the standard matrix of L Often we will use B to denote the linear mapping x Bx, x IR k Suppose that B and C are the standard matrices of linear mapping L and M from IR k to IR m, respectively Then B + C is the standard matrix of L + M Moreover, if c is a real number, then cb is the standard matrix of cl Furthermore, if C is the standard matrix of a linear mapping M from IR d to IR k, and if B is the standard matrix of a linear mapping L from IR k to IR m, then BC is the standard matrix of the composite linear mapping L M from IR d to IR m The norm of a linear mapping L from IR k to IR m is defined by L := sup{ L(x) : x IR k, x 1} Thus, L 0 It is easily seen that L = 0 if and only if L(x) = 0 for all x IR k Moreover, for any real number c, cl = c L If M is also a linear mapping from IR k to IR m, then L + M L + M 1

Let U be a nonempty open subset of IR k A mapping f from U to IR m is said to be differentiable at a point a in U if there exists a linear mapping L from IR k to IR m such that f(x) f(a) L(x a) lim x a x a The linear mapping L satisfying the above condition is unique This linear mapping is denoted by df a and is called the differential of f at a If f is differentiable at every point = 0 in U, then we call f a differentiable mapping from U to IR m Theorem 11 Let f be a mapping from an open set U in IR k to IR m, and let a be a point in U Suppose that f(x) = (f 1 (x), f 2 (x),, f m (x)) for x U, where f 1, f 2,, f m are real-valued functions on U Then f is differentiable at a if and only if f 1, f 2,, f m are differentiable at a If this is the case, then the standard matrix of the differential df a is D 1 f 1 (a) D 2 f 1 (a) D k f 1 (a) D 1 f 2 (a) D 2 f 2 (a) D k f 2 (a) D 1 f m (a) D 2 f m (a) D k f m (a) Proof Suppose that f is differentiable at a and the standard matrix of the differential df a is B = (b ij ) 1 i m,1 j k For each i {1,, m}, let v i be the ith row vector (b i1,, b ik ) IR k We have It follows that f i (x) f i (a) v i, x a f(x) f(a) L(x a) f i (x) f i (a) v i, x a lim x a x a Hence, for each i {1,, m}, f i is differentiable at a and b ij = D j f i (a) for 1 i m and 1 j k Conversely, suppose that f i is differentiable at a for each i {1,, m} For each i, there exists a vector v i IR k such that f i (x) f i (a) v i, x a lim x a x a Let B be the m k matrix with v 1, v 2,, v m as its rows, and let L be the corresponding linear mapping from IR k to IR m Then we have = 0 = 0 f(x) f(a) L(x a) m f i (x) f i (a) v i, x a i=1 2

Consequently, f(x) f(a) L(x a) lim x a x a = 0 This shows that the mapping f is differentiable at a We define the Jacobian matrix of f at a to be Df(a) := ( D j f i (a) ) 1 i m,1 j k An application of the mean value theorem gives the following useful result for differentiable mappings Theorem 12 Let f = (f 1,, f m ) be a differentiable mapping from an open set U in IR k to IR m Let a and b be two distinct points in U such that the closed line segment [a, b] is contained in U If K is a real number such that Df(x) K for all x in the open line segment (a, b), then f(b) f(a) K b a Proof The theorem is obviously true if f(b) = f(a) In what follows we assume that f(b) f(a) For a vector v = (v 1,, v m ) IR m we define h(x) := v, f(x) = v 1 f 1 (x) + + v m f m (x), x U Then v is a differentiable function on U By the mean value theorem (Theorem 41 in Chapter 2), there exists some x (a, b) such that h(b) h(a) = h(x), b a) = v[df(x)](b a), where v is regarded as a 1 m matrix, Df(x) is an m k matrix, and b a is regarded as a k 1 matrix Since Df(x) K, it follows that v, f(b) f(a) v Df(x) b a K v b a Choosing v := [f(b) f(a)]/ f(b) f(a) in the above inequalities, we obtain v = 1 and v, f(b) f(a) = f(b) f(a) Therefore, f(b) f(a) K b a The following theorem gives the chain rule for the composition of two differentiable mappings 3

Theorem 13 Let f be a mapping from an open set U in IR k to IR m, and let g be a mapping from an open set V in IR k to IR n Suppose that a U and f(u) V If f is differentiable at a, and if g is differentiable at b := f(a), then the composite mapping g f is differentiable at a and d(g f) a = dg f(a) df a Consequently, D(g f)(a) = Dg(f(a))Df(a) Proof We write S for dg b and write T for df a Since g is differentiable at b = f(a), there exists r > 0 such that y B r (b) implies y V and g(y) g(b) S(y b) ε y b Since f is differentiable at a, there exists δ > 0 such that x B δ (a) implies x U, f(x) B r (b) and f(x) f(a) T (x a) ε x a In what follows we assume that x B δ (a) Then we have f(x) f(a) f(x) f(a) T (x a) + T (x a) (ε + T ) x a Moreover, S(f(x) f(a)) S T (x a) S f(x) f(a) T (x a) ε S x a Since f(x) lies in B r (b), we have g(f(x)) g(f(a)) S(f(x) f(a)) ε f(x) f(a) By using the triangle inequality, we derive from the above inequalities that g f(x) g f(a) S T (x a) ε S x a +ε f(x) f(a) ε( S +ε+ T ) x a This shows that g f is differentiable at a and d(g f) a = S T In other words, d(g f) a = dg f(a) df a In the above theorem, the mapping f can be represented as y s = f s (x 1,, x k ), s = 1,, m, (x 1,, x k ) U, and the mapping g can be represented as z i = g i (y 1,, y m ), i = 1,, n, (y 1,, y m ) V 4

If we use the traditional notation, then the chain rule can be expressed as z i x j = m s=1 z i y s y s x j, i = 1,, n, j = 1,, k Example Let f be the mapping from IR 2 to IR 2 given by u = ρ cos θ, v = ρ sin θ, (ρ, θ) IR 2, and let g be the mapping from IR 2 to IR 2 given by We have and x = u 2 v 2, y = 2uv, (u, v) IR 2 [ ] cos θ ρ sin θ Df(ρ, θ) = sin θ ρ cos θ [ 2u 2v Dg(u, v) = 2v 2u By the chain rule we obtain D(g f)(ρ, θ) = Dg(ρ cos θ, ρ sin θ)df(ρ, θ) Consequently, [ ] [ ] [ ] 2ρ cos θ 2ρ sin θ cos θ ρ sin θ 2ρ cos(2θ) 2ρ D(g f)(ρ, θ) = = 2 sin(2θ) 2ρ sin θ 2ρ cos θ sin θ ρ cos θ 2ρ sin(2θ) 2ρ 2 cos(2θ) ] 2 The Jacobian Determinant Let f be a differentiable mapping from an open set U in IR k to IR k For a point a U, the Jacobian determinant of f at a is defined to be J f (a) := det(df(a)) = det ( D j f i (a) ) 1 i,j k Example 1 Let f be the mapping from IR 2 to IR 2 given by u = ρ cos θ, v = ρ sin θ, (ρ, θ) IR 2 Then J f (ρ, θ) = cos θ sin θ ρ sin θ ρ cos θ = ρ Example 2 Let g be the mapping from IR 3 to IR 3 given by x = ρ cos θ sin φ, y = ρ sin θ sin φ, z = ρ cos φ, (ρ, θ, φ) IR 3 5

Then cos θ sin φ ρ sin θ sin φ ρ cos θ cos φ J g (ρ, θ, φ) = sin θ sin φ ρ cos θ sin φ ρ sin θ cos φ cos φ 0 ρ sin φ = ρ2 sin φ We are in a position to review basic properties of determinants Let A = (a ij ) 1 i,j n be an n n matrix of real numbers If n = 1, we define det(a 11 ) := a 11 Suppose that n > 1 For a fixed pair (i, j) (1 i, j n) we use A ij to denote the (n 1) (n 1) matrix obtained by deleting the ith row and the jth column from A We define det A := n ( 1) 1+j A 1j In particular, if A = (a ij ) 1 i,j 2 is a 2 2 matrix, then det A = a 11 a 12 a 21 a 22 := a 11a 22 a 12 a 21 j=1 If A = (a ij ) 1 i,j 3 is a 3 3 matrix, then a 11 a 12 a 13 det A = a 21 a 22 a 23 a 31 a 32 a 33 := a 11 a 22 a 23 a 32 a 33 a 12 a 21 a 23 a 31 a 33 + a 13 a 21 a 22 a 31 a 32 For an n n matrix A = (a ij ) 1 i,j n we use A i to denote its ith column (i = 1,, n) Thus A can be written as [A 1, A 2,, A n ] By using an induction argument we can easily verify the following properties of determinants (d1) If A is the identity matrix, that is, a ij = 1 for i = j and a ij = 0 for i j, then det A = 1 (d2) The determinant of a matrix A is a multilinear function of its columns More precisely, if the jth column A j is equal to a sum of two column vectors A i det[a 1,, A i 1, A i + A i, A i+1,, A n ] and A i, then = det[a 1,, A i 1, A i, A i+1,, A n ] + det[a 1,, A i 1, A i, A i+1,, A n ] Furthermore, if c IR, then det[a 1,, A i 1, ca i, A i+1,, A n ] = c det[a 1,, A i 1, A i, A i+1,, A n ] 6

(d3) If two adjacent columns of a matrix A are equal, ie, if A i = A i+1 for some i in {1,, n 1}, then det A = 0 The above three conditions (d1), (d2), and (d3) characterize the properties of determinants In other words, all the properties of determinants can be derived from (d1), (d2), and (d3) Let us derive the following property: (d4) If two columns of a matrix are interchanged, then its determinant changes by a sign We establish this property first when two adjacent columns A i and A i+1 are interchanged By (d3) we have det[, A i + A i+1, A i + A i+1, ] = 0 Applying the multilinear property (d2) to the above determinant, we obtain det[, A i, A i, ] + det[, A i, A i+1, ] + det[, A i+1, A i, ] + det[, A i+1, A i+1, ] = 0 By (d3) we have det[, A i, A i, ] = 0 and det[, A i+1, A i+1, ] = 0 Hence, det[, A i+1, A i, ] = det[, A i, A i+1, ] Now we strengthen property (d3) as follows: (d5) If two columns of a matrix A are equal, then det A = 0 Assume that two columns of the matrix A are equal We can change the matrix by a successive interchange of adjacent columns until we obtain a matrix A with equal adjacent columns By what has been proved for (d4) we have det A = det A or det A = det A But det A = 0 by (d3) Hence det A = 0 We can now finish the proof of (d4) Suppose that the ith column and the jth column are interchanged, where 1 i < j n By (d5) we have det[, A i + A j,, A i + A j, ] = 0 Expanding the above determinant as before, we obtain det[, A j,, A i, ] = det[, A i,, A j, ] The following property is also useful (d6) If one adds a scalar multiple of one column to another column, then the value of the determinant does not change 7

Suppose that the ith column A i of a matrix A is replaced by A i + ca j, where j i and c IR By (d2) we have det[, A i 1, A i + ca j, A i+1, ] = det[, A i 1, A i, A i+1, ] + c det[, A i 1, A j, A i+1, ] There are two determinants on the right of the above equality The first determinant is just det A, and the second determinant is equal to 0, since two of its columns are equal This verifies (d6) For c IR and i {1,, n}, let Q i (c) be the matrix obtained from the n n identity matrix I by multiplying its ith column by c For 1 i j n, we use P ij to denote the matrix obtained from I by interchanging the ith column and the jth column For α IR and 1 i j n, we use R ij (α) to denote the matrix obtained from I by adding the α multiple of the jth column to the ith column A square matrix is called an elementary matrix if it has one of the forms P ij, Q i (c), or R ij (α) Theorem 21 If A and B are two square matrices of the same size, then det(ab) = (det A)(det B) and det(a T ) = det A Proof Let A and B be two n n matrices Then B can be written as a product of elementary matrices Hence, in order to prove det(ab) = (det A)(det B), it suffices to show that det(ae) = (det A)(det E) for each elementary matrix E If E = P ij, then AP ij is the matrix obtained by interchanging the ith column and the jth column of A; hence det(ap ij ) = det A = (det A)(det P ij ) If E = Q i (c), then AQ i (c) is the matrix obtained from A by multiplying its ith column by c; hence det(aq i (c)) = c det A = (det A)(det Q i (c)) If E = R ij (α), then AR ij (α) is the matrix obtained from A by adding the α multiple of the jth column to the ith column; hence det(ar ij (α)) = det A = (det A)(det R ij (α)) This completes the proof of det(ab) = (det A)(det B) Let us show that det E T = det E for any elementary matrix E Indeed, Pij T = P ij and Q i (c) T = Q i (c) Moreover, R ij (α) T = R ji (α) Hence, det R ij (α) T = 1 = det R ij (α) The 8

matrix A can be written as A = E 1 E k, where E 1,, E k are elementary matrices We have det A T = det(e T k E T 1 ) = (det E T k ) (det E T 1 ) = (det E k ) (det E 1 ) = det A This completes the proof of det A T = det A An n n matrix A is said to be invertible if there exists an n n matrix B such that AB = BA = I Such a matrix B is uniquely determined by A This matrix B is called the inverse of A and will be denoted by A 1 Theorem 22 A square matrix A is invertible if and only if det A 0 Proof Let A be an n n matrix If A is invertible, then there exists an n n matrix B such that AB = I It follows that 1 = det I = det(ab) = (det A)(det B) This shows that det A 0 If E is an elementary matrix and det E 0, then E is invertible Indeed, P ij is invertible since P ij P ij = I Moreover, R ij (α) is invertible, because R ij (α)r ij ( α) = I If E = Q i (c) and det E 0, then c = det E 0 In this case Q i (c) is invertible, since Q i (1/c)Q i (c) = Q i (c)q i (1/c) = I Now suppose that det A 0 We write A as A = E 1 E k, where E 1,, E k are elementary matrices Since det A = det(e 1 ) det(e k ) and det A 0, we have det E j 0 for each j {1,, k} By what has been proved, each E j is invertible Consequently, (E 1 k This shows that A is invertible E1 1 )A = A(E 1 k E1 1 ) = I 3 The Inverse Function Theorem The main theorem of this section establishes sufficient conditions for the existence of a local inverse of a continuously differentiable mapping Theorem 31 Let U be an open set in IR k and let f = (f 1,, f k ) be a continuously differentiable mapping from U to IR k Suppose that a is a point in U such that J f (a) 0 Then there exist an open set U 1 with a U 1 U and an open set V 1 with f(a) V 1 f(u) such that f is a one-to-one mapping from U 1 onto V 1 Moreover, the inverse mapping g of f U1 is continuously differentiable on V 1 Proof Let S denote the Jacobian matrix Df(a) = (D j f i (a)) 1 i,j k Since the Jacobian determinant J f (a) 0, the matrix S is invertible Let T := S 1 For given y IR k, 9

consider the mapping h from U to IR k defined by h(x) := x T (f(x) y), x U If there exists x U such that h(x ) = x, then T (f(x ) y) = 0 Since T is invertible, it follows that f(x ) = y Thus, the problem of solving the equation f(x) = y is reduced to the problem of finding a fixed point of the mapping h We observe that Dh(x) = I T (Df(x)), x U In particular, Dh(a) = I T (Df(a)) = I T S = 0, where 0 stands for the k k matrix with all entries being 0 Since f is continuously differentiable on U, so is h Hence, there exists some r > 0 such that B r (a) U and Dh(x) < 1/2 for all x B r (a) Consequently, the matrix I Dh(x) is invertible Thus, the Jacobian matrix Df(x) = T 1 (I Dh(x)) is invertible for all x B r (a) Moreover, by Theorem 12 we have h(x ) h(x ) 1 2 x x for all x, x B r (a) It follows that T [f(x ) f(x )] = [x x ] [h(x ) h(x )] x x h(x ) h(x ) 1 2 x x Therefore, f(x ) f(x ) 1 2 T x x In particular, f Br (a) is one-to-one for all x, x B r (a) Let δ := r/(2 T ) and V 1 := B δ (b), where b := f(a) Let y V 1 Our goal is to find x U such that f(x) = y For this purpose, we use the following iteration scheme: x 0 := a and x k+1 := h(x k ) for k = 0, 1, 2, We shall use mathematical induction to prove that the statement P k : x k+1 B r (a) and x k+1 x k < r 2 k+1 is true for all k IN 0 For k = 0 we have x 1 x 0 = T (f(x 0 ) y) It follows that x 1 a = x 1 x 0 T y b < T δ = r 2 This verifies P 0 Suppose that k > 0 and P j is true for all j < k By the induction hypothesis, x k, x k 1 B r (a) Hence, x k+1 x k = h(x k ) h(x k 1 ) 1 2 xk x k 1 < 1 2 10 r 2 k = r 2 k+1

Moreover, x k+1 a = x k+1 x 0 = (x j+1 x j ) Thus, x k+1 B r (a) j=0 x j+1 x j < j=0 r < r 2j+1 This verifies P k and thereby completes the induction procedure Since x k+1 x k < r/2 k+1 for all k IN 0, the sequence (x k ) k=0,1, converges to some x in IR k Letting k on the both sides of the equation x k+1 = h(x k ), we obtain x = h(x ) Therefore, f(x ) = y Furthermore, x a = lim k xk+1 a x j+1 x j = x 1 x 0 + j=0 j=0 x j+1 x j < r Let U 1 := B r (a) f 1 (V 1 ) Then U 1 is an open set and a U 1 U By what has been proved, for any y V 1, there exists a unique x B r (a) such that f(x ) = y Clearly, x B r (a) f 1 (V 1 ) = U 1 This shows that f is a one-to-one mapping from U 1 onto V 1 Moreover, f(a) = b V 1 = f(u 1 ) f(u) Let g = (g 1,, g k ) be the inverse mapping of f U1 For v V 1, we wish to show that g is differentiable at v Let y V 1, x := g(y) U 1 and u := g(v) U 1 Then y = f(x) and v = f(u) Let S u denote the Jacobian matrix Df(u) Then S u is invertible We have It follows that g(y) g(v) S 1 u g(y) g(v) S 1 u j=1 (y v) = Su 1 [ f(x) f(u) Su (x u) ] (y v) Su 1 f(x) f(u) S u (x u) Note that y v = f(x) f(u) x u /(2 T ) Since f is differentiable at u, we have Consequently, f(x) f(u) S u (x u) lim x u x u lim y v g(y) g(v) Su 1 (y v) y v Therefore, g is differentiable at v, and Dg(v) = Su 1 on U 1, we conclude that Dg is continuous on V 1 = 0 = 0 Example Let f = (f 1, f 2 ) be the mapping from IR 2 to IR 2 given by = (Df(u)) 1 Since Df is continuous f 1 (x 1, x 2 ) := x 2 1 x 2 2, f 2 (x 1, x 2 ) := 2x 1 x 2, (x 1, x 2 ) IR 2 11

Given (y 1, y 2 ) IR 2, we wish to solve the system of equations f 1 (x 1, x 2 ) = y 1, f 2 (x 1, x 2 ) = y 2 We have y 2 1 = (x 2 1 x 2 2) 2 and y 2 2 = 4x 2 1x 2 2 It follows that y 2 1 + y 2 2 = (x 2 1 + x 2 2) 2 Hence, x 2 1 + x 2 2 = y 2 1 + y2 2 Thus, for (y 1, y 2 ) = (0, 0), the only solution is (x 1, x 2 ) = (0, 0) For (y 1, y 2 ) (0, 0), we derive from x 2 1 + x 2 2 = y 2 1 + y2 2 and x2 1 x 2 2 = y 1 that (x 1, x 2 ) = ±( [y1 + [ y1 ) y1 2 + 2] y2 /2, + y1 2 + ] y2 2 /2 for y 2 0 or (x 1, x 2 ) = ±( [y1 + [ y1 ) y1 2 + 2] y2 /2, + y1 2 + ] y2 2 /2 for y 2 < 0 Consequently, f maps IR 2 \ {(0, 0)} two-to-one onto IR 2 \ {(0, 0)} Let us compute the Jacobian of f We have J f (x 1, x 2 ) = D 1f 1 (x 1, x 2 ) D 2 f 1 (x 1, x 2 ) D 1 f 2 (x 1, x 2 ) D 2 f 2 (x 1, x 2 ) = 2x 1 2x 2 2x 2 2x 1 = 4(x2 1 + x 2 2) By Theorem 31, for (a 1, a 2 ) (0, 0), there exist an open set U 1 in IR 2 containing (a 1, a 2 ) and an open set V 1 in IR 2 containing f(a 1, a 2 ) such that f is a one-to-one mapping from U 1 onto V 1 Indeed, if we choose r := a 2 1 + a2 2 > 0, then f B r (a 1,a 2 ) is one-to-one Let g = (g 1, g 2 ) be the inverse of f Br (a 1,a 2 ) By Theorem 31 and the chain rule we have Dg(y 1, y 2 ) = where (y 1, y 2 ) = f(x 1, x 2 ) for (x 1, x 2 ) B r (a 1, a 2 ) [ ] 1 x1 x 2 2(x 2 1 + x2 2 ), x 2 x 1 In the above example, the inverse mapping could be found in an explicit form This is not possible in general But the Inverse Mapping Theorem is still applicable It gives us a powerful tool to analyze mappings and curvilinear coordinates 4 The Implicit Function Theorem Theorem 41 Let f = (f 1,, f m ) be a continuously differentiable mapping from an open set U in IR k+m to IR m Each f i (i = 1,, m) is a function of (x 1,, x k, y 1,, y m ) Suppose that (a, b) = (a 1,, a k, b 1,, b m ) is a point in U such that f i (a, b) = 0 for i = 1,, m If ( fi ) det (a, b) y 0, j 1 i,j m 12

then there exist an open set V in IR k containing a = (a 1,, a k ) and a continuously differentiable mapping g = (g 1,, g m ) from V to IR m such that g(a) = b = (b 1,, b m ) and that for i = 1,, m, f i ( x1,, x k, g 1 (x 1,, x k ),, g m (x 1,, x k ) ) = 0 (x 1,, x k ) V Proof Let F = (F 1,, F k, F k+1,, F m ) be the mapping from U to IR k+m given by F i (x, y) = x i for i = 1,, k and F k+j (x, y) = f j (x, y) for j = 1,, m, where x := (x 1,, x k ) and y := (y 1,, y m ) Clearly, F (a, b) = (a 1,, a k, 0,, 0) The Jacobian matrix of F at (a, b) is [ I 0 S T where I is the k k identity matrix, 0 is the k m zero matrix, ( fi ) ( fi ) S = (a, b) and T = (a, b) x j 1 i m,1 j k y j ], 1 i m,1 j m By our assumption, det T 0 Hence, the Jacobian determinant J F (a, b) 0 By the Inverse Function Theorem, there exist an open set U 1 in IR k+m with (a, b) U 1 U and an open set V 1 in IR k+m with F (a, b) V 1 F (U) such that F is an one-to-one mapping from U 1 onto V 1 Let G := (G 1, G 2,, G k+m ) be the inverse mapping of F U1 Then G is continuously differentiable on V 1 Set V := {(x 1,, x k ) IR k : (x 1,, x k, 0,, 0) V 1 } Then V is an open set in IR k Moreover, since (a 1,, a k, 0,, 0) = F (a, b) V 1, we have a = (a 1,, a k ) V For j = 1,, m, let g j (x 1,, x k ) := G k+j (x 1,, x k, 0,, 0), (x 1,, x k ) V Then g := (g 1,, g m ) is a continuously differentiable mapping from V to IR m Since F (a, b) = (a 1,, a k, 0,, 0) = (a, 0), we have G(a, 0) = (a, b) In light of the definition of g we see that b = g(a) Moreover, since F G is the identity mapping on V, we have F (G(x, 0)) = (x, 0) for all x = (x 1,, x k ) V Consequently, we obtain G i (x, 0) = x i for i = 1,, k and F k+j (G(x, 0)) = 0 for j = 1,, m Therefore, for j = 1,, m we have f j ( x1,, x k, g 1 (x 1,, x k ),, g m (x 1,, x k ) ) = 0 (x 1,, x k ) V This completes the proof of the theorem 13

Let Z := {(x 1,, x k, y 1,, y m ) U 1 : f j (x 1,, x k, y 1,, y m ) = 0 for j = 1,, m} Let ϕ be the mapping given by ϕ(x 1,, x k ) := ( x 1,, x k, g 1 (x 1,, x k ),, g m (x 1,, x k ) ), (x 1,, x k ) V From the above proof we see that ϕ is a one-to-one mapping from V onto Z Example 1 Let S 1 and S 2 be two surfaces in IR 3 represented by S 1 := {(x, y, z) IR 3 : x 2 (y 2 + z 2 ) = 5} and S 2 := {(x, y, z) IR 3 : (x z) 2 + y 2 = 2} Clearly, (1, 1, 2) S 1 S 2 Let F and G be the functions given by F (x, y, z) := x 2 (y 2 + z 2 ) 5 and G(x, y, z) := (x z) 2 + y 2 2, (x, y, z) IR 3 At the point (1, 1, 2) we have F y G y F z G z = 2x 2 y 2x 2 z 2y 2(x z) = 2 4 2 2 = 4 0 By Theorem 41 we can find an interval V in IR containing 1, an open set U in IR 3 containing (1, 1, 2), and functions f and g from V to IR such that f(1) = 1, g(1) = 2, and (f, g) maps V one-to-one and onto U (S 1 S 2 ) In particular, F (x, f(x), g(x)) = 0 and G(x, f(x), g(x)) = 0 for all x V By using the chain rule it follows that Consequently, F x + F y f (x) + F z g (x) = 0 and G x + G y f (x) + G z g (x) = 0 2x(y 2 + z 2 ) + 2x 2 yf (x) + 2x 2 zg (x) = 0 and 2(x z) + 2yf (x) 2(x z)g (x) = 0 Solving the above system of equations for f (x) and g (x) we obtain f (x) = y2 z + z 3 xy 2 x 2 z x 2 y and g (x) = x2 xz y 2 z 2 x 2, x V 14

Example 2 Consider the system of equations F (x, y, u, v) = u 2 + v 2 x 2 y = 0 and G(x, y, u, v) = u + v xy + 1 = 0 Clearly, F (2, 1, 1, 2) = 0 and G(2, 1, 1, 2) = 0 At the point (2, 1, 1, 2) we have F F u v = 2u 2v 1 1 = 2 4 1 1 = 6 0 G u G v By Theorem 41 we can find an open set V in IR 2 containing (2, 1), and functions f and g from V to IR such that f(2, 1) = 1, g(2, 1) = 2, and F (x, y, f(x, y), g(x, y)) = 0 and G(x, y, f(x, y), g(x, y)) = 0 for all (x, y) V Differentiating both sides of the above equations with respect to x, we obtain F x + F f u x + F v g x = 0 and G x + G f u x + G g v x = 0 Consequently, f x = x yv g and u + v x = x + yu, (x, y) V u + v Similarly, differentiating with respect to y yields Hence we obtain F y + F f u y + F v f y = 1 2xv 2(u + v) g y = 0 and G y + G f u y + G g v y = 0 and g y = 1 + 2xu, (x, y) V 2(u + v) 5 Constrained Optimization In this section, as an application of the inverse function theorem, we study the Lagrange multiplier method for constrained optimization Theorem 51 Let f, g 1,, g k be real-valued continuously differentiable functions of (x 1,, x n ) defined on an open set U in IR n with n > k Let Z := {z U : g 1 (z) = = g k (z) = 0} Suppose that there exist a point a in Z and an open ball B r (a) U such that f(z) f(a) (or f(z) f(a)) for all z Z B r (a) Suppose also that the rows of the Jacobian matrix ( gi ) (a) x j 1 i k,1 j n 15

are linearly independent Then there exist real numbers λ 1,, λ k such that f(a) + λ 1 g 1 (a) + + λ k g k (a) = 0 Proof By a permutation of the index set {1,, n} if necessary we may assume that ( gi ) det (a) x 0 j 1 i,j k Consequently, we can find real numbers λ 1,, λ k such that the equality f x j (a) + i=1 λ i g i x j (a) = 0 holds for j = 1,, k Our proof will be complete if we can show that the above equality also holds for j = k + 1,, n Suppose that a = (a 1,, a n ) We write a = (a, a ), where a := (a 1,, a k ) and a := (a k+1,, a n ) By the Implicit Function Theorem, there exist an open set V in IR n k containing a = (a k+1,, a n ) and a continuously differentiable mapping φ = (φ 1,, φ k ) from V to IR k such that φ(a k+1,, a n ) = (a 1,, a k ) and that for i = 1,, k, g i ( φ1 (x k+1,, x n ),, φ k (x k+1,, x n ), x k+1,, x n ) = 0 (xk+1,, x n ) V For (x k+1,, x n ) V, define h(x k+1,, x n ) := f ( φ 1 (x k+1,, x n ),, φ k (x k+1,, x n ), x k+1,, x n ) Then h is a continuously differentiable function on V and it attains a local minimum (or a local maximum) at the point a = (a k+1,, a n ) Hence, for m = k + 1,, n we have 0 = h x m (a ) = j=1 f x j (a) φ j x m (a ) + f x m (a), where the chain rule has been used to derive the second equality Furthermore, we have j=1 Consequently, j=1 g i x j (a) φ j x m (a ) + g i x m (a) = 0, i = 1,, k, m = k + 1,, n f x j (a) φ j x m (a ) + f x m (a) + i=1 λ i( j=1 16 g i (a) φ j (a ) + g ) i (a) = 0 x j x m x m

It follows that f x m (a) + i=1 λ i g i x m (a) = 0, m = k + 1,, n This completes the proof The above theorem gives the following method to find local minima or maxima for a continuous differentiable function f subject to the constraint g 1 = = g k = 0 Set the Lagrange function L(x 1,, x n ) := f(x 1,, x n ) + λ i g i (x 1,, x n ), (x 1,, x n ) U, where λ 1,, λ k are Lagrange multipliers Solve the system of n + k equations i=1 { L x j (x 1,, x n ) = 0 g i (x 1,, x n ) = 0 for j = 1,, n for i = 1,, k for (x 1,, x n ) and (λ 1,, λ k ) Example Let us find the extreme values (maximum and minimum) of the function f(x 1, x 2, x 3 ) = x 3 1 + x 3 2 + x 3 3 subject to the constraints x 2 1 + x 2 2 + x 2 3 = 4 and x 1 + x 2 + x 3 = 1 Note that the set E := {(x 1, x 2, x 3 ) IR 3 : x 2 1 + x 2 2 + x 2 3 = 4, x 1 + x 2 + x 3 = 1} is a compact set So f attains its maximum and minimum on E We use the Lagrange multiplier method to find the maximum and minimum of f on E Let g 1 (x 1, x 2, x 3 ) := x 2 1 + x 2 2 + x 2 3 4 and g 2 (x 1, x 2, x 3 ) := x 1 + x 2 + x 3 1 The Lagrange function is L(x 1, x 2, x 3, λ 1, λ 2 ) = (x 3 1 + x 3 2 + x 3 3) + λ 1 (x 2 1 + x 2 2 + x 2 3 4) + λ 2 (x 1 + x 2 + x 3 1) Setting L x j = 0 for j = 1, 2, 3, we obtain 3x 2 1 + λ 1 2x 1 + λ 2 = 0, 3x 2 2 + λ 1 2x 2 + λ 2 = 0, 3x 2 3 + λ 1 2x 3 + λ 2 = 0 17

It follows that Consequently, 3x 2 1 2x 1 1 3x 2 2 2x 2 1 3x 2 3 2x 3 1 = 0 (x 1 x 2 )(x 1 x 3 )(x 2 x 3 ) = 0 Thus, one and only one of the cases x 1 = x 2, x 1 = x 3, and x 2 = x 3 must occur Suppose that x 1 = x 2 This together with x 1 + x 2 + x 3 = 1 yields x 3 = 1 2x 1 Substituting x 2 = x 1 and x 3 = 1 2x 1 into the equation x 2 1 + x 2 2 + x 2 3 = 4, we obtain It has two solutions: x 2 1 + x 2 1 + (1 2x 1 ) 2 = 1 x 1 = 1 3 + 22 6 and x 1 = 1 3 22 6 Hence, the optimization problem has the following solutions: ( 1 22 3 + 6, 1 22 3 + 6, 1 3 22 ) 3 and ( 1 22 3 6, 1 22 3 6, 1 3 + 22 3 ) Other solutions are obtained from the above solutions by permutations of {x 1, x 2, x 3 } Note that the rows of the Jacobian matrix [ g1 g 1 g 1 ] x 1 x 2 x 3 = g 2 g 2 g 2 x 1 x 2 x 3 are linearly independent at any of these points [ 2x1 2x 2 2x 3 1 1 1 ] It is easily seen that the first set of solutions correspond to the minimum value, and the second set of solutions correspond to the maximum value 18