The Inverse Function Theorem 1

Similar documents
Multivariate Differentiation 1

The Implicit Function Theorem 1

R N Completeness and Compactness 1

The Arzelà-Ascoli Theorem

THE INVERSE FUNCTION THEOREM

Concave and Convex Functions 1

Math 421, Homework #9 Solutions

Convexity in R N Supplemental Notes 1

Concave and Convex Functions 1

INVERSE FUNCTION THEOREM and SURFACES IN R n

Many of you got these steps reversed or otherwise out of order.

Existence of Optima. 1

Numbers 1. 1 Overview. 2 The Integers, Z. John Nachbar Washington University in St. Louis September 22, 2017

COMPLETE METRIC SPACES AND THE CONTRACTION MAPPING THEOREM

Finite-Dimensional Cones 1

Fixed Point Theorems 1

Separation in General Normed Vector Spaces 1

Review of Multi-Calculus (Study Guide for Spivak s CHAPTER ONE TO THREE)

In English, this means that if we travel on a straight line between any two points in C, then we never leave C.

FIXED POINT ITERATIONS

Finite Dimensional Optimization Part III: Convex Optimization 1

1.4 The Jacobian of a map

Math 147, Homework 1 Solutions Due: April 10, 2012

MATHEMATICAL ECONOMICS: OPTIMIZATION. Contents

Differential Topology Solution Set #2

LECTURE 15: COMPLETENESS AND CONVEXITY

Inverse Function Theorem

Economics 204 Fall 2011 Problem Set 2 Suggested Solutions

Implicit Functions, Curves and Surfaces

Chapter 3. Differentiable Mappings. 1. Differentiable Mappings

Math 147, Homework 5 Solutions Due: May 15, 2012

THE INVERSE FUNCTION THEOREM FOR LIPSCHITZ MAPS

Problem: A class of dynamical systems characterized by a fast divergence of the orbits. A paradigmatic example: the Arnold cat.

Analysis Finite and Infinite Sets The Real Numbers The Cantor Set

Advanced Calculus Notes. Michael P. Windham

The Inverse Function Theorem via Newton s Method. Michael Taylor

Maths 212: Homework Solutions

1. Bounded linear maps. A linear map T : E F of real Banach

Econ 204 (2015) - Final 08/19/2015

FIXED POINT METHODS IN NONLINEAR ANALYSIS

Course 212: Academic Year Section 1: Metric Spaces

Math 117: Infinite Sequences

Course 224: Geometry - Continuity and Differentiability

i = f iα : φ i (U i ) ψ α (V α ) which satisfy 1 ) Df iα = Df jβ D(φ j φ 1 i ). (39)

MAT 257, Handout 13: December 5-7, 2011.

The Heine-Borel and Arzela-Ascoli Theorems

BROUWER FIXED POINT THEOREM. Contents 1. Introduction 1 2. Preliminaries 1 3. Brouwer fixed point theorem 3 Acknowledgments 8 References 8

fy (X(g)) Y (f)x(g) gy (X(f)) Y (g)x(f)) = fx(y (g)) + gx(y (f)) fy (X(g)) gy (X(f))

1 The Local-to-Global Lemma

The Implicit and Inverse Function Theorems Notes to supplement Chapter 13.

7. Let X be a (general, abstract) metric space which is sequentially compact. Prove X must be complete.

Compact operators on Banach spaces

Metric Spaces and Topology

Math 361: Homework 1 Solutions

Math 421, Homework #7 Solutions. We can then us the triangle inequality to find for k N that (x k + y k ) (L + M) = (x k L) + (y k M) x k L + y k M

Notes on the Implicit Function Theorem

Analysis II: The Implicit and Inverse Function Theorems

Selçuk Demir WS 2017 Functional Analysis Homework Sheet

Lecture Notes on Metric Spaces

Introduction to Topology

Chapter 4. Inverse Function Theorem. 4.1 The Inverse Function Theorem

SMSTC (2017/18) Geometry and Topology 2.

HOMEWORK FOR , FALL 2007 ASSIGNMENT 1 SOLUTIONS. T = sup T(x) Ax A := sup

d(x n, x) d(x n, x nk ) + d(x nk, x) where we chose any fixed k > N

MATH 54 - TOPOLOGY SUMMER 2015 FINAL EXAMINATION. Problem 1

SARD S THEOREM ALEX WRIGHT

(x k ) sequence in F, lim x k = x x F. If F : R n R is a function, level sets and sublevel sets of F are any sets of the form (respectively);

1 Lyapunov theory of stability

Lecture 6: Contraction mapping, inverse and implicit function theorems

MATH 202B - Problem Set 5

Multivariable Calculus

Math 117: Continuity of Functions

Mathematics 530. Practice Problems. n + 1 }

MT804 Analysis Homework II

Derivatives of Functions from R n to R m

The theory of manifolds Lecture 2

Finite Dimensional Optimization Part I: The KKT Theorem 1

Set, functions and Euclidean space. Seungjin Han

Preliminary Exam 2018 Solutions to Morning Exam

Transversality. Abhishek Khetan. December 13, Basics 1. 2 The Transversality Theorem 1. 3 Transversality and Homotopy 2

Real Analysis Math 131AH Rudin, Chapter #1. Dominique Abdi

Convex Functions and Optimization

The Contraction Mapping Principle

Math 6455 Nov 1, Differential Geometry I Fall 2006, Georgia Tech

AN EFFECTIVE METRIC ON C(H, K) WITH NORMAL STRUCTURE. Mona Nabiei (Received 23 June, 2015)

Geometry and topology of continuous best and near best approximations

Lecture 11 Hyperbolicity.

DEVELOPMENT OF MORSE THEORY

Several variables. x 1 x 2. x n

5 Integration with Differential Forms The Poincare Lemma Proper Maps and Degree Topological Invariance of Degree...

Determine for which real numbers s the series n>1 (log n)s /n converges, giving reasons for your answer.

Week 4: Differentiation for Functions of Several Variables

Chap. 1. Some Differential Geometric Tools

Economics 204 Fall 2011 Problem Set 1 Suggested Solutions

Exercise Solutions to Functional Analysis

Copyright c 2007 Jason Underdown Some rights reserved. quadratic formula. absolute value. properties of absolute values

We denote the derivative at x by DF (x) = L. With respect to the standard bases of R n and R m, DF (x) is simply the matrix of partial derivatives,

Mid Term-1 : Practice problems

FUNCTIONAL ANALYSIS LECTURE NOTES: COMPACT SETS AND FINITE-DIMENSIONAL SPACES. 1. Compact Sets

Real Analysis. Joe Patten August 12, 2018

Transcription:

John Nachbar Washington University April 11, 2014 1 Overview. The Inverse Function Theorem 1 If a function f : R R is C 1 and if its derivative is strictly positive at some x R, then, by continuity of the derivative, there is an open interval U containing x such that the derivative is strictly positive for any x U. The Mean Value Theorem then implies that f is strictly increasing on U, and hence that f maps U 1-1 onto f(u). Let V = f(u). Hence f : U V is invertible. A similar argument applies if the derivative is strictly negative. The Inverse Function Theorem generalizes and strengthens the previous observation. If f : R N R N is C r and if the matrix Df(x ) is invertible for some x R N, then there is an open set, U R N, with x U, such that f maps U 1-1 onto V = f(u). Hence f : U V is invertible. Moreover, V is open, the inverse function f 1 : V U is C r, and for any x U, setting y = f(x), Df 1 (y) = [Df(x)] 1. The last fact allows us to compute Df 1 (y) even when we can t derive f 1 analytically. Here are four examples in R. Example 1. Let f(x) = e x, which is C. Then Df(x) = e x > 0 for all x. Hence I can take U = R and V = f(r) = (0, ). The inverse function is then f 1 : V R defined by f 1 (y) = ln(y), which is C. Suppose, for example, that x = 1 then y = e. Since Df 1 (y) = 1/y, Df 1 (e) = 1/e. On the other hand, Df(x) = e x, so Df(1) = e. Hence Df 1 (y) = 1/Df(x), as claimed above. Example 2. Let f(x) = x 2, which is C. Then Df(x) = 2x for all x. At x = 1, Df(1) = 2 > 0, so the Inverse Function theorem guarantees existence of a local inverse. In fact, I can take U = (0, ) and V = f(u) = (0, ). The inverse function is then f 1 : V R defined by f 1 (y) = y, which is C. Example 3. Again let f(x) = x 2. At x = 0, Df(0) = 0, so the hypothesis of the Inverse Function theorem is violated. f does not, in fact, have an inverse in any neighborhood of 0. For example, if U = ( 1, 1) then V = (0, 1). If y = 0.01 then we would have to have both f 1 (0.01) = 0.1 and f 1 (0.01) = 0.1. 1 cbna. This work is licensed under the Creative Commons Attribution-NonCommercial- ShareAlike 4.0 License. 1

Example 4. Let f(x) = x 3, which is C. Then Df(x) = 3x 2 for all x. At x = 0, Df(0) = 0, so the hypothesis of the Inverse Function Theorem is violated. But f is invertible. In fact, I can take U = V = R and the inverse is f 1 : R R defined by f 1 (y) = y 1/3. But f 1 is not differentiable at 0. An illuminating, but more abstract, way to view the Inverse Function Theorem is the following. A linear isomorphism is a linear function h : R N R N, h(x) = Ax, such that the N N matrix A is invertible. A C r diffeomorphism is a C r function f : U V, where both U and V are open, such that f maps U 1-1 onto V and the inverse function f 1 : V U is also C r. One can think of diffeomorphism as a generalization of linear isomorphism. The Inverse Function Theorem then says that if the linear approximation to f at x is an isomorphism then f is a diffeomorphism on an open set containing x. 2 My proofs below follow those in Rudin (1976), Lang (1988), and Spivak (1965). 2 Some Necessary Preliminaries The argument sketched above for N = 1 relied on the Mean Value Theorem. Unfortunately, the Mean Value Theorem does not strictly generalize to N > 1, as the following example illustrates. Example 5. Define f : R 2 R 2 by f(x 1, x 2 ) = (e x 1 cos(x 2 ), e x 1 sin(x 2 )). Note that f(0, 0) = f(0, 2π). Therefore, if the Mean Value Theorem held in R 2, there would have to be an x for which Df(x ) is the zero matrix. But [ e x 1 cos(x Df(x) = 2 ) e x ] 1 sin(x 2 ) e x 1 sin(x 2 ) e x 1, cos(x 2 ) which has full rank for all x. Fortunately, there is an implication of the Mean Value Theorem that does generalize and that is strong enough to imply invertibility of f. Consider first the case where N = 1. Suppose that there is a number c such that Df(x) c for all x. By the Mean Value Theorem, for any x 1, x 0 R, there is an x R such that f(x 1 ) f(x 0 ) = Df(x)(x 1 x 0 ), hence f(x 1 ) f(x 0 ) = Df(x) x 1 x 0, hence, f(x 1 ) f(x 0 ) c x 1 x 0. Theorem 1 below, generalizes this inequality to N 1. 2 The affine approximation to f at x is H(x) = Df(x )(x x ) + f(x ). The linear approximation is the linear portion of this, namely h(x) = Df(x )x. 2

To state Theorem 1, I need to define a norm for matrices. I use the same norm that I used when defining what it means for a function to be C 1. Explicitly, for any N N matrix A, define the norm of A to be A = max x S Ax, where S = {x R N : x 1} is the solid unit ball centered at the origin in R N. Since S is compact and the norm on R N is continuous, A is well defined and one can verify that this norm does, in fact, satisfy all the required properties of a norm. For later reference, note that for any x 0, since x/ x = 1, Ax = A x x x A x. (1) I can now state and prove the generalized Mean Value Theorem. Theorem 1 (Generalized Mean Value Theorem). Suppose that U R N is convex and open, that f : U R N is differentiable, and that there is a number c > 0 such that for all x U, Df(x) c. Then for all x 1, x 0 U, f(x 1 ) f(x 0 ) c x 1 x 0. Proof. Take any x 1, x 0 U. Define g : R R N by g(t) = (1 t)x 0 + tx 1. Since U is open and convex, there is a ε > 0 sufficiently small such that g(t) U, for t ( ε, 1 + ε). Define h : ( ε, 1+ε) R N by h(t) = f(g(t)). For ease of notation, let x t = g(t). Then, by the Chain Rule, Hence, by inequality 1, Dh(t) = Df(g(t))Dg(t) = Df(x t )(x 1 x 0 ). Dh(t) Df(x t ) x 1 x 0 c x 1 x 0. Note that this holds for all t ( ε, 1 + ε). To complete the proof, I argue that there is a t (0, 1) such that f(x 1 ) f(x 0 ) Dh(t). To see this, define φ : ( ε, 1 + ε) R by φ(t) = h(t) (h(1) h(0)). 3

By the standard Mean Value Theorem on R, there is a t (0, 1) such that or φ(1) φ(0) = Dφ(t ), φ(1) φ(0) = Dh(t ) (h(1) h(0)). On the other hand, by direct substitution, φ(1) φ(0) = h(1) (h(1) h(0)) h(0) (h(1) h(0)) = h(1) h(0) 2 Combining and applying the Schwartz inequality yields, Dividing through by h(1) h(0) yields h(1) h(0) 2 Dh(t ) h(1) h(0). h(1) h(0) Dh(t ). (if f(ˆx) f(x) = 0, then the inequality holds trivially). Finally, the result follows by noting that h(1) = f(x 1 ) and h(0) = f(x 0 ). The Inverse Function Theorem also requires that both U and V = f(u) are open. For N = 1, this can be established by using, among other things, the fact that the image of an interval under a continuous function is also an interval. This fact, however, has no natural analog in R N. Therefore, the proof below instead establishes openness using the Contraction Mapping Theorem, which is discussed in the notes on fixed points. Briefly, let (X, d) be a metric space and let φ : X X. φ is a contraction iff there is an c (0, 1) such that for all x, ˆx X, d(φ(x), φ(ˆx)) c d(x, ˆx). A point x X is a fixed point of φ iff φ(x ) = x. Theorem 2 (Contraction Mapping Theorem). Let (X, d) be a complete metric space and let φ : X X be a contraction. Then φ has a unique fixed point. 3 Statement and Proof of the Inverse Function Theorem Theorem 3 (Inverse Function Theorem). Let f : R N R N be C r, where r is a positive integer. For x R N, suppose that Df(x ) is invertible. Then there are open sets U, V R N, with x U, such that f maps U 1-1 onto V = f(u). Hence f : U V is invertible. Moreover, the inverse function f 1 : V U is C r, and for any y V, setting x = f 1 (y), Df 1 (y) = [Df(x)] 1. (2) 4

Proof. Let y = f(x ). For ease of notation, assume that Df(x ) = I, the N N identity matrix. For the more general case, define F (x) = [Df(x )] 1 f(x). Then DF (x ) = I. Apply the argument below to F to get a C r inverse F 1, and note that f 1 (y) = F 1 ([Df(x )] 1 y), which is C r since F 1 is C r. In this sense, the case Df(x ) = I is without loss of generality. For a similar reason, one can simplify notation by assuming x = y = 0, but I will not do so. For any y R N, define φ y : R N R N by φ y (x) = x f(x) + y. For any y, φ y (x) = x iff f(x) = y. Below, I use the Contraction Mapping Theorem on φ y to find x f 1 (y). Note next that for any y, Dφ y (x) = I Df(x). Then, for any y, Dφ y (x ) = 0. Since Df is continuous, there is an open set U R N such that for any x U, for any y, Dφ y (x) < 1 2. (It is important that the right-hand side be smaller than 1; it is not important that it be, in particular, 1/2.) This implies that, for any x U, Df(x) is invertible, since if Df(x) is singular then there is a z R N of norm 1 such that Df(x)z = 0, but then Dφ y (x) Dφ y (x)z = z = 1 > 1/2. By Theorem 1, for any x, ˆx U, and for any y, φ y (ˆx) φ y (x) 1 ˆx x. (3) 2 Moreover, again since Df is continuous, since the inverse operation on invertible matrices is continuous, and since [Df(x )] 1 = I = 1, I can take U such that for all x U, [Df(x)] 1 2. (4) (The fact that I take the bound to be 2, rather than, say, 1000, is not important; all that matters is that the bound exists.) Let V = f(u). Inequality 3 implies that f must be 1-1 on U: if f(ˆx) = f(x), then φ y (ˆx) φ y (x) = ˆx x, which contradicts inequality 3. Since f maps U 1-1 onto V, the inverse function f 1 : V U is well defined. I must show that V is open. (Since U is essentially arbitrary, this will also establish that f is an open mapping: the image of any open set is open.) Take any ŷ V. I must show that there is an open ball around ŷ contained in V. Let ˆx = f 1 (ŷ); since f is 1-1, ˆx is well defined. Take any δ such that δ > 0 such that K = N δ (ˆx) O. Take the open ball around ŷ to be N δ/2 (ŷ). I need to show that this open ball is a subset of V. Take any y N δ/2 (ŷ). I must show that y V. 5

Inequality 3 implies that for any x K, φ y (x) ˆx φ y (x) φ y (ˆx) + φ y (ˆx) ˆx < 1 x ˆx + y ŷ 2 δ. Thus φ y maps K into itself. Since K is complete, and since by inequality 3, φ y is a contraction, the Contraction Mapping Theorem implies that there is an x K such that φ y (x) = x, hence there is an x K such that f(x) = y. Since K U, this implies y V, as was to be shown. The last step is to show that f 1 is C r. I show first that f 1 is continuous. Fix any y V, take any other ŷ V, and set x = f 1 (y), ˆx = f 1 (ŷ). Then f 1 (ŷ) f 1 (y) = ˆx x = φ y (ˆx) + ŷ φ y (x) y φ y (ˆx) φ y (x) + ŷ y 1 ˆx x + ŷ y, 2 where the last inequality comes from inequality 3. Rearranging, and substituting x = f 1 (y), ˆx = f 1 (ŷ), f 1 (ŷ) f 1 (y) 2 ŷ y, (5) which implies that f 1 is continuous. I next claim that f 1 is differentiable on V, and that, moreover, for any y V, setting x = f 1 (y), Df 1 (y) = [Df(x)] 1. Consider any ŷ V, ŷ y. Set ˆx = f 1 (ˆx). Then, since f 1 (ŷ) f 1 (y) [Df(x)] 1 (ŷ y) = x ˆx [Df(x)] 1 (f(ˆx) f(x)) and since [Df(x)] 1 2 (by inequality 4), = [Df(x)] 1 (f(ˆx) f(x) Df(x)(x ˆx)), f 1 (ŷ) f 1 (y) [Df(x)] 1 (ŷ y) 2 f(ˆx) f(x) Df(x)(x ˆx). Therefore, to show that it suffices to show that f 1 (ŷ) f 1 (y) [Df(x)] 1 (ŷ y) lim = 0, ŷ y ŷ y f(ˆx) f(x) Df(x)(x ˆx) lim = 0. ŷ y ŷ y 6

But since f is differentiable, f(ˆx) f(x) Df(x)(x ˆx) lim = 0. ˆx x ˆx x By continuity of f 1, ŷ y implies ˆx x. In addition, inequality 5 establishes that, ˆx x ŷ y 2. Hence, f(ˆx) f(x) Df(x)(x ˆx) lim = lim ŷ y ŷ y as was to be proved. Since f(ˆx) f(x) Df(x)(x ˆx) ŷ y ˆx x = 0, Df 1 (y) = [Df(x)] 1 ˆx x ŷ y for any y V, since Df is C r 1 (since f is C r ), and since the inverse operation on invertible matrices is smooth, this establishes that Df 1 is C r 1. Hence f 1 is C r, as was to be shown. Remark 1. Equation 2 is consistent with the Chain Rule. Explicitly, suppose that we are simply told that f 1 exists and is differentiable. Define h : V V by h(x) = f 1 (f(x)). Then, by the Chain Rule, On the other hand, since h(x) = x, Dh(x) = Df 1 (f(x))df(x). Dh(x) = I, where I is the N N identity matrix. Combining, and setting y = f(x), Rearranging this yields equation 2. Df 1 (y)df(x) = I. (6) Remark 2. Example 4 shows that Df(x ) being of full rank is not necessary for existence of an inverse. Equation 2 implies, however, that Df(x ) being of full rank is necessary for Df 1 (x ) to be differentiable. The Inverse Function Theorem thus says that this necessary condition for a differentiable local inverse is also sufficient. Remark 3. The above proof of the Inverse Function Theorem generalizes easily to infinite dimensional spaces. See Lang (1988). A common alternative proof, used, for example, in Spivak (1965), replaces the contraction mapping argument with an argument based on existence of maxima for continuous functions defined on compact sets. This approach relies on Heine-Borel, which does not generalize. 7

References Lang, S. (1988): Differential Manifolds. Springer, 2d edn. Rudin, W. (1976): Principles of Mathematical Analysis. McGraw-Hill, New York, third edn. Spivak, M. (1965): Calculus on Manifolds. Benjamin/Cummings. 8