Calculus 2502A - Advanced Calculus I Fall : Local minima and maxima

Similar documents
REVIEW OF DIFFERENTIAL CALCULUS

LESSON 23: EXTREMA OF FUNCTIONS OF 2 VARIABLES OCTOBER 25, 2017

MA102: Multivariable Calculus

SIMPLE MULTIVARIATE OPTIMIZATION

e x = 1 + x + x2 2! + x3 If the function f(x) can be written as a power series on an interval I, then the power series is of the form

Lecture 8: Maxima and Minima

OR MSc Maths Revision Course

Polynomial functions right- and left-hand behavior (end behavior):

September Math Course: First Order Derivative

MATH Max-min Theory Fall 2016

Math 2204 Multivariable Calculus Chapter 14: Partial Derivatives Sec. 14.7: Maximum and Minimum Values

Test 3 Review. y f(a) = f (a)(x a) y = f (a)(x a) + f(a) L(x) = f (a)(x a) + f(a)

Math Maximum and Minimum Values, I

Tangent spaces, normals and extrema

MAXIMA AND MINIMA CHAPTER 7.1 INTRODUCTION 7.2 CONCEPT OF LOCAL MAXIMA AND LOCAL MINIMA

446 CHAP. 8 NUMERICAL OPTIMIZATION. Newton's Search for a Minimum of f(x,y) Newton s Method

DEPARTMENT OF MATHEMATICS AND STATISTICS UNIVERSITY OF MASSACHUSETTS. MATH 233 SOME SOLUTIONS TO EXAM 2 Fall 2018

Extrema of Functions of Several Variables

Math 212-Lecture Interior critical points of functions of two variables

The Derivative. Appendix B. B.1 The Derivative of f. Mappings from IR to IR

Higher order derivative

2x (x 2 + y 2 + 1) 2 2y. (x 2 + y 2 + 1) 4. 4xy. (1, 1)(x 1) + (1, 1)(y + 1) (1, 1)(x 1)(y + 1) 81 x y y + 7.

Chapter 2: Unconstrained Extrema

Solutions to the Calculus and Linear Algebra problems on the Comprehensive Examination of January 28, 2011

Faculty of Engineering, Mathematics and Science School of Mathematics

MTH4101 CALCULUS II REVISION NOTES. 1. COMPLEX NUMBERS (Thomas Appendix 7 + lecture notes) ax 2 + bx + c = 0. x = b ± b 2 4ac 2a. i = 1.

3 Applications of partial differentiation

Basics of Calculus and Algebra

Engg. Math. I. Unit-I. Differential Calculus

x x implies that f x f x.

8. Diagonalization.

Math 118, Fall 2014 Final Exam

Math 5BI: Problem Set 6 Gradient dynamical systems

Math 121 Winter 2010 Review Sheet

Functions of Several Variables

Exam 3 MATH Calculus I

ALGEBRAIC GEOMETRY HOMEWORK 3

1 Overview. 2 A Characterization of Convex Functions. 2.1 First-order Taylor approximation. AM 221: Advanced Optimization Spring 2016

Suppose that f is continuous on [a, b] and differentiable on (a, b). Then

Mathematic 108, Fall 2015: Solutions to assignment #7

Transpose & Dot Product

Lecture 4: Optimization. Maximizing a function of a single variable

1. Which one of the following points is a singular point of. f(x) = (x 1) 2/3? f(x) = 3x 3 4x 2 5x + 6? (C)

1. Introduction. 2. Outlines

3.5 Quadratic Approximation and Convexity/Concavity

Transpose & Dot Product

1 Lecture 25: Extreme values

Module Two: Differential Calculus(continued) synopsis of results and problems (student copy)

Study Guide/Practice Exam 3

The University of British Columbia Midterm 1 Solutions - February 3, 2012 Mathematics 105, 2011W T2 Sections 204, 205, 206, 211.

Math 10b Ch. 8 Reading 1: Introduction to Taylor Polynomials

HW3 - Due 02/06. Each answer must be mathematically justified. Don t forget your name. 1 2, A = 2 2

Student Study Session Topic: Interpreting Graphs

a x a y = a x+y a x a = y ax y (a x ) r = a rx and log a (xy) = log a (x) + log a (y) log a ( x y ) = log a(x) log a (y) log a (x r ) = r log a (x).

Math (P)refresher Lecture 8: Unconstrained Optimization

Functions of Several Variables

3.2. Polynomial Functions and Their Graphs. Copyright Cengage Learning. All rights reserved.

Multivariable Calculus

Solutions to the Multivariable Calculus and Linear Algebra problems on the Comprehensive Examination of January 31, 2014

106 Chapter 5 Curve Sketching. If f(x) has a local extremum at x = a and. THEOREM Fermat s Theorem f is differentiable at a, then f (a) = 0.

Contents. 2 Partial Derivatives. 2.1 Limits and Continuity. Calculus III (part 2): Partial Derivatives (by Evan Dummit, 2017, v. 2.

Chapter 7. Extremal Problems. 7.1 Extrema and Local Extrema

MA202 Calculus III Fall, 2009 Laboratory Exploration 2: Gradients, Directional Derivatives, Stationary Points

Multivariable Calculus Notes. Faraad Armwood. Fall: Chapter 1: Vectors, Dot Product, Cross Product, Planes, Cylindrical & Spherical Coordinates

2.2 The Limit of a Function

MATH 019: Final Review December 3, 2017

Calculus for the Life Sciences II Assignment 6 solutions. f(x, y) = 3π 3 cos 2x + 2 sin 3y

Section 3.2 Polynomial Functions and Their Graphs

Implicit Functions, Curves and Surfaces

Optimization. Sherif Khalifa. Sherif Khalifa () Optimization 1 / 50

MATH 5720: Unconstrained Optimization Hung Phan, UMass Lowell September 13, 2018

10/22/16. 1 Math HL - Santowski SKILLS REVIEW. Lesson 15 Graphs of Rational Functions. Lesson Objectives. (A) Rational Functions

x +3y 2t = 1 2x +y +z +t = 2 3x y +z t = 7 2x +6y +z +t = a

14.7: Maxima and Minima

Solutions to Homework 7

Math 210, Final Exam, Fall 2010 Problem 1 Solution. v cosθ = u. v Since the magnitudes of the vectors are positive, the sign of the dot product will

B553 Lecture 1: Calculus Review

= c, we say that f ( c ) is a local

4 Partial Differentiation

AB Calculus Diagnostic Test

f(x, y) = 1 sin(2x) sin(2y) and determine whether each is a maximum, minimum, or saddle. Solution: Critical points occur when f(x, y) = 0.

Local Maximums and Local Minimums of Functions. f(x, y) has a local minimum at the point (x 0, y 0 ) if f(x 0, y 0 ) f(x, y) for

Convex Functions and Optimization

Math 180, Final Exam, Fall 2007 Problem 1 Solution

CHAPTER 4: HIGHER ORDER DERIVATIVES. Likewise, we may define the higher order derivatives. f(x, y, z) = xy 2 + e zx. y = 2xy.

Meaning of the Hessian of a function in a critical point

This exam will be over material covered in class from Monday 14 February through Tuesday 8 March, corresponding to sections in the text.

EC /11. Math for Microeconomics September Course, Part II Problem Set 1 with Solutions. a11 a 12. x 2

Functions. A function is a rule that gives exactly one output number to each input number.

Maxima and Minima. (a, b) of R if

Series Solutions of Differential Equations

Maxima and Minima. Marius Ionescu. November 5, Marius Ionescu () Maxima and Minima November 5, / 7

Bob Brown Math 251 Calculus 1 Chapter 4, Section 4 1 CCBC Dundalk

Announcements. Topics: Homework: - sections , 6.1 (extreme values) * Read these sections and study solved examples in your textbook!

Preliminaries Lectures. Dr. Abdulla Eid. Department of Mathematics MATHS 101: Calculus I

Partial Derivatives October 2013

Math 1314 Lesson 24 Maxima and Minima of Functions of Several Variables

4 3A : Increasing and Decreasing Functions and the First Derivative. Increasing and Decreasing. then

Partial Derivatives. w = f(x, y, z).

Nonlinear Optimization

Transcription:

Calculus 50A - Advanced Calculus I Fall 014 14.7: Local minima and maxima Martin Frankland November 17, 014 In these notes, we discuss the problem of finding the local minima and maxima of a function. 1 Background and terminology Definition 1.1. Let f : D R be a function, with domain D R n. A point a D is called: a local minimum of f if f( a) f( x) holds for all x in some neighborhood of a. a local maximum of f if f( a) f( x) holds for all x in some neighborhood of a. a global minimum (or absolute minimum) of f if f( a) f( x) holds for all x D. a global maximum (or absolute maximum) of f if f( a) f( x) holds for all x D. An extremum means a minimum or a maximum. Definition 1.. An interior point a D is called a: critical point of f if f is not differentiable at a or if the condition f( a) = 0 holds, i.e., all first-order partial derivatives of f are zero at a. saddle point of f if it is a critical point which is not a local extremum. In other words, for every neighborhood U of a, there is some point x U satisfying f( x) < f( a) and some point x U satisfying f( x) > f( a). Remark 1.3. Global minima or maxima are not necessarily unique. For example, the function f : R R defined by f(x) = sin x has a global maximum with value 1 at the points:..., 3π, π, 5π,... = { π + kπ k Z } 1

and a global minimum with value 1 at the points:..., π, 3π, 7π,... = { 3π + kπ k Z }. For a sillier example, consider a constant function g( x) = c for all x D. Then every point x D is simultaneously a global minimum and a global maximum of g. Proposition 1.4. Let a be an interior point of the domain D. If a is a local extremum of f, then any partial derivative of f which exists at a must be zero. Proof. Assume that the partial derivative D m f( a) exists, for some 1 m n. Then the function of a single variable m th coordinate {}}{ g(x) := f(a 1, a,..., x,..., a n 1, a n ) is differentiable at x = a m and has a local extremum at x = a m. Therefore its derivative vanishes at that point: g (a m ) = D m f( a) = 0. Remark 1.5. The converse does not hold: a critical point of f need not be a local extremum. For example, the function f(x) = x 3 has a critical point at x = 0 which is a saddle point. Upshot: of f. In order to find the local extrema of f, it suffices to consider the critical points

One-dimensional case Recall the second derivative test from single variable calculus. Proposition.1 (Second derivative test). Let f : D R be a function of a single variable, with domain D R. Let a D be a critical point of f such that the first derivative f exists in a neighborhood of a, and the second derivative f (a) exists. If f (a) > 0 holds, then f has a local minimum at a. If f (a) < 0 holds, then f has a local maximum at a. If f (a) = 0 holds, then the test is inconclusive. Remark.. The following examples illustrate why the test is inconclusive when the critical point a satisfies f (a) = 0. The function f(x) = x 4 has a local minimum at 0. The function f(x) = x 4 has a local maximum at 0. The function f(x) = x 3 has a saddle point at 0. Example.3. Find all local extrema of the function f : R R defined by f(x) = 3x 4 4x 3 1x + 10. Solution. Note that f is twice differentiable on all of R, and moreover the domain of f is open in R (so that there are no boundary points to check). Let us find the critical points of f: f (x) = 1x 3 1x 4x = 0 x 3 x x = 0 x ( x x ) = 0 x(x + 1)(x ) = 0 x = 1, 0, or. Let us use the second derivative test to classify these critical points: f (x) = 1 ( 3x x ) from which we conclude: f ( 1) = 1 (3 + ) = 1(3) > 0 f (0) = 1 ( ) < 0 f () = 1 (1 4 ) = 1(6) > 0 3

x = 1 is a local minimum of f, with value f( 1) = 3 + 4 1 + 10 = 5. x = 0 is a local maximum of f, with value f(0) = 10. x = is a local minimum of f, with value f() = 3(16) 4(8) 1(4) + 10 =. Remark.4. In this example, the function f has no global maximum, because it grows arbitrarily large: lim x + f(x) = +. Moreover, the condition lim x f(x) = +, together with the extreme value theorem, implies that f reaches a global minimum. Therefore, x = is in fact the global minimum of f, with value f() =, as illustrated in Figure 1. 40 f(x) 0 1 1 3 x 0 Figure 1: Graph of the function f in Example.3. 4

3 Two-dimensional case Theorem 3.1 (Second derivatives test). Let f : D R be a function of two variables, with domain D R. Let (a, b) D be a critical point of f such that the second partial derivatives f xx, f xy, f yx, f yy exist in a neighborhood of (a, b) and are continuous at (a, b), which implies in particular f xy (a, b) = f yx (a, b). The discriminant of f is the determinant D := f xx f xy f yx f yy = f xx f yy f xy. Now consider the discriminant D = D(a, b) at the point (a, b). If D > 0 and f xx (a, b) > 0 hold, then f has a local minimum at (a, b). If D > 0 and f xx (a, b) < 0 hold, then f has a local maximum at (a, b). If D < 0 holds, then f has a saddle point at (a, b). If D = 0 holds, then the test is inconclusive. Remark 3.. The following examples illustrate why the test is inconclusive when the discriminant satisfies D = 0. The function f(x, y) = x + y 4 has a (strict) local minimum at (0, 0). The function f(x, y) = x has a (non-strict) local minimum at (0, 0). The function f(x, y) = x y 4 has a (strict) local maximum at (0, 0). The function f(x, y) = x has a (non-strict) local maximum at (0, 0). The function f(x, y) = x + y 3 has a saddle point at (0, 0). The function f(x, y) = x + y 3 has a saddle point at (0, 0). The function f(x, y) = x y 4 has a saddle point at (0, 0). Example 3.3 (# 14.7.11). Find all local extrema and saddle points of the function f : R R defined by f(x, y) = x 3 1xy + 8y 3. 5

Solution. Note that f is infinitely many times differentiable on its domain R. Let us find the critical points of f: { f x (x, y) = 3x 1y = 0 f y (x, y) = 1x + 4y = 0 { 4y = x x = y 4y = (y ) = 4y 4 y = y 4 y(y 3 1) = 0 y = 0 or y 3 = 1 y = 0 or y = 1 y = 0 x = (0) = 0 y = 1 x = (1) = which implies that (0, 0) and (, 1) are the only points that could be critical points of f, and indeed they are. Let us compute the discriminant of f: f xx (x, y) = 6x f xy (x, y) = 1 f yy (x, y) = 48y D(x, y) = f xx (x, y)f yy (x, y) f xy (x, y) = (6x)(48y) ( 1) = 1 (xy 1). At the critical point (0, 0), the discriminant is: so that (0, 0) is a saddle point. D(0, 0) = 1 (0 1) < 0 At the critical point (, 1), the discriminant is: D(, 1) = 1 (4 1) > 0 and we have f xx (, 1) = 6() = 1 > 0, so that (, 1) is a local minimum with value f(, 1) = 8 4 + 8 = 8, as illustrated in Figure.. 6

500 0 z 500 4 x 0 0 4 4 4 y 4 Why does it work? Figure : Graph of the function f in Example 3.3. 4.1 Quadratic approximation The second derivatives test relies on the quadratic approximation of f at the point (a, b), which is the Taylor polynomial of degree at that point. Recall that for functions of a single variable f(x), the quadratic approximation of f at a is: f(x) f(a) + f (a)(x a) + f (a) (x a). Since the variable of interest is the displacement from the basepoint a, let us rewrite h := x a and thus: f(a + h) f(a) + f (a)h + f (a) h. If a is a critical point of f, then we have f (a) = 0 and the quadratic approximation becomes: f(a + h) f(a) + f (a) h. If f (a) 0 holds, then the quadratic term will determine the local behavior of f around a: f (a) > 0 yields a local minimum while f (a) < 0 yields a local maximum. Idea: If f (a) > 0 holds, then f qualitatively behaves like f(a + h) = f(a) + h when h is small. If f (a) < 0 holds, then f behaves like f(a + h) = f(a) h. 7

For functions of two variables f(x, y), the quadratic approximation at (a, b) is: f(x, y) f(a, b) + f x (a, b)(x a) + f y (a, b)(y b)+ (x a) (y b) + f xx (a, b) + f xy (a, b)(x a)(y b) + f yy (a, b) Again, consider the displacement from the basepoint (a, b) and write and thus: h := (h, k) := (x a, y b) f(a + h, b + k) f(a, b) + f x (a, b)h + f y (a, b)k + f xx (a, b) h + f xy(a, b)hk + f yy (a, b) k. In vector notation, which is convenient in higher dimension, this can be rewritten as: f(a + h, b + k) f(a, b) + f(a, b) h + 1 h T H f (a, b) h where H f (a, b) is the Hessian of f at the point (a, b), defined as the matrix of second partial derivatives: [ ] fxx f H f = xy. Here h T denotes the row vector [ h k ], which is the transpose of the column vector h = f yx If (a, b) is a critical point of f, then we have f x (a, b) = 0 and f y (a, b) = 0, and the quadratic approximation becomes: f(a + h, b + k) f(a, b) + f xx (a, b) h + f xy(a, b)hk + f yy (a, b) k f yy = f(a, b) + 1 h T H f (a, b) h. [ ] h. k 4. Behavior close to a critical point The Hessian H f (a, b) should be viewed as a symmetric bilinear form on the tangent space of the domain D at the point (a, b). Given two direction vectors v and w, the Hessian outputs the number v T H f (a, b) w = D w D v f(a, b) where D v f(a, b) denotes the derivative of f in the (non-normalized) direction v at the point (a, b). For example, taking standard basis vectors v = ı and w = ı yields the number ı T H f (a, b) ı = D ı D ı f(a, b) = f xx (a, b) 8

and taking v = ı and w = j yields the number ı T H f (a, b) j = D j D ı f(a, b) = f xy (a, b). From linear algebra, we know that a symmetric bilinear form is classified, up to a change of basis, by its signature: how many of its eigenvalues are positive, zero, or negative. Let λ 1, λ be the eigenvalues of H f (a, b). Depending on the signs of λ 1 and λ, the bilinear form H f (a, b) will be equivalent (up to a change of basis) to one of the following: [ ] 1 0 if λ 0 1 1 and λ are both positive. [ ] 1 0 if one λ 0 1 i is positive and the other is negative. [ ] 1 0 if λ 0 1 1 and λ are both negative. [ ] 1 0 if one λ 0 0 i is positive and the other is zero. [ ] 1 0 if one λ 0 0 i is negative and the other is zero. [ ] 0 0 if λ 0 0 1 and λ are both zero. Upshot: To analyze the behavior of f close to a critical point (a, b), the first step is to find the signs of the eigenvalues of the Hessian H f (a, b). The discriminant of f was defined as the determinant of the Hessian, which is the product of the eigenvalues: D(a, b) = det H f (a, b) = λ 1 λ. Definition 4.1. A critical point (a, b) of f is called non-degenerate if the Hessian H f (a, b) is non-degenerate, i.e., has only non-zero eigenvalues. Otherwise, the critical point is called degenerate, which means that the Hessian H f (a, b) has an eigenvalue 0. Note that a critical point is degenerate if and only if the product of the eigenvalues is zero: λ 1 λ = 0; equivalently, the discriminant is zero: D(a, b) = 0. Example 4.. For all the functions described in Example 3., the origin (0, 0) is a degenerate critical point of f. As we have seen, the function f can have all kinds of behaviors close to a degenerate critical point. In contrast, close to a non-degenerate critical point, the behavior of f is dictated by the signature of the Hessian. If the discriminant satisfies D(a, b) 0, then the critical point (a, b) is non-degenerate, and its signature is one of the following: 9

[ ] 1 0 0 1 [ ] 1 0 0 1 [ ] 1 0. 0 1 The case of one positive eigenvalue and one negative eigenvalue happens if and only if the discriminant is negative: D = λ 1 λ < 0. In the case D > 0, there are still two possible signatures: [ ] 1 0 0 1 [ ] 1 0. 0 1 To distinguish between the two cases, Sylvester s criterion (from linear algebra) tells us that both eigenvalues are positive if and only if the top left entry of the matrix is positive: f xx (a, b) > 0. Remark 4.3. For this last step, one could also use the equivalent condition f yy (a, b) > 0, or that the trace of the matrix is positive: f xx (a, b) + f yy (a, b) = λ 1 + λ > 0. Summary: Here is a reinterpretation of the second derivatives test. D = 0 one of the eigenvalues λ i is zero. D < 0 one of the eigenvalues is positive and the other is negative. D > 0 and f xx > 0 both eigenvalues λ i are positive. D > 0 and f xx < 0 both eigenvalues λ i are negative. Idea: If both eigenvalues are positive, then f qualitatively behaves like f(a + h, b + k) = f(a, b) + h + k when h and k are small. If one eigenvalue is positive and the other is negative, then f behaves like f(a+h, b+k) = f(a, b) + h k. If both eigenvalues are negative, then f behaves like f(a + h, b + k) = f(a, b) h k. 10