Max-Min Problems in R n Matrix

Similar documents
Student Solutions Manual. for. Web Sections. Elementary Linear Algebra. 4 th Edition. Stephen Andrilli. David Hecker

1 Linear Algebra Problems

Calculus Example Exam Solutions

EXERCISE SET 5.1. = (kx + kx + k, ky + ky + k ) = (kx + kx + 1, ky + ky + 1) = ((k + )x + 1, (k + )y + 1)

Numerical Linear Algebra Homework Assignment - Week 2

MA102: Multivariable Calculus

5 More on Linear Algebra

Linear Algebra. Chapter 8: Eigenvalues: Further Applications and Computations Section 8.2. Applications to Geometry Proofs of Theorems.

Calculus 2502A - Advanced Calculus I Fall : Local minima and maxima

Functions of Several Variables

MAT 419 Lecture Notes Transcribed by Eowyn Cenek 6/1/2012

Math 5BI: Problem Set 6 Gradient dynamical systems

MTH 5102 Linear Algebra Practice Final Exam April 26, 2016

NEW SIGNS OF ISOSCELES TRIANGLES

a11 a A = : a 21 a 22

B553 Lecture 3: Multivariate Calculus and Linear Algebra Review

HW3 - Due 02/06. Each answer must be mathematically justified. Don t forget your name. 1 2, A = 2 2

Lecture 16: 9.2 Geometry of Linear Operators

4.3 - Linear Combinations and Independence of Vectors

OR MSc Maths Revision Course

Differentiation. f(x + h) f(x) Lh = L.

Chapter 2. Vectors and Vector Spaces

Math 203A - Solution Set 1

The Singular Value Decomposition

Math 203A - Solution Set 1

APPENDIX A. Background Mathematics. A.1 Linear Algebra. Vector algebra. Let x denote the n-dimensional column vector with components x 1 x 2.

Exercise Sheet 1.

REVIEW OF DIFFERENTIAL CALCULUS

MA202 Calculus III Fall, 2009 Laboratory Exploration 2: Gradients, Directional Derivatives, Stationary Points

Lagrange Multipliers

EIGENVALUES AND EIGENVECTORS 3

1 The Well Ordering Principle, Induction, and Equivalence Relations

1 Overview. 2 A Characterization of Convex Functions. 2.1 First-order Taylor approximation. AM 221: Advanced Optimization Spring 2016

Here each term has degree 2 (the sum of exponents is 2 for all summands). A quadratic form of three variables looks as

1 Lecture 25: Extreme values

LINEAR ALGEBRA BOOT CAMP WEEK 4: THE SPECTRAL THEOREM

8. Diagonalization.

EXERCISES ON DETERMINANTS, EIGENVALUES AND EIGENVECTORS. 1. Determinants

Math 413/513 Chapter 6 (from Friedberg, Insel, & Spence)

Problem 1A. Use residues to compute. dx x

Rings, Integral Domains, and Fields

2.6 Logarithmic Functions. Inverse Functions. Question: What is the relationship between f(x) = x 2 and g(x) = x?

Nonlinear equations and optimization

6 Optimization. The interior of a set S R n is the set. int X = {x 2 S : 9 an open box B such that x 2 B S}

Section 21. The Metric Topology (Continued)

Lecture 7: Introduction to linear systems

Eigenvalues and Eigenvectors

Solution: a) Let us consider the matrix form of the system, focusing on the augmented matrix: 0 k k + 1. k 2 1 = 0. k = 1

MA455 Manifolds Solutions 1 May 2008

Linear Algebra. James Je Heon Kim

Introduction to Linear Algebra. Tyrone L. Vincent

ECON2285: Mathematical Economics

Lecture 8: Basic convex analysis

Bob Brown Math 251 Calculus 1 Chapter 4, Section 1 Completed 1 CCBC Dundalk

CHAPTER 4: HIGHER ORDER DERIVATIVES. Likewise, we may define the higher order derivatives. f(x, y, z) = xy 2 + e zx. y = 2xy.

Nonlinear Programming (NLP)

1. General Vector Spaces

The Derivative. Appendix B. B.1 The Derivative of f. Mappings from IR to IR

which are not all zero. The proof in the case where some vector other than combination of the other vectors in S is similar.

1 The space of linear transformations from R n to R m :

Performance Surfaces and Optimum Points

g(t) = f(x 1 (t),..., x n (t)).

Lecture 12: Diagonalization

Tangent spaces, normals and extrema

3.2 Gaussian Elimination (and triangular matrices)

X (z) = n= 1. Ã! X (z) x [n] X (z) = Z fx [n]g x [n] = Z 1 fx (z)g. r n x [n] ª e jnω

Static Problem Set 2 Solutions

Multivariable Calculus

SOME THEORY AND PRACTICE OF STATISTICS by Howard G. Tucker CHAPTER 2. UNIVARIATE DISTRIBUTIONS

z = f (x; y) f (x ; y ) f (x; y) f (x; y )

MATH 221, Spring Homework 10 Solutions

Metric Spaces. DEF. If (X; d) is a metric space and E is a nonempty subset, then (E; d) is also a metric space, called a subspace of X:

Exercise Set 7.2. Skills

Chapter 3. Determinants and Eigenvalues

22m:033 Notes: 7.1 Diagonalization of Symmetric Matrices

Coordinate Systems. S. F. Ellermeyer. July 10, 2009

Lecture 11: Eigenvalues and Eigenvectors

Section III.6. Factorization in Polynomial Rings

1. Subspaces A subset M of Hilbert space H is a subspace of it is closed under the operation of forming linear combinations;i.e.,

Math 1060 Linear Algebra Homework Exercises 1 1. Find the complete solutions (if any!) to each of the following systems of simultaneous equations:

A = 3 1. We conclude that the algebraic multiplicity of the eigenvalues are both one, that is,

A function is actually a simple concept; if it were not, history would have replaced it with a simpler one by now! Here is the definition:

Topology Exercise Sheet 2 Prof. Dr. Alessandro Sisto Due to March 7

Differentiable Functions

Mathematics 206 Solutions for HWK 13b Section 5.2

Abstract Algebra I. Randall R. Holmes Auburn University. Copyright c 2012 by Randall R. Holmes Last revision: November 11, 2016

Symmetric Matrices and Eigendecomposition

Properties of Matrices and Operations on Matrices

Higher order derivative

Properties of the Gradient

2x (x 2 + y 2 + 1) 2 2y. (x 2 + y 2 + 1) 4. 4xy. (1, 1)(x 1) + (1, 1)(y + 1) (1, 1)(x 1)(y + 1) 81 x y y + 7.

Chapter 2: Unconstrained Extrema

Lecture Notes on Metric Spaces

Math 263 Assignment #4 Solutions. 0 = f 1 (x,y,z) = 2x 1 0 = f 2 (x,y,z) = z 2 0 = f 3 (x,y,z) = y 1

Math 302 Outcome Statements Winter 2013

Extra Problems for Math 2050 Linear Algebra I

3.5 Quadratic Approximation and Convexity/Concavity

Assignment 3. Section 10.3: 6, 7ab, 8, 9, : 2, 3

ON SUM OF SQUARES DECOMPOSITION FOR A BIQUADRATIC MATRIX FUNCTION

1. What is the determinant of the following matrix? a 1 a 2 4a 3 2a 2 b 1 b 2 4b 3 2b c 1. = 4, then det

Transcription:

Max-Min Problems in R n Matrix 1 and the essian Prerequisite: Section 6., Orthogonal Diagonalization n this section, we study the problem of nding local maxima and minima for realvalued functions on R n. The method we describe is the higher-dimensional analogue to nding critical points and applying the second derivative test to functions on R studied in rst-semester calculus. Taylor s Theorem in R n Let f C (R n ), where C (R n ) is the set of real-valued functions de ned on R n having continuous second partial derivatives. The method for solving for local extreme points of f relies upon Taylor s Theorem with second degree remainder terms, which we state here without proof. (n the following theorem, an open hypersphere centered at x 0 is a set of the form fx R n j kx x 0 k < rg for some positive real number r.) TEOREM 1 (Taylor s Theorem in R n ) Let A be an open hypersphere centered at x 0 R n, let u be a unit vector in R n, and let t R such that x 0 + tu A. Suppose f : A! R has continuous second partial derivatives throughout A; that is, f C (A). Then there is a c with 0 c t such that nx @f f(x 0 + tu) = f(x 0 ) + @x i=1 i (tu i ) + 1 nx @x i=1 i nx nx + (t u i u j ): @x i @x j i=1 j=i+1 +cu +cu (t u i ) Taylor s Theorem in R n is derived from the familiar Taylor s Theorem in R by applying it to the function g(t) = f(x 0 + tu). n R, the formula in Taylor s Theorem is f(x 0 + tu) = f(x 0 ) + @f @x (tu 1 ) + @f @y (tu ) + 1 @x (t u 1) + 1 +cu @y (t u ) +cu + @ f (t u 1 u ): +cu h Recall that the gradient of f is de ned by rf = @f @x 1 ; @f @x ; : : : ; @f @x n i. f we let v = tu, then, in R, v = [v 1 ; v ] = [tu 1 ; tu ], and so the sum @f @x (tu 1 ) + @f (tu ) simpli es to rf v. Also, since f has continuous second partial @y Andrilli/ecker Elementary Linear Algebra, 4th ed. March 15, 010 Copyright 010, Elsevier nc. All rights reserved.

derivatives, we have @ f = @ f : Therefore, 1 @x (t u 1) + 1 = 1 @ v f 1 @x v 1 + @ f = 1 vt 4 @x @y (t u ) + @ f (t u 1 u ) + 1 v v @y 5 v; where v is considered to be a column vector. The matrix = 4 @x @y v 1 + @ f @y v in this expression is called the essian matrix for f. Thus, in the R case, with v = tu, the formula in Taylor s Theorem can be written as f(x 0 + v) = f(x 0 ) + rf v + 1 vt v, 5 +kv for some k with 0 k 1 (where k = c t ). While we have derived this result in R, the same formula holds in R n, where the essian is the matrix whose (i; j) entry is @x i@x j. Critical Points f A is a subset of R n, then we say that f : A! R has a local maximum at a point x 0 A if and only if there is an open neighborhood U of x 0 such that f(x 0 ) f(x) for all x U. A local minimum for a function f is de ned analogously. TEOREM Let A be an open hypersphere centered at x 0 R n, and let f : A! R have continuous rst partial derivatives on A. f f has a local maximum or a local minimum at x 0, then rf(x 0 ) = 0. Proof f x 0 is a local maximum, then f(x 0 + he i ) f(x 0 ) 0 for small h. Then, f(+hei) f() f(x lim h!0 + h 0. Similarly, lim 0+he i) f(x 0) h!0 h 0. ence, for the limit to exist, we must have @f @x i = 0. Since this is true for each i, rf = 0. A similar proof works for local minimums. QED Points x 0 at which rf(x 0 ) = 0 are called critical points. Example 1 Let f : R! R be given by f(x; y) = 7x + 6xy + x + 7y y +. Then rf = [14x + 6y + ; 6x + 14y ]. We nd critical points for f by solving rf = 0. This is the linear system 14x + 6y + = 0 6x + 14y = 0 ; which has the unique solution x 0 = [ 1; ]. ence, by Theorem, ( 1; ) is the only possible extreme point for f. (We will see later that ( 1; ) is a local minimum.) Andrilli/ecker Elementary Linear Algebra, 4th ed. March 15, 010 Copyright 010, Elsevier nc. All rights reserved.

Su cient Conditions for Local Extreme Points f x 0 is a critical point for a function f, how can we determine whether x 0 is a local maximum or a local minimum? For functions on R, we have the second derivative test from calculus, which says that if f 00 (x 0 ) < 0, then x 0 is a local maximum, but if f 00 (x 0 ) > 0, then x 0 is a local minimum. We now derive a similar test in R n. Consider the following formula from Taylor s Theorem: f(x 0 + v) = f(x 0 ) + rf(x 0 ) v + 1 vt v. +kv At a critical point, rf(x 0 ) = 0, and so f(x 0 + v) = f(x 0 ) + 1 vt +kv v. ence, if v T v is positive for all small nonzero vectors v, then f will +kv have a local minimum at x 0. (Similarly, if v T v is negative, f will +kv have a local maximum.) But since we assume that f has continuous second partial derivatives, v T v is continuous in v and k, and will be positive for small +kv v if v T v is positive for all nonzero v. ence, TEOREM Given the conditions of Taylor s Theorem for a set A and a function f : A! R, f has a local minimum at a critical point x 0 if v T v > 0 for all nonzero vectors v. Similarly, f has a local maximum at a critical point x 0 if v T v < 0 for all nonzero vectors v. Positive De nite Quadratic Forms f v is a vector in R n, and A is an n n matrix, the expression v T Av is known as a quadratic form. (For more details on the general theory of quadratic forms, see Section 8.11.) A quadratic form such that v T Av > 0 for all nonzero vectors v is said to be positive de nite. Similarly, a quadratic form such that v T Av < 0 for all nonzero vectors v is said to be negative de nite. Now, in particular, the expression v T v in Theorem is a quadratic form. Theorem then says that if v T v is a positive de nite quadratic form at a critical point x 0, then f has a local minimum at x 0. Theorem also says that if v T v is a negative de nite quadratic form at a critical point @x i@x j = @ f @x j@x i x 0, then f has a local maximum at x 0. Therefore, we need a method to determine whether a quadratic form of this type is positive de nite or negative de nite. Now, the essian matrix, which we will abbreviate as, is sym- metric because (since f C (A)). ence, by Theorem 6.0, can be orthogonally diagonalized. That is, there is an orthogonal matrix P such that PP T = D, a diagonal matrix, and so, = P T DP. ence, v T v = v T P T DPv = (Pv) T D(Pv). Letting w = Pv, we get v T v = w T Dw. But P is nonsingular, so as v ranges over all of R n, so does w, and vice-versa. Thus, Andrilli/ecker Elementary Linear Algebra, 4th ed. March 15, 010 Copyright 010, Elsevier nc. All rights reserved.

4 v T v > 0 for all nonzero v if and only if w T Dw > 0 for all nonzero w. Now, D is diagonal, and so w T Dw = d 11 w 1 + d w + + d nn w n. But the d ii s are the eigenvalues of. Thus, it follows that w T Dw > 0 for all nonzero w if and only if all of these eigenvalues are positive. (Set w = e i for each i to prove the only if part of this statement.) Similarly, w T Dw < 0 for all nonzero w if and only if all of these eigenvalues are negative. ence, TEOREM 4 A symmetric matrix A de nes a positive de nite quadratic form v T Av if and only if all of the eigenvalues of A are positive. A symmetric matrix A de nes a negative de nite quadratic form v T Av if and only if all of the eigenvalues of A are negative. ence, Theorem can be restated as follows: Given the conditions of Taylor s Theorem for a set A and a function f : A! R: (1) if all of the eigenvalues of are positive at a critical point x 0, then f has a local minimum at x 0, and () if all of the eigenvalues of are positive at a critical point x 0, then f has a local minimum at x 0. Example Consider the function f(x; y) = 7x + 6xy + x + 7y y + : n Example 1, we found that f has a critical point at x 0 = [ 1; ]. Now, the essian matrix for f at x 0 is = 4 @x @y 5 x 0 = 14 6. 6 14 But p (x) = x 8x + 160, which has roots x = 8 and x = 0. Thus, has all eigenvalues positive, and hence, v T v is positive de nite. Theorem 4 then tells us that x 0 = [ 1; ] is a local minimum for f. Local Maxima and Minima in R t can be shown (see Exercise ) that a symmetric matrix A de nes a positive de nite quadratic form (v T Av > 0 for all nonzero v) if and only if a 11 > 0 and jaj > 0. Similarly, a symmetric matrix de nes a negative de nite quadratic form (v T Av < 0 for all nonzero v) if and only if a 11 < 0 and jaj > 0. Example Suppose f(x; y) = x x y + y + 4y x 4 y 4. First, we look for critical points by solving the system 8 >< >: @f @x = 4x 4xy 4x = 4x(1 (y + x )) = 0 @f @y = 4x y + 4y + 4 4y = 4y(x + y ) + 4y + 4 = 0 Now @f @x = 0 yields x = 0 or y +x = 1. f x = 0, then @f @y = 0 gives 4y+4 4y = 0. The unique real solution to this equation is y =. Thus, [0; ] is a critical point. f x 6= 0, then y + x = 1. From @f @y = 0, we have 0 = 4y(1) + 4y + 4 = 4, a contradiction, so there is no critical point when x 6= 0. : Andrilli/ecker Elementary Linear Algebra, 4th ed. March 15, 010 Copyright 010, Elsevier nc. All rights reserved.

5 An Example in R Next, we compute the essian matrix at the critical point [0; ]. @x = 4 5 = @y [0;] 4 4y 1x 8xy 8xy 4x + 4 1y [0;] = 1 0 : 0 44 Since the (1; 1) entry is negative and jj > 0, de nes a negative de nite quadratic form and so f has a local maximum at [0; ]. Example 4 Consider the function g(x; y; z) = 5x + xz + 4xy + 10x + z 6yz 6z + 5y + 1y + 1. We nd the critical points by solving the system 8 >< >: @g @x @g @y @g @z = 10x + z + 4y + 10 = 0 = 4x 6z + 10y + 1 = 0 = x + 6z 6y 6 = 0 Using row reduction to solve this linear system yields the unique critical point [ 9; 1; 16]. The essian matrix at [ 9; 1; 16] is @ g @ g @ g @x @x@z 10 4 = @ g @ g @ g 6 @y 4 @y@z 7 = 4 4 10 6 5. 5 6 6 @ g @z@x @ g @z@y @ g @z [ 9;1;16] A lengthy computation produces p (x) = x 6x +164x 8. The roots of p (x) are approximately 0:04916, 10:6011, and 15:497. Since all of these eigenvalues for are positive, [ 9; 1; 16] is a local minimum for g.. Failure of the essian Matrix Test n calculus, we discovered that the second derivative test fails when the second derivative is zero at a critical point. A similar situation is true in R n. f the essian matrix at a critical point has 0 as an eigenvalue, and all other eigenvalues have the same sign, then the function f could have a local maximum, a local minimum, or neither at this critical point. Of course, if the essian matrix at a critical point has two eigenvalues with opposite signs, the critical point is not a local extreme point (why?). Exercise illustrates these concepts. New Vocabulary C (R n ) (functions from R n to R having continuous second partial derivatives) critical point (of a function) gradient (of a function on R n ) essian matrix local maximum (of a function on R n ) local minimum (of a function on R n ) negative de nite quadratic form Andrilli/ecker Elementary Linear Algebra, 4th ed. March 15, 010 Copyright 010, Elsevier nc. All rights reserved.

6 ighlights open hypersphere (in R n ) positive de nite quadratic form Taylor s Theorem (in R n ) The gradient of a function f : R n! R is de ned by rf = h i @f @x 1 ; @f @x ; : : : ; @f @x n : Let A be an open hypersphere about x 0, and let f be a function on A with continuous partial derivatives. f f has a local maximum or minimum at x 0, then rf(x 0 ) = 0: For a function f : R n! R; its corresponding essian matrix is the n n @ matrix whose (i; j) entry is f @x i@x j : n particular, for a function f : R! R; the essian matrix = 4 @x @y 5 : Taylor s Theorem in R n : Let A be an open hypersphere centered at x 0 R n, let u be a unit vector in R n, and let t R such that x 0 + tu A. Suppose f : A! R has continuous second partial derivatives throughout A; that is, f C (A). Then there is a c with 0 c t such that f(x 0 +tu) = f(x 0 )+ P n @f i=1 @x i (tu i )+ 1 P n i=1 @x (t u i )+P n P n i=1 j=i+1 @x i +cu i@x j +cu (t u i u j ): n particular, we have f(x 0 +v) = f(x 0 )+ rf v + 1 vt v, for some k with 0 k 1. +kv Let A be an open hypersphere centered at x 0 R n : f f : A! R has continuous second partial derivatives throughout A, then f : A! R, f has a local minimum at a critical point x 0 if v T v > 0 for all nonzero vectors v. Similarly, f has a local maximum at a critical point x 0 if v T v < 0 for all nonzero vectors v. A quadratic form is an expression of the form v T Av, where v is a vector in R n, and A is an n n matrix. A positive de nite quadratic form is one such that v T Av > 0 for all nonzero vectors v. Similarly, a negative de nite quadratic form is one such that v T Av < 0 for all nonzero vectors v. For a function f : R n! R having essian matrix, if v T v is a positive de nite quadratic form at a critical point x 0, then f has a local minimum at x 0. Similarly, if v T v is a negative de nite quadratic form at a critical point x 0, then f has a local maximum at x 0. A symmetric matrix A de nes a positive de nite quadratic form v T Av if and only if all of the eigenvalues of A are positive. A symmetric matrix A de nes a negative de nite quadratic form v T Av if and only if all of the eigenvalues of A are negative. f f : R n! R has essian matrix ; and all eigenvalues of are positive at a critical point x 0, then f has a local minimum at x 0. f f : R n! R has essian matrix ; and all eigenvalues of are negative at a critical point x 0, then f has a local maximum at x 0. Andrilli/ecker Elementary Linear Algebra, 4th ed. March 15, 010 Copyright 010, Elsevier nc. All rights reserved.

A symmetric matrix A has a positive de nite quadratic form v T Av if and only if a 11 > 0 and jaj > 0. Similarly, a symmetric matrix has a negative de nite quadratic form v T Av if and only if a 11 < 0 and jaj > 0. 7 EXERCSES 1. n each part, solve for all critical points for the given function. Then, for each critical point, use the essian matrix to determine whether the critical point is a local maximum, a local minimum, or neither. F a) f(x; y) = x + x + xy x + y b) f(x; y) = 6x + 4xy + y + 8x 9y F c) f(x; y) = x + xy + x + y y + 5 d) f(x; y) = x + x y x + xy + xy x + y y y (int: To solve for critical points, rst set @f @f @x @y = 0.) F e) f(x; y; z) = x +xy +xz +y 4 +4y z +6y z y +4yz 4yz +z 4 z a) Show that f(x; y) = (x ) 4 + (y ) has a local minimum at [; ], but its essian matrix at [; ] has 0 as an eigenvalue. b) Show that f(x; y) = (x ) 4 + (y ) has a critical point at [; ], its essian matrix at [; ] has all nonnegative eigenvalues, but [; ] is not a local extreme point for f. c) Show that f(x; y) = (x+1) 4 (y+) 4 has a local maximum at [ 1; ], but its essian matrix at [ 1; ] is O and thus has all of its eigenvalues equal to zero. d) Show that f(x; y; z) = (x 1) (y ) +(z ) 4 does not have any local extreme points. Then verify that its essian matrix has eigenvalues of opposite sign at the function s only critical point. a b a) Prove that a symmetric matrix A = de nes a positive b c de nite quadratic form if and only if a > 0 and jaj > 0. (int: Compute p A (x) and show that both roots are positive if and only if a > 0 and jaj > 0.) b) Prove that a symmetric matrix A de nes a negative de nite quadratic form if and only if a 11 < 0 and jaj > 0. F. True or False: a) f f : R n! R has continuous second partial derivatives, then the essian matrix is symmetric. b) Every symmetric matrix A de nes either a positive de nite or a negative de nite quadratic form. c) A essian matrix for a function with continuous second partial derivatives evaluated at any point is diagonalizable. 5 d) v T v is a positive de nite quadratic form. 0 0 e) v T 4 0 9 0 5 v is a positive de nite quadratic form. 0 0 4 Andrilli/ecker Elementary Linear Algebra, 4th ed. March 15, 010 Copyright 010, Elsevier nc. All rights reserved.

8 Answers to Selected Exercises (1) (a) Critical points: (1; 1), ( 1; 1); local minimum at (1; 1) (c) Critical point: ( ; ); local minimum at ( ; ) (e) Critical points: (0; 0; 0); ( 1 ; 1 ( 1 ; 1 ; 1 ); ( 1 ; 1 ; 1 ) (4) (a) T (b) F (c) T (d) T (e) F ; 1 ); ( 1 ; 1 ; 1 ); local minimums at Andrilli/ecker Elementary Linear Algebra, 4th ed. March 15, 010 Copyright 010, Elsevier nc. All rights reserved.