Max-Min Problems in R n Matrix

Max-Min Problems in R n Matrix 1 and the essian Prerequisite: Section 6., Orthogonal Diagonalization n this section, we study the problem of nding local maxima and minima for realvalued functions on R n. The method we describe is the higher-dimensional analogue to nding critical points and applying the second derivative test to functions on R studied in rst-semester calculus. Taylor s Theorem in R n Let f C (R n ), where C (R n ) is the set of real-valued functions de ned on R n having continuous second partial derivatives. The method for solving for local extreme points of f relies upon Taylor s Theorem with second degree remainder terms, which we state here without proof. (n the following theorem, an open hypersphere centered at x 0 is a set of the form fx R n j kx x 0 k < rg for some positive real number r.) TEOREM 1 (Taylor s Theorem in R n ) Let A be an open hypersphere centered at x 0 R n, let u be a unit vector in R n, and let t R such that x 0 + tu A. Suppose f : A! R has continuous second partial derivatives throughout A; that is, f C (A). Then there is a c with 0 c t such that nx @f f(x 0 + tu) = f(x 0 ) + @x i=1 i (tu i ) + 1 nx @x i=1 i nx nx + (t u i u j ): @x i @x j i=1 j=i+1 +cu +cu (t u i ) Taylor s Theorem in R n is derived from the familiar Taylor s Theorem in R by applying it to the function g(t) = f(x 0 + tu). n R, the formula in Taylor s Theorem is f(x 0 + tu) = f(x 0 ) + @f @x (tu 1 ) + @f @y (tu ) + 1 @x (t u 1) + 1 +cu @y (t u ) +cu + @ f (t u 1 u ): +cu h Recall that the gradient of f is de ned by rf = @f @x 1 ; @f @x ; : : : ; @f @x n i. f we let v = tu, then, in R, v = [v 1 ; v ] = [tu 1 ; tu ], and so the sum @f @x (tu 1 ) + @f (tu ) simpli es to rf v. Also, since f has continuous second partial @y Andrilli/ecker Elementary Linear Algebra, 4th ed. March 15, 010 Copyright 010, Elsevier nc. All rights reserved.

derivatives, we have @ f = @ f : Therefore, 1 @x (t u 1) + 1 = 1 @ v f 1 @x v 1 + @ f = 1 vt 4 @x @y (t u ) + @ f (t u 1 u ) + 1 v v @y 5 v; where v is considered to be a column vector. The matrix = 4 @x @y v 1 + @ f @y v in this expression is called the essian matrix for f. Thus, in the R case, with v = tu, the formula in Taylor s Theorem can be written as f(x 0 + v) = f(x 0 ) + rf v + 1 vt v, 5 +kv for some k with 0 k 1 (where k = c t ). While we have derived this result in R, the same formula holds in R n, where the essian is the matrix whose (i; j) entry is @x i@x j. Critical Points f A is a subset of R n, then we say that f : A! R has a local maximum at a point x 0 A if and only if there is an open neighborhood U of x 0 such that f(x 0 ) f(x) for all x U. A local minimum for a function f is de ned analogously. TEOREM Let A be an open hypersphere centered at x 0 R n, and let f : A! R have continuous rst partial derivatives on A. f f has a local maximum or a local minimum at x 0, then rf(x 0 ) = 0. Proof f x 0 is a local maximum, then f(x 0 + he i ) f(x 0 ) 0 for small h. Then, f(+hei) f() f(x lim h!0 + h 0. Similarly, lim 0+he i) f(x 0) h!0 h 0. ence, for the limit to exist, we must have @f @x i = 0. Since this is true for each i, rf = 0. A similar proof works for local minimums. QED Points x 0 at which rf(x 0 ) = 0 are called critical points. Example 1 Let f : R! R be given by f(x; y) = 7x + 6xy + x + 7y y +. Then rf = [14x + 6y + ; 6x + 14y ]. We nd critical points for f by solving rf = 0. This is the linear system 14x + 6y + = 0 6x + 14y = 0 ; which has the unique solution x 0 = [ 1; ]. ence, by Theorem, ( 1; ) is the only possible extreme point for f. (We will see later that ( 1; ) is a local minimum.) Andrilli/ecker Elementary Linear Algebra, 4th ed. March 15, 010 Copyright 010, Elsevier nc. All rights reserved.

Su cient Conditions for Local Extreme Points f x 0 is a critical point for a function f, how can we determine whether x 0 is a local maximum or a local minimum? For functions on R, we have the second derivative test from calculus, which says that if f 00 (x 0 ) < 0, then x 0 is a local maximum, but if f 00 (x 0 ) > 0, then x 0 is a local minimum. We now derive a similar test in R n. Consider the following formula from Taylor s Theorem: f(x 0 + v) = f(x 0 ) + rf(x 0 ) v + 1 vt v. +kv At a critical point, rf(x 0 ) = 0, and so f(x 0 + v) = f(x 0 ) + 1 vt +kv v. ence, if v T v is positive for all small nonzero vectors v, then f will +kv have a local minimum at x 0. (Similarly, if v T v is negative, f will +kv have a local maximum.) But since we assume that f has continuous second partial derivatives, v T v is continuous in v and k, and will be positive for small +kv v if v T v is positive for all nonzero v. ence, TEOREM Given the conditions of Taylor s Theorem for a set A and a function f : A! R, f has a local minimum at a critical point x 0 if v T v > 0 for all nonzero vectors v. Similarly, f has a local maximum at a critical point x 0 if v T v < 0 for all nonzero vectors v. Positive De nite Quadratic Forms f v is a vector in R n, and A is an n n matrix, the expression v T Av is known as a quadratic form. (For more details on the general theory of quadratic forms, see Section 8.11.) A quadratic form such that v T Av > 0 for all nonzero vectors v is said to be positive de nite. Similarly, a quadratic form such that v T Av < 0 for all nonzero vectors v is said to be negative de nite. Now, in particular, the expression v T v in Theorem is a quadratic form. Theorem then says that if v T v is a positive de nite quadratic form at a critical point x 0, then f has a local minimum at x 0. Theorem also says that if v T v is a negative de nite quadratic form at a critical point @x i@x j = @ f @x j@x i x 0, then f has a local maximum at x 0. Therefore, we need a method to determine whether a quadratic form of this type is positive de nite or negative de nite. Now, the essian matrix, which we will abbreviate as, is sym- metric because (since f C (A)). ence, by Theorem 6.0, can be orthogonally diagonalized. That is, there is an orthogonal matrix P such that PP T = D, a diagonal matrix, and so, = P T DP. ence, v T v = v T P T DPv = (Pv) T D(Pv). Letting w = Pv, we get v T v = w T Dw. But P is nonsingular, so as v ranges over all of R n, so does w, and vice-versa. Thus, Andrilli/ecker Elementary Linear Algebra, 4th ed. March 15, 010 Copyright 010, Elsevier nc. All rights reserved.

4 v T v > 0 for all nonzero v if and only if w T Dw > 0 for all nonzero w. Now, D is diagonal, and so w T Dw = d 11 w 1 + d w + + d nn w n. But the d ii s are the eigenvalues of. Thus, it follows that w T Dw > 0 for all nonzero w if and only if all of these eigenvalues are positive. (Set w = e i for each i to prove the only if part of this statement.) Similarly, w T Dw < 0 for all nonzero w if and only if all of these eigenvalues are negative. ence, TEOREM 4 A symmetric matrix A de nes a positive de nite quadratic form v T Av if and only if all of the eigenvalues of A are positive. A symmetric matrix A de nes a negative de nite quadratic form v T Av if and only if all of the eigenvalues of A are negative. ence, Theorem can be restated as follows: Given the conditions of Taylor s Theorem for a set A and a function f : A! R: (1) if all of the eigenvalues of are positive at a critical point x 0, then f has a local minimum at x 0, and () if all of the eigenvalues of are positive at a critical point x 0, then f has a local minimum at x 0. Example Consider the function f(x; y) = 7x + 6xy + x + 7y y + : n Example 1, we found that f has a critical point at x 0 = [ 1; ]. Now, the essian matrix for f at x 0 is = 4 @x @y 5 x 0 = 14 6. 6 14 But p (x) = x 8x + 160, which has roots x = 8 and x = 0. Thus, has all eigenvalues positive, and hence, v T v is positive de nite. Theorem 4 then tells us that x 0 = [ 1; ] is a local minimum for f. Local Maxima and Minima in R t can be shown (see Exercise ) that a symmetric matrix A de nes a positive de nite quadratic form (v T Av > 0 for all nonzero v) if and only if a 11 > 0 and jaj > 0. Similarly, a symmetric matrix de nes a negative de nite quadratic form (v T Av < 0 for all nonzero v) if and only if a 11 < 0 and jaj > 0. Example Suppose f(x; y) = x x y + y + 4y x 4 y 4. First, we look for critical points by solving the system 8 >< >: @f @x = 4x 4xy 4x = 4x(1 (y + x )) = 0 @f @y = 4x y + 4y + 4 4y = 4y(x + y ) + 4y + 4 = 0 Now @f @x = 0 yields x = 0 or y +x = 1. f x = 0, then @f @y = 0 gives 4y+4 4y = 0. The unique real solution to this equation is y =. Thus, [0; ] is a critical point. f x 6= 0, then y + x = 1. From @f @y = 0, we have 0 = 4y(1) + 4y + 4 = 4, a contradiction, so there is no critical point when x 6= 0. : Andrilli/ecker Elementary Linear Algebra, 4th ed. March 15, 010 Copyright 010, Elsevier nc. All rights reserved.

5 An Example in R Next, we compute the essian matrix at the critical point [0; ]. @x = 4 5 = @y [0;] 4 4y 1x 8xy 8xy 4x + 4 1y [0;] = 1 0 : 0 44 Since the (1; 1) entry is negative and jj > 0, de nes a negative de nite quadratic form and so f has a local maximum at [0; ]. Example 4 Consider the function g(x; y; z) = 5x + xz + 4xy + 10x + z 6yz 6z + 5y + 1y + 1. We nd the critical points by solving the system 8 >< >: @g @x @g @y @g @z = 10x + z + 4y + 10 = 0 = 4x 6z + 10y + 1 = 0 = x + 6z 6y 6 = 0 Using row reduction to solve this linear system yields the unique critical point [ 9; 1; 16]. The essian matrix at [ 9; 1; 16] is @ g @ g @ g @x @x@z 10 4 = @ g @ g @ g 6 @y 4 @y@z 7 = 4 4 10 6 5. 5 6 6 @ g @z@x @ g @z@y @ g @z [ 9;1;16] A lengthy computation produces p (x) = x 6x +164x 8. The roots of p (x) are approximately 0:04916, 10:6011, and 15:497. Since all of these eigenvalues for are positive, [ 9; 1; 16] is a local minimum for g.. Failure of the essian Matrix Test n calculus, we discovered that the second derivative test fails when the second derivative is zero at a critical point. A similar situation is true in R n. f the essian matrix at a critical point has 0 as an eigenvalue, and all other eigenvalues have the same sign, then the function f could have a local maximum, a local minimum, or neither at this critical point. Of course, if the essian matrix at a critical point has two eigenvalues with opposite signs, the critical point is not a local extreme point (why?). Exercise illustrates these concepts. New Vocabulary C (R n ) (functions from R n to R having continuous second partial derivatives) critical point (of a function) gradient (of a function on R n ) essian matrix local maximum (of a function on R n ) local minimum (of a function on R n ) negative de nite quadratic form Andrilli/ecker Elementary Linear Algebra, 4th ed. March 15, 010 Copyright 010, Elsevier nc. All rights reserved.

6 ighlights open hypersphere (in R n ) positive de nite quadratic form Taylor s Theorem (in R n ) The gradient of a function f : R n! R is de ned by rf = h i @f @x 1 ; @f @x ; : : : ; @f @x n : Let A be an open hypersphere about x 0, and let f be a function on A with continuous partial derivatives. f f has a local maximum or minimum at x 0, then rf(x 0 ) = 0: For a function f : R n! R; its corresponding essian matrix is the n n @ matrix whose (i; j) entry is f @x i@x j : n particular, for a function f : R! R; the essian matrix = 4 @x @y 5 : Taylor s Theorem in R n : Let A be an open hypersphere centered at x 0 R n, let u be a unit vector in R n, and let t R such that x 0 + tu A. Suppose f : A! R has continuous second partial derivatives throughout A; that is, f C (A). Then there is a c with 0 c t such that f(x 0 +tu) = f(x 0 )+ P n @f i=1 @x i (tu i )+ 1 P n i=1 @x (t u i )+P n P n i=1 j=i+1 @x i +cu i@x j +cu (t u i u j ): n particular, we have f(x 0 +v) = f(x 0 )+ rf v + 1 vt v, for some k with 0 k 1. +kv Let A be an open hypersphere centered at x 0 R n : f f : A! R has continuous second partial derivatives throughout A, then f : A! R, f has a local minimum at a critical point x 0 if v T v > 0 for all nonzero vectors v. Similarly, f has a local maximum at a critical point x 0 if v T v < 0 for all nonzero vectors v. A quadratic form is an expression of the form v T Av, where v is a vector in R n, and A is an n n matrix. A positive de nite quadratic form is one such that v T Av > 0 for all nonzero vectors v. Similarly, a negative de nite quadratic form is one such that v T Av < 0 for all nonzero vectors v. For a function f : R n! R having essian matrix, if v T v is a positive de nite quadratic form at a critical point x 0, then f has a local minimum at x 0. Similarly, if v T v is a negative de nite quadratic form at a critical point x 0, then f has a local maximum at x 0. A symmetric matrix A de nes a positive de nite quadratic form v T Av if and only if all of the eigenvalues of A are positive. A symmetric matrix A de nes a negative de nite quadratic form v T Av if and only if all of the eigenvalues of A are negative. f f : R n! R has essian matrix ; and all eigenvalues of are positive at a critical point x 0, then f has a local minimum at x 0. f f : R n! R has essian matrix ; and all eigenvalues of are negative at a critical point x 0, then f has a local maximum at x 0. Andrilli/ecker Elementary Linear Algebra, 4th ed. March 15, 010 Copyright 010, Elsevier nc. All rights reserved.

A symmetric matrix A has a positive de nite quadratic form v T Av if and only if a 11 > 0 and jaj > 0. Similarly, a symmetric matrix has a negative de nite quadratic form v T Av if and only if a 11 < 0 and jaj > 0. 7 EXERCSES 1. n each part, solve for all critical points for the given function. Then, for each critical point, use the essian matrix to determine whether the critical point is a local maximum, a local minimum, or neither. F a) f(x; y) = x + x + xy x + y b) f(x; y) = 6x + 4xy + y + 8x 9y F c) f(x; y) = x + xy + x + y y + 5 d) f(x; y) = x + x y x + xy + xy x + y y y (int: To solve for critical points, rst set @f @f @x @y = 0.) F e) f(x; y; z) = x +xy +xz +y 4 +4y z +6y z y +4yz 4yz +z 4 z a) Show that f(x; y) = (x ) 4 + (y ) has a local minimum at [; ], but its essian matrix at [; ] has 0 as an eigenvalue. b) Show that f(x; y) = (x ) 4 + (y ) has a critical point at [; ], its essian matrix at [; ] has all nonnegative eigenvalues, but [; ] is not a local extreme point for f. c) Show that f(x; y) = (x+1) 4 (y+) 4 has a local maximum at [ 1; ], but its essian matrix at [ 1; ] is O and thus has all of its eigenvalues equal to zero. d) Show that f(x; y; z) = (x 1) (y ) +(z ) 4 does not have any local extreme points. Then verify that its essian matrix has eigenvalues of opposite sign at the function s only critical point. a b a) Prove that a symmetric matrix A = de nes a positive b c de nite quadratic form if and only if a > 0 and jaj > 0. (int: Compute p A (x) and show that both roots are positive if and only if a > 0 and jaj > 0.) b) Prove that a symmetric matrix A de nes a negative de nite quadratic form if and only if a 11 < 0 and jaj > 0. F. True or False: a) f f : R n! R has continuous second partial derivatives, then the essian matrix is symmetric. b) Every symmetric matrix A de nes either a positive de nite or a negative de nite quadratic form. c) A essian matrix for a function with continuous second partial derivatives evaluated at any point is diagonalizable. 5 d) v T v is a positive de nite quadratic form. 0 0 e) v T 4 0 9 0 5 v is a positive de nite quadratic form. 0 0 4 Andrilli/ecker Elementary Linear Algebra, 4th ed. March 15, 010 Copyright 010, Elsevier nc. All rights reserved.

8 Answers to Selected Exercises (1) (a) Critical points: (1; 1), ( 1; 1); local minimum at (1; 1) (c) Critical point: ( ; ); local minimum at ( ; ) (e) Critical points: (0; 0; 0); ( 1 ; 1 ( 1 ; 1 ; 1 ); ( 1 ; 1 ; 1 ) (4) (a) T (b) F (c) T (d) T (e) F ; 1 ); ( 1 ; 1 ; 1 ); local minimums at Andrilli/ecker Elementary Linear Algebra, 4th ed. March 15, 010 Copyright 010, Elsevier nc. All rights reserved.