Brief Review of Fuctios of Several Variables
Differetiatio Differetiatio Recall, a fuctio f : R R is differetiable at x R if ( ) ( ) lim f x f x 0 exists df ( x) Whe this limit exists we call it or f( x). dx Now, whe f : R R what do we mea by differetiability at x R? NOE : whe f : R R we ca restate the above defiitio as follows: f is differetiable at x R if there is a umber, ( x), so that f( x ) f( x) ( x) lim 0 0 ad, clearly, ( x) f( x). 2
Differetiatio he -dimesioal aalogue follows. Def : he fuctio f : R R is said to be differetiable at x R if there is a vector η( x) (-dimesioal of course) so that f( xh) f( x) η ( xh ) lim 0 h0 h where h is a -vector ad η ( xh ) deotes the ier product of these two vectors. HW9 : Show that if f : R R is differetiable at x the η( x) is uique. 3
Partial Derivatives Partial Derivatives Def : f : R R is said to have a partial derivative at x with respect to xi if lim 0 f i ( xe ) f( x) e i i exists. (ote : e ) i where e (0 0 0 0 0) i the i-th locatio; the i-th uit vector Equivaletly, we may say f : R R has a partial derivative at x with respect to x if there is a umber ( x) so that i i f( xe ) f( x) i ( x) lim 0 0 f ( x) (of course, we the write i ( x) ) x i i 4
Partial Derivatives Fact : If f : R R is differetiable at x the f has partial derivatives at x i Proof : Sice f is differetiable at x the, o matter how the vectors h approach 0, we have that there is a vector η( x) so that f( xh) f( x) η ( xh ) lim 0 h0 h i herefore, let these h vectors be of the form h( ) e ad the h( ) 0 as 0. he i i f( xe ) f( x) i ( x) e lim 0 0 implies, by uiqueess, that ( x) η ( xe ) i. i 5
Partial Derivatives, Gradiets f ( x) Sice i ( x ), this shows that the i-th compoet of η( x) is, i xi f ( x) fact,, the i-th partial derivative at x. herefore, i what follows we xi will replace η( x) with f( x) f( x) f ( x) f ( x) x x2 x the vector of partial derivatives (sometimes called the gradiet vector at x ). 6
Directioal Derivatives 2 HW0 : Cosider the fuctio f : R R defied as follows: f( x, x ) 2 3 x 2 2, if x 0, or x2 0 x x2 0, if x x2 0 Show that f has partial derivatives at x (0,0) but that f is ot differetiable at x. Directioal Derivatives Let f : R R Def : f is said to be differetiable at x i the directio d if ( ) ( ) lim f xd f x d 0 exists 7
Directioal Derivatives Whe this limit exists, we will write it as D() x which is said to be a directioal derivative i the directio d. i f ( x) [Note : Whe d e, we have D() d x ] x d i d x xd for 0 Fact 2 : Let f : R R be differetiable at x. Let d 0 be ay directio. he, D() x d f ( x) d d 8
0 Directioal Derivatives Proof : Sice f is differetiable at x we kow that f( xh) f( x) f( x) h lim 0 h0 h o matter how the vectors h approach 0. Now let h( ) d; the h 0 as 0. herefore, f( xd) f( x) f( x) d lim 0 d or, equivaletly, f( xd) f( x) D d( x) d lim 0 d 0 herefore, by uiqueess, we must have f ( x) d D d( x) d 9
aylor s heorem aylor s heorem for f : R R Recall, aylor s heorem (for f : R R ) : Let f : R R have a - th derivative ad assume the (-)-st derivative is cotiuous. Let x x. he there exists a [0,] so that x x ( xx ) ad k ( k) ( xx ) ( ) ( xx ) f( x) f( x ) f ( x ) f ( x) k!! k We ow wat to derive some form of this result for fuctios of several variables, f : R R aylor s st ad 2 d order results i R : Let f : R R have cotiuous st partial derivatives. he, for ay two poits x, x so that x x, there exists a [0,] so that (i) f( x) f( x ) f x( ) x ( xx ) 0
aylor s heorem [NOE : For two vectors x, x, the lie through these two poits is y R, y = x( x he lie segmet joiig these two poits is y 0,, y = x( x x x x x
aylor s heorem ad if, i additio, has cotiuous 2d partials, f f x f x f x xx xx Hx x xx 2 (ii) ( ) ( ) ( ) ( ) ( ) ( ) ( ) where H( z) is the Hessia matrix for f, evaluated at z. hat is H( z) 2 2 2 f ( z) f ( z) f ( z) 2 z zz2 zz 2 2 2 f ( z) f ( z) f ( z) 2 z z zz2 z 2
Proof : aylor s heorem (i) Let g( f x( ) x for [0,]. he g() f( x) ad g(0) f( x ). Usig aylor s heorem o R we get or, by the Chai rule, g() g(0) g'( f( x) f( x ) f x( ) x ( xx ) 3
(ii) (Recall, by cotiuity we have aylor s heorem 2 2 f( x) f( x) x x x x i j j i i.e., Hx ( ) is a symmetric matrix.) Usig aylor s heorem o we have g() g(0) g'(0 g''( 2 for some [0,], or, by the chai rule, f x f x f x xx xx Hx x xx 2 ( ) ( ) ( ) ( ) ( ) ( ) ( ) R 4
Approximatio Usig aylor s Series Defiitio : For a differetiable fuctio f : R R the first order approximatio at x is defied to be f ( x ) f( x ) ( xx ) Defiitio : For f : R R with cotiuous 2d partials, the secod order approximatio at x is defied to be f ( x ) f( x ) ( xx ) ( xx ) H( x )( xx ) 2 5
Necessary Coditios for Local Miima Cosider the optimizatio problem mi f ( x ) xf (P) where f : R R, F R Defiitio : (feasible directio) : he vector d R, d 0, is said to be a feasible directio at x F if there is a umber 0 so that the vector x d F for all [0, ] 6
Necessary Coditios for Local Miima We let DxF ( ; ) deote the set of feasible directios at x F. heorem 4 : (st order ecessary coditio (FONC) for local miimum): Let F R ad let f : R R be differetiable. If x F is a local miimum the, for all d D x F ( ; ) we must have f x d ( ) 0 7
Necessary Coditios for Local Miima Proof : If there is a d D( x ; F) so that 0 f ( x ) d we must the have ( x d ) ( x ) 0 lim f f 0 0 so that f( x d ) f( x ) for all [0, ]. But d D( x ; F) 0 so that x d F all [0, ]. Let ad we the have f( x d) f( x ) all [0, ˆ ] x dfx ot a local miimum. ˆ mi {, } ad such 8
Necessary Coditios for Local Miima Corollary : If x it F, the f ( x ) 0. Proof : x it F D( x ; F) R 0. f ( x ) d0, for all d 0 f ( x ) 0. HW : Show whether or ot (for x F ) f ( x ) d 0, for all d D x F ( ; ) implies x is a local miimum. 9
Necessary Coditios for Local Miima heorem 5 : (2d order ecessary coditios for local miimum (SONC)) Let f : R R have cotiuous 2d partials ad let x F be a local miimum. he for ay d D( x ; F) we have (i) f ( x ) d 0 ad (ii) if f ( x ) d 0, the dhx ( ) d 0. Proof : (i) is just a restatemet of heorem 4. herefore, assume d D( x ; F) By aylor s heorem, is such that f ( x ) d 0. Let x( x d for 0. f( x( f( x ) f( x ) d d H( x d ) d 2 2 20
Necessary Coditios for Local Miima Now, if d H( x ) d 0 we see, by cotiuity of the secod partials, there exists a ˆ so that d H( x d ) d 0 for all [0, ˆ ], ad the, for such s, f( x( f( x ). But, agai, d D( x ; F) there is a so that x d F for all [0, ]. By lettig mi ˆ we see that x caot be a local miimum. Corollary : Let x be a iterior poit of F. he, (i) f ( ) x 0 ad for all (ii) dhx ( ) d 0 d R (i.e., Hx ( ) is a positive semi-defiite matrix which we abbreviate as PSD). 2
Proof : Necessary Coditios for Local Miima Dx F ( ; ) R 0 Defiitio : A matrix A is said to be positive semi-defiite (PSD) if for all x R xax 0. A matrix A is said to be positive defiite if for all x R, x 0, xax 0 A matrix A is said to be egative semi-defiite (NSD) if A is PSD (i.e., for all x R, xax 0 ). A matrix A is said to be egative defiite (ND) if A is PD (i.e., for all x R, x 0, xax0). 22
Necessary Coditios for Local Miima [Note : he corollary of heorem 5 states that if x it F is a local miimum for mi f ( x), the f ( x ) 0 ad Hx ( ) is PSD.] xf HW2 : Show whether the followig is true. If x F is such f ( ) x 0 that ad Hx ( ) is PSD the x is a local miimum. HW3 : Costruct a simple example to show that whe a local miimum is o the boudary of F the it may be that Hx ( ) is ot PSD. x 23
Sufficiet Coditios for Local Miima Defiitio : x F is a strict local miimum for mi f ( x) if there is a xf 0 so that for all x F N( x, ), x x, we have f( x) f( x) heorem 6 : (2 d order sufficiet coditio for local miimum) Let f : R R have cotiuous 2 d partials ad let x F. If (i) f ( ) x 0 ad (ii) Hx ( ) is PD. the x is a strict local miimum. 24
Sufficiet Coditios for Local Miima Proof : Suppose x is ot a strict local miimum. he for all 0 there is a x FN( x, ) so that f( x ) f( x ). Let k for k, k k k iteger-valued. he f( x ) f( x ) for all k, where x x. By aylor s heorem, for each k, f x f x x x H x x x x 2 k k k k ( ) ( ) ( ) ( k ( k) )( ) for some k [0, ].herefore, for each k, k k k ( x x ) H( k x ( k) x )( x x ) 0 k ad this implies that Hx ( ) caot be PD. 25
Sufficiet Coditios for Local Miima HW4 : Show this latter claim. HW5 : Costruct a example where a iterior poit it F is a strict local miimum but Hx ( ) is PSD ad ot PD. (i.e, the coditios of heorem 6 are ot ecessary for a strict local Miimum). x 26
Brief Review of Quadratic Forms Recall, for fuctios f : R R with cotiuous secod partials, we have 2 2 f( x) f( x) x x x x i j j i ad, therefore, Hx ( ) H ( x) (i.e., the Hessia matrix is symmetric). HW6 : (a) If A is PD show that a 0, i,,. (a) If A is PSD show that a 0, i,,. ii ii HW7 : Let A be PD. Show that A is a osigular matrix 27
Brief Review of Quadratic Forms Note: xax i j xa x i ij j 2 ax ii i axx ij i j i i ji if A is also symmetric, we may write 2 ax ii i 2 xax i ij j i i ji xax Example : Let A a a a 2 a 2 22 ad A 2 2 he. xax a x 2a x x a x Now, assume a 0. he 2 2 22 2 A 28
Brief Review of Quadratic Forms xax a x a 2 x x a x 2 2 22 2 2 2 a a 2 2 a a a a x x x 2 22 2 2 2 2 a a a a x x x 2 2 a 2 aa22 a 2 2 2 2 2 a a 29
Brief Review of Quadratic Forms 2 Now, if a ad A a a a we see that xax0for all 0 x 0. hat is, for a 2 x 2 symmetric matrix we see that a sufficiet coditio for A to be PD is that a, A 0. We ow show that this coditio is also ecessary. Sice A 0.Now, 22 2 0 0 a 0 2 a A 2 2 a x x2 x2 a a xax (HW6) we eed oly show a2 if A 0 choose x x, x2 as follows : x2, x x2. he a A xax 0, x 0, ad A is ot PD ; cotradictio. a 30
Brief Review of Quadratic Forms he above motivates the ext heorem which we state without proof. heorem 7 : Let A be a matrix which is also symmetric (i.e., A A ). he is PD if, ad oly if, a a a 2 3 a a2 2 22 23 a2 a22 a3 a23 a33 a 0, 0, a a a 0,..., A 0, ad therefore (sice A ( ) A ), A is ND if, ad oly if, a a a 2 3 a a2 2 22 23 a2 a22 a3 a23 a33 a 0, 0, a a a 0,..., ( ) A 0 3
Brief Review of Quadratic Forms HW8 : Let A be a matrix which is PSD. For each 0 0 defie A( ) AI, where I. Show that A( ) is PD. 0 As a result of HW8, we have the followig Corollary to heorem 7 whe A is also symmetric. 32
Brief Review of Quadratic Forms Corollary : If A is a symmetric matrix which is also PSD the a a a a 0, 0, a a a 0,..., A 0 2 3 a a2 2 22 23 a2 a22 a3 a23 a33 Proof : Sice A( ) ( ( )) is PD we must have a ij a a a A a a 2 0, 0,, ( ) 0. 2 22 But the determiat is a cotiuous fuctio of the elemets of the matrix ad, therefore, by lettig 0 we have a a a 2 0, 0,, A 0 a a 2 22 33
Brief Review of Quadratic Forms HW9 : Provide a simple example of a symmetric matrix with the above property which is also ot PSD. Note : Let A be a matrix ad defie the quadratic fuctio Q : R R by Q( x) x Ax he Q( x) 0 for all x 0 A is PD ad similarly for PSD, ND, ad NSD. I fact, this is the motivatio for the term positive defiite. hat is, if Q( x) is positive, we say A is PD ad coversely. 34
Brief Review of Quadratic Forms Also ote that whe we are talkig about quadratic forms Q( x) x Ax we may assume, without loss of geerality, that A is symmetric. I particular, if A is ot symmetric, we may replace A by A ˆ ( AA 2 ) ad we see that Q( x) Qˆ( x) xax ˆ for all x R, ad  is symmetric. 35
Brief Review of Quadratic Forms So far we have developed a ecessary ad sufficiet coditio for a symmetric matrix A to be PD (ND); but we have developed oly a ecessary coditio for a symmetric matrix A to be PSD (NSD). We therefore ow sketch a coditio which is both ecessary ad sufficiet for a symmetric matrix to be PSD. We first proceed somewhat abstractly as follows: Let A be a matrix. Assume there is a matrix B so that B B 36
Brief Review of Quadratic Forms (i.e., is a orthogoal matrix) ad there is a -vector so that B AB λi λ,, Now, for ay x R defie y B x. he ( ) - 2 iyi i xax By ABy ybaby ybaby y(λi)y x 0 We therefore see that, xax0, all if, ad oly if, Similarly, we see that xax0, all x 0 if, ad oly if, i i 0, 0 i,, i,, HW20 : Prove this last claim We ow eed to discuss the existece of such a matrix B ad such a vector λ. 37
Brief Review of Quadratic Forms,,..., HW2 : Let 2 be distict eigevalues of the symmetric i j matrix A. Let x ad x be eigevectors associated with ad i j, i j. i j Show that ( x ) x 0. herefore, if we ormalize the eigevectors x,..., x i ( i.e., x ) we the get ( usig the Euclidea orm ) where ij ( x ) x i j ij, i j 0, i j Now show that x,, x are liearly idepedet (ad therefore B= [ x,, x ] is a orthoormal basis for R ). Now show that B =B ad B AB λi. 38
Brief Review of Quadratic Forms Now coclude that A is PD (PSD) if, ad oly if, all eigevalues of A are positive (oegative) ad, A is ND (NSD) if, ad oly if, all the eigevalues of A are egative (opositive). Also, what happes to the above if ot all eigevalues are distict? HW22 : Provide a example which shows that a osymmetric matrix may have oegative eigevalues ad A is ot PSD. A heorem 6 provided a sufficiet coditio for a poit x to be a strict local miimum. I particular it is required i that theorem that H(x ) be PD. he above results o PD, PSD, etc., provided coditios uder which, i some cases, H(x ) ca be checked for PD. However, there is a stregtheig of the hypotheses of heorem 6 which is sufficiet for x to be a local miimum (ot ecessarily strict) but this hypothesis is seldom checkable. 39
Brief Review of Quadratic Forms HW23 : Let have cotiuous d f : R R 2 partials ad let x F. If (i) f ( x ) 0 ad (ii) H(x) is PSD o some eighborhood of x the x is a local miimum. [ Note: H(x) PSD o some eighborhood of x meas ε > 0, for all x N( x, ε) the matrix H(x) is PSD.] 40
Itroductio to Covexity Perhaps the most importat aalytic tool i the theory of optimizatio is the otio of covexity. his cocept is so fudametal that a good deal of time must be spet o this subject i order to fully appreciate the structure of optimizatio problems ad associated results. I what follows we use the followig otatio: Let x, x R. We defie ad [ xx, ] { y [0,] yx( ) x} ( xx, ) { y (0,) yx( ) x} Note that [ xx, ] [ xx, ] ad ( xx, ) ( xx, ). Defiitio : A set 2 C R is covex if for all x, x C. We also have 2 [, ] x x C 4
Itroductio to Covexity Defiitio : Let C R be a covex set ( C ). A fuctio f : C R is 2 covex if for all x, x C, we have for all [0,]. f f f 2 2 ( x ( ) x ) ( x ) ( ) ( x ) Defiitio : Let C be a covex subset of R. f : C R is cocave 2 if f is covex. hat is, for all, x, x C for all [0,]. f f f 2 2 ( x ( ) x ) ( x ) ( ) ( x ) Defiitio : Let C, C R, C covex. f : C R is strictly 2 covex if for all x, x C, x x 2, f f f 2 2 ( x ( ) x ) ( x ) ( ) ( x ) for all (0,). 42
Itroductio to Covexity HW24 : (here are o local miima which are ot global miima). Let C R be covex ad let f : C R be covex. If x is a local miimum for mi f ( x) the x is a global miimum. he level sets of covex fuctios have a importat property. xc heorem 8 : Let be covex, C R. Let f : C R be covex. he is covex for all C α R L (α) = { xc f ( x) α } f 43
Itroductio to Covexity 2 i Proof : Let α R ad let x, x L f (α). he f( x ) α, i, 2. 2 he, for [0, ], we have f( x ) ( ) f( x ) α ( )αα But, by covexity of f, we the have f 2 2 ( x ( ) x ) α x ( ) x L f (α) Note : he above shows that if f is cocave o C, the { xc f ( x) α } is covex for all α R. Note : A ocovex fuctio may have L (α) covex for all Proof : f α R. C [0, ) L (α) f 44
Itroductio to Covexity Defiitio : Let C R, C ad covex. Let f : C R. he lower epigraph of f is defied to be G f { ( x,α) CR f ( x) α } he upper epigraph is defied with the iequality reversed. heorem 9 : (relative covex sets ad fuctios). Let C R, C, covex, ad let f : C R. he f is covex if, ad oly if, the lower epigraph of f is a covex set. C 2 2 i Proof : Assume f is covex ad let ( x,α ), ( x,α ) G f. he, f ( x ) α i, 2 ad the 2 2 λ f( x ) (λ) f( x ) λ α (λ)α, λ [0,] i But covexity of f implies f(λ x (λ) x ) λ f( x ) (λ) f( x ) λ α (λ)α 2 2 2 2 2 ( x ( ) x, ( ) ) G f 45
Itroductio to Covexity G f is covex. Now assume f is ot covex. he there exists 2 vectors x, x C, x x 2, ad (0,) so that f f f 2 2 ( x ( ) x ) ( x ) ( ) ( x ) 2 2 2 2 Defie f ( x ) ad f ( x ). he ( x, ) G, but f ( x, ) G f 2 2 ( ( ), ( ) ) G f x x G f ot covex Above, we showed there are ocovex fuctios so that L is covex f ( ) for all R. Such fuctios have a ame Defiitio : Let C R, C, C covex. A fuctio f : C R is said to be quasi-covex if L ( ) is covex for all R. f 46
Itroductio to Covexity HW25 : Let C ad covex ad let f : C R. Show that f is quasicovex if, ad oly if, the followig property holds: For all x, x C ad all 2 [0,] f f f 2 2 ( x ( ) x ) max { ( x ), ( x )} [ Note : f is quasi-cocave o C covex if f is quasi-covex; i.e., { xc f ( x) } is covex for all R. herefore, the above HW states that f is quasi-cocave if, ad oly if, f f f 2 2 ( x ( ) x ) mi { ( x ), ( x )} HW26 : (simple facts about covexity) (a) Let f ad be two covex fuctios o a covex set C. Show that f 2f2is covex o C if. 0, 2 0 (b) he itersectio of ay collectio of covex sets, i R, is also covex. 47
C Itroductio to Covexity C (c) Let, C R. he is covex if, ad oly if, the followig 2 k coditio holds: For ay iteger k, let x, x,, x be members of ad let, 2,, k be k oegative umbers so that i. he j x x C j (d) Let, C C R, C covex. Let f : C R. he f is covex if, 2 k ad oly if, the followig holds: For ay iteger k, let x, x,, x be i C ad let be oegative umbers so that. he, 2,, k (e) Let, C C R, C covex. Let f : C R. he f is quasi-covex if, ad oly if, the followig property holds: For ay iteger k, let 2 x, x,, x k C ad let be oegative umbers so that, 2,, k. he k i 2 k f( ix ) max { f( x ), f( x ),, f( x )} i i f k k i i x i i i ( ) f( x ) i i C 48
Itroductio to Covexity A importat applicatio of covexity is the characterizatio of the set of feasible directios. Let C R be covex ad let xc. We wish to characterize DxC (, ). I the first place, let yc, y x. he y x is a feasible directio sice, by covexity of C, x ( yx) C for all [0,]. herefore, { y x yc, y x } D( x, C) O the other had, let dd( x, C). he there is a 0 so that xdc for all [0, ]. Defie y xd, the yc ad yx d which says the feasible directio d is proportioal to some y x where yc. Note also that sice oly directio is importat, there is essetially o differece betwee the directio d ad the directio d for all 0. herefore, we may state the followig 49
Itroductio to Covexity heorem 0 : Let C, C covex, C R. Let. he, the set of feasible directios at x, DxC (, ), may be take to be { y x yc, y x } HW27 : Is the followig true: Let C be a oempty covex subset of R ad let f : C R be quasi-covex. Let x C be a local miimum. he x is a global miimum for mi f ( x)? C xc Defiitio : Let ad covex. f : C R is strictly quasi-covex if 2 2 for all x, x C so that f( x ) f( x ), we also have f x x f x f x 2 2 ( ( ) ) max { ( ), ( )} HW28 : Let x be a local miimum for mi f ( ) where is covex ad xc is strictly quasi-covex. Show that x is a global miimum x C x C f 50
Itroductio to Covexity HW29 : Is a covex fuctio also strictly quasi-covex? Is a quasi-covex fuctio also strictly quasi-covex? Is a strictly quasi-covex fuctio also quasi-covex? Is a strictly covex fuctio also strictly quasi-covex? C HW30 : Let, C covex, ad let f : C R be quasi-covex. If is a strict local miimum the x is a global miimum (i fact, a uique global miimum). x C 5