EE2 Mathematics : Fnctions of Mltiple Variables http://www2.imperial.ac.k/ nsjones These notes are not identical word-for-word with m lectres which will be gien on the blackboard. Some of these notes ma contain more eamples than the corresponding lectre while in other cases the lectre ma contain more detailed working o are therefore strongl adised to attend lectres. Contents 1 Partial Differentiation and Mltiariable Fnctions 1 1.1 Partial Differentials Reminder....................... 1 1.2 The Total Differential............................ 1 1.3 The Chain Rle................................ 2 2 Change of Variables 3 3 Talor s theorem for mlti-ariable fnctions 5 4 Gradient 5 5 Stationar Points 6 5.1 Characterizing stationar points with the Hessian:.......... 9 6 Leibnitz Integral Rle 9
1 1 Partial Differentiation and Mltiariable Fnctions In the following we will be considering fnctions of mltiple ariables f(,...). We will principall consider the fnctions of jst two ariables, f(, ), bt most of the concepts discssed can be generalized to mltiple ariables. In the lectres o will be introdced to perspectie plots and (constant) contor plots. This part of the corse can be iewed as introdcing the basics of optimization: something critical for qantitatie design. 1.1 Partial Differentials Reminder From starting point ( 0, 0 ), we will consider small changes in the, -plane of size (, ) and the small changes that the correspond to in f, of size f. Consider small changes to with = constant = 0. It follows that f = f( 0 +, 0 ) f( 0, 0 ) and the instantaneos slope in the -direction is then f lim 0 = lim f( 0 +, 0 ) f( 0, 0 ) 0 = (1.1) =0 (if the limit eists). This is called the partial deriatie of f with respect to while keeping constant. There are a ariet of notations for partial deriaties e.g.: = ( ) = = f (1.2) note that in the last two eqialences, the fact that is held constant is simpl assmed. The partial differential with respect to, is defined similarl. =0 Eample : If f(, ) = 3 sin() then = 3 2 sin()+ 3 cos() and also = 4 cos(). 2 f 2 We can also take the partial deriatie of a partial deriatie ielding epressions like = f or 2 f = f. 1.2 The Total Differential What is the change in f for an small step (, )? We ll se a relatiel standard argment (in the lectres we ll se a graphical argment) to obtain the total differential: f = f( +, + ) f(, ) f = f( +, + ) + ( f(, + ) + f(, + )) f(, ) f = f( +, + ) f(, + ) f(, + ) f(, ) + (1.3)
2 This resembles the definitions for the partial differentials sed preiosl. For (, ) sfficientl small it becomes f (, ) + (, ). (1.4) (the first term might seem to approimate (,+ ) ) bt this in trn approimates (,) ). In the limit, 0 this becomes the Total Differential: df = (, ) d + (, ) d (1.5) where df, d, d are called abstract differentials. This generalises to df = 1 d 1 + 2 d 2 +... n d n for fnctions of n ariables. Eample: Consider the fnction f(, ) = 2 + 3 2. The constant contor f = 1 passes throgh (0, 1). What is the ale of on the same contor when = 0.1? Along the contor f = 0. It follows from Eq. 1.4 that (,) + (,) = 0 = (2 + ) + ( 6) ealated at (0, 1) so and so 1 + 1.017. 6 1.3 The Chain Rle This considers the setting where f(, ) bt and are both themseles fnctions of a ariable sch that (), (). This is a natral setting e.g. one might hae a fnction (like fel consmption) depending on distance and speed bt, in trn, both of these qantities might be fnctions of time. E.g. f = b + where = a 2 /2 and = a. Since f(, ) = f() it follows that df is well defined (since f is a niariate fnction of ). d df d = lim f 0 from this follows the Chain Rle: = lim 0 (,) + (,) (1.6) df d (, ) d = d + (, ) Note that this onl applies when f is a niariate fnction of. Eample: From the eample aboe and sing Eq. (1.7) df d d d. (1.7) = (b + ).a +.a which simplifies to ba + 3 2 a2 2. The chain rle generalizes to fnctions of more ariables as df = d 1 d 1 d... n d n d. + 2 d 2 d +
3 2 Change of Variables Sppose f = f(, ) and that = (, ) and = (, ) (this is a generalization of the preios sitation where we had (t) and (t)). So we can write f = f(, ) = f((, ), (, )) = f(, ). It is sefl to be able to toggle between these co-ordinate sstems when tring to sole problems. We can write ot arios ersions of the total differential: df = d + d (since f(, )) (2.1) df = d + d (since f(, )) (2.2) d = d + d (since (, )) (2.3) d = d + d (since (, )) (2.4) We can now sbstitte (2.3) and (2.4) into (2.2) obtaining: df = ( d + d) + ( d + d) (2.5) Groping terms in (2.5) and comparing terms in d and d with (2.1) we find: = + (2.6) which epresses a change of ariables in f. And similarl = + (2.7) Rle of thmb: One can obtain change of ariables epressions from the total differential b mltipling throgh b. I.e. we can obtain (2.6) b writing = + and mltipling throgh b. Note that we re sing and and not d and d becase = (, ) and = (, ). Eample of a change of ariables: Sppose I want to go from thinking abot a fnction in polar co-ordinates, f(r, θ) = r 2 sin θ to representing it in Cartesian co-ordinates. Find where = r cos θ, = r sin θ, r = ( 2 + 2 ) and θ = tan 1 /. Writing the total differential for f(θ, r) as df = r θ dr + θ r dθ. Using the rle of thmb aboe we hae = r r θ + θ θ r (2.8)
4 it follows that = 2r sin θ 1 2 (2 + 2 ) 1 2 2 + r 2 1 cos θ 1 + 2 /. (2.9) 2 2 This simplifies to = / 2 + 2. Note 1 What abot doble partial differentials? 2 f = ( ) = [ ][ ]f. So 2 [ 2 f = 2 r r + ] [ θ r 2 θ r r r + ] θ r 2 θ f (2.10) r. How shold I interpret epressions like this? E.g. what is the first term when I epand the aboe? [ ] f = [ ] (2.11) r r r r r r r r = [ (/r) r r r + ] 2 f (2.12) r r 2 How do we obtain eqations like (2.10)? Well let s consider Eqation (2.8) and 2 f. 2 Writing g(r, θ) = we are then at libert to consider the total differential of dg(r, θ) in terms of the abstract differentials dr and dθ. B constrction, 2 f = g and we can obtain 2 g b sing the aboe rle of thmb and mltipling dg(r, θ) throgh b : this ields Eq. (2.8) in terms of g not f. Now we can take this epression for g and eliminate g sing g(r, θ) =. One then obtains an epression similar to Eqation (2.10). Note 2 Implicit epressions hae cropped p alread e.g. f(,, z) = 0. First of all it might be sefl to remember how, in the niariate case o hae been taght to do implicit differentation. E.g. Remind orself how o wold find d when d 2 + log + 3 = 0 (hint: o don t do it b rearranging for an epression = h()). So what abot the mltiariate z case if I want to find e.g.? Eample: for = z 3 + z. Differentiate with respect to holding constant (and noting the obios, bt perhaps helpfl to some, that [ ]z = z ). It follows from the epression for that 0 = 3z 2 z + z + z. We can then rearrange and sole for z = z. 3z 2 + Note 3 There are two identities which hold for f(,, z) = 0 which o ma hae encontered preiosl. These are called the reciprocit relation and the cclic relation. [ ] 1 = z (2.13) z z. z. z = 1 (2.14) Tr and obtain these for orself. This can be done b writing ot the total differentials for d(, z) and d(, z) and sing the epression for d to eliminate d from the epression for
5 d. Then, b considering the scenarios first when z is constant and then when is constant o can obtain the aboe. 3 Talor s theorem for mlti-ariable fnctions Reminder: in niariate case f() = f( 0 ) + ( 0 )f ( 0 ) + ( 0) 2 f ( 0 ) +... + ( 0) n 1 f n 1 ( 0 ) +... (3.1) 2! (n 1)! where we cold also hae written 0 =. Generalizing to two ariables: f(, ) = f( 0, 0 ) + + + 1 2! [ ] 2 f 2 ( )2 + 2 2 f + 2 f 2 ( )2 +... (3.2) Where we define = 0 and = 0 and ealate the deriaties at 0, 0. We cold write the qadratic term in different notation as ( + )( + )f where and act onl on f not on or. The aboe epansion then becomes: f(, ) = n=0 For more than two dimensions we can write this as [( 1 n! + ) n f(, )] (3.3) 0, 0 or f() = n=0 1 n! [( )n f()] =0 (3.4) f() = f( 0 ) + i i i + 1 2! i j 2 f i j i j +... (3.5) Hessian: If f( 1... n ) we can define H ij = i j f( 1,..., n ). The matri H is called the Hessian. H ij appears in the eqation aboe (3.5). 4 Gradient The gradient ector is called grad f or f = (, ). If f = f( 1... n ) then the same notation holds for higher dimensions and the i th component of f i = i. The total differential df = d+ d can be written as df = ds f where ds = (d, d). Instead of ds we cold also think abot small (finite) steps S = (, ) or S S (where S is a nit ector in direction S). We can ths write
6 Rewriting the dot prodct: f S f = S S f (4.1) f f S cos θ (4.2) where θ is the angle between ectors f and S. Eqation 4.2 is sefl: it relates the change in the fnction f, f, to f and S. If S is a step which is parallel to f then θ is zero and f is maimized. This means that small steps in a direction parallel to f are in the direction of greatest change of f. This allows an interpretation of f: it points in the direction of greatest change of f. Similarl if S is perpendiclar to f then Eqation 4.2 tells s that the change in f is zero. In this case S is a motion along a constant contor of f (or we re located at one of f s stationar points - see later). This gies s an interpretation for the direction of f bt what abot its magnitde? f(, ) tells s the gradient in the direction f(, ). In Eq. 4.2 if S f then θ = 0 and so rearranging we find that f. In the limit of infinitesimal steps ds in the f S direction f this becomes df = f. I.e. preiosl we took slices throgh or landscape ds which were parallel to the co-ordinate aes, e.g. setting = 0. We cold then calclate gradients like = 0. Bt we can also, for an point (, ), take a slice throgh the fnction in a direction f(, ) which passes throgh (, ). The instantaneos rate of change of f along this slice, and at (, ), is the ale of f(, ). Recap: f(, ) is a ector pointing in the direction of greatest change of f at the point (, ). Its magnitde is eqal to the rate of change of f in this direction (at (, )). From aboe, if ds is not parallel to f then a step, ds, of infinitesimal size in direction â has a rate of change of f specified b df = f â = f cos θ where θ is the angle between ds â and f. A standard eample: Consider the fnction f = 2 + z. Find i) f ii) Find the rate of change of f ( df ) in a direction a = (1, 2, 3) at point (1, 2, 1) iii) what is the direction of the ds largest rate of change at this point, and what is its magnitde? i) f = (2, 2 + z, ). ii) First find â then df = f â. At (1, 2, 1) f = (4, 0, 2) ds so df = 10 ds 14. iii) This is the direction f = (4, 0, 2) and of magnitde f = (20). 5 Stationar Points Uniariate case: When we differentiate f() and sole for the 0 sch that df = 0, the d d gradient is zero at these points. What abot the rate of change of gradient: ( df ) at the d d minimm 0? For a minimm the gradient increases as 0 0 + ( > 0). It follows
7 that d2 f d > 0. The opposite is tre for a maimm: 2 f < 0, the gradient decreases pon d 2 d 2 positie steps awa from 0. For a point of inflection d2 f = 0. d 2 Mltiariate case: Stationar points occr when f = 0. In 2-d this is (, ) = 0, namel, a generalization of the niariate case. Recall that df = d + d can be written as df = ds f where ds = (d, d). If f = 0 at ( 0, 0 ) then an infinitesimal step ds awa from ( 0, 0 ) will still leae f nchanged, i.e. df = 0. There are three tpes of stationar points of f(, ): Maima, Minima and Saddle Points. We ll draw some of their properties on the board. We will now attempt to find was of identifing the character of each of the stationar points of f(, ). Consider a Talor epansion abot a stationar point ( 0, 0 ). We know that f = 0 at ( 0, 0 ) so writing (, ) = ( 0, 0 ) we find: f = f(, ) f( 0, 0 ) 0 + 1 [ ] 2 f 2! 2 ( )2 + 2 2 f + 2 f 2 ( )2. (5.1) Maima: At a maimm all small steps awa from ( 0, 0 ) lead to f < 0. From Eq. (5.1) it follows that for a maimm: 2 f + 2 f + 2 f < 0 (5.2) for all (, ). Since this holds for all (, ) this incldes = 0. It follows that f < 0 (5.3) (also f < 0 b similar argments). Since Eq. (5.2) holds for arbitrar (, ), it mst also hold for (, ) = (λ, ) (i.e. along the locs = λ ). In this case Eq. (5.2) becomes: Mltipling throgh b f < 0 we obtain: λ 2 f + 2λf + f < 0. (5.4) λ 2 f 2 + 2λf f + f f > 0 (5.5) f 2 f f < (f λ + f ) 2, (5.6) for all λ (this last step is a sefl trick). The right-hand side of Eqation (5.6) can be set to zero (b picking λ appropriatel) bt can t be smaller (becase it is a sqared term) so: f 2 f f < 0. (5.7) Minima: The same tpe of argments appl for minima where all small steps from ( 0, 0 ) lead to f > 0.
8 (and also f > 0). f > 0 (5.8) f 2 f f < 0. (5.9) Saddle Points: In the Talor epansion for f what happens if = λ sch that Eq. (5.1) is zero to second order? λ 2 f + 2λf + f = 0. (5.10) We can se the same trick of mltipling throgh b f and rearranging as we did in Eqs. (5.5,5.6) and noting that (f λ + f ) 2 > 0 for all λ we find: f 2 f f > 0 (5.11) which is the condition for a stationar point to be a saddle point. Note that f, f are nconstrained. Eq. (5.10) can be soled for real roots λ 1, λ 2. This specifies two directions = λ 1 and = λ 2 on which f( 0 +, 0 + ) = f( 0, 0 ) to second order (these correspond the the asmptotes of the saddle s locall hperbolic contors). Smmar: The sfficient conditions for a stationar point to be a Ma, Min, Saddle are: Ma : f < 0 and f 2 f f < 0 (5.12) Min : f > 0 and f 2 f f < 0 (5.13) Saddle : f 2 f f > 0 (5.14) Note that these are not necessar conditions: consider f(, ) = 4 + 4 (a qestion in one of or eamples classes). To classif the stationar points in sch cases the Talor epansion sed in Eq. (5.1) mst be taken to higher order. A standard eample: Find and classif the stationar points of f(, ) = 3 3 2 + 2 2 and sketch its contors. From f = 0 it follows that f = 3 2 6 + 2 = 0 and f = 2 2 = 0. It follows that = and so that 3 2 4 = 0 and ths = 0, 4. Stationar points are ths (0, 0) and 3 ( 4, 4). 3 3 We can calclate the possible second deriaties (eqialent to finding the Hessian mentioned earlier): f = 6 6; f = 2; f = 2. At (0, 0) b sbstitting in the releant ales one finds f < 0 and f 2 f f < 0. Ths (0, 0) is a maimm. At ( 4, 4 ) we find 3 3 we hae a saddle point, since f 2 f f > 0. In order to sketch this we need to find the asmptotes of the locall hperbolic contors abot the saddle point. Which two directions hae f = 0 (to second order) at the saddle point? I.e. what = λ 1 and = λ 2 are sch that f( 0 +, 0 + ) = f( 0, 0 ) (to second order)? As aboe, we need to sole λ 2 f + 2λf + f = 0 at ( 4, 4 ): obtaining 3 3 2(λ 2 + 2λ 1) = 0 so λ 1,2 = 1 ± 2.
9 5.1 Characterizing stationar points with the Hessian: We can write Eq. (5.1) in terms of the Hessian: 2 f + 2 f + 2 f = S T H S (5.15) where S = (, ) and here: ( f f H = f f ). (5.16) A local maimm has S T H S < 0 for all S. This is the same as saing the matri H is negatie definite (later o ll learn this means its eigenales are strictl negatie). The condition f 2 f f < 0 can be interpreted as the statement that the determinant of H mst be positie. Recall that deth = f f f 2. A local minimm has S T H S > 0 for all S. This is eqialent to H being positie definite (strictl positie eigenales). The conditions for being a minimm are ths deth > 0 and f > 0. A saddle point has deth < 0 and H is called indefinite (its eigenales hae mied signs). If deth = 0 higher order terms in the Talor series are reqired to characterize the stationar point. [A notationall inoled, and non-eaminable, generalisation of the aboe, for fnctions of k ariables, is that a sfficient condition for a stationar point to be a maimm is to hae ( 1) i D (i) > 0 for all i k where D (i) = deth (i) and where H (i) is a sbmatri of H composed of all entries H lm with indices l, m < i + 1. H (i) is sometimes called the i th order leading principal minor of H. For a minimm the eqialent condition is that D (i) > 0 for all i k] 6 Leibnitz Integral Rle Differentiation of integrals of fnctions of mltiple ariables Consider: Leibnitz Integral Rle is then: df () d Eample: If F () = 3 F () = t=() t=() f(, t)dt (6.1) = f(, ()) d t=() f(, ())d d d + dt (6.2) t=() 2 sin t t + tdt find df. Using Eq. (6.2) we hae d
10 df () d sin 4 = ( + 3 ).3 2 sin 3 3 ( + 2 ).2 + cos t dt (6.3) 3 2 2 Where the integral on the right-handside becomes [ sin t ] 3 2.