Chapter 4. Several-variable calculus. 4.1 Derivatives of Functions of Several Variables Functions of Several Variables

Similar documents
4 Partial Differentiation

Taylor Series and stationary points

3 Applications of partial differentiation

Lecture 13 - Wednesday April 29th

MTH4101 CALCULUS II REVISION NOTES. 1. COMPLEX NUMBERS (Thomas Appendix 7 + lecture notes) ax 2 + bx + c = 0. x = b ± b 2 4ac 2a. i = 1.

Practice problems for Exam 1. a b = (2) 2 + (4) 2 + ( 3) 2 = 29

Engg. Math. I. Unit-I. Differential Calculus

Module Two: Differential Calculus(continued) synopsis of results and problems (student copy)

The Derivative. Appendix B. B.1 The Derivative of f. Mappings from IR to IR

MA102: Multivariable Calculus

Module 2: Reflecting on One s Problems

1 Functions of Several Variables Some Examples Level Curves / Contours Functions of More Variables... 6

Contents. 2 Partial Derivatives. 2.1 Limits and Continuity. Calculus III (part 2): Partial Derivatives (by Evan Dummit, 2017, v. 2.

ENGI Partial Differentiation Page y f x

Partial Derivatives. w = f(x, y, z).

REVIEW OF DIFFERENTIAL CALCULUS

Review for the Final Exam

Higher order derivative

x + ye z2 + ze y2, y + xe z2 + ze x2, z and where T is the

Core Mathematics C12

Core Mathematics C12

Math 234 Final Exam (with answers) Spring 2017

25. Chain Rule. Now, f is a function of t only. Expand by multiplication:

Higher Portfolio Quadratics and Polynomials

a x a y = a x+y a x a = y ax y (a x ) r = a rx and log a (xy) = log a (x) + log a (y) log a ( x y ) = log a(x) log a (y) log a (x r ) = r log a (x).

Derivatives and Integrals

Exercises for Multivariable Differential Calculus XM521

g(t) = f(x 1 (t),..., x n (t)).

Multivariable Calculus and Matrix Algebra-Summer 2017

Math 10C - Fall Final Exam

SOLUTIONS FOR PRACTICE FINAL EXAM

Math 263 Assignment #4 Solutions. 0 = f 1 (x,y,z) = 2x 1 0 = f 2 (x,y,z) = z 2 0 = f 3 (x,y,z) = y 1

Final Exam 2011 Winter Term 2 Solutions

7a3 2. (c) πa 3 (d) πa 3 (e) πa3

Unit IV Derivatives 20 Hours Finish by Christmas

Unit IV Derivatives 20 Hours Finish by Christmas

Practice Midterm 2 Math 2153

This exam will be over material covered in class from Monday 14 February through Tuesday 8 March, corresponding to sections in the text.

Calculus I Review Solutions

CALCULUS PROBLEMS Courtesy of Prof. Julia Yeomans. Michaelmas Term

ECM Calculus and Geometry. Revision Notes

e x3 dx dy. 0 y x 2, 0 x 1.

Study Guide/Practice Exam 2 Solution. This study guide/practice exam is longer and harder than the actual exam. Problem A: Power Series. x 2i /i!

Pure Mathematics P1

ALGEBRAIC GEOMETRY HOMEWORK 3

Sets. 1.2 Find the set of all x R satisfying > = > = > = - > 0 = [x- 3 (x -2)] > 0. = - (x 1) (x 2) (x 3) > 0. Test x = 0, 5

MATH 32A: MIDTERM 2 REVIEW. sin 2 u du z(t) = sin 2 t + cos 2 2

Calculus - II Multivariable Calculus. M.Thamban Nair. Department of Mathematics Indian Institute of Technology Madras

MATHEMATICS AS/M/P1 AS PAPER 1

Lecture 7 - Separable Equations

Vectors, dot product, and cross product

1. If the line l has symmetric equations. = y 3 = z+2 find a vector equation for the line l that contains the point (2, 1, 3) and is parallel to l.

MATH20132 Calculus of Several Variables. 2018

Higher Mathematics Course Notes

CALCULUS MATH*2080 SAMPLE FINAL EXAM

Second Order ODEs. Second Order ODEs. In general second order ODEs contain terms involving y, dy But here only consider equations of the form

AP Calculus Chapter 3 Testbank (Mr. Surowski)

MATH 19520/51 Class 5

1 + x 2 d dx (sec 1 x) =

UNIVERSITY OF HOUSTON HIGH SCHOOL MATHEMATICS CONTEST Spring 2018 Calculus Test

H I G H E R S T I L L. Extended Unit Tests Higher Still Higher Mathematics. (more demanding tests covering all levels)

MATHEMATICS AS/P1/D17 AS PAPER 1

MATH H53 : Final exam

FINAL EXAM STUDY GUIDE

OR MSc Maths Revision Course

MIDTERM EXAMINATION. Spring MTH301- Calculus II (Session - 3)

CALCULUS III THE CHAIN RULE, DIRECTIONAL DERIVATIVES, AND GRADIENT

Sec. 14.3: Partial Derivatives. All of the following are ways of representing the derivative. y dx

MATH 2203 Exam 3 Version 2 Solutions Instructions mathematical correctness clarity of presentation complete sentences

Nonlinear Autonomous Systems of Differential

1. For each function, find all of its critical points and then classify each point as a local extremum or saddle point.

100 CHAPTER 4. SYSTEMS AND ADAPTIVE STEP SIZE METHODS APPENDIX

Lecture 10. (2) Functions of two variables. Partial derivatives. Dan Nichols February 27, 2018

Volumes of Solids of Revolution Lecture #6 a

Math Review for Exam Compute the second degree Taylor polynomials about (0, 0) of the following functions: (a) f(x, y) = e 2x 3y.

Lagrange Multipliers

Exam 3 Solutions. Multiple Choice Questions

ENGI Partial Differentiation Page y f x

function independent dependent domain range graph of the function The Vertical Line Test

ISE I Brief Lecture Notes

McGill University April Calculus 3. Tuesday April 29, 2014 Solutions

MTH4101 Calculus II. Carl Murray School of Mathematical Sciences Queen Mary University of London Spring Lecture Notes

Review for the First Midterm Exam

IYGB Mathematical Methods 1

P1 Calculus II. Partial Differentiation & Multiple Integration. Prof David Murray. dwm/courses/1pd

Faculty of Engineering, Mathematics and Science School of Mathematics

AP Calculus Free-Response Questions 1969-present AB

AP Physics C Summer Homework. Questions labeled in [brackets] are required only for students who have completed AP Calculus AB

MSc Mas6002, Introductory Material Mathematical Methods Exercises

DRAFT - Math 101 Lecture Note - Dr. Said Algarni

6x 2 8x + 5 ) = 12x 8

Page Points Score Total: 210. No more than 200 points may be earned on the exam.

Candidates are expected to have available a calculator. Only division by (x + a) or (x a) will be required.

Math 2400, Midterm 2

(b) Find the range of h(x, y) (5) Use the definition of continuity to explain whether or not the function f(x, y) is continuous at (0, 0)

Math 180, Exam 2, Practice Fall 2009 Problem 1 Solution. f(x) = arcsin(2x + 1) = sin 1 (3x + 1), lnx

DEPARTMENT OF MATHEMATICS AND STATISTICS UNIVERSITY OF MASSACHUSETTS. MATH 233 SOME SOLUTIONS TO EXAM 2 Fall 2018

UNIVERSITY OF SOUTHAMPTON. A foreign language dictionary (paper version) is permitted provided it contains no notes, additions or annotations.

TEST CODE: MIII (Objective type) 2010 SYLLABUS

Calculus 2502A - Advanced Calculus I Fall : Local minima and maxima

Transcription:

Chapter 4 Several-variable calculus 4.1 Derivatives of Functions of Several Variables 4.1.1 Functions of Several Variables ² A function f of n variables (x 1,x 2,...,x n ) in R n is an entity that operates on these variables to produce another real number y = f(x 1,x 2,...,x n ). ² x 1, x 2,..., x n are called the independent variables, y the dependent variable. ² We write f: R n! R to indicate that f maps R n (or a domain within R n )intor. 4.1.2 Geometric Interpretation For a function of two variables, f(x, y), consider(x, y) as defining a point P in the xyplane. Let the value of f(x, y) be taken as the length PP 0 drawn parallel to the z-axis (or the height of point P 0 above the plane). Then as P moves in the xy-plane, P 0 maps out a surface in space whose equation is z = f(x, y). 33

Just as a function of one variable has a graph which is cut only once by each vertical line (constant x), here the surface can only be cut once by each vertical line (constant x and y). : f(x, y) =6 2x 3y The surface z =6 2x 3y, i.e.2x +3y + z =6, is a plane with intersects: the x-axis where y = z =0,i.e.x =3; the y-axis where x = z =0,i.e.y =2; the z-axis where x = y =0,i.e.z =6. : f(x, y) =x 2 y 2 In the plane x =0,thereisamaximum at y =0;inthe plane y =0,thereisaminimum at x =0. The whole surface is shaped like a horse s saddle; and the point (0, 0) is called a saddle point (of which, more later). x y z = x 2 y 2 : f(x, y) =x 2 +y 2 The intersection with the plane x =0is the parabola z = y 2 andwiththeplaney =0is the parabola z = x 2. This surface is symmetric about the z axis, and is a paraboloid (parabolic bowl). x y 4.1.3 Partial Derivatives z = x 2 + y 2 Given a function of several variables, we could choose to hold all but one of these variables fixed at arbitrarily chosen values, thereby obtaining a function of one variable (the 34

remaining one), which could then be differentiated. Definition Given a function f(x 1,...,x n ) of n variables and an integer k between 1 and n, the partial derivative = f xk = xk f x k of f with respect to the variable x k is the derivative of f with respect to x k only, while the remaining n 1 variables are all held fixed. Explicitly x k f xk = lim δx k 0 f(x1,...,x k 1,x k + δx k,x k+1,...,x n ) f(x 1,...,x n ) δx k, (4.1) itself a function of (x 1,...,x n ). In practice the variables held fixed act as constants: f(x) =3x 4 +sin(2x) ) df /dx =12x 3 +2cos(2x) f(x, y) =yx 4 +sin(yx) ) / x =4yx 3 + y cos (yx). Geometrical interpretation of partial derivatives in the case n =2 Recall that the graph of f is the surface z = f(x, y) with the z coordinate measured vertically upwards. The cross section of this surface cut by a vertical plane y = constant is a curve whose slope (gradient) is the partial derivative f x. (see figure). Similarly f y is the slope of the cross section of the graph by a vertical plane x = constant. One may interpret the partial derivatives f x and f y as the slope encountered by walking over the surface in the x and y directions respectively. 35

Remark It is obvious from the definition that the partial derivative with respect to a particular variable obeys the same sum, product and quotient rules D II - D IV as the ordinary (single variable) derivative, i.e., if u and v are both functions of x 1,...,x n, then, for k =1,...,n, (u + v) = u + v, (4.2) x k x k x k (uv) = u v + v u, (4.3) x k x k x k x k ³ u v = 1 v 2 µ v u x k u v x k Corresponding to D I, we have the result that (v 6= 0). (4.4) f(x 1,...,x n ) is independent of x k iff x k is zero for all (x 1,...,x n ). (4.5) and the consequent result f(x 1,...,x n )=constant iff the n partial derivatives are all zero for all (x 1,...,x n ). (4.6) Corresponding to the chain rule D V, we have the result that, if g is a function of x 1,...,x n and f afunctionofasingle variable, then [ffg(x 1,...,x n )g] =f 0 fg(x 1,...,x n )g g. (4.7) x k x k For example, if f (g) =sing and g (x, y) =x 2 + xy then f (x, y) = sin x 2 + xy f x = (2x + y)cos x 2 + xy A more powerful and very important generalization of the chain rule is coming up later in this chapter. Calculate the partial derivatives of the functions: (a) f(x, y) =x 2 +2xy 2 + y 3 ; (b) f(x, y, z) =xz + e yz +sin(xy). (a) Holding y constant gives =2x x +2y2 +0. Holding x constant gives =0+4xy +3y2. (b) Holding both y and z constant gives f x = z +0+y cos (xy). Holding both x and z constant gives f y =0+ze yz + x cos (xy). Holding both x and y constant gives f z = x + ye yz +0. 36

(implicit partial differentiation) If z is a function of two independent variables x and y, andz satisfies xz +lnz = 2x +3y, find z in terms of x, y and z. x Differentiating each term in the equation with respect to x, holdingy constant, and treating z as a function of x, weobtain so that x z x + z + 1 z z x =2 z z(2 z) = x (1 + xz). 4.1.4 Second and Higher Order Partial Derivatives Since = f x x and = f y arethemselvesfunctionsofx and y, they themselves have partial derivatives, for which we use the notations 2 f x = µ = (f 2 x ) x = f xx, (4.8) x x 2 f x = µ = (f x ) y = f xy, (4.9) x 2 f x = µ = (f y ) x = f yx, (4.10) x 2 f = µ = (f 2 y ) y = f yy. (4.11) This notation extends obviously to higher order derivatives and to functions of three or more variables. For obvious reasons, f xy and f yx are called mixed derivatives. If f(x, y) =x 4 y 2 x 2 y 6 then Mixed Derivatives Theorem x = 4x3 y 2 2xy 6 = 2x 4 y 6x 2 y 5 2 f x 2 = 12x 2 y 2 2y 6 2 f x = 8x3 y 12xy 5 2 f 2 = 2x 4 30x 2 y 4 2 f x = 8x 3 y 12xy 5 If f x, f y and f xy exist and are continuous, then f yx exists and f xy = f yx. 37

We will not prove this theorem (we have not fully defined the word continuous); but for reasonable functions it will always apply. This means that to calculate a mixed derivative we can calculate in either order. For third-order derivatives the mixed derivatives theorem gives f xxy = f xyx = f yxx and so on (check for yourself in the last example). Verify the Mixed Derivatives Theorem for the function f(x, y) =xy 3 + x sin xy. Using the sum, product and chain rules, we see that f x = y 3 +sinxy + xy cos xy, and hence that f xy =(f x ) y =3y 2 + x cos xy +(x cos xy x 2 y sin xy) =3y 2 +2x cos xy x 2 y sin xy. Similarly, f y =3xy 2 + x 2 cos xy, so f yx =(f y ) x =3y 2 +(2x cos xy x 2 y sin xy) =f xy. In 3 dimensions, the distance r ofapointfromtheoriginisgivenintermsofits Cartesian coordinates x, y and z by r = p x 2 + y 2 + z 2 =(x 2 + y 2 + z 2 ) 1/2. Show that the function φ(x, y, z) =1/r =(x 2 + y 2 + z 2 ) 1/2 obeys Laplace s equation 2 φ x + 2 φ 2 + 2 φ 2 z =0 2 (except at the origin). By the chain rule, φ x = 1 2 (x2 + y 2 + z 2 ) 3/2 (2x) = x(x 2 + y 2 + z 2 ) 3/2. Therefore, by the product and chain rules, 2 φ = (x 2 + y 2 + z 2 ) 3/2 +( x) 3 x 2 2 (x2 + y 2 + z 2 ) 5/2 (2x) Similarly, by symmetry, = (x 2 + y 2 + z 2 ) 3/2 +3x 2 (x 2 + y 2 + z 2 ) 5/2. 2 φ 2 = (x 2 + y 2 + z 2 ) 3/2 +3y 2 (x 2 + y 2 + z 2 ) 5/2, 2 φ z 2 = (x 2 + y 2 + z 2 ) 3/2 +3z 2 (x 2 + y 2 + z 2 ) 5/2. Adding the three above equations now gives 2 φ x 2 + 2 φ 2 + 2 φ z 2 = 3(x 2 + y 2 + z 2 ) 3/2 +3(x 2 + y 2 + z 2 )(x 2 + y 2 + z 2 ) 5/2 = 3(x 2 + y 2 + z 2 ) 3/2 +3(x 2 + y 2 + z 2 ) 3/2 =0. 38

4.2 Linear Approximations and Tangents 4.2.1 Tangent to Graph of a Function of One Variable The tangent to the curve y = f(x) at A =(a, f(a)) is the straight line through A with slope f 0 (a), i.e. it has the equation y = f(a)+(x a)f 0 (a). (4.12) NB1. for this line, dy dx = f 0 (a) and y = f(a) at x = a. NB2. The RHS consists of the first two terms of the Taylor expansion of f about x = a (i.e. it is the best linear approximation to f (x) near x = a). Find the linear approximation to f(x) =1+x 2 near x =2. If f(x) =1+x 2 then f 0 (x) =2x. At the point x =2we have f =5and f 0 =4. Therefore the linear approximation is f(x) ¼ 5+4(x 2) = 4x 3. 4.2.2 Tangent Plane to Graph of a Function of Two Variables By analogy with the above, this is the (best) linear approximation to f near (a, b), as given by the first two terms of the two-variable Taylor series (appendix F). It is the plane whose equation is z = f(a, b)+(x a)f x (a, b)+(y b)f y (a, b). (4.13) NB: For this plane, z x = f x(a, b), z = f y(a, b) and z = f(a, b) at x = a and y = b, i.e. we have matched the first derivatives and the value of the function at (a, b). Find the tangent plane to the surface z = f(x, y) =x 2 + y 2 near the point x =1, y =2. If f(x, y) =x 2 + y 2 then f x =2x and f y =2y. At the point (1, 2) we have f =5, f x =2and f y =4. Thus the tangent plane is z =5+(x 1)2 + (y 2)4 = 2x +4y 5. 4.3 Directional derivatives and the gradient vector For f(x, y), f x and f y measure the rates of change of f along the x and y directions. How can we can calculate the rate of change of f in any direction? We need to know how much f changes when both x and y change by small amounts. Near x = a and y = b, f(x, y) is approximately given by equation (4.13) for the tangent plane. Let x change by a (vanishingly) small amount dx, andy by dy (i.e. x = a + dx, y = b + dy) then f (x, y) ¼ f(a, b)+(x a)f x (a, b)+(y b)f y (a, b) f(a + dx, b + dy) ¼ f(a, b)+(dx)f x (a, b)+(dy)f y (a, b). 39

Thechangeinf is df = f(a + dx, b + dy) f(a, b), so df = (dx)f x (a, b)+(dy)f y (a, b) = rf dr, where we have defined the two dimensional vector representing the change in x and y, dr = dxi + dyj =(dx, dy) and the two dimensional gradient vector rf = x i + µ j = x,. (4.14) We can, additionally, write dr = udr where u is a unit vector in the direction dr and dr is the magnitude of the change. Then: df = rf udr and so Rate of change of f in the direction of u = df = rf u. dr The above generalises to functions of more than two variables. E.g. for a function of three variables, f (x, y, z) the three-dimensional gradient vector is rf = x i + j + µ z k = x,, (4.15) z 4.3.1 Two properties of the gradient The change df in f due to a change in the position by dr = udr is given by df = rf dr = rf udr = jrfj dr cos θ (4.16) where θ is the angle between the vectors dr and rf. Welookatcaseswheredr is parallel or perpendicular to rf. Property 1. From (4.16) the direction dr for which df is a maximum is that for which cos θ =1,orθ =0, i.e. dr in the direction of rf. Thus At any point, rf points in the direction in which f is increasing most rapidly and its magnitude jrfj gives this maximum rate of change. i.e. rf points uphill. Property 2. From (4.16), df = 0 corresponds to θ = π/2, whenrf and dr are perpendicular. But df =0means that f has not changed so dr is along the surface f =constant. Thus At any point, rf points is perpendicular to the surface f = constant through that point. NB f =constant is a contour of the function f. For a function of two variables, these two properties are illustrated in the following picture: 40

If f(x, y, z) =z 3 +3x 2 y 2 +sinz, find rf. The three partial derivatives are x = 0+6xy2 +0=6xy 2 = 0+6x 2 y +0=6x 2 y z = 3z 2 +0+cosz =3z 2 +cosz so rf = 6xy 2, 6x 2 y, 3z 2 +cosz. If f (x, y, z) =x 2 +xy+z, find rf. What is the rate of change of f along the direction i+2j +2k at the point P (1, 1, 1)? What is the magnitude of the maximum rate of change of f at this point? rf = µ x,, =(2x + y, x, 1) z Now, at the point P (1, 1, 1), rf =(3, 1, 1). To find the rate of change of f along a vector v =(1, 2, 2), we need the unit vector along this direction, which is µ 1 1 ˆv = p 12 +2 2 +2 (1, 2, 2) = 2 3, 2 3, 2. 3 So, the rate of change of f in this direction is µ 1 rf ˆv =(3, 1, 1) 3, 2 3, 2 =1+ 2 3 3 + 2 3 = 7 3. 41

The maximum rate of change of f at the point P is jrfj =(11) 1/2. Find a unit vector perpendicular to the surface z = x 2 + y 2 at the point A(1, 2, 5). A (via the tangent plane) Earlier, we found the tangent plane to this surface at this point to be 2x +4y z =5. The vector equation of a plane can be written as r n = a where r =(x, y, z) and n is a vector perpendicular to the plane. By inspection, we see that n =(2, 4, 1) is such a vector, and so a unit vector in this direction is ˆn = 1 1 p (2, 4, 1) = p (2, 4, 1). 22 +4 2 +12 21 B (treat the surface as a contour of a function of three variables). The equation of the surface can be written as x 2 + y 2 z =0 so if we define a function f(x, y, z) =x 2 + y 2 z we can say the surface is the contour of the function f given by f(x, y, z) =0=constant. We know, from Property 2 above, that rf is perpendicular to the surface f =constant. rf =(2x, 2y, 1) = (2, 4, 1) at the point A(1, 2, 5). So, as before, a unit vector perpendicular to the surface is ˆn = 1 1 p (2, 4, 1) = p (2, 4, 1). 22 +4 2 +12 21 4.4 Stationary (Critical) Points of a Function of Two Variables 4.4.1 Definition For a function of two variables, f(x, y), astationary point (x,y ) is defined to be a point at which the gradient vector is zero: rfj (x,y ) =(f x (x,y ),f y (x,y )) = (0, 0), (4.17) i.e. both of the partial derivatives and arezeroatthatpoint. x Thevaluez = f(x,y ) of f at (x,y ) is the corresponding stationary value (SV). 42

4.4.2 Classification of SP s of a Function of Two Variables There are three main types of stationary point for a function of two variables, maximum, minimum and saddle points. These are sketched as follows: Maximum: A local peak in the function. To get a peak, we must ensure that when point (x, y) moves away from (x,y ) asmalldistanceinany direction, thevalueof f(x, y) always decreases. Minimum: A local trough in the function. To get a trough, we must ensure that when point (x, y) moves away from (x,y ) a small distance in any direction,thevalueoff(x, y) always increases. Saddle point: Looks like a horse s saddle!. Moving off in some directions away from (x,y ) leads to an increase in f, whilemovingoff in other directions leads to a decrease in f. Contours: We can represent the landscape of the surface z = f(x, y) by contour lines, which are curves in the (x, y) plane on which f(x, y) takes different constant values. Around a maximum, thevalueoff(x, y) is always smaller than its value z at the maximum. The contours are closed loops around the stationary point. Around a minimum, f(x, y) >z and again the contours are closed loops around the stationary point. The representation of a saddle point by contour lines has the characteristic appearance depicted below.at the level of the saddle there are two contour lines which cross at the saddle. These two crossing contour lines separate two regions in which f>z from two regions in which f<z. Thus, as we move away from the saddle in different directions, there are two pairs of opposite directions in which f stays fixed (along the crossing contour lines), and these directions separate two opposite ranges of direction in which f increases from two opposite ranges of direction in which f decreases. 43

44

Toinvestigatewhattypeagivenstationarypointis,wemustlookatwhatvaluesf takes close to this point. Consider the Taylor expansion (from appendix F) of f(x, y) about a point (x,y ): f(x, y) = f(x,y )+(x x )f x (x,y )+(y y )f y (x,y ) + 1 2 (x x ) 2 f xx (x,y )+(x x )(y y )f xy (x,y )+ 1 2 (y y ) 2 f yy (x,y ) + higher order terms, (NB this matches the first and second derivatives of f(x, y) at the point (x,y )). Suppose that (x,y ) is a stationary point. Then f x (x,y )=f y (x,y )=0,andif we label the values f(x,y )=z, x x = δx, y y = δy, f xx (x,y )=A, f xy (x,y )=B, f yy (x,y )=C, (4.18) we can rewrite the Taylor series in the form f(x, y) =z + 1 A δx 2 +2Bδxδy + C δy 2 + higher order terms, (4.19) 2 whereitisconvenienttowrite Q(δx, δy) =A δx 2 +2B δxδy + C δy 2 (4.20) for the quadratic expression in the square brackets. Let s look at the values of Q around a circle surrounding the stationary point, i.e. let where θ is an angle we can vary. Note that, δx = δs cos θ and δy = δs sin θ ² For a minimum, Q will always be positive (f >z ). ² For a maximum, Q will always be negative (f <z ). ² For a saddle, Q will change sign around the circle. Substituting in, we get: Q(δs cos θ, δs sin θ) = δs 2 A cos 2 θ +2Bcos θ sin θ + C sin 2 θ 1 = δs 2 2 A(1 + cos 2θ)+Bsin 2θ + 1 C(1 cos 2θ). 2 Afterafewmoretrigidentities(seeAppendixG)weget 1 Q(δs cos θ, δs sin θ) =δs 2 (A + C)+Rcos(2θ φ) 2 where R>0 and R 2 = 1 4 (A + C)2 +(B 2 AC) and the angle φ is such that 1 (A C) =R cos φ and B = R sin φ. 2 45

Consider Q δs 2 = 1 (A + C)+Rcos(2θ φ). 2 As θ varies, this oscillates with amplitude R about an average value of 1 (A + C). Hence, 2 if R> 1 ja + Cj 2 then the oscillations are large enough to change the sign of Q as θ is varied, giving a saddle point. This condition simplifies to i.e. AC B 2 < 0 Hence, the condition for a saddle point is: R 2 > 1 (A + C)2 4 f xx f yy f 2 xy < 0 Condition for a saddle point. If it is not a saddle point, then it is a maximum or a minimum. We can determine which by looking at the sign of f xx (or f yy ). Appendix G gives more details. Hence: f xx f yy f 2 xy > 0, f xx > 0 Condition for a minimum. f xx f yy fxy 2 > 0, f xx < 0 Condition for a maximum. Note that in BOTH cases the function f xx f yy fxy 2 must be POSITIVE at the SP. : Locate and classify the stationary points of the function f(x, y) =12x 3 +y 3 +12x 2 y 75y. f x =36x 2 +24xy =12x(3x +2y), f y =3y 2 +12x 2 75 = 3(4x 2 + y 2 25). SP sgiven by f x =0, f y =0. For f x =0we have x (3x +2y) =0 so EITHER x =0OR 3x +2y =0, y = 3 x. 2 If x =0then f y =3(y 2 25) = 0 ) y = 5. If y = 3 x then 2 f y = 3 4x 2 + y 2 25 µ = 3 4x 2 + 9 4 x2 25 µ 25 = 3 4 x2 25 = 75 x 2 4 4 and so x = 2, y = 3 x = 3. 2 So there are 4 SP s, (0, 5), (0, 5), (2, 3) and ( 2, 3), with respective SV s 250, 250, 150 and 150. 46

The 2nd order partial derivatives are f xx =72x +24y =24(3x + y), f xy =24x, f yy =6y. At (0, 5), f xx =120> 0, f xy =0,f yy =30,H = f xx f yy f 2 xy = 3600 > 0, sothissp is a minimum. At (0, 5), f xx = 120 < 0, f xy =0, f yy = 30, H =3600> 0, sothisspisa maximum. At (2, 3), f xx =72,f xy =48,f yy = 18, H = 72 18 48 2 < 0, sothisspisa saddle point. At ( 2, 3), f xx = 72, f xy = 48, f yy =18,H = 72 18 48 2 < 0, sothisspis a saddle point. This is a sketch of the contours. For the connectivity, it helps to note the stationary values. 4.4.3 Definition: Hessian The function f xx f yy fxy 2 is called the Hessian H(x, y) of f. It may be written as a 2 2 determinant: H(x, y) = fxx f xy. f yx f yy 47

4.4.4 Definition: Degenerate stationary point A stationary point (x,y ) at which H = H(x,y )=0is said to be degenerate. Such stationary points will be excluded from this course. They require further investigation, involving cubic or higher order terms in the Taylor expansion. 4.5 Lagrange Multipliers 4.5.1 Introductory example Supposewewanttofind the area of the smallest circle centred on the origin which touches the line y = 3x + 4. The diagram shows three candidate circles: the smallest is too small as it fails to touch the line, the largest too large (we can do better); the ideal circle just touches the line in one place. Note that this means the line is the tangent line to the circle at that point, and the normal to the circle is also normal to the line. Now we can write the question as: minimise f(x, y) =π(x 2 + y 2 ) such that g(x, y) =y +3x 4=0. Each candidate circle is a line f(x, y) =constant and the line is of the form g(x, y) =constant, so to make the two normals parallel we put rf = λrg and we retain the constraint g(x, y) = 0. This procedure always produces a maximum or minimum of f given the constraint that g =0.Thequantityλ is called a Lagrange multiplier. In this case we have rf =(2πx, 2πy) and rg =(3, 1) so we put 2πx =3λ and 2πy = λ and the constraint g(x, y) =0gives λ/2π +9λ/2π 4 = 0 λ = 4π/5 and so, x = 6/5,y =2/5 Hence, the area of circle = π(36/25 + 4/25) = π(40/25) = 8π/5. 48

4.5.2 General principle In general, to minimise or maximise f (x 1,...,x n ) subject to a constraint g (x 1,...,x n )=constant, we set rf = λrg where the unknown λ is called the Lagrange multiplier. rf = λrg and the constraint g =constant gives n +1equations altogether, enough in principle to solve for the n +1 unknowns, x 1,...,x n and λ. Why does this work? With reference to the picture above, we can make the following two comments: 1.The smallest value of f on the contour g =constant is where this contour just touches a contour f =constant. For contours to just touch, the perpendiculars to the contour must be parallel, so rf = λrg. 2. To minimise f along the contour g =constant we require only that the component of rf along the contour is zero; we are only interested in changes of f along the contour. So, rf is allowed to have a component perpendicular to the contour and we can set rf = λrg for some unknown λ. 1 Minimize f(x, y, z) =x 2 + y 2 + z 2 subject to the constraint x 2y + z =3. We write the constraint condition as g(x, y, z) =x 2y + z =3so we can calculate rf = (2x, 2y,2z) rg = (1, 2, 1) 49

and to have rf = λrg requires: 2x = λ; 2y = 2λ; 2z = λ. We substitute these into the constraint condition: 0 = x 2y + z 3= 1 2 λ +2λ + 1 2 λ 3=3λ 3 ) λ =1 to determine the point: (x, y, z) =( 1 2, 1, 1 2 ) at which f(x, y, z) = 1 2 +1 2 + 1 2 =3/2. 2 2 Warning: This procedure does not tell us the difference between a minimum and a maximum of f, soyoumayneedtochecksomeothervaluestoverifyyouhavethe correct solution. In this case, taking (3, 0, 0) (which satisfies the constraint) gives f =9 which is greater than the value at our stationary point, so we have found a minimum. 2 Find the maximim area of a rectangle with perimeter P. We need to maximise the area A = xy subject to the constraint g (x, y) =2x+2y = P. So ra = λrg gives y = 2λ x = 2λ so y = x. Substituting into the constraint, we have 4x = P x = P 4 = y. Hence, A = P 2 /16. This is a maximum, as can be seen by e.g. choosing x = P/6 and y = P/3 (which satisfies the constraint) which gives A = P 2 /18. Extension. If there is more than one constraint, e.g. g =0, h =0,thenweusemore than one Lagrange multiplier: solve for rf = λrg + µrh in terms of x, y, z, λ and µ subject to g =0, h =0. 4.6 The Chain Rule We have seen (section 4.1.3) the chain rule for ffg(x 1,...,x n )g. Consider f (x, y) where x and y are functions of another variable t (i.e. x (t) and y (t)). If t increases by t then x increases by x = t dx dy and y increases by y = t dt dt so the change in position is µ dx r =( x, y) = t dt, dy = t dr dt dt 50

where µ dr dx dt = dt, dy dt istherateofchangeofpositionwitht. Now, recall from our work on directional derivatives: the change in f for this small change in position is f = rf r = rf dr dt t. If we rearrange and let t! 0 we obtain the chain rule for a function of two variables, which is, df dt dr = rf dt = dx x dt + dy dt (4.21) NB f depends on x and y [so partial derivatives, ] whilst x and y depend on just x t [so ordinary derivatives dx, dy ]. Thus f depends on t and has the ordinary derivative dt dt df given by the chain rule (4.21). dt If f(x, y) =x 2 + y 2,wherex =sint, y = t 3,then df dt = dx x dt + dy dt =2xcos t +2y3t2 =2sintcos t +6t 5. Of course in this simple example we can check the result by substituting for x and y before differentiation to give f(t) =(sint) 2 +(t 3 ) 2,so df =2sintcos t dt +6t5 as before. The chain rule extends directly to functions of three or more variables, and to include implicit differentiation. If f(x, y, z) =ln(2x 3y +4z), wherex = e t, y =lnt, z =cosht, then df dt = dx x dt + = 2e t 2x 3y +4z dy dt + z = 2et 3/t +4sinht 2e t 3lnt +4cosht. dz dt 3(1/t) 2x 3y +4z + 4sinht 2x 3y +4z 4.6.1 Extended chain rule For f(x, y) suppose that x and y depend on two variables s and t (e.g. polar co-ordinates, x = s cos t, y = s sin t). Then µ r x s = s, and r µ x s t = t, (4.22) t are two vectors representing the rate of change of position with s and t respectively. 51

Changing either s or t changes x and y, so changes f, i.e. producing and s t according to the extended chain rule = rf r s s = x x s + s. (4.23) = rf r t t = x x t + (4.24) t f(x, y) =x 2 y 3,wherex = s t 2, y = s +2t. Then and s x =2xy3 and =3x2 y 2 = x x s + s = 2xy 3.1+3x 2 y 2.1 = xy 2 (2y +3x) = (s t 2 )(s +2t) 2 (5s +4t 3t 2 ) t = x x t + t = 2xy 3 ( 2t)+3x 2 y 2 (2) = 2xy 2 ( 2ty +3x) = 2(s t 2 )(s +2t) 2 (3s 2st 7t 2 ). s (i) If f is a function of x and y, wherex = e s cos t, y = e s sin t, provethatsin t cos t = es. t (ii) If f is a function of z/x and x/y, provethatx + y + z =0. x z s (i) If x = e s cos t and y = e s sin t then x s = es cos t x t = es sin t It follows that s s Combining these two equations we have s = es sin t t = es cos t. = x es cos t + es sin t = x es sin t + es cos t sin t s +cost t = es. s + 52

(ii) Let u = z/x and v = x/y. Then u = x z/x2, u v = x/y2 and v =0. z u =0, z v =1/x, x =1/y and and so as required. x = u u x + v v x = z u x + 1 2 v y = u u + v v = x v y 2 = u z u z + v v z = 1 u x. x x + y + z z =0 4.6.2 Definition of the Jacobian We could write the extended chain rule (4.23) in matrix-vector form as follows: Ã! à x! 0 1 s s s x = @ A (4.25) which leads us naturally to the Jacobian matrix t J = x t à x s x t whose determinant is the Jacobian of the transformation from x, y to s, t: (x, y) x (s, t) = s s = x s x t t t s t! (4.26) t x s t. (4.27) NB The rows of the matrix J are the vectors r = x, s s s and r = x, t t t expressing the rate of change of position (x, y) with s and t respectively. Geometrically, the Jacobian is (x,y) = r (s,t) r s t = r s r t sin θ, whereθ is the angle between r r ad. Hence, the s t Jacobian is the area of the parallelogram whose sides are r r and. s t 4.6.3 Change of Variables Suppose we have x and y expressed in terms of two other variables s and t. Howwould we go about finding expressions for s and t in terms of x and y? Isthisalwayspossible? Carrying out a change of variables. The procedure for reversing a change of variables is to look for combinations of x and y which eliminate all dependence on one of s and t. This is best shown by example. If x = s 2 t and y = t 2 /s, find s and t in terms of x and y. 53

We begin by looking for a combination of x and y that has no t-dependence. From the first equation, we can write t = x/s 2 so we substitute this into the second equation: and manipulate this result to give s: y =(x/s 2 ) 2 /s = x 2 /s 5 s 5 = x 2 /y ) s = x 2/5 y 1/5. We can then substitute this into either of the definitions to eliminate s wechoosethe second: y = t 2 x 2/5 y 1/5 ) t = x 1/5 y 2/5 so the full solution is s = x 2/5 y 1/5, and t = x 1/5 y 2/5. If x = s cos t and y = s sin t, find s and t in terms of x and y. Here we start by eliminating t. The simplest way to do this is to use the identity sin 2 t +cos 2 t =1: x 2 + y 2 = s 2 cos 2 t + s 2 sin 2 t = s 2 ) s =(x 2 + y 2 ) 1/2, and to eliminate s we simply divide the two expressions: y/x = s sin t/s cos t =tant ) t =tan 1 (y/x). If x = s 2 t and y = s 4 t 2 +2s 2 t +4, express s and t in terms of x and y. We try to eliminate t by using the x-equation: in the y-equation: x = s 2 t ) t = xs 2 y = s 4 t 2 +2s 2 t +4=s 4 [xs 2 ] 2 +2s 2 [xs 2 ]+4=x 2 +2x +4 and we find that we have no s-dependence in this equation so we can t rearrange to find s. In this case it is not possible to determine s and t from values of x and y. Sincey can be written in terms of x only, y and x are not independent. Jacobian and change of variables In the example above, we could not find s and t from x and y. The critical quantity here is the Jacobian: (x, y) (s, t) = x s t x s t =(2st)(2s4 t+2s 2 ) (4s 3 t 2 +4st)(s 2 )=4s 5 t 2 +4s 3 t 4s 5 t 2 4s 3 t =0. In general, it is only possible to change variables and change back again if the Jacobian of transformation is not zero. When the Jacobian is zero, the area of the parallelogram whose sides are r r and s t is zero, which means r r and are parallel. Changes in either s or t give changes in the s t position (x, y) in the same direction, so y and x are not independent, and s and t cannot be uniquely determined from x and y. 54