Introduction to the Calculus of Variations

Introduction to the Clculus of Vritions Jim Fischer Mrch 20, 1999 Abstrct This is self-contined pper which introduces fundmentl problem in the clculus of vritions, the problem of finding extreme vlues of functionls. The reder should hve solid bckground in onevrible clculus. Contents 1 Introduction 1 2 Prtil Derivtives 2 3 The Chin Rule 3 4 Sttement of the Problem 4 5 The Euler-Lgrnge Eqution 5 6 The Brchistochrone Problem 10 7 Concluding Remrks 12 1 Introduction We begin with n introduction to prtil differentition of functions of severl vribles. After prtil derivtives re introduced we discuss some forms of the chin rule. In section 4 we formulte the fundmentl problem. In section 5 we stte nd prove the fundmentl result (The Euler-Lgrnge Eqution). We conclude the pper with solution of one of the most fmous problems from the clculus of vritions, The Brchistochrone Problem. 1

2 Prtil Derivtives Given function of one vrible sy f(x), we define the derivtive of f(x) t x = to be f f( + h) f() () = lim, h 0 h provided this limit exists. For function of severl vribles the totl derivtive of function is not s esy to define, however, if we set ll but one of the independent vribles equl to constnt, we cn define the prtil derivtive of function by using limit similr to the one bove. For exmple, if f is function of the vribles x, y nd z, wecnsetx = nd y = b nd define the prtil derivtive of f with respect to z to be f (x, y, z) = lim z h 0 f(, b, z + h) f(, b, z), h wherever this limit exists. Note tht the prtil derivtive is function of ll three vribles x, y nd z. The prtil derivtive of f with respect to z gives the instntneous rte of chnge of f in the z direction. The definition for the prtil derivtive of f with respect to x or y is defined in similr wy. Computing prtil derivtives is no hrder thn computing ordinry one-vrible derivtives, one simply trets the fixed vribles s constnts. Exmple 1. Suppose f(x, y, z) =x 2 y 2 z 2 + y cos(z), then f x (x, y, z) =2xy2 z 2 f y (x, y, z) =2x2 yz 2 +cos(z) f z (x, y, z) =2x2 y 2 z y sin(z) We cn tke higher order prtil derivtives by continuing in the sme mnner. In the exmple bove, first tking the prtil derivtive of f with respect to y nd then with respect to z yields: 2 f z y =4x2 yz sin(z). Such derivtive is clled mixed prtil derivtive of f. From well known theorem of dvnced clculus, if the second order prtil derivtives of f exist in neighborhood of the point (, b, c) nd re continuous t 2

(, b, c), then the mixed prtil derivtives do not depend on the order in which they re derived. Tht is, for exmple 2 f z y (, b, c) = 2 f (, b, c). y z This result ws first proved by Leonrd Euler in 1734. 1+y 2 Exercise 1. Let f(x, y, z) = z 2, compute ll three prtil derivtives of f. Exercise 2. For f in Exmple 1, compute couple of mixed prtil derivtives nd verify tht the order in which you differentite does not mtter. 3 The Chin Rule We begin with review of the chin rule for functions of one vrible. Suppose f(x) is differentible function of x nd x = x(t) is differentible of t. By the chin rule theorem, the composite function z(t) =f x(t) is differentible function of t nd dz dt = dz dx dx dt. (3.1) For exmple, if f(x) =sin(x) ndx(t) =t 2, then the derivtive with respect to t of z =sin(t 2 ) is given by cos(t 2 ) 2t. It turns out tht there is chin rule for functions of severl vribles. For exmple, suppose x nd y re functions of t nd consider the function z =[x(t)] 2 +3[y(t)] 3. We cn think of z s the composite of the function f(x, y) =x 2 +3y 3 withthefunctions x(t) ndy(t). By chin rule theorem for functions of severl vribles, dz dt = z dx x dt + z dy y dt (3.2) Note the similrity between (3.1) nd (3.2). For functions of severl vribles, one needs to keep trck of ech of the independent vribles seprtely, pplying chin rule to ech. The hypothesis for the chin rule theorem require the function z = f(x, y) to hve continuous prtil derivtives nd for x(t) nd y(t) to be differentible. 3

4 Sttement of the Problem Webeginwithsimpleexmple. LetP nd Q be two points in the xy-plne nd consider the collection of ll smooth curves which connect P to Q. Let y(x) besuchcurvewithp =(, y()) nd Q =(b, y(b)). The rc-length of the curve y(x) is given by the integrl 1+[y (x)] 2 dx. Suppose now tht we wish to determine which curve will minimize the bove integrl. Certinly our knowledge of ordinry geometry suggests tht the curve which minimizes the rc-length is the stright line connecting P to Q. However, wht if insted we were interested in finding which curve minimizes different integrl? For exmple, consider the integrl b 1+[y (x)] 2 dx. y(x) It is not obvious wht choice of y(x) will result in minimizing this integrl. Further, it is not t ll obvious tht such minimum exists! One wy to proceed is to notice tht the bove integrls cn be viewed s specil kinds of functions, functions whose inputs re functions nd whose outputs re rel numbers. For exmple we could write F [y(x)] = More generlly we could write: F [y(x)] = 1+[y (x)] 2 dx f(x, y(x),y (x)) dx A function like F is ctully clled functionl, thisnmeisusedto distinguish F from ordinry rel-vlued functions whose domins consist of ordinry vribles. The function f in the integrl is to be viewed s n ordinry function of the vribles x, y nd y (this should become more cler in the next section). 1 One of the fundmentl problems of which the clculus of vritions is concerned, is locting the extrem of functionls. 1 We don t cll f(x, y, y ) functionl becuse its rnge is not R. 4

Before we formlly stte the problem, we need to specify the domin of F more precisely. Consider the intervl [, b] R nd define C[,b] 1 to be the set C 1 [,b] = {y(x) y :[, b] R, x hs continuous first derivtive on [,b]}. We will consider only functionls which hve certin desirble properties. Let F be functionl whose domin is C 1 [,b], F [y(x)] = f(x, y(x),y (x)) dx. We will require tht the function f in the integrl hve continuous prtil derivtives of x, y nd y. We require the continuity of derivtives becuse we will need to pply chin rules nd the Leibniz rule for differentition. We now stte the fundmentl problem. Problem: Let F be functionl defined on C[,b] 2,withF[y(x)] given by F [y(x)] = f(x, y(x),y (x)) dx. Suppose the functionl F obtins minimum (or mximum) vlue 2. How do we determine the curve y(x) which produces such minimum (mximum) vlue for F? In the next section we will show tht the minimizing curve y(x) must stisfy differentil eqution known s the Euler-Lgrnge Eqution. 5 The Euler-Lgrnge Eqution We begin this section with the fundmentl result: Theorem 1. If y(x) is curve in C[,b] 2 which minimizes the functionl F [y(x)] = f(x, y(x),y (x)) dx, then the following differentil eqution must be stisfied: f x d ( ) f dx x =0. This eqution is clled the Euler-Lgrnge Eqution. 2 This is n importnt ssumption for there do exist Functionls which hve no extrem. 5

Before proving this theorem, we consider n exmple. Exmple 2. If F [y(x)] = 1+[y (x)] 2 dx, then the Euler-Lgrnge Eqution is given by: 0= f y d dx ( ) f y ( ) =0 d y (x) dx 1+[y (x)] 2 1+[y (x)] 2 y (x) [y (x)] 2 y (x) ( 1+[y (x)] 2) 1 2 = 1+[y (x)] ( 2 1+[y (x)] 2) y (x) [y (x)] 2 y (x) = (1 + [y (x)]) 3 2 y (x) = (1 + [y (x)]) 3 2 Exercise 3. Show tht the solution to 0= y (x) (1 + [y (x)]) 3 2 is stright line. Tht is y(x) = Ax + B. Is this proof tht the shortest pth between two points is stright line? The proof of Theorem 1 relies on three things, the Leibniz rule, integrtion by prts nd Lemm 1. It is ssumed tht the reder is fmilir with integrtion by prts, we will discuss the Leibniz rule lter, nd we stte nd prove Lemm 1 now. Lemm 1. Let M(x) be continuous function on the intervl [, b]. Suppose tht for ny continuous function h(x) with h() =h(b) =0we hve M(x)h(x) dx =0. Then M(x) is identiclly zero 3 on [, b]. 3 Actully the function is zero lmost everywhere. This mens tht the set of x vlues where the function is not zero hs length of zero. 6

Proof of Lemm 1: Since h(x) cn be ny continuous function with h() =h(b) =0,wechooseh(x) = M(x)(x )(x b). Clerly h(x) is continuous since M is continuous. Also, M(x)h(x) 0 on[, b] (check this). But, if the definite integrl of non-negtive function is zero then the function itself must be zero. So we conclude tht 0=M(x)h(x) =[M(x)] 2 [ (x )(x b)]. This nd the fct tht [ (x )(x b)] > 0on(, b) impliestht[m(x)] 2 =0 on [, b]. Finlly, [M(x)] 2 =0on[, b] implies tht M(x) =0on[, b]. Proof of Theorem 1: Suppose y(x) is curve which minimizes the functionl F. Tht is, for ny other permissible curve g(x), F[y(x)] F [g(x)]. The bsic ide in this proof will be to construct function of one rel vrible sy H(ɛ) which hs the following properties: 1. H(ɛ) is differentible function ner ɛ =0. 2. H(0) is locl minimum for H. After constructing H, we show tht Property 2 implies the Euler-Lgrnge eqution must be stisfied. We begin by constructing vrition of y(x). Let ɛ be smll rel number (positive or negtive), nd consider the new function given by: y ɛ (x) =y(x)+ɛh(x) where h(x) C[,b] 2 nd h() =h(b) =0. We cn now define the function H to be H(ɛ) =F [y ɛ (x)]. Since x 0 (t) =y(x) ndy(x) minimizes F [y(x)], it follows tht 0 minimizes H(ɛ). Now, since H(0) is minimum vlue for H, we know from ordinry clculus tht H (0) = 0. The function H cn be differentited by using the Leibniz rule 4 : d dɛ (H(ɛ)) = d f(x, y ɛ,y dɛ ɛ) dx = ɛ f(x, y ɛ,y ɛ) dx 4 For proof of the Leibniz rule, check out text on dvnced clculus. 7

P vrition of y(x) y(x) Q b x xis Figure 1: A vrition of y(x). Applying the chin rule within the integrl we obtin: ɛ f(x, y ɛ,y ɛ)= f x x ɛ + f y ɛ y ɛ ɛ + f = f y ɛ y ɛ ɛ + f y ɛ x ɛ ɛ y ɛ x ɛ ɛ (5.1) (5.2) = f h(x)+ f y ɛ y ɛ h (x) (5.3) Exercise 4. Show tht equtions (5.1) through (5.3) re true. 8

From these computtions, we hve ( f H (ɛ) = h(x)+ f ) y ɛ y ɛ h (x) dx. (5.4) Evluting this eqution t ɛ =0yields ( ) f f 0= h(x)+ y y h (x) dx. (5.5) At this point we would like to pply Lemm 1 but in order to do so, we must first pply integrtion by prts to the second term in the bove integrl. Once this is done, the following eqution is obtined from eqution (5.5): [ f 0= y d ( )] f dx y h(x) dx. (5.6) Finlly, since this procedure works for ny function h(x) withh() = h(b) = 0, we cn pply Lemm 1 nd conclude tht 0= f y d ( f dx y This completes the proof of Theorem 1. Exercise 5. Verify eqution (5.6) by doing the integrtion by prts in eqution (5.5). Beltrmi Identity Often in pplictions, the function f which ppers in the integrnd does not depend directly on the vrible x. In these situtions, the Euler-Lgrnge eqution tkes prticulrly nice form. This simplifiction of the Euler- Lgrnge eqution is known s the Beltrmi Identity. We present without proof the Beltrmi Identity, it is not obvious how it rises from the Euler- Lgrnge eqution, however, its derivtion is stright-forwrd. The Beltrmi Identity: If f x = 0 then the Euler-Lgrnge eqution is equivlent to: f y f y = C (5.7) where C is constnt. Exercise 6. Use the Beltrmi identity to produce the differentil eqution in Exmple 2. ). 9

6 The Brchistochrone Problem Suppose P nd Q re two points in the plne. Imgine there is thin, flexible wire connecting the two points. Suppose P is bove Q, nd we let frictionless bed trvel down the wire impelled by grvity lone. By chngingtheshpeofthewirewemightlterthemountoftimeittkesfor the bed to trvel from P to Q. The brchistochrone problem (or quickest descent problem) is concerned with determining wht shpe (if ny) will result in the bed reching the point Q in the lest mount of time. This problem ws first introduced by J. Bernoulli in the mid 17th century, nd ws first solved by Isc Newton. In this section we set up the relevnt functionl nd then pply Theorem 1 to see wht the differentil eqution ssocited with this problem looks like. Finlly, we provide solution to this differentil eqution. First, we let curve y(x) tht connects P nd Q represent the wire. As before ssume P =(, y()) nd Q =(b, y(b)). We will restrict ourselves to curves tht belong to C[,b] 2. Given such curve, the time it tkes for the bed to go from P to Q is given by the functionl 5 F [y(x)] = ds v (6.1) where ds = 1+[y (x)] 2 dx nd v(x) =y (x). By using Newton s second lw (Potentil nd Kinetic energies re equl) we obtin 1 2 m[v(x)]2 = mg(y() y(x)) This llows us to rewrite the functionl (6.1) s b 1+[y F [y(x)] = (x)] 2 2g(y() y(x) dx Assuming minimum time exists, we cn pply Theorem 1 to the functionl F. Notice tht the integrnd does not depend directly on the vrible x nd therefore we cn pply the Beltrmi Identity. We cn lso mke computtions little esier letting P =(0, 0), the resulting eqution is then 1+[y (x)] 2 2gy(x) ( ) 1 2gy(x) 2 y (x) 1+[y (x)] 2 y (x) gy(x) = C (6.2) 5 For convenience we use the down direction to represent positive y vlues. 10

Eqution (6.2) simplifies to ( 1+[y (x)] 2) y(x) = 1 2gC 2 = k2 (6.3) Finlly, eqution (6.3) is well known nd the solution is cycloid. 6 prmetric equtions of the cycloid re given by: The x(θ) = 1 2 k2 (θ sin θ) y(θ) = 1 2 k2 (1 cos θ) Exmple 3. With P =(0, 0) nd Q =( π 2, 1),k =1nd 0 θ π 2. Figure 2 shows the cycloid solution to the Brchistochrone problem. A strnge property of the cycloid is the followig: If we let the frictionless bedstrtfromresttnypointonthecycloid,themountoftimeittkes to rech point Q is lwys the sme. P x y Q = (pi/2, 1) Figure 2: The Cycloid Solution for Exmple 3. 6 The Cycloid curve is the pth of fixed point on the rim of wheel s the wheel is rotted long stright line. 11

7 Concluding Remrks After introducing the notion of prtil derivtives nd the chin rule for functions of severl vribles, we were ble to stte problem tht the clculus of vritions is concerned. This is the problem of identifying extreme vlues for functionls (which re functions of functions). In section 5 we showed tht under the ssumption tht minimum (or mximum) solution exists, the solution must stisfy the Euler-Lgrnge eqution. The nlysis which led to this eqution relied hevily on the fct from ordinry clculus tht the derivtive of function t n extreme vlue is zero (provided tht the derivtive exists there). We dmit tht much of the nlyticl detils hve been omitted nd encourge the interested reder to look further into these mtters. It turns out tht problems like the brchistochrone cn be extended to situtions with n rbitrry number of vribles s well s to regions which re non- Eucliden. In fct, mny books bout Generl Reltivity deduce the Einstein field equtions vi vritionl pproch which is bsed on the ides tht were discussed in this pper. References [1] Widder, D., Advnced Clculus, Prentice-Hll, 1961. [2] Boyer, C., A History of Mthemtics, John Wiley nd Sons, 1991. [3] Troutmn, J., Vritionl Clculus with Elementry Convexity, Springer-Verlg, 1980. 12