A Short Essay on Variational Calculus

A Short Essay on Variational Calculus Keonwook Kang, Chris Weinberger and Wei Cai Department of Mechanical Engineering, Stanford University Stanford, CA 94305-4040 May 3, 2006 Contents 1 Definition of a functional 1 2 First variation 2 3 Essential and natural boundary conditions 4 4 Second variation 5 5 Functional derivative 6 6 Examples 7 6.1 Lagrange s equation of classical mechanics........................ 7 6.2 Tight string under lateral loading............................. 8 6.3 Phase field equation for microstructure evolution.................... 9 6.4 Schrödinger s equation of quantum mechanics...................... 9 1 Definition of a functional The calculus of variations deals with functionals, which are functions of a function, to put it simply. For example, the calculus of variations can be used to find an unknown function that minimizes or maximizes a functional. Let us begin our discussion with the definition of a functional. We are familiar with the definition of a function. It is a mapping from one number (or a set of numbers) to another value. For example, f(x) = x 2 (1) is a function, which maps x = 2 to f(x) = 4, and x = 3 to f(x) = 9, etc. On the other hand, a functional is a mapping from a function (or a set of functions) to a value. For example, π 0 [y(x)] 2 dx is a functional. Notice that we use square brackets for a functional, to signify the fact that its argument is a function. When y(x) = x, π 0 x 2 dx = π3 3 1 (2) (3)

When y(x) = sin x, π 0 sin 2 x dx = π 2 Therefore, functional I[y(x)] maps y(x) = x to π 3 /3 and maps y(x) = sin x to π/2. Because an integral maps a function to a number, a functional usually involves an integral. The following form of functional often appears in the calculus of variations, F (x, y(x), y (x)) dx (5) where y (x) dy(x)/dx. Many engineering problems can be formulated into a variational form, in which we need to find a function y(x) that minimizes (or maximizes) this functional, subject to certain boundary conditions, such as y( ) = a and y(x 2 ) = b. Notice that while I[y(x)] is a functional, F (x, y, y ) is simply a function that maps three numbers into one number. 2 First variation For a function f(x), its differential, df, is how much f changes if its argument, x, changes by an infinitesimal amount dx. For example, when f(x) = x 2, df = 2x dx If x is at a minimum (or maximum) of f(x), then df should be zero for an infinitesimal change of x. In this example, the minimum occurs at x = 0. For a functional I[y(x)], the corresponding term is its (first) variation, δi. δi is how much I changes if its argument, the function y(x), changes by an infinitesimal amount δy(x). This is illustrated in Fig. 1, where function y(x) is shown in bold line. If the function undergoes a small change, ỹ(x) = y(x) + δy(x) (7) where δy(x) is a small, continuous function, then the variation of the functional is, δi = I[ỹ(x)] I[y(x)] (8) Our task is obtain the explicit expression of δi. Suppose that the functional I[y(x)] takes the form of Eq.(5). To obtain δi, let us define δy(x) = ɛφ(x) (9) where φ(x) is an arbitrary continuous function whose value is of the order 1 and ɛ is an infinitesimal number. We also require φ(x) to be sufficiently smooth so that ỹ (x) is well defined. 1 We can now Taylor expand I[ỹ(x)] in terms of ɛ. I[ỹ(x)] = = x 1 x2 = I[y(x)] + ɛ F (x, y(x) + ɛφ(x), y (x) + ɛφ (x)) dx [ F (x, y(x), y (x)) + [ ɛφ(x) + φ(x) + φ (x) ] ɛφ (x) + O(ɛ 2 ) ] dx (4) (6) dx + O(ɛ 2 ) (10) 1 Another restriction on φ(x) is that φ(x) and φ (x) have the same order of smallness. In this case, the variation in y is said to be weak [2]. 2

y δy(x) ~ y(x) y(x) B A x 2 x Figure 1: Function y(x) (bold line) and a small variation from it, ỹ(x) = y(x) + δy(x). δy(x) is a small, continuous function which satisfies the boundary conditions δy( ) = δy(x 2 ) = 0. Keeping terms only up to the first order, we have, [ ] δi = ɛ φ(x) + φ (x) dx [ ] = δy(x) + δy (x) dx (11) Usually, it is more desirable to express δi solely in terms of δy(x), and get rid of δy (x) in the expression. This can be done by integration-by-parts. δi = ɛ = ɛ x 1 x2 φ(x) dx + ɛ dφ(x) [ ] x2 φ(x) dx + ɛ φ(x) ɛ ( ) d dx φ(x) dx (12) If we constrain the value of function y(x) at and x 2 to be constant, i.e., apply boundary conditions φ( ) = φ(x 2 ) = 0, then, 2 [ δi = ɛ d ( )] dx φ(x) dx [ = d ( )] dx δy(x) dx (13) Therefore, if y(x) is a minimum (or a maximum) of functional I[y(x)], subjected to the constraint that y( ) and y(x 2 ) are fixed, then y(x) must satisfy the differential equation, d ( ) dx = 0 (14) for x (, x 2 ). This is called the Euler-Lagrange equation. 2 The same conclusion can be made if y(x) satisfies periodic boundary conditions, y() = y(x 2), y () = y (x 2). 3

Example. What is the shortest path between two points A and B, as shown in in Fig.1? Of course, the answer is a straight line. Here we will derive this result from variational calculus. Given a curve y(x) connecting points A and B, the length of the curve is, L[y(x)] = F (x, y(x), y (x)) dx = 1 + (y (x)) 2 dx (15) Therefore, L is a functional of y(x). The minimum of functional L must satisfy the Euler-Lagrange equation (14). Notice that in this case F has no explicit dependence on x or y(x), so that, = 0 and = The Euler-Lagrange equation becomes ( ) d y (x) = 0 d + (y (x)) 2 y (x) 1 + (y (x)) 2 The solution is that y (x)=const, hence the shortest path is a straight line. 3 Essential and natural boundary conditions In the above derivation of the Euler-Lagrange equation, we assumed that δy( ) = δy(x 2 ) = 0. This makes the second term in Eq. (12) vanish. In this case, the function y(x) satisfies the boundary condition, y( ) = y 1 y(x 2 ) = y 2 (16) (17) where y 1 and y 2 are constants. This is called the essential (or geometric) boundary condition. This is the appropriate boundary condition for some applications, such as the example discussed above. However, in other applications, we may need to apply other types of boundary conditions to the function y(x). If we still want the second term in Eq. (12) to vanish (so that we obtain the familiar Euler- Lagrange equation), but allowing δy( ) and δy(x 2 ) to be non-zero, then we need to have, = 0 (18) x=x1 = 0 (19) x=x2 This is called a natural boundary condition. A system may also have a natural boundary condition at one end (x = ) and an essential boundary condition at the other end (x = x 2 ). If the functional involves higher derivatives of y(x), such as, F (x, y(x), y (x), y (x)) dx (20) 4

then the minimum (or maximum) of I[y(x)] satisfies the following Euler-Lagrange equation, d 2 ( ) dx 2 d ( ) dx + = 0 (21) provided that y(x) is subjected to the following essential or natural boundary conditions, essential y( ) = y 1 y ( ) = y 1 y(x 2 ) = y 2 y (x 2 ) = y 2 natural d dx ( ) = 0 at x = = 0 at x = d dx ( ) = 0 at x = x 2 = 0 at x = x 2 In general, when the functional I contains higher derivatives of y(x), i=0 the Euler-Lagrange equation is n ( ) ( 1) i di dx i (i) = 0 F (x, y, y, y,, y (n) ) dx (22) when appropriate boundary condition is applied. 4 Second variation When y(x) satisfies the Euler-Lagrange equation, the first variation δi vanishes and y(x) is an extremum of the functional I[y(x)]. To find out whether y(x) is a minimum or a maximum of I[y(x)], we need to look at the second variation of I. Let us continue the Taylor expansion in Eq. (10) to the second order, I[ỹ(x)] = = I[y(x)] + ɛ + ɛ2 2 F (x, y(x) + ɛφ(x), y (x) + ɛφ (x)) dx [ ] φ(x) + φ (x) dx [ 2 F 2 φ2 (x) + 2 F 2 φ 2 (x) + 2 F φ(x)φ (x) ] dx + O(ɛ 3 ) (23) The second variation of I is the term that is of the second order of ɛ, δ 2 I = ɛ2 x2 [ 2 ] F 2 2 φ2 (x) + 2 F 2 φ 2 (x) + 2 F φ(x)φ (x) dx = 1 [ 2 ] F 2 2 δy2 (x) + 2 F 2 δy 2 (x) + 2 F δy(x)δy (x) dx (24) When the first variation δi vanishes, y(x) is a maximum if the second variation δ 2 I > 0 (for arbitrary φ(x)), or a minimum if δ 2 I < 0. If δ 2 I = 0, we will have to examine the higher order variations of I to determine whether y(x) is a minimum or a maximum. For more discussions see [2, 3]. 5

5 Functional derivative For a function with multiple arguments, f(, x 2,, x n ), if the differential df can be written as, df = n g i (,, x n ) dx i (25) i=1 then g i (,, x n ) is called the (partial) derivative of f with respect to x i, for i = 1,, n, f x i = g i (,, x n ) (26) Notice that the derivative of a function f(,, x n ) with respect to x i g i (,, x n ). Similarly, if the (first) variation of a functional is another function b a can be written as, F (y(x), y (x)) dx (27) δi = b a g(x) δy(x) dx then the functional derivative of I is (28) δi = g(x) (29) δy(x) The functional derivative of a functional I[y(x)] is a function g(x). In the following, we discuss the connection between the (ordinary) derivative and the functional derivative. Imagine that we approximate function y(x) by a piece-wise linear curve that passes through a set of points (x i, y i ), i = 0,, n + 1, where x i = i, y i = y(x i ), x 0 = a, and x n+1 = b, as shown in Fig. 2. Let y(x) subject to essential boundary conditions, so that y 0 = y(a) and y n+1 = y(b) are fixed. Then the functional I[y(x)] can be approximated by a sum, where b a = lim n I n (y 1, y 2,, y n ) F (y(x), y (x)) dx n i=0 ( F y i, y ) i+1 y i lim n I n(y 1,, y n ) (30) n F (y i, z i ) (31) i=0 z i y i+1 y i (32) 6

y δy k δy k a... y k+1 x 0 x 2 x 3 x k-1 x k x k+1 x n+1 y k... b x Figure 2: A piece-wise linear approximation of function y(x). x i = i, y i = y(x i ), x 0 = a, x n+1 = b. Consider that one of the node, (x k, y k ) moves vertically by δy k, the variational derivative δi/δy at x = x k is the change of I divided by the change of area δy (shade region) in the limit of 0. Therefore, functional I[y(x)] is the limit of an ordinary function I n (y 1,, y n ) when the number of its arguments, n, goes to infinity. The partial derivative of I n with respect to one of its arguments is, I n k = ( [ + 1 y=yk d y=yk dx 1 y =z k 1 ( ) ] x=x k y =z k Compare this with the definition of the functional derivative, we have, δi δy(x) = d ( ) I n 1 dx = lim 0 k ) (33) Therefore, the functional derivative δi/δy(x) can be interpreted as the following. Imagine a small and localized perturbation of y(x) near point x, δi/δy(x) is the change of I divided by the change of area under the curve y(x). 6 Examples 6.1 Lagrange s equation of classical mechanics The equations of motion for particles in classical mechanics can be obtained from Hamilton s principle or the principle of least action. Let q be the coordinate of a particle, t be the time, so that q(t) is the trajectory of this particle. Suppose that the particle s position at times t 1 and t 2 are known, (34) q(t 1 ) = q 1 q(t 2 ) = q 2 (35) (36) i.e., q(t) is subjected to essential boundary conditions. The question is: what trajectory q(t) would the particle take to go through points q 1 and q 2 exactly at times t 1 and t 2? 7

The Hamilton s principle states that the real trajectory of the particle is the one that minimizes (or maximizes) the action functional [5], S[q(t)] = t2 t 1 L(t, q(t), q (t)) dt (37) where L(t, q(t), q (t)) is the Lagrangian of the system, which is usually the kinetic energy minus the potential energy, L(q, q ) = 1 2 m(q (t)) 2 V (q) (38) The real trajectory of the particle must have δs = 0, and hence satisfies the Euler-Lagrange equation, ( ) d L dt q L q = 0. (39) When the Lagrangian takes the form of Eq. (38), this leads to the following equation of motion, mq (t) = V q (40) which is identical to the Newton s equation of motion F = ma, where F = V q is the force, a = q (t) is the acceleration. In general, if the system has n degrees of freedom, q 1,, q n, the Lagrange s equations of motion are ( ) d L L = 0 for all i = 1,2,,n. dt q i q i 6.2 Tight string under lateral loading When a system subjected to external loading reaches thermal equilibrium, its Gibbs free energy is minimized. This thermodynamic principle allows us to determine the equilibrium shape of an elastic body subjected to external loading. For example, consider an elastic string subjected to lateral loading w(x), as shown in Fig. 3. Ignoring the entropic contribution, the Gibbs free energy of the system is simply the enthalpy H, i.e. the store elastic energy minus the work done by the external load. Assuming the string has a line tension T, the enthalpy can be written as the following functional of the shape of the string y(x), (in the limit of y (x) 1) H[y(x)] = L 0 Therefore, the variational derivative is 1 2 T (y (x)) 2 w(x) y(x) dx (41) δh δy(x) = T y (x) w(x) (42) Because the equilibrium shape of the string minimizes the functional H[y(x)], it satisfies the following Euler-Lagrange equation, T y (x) + w(x) = 0 (43) 8

0 w(x) y(x) L x y Figure 3: Distributed load w(x) is applied to the string hanging tightly between two walls. The deflection y(x) and its derivative is assumed to be very small because the string is tight. 6.3 Phase field equation for microstructure evolution Many systems evolves in the direction that minimizes its Gibbs free energy. This observation can be used to construct evolution equations of a wide range of systems, such as the phase field model of material microstructure [7]. For example, consider a simple model of crystal growth by solidification from the melt. The system may be described by a phase field, φ(x), such that φ(x) 1 if the material at x is solid and φ(x) 1 if it is liquid. The free energy F of the system is a functional of the phase field φ(x), which can be written as, F [φ(x)] = L 0 U(φ(x) 1) 2 (φ(x) + 1) 2 + ɛ (φ (x)) 2 dx (44) The first term in the integrand accounts for the system s preference in either one of the two states, φ = ±1. The second term in the integrand introduces a free energy penalty for the liquid-solid interface. A simple evolution equation for φ(x) is that φ(x) d dtφ(x) follows the steepest descent direction, i.e. the direction of the variational derivative, φ(x) = k δf δφ(x) = k [ 4Uφ(x) (φ(x) 2 1) ɛ φ (x) ] (45) where k is a rate constant. 6.4 Schrödinger s equation of quantum mechanics An electron in quantum mechanics is described by its wave function ψ(x). For a single electron moving in a potential V (x), the Hamiltonian operator is, 2 Ĥ = 2 + V (x) (46) 2m x2 This means that, Ĥψ(x) = 2 2 ψ(x) 2m x 2 + V (x)ψ(x) (47) The energy E of the electron is a functional of its wave function ψ(x), E[ψ(x)] = ψ (x)ĥψ(x) dx (48) 9

For simplicity, we will assume that ψ(x) is real, so that its complex conjugate ψ (x) = ψ(x). We will also assume that ψ( ) = ψ( ) = 0, i.e. assume essential boundary conditions. The ground state wave function is the minimizer of functional E[ψ(x)]. However, the wave function must satisfy the normalization condition, ψ (x)ψ(x) dx = 1 (49) Therefore, the ground state wave function satisfies the following Euler-Lagrange equation, where δe δψ(x) = 0 (50) E [ψ(x)] = E[ψ(x)] λ ψ (x)ψ(x) dx (51) and λ is the Lagrange multiplier. The explicit expression of the variational derivative is, δe [ δψ(x) = 2 2 2 ] ψ(x) 2m x 2 + V (x) ψ(x) λ ψ(x) (52) Therefore, the ground state wave function satisfies the following eigen equation, Ĥψ(x) = λψ(x) (53) We can further show that λ = E[ψ(x)] in this case, so that the constant λ is usually written as E. This leads to the time-independent Schrödinger s equation, [ 2 2 ] 2m x 2 + V (x) ψ(x) = E ψ(x) (54) References [1] R. S. Schechter, The Variational Method in Engineering, (McGrow-Hill Book Company, 1967). [2] C. Fox, An Introduction to the Calculus of Variations, (Oxford University Press, 1950). [3] I. M. Gelfand and S. V. Fomin, Calculus of Variations, (Dover, New York, 2000). [4] B. van Brunt, The Calculus of Variations, (Springer, 2004). [5] L. D. Landau and E. M. Lifshitz, Mechanics, (Pergamon Press, 1969). [6] C. Lanczos, The Variational Principles of Mechanics, (Dover, New York, 1970).h 9 [7] V. V. Bulatov and W. Cai, Computer Simulation of Dislocations, (Oxford University Press, to be published, 2006), chapter 11. 10