Non-Dimensionalization of Differential Equations September 9, 2014 1 Non-dimensionalization The most elementary idea of dimensional analysis is this: in any equation, the dimensions of the right hand side and the left hand side must match up. For example, an equation describing the velocity of a ball v = f(v 0, g, t) must have units of velocity, say, meters per second, on both sides or it does not make any sense. If we are measuring velocity in meters per second, the equation will have some constants in it that are related to our units of measure, meters and seconds. The specific form of the velocity equation is v = v 0 + gt, where v 0 is the original velocity, g is the gravitational constant of the Earth and t is time. Including the units, the equation becomes v m/sec = v 0 m/sec 9.8 m/sec 2 t sec, and we can see that the dimensions of the equations are m/sec = m/sec + (m/sec 2 ) sec, which matches up. The problem with this equation is that if we gave it to someone who measured velocities in feet per second, she would get the wrong answer if she didn t convert the measurements on her instruments to meters per second. In fact, the multi million dollar Mars Climate Probe was destroyed because of just this kind of error. Engineers working on different parts of the problem were using different units and one group forgot to convert from English units to SI units before sending their findings to another group. The solution to this problem, it seems, is to express it in non-dimensional form v = v 0 + gt. As long as v is measured in feet per minute, or meters per second, or even decicubits per fortnight, with the other physical quantities being similarly measured and the constant g given in that system, the solution will be correct. We naturally do dimensional matching almost unconsciously (if we have been at this long enough!) For instance, suppose we wish to model the concentration of salt in a lake. More specifically, suppose the lake has 1 10 6 cubic meters of water before the salt started polluting it, and that it has a single inlet and a single outlet, both of which flow at a rate of 3 cubic meters per second, and that at some time t 0 a manufacturing plant upriver begins dumping salt in the river, so that when the river flows into the lake, it has a concentration of 1 kilogram of salt per cubic meter. To develop a differential equations describing the situation, we form a difference that describes the change in salt over an interval of time t. Represent the salt at time t by S(t), and the salt at t later by S(t + t); then the change in salt over the period t is S(t + t) S(t),. This goes on the left hand side and has units of kilograms. On the right hand side, we describe the physical situation, change in mass of salt per unit time times the time t: this is salt per unit time in minus salt per unit time out, times t. Specifically, change in salt is (1kg/m 3 3m 3 /sec) (S3 10 6 m 3 /sec), and the change 1
in time t sec is (3 3 10 6 S) kg t. The last step is to divide both sides by t and take the limit as t goes to zero (now ignoring the units of kilograms): S(t + t) S(t) lim = ds t 0 t dt = 3 3 10 6 S. We can solve this ODE and have a formula for the total amount of salt in the lake at any time, and we can use the volume of the lake to determine the concentration. Now, while we were able to cancel out the SI unit names specifically, and we ignored the remaining mass unit, we still have units involved in this equation implicitly in the form of the constants 10 6 and 3, both of which arose from the units used to measure the volume of the lake, the concentration of salt and the rate of flow of the rivers. How can we make an equation that automatically accounts for differing units of measure, like we did for the velocity equation? And how do we find equations like the velocity equation and much more complicated modeling tasks, for instance, when we don t know for sure what the physics or the chemistry and so on actually are, or when we don t know what variables might be the most important and which might be ignored? The answer lies in a more formal and exacting application of dimensional analysis, sometimes called non-dimensionalizing equations or finding dimensionally homogeneous equations. 1.1 Example from physics Before we get into the full machinery of the method, let s do an example from physics in the way it is usually done in a physics class, and analyze the results. We begin by assembling the differential equation for the unforced frictionless pendulum using Newtons second law F = ma, force equals mass times acceleration, and with the assumption that all of the mass of the pendulum is concentrated in the center of the bob. On the left, the mathematical expression for the mass times angular acceleration m r θ, where m is the mass of the bob, r its length, θ is measured counterclockwise from the straight down position, and we have used the standard physicists notation for the second derivative with respect to time, θ d2 θ dt. On the right will be the specific physical forces 2 in the θ direction, g m r sin θ, so we have m r θ = g m sin θ. Now divide through by mr, which is never zero, to get θ = g r sin θ. Here is a trick worth remembering. There is no reason to choose to measure time in seconds. Let s choose a new unit of time measurement τ = k t for some k which we haven t yet determined. Now suppose we are measuring time in these units. Then θ = θ(τ) = θ(τ(t)). Using the chain rule, we find θ = d dt ( ) dθ = d ( dθ dt dt dτ ) dτ = d dt dt ( ) dθ k = d2 θ dτ dτ dτ 2 dt k = d2 θ dτ 2 k2 = k 2 θ, where the primes in θ indicate differentiation with respect to the new time variable τ. This simply says that the change in a property with respect to one measure of time is a constant times the change in that property with respect to another measure of time. This makes good sense. Now we have k 2 θ = g r sin θ, 2
Figure 1: The physical pendulum. which we divide by k to get θ = g 1 sin θ. r k2 If we let k = g r, then we see that k has units of L becomes θ + sin θ = 0. T 2 L = 1 T and the differential equation In terms of the actual constants of gravity and length, we are measuring time as a function of the gravitational constant and the length of the pendulum rod. There is something much more interesting happening here, as we shall soon see. Now, to see what k = g r really means, let s go back to the original equation θ = g r sin θ and solve it using the usual methods from differential equations. Because it is nonlinear, we will use a trick to approximate the solution by approximating the sine function with the linear term of its power series (Taylor Series) expansion sin θ = θ 1 3 θ3 +... The linear approximation sin θ θ is good to within about 1% for angles θ < 15, so if we solve θ + g r θ = 0, the solution will be very close to correct for small excursions of the pendulum. 3
Figure 2: The sine function and the function θ. Using the usual methods for second order linear ODEs, we find g g θ = K 1 sin r t + K 2 cos r t. We see that the scaling we chose for time in the full model of the pendulum is the frequency of the solution to the linearized problem. So the natural time scale to use when modeling a pendulum is the time it takes to complete a small amplitude swing, actually the time it takes for a small amplitude swing in the limit as the amplitude goes to zero (Why?). We also see that it is the magnitude of the gravitational acceleration and the length of the shaft that determine the period of the pendulum, and the mass does not figure into it at all. Actually, this is not quite correct, because we assumed the pendulum shaft was rigid and had no mass (incompatible assumptions!) and that the mass of the bob was concentrated at the center (more realistic). In a real pendulum, the distributed mass of the shaft displaces the actual center of gravity of the pendulum (see any calculus book for moment of inertia formulas for various shapes). Now, why should this be? Why should a physical object like a pendulum prefer one particular measure of time over another? Does it have a preferred length scale too? A preferred gravitational acceleration or velocity? The answer is that a system prefers a system of measurements that makes it dimensionless. A dimensionless system is the simplest possible version of the system, and the best, because it is scalable, in other words, it models the phenomena over a range of sizes. In particular, if we have two pendula one a meter long and one a foot long, measuring time in units of r g with r in meters and g as 9.8 meters per second squared in the first, and r in feet and g as 16 feet per second squared in the second will result in a dimensionally homogeneous equation. Thus the same solution is valid for either case, and any variation of the gravity or pendulum length will only scale the time axis. Adding friction, the original equation would become ml θ = γ θ mg sin θ. 4
Applying the same technique as before, we find resulting in, after a little algebra, k 2 θ = γ m kθ g sin θ, r θ + γ m r g θ + sin θ = 0. What should we do about this last constant? The answer will come later. 1.2 Example from ecology In a closed ecological system reside a population of rabbits who have all the clover they need to grow without bound, and a population of foxes who depend entirely on the rabbits for their sustenance. Because the population of rabbits will grow exponentially, while the population of foxes will die out exponentially, a first pass at a system, without interaction, is f = k 1 f ṙ = k 2 r. Adding an interactioon term, with the assumption that the interaction between foxes and rabbits (fox catches and eats rabbit) can be modeled as a proportionality depending on the joint probability of a rabbit and fox occupying some small capture region, we obtain 5
f = k 1 f + k 3 fr ṙ = k 2 r k 4 fr. In this formulation, we have four free parameters, k 1, k 2, k 3, k 4, some of which are probably unnecessary. Because we are free to choose the units we use to measure time (days, weeks, years) and number of creatures (number of individuals, number of average family sizes, percentage of total population, etc.) we can choose our units to eliminate some of these parameters, by combining them in dimensionless groups. With that in mind, let us propose new variables proportional to our current ones, say τ = at, F = bf, R = cr. We will adjust a, b, c to give the simplest possible model. Then substitute f = 1 b F, r = 1 c R, f df dt = a b df dτ a b F, ṙ dr dt = a c dr dτ a c R to get or a b F 1 = k 1 b F + k 1 3 bc F R (1) a 1 c R = k 2 c R k 1 4 F R, bc (2) F 1 = k 1 a F + k 1 3 ac F R (3) R 1 = k 2 a R k 1 4 F R. ba (4) Let s eliminate constant multipliers one by one. It is clear that let a = k 1 or a = k 2. Which one we choose is arbitrary we are either choosing the time rescaling a to be the growth rate of the rabbits or the decay rate of the foxes. Choosing a = k 1, we obtain F = F + k 3 k 1 c F R (5) R = k 2 k 1 R k 4 bk 1 F R. (6) Now, if we choose c = k3 k 1 and b = k4 k 1, and rename the ratio k2 k 1 α, we have eliminated all but one parameter. The single parameter α, the ratio of the growth of rabbits to the decay of foxes apparently has all the important information in it. F = F + F R (7) R = αr F R. (8) By choosing to rescale our units for the fox and rabbit populations and our measurement of time, we have simplified the problem immensely. We can now run numerical experiments and analytical procedures on our model over a realistic rance of a single parameter. It is important to realize that we have done noting but change the density of tics on the three axes. Several plots of solutions to this system can be seen in Figure(1.2). 6
2 Stability Analysis For linear systems (and also for linearization of non-linear systems, as we shall see) the eigenvalues and eigenvectors completely characterize the local qualitative dynamics of most equilibrium points. Suppose we have a linear system ẋ = Ax, or in more detail, [ ẋ ẏ ] = [ a b c d ] [ x y If b = c = 0 then the system is decoupled and the solutions have the form x = x 0 e at, y = y 0 e dt. Otherwise solutions will be (generically) of the form x = x 0 e λ1t v 1, ȳ = ȳ 0 e λ2t v 2 for some constants λ 1, λ 2 and vectors v 1, v 2. Substituting x for x, we get Av 1 = λ 1 v 1 or (A λ 1 I)v 1 = 0, which, for v 1 0, has a solution for λ 1 which can be found by solving det(a λ 1 I) = 0, and likewise for λ 2, v 2. Upon finding λ 1 and λ 2, these can be substituted into (A λi)v = 0 to find v 1, v 2. With the eigenvectors v 1, v 2 and eigenvalues λ 1, λ 2 in hand, we can decompose the matrix A in the problem ẋ = Ax as A = P DP 1, where P is a matrix whose columns are the eigenvectors v 1, v 2, D has the eigenvalues λ 1, λ 2 on its diagonal and zeroes elsewhere, to obtain ẋ = P DP 1 x or P 1 ẋ = DP 1 x. Now two vectors are identical whether written in one basis or another, thus Ix = P x represent the same vector, so the components of x, the coefficients of the column vectors v 1, v 2 making up P are given by P 1 x = x. In other words, P 1 rewrites x in the basis spanned by the columns of P, and the differential equation is now in the form x = D x, which is uncoupled in the new variables. Now we solve x = D x to get x = x 0 e λ1t v 1 and ȳ = ȳ 0 e λ2t v 2. Linear combinations of these two solutions (with coefficients x 1, ȳ 0 give solutions beginning anywhere in the phase space. We can then convert this solution back to a solution in the standard basis by writing v 1, v 2 as linear combinations [ ] of the[ standard ] basis vectors, [ ] and[ taking ] linear 1 0 1 0 combinations of these. Write v 1 = p 0 + q 1 and v 2 = r 0 + s 1. Then [ e λ1t v 1 = e λ1t 1 (p 0 ]. ] [ 0 + q 1 ] ) and so [ ] [ ] e λ2t v 2 = e λ2t 10 01 (r + s ), x = pe λ1t + re λ2t 7
y = qe λ1t + se λ2t. Obviously λ i can be any complex conjugate pair. Depending on the nature of these numbers, the solutions can have very different natures. The stability type of the equilibrium point [ is ] determined by the form of the eigenvalues. The eigenvalues for a matrix A = a b c d are found by solving det(a λi) = [ ] [ ] [ ] a b 1 0 c d λ 0 1 = a λ b c d λ = 0 or (a λ)(d λ) bc = λ 2 (a + d)λ + (ad bc) = 0, which has solutions λ 1,2 = (a + d) ± (a + d) 2 4(ad bc) 2 = τ 2 ± ( τ 2 )2, where we have renamed the trace τ = a+d and the determinant = ad bc of the matrix A. Then, depending on whether τ or the quantity under the radical is negative, positive or zero, we will have either stable, unstable or neutral dynamics, which can be subdivided into several topological types shown in the chart of Figure 3. Figure 3: This diagram encodes the equilibrium type in terms of the sign of the trace and the relative sizes of the trace and determinant. In addition to the equilibria shown, there are other types of limiting behavior, most notably limit cycles and strange attractors. A limit cycle is a solution z(t) with the property that z(t + T ) = z(t) for some finite T. A limit cycle is a topological circle, and may be knotted if it lives in three dimensions. A strange attractor is the limit set for a system that is neither an equilibrium nor a limit cycle. It is a much more complicated object that we will not discuss here, except to note that ordinary differential equation models for such simple systems as a forced oscillator, pendulum or epidemic model can exhibit the chaotic behavior that leads to a strange attractor. 8
Figure 4: The various types of equilibria determined by the eigenvalues of the local linear manifolds. A simple example of a differential equation that can lead to a limit cycle is dy dx = y 3 +x 2 y x y x 3 +xy 2 x+y. This can be written more simply in polar coordinates as θ = C, ṙ = ar(1 r), and its solution curves are shown in 5. By plotting ṙ against r in the second equation of this system we can see that except for the equilibria r = 0, r = 1, any initial conditions will approach the circle r = 1 by spiraling counterclockwise in or out. Figure 5: An attracting cycle with equation θ = C, ṙ = ar(1 r) in polar coordinates. The Lotka-Volterra predator-prey model ẋ = ax bxy, ẏ = cy + dxy exhibits a limit cycle, as does the van der Pol oscillator ÿ ɛ(1 y 2 )ẏ + y = 0. The graphs for solutions to these equations are shown below in Figure 6. 9
Figure 6: Limit cycles of the Lotka-Volterra system (left) and the van der Pol oscillator (right). A linear system has an equilibrium point at (0, 0) and nowhere else. Nonlinear systems can have any number of equilibrium points. We can perform the same kind of analysis near each of these equilibrium points if we linearize the local dynamics. The next section elaborates on this idea. 3 Local Linearization If a system is linear, we can determine the stability type of the equilibrium point by the method above for 2 2 systems, and can, at least in principle, find an algebraic solution for the eigenvalues and eigenvectors of linear systems up to 4 4. Beyond this, we are solving a fifth degree or higher polynomial to get the eigenvalues, and there is no general solution to these in terms of radicals and powers. More often than not, our model is a nonlinear system of ODEs, and in addition there are equilibrium points other than the origin. In these cases, we would like to build a locally linear system and do the same analysis we did for linear systems near each equilibrium point. For most types of equilibria, the eigenvalues of the linear system are sufficient to determine entirely the qualitative dynamics near the equilibrium point. Of all the types of equilibria, only the true nonlinear dynamics of centers of the linearized system remain a mystery. We will introduce a method to deal with problem cases like these shortly. Now we look at the method of local linearization for a two dimensional system. The method generalizes easily to higher dimensional systems. Suppose we have an equilibrium point at (x, y ). Then near that point we can set up a coordinate system (u, v) = (x x, y = [ y ),] and[ nearby, we ] may expand the right hand[ side of ] the [ nonlinear differential ] ẋ f(x, y) f(u + x equations ẏ = g(x, y) in a Taylor series. We get u v =, v + y ) g(u + x, v + y, ) where f(u + x, v + y ) = 1 0! f(x, y ) + 1 1! (u1 f x (x, y ) + v 1 f y (x, y ))+ + 1 2! (u2 f xx (x, y ) + 2uvf xy (x, y ) + v 2 f yy (x, y )) +... 10
Figure 7: Nonlinear stable and unstable manifolds W s and W u may be well approximated by stable and unstable linear manifolds E s and E u near the equilibrium. and g(u + x, v + y ) = 1 0! (u0 + v 0 )g(x, y ) + 1 1! (u1 g x (x, y ) + v 1 g y (x, y ))+ + 1 2! (u2 g xx (x, y ) + 2uvg xy (x, y ) + v 2 g yy (x, y )) +.... Now because (x, y ) is an equilibrium point, f(x, y ) = 0 and g(x, y ) = 0. Ignoring quadratic and higher terms, we obtain for f and g or in matrix form, f(x, y) = uf x (x, y ) + vf y (x, y ), g(x, y) = ug x (x, y ) + vg y (x, y ), [ u v ] = [ ] fx f y g x g y (x,y ) [ ] fx f The matrix J = y g x g is known as the Jacobian matrix, and we can find its eigenvectors and eigenvalues at each equilibrium point (x, y ) i to determine the local y dynamics. [ u v ]. 11