Pseuo-Time Methos for Constraine Optimization Problems Governe by PDE Shlomo Ta'asan Carnegie Mellon University an Institute for Computer Applications in Science an Engineering Abstract In this paper we present anovel metho for solving optimization problems governe by partial ierential equations. Existing methos use graient information in marching towar the minimum, where the constraine PDE is solve once (sometimes only approximately) per each optimization step. Such methos can be viewe as a marching techniques on the intersection of the state an costate hypersurfaces while improving the resiuals of the esign equation per each iteration. In contrast, the metho presente here march on the esign hypersurface an at each iteration improve the resiuals of the state an costate equations. The new metho is usually much less expensive per iteration step, since in most problems of practical interest the esign equation involves much fewer unknowns than either the state or costate equations. Convergence is shown using energy estimates for the evolution equations governing the iterative process. Numerical tests shows that the new metho allows the solution of the optimization problem in cost equivalent to solving the analysis problem just a few times, inepenent of the number of esign parameters. The metho can be applie using single gri iterations as well as with multigri solvers. This research was supporte by the National Aeronautics an Space Aministration uner NASA Contract No. NAS-9480 while the author was in resience at the Institute for Computer Applications in Science an Engineering (ICASE), NASA Langley Research Center, Hampton, VA 2368-000. i
Introuction In the last few years there has been a growing interest in the numerical solution of optimization problems governe by large scale problems. This new interest is a irect result of the improvement in computer technology. Probably the most challenging problems are those which involve complex governing equations such as the Euler, Navier-Stokes, Acoustic wave, an Maxwell's. Some global quantities governe by the solutions of such equations are require to be minimize (maximize) in terms of some prescribe esign variables. The resulting mathematical problem is formulate as a constraine optimization problem which can sometimes be viewe as a control problem. Most existing algorithms use graient information for reaching the minimum, possibly together with preconitioners for accelerating convergence. Ecient graient calculation can be one using the ajoint equations, an in area of aeroynamics esign, this approach was rst suggeste in [J]. There the steepest escent metho was employe there an the ajoint equations were use for ecient calculation of graients. In this approach, each optimization step requires the solution of the state an costate equations an an ecient implementation is achieve by using multigri methos for both equations. No acceleration of the optimization process was involve in this work. The one shot metho propose in [T] for control problems, also uses the ajoint metho together with multigri acceleration for state an costate, but also inclue an acceleration of the minimization process. Its evelopment so far has been for problems with elliptic partial ierential equations as constraints. The main iea is that smooth perturbations in the ata of the problem introuce smooth changes in the solution, an highly oscillatory changes in the ata prouces highly oscillatory changes in the solution. Moreover, highly oscillatory changes are localize. These observations enable the construction of very ecient optimization proceure, in aition to very ecient solvers for the state an costate variables. Design variables that correspon to smooth changes in the solution are solve for on coarse levels an those corresponing to highly oscillatory changes are solve for on appropriate ner gris. The resulting metho can be viewe as a preconitioning of the graient escent metho where the new conition number is inepenent of the gri size, an is of orer. Thus, within a few optimization iterations one reaches the minimum. The metho was rst evelope for a small imensional parameter space, where the optimization was one on the coarsest gri, yet converging to the ne gri solution [T]. Later in [TKS], [TKS2] the metho was applie to cases of a moerate number of esign variables an where the optimization was one on few of the coarsest gris. The extension of these ieas to the innite imensional parameter space was one in [AT],[AT2] where both bounary control as well as shape esign problems were treate. In [AT],[AT2] an important new analysis for the structure of the functional near the minimum was introuce. That analysis also enables the construction of ecient relaxation for multigri methos an preconitioners for single gri techniques. Moreover, it can give essential information about the structure of the minimum incluing the conition number for the optimization problem, the well-poseness (ill-poseness) of the problem, an can
suggest appropriate regularization techniques. Experiments with the one shot metho for nite imensional an innite imensional esign spaces showe that the convergence rate is practically inepenent of the number of esign parameters. The necessity of using multigri algorithms in the one shot methos is certainly a isavantage since in many engineering applications the unerlying solvers o not use multigri methos. This rawback has le to inquiries in other irections, but still aiming at algorithms that solve the full optimization problem in one shot, i.e., in a cost not much larger that that of solving the analysis problem. The rst observation mae was that the solution when using the ajoint metho is an intersection point of three hypersurfaces escribing the state equation, costate-state equation an esign equation (together forming the necessary conitions of the minimization problem). The ajoint metho can be viewe as marching on the intersection of the hypersurfaces corresponing to state an costate variables, in the irection of the intersection with the esign hypersurface. Since in most applications the number of esign variables is signicantly smaller than the number of state or costate variables, marching in the esign hypersurface is a much less expensive process than the ajoint metho, an may serve as a solution process for the optimization problem. Methos that march on the esign hypersurface are not base on graients an their convergence properties are ierent. In this paper we construct an analyze methos of this type by embeing the necessary conitions into an evolution equation so that the solution evolves in the esign hypersurface. Energy estimates are use to prove convergence. The new methos which are stable approximations to evolution processes can be accelerate using multigri or other acceleration techniques. Numerical results for moel problems are presente an emonstrate the eectiveness of the metho. It is shown that the full optimization problem is solve in a computer time equivalent to just a few solutions of the analysis problem. The metho seems to converge in a rate inepenent of the number of esign variables. 2 On Ajoint Methos We consier the following constraine minimization problem min u E(u; (u)) () L(u; ) =0 (2) where L(u; :) is a partial ierential operator (possibly nonlinear) ene on a Hilbert space H of functions ene on a omain. The esign variable is assume to be in some other Hilbert space U, for example, functions ene on the bounary, or part of it. The (formal) necessary conitions are L(u; )=0 L + E =0 L u + E u =0 State Equation Costate Equation Design Equation (3) 2
an we assume the existence of solutions for both the state an costate equations. It can be shown that the graient ofe(u; (u)) is given by A(u) =L u (; u)+e u(u; (u)) (4) where (u); (; u) are the solution of the state an costate equations. The quantity A(u) can serve as a minimization irection (steepest escent). The ajoint metho consists of solving the state an costate equations at each upate step of the esign variables. Thus is can be viewe as an approximation to the following evolution process. L(u; )=0 (5) L + E =0 (6) t u + L u + E u =0 (7) where u represent the erivative ofuwith respect to the pseuo-time variable introuce into t the problem. The actual iteration metho is obtaine by replacing u with (un+ u n )=t, for a t suciently small t The full algorithm can be viewe as a solver for the equation A(u) =0 (8) for the variable u. A crucial quantity to consier for analyzing the eciency of ierent algorithms is the mapping u!a(u) (9) For problems arising from partial ierential equation this mapping is a ierential or a pseuoierential operator an ba conitioning is anticipate. Preconitioning of basic iterative methos such as the steepest escent, is neee. The one shot methos [T],[TKS],[AT],[AT2] were aiming at a preconitioning of the graient algorithm in such away that an orer one conition number is obtaine. In such a case the number of minimization step require to reach the minimum is inepenent of the size of the problem, i.e., the number of unknowns for u. This approach was foun to be very successful for elliptic equations. The iea is to exploit the locality of high frequencies in the algorithm, as well as the fact that high frequencies in the esign variables are relate to high frequencies of the state variable an vice versa. Finite an innite imensional esign spaces have been consiere with application to aeroynamics problems, an other shape esign problems. Another possible irection, which was not explore, is to construct single gri preconitioners base on the form of the symbol of the operator A. This iea will be iscusse elsewhere. 3
Costate Design State 3 The New Approach Figure : Hypersurfaces for state, costate an esign equations The solution of the minimization problem is the intersection point of the hypersurfaces ene by state, costate, an esign equations, see gure. Graient escent methos for constraine optimization problems march along the intersection of the state an costate hypersurfaces. Each step in such a process requires the solution of two large scale problems, namely, the iscretize PDEs. Since in many applications the number of unknowns in either the state or costate equations is signicantly larger than that in the esign equation, marching on the hypersurface ene by the esign equation is a much less expensive process than that of marching along the state an the costate hypersurfaces. This is the main iea of the new approach. Each step in the minimization algorithms presente here improve the solution of the state an costate equations, for example, by improving the istance to the hypersurfaces ene by the state an costate equations. In aition each step is such that the approximate solution lies on the esign hypersurface. The construction of algorithms that march along the esign hypersurface an converge to the minimum of the optimization problem can be one for a wie class of problems governe by PDE. The approach taken here is to look at iterative methos for the solution of the state an costate equations as a stable approximation to the evolution equations governe by the constraine PDE. The construction of the metho is one in two steps. In the rst the stationary PDE (the necessary conitions) is embee into an evolution PDE for which the solution evolves in the esign hyper- 4
surface, an an energy estimate ensuring convergence is erive. The secon step involves a stable an consistent iscretization of the pseuo-time epenent problem which is usually straightforwar. A technical iculty which nees some explanation is relate to the problem of staying on the esign hypersurface. Assume that we are given an iterative metho for calculating the solutions of the state an costate equations. Let the change prouce in an be ~ an ~ respectively. In orer to remain on the esign hypersurface it is necessary to calculate a change in u, namely, ~u such that An approximation to this equation is A(u +~u; + ~ ; + ~ )=0 (0) A u ~u = A ~ A~ () This representation is useful when A u is an invertible operator. Note that the solution of this equation involves a system whose size is ientical to that of the number of esign variables, which is signicantly smaller than that of the state or the costate equations. While the operator A A u + A u is invertible, it is not convenient towork with; however A u = A is simple an easy u to manipulate. In practice, A u may not be invertible an the upate of the u requires a ierent process. In problems arising from partial ierential equations in which the esign variables are ene on the bounary only, the esign equation is an aitional bounary conition for the system, for the extra unknowns, namely, the esign variables. In that case per each iteration step of the metho presente here require the simultaneous solution of the three bounary conitions for the state equation, the costate equation an the esign equation. These three conitions together involve only a fraction more work than that of the bounary conitions for the state an costate equations. In cases where the set of the three bounary conitions cannot be solve for the bounary state, costate, an esign variables, one shoul inclue equations from the interior. This is a typical case when consiering systems of partial ierential equations in several imensions. In case that the linearize operator L is coercive an the esign equation can be solve for the esign variables, keeping the state an costate variables xe, one can view the metho presente here as an approximation to the following time epenent problem u + + L(u; )=0 t (2) t + LT + E =0 (3) L u + E u =0 (4) where the last equation is essentially an extra bounary conition for the esign variables. 5
4 Examples In this section we show a few examples of using the iea outline in the previous section. We prove an energy estimate for each of the examples, ensuring convergence. Example I: Distribute Control Let IR n an consier the optimization problem subject to min u 2 Z ( ) 2 x + 2 Z u 2 (5) The necessary conitions are 8 >< >: ( 4 = u =0 4 = u 4 + = u =0 =0 =0 (6) (7) Consier the pseuo-time embeing t = 4 u t = 4 + u =0 =0 =0 (8) We show that the error term in ; ; u, ten to zero as t approaches innity. These error terms satisfy the same equations as their corresponing quantities ; ; u but with zero source terms. So without loss of generality we consier our problem with =0. The proof uses Poincare's inequality in the form k k 2 C(kr k 2 +( Z s) 2 ) C>0 (9) where, an C>0is a constant inepenent of 2H (). The norm use above an in the rest of the paper is the L 2 norm on. The use of this theorem will be for functions vanishing on part of the bounary enote by. For this example we take = since the bounary conition for the errors in both an is zero on. Multiplying the rst two equations in (7) by an respectively we get using integration by parts an the Poincare inequality 2 t (kk2 + kk 2 )= krk 2 krk 2 C(kk 2 +kk 2 ) (20) 6
for some constant C>0, inepenent of,. This implies that kk 2 + kk 2 ecay exponentially with rate exp( Ct). That is, the pseuotime embeing converges to the minimum, at a rate etermine by the constant C. Example II: Bounary Control The next example is of a bounary control. Let IR n, [ 2 =, \ 2 = ; an consier min ( u 2 Z ) 2 x + Z 2 u 2 (2) subject to The necessary conitions are 8 >< >: 8 >< >: 4 =0 n u =0 2 4 =0 4=0 n u n = u + =0 =0 2 =0 2 (22) (23) The time embeing use for this problem is t = 4 t = 4 n u n = u + =0 =0 2 =0 2 (24) In this case the use of Poincare's inequality is one for = 2. Similarly to the previous example it can be shown that 2 t (kk2 + kk 2 )= krk 2 krk 2 C(kk 2 +kk 2 ) (25) with a ierent constant than that of example I. Again this estimate implies the exponential ecay of errors. Thus, convergence of an is ensure, an therefore also of u from the Neumann bounary conition for. 7
Example III: System of First Orer Let x 2 IR, A ;...;A be symmetric constant matrices, =( ;...; n ) ene on IR. We introuce the ecomposition of j =( + ; 0 ; ) as follows. Let A =(A ;...;A ) an n be the outwar normal to the bounary. The matrix A n is symmetric an therefore has real eigenvalues an a complete set of eigenvectors. Let =( ; 0 ; + ) be a ecomposition into the irect sum of the subspaces of A n corresponing to negative, zero an positive eigenvalues. For simplicity we also assume that A n has zero eigenvalues on isolate points on the bounary. Consier the following problem where is the solution of Z min u 2 ( g) 2 s + 2 Z P j= A j x j =0 (An) + + =u (An) + + =0 2 u 2 s (26) where [ 2 = ; \ 2 = ; We further assume that there exist a constant K>0 such that if =0; + = 0 for a time interval larger than K then = =0. The necessary conitions for the above optimization problem are P j= A j x j =0 P j= A j x j =0 (An) + + =u (An) + = g + + u =0 (An) + + =0 2 (An) =0 2 where is an arbitrary positive number. We use it to erive the convergence estimate. Consier the following time embeing (27) (28) t + P j= A j x j =0 P t j= A j x j =0 (An) + + =u (An) + = g (29) + + u =0 (An) + + =0 2 (An) =0 2 For analysis of the behavior of the errors we take g = 0 an using integration by parts we obtain 2 t (kk2 + kk 2 )= <(An) + + ; + >+ <(An) ; > <(An) + + ; + > <(An) ; > (30) 8
where the norms enote L 2 norms on an <:;:>enotes inner proucts on the bounary. Eliminating the u from the bounary conition we obtain the following bounary conition that must be satise by ; Substituting these into the energy estimate we obtain (A n) + =0 (3) + + (A n) + =0 (32) 2 The conitions t (kk2 + kk 2 )= <(An) (An) ; > <(An) (An) + + +; + > <(An) ; > (A n) 2 (A n) 0 (A n) + (A n) () 2 + 0 2 <(An) + + ; + > 2 (33) (34) imply that the changes in energy are non-positive. That is, it is either ecreasing or stabilize. Stabilization of the energy can occur only for the value zero, since otherwise it means that there exists a non zero solution to the evolution equation such that an + are zero for all times. This is in contraiction to the assumption about the constraint PDE. Since was arbitrary in this analysis, we can choose it small enough so that the rst conition hols. Then if is large enough the secon conition will hol as well. Thus, we obtain convergence if is not too small. 5 Numerical Results In this section we emonstrate the eectiveness of the methos iscusse here for an optimization problem governe by invisci incompressible ow. Let = f(x; y)j0 x ; 0 y g, = f(x; 0)j0 x g, 2=f(x; )j0 x g The minimization problem is given by Z min u 2 ( 0 ) 2 x + 2 Z u 2 x (35) subject to 4 =0 = n u = g(x) 2 (36) an all functions are assume to be perioic in the x irection. 9
5. Finite Dimensional Design Space We assume that u has the form qx j= j f j (x) (37) where j are constants to be etermine an f j (x) are prescribe functions. The necessary conitions for this problem are easily erive an we use the following time embeing as a solution process t 4=0 t 4= n u n = 0 =0 (38) =g 2 R 0 f j(x)(x; 0)x + j =0 j=;...;q This time epenent process was approximate by Jacobi relaxation, where at each time step all bounary conitions are satise. Resiuals history is given in g 2 an shows that the convergence rate is inepenent of the number of esign variables. 5.2 Innite Dimensional Design Space In this case we look for u in a function space, namely, L 2 (0; ). The necessary conitions are stationary solution of the following time evolution equation which was use in the computation. t 4=0 t 4=0 n u = g(x) n = 0 (x; 0) + u(x) =0 =0 2 (39) This time epenent process was approximate by Jacobi relaxation, where at each time step all bounary conitions are satise. Resiuals history is given in g 3 an shows that the convergence rate is inepenent of. It can be seen from that gure that the number of iteration for the analysis problem an for the full optimization problem are not much ierent. 0
Figure 2: Resiuals History for analysis an full optimization metho, gri 32x32, = 0:
Figure 3: Resiuals History for analysis an full optimization metho, gri 32x32 6 Conclusions In this paper we have introuce pseuo-time methos for the ecient solution of optimization problems governe by partial ierential equations. In these methos the marching towar the solution of the optimization problem is one on the esign hypersurface rather than the intersection of the hypersurfaces for state an costate equations. Very ecient solvers are obtaine as inicate from the proofs as well as from the numerical examples inclue. The methos allow the solution of the full optimization problem in a computational cost similar to that of solving the constraine PDE. The methos o not require graient calculation however, it is essential to use it with the ajoint equations. The methos oer an alternative to graient escent methos. Their implementation is straightforwar an can be one using multigri algorithms or single gri iteration. Acknowlegment The author wishes to thank Davi Kinerlehrer for valuable iscussions. References [AT] Arian, S. Ta'asan: Shape Optimization in One Shot. Proceeings of a Workshop on Optimal Design an Control hel at Blacksburg, VA, April 8-9, 994. [AT2] Arian, S. Ta'asan: Multigri One Shot Methos for Optimal Control Problems: Innite Dimensional Control. ICASE Report No. 94-52. [BD] F. Beaux an A. Dervieux: A Hierarchical Approach for Shape Optimization, Inria Rapports e Recherche, N. 868 (993) [CGBN] A. Carle, L.L. Green, C.H. Bischof an P. Newman: \Application of Automatic Dierentiation in CFD" AIAA 94-297, June 994. 2
[GNH] L. Green, P. Newman an K. Haigler: \Sensitivity Derivatives for avance CFD Algorithms an Viscous Moeling Parameters via Automatic Dierentiation", AIAA 93-332, 993. [HMY] W.P. Human, R.G. Melvin, D.P. Young, F.T. Johnson, J.E. Bussoletti, M.B. Bieterman an C.L. Hilmes: \Practical Design an Optimization in Computational Flui Dynamics", AIAA 93-3 July 993. [J] A. Jameson: \Aeroynamics Design via Control Theory" J. Sci. Comp. Nov. 2, 988. [T] S. Ta'asan: One-Shot Metho for Optimal Control of Distribute Parameter Systems I: Finite Dimensional Control. ICASE Report No. 9-2, 99. [TSK] S. Ta'asan, M.D. Salas an G. Kuruvila: Aeroynamics Design an Optimization in \One Shot". Proceeings of the AIAA 30th Aerospace Sciences Meeting an Exhibit, Jan 6-9, 992, Reno NV. [TSK2] S. Ta'asan, M.D. Salas an G. Kuruvila: A New Approach to Aeroynamics Optimization. Proceeing of the First European Conference on Numerical Methos in Engineering, Sept. 992, Brussels, Belgium. 3