Lecture Note III: Least-Squares Method

Lecture Note III: Least-Squares Method Zhiqiang Cai October 4, 004 In this chapter, we shall present least-squares methods for second-order scalar partial differential equations, elastic equations of solids, incompressible Newtonian fluid flow, and Maxwell s equations in electromagnetic. A General Methodology We give a general methodology for the design of least-squares methods applied to a first-order system of partial differential equations. Consider the following first-order partial differential system defined on a bounded domain Ω R d d = or 3: { L U = F in Ω, B U = G on Ω,. where L = L ij m n is a block m n matrix differential operator of at most first order, B = B ij l n is a block l n matrix operator, U = U i n is unknown, F = F i m is a given block vector-valued function defined in Ω, G = G i l is a given block vector-valued function defined on Ω. Assume that first-order system. has a unique solution U. Boundary conditions in a least-squares formulation can be imposed either strongly in the solution space or weakly by adding boundary functionals. For simplicity of presentation, we impose them in the solution space Φ. Assume that Φ is appropriately chosen so that least-squares functional is well defined. Define the least-squares functional by GU ; F = m n L ij U j F i ki,ω,. j= where ki,ω denotes a Sobolev norm and ki = or 0. If ki = 0 for all i, GU ; F is referred to as the L norm least-squares functional, otherwise, it is referred to as the inverse norm least-squares functional. Denote the Laplace operator by and the L Department of Mathematics, Purdue University, West Lafayette, IN 47907-395, U.S.A.

inner product by f, g = Ω f g dx. Then L Ω = H 0 Ω =, and H Ω =,. Now, the least-squares minimization problem is to minimize the least-squares functional over Φ: This is equivalent to solving the normal equation: GU ; F = min GV ; F..3 V Φ L K L U = L K F.4 where L is the adjoint operator of L with respect to the L inner product and K is a block diagonal operator with each block associated with the H ki Ω norm K = diag K,, K m where K i = I or. That is, each diagonal block of K is either the identity or the inverse of Laplacian. For the L least squares, we have K = I. The normal operator L K L is a differential operator of at most second-order. The variational form of.4 is to find U Φ such that bu, V K L U, L V = K F, L V fv V Φ..5 It is easy to check that GV ; 0 = bv, V. The design of the least-squares method is to choose first-order system. and least-squares norms so that the least-squares problem i.e., the normal equation or the minimization problem or the weak form can be numerically solved effectively and efficiently; i.e., the least-squares variables can be discretized with optimal accuracy; the resulting algebraic system can be solved with optimal complexity. It is well known that problems with the identity or Laplace operators can be numerically solved with both optimal accuracy and complexity. Recently, it was shown that this is also true for problems involving the Hdiv and Hcurl operators H div = I div and H curl = I + provided Raviart-Thomas elements [7] are used for Hdiv and edge elements [3] are used for Hcurl. Hence, one wants to develop the least-squares method so that the normal operator L K L is equivalent to a block diagonal operator whose diagonal block is either the identity, Laplacian, H div, or H curl operators: D = diag D,, D m where D i = I,, H div, or H curl. With such equivalence, the least-squares problem can then be numerically solved with optimal accuracy and optimal complexity. Moreover, finite element spaces for different

variables U i can be chosen independently and, hence, based solely on the approximation properties and implementation/computational costs. The above equivalence between L K L and D means that there exist positive constants α 0 and α such that α 0 m m m V i D i bv, V and bu, V α U i D i V i D i.6 for all U, V Φ. Here Di denotes the L, H, Hdiv, or Hcurl norms. The main task of analyzing least-squares methods is to establish.6. Many physical models involve parameters such as the Lamé constants for solids and the viscosity parameters for fluids. It is then important to establish equivalence independent of these parameters. This is because parameter-independent equivalence implies robustness of the least-squares methods with respect to these parameters.. Least-Squares Approximation Assume that Φ h is a finite dimensional subspace of Φ satisfying the following approximation property: m inf V i V h V h Φ h i D i C a h s.7 for all V = V i n Φ. Then least-squares approximation is to find U h Φ h such that GU h ; F = min GV ; F..8 V Φ h Equivalently, find U h Φ h such that bu h, V = fv V Φ h..9 Theorem. Let U and U h be the solutions of.5 and.9, respectively. Assume that equivalence.6 and approximation property.7 hold. Then we have the following error estimation: m U i U h i D i C a α α 0 h s..0 Proof: Difference of.5 and.9 gives the error equation: bu U h, V = 0 V Φ h.. 3

It follows from.6 and.7 that for any V Φ h α 0 m U i U h i D i bu U h, U U h = bu U h, U V m m α U i Ui h D i α C a h s m U i U h i D i. U i V i D i Dividing on both sides by α 0 m U i U h i D i yields.0 and, hence, the theorem.. Mesh Refinement Indicator Let U be the solution of. and V Φ be a computed approximation to U. Then. and.6 imply GV; F = m n L ij V j F i ki,ω = j= m n L ij V j U j ki,ω j= m V i U i D i.. Since m G0; F = F i ki,ω = combining with. gives m n L ij U j ki,ω j= m U i D i, GV; F G0; F m V i U i D i m U i D i..3. means that the value of the least-squares functional at V gives certain measurement of absolute difference between the solution U and an approximation V in the functional induced norm. Therefore, the value of the least-squares functional at V on each element probably gives a reasonable mesh refinement indicator. Especially, this is true for nonlinear problem. 4

Second-Order Scalar PDEs Consider the following second-order elliptic boundary value problem: A p + Xp = f, in Ω, p = 0, on Γ D, n A p = 0, on Γ N,. where A is a d d symmetric matrix of functions in L Ω and X is an at most firstorder linear differential operator. We assume that A is uniformly symmetric positive definite: there exist positive constants 0 < λ Λ such that λξ T ξ ξ T Aξ Λξ T ξ. for all ξ R d and almost all x Ω. The corresponding variational form of system. is to find p V such that where V = ap, q = fq q V.3 { H 0,D Ω if mesγ D Ĥ Ω otherwise, with Ĥ Ω = {v H Ω v dx = 0} and the bilinear and linear forms are defined Ω by ap, q = A p, q + Xp, q and fq = f, q, respectively. Under appropriate assumptions on Γ D and X, problem.3 is uniquely solvable in H0,D Ω for any f H Ω or uniquely solvable in Ĥ Ω if and only if f satisfies the compatibility condition f dx = 0.. First-Order System of PDEs Ω For., we consider two first-order systems. To this end, introducing the flux variable u = A p, problem. may be rewritten as a first-order system of partial differential equations as follows: A A u 0 L U = F in Ω.4 div X p f with boundary conditions p = 0 on Γ D and n u = 0 on Γ N..5 5

Based on this system, we will consider two functionals: Div least-squares functional and inverse norm least-squares functional. Note that if u is sufficiently smooth, then the properly scaled solution, A u, of.4 is curl free, i.e., A u = 0, and that the homogeneous Dirichlet boundary condition on Γ D implies the tangential flux condition n A u = 0 on Γ D. We then have a redundant but consistent first-order system: A A 0 u L U div X = f F.6 p A 0 0 with boundary conditions p = 0, n A u = 0 on Γ D, and n u = 0 on Γ N..7 Based on this system, we will consider Div-curl least-squares functional.. Div Least-Squares Functional Let H N div; Ω denote a subspace of Hdiv; Ω: H N div; Ω = {v Hdiv; Ω : n v = 0 on Γ N }. For any v, q H N div; Ω V Φ, consider the following div least-squares functional: Gv, q; f = A v + A q 0,Ω + v + Xq f 0, Ω..8 The corresponding normal operator is A div I X L L = I X div div A + X X.9 with L defined in.4 and the corresponding bilinear and linear forms are bu, p; v, q = A u + A p, v + A q + u + Xp, v + Xq.0 fv, q = f, v + Xq.. The main task of this section is to establish the following equivalence: I div 0 L L. 0 6

Theorem. There exist positive constants α 0 and α such that α 0 v Hdiv + q, Ω bv, q; v, q = Gv, q; 0. for any v, q H N div; Ω V and bu, p; v, q α u Hdiv + p, Ω for any u, p and any v, q in H N div; Ω V. v Hdiv + q, Ω.3 Proof:.3 is a direct consequence of the Cauchy-Schwarz and triangle inequalities. To show the validity of., we first establish that v Hdiv + q, Ω C Gv, q; 0 + q 0, Ω..4 It follows from integration by parts, the Cauchy-Schwarz inequality, the Poincaré inequality, and. that A q 0, Ω = A q + A v, A q v, q = A q + A v, A q + v, q = A q + A v, A q + v + Xq, q Xq, q A q + A v 0, Ω A q 0, Ω + v + Xq 0, Ω q 0, Ω + Xq 0, Ω q 0, Ω A q + A v 0, Ω + C q 0, Ω A q 0, Ω + v + Xq 0, Ω q 0, Ω. Combining the fact that ab a + b, we have q, Ω C A q 0, Ω C Gv, q; 0 + q 0, Ω..5., the triangle inequality, and.5 give v 0, Ω λ A v 0, Ω λ C Gv, q; 0 + q 0, Ω. By the triangle inequality and.5, we have A v + A q 0, Ω + A q 0, Ω v 0, Ω v + Xq 0, Ω + Xq 0, Ω v + Xq 0, Ω + C q, Ω C Gv, q; 0 + q 0, Ω. Combining the above three inequalities yields.4. 7

With.4, we show the validity of. by the compactness argument. To this end, assume that. is not true. This implies that there exists a sequence {v n, q n } H N div; Ω V such that v n Hdiv + q n, Ω = and Gv, q; 0 n.6 Since V is compactly contained in L Ω, there exists a subsequence {p nk } V which converges in L Ω. For any k, l and v nk, p nk, v nl, p nl H N div; Ω V, it follows from.4 and the triangle inequality that v nk v nl Hdiv + q nk q nl, Ω C Gv nk v nl, q nk q nl ; 0 + q nk q nl 0, Ω C Gv nk, q nk ; 0 + Gv nl, q nl ; 0 + q nk q nl 0, Ω 0. which implies that v nk, p nk is a Cauchy sequence in the complete space H N div; Ω V. Hence, there exists v, p H N div; Ω V such that vnk v Hdiv + q nk q,ω = 0. Next, we show that lim k which contradict with.6 that q = 0 and v = 0.7 0 = v Hdiv + q, Ω = lim k v nk Hdiv + q nk, Ω =. To this end, for any φ V, integration by parts and the Cauchy-Schwarz inequality give aq nk, φ = A q nk, φ + Xq nk, φ = A q nk + v nk, φ + Xq nk + v nk, φ Gv nk, q nk ; 0 φ,ω. Since lim q nk = q in V, we then have aq, φ = lim k aq nk, φ lim k Gv nk, q nk ; 0 φ,ω = 0. Because.3 has a unique solution, we have that q = 0. Now, v = 0 follows from.4: v Hdiv = lim v nk Hdiv C lim Gvnk, q nk ; 0 + q nk 0, Ω = 0. k k This completes the proof of.7 and, hence, the theorem. 8

.3 Inverse Norm Least-Squares Functional For any v, q H N div; Ω V Φ, consider the following least-squares functional: Gv, q; f = A v + A q 0,Ω + v + Xq f, D..8 Let I 0 K =.9 0 D where D is the solution operator of the Laplace equation with homogeneous Dirichlet boundary conditions on Γ D. Then the corresponding normal operator is A L D div I D X K L =.0 I X D div div A + X D X with L defined in.4 and the corresponding bilinear and linear forms are respectively. bu, p; v, q = A u + A p, v + A q + D u + Xp, v + Xq. fv, q = D f, v + Xq,. Theorem. There exist positive constants α 0 and α such that for any v, q H N div; Ω V and α 0 v + q, Ω bv, q; v, q = Gv, q; 0.3 bu, p; v, q α u + p, Ω for any u, p and any v, q in H N div; Ω V. v + q, Ω.4 Proof: The theorem may be proved in a similar fashion as that of Thereom.. This theorem gives the following equivalence: I 0 L K L 0. 9

.4 Div-Curl Least-Squares Functional We use the following space to define the div-curl least-squares functional for the extended system.6. Let Hcurl A; Ω = {v L Ω d : A v L Ω d 3 },.5 which is a Hilbert space under the norm v Hcurl A v 0, Ω + A v 0, Ω When A is the identity matrix in.5, we use the simpler notation Hcurl; Ω. Define the subspaces and. H D curl A; Ω = {v Hcurl A; Ω : n A v = 0 on Γ D }, W = H N div; Ω H D curl A; Ω. For v, q W V = Φ, the div-curl least-squares functional is given by Gv, q; f = A v A q 0, Ω + v + Xq f 0, Ω + A v 0, Ω..6 The corresponding normal operator is A div + A A I X L L = I X div div A + X X.7 with L defined in.6 and the corresponding normal operator and bilinear and linear forms are bu, p; v, q = A u + A p, v + A q + u + Xp, v + Xq + A u, A v.8 fv, q = f, v + Xq,.9 respectively. It follows from Theorem. that we have the following equivalence: I div + A A 0 0 L L..30 0 0 The second equivalence requires sufficient smoothness of coefficients and boundary see [7] for the proof. 0

Theorem.3 There exist positive constants α 0 and α such that for any v, q W V and α 0 v Hdiv + A v 0, Ω + q, Ω bv, q; v, q.3 bu, p; v, q α u Hdiv + A u 0, Ω + p, Ω for any u, p, v, q W V. v Hdiv + A v 0, Ω + q, Ω.3.5 Least-Squares Problems For the solution space Φ, we have the following equivalent least-squares problems: minimization problem: find u, p Φ such that Gu, p; f = variational problem: find u, p Φ such that min Gv, q; f;.33 v, q Φ bu, p; v, q = fv, q v, q Φ..34.6 Least-Squares Approximation In this subsection, we consider least-squares finite element approximation only based on the div least-squares functional. Approximation based on the div-curl least-squares functional may be studied in a similar fashion. There are two numerical approximations based on the inverse norm least-squares functional: mesh-dependent norm approach in [] and the discrete H norm approach in [4]. Assume that Ω is a polygonal domain, let T h be a quasi-regular triangulation of Ω with triangular/tetrahedra or rectangular elements of size Oh. Denote spaces of polynomials on an element K R d : P k K is the space of polynomials of degree k; P k,k K = {px, x : px, x = a ij x i x j, d = i k, j k P k,k,k 3 K = {px, x, x 3 : px, x, x 3 = a ijk x i x j x k 3, d = 3. i k, j k, k k 3 Denote the local Raviart-Thomas RT space of index k 0 on an element K: P k K d + x,..., x d P k K, K = triangle/tetrahedra RT k K = P k+,k K P k,k+ K, K = rectangle, d = P k+,k,k K P k,k+,k K P k,k,k+ K, K = rectangle, d = 3

Degrees of freedom for RT 0 K = a + bx, c + bx on triangle or RT 0 K = a+bx, c+dx on rectangle are normal components of vector field on all edges faces of two- three- dimensional elements. See [5] for the choice of degrees of freedom for the RT k space of index k. They are chosen for ensuring continuity of the normal component of vector field at interfaces of elements. Then one can define the Hdiv; Ω conforming Raviart-Thomas space of order k 0 [7] by RT k = {v Hdiv; Ω : v K RT k K K T h }, which has the approximation property: inf v φ 0,Ω C h r v r,ω for r k +.35 φ RT k inf v φ 0,Ω C h r v r,ω for 0 r k +..36 φ RT k Denote the space of continuous piecewise polynomials of degree k by S k = {q H Ω : q K P k K T T h }. which has the following approximation property: inf q φ 0,Ω + h q φ,ω C h r+ q r+,ω for 0 r k +..37 φ S k Then least-squares approximation is to find u h, p h RT k S k such that bu h, p h ; v, q = fv, q v, q RT k S k..38 Theorem.4 Let u, p and u h, p h be the solutions of.34 and.38, respectively. Then we have the following error estimation: u u h Hdiv + p p h,ω C α α 0 h r p r+,ω + u r,ω + u r,ω C α α 0 h r p r+,ω + f r,ω..39 Proof:.39 follows from Theorem., the approximation properties in.35,.36, and.37, and the facts that u r,ω C p r+,ω and that u r,ω = f Xp r,ω f r,ω + C p r+,ω.

.7 Comparison of Least-Squares Methods In this section, we make simple comparison of least-squares methods. The div leastsquares method has the following numerical properties: + optimal finite element approximation; + optimal fast multigrid solver if Raviart-Thomas elements are used for the flux. The mesh dependent least-squares method has the following properties: + optimal finite element approximation; unknown fast iterative solver. The discrete H norm least-squares method has the following properties: + optimal finite element approximation; + uniformly well preconditioned by multigrid or domain decomposition; expensive evaluations of the discrete H norm. The div-curl least-squares method has the following properties: + finite element approximations are H -optimally accurate in each variable including new variables; + standard multigrid methods applied to the resulting discrete equation have optimal complexity; additional smoothness of the original problem is required for the second equivalence in.30..8 Boundary Least-Squares Functional Denote by H Ω the dual space of H Ω with the dual norm < v, q > v, Ω = sup, q H q Ω, Ω where the bracket < q, v > denotes duality between H Ω and H Ω. When Γ N Ω = Γ N Γ D, denote by H Γ N the dual space of H 00 Γ N = {v ΓN : v Ω}. In this section, we need the generalized Poincaré-Friedrichs inequality H 0,D q, Ω C q 0, Ω + q 0,ΓD q H Ω if mesγ D 0, q, Ω C q 0, Ω q Ĥ Ω otherwise;.40 3

and the trace inequalities for any subset Γ Ω with positive measure q,γ q,ω q H Ω, n v,γ v Hdiv v Hdiv; Ω..4 The first inequality in.4 follows from the definition. The second inequality in.4 follows from the definition, the Green s formula, and the Cauchy-Schwarz inequality: for any q H Ω and q = 0 on Γ = Ω \ Γ q n v ds Γ q, Ω Ω q n v ds q,ω = Considering non-homogeneous boundary conditions: q v dx + v q dx Ω Ω v Hdiv. q,ω p = g on Γ D and n A p = h on Γ N and the following least-squares functional: Gv, q; ˆf = A v + A q 0, Ω + v + Xq f 0, Ω + p g,γ D + n v h,γ N.4 for v, q Hdiv; Ω H Ω, where ˆf = f, g, h. Then the least-squares problem for.4 is to minimize this quadratic functional over Hdiv; Ω H Ω: find u, p Hdiv; Ω H Ω such that Gu, p; ˆf = inf Gv, q; ˆf..43 v, q Hdiv; Ω H Ω It is easy to see that the variational form for.43 is to find u, p Hdiv; Ω H Ω such that bu, p; v, q = fv, q, v, q Hdiv; Ω H Ω,.44 where the bilinear form b ; : Hdiv; Ω H Ω R is defined by bu, p; v, q = A u + A p, v + A q 0, Ω + u + Xp, v + Xq 0, Ω + < p, q >,Γ D + < n u, n v >,Γ N and the linear form f, : Hdiv; Ω H Ω R is defined by fv, q = f, v + Xq 0, Ω + < g, q >,Γ D + < h, n v >,Γ N. Theorem.5 Then there exist positive constants α 0 and α such that for any v, q Hdiv; Ω H Ω and α 0 v Hdiv + q, Ω bv, q; v, q.45 bu, p; v, q α u Hdiv + p, Ω for any u, p, v, q Hdiv; Ω H Ω. 4 v Hdiv + q, Ω.46

Proof: The continuity of the bilinear form b ; in.46 is an immediate consequence of the Cauchy-Schwarz and trace inequalities. To show the validity of the coercivity of the bilinear form in.45, it suffices to prove that v Hdiv + q, Ω C bv, q; v, q + q 0,Ω.47 because.45 then follows from a standard compactness argument see the proof of Theorem.. To this end, first note that, using the triangle and trace inequalities, q n v ds Ω q n v ds + Γ D q n v ds Γ N q,γ D n v,γ D + q,γ N n v,γ N q,γ D v Hdiv + q,ω n v,γ N q,γ v D 0,Ω + v + Xq 0,Ω + Xq 0,Ω + q,ω n v,γ N C q,γ A v 0,Ω + C A q 0,Ω q D,Γ + n v D,Γ + bv, q; v, q. N The triangle inequality gives that A v 0,Ω A v + A q 0,Ω + A q 0,Ω,.48 which, together with the above inequality, implies that q n v ds C q,γ + n v D,Γ A q 0,Ω + C bv, q; v, q..49 N Ω It follows from integration by parts, the Cauchy-Schwarz and Poincaré-Friedrichs inequalities, and.49 that A q 0,Ω = A A q + v, A q0,ω + q, v 0,Ω q n v ds Hence, A A q + v 0,Ω A q 0,Ω + q 0,Ω v 0,Ω Ω Ω q n v ds A A q + v 0,Ω + v + Xq 0,Ω + n v,γ + q N,Γ A q 0,Ω D + q 0,Ω Xq 0,Ω + C bv, q; v, q. Xq 0,Ω C A q 0,Ω C bv, q; v, q + q 0,Ω. Combining with.48 and the triangle inequality yields A v 0,Ω + v 0,Ω C bv, q; v, q + q 0,Ω. This completes the proof of.47 and, hence, theorem. For numerical approach based on the boundary least-squares functional, see [8]. 5

References [] A. K. Aziz, R.B. Kellogg, and A.B. Stephens, Least square methods for elliptic systems, Math. Comp., 4469985, 53-70. [] S. C. Brenner and L. R. Scott, The Mathematical Theory of Finite Element Methods, Springer-Verlag, New York, 994. [3] A. Bossavit, Computational Electromagnetism: variational formulations, complementarity, edge elements, Academic Press, San Diego, 998. [4] J. Bramble, R. Lazarov, and J. Pasciak, A least-squares approach based on a discrete minus one inner product for first order system, Math. Comp., 66997, 935-955. [5] F. Brezzi and M. Fortin, Mixed and Hybrid Finite Element Methods, Springer- Verlag, New York, 99. [6] Z. Cai, R. D. Lazarov, T. Manteuffel, and S. McCormick, First-order system least squares for second-order partial differential equations: Part I, SIAM J. Numer. Anal., 3:6994, 785-799. [7] Z. Cai, T. Manteuffel, and S. McCormick, First-order system least squares for second-order partial differential equations: Part II, SIAM J. Numer. Anal., 34997, 45-454. [8] G. F. Carey and Y. Shen, Convergence studies of least-squares finite elements for first order systems, Comm. Appl. Numer. Meth., 5 989, pp. 47 434. [9] P. G. Ciarlet, The Finite Element Method for Elliptic Problems, North-Holland, New York, 978. [0] C. L. Chang, Finite element approximation for grad-div type systems in the plane, SIAM J. Numer. Anal., 999, 45-46. [] E. D. Eason, A review of least squares methods for solving partial differential equations, Int. J. Numer. Math. Engrg., 0976, 0-046. [] V. Girault and P. A. Raviart, Finite Element Methods for Navier-Stokes Equations: Theory and Algorithms, Springer-Verlag, New York, 986. [3] P. Grisvard, Elliptic Problems in Nonsmooth Domains, Pitman, Boston 985. [4] A. I. Pehlivanov and G. F. Carey, Error estimates for least-squares mixed finite elements, Math. Mod. Numer. Anal., 8994, 499-56. 6

[5] A. I. Pehlivanov, G. F. Carey, and R. D. Lazarov, Least squares mixed finite elements for second order elliptic problems, SIAM J. Numer. Anal., 3994, 368-377. [6] A. I. Pehlivanov, G. F. Carey, R. D. Lazarov, and Y. Shen, Convergence of least squares finite elements for first order ODE systems, Computing 993. [7] P. A. Raviart and I. M. Thomas, A mixed finite element method for second order elliptic problems, Lect. Notes Math. 606, Springer-Verlag, Berlin and New York 977, 9-35. [8] G. Starke, Multilevel boundary functionals for least-squares mixed finite element methods, SIAM J. Numer. Anal., 36999, 065-077. 7

Homework Consider problem. with Xp = b p + cp, study dependence of constants α 0 and α in Theorem. on the diffusion coefficients A, convection coefficients b, and reaction coefficient c for the following cases: A = ax I, b = 0, and c = 0. A = I, b = b, and c = 0, where b is a constant. A = I, b = 0, and c = ω, where ω > 0 is a constant. 8