A hybrid domain decomposition method based on one-level FETI and BDDC algorithms

Similar documents
RESEARCH ARTICLE. A strategy of finding an initial active set for inequality constrained quadratic programming problems

20. A Dual-Primal FETI Method for solving Stokes/Navier-Stokes Equations

Short title: Total FETI. Corresponding author: Zdenek Dostal, VŠB-Technical University of Ostrava, 17 listopadu 15, CZ Ostrava, Czech Republic

A Balancing Algorithm for Mortar Methods

Selecting Constraints in Dual-Primal FETI Methods for Elasticity in Three Dimensions

Multilevel and Adaptive Iterative Substructuring Methods. Jan Mandel University of Colorado Denver

Adaptive Coarse Space Selection in BDDC and FETI-DP Iterative Substructuring Methods: Towards Fast and Robust Solvers

Multispace and Multilevel BDDC. Jan Mandel University of Colorado at Denver and Health Sciences Center

A Balancing Algorithm for Mortar Methods

Parallel Sums and Adaptive BDDC Deluxe

CONVERGENCE ANALYSIS OF A BALANCING DOMAIN DECOMPOSITION METHOD FOR SOLVING A CLASS OF INDEFINITE LINEAR SYSTEMS

FETI domain decomposition method to solution of contact problems with large displacements

Extending the theory for domain decomposition algorithms to less regular subdomains

SOME PRACTICAL ASPECTS OF PARALLEL ADAPTIVE BDDC METHOD

Parallel scalability of a FETI DP mortar method for problems with discontinuous coefficients

The All-floating BETI Method: Numerical Results

Scalable BETI for Variational Inequalities

Convergence analysis of a balancing domain decomposition method for solving a class of indefinite linear systems

On the Use of Inexact Subdomain Solvers for BDDC Algorithms

18. Balancing Neumann-Neumann for (In)Compressible Linear Elasticity and (Generalized) Stokes Parallel Implementation

INSTITUTE OF MATHEMATICS THE CZECH ACADEMY OF SCIENCES. A virtual overlapping Schwarz method for scalar elliptic problems in two dimensions

TR THREE-LEVEL BDDC IN THREE DIMENSIONS

A FETI-DP Method for Mortar Finite Element Discretization of a Fourth Order Problem

ASM-BDDC Preconditioners with variable polynomial degree for CG- and DG-SEM

Parallel Scalability of a FETI DP Mortar Method for Problems with Discontinuous Coefficients

AN ADAPTIVE CHOICE OF PRIMAL CONSTRAINTS FOR BDDC DOMAIN DECOMPOSITION ALGORITHMS

The antitriangular factorisation of saddle point matrices

Domain Decomposition Preconditioners for Spectral Nédélec Elements in Two and Three Dimensions

IsogEometric Tearing and Interconnecting

for three dimensional problems are often more complicated than the quite simple constructions that work well for problems in the plane; see [23] for a

arxiv: v1 [math.na] 28 Feb 2008

OVERLAPPING SCHWARZ ALGORITHMS FOR ALMOST INCOMPRESSIBLE LINEAR ELASTICITY TR

GRUPO DE GEOFÍSICA MATEMÁTICA Y COMPUTACIONAL MEMORIA Nº 8

Sharp condition number estimates for the symmetric 2-Lagrange multiplier method

Multispace and Multilevel BDDC

Chapter 2 Finite Element Spaces for Linear Saddle Point Problems

ETNA Kent State University

A new algorithm for solving 3D contact problems with Coulomb friction

FETI Methods for the Simulation of Biological Tissues

Domain Decomposition Algorithms for an Indefinite Hypersingular Integral Equation in Three Dimensions

Application of Preconditioned Coupled FETI/BETI Solvers to 2D Magnetic Field Problems

A Robust Preconditioner for the Hessian System in Elliptic Optimal Control Problems

Inexact Data-Sparse BETI Methods by Ulrich Langer. (joint talk with G. Of, O. Steinbach and W. Zulehner)

Overlapping Schwarz preconditioners for Fekete spectral elements

Preconditioning of Saddle Point Systems by Substructuring and a Penalty Approach

Convergence Behavior of a Two-Level Optimized Schwarz Preconditioner

PANM 17. Marta Čertíková; Jakub Šístek; Pavel Burda Different approaches to interface weights in the BDDC method in 3D

ETNA Kent State University

AN ANALYSIS OF A FETI DP ALGORITHM ON IRREGULAR SUBDOMAINS IN THE PLANE TR

1 Computing with constraints

FETI-DP for Elasticity with Almost Incompressible 2 Material Components 3 UNCORRECTED PROOF. Sabrina Gippert, Axel Klawonn, and Oliver Rheinbach 4

Two new enriched multiscale coarse spaces for the Additive Average Schwarz method

An Iterative Substructuring Method for Mortar Nonconforming Discretization of a Fourth-Order Elliptic Problem in two dimensions

Auxiliary space multigrid method for elliptic problems with highly varying coefficients

cfl Jing Li All rights reserved, 22

INSTITUTE OF MATHEMATICS THE CZECH ACADEMY OF SCIENCES

An Efficient FETI Implementation on Distributed Shared Memory Machines with Independent Numbers of Subdomains and Processors

Constrained Minimization and Multigrid

Parallel Scalable Iterative Substructuring: Robust Exact and Inexact FETI-DP Methods with Applications to Elasticity

ON THE CONVERGENCE OF A DUAL-PRIMAL SUBSTRUCTURING METHOD. January 2000

33 RASHO: A Restricted Additive Schwarz Preconditioner with Harmonic Overlap

Fast Iterative Solution of Saddle Point Problems

Multipréconditionnement adaptatif pour les méthodes de décomposition de domaine. Nicole Spillane (CNRS, CMAP, École Polytechnique)

Accomodating Irregular Subdomains in Domain Decomposition Theory

A Neumann-Dirichlet Preconditioner for FETI-DP 2 Method for Mortar Discretization of a Fourth Order 3 Problems in 2D 4 UNCORRECTED PROOF

Key words. preconditioned conjugate gradient method, saddle point problems, optimal control of PDEs, control and state constraints, multigrid method

Structural and Multidisciplinary Optimization. P. Duysinx and P. Tossings

Construction of a New Domain Decomposition Method for the Stokes Equations

March 8, 2010 MATH 408 FINAL EXAM SAMPLE

Spectral element agglomerate AMGe

Preliminary/Qualifying Exam in Numerical Analysis (Math 502a) Spring 2012

A DECOMPOSITION PROCEDURE BASED ON APPROXIMATE NEWTON DIRECTIONS

Electronic Transactions on Numerical Analysis Volume 49, 2018

New constructions of domain decomposition methods for systems of PDEs

An Introduction to Algebraic Multigrid (AMG) Algorithms Derrick Cerwinsky and Craig C. Douglas 1/84

On Nonlinear Dirichlet Neumann Algorithms for Jumping Nonlinearities

1 Constrained Optimization

Coupled FETI/BETI for Nonlinear Potential Problems

LECTURE # 0 BASIC NOTATIONS AND CONCEPTS IN THE THEORY OF PARTIAL DIFFERENTIAL EQUATIONS (PDES)

Chapter 7. Optimization and Minimum Principles. 7.1 Two Fundamental Examples. Least Squares

A Multigrid Method for Two Dimensional Maxwell Interface Problems

Nonoverlapping Domain Decomposition Methods with Simplified Coarse Spaces for Solving Three-dimensional Elliptic Problems

March 5, 2012 MATH 408 FINAL EXAM SAMPLE

Substructuring for multiscale problems

PARTITION OF UNITY FOR THE STOKES PROBLEM ON NONMATCHING GRIDS

Theoretically supported scalable FETI for numerical solution of variational inequalities

Linear Solvers. Andrew Hazel

Multigrid Methods for Elliptic Obstacle Problems on 2D Bisection Grids

Domain decomposition on different levels of the Jacobi-Davidson method

Part 4: Active-set methods for linearly constrained optimization. Nick Gould (RAL)

Optimal multilevel preconditioning of strongly anisotropic problems.part II: non-conforming FEM. p. 1/36

Ir O D = D = ( ) Section 2.6 Example 1. (Bottom of page 119) dim(v ) = dim(l(v, W )) = dim(v ) dim(f ) = dim(v )

Simple Examples on Rectangular Domains

FETI-DPH: A DUAL-PRIMAL DOMAIN DECOMPOSITION METHOD FOR ACOUSTIC SCATTERING

Iterative Methods for Solving A x = b

On solving linear systems arising from Shishkin mesh discretizations

Indefinite and physics-based preconditioning

A note on accurate and efficient higher order Galerkin time stepping schemes for the nonstationary Stokes equations

An Iterative Domain Decomposition Method for the Solution of a Class of Indefinite Problems in Computational Structural Dynamics

Affine covariant Semi-smooth Newton in function space

Transcription:

A hybrid domain decomposition method based on one-level FET and BDDC algorithms Jungho Lee Computer Science and Mathematics Division Oak Ridge National Laboratory July 26, 2 Abstract A three-level domain decomposition is considered. Bodies in contact with each other are divided into subdomains, which in turn are the union of elements. Using an approach based purely on FET (finite element tearing and interconnecting) algorithms with only Lagrange multipliers as unknowns, which has been developed by the engineering community, does not lead to a scalable algorithm with respect to the number of subdomains in each body. nstead, we consider a new method based on the saddle point formulation of the FET methods with both displacement vectors and Lagrange multipliers as unknowns. The resulting system is solved with a block-diagonal preconditioner which combines the one-level FET and the BDDC (balancing domain decomposition by constraints) methods. We show that this new method is scalable with respect to the number of subdomains. A model contact problem is solved by a nonlinear algorithm which combines the new method and a primal-dual active set method. Keywords domain decomposition, scalable algorithms, FET, BDDC, contact problems, primaldual active set method AMS Subject Classification 65F, 65K, 65N55 ntroduction We consider contact problems without friction. n 8, 2, we considered the FET-FET method, which is a domain decomposition method for a linearized contact problem. We showed that the FET-FET method, which has been used in the engineering community 2,, has a condition number estimate which grows linearly with the number of subdomains, or processors. This paper is a sequel to 8; we introduce a scalable alternative to the FET-FET method, which we call a hybrid method. This method combines the one-level FET and the BDDC methods. n this paper, we assume the use of an active set method to solve our contact problem. n each step of an active set method, the active set is updated, and a imization problem on the current active set is approximately solved, until a desired accuracy is achieved. Thus, an active set method requires (nonlinear) outer iterations in which the active set is updated and (linear) inner iterations in which a imization problem is solved. We use the primal-dual active set strategy viewed as a semismooth Newton method, proposed and analyzed in 3, 4,, 9, to detere the outer iterations. This paper is organized as follows. n Section 2, we review the one-level FET, FET-DP (dualprimal FET), and BDDC methods. We introduce a nonlinear model problem and briefly describe the This author s work was supported in part by the U.S. Department of Energy under contracts DE-FG2-6ER2578 and DE-FC2-ER25482 and in part by National Science Foundation grant DMS-5325. This submission was sponsored by a contractor of the United States Government under contract DE-AC5-OR22725 with the United States Department of Energy. The United States Government retains, and the publisher, by accepting this submission for publication, acknowledges that the United States Government retains, a nonexclusive, paid-up, irrevocable, worldwide license to publish or reproduce the published form of this submission, or allow others to do so, for United States Government purposes. An earlier version of the material contained in this paper already appeared in the Ph.D. thesis of the author, see 2.

FET-FET method in Section 3. We introduce the hybrid method and provide an eigenvalue analysis of its preconditioned operator in Sections 4 and 5. n Section 6 we present numerical results which confirm the scalability of the hybrid method. n Section 7, we solve a model problem using a combination of a primal-dual active set method and our hybrid method. 2 Building blocks of the Hybrid method n this section, we briefly review some popular domain decomposition methods with nonoverlapping subdomains. 2. A model problem and notation We consider a second-order scalar elliptic problem on a bounded domain Ω R n,n = 2,3. We denote the boundary of Ω by Ω, and assume that homogeneous Dirichlet boundary conditions are imposed on Ω D Ω, which is a subset of Ω with a positive measure. Let Ω N := Ω \ Ω D be its complement. The corresponding Sobolev space in which the solution will be found is H (Ω, Ω D ) := {v H (Ω) : u = on Ω D }. We find u H (Ω, Ω D ) such that where a(u,v) = f(v), v H (Ω, Ω D ), (2.) a(u,v) := Ω ρ(x) u v, f(v) = Note that (2.) is equivalent to the following imization problem: Ω fv. (2.2) a(u,u) f(u). (2.3) u H (Ω, Ω D 2 ) We decompose Ω into N nonoverlapping subdomains Ω i,i =,,N, each of which is the union of shape-regular elements with the finite element nodes on the boundaries of neighboring subdomains matching across the interface := ( N i= Ω i) \ Ω D. is the union of faces, edges and vertices in three dimensions: faces, regarded as open subsets of, are shared by two subdomains. Edges, regarded as open subsets of the boundaries of the faces, are shared by more than two subdomains. Vertices are endpoints of edges. edges and vertices in two dimensions: edges, regarded as open subsets of, are shared by two subdomains. Vertices, as in three dimensions, are endpoints of edges. We assume that ρ(x) = ρ i ρ >, x Ω i,i =,,N. We also introduce the corresponding set of interface nodes h := ( N i= Ω i,h) \ Ω h, where Ω h and Ω i,h are the sets of finite element nodes on Ω and Ω i, respectively. We also define local bilinear forms and linear functionals, a (u,v) := ρ(x) u v, Ω i f (v) = fv. Ω i (2.4) n the rest of this section, we discuss the choice of the space of finite element functions in onelevel FET, FET-DP, and BDDC methods. We denote a standard finite element space of continuous, piecewise linear functions on Ω i by W. We will always assume that these functions vanish on Ω D. Each W is decomposed into a subdomain interior part W W = W W. and a subdomain interface part W : We denote the associated product spaces by W := N i= W,W := N, and W := N. The functions in W and W are in general discontinuous across the interface, whereas the finite element solutions are continuous across the interface. Therefore we introduce continuous subspaces i= W i= W 2

of W and W by Ŵ and Ŵ, respectively. For the FET-DP and BDDC methods, we will also need a subspace W W, intermediate between W and Ŵ, which consists of finite element functions which satisfy certain continuity constraints. The corresponding interface space is denoted by W. We introduce the following decomposition of W : ( N ) W = W ŴΠ = ŴΠ, where ŴΠ is a subspace of continuous functions and W is a subspace of functions which are allowed to be discontinuous across the interface. More precisely, ŴΠ is spanned by subdomain vertex nodal basis functions, i.e., consists of functions which are nonzero only at subdomain vertices, in the twodimensional case. Accordingly, W W consist of functions which are zero at the vertices of the subdomain Ω i. n other words, W consists of functions that are continuous at subdomain vertices. n the three-dimensional case, such vertex constraints are not enough to ensure scalability and we need to enforce the edge averages to be continuous; for instance, see 6. See Figure for a depiction of W, W, and Ŵ in the two-dimensional case. For each subdomain Ω i,i =,,N, we assemble local stiffness matrices A and local load vectors f obtained by integrating appropriate expressions over individual subdomains. We also introduce scaling factors δ i (x) for each node x h Ω i,h,i =,,N: for γ /2, ), δ i (x) = ρ γ i j N x ρ γ j i= W, x Ω i,h h. Here, N x is the set of indices j of the subdomains such that x Ω j,h. Figure : W, W and Ŵ (a) W: One-Level (b) W: FET-DP (c) Ŵ: BDDC 2.2 The One-Level FET method n this subsection, we review the one-level FET method, following 26, Section 6.3. We use the finite element functions in the space W to discretize the imization problem (2.3). Since the functions in W are in general discontinuous across the interface, we need to enforce the continuity condition explicitly: u W 2 ut Au f T u, subject to Bu =, (2.5) 3

where A = A () A (N), f = f (). f (N) Bu = represents continuity constraints across the interface, where B = B (),B (2),,B (N) is a matrix consisting of elements,, such that Bu = if and only if all the values of u associated with more than one subdomain boundary coincide. The columns of B which correspond to the interior nodes of Ω i are zero. Thus, B = B when the interior degrees of freedom are ordered first. We call B a jump operator. ntroducing a vector of Lagrange multipliers λ to enforce the continuity constraint Bu =, we obtain the following Karush-Kuhn-Tucker (KKT) system: Find (u, λ) W range(b), such that. Au + B T λ = f Bu =. (2.6) λ is unique only up to an additive element of ker(b T ). The space of Lagrange multipliers, U, is therefore chosen as range(b). Eliating the interior unknowns in each subdomain, we obtain the following: Find (u,λ) W range(b ), such that where S = S () g = g (). g (N) S (N), Su + B T λ = g B u =, (2.7), S = A A g = f A A A A T,i =,,N, f,i =,,N and B = B (),B(2),,B(N) is obtained by removing the zero columns of B that correspond to the interior nodes of individual subdomains, resulting in Bu = B u where B = B and u T = u T u T. n all FET methods, we reduce the KKT system (2.7) to an equation of λ alone, by solving the first equation of (2.7) for u. The matrices A in (2.6) and S in (2.7), however, are generally singular, when there are subdomains with boundaries which do not intersect the Dirichlet boundary Ω D. We call such subdomains floating. n such a case the solution of the first equation of (2.7) exists if and only if g B T λ range(s); this requirement leads to the introduction of a projection P, which will be introduced shortly. First, we introduce a matrix R such that range(r) = ker(s): R () R =, where R consists of the null vectors of S,i =,,N. Subdomains with nonsingular stiffness matrices do not contribute to the matrix R, i.e., R is an empty matrix if the subdomain Ω i intersects the Dirichlet boundary Ω D. We now solve the first equation of (2.7) for u : R (N) u = S (g B T λ) + Rα if g B T λ range(s) = ker(s) = range(r), (2.8) 4

where S is a pseudoinverse of S and α has to be detered. Substituting (2.8) into the second equation of (2.7), we obtain B S B T λ = B S g + B Rα. (2.9) We introduce the notation F := B S B T,d := B S g,g := B R,e := R T g and P := G(G T G) G T. Note that P is a projection operator with its range orthogonal to G. We apply this P to (2.9) to eliate the term with α and rewrite the orthogonality condition in (2.8) to obtain the following: { PFλ = Pd G T (2.) λ = e. We define the space V := {µ U : B T µ range(s)} = ker(g T ), which we call the space of admissible increments, following Chen and Mandel 7. The one-level FET method is a preconditioned conjugate gradient method applied to PFλ = Pd, λ λ + V (2.) where λ is chosen such that G T λ = e. Here, we only consider the Dirichlet preconditioner M D := B D, SBD, T, where B D, = B () D, BN D, is a scaled jump operator. B D, is obtained as follows: each nonzero entry of B is related to the Lagrange multiplier enforcing the continuity at a node x Ω i Ω j and is multiplied by δ j (x) to produce the corresponding entry of B D,. With this choice of preconditioner, the preconditioned operator of the one-level FET method has the following condition number bound: K C( + log(h/h)) 2, (2.2) where K denotes the condition number of the preconditioned operator in the appropriate subspace. For a proof of (2.2), see 24 or 26, Section 6.3. Thus the convergence rate of the one-level FET method depends only polylogarithmically on the number of degrees of freedom of a subdomain. 2.3 The FET-DP method n this subsection, we closely follow the notation of 2. For more details on various FET-DP methods, see, e.g., 2, 6,, 2, 26 and the references therein. n the FET-DP method, we use finite element functions in W to discretize (2.3). We first note that the local stiffness matrices A and the local load vectors f can be written as follows: A = A A A Π A T A A Π A T Π AT Π A ΠΠ, f = f f f Π, (2.3) where,, and Π indicate the index sets corresponding to the interior nodes, dual nodes, i.e., those of W, and primal nodes, i.e., those of W Π, respectively. We introduce the matrix Ã, which can be thought of as the restriction of A, defined for the functions in W, to the subspace W: Ã = A () A ()T A () A () Ã () Π Ã ()T Π Ã ()T Π A (N) A (N) Ã () Π Ã (N) Π A (N)T A (N) Ã (N) Π Ã (N)T Π Ã (N)T Π Ã ΠΠ. (2.4) 5

Here, and Ã Π = RT Π A Π, à ΠΠ = Ã Π = RT Π A Π, N i= R T Π A ΠΠ R Π, i =,,N, where R Π : ŴΠ W Π,i =,,N, is a restriction operator which extracts the relevant subdomain component belonging to W Π from a vector in ŴΠ. As in the one-level FET method, we introduce a vector of Lagrange multipliers and obtain the following saddle point problem: Find (u,λ) W range( B), such that Ãu + B T λ = f Bu =. (2.5) Again, B is a jump operator such that Bu =,u W if and only if the values of u associated with more than one subdomain coincide. Eliating the interior unknowns of each subdomain from the system (2.5), we obtain: Find (u,λ) W range( B ), such that S u + B T λ = g B u =. (2.6) S can also be regarded as the restriction of S, defined on W, to the subspace W : S = R T S R, where R : W W is a direct sum of restriction operators that extract the subdomain part belonging to W from a vector in W. The matrix Ã, and therefore also S, are nonsingular, so we can solve the first equation of (2.6) for u and substitute the resulting expression into the second equation of (2.6): B S B T λ = B S g. (2.7) The Dirichlet preconditioner used in the FET-DP algorithms to solve the equation (2.7) is B D, S BT D,. () (N) B D, = B D,,, B D, is obtained in exactly the same manner as B D, in Section 2.2. With this choice of preconditioner, the preconditioned operator for the FET-DP method also has the condition number bound (2.2). For a proof of this convergence bound for the two-dimensional case, see, e.g., 25. For three-dimensional scalar elliptic problems and linear elasticity problems, see 7 and 6, respectively. 2.4 The BDDC method n this subsection, we review the BDDC method, following 27. The discretized problem on the entire domain Ω is: Find (u,u ) (W,Ŵ), such that ( ) ( ) A A T u = A A u The equation (2.8) can be rewritten as A () R ()T A () A ()T R(N) T A (N) A (N) N i= A (N)T R T R () R (N) A 6 R ( f f u () Ị ). (2.8). u (N) u f ()... f (N) N i= RT f, (2.9)

R where : Ŵ W are restriction operators which extract the subdomain parts. Eliating the interior unknowns of each subdomain, i.e., eliating the upper left block of (2.9), we obtain where and Ŝ = = N i= N i= Ŝ u = g, (2.2) R T (A A R T S R A A T ) R = R T S R, (2.2) g = N i= R T (f A A f ). R : Ŵ W and S are direct sums of R and S, respectively. From (2.2), we can see that Ŝ can be regarded as the restriction of S, defined on W = N i= W, to the continuous subspace Ŵ. We can also view Ŝ as the restriction of S to Ŵ: Ŝ = R T S R, where R : Ŵ W. We introduce a few more restriction operators that constitute R. R : Ŵ W extracts the part that belongs to W from a vector in Ŵ. R Π : Ŵ ŴΠ is defined similarly. Thus R = R ()T R(N)T RT Π. We also define R D,, which are scaled versions of R ; each row of R has exactly one nonzero entry corresponding to a node x on the subdomain interface. Multiplying each such entry with δ i (x) results in the scaled version R D,. n the BDDC method, we use M = R D, T S R D, as the preconditioner. With this choice of preconditioner, the BDDC and the FET-DP algorithms have the same set of eigenvalues; see 22, 23, 2. 3 A Model Problem 3. Motivation and Notations Our ultimate goal is to solve multi-body contact problems without friction. Contact problems are characterized by an active area of contact, which is unknown a priori, and inequality constraints such as non-penetration conditions; e.g., see 2,. We consider two slightly different approaches to derive the same saddle point formulation we are going to solve. n the first approach the method of attack is an active set method from the outset, in which we solve a sequence of equality constrained imization problems. We choose to solve a saddle point formulation of each such equality constrained problem for the reason we will explain below. n the second approach, described e.g. in 9, a complementarity problem of the original imization problem is first considered. This problem can be expressed as a single nonlinear equation, and we can use a semismooth Newton method to solve this nonlinear problem. t turns out that the linear problems that are solved in this approach are identical to the saddle point problems of the first approach, and the two approaches differ only in the solution update process. n this paper, we develop the theory only for scalar elliptic problems with inequality constraints and present the following model problem, which is taken from 5, 6, as a motivation: 2 ( ) u i 2 dx fu i dx 2 Ω i Ω i i= 7

where u i H (Ω i ),i =,2, Ω = (,) (,),Ω 2 = (,2) (,) u = on u = {} (,) u 2 u on c = {} (,) (3.) The reason we consider only scalar elliptic problems is that the inequality constraints in scalar elliptic problems are much simpler than those in linear elasticity problems and this simplicity allows us to focus on the analysis of the preconditioned operator. We consider a generalization of (3.), a imization problems with multiple bodies Ω i,i =,,N, each of which has many degrees of freedom and is decomposed into subdomains Ω i,j,i =,,N,j =,,N i, constrained by an inequality condition: N ( ) ρ u i 2 dx fu i dx 2 i= Ω i Ω i where u i H (Ω i ), i =,,N, u i = on i u, N B u i (3.2) i= We assume that the boundary of at least one body is clamped, i.e., Ω D := N i= i u. We assume that there are no traction forces. We assume that ρ(x) = ρ i,j ρ >, x Ω i,j, i,j. We also assume the existence of a coefficient C, independent of i, such that ρ i := max j ρ i,j Cρ i,j, i,j. (3.3) We introduce two types of global interfaces: the first one is gl := i j Ω i Ω j, and can be viewed as the potential contact area between the bodies: in the model problem, this is c. The second one, the current contact area, is denoted by k gl, where k gl gl; the superscript k concerns the outer iteration of the active set method and reds us that the current active set/area changes. n each outer iteration of the active set method some of the inequality constraints are adopted as the corresponding equality constraints and the rest are ignored, and k gl,h, the discrete version of k gl, can be viewed as the collection of the nodes at which equality constraints are being imposed. We also introduce the local interfaces loc := j k ( Ω i,j Ω i,k ),i =,,N. We denote the union of the free boundaries by Ω F := ( i Ω i ) \ ( Ω D gl ). We denote the standard finite element space of continuous, piecewise linear functions on Ω i,j by W (i,j). Each W (i,j) is decomposed into a subdomain interior part W (i,j) and a subdomain interface part W (i,j) for functions on Ω i,j gl. We also recall that all functions vanish on the Dirichlet boundary Ω D. We define associated product spaces, W := N i j= W (i,j) and W := N i j= W (i,j). We introduce spaces analogous to W and Ŵ of Section 2. Functions in W are in general discontinuous across the local interface loc, and we define Ŵ as the continuous subspace of W. W := W Ŵ Π is an intermediate space between Ŵ and W. n the two-dimensional case, which is the focus of our analysis in this paper, W consists of functions that are continuous at subdomain vertices. Also, we let W := W W. We introduce the matrices A, which are the direct sums of the stiffness matrices A (i,j),j =,,N i for the individual subdomains: A = A (i,) A (i,ni),i =,,N. (3.4) 8

Figure 2: W c for FET-FET, Ŵc for Hybrid H b H s (a) FET-FET H b Hs (b) Hybrid We also introduce the Schur complements S (i,) S =, S (i,j) = A (i,j) S (i,ni) A(i,j)T A (i,j) A (i,j)t,j =,,N i. The finite element formulation of the problem (3.) in the space W c := N W i= and W,c := N i= has been considered in the FET-FET method 2, 8. n the hybrid method, we formulate the problem in the space Ŵ,c := N i= Ŵ. We will briefly describe the FET-FET method here. The problem (3.) can be expressed as u Wc 2 ut à c u f c T u, with Bc u. (3.5) Recalling that we are using an active set method to deal with the inequality conditions, we formulate the imization problem on the current active set: where à c = à () W u Wc 2 ut à c u f c T u, with Z k Bc u =, (3.6) à (N),u = u (). u (N), f c = f (). f (N),u W,i =,,N, and B c = Bloc B gl = B () loc... B (N) loc B () gl B (N) gl, B loc = Z k = Zgl k. B (i,) loc B (i,ni) loc,i =,,N, 9

Z k Bc u = in (3.6) indicates the continuity constraint across the local subdomain interface loc,i =,,N, as well as the continuity constraint across the global area of contact k gl. Zk gl is a square matrix obtained by replacing some of the diagonal entries of the identity matrix with zeros; only the entries corresponding to the nodes at which an equality is being imposed are retained. We use the superscript k as an indication that Zgl k and Zk change in each iteration of the active set method. We have B loc u =,u W, exactly when the values associated with more than one subdomain on the body Ω i coincide. Note that B loc has nonzero columns only for the components of W. We also introduce a scaled jump operator, B D,c : B D,c = Bloc,D B gl,d = B () loc,d B (N) loc,d B () gl,d B(N) gl,d and B loc,d = B (i,) loc,d B(i,Ni) loc,d,i =,,N. B loc,d and B gl,d are obtained in the same manner as the B D, of the one-level FET method (see section associated with the Lagrange multipliers for the continuity at the node (x), where N x,loc is the set of indices of the 2.2); the nonzero entry of B (i,j) loc x Ω i,j Ω i,k, multiplied by δ i,k (x) = ργ i,k (x)/ ρ γ s N i,s x,loc subdomains of Ω i with x on their boundary, is the corresponding entry of B loc,d. Similarly, the nonzero entry of B gl associated with the Lagrange multiplier for the continuity at the node x Ω i Ω j, multiplied by δ j (x) = s N (j) x,loc ρ γ j,s (x)/ k N x,gl,t N (k) x,loc ρ γ k,t (x), where N x,gl is the set of indices of the subdomains of any body which share the node x on their boundary, is the corresponding entry of B gl,d. Eliating the interior unknowns in all subdomains of each body, we obtain the following reduced imization problem, u W,c 2 ut S c u g c T u, with Z k B,c u =, (3.7) where S c = S () S (N),u = u ().. u (N), This imization problem has the following KKT system: S c (Z k B,c ) T u Z k B,c λ u = W,i =,,N. g. (3.8) t is natural to reduce this system to an equation for λ as in the one-level FET method and solve it with the PCG method in a proper subspace, using the following preconditioner: M D := Zk BD,c Sc BT D,c Z k. The resulting method, i.e., the FET-FET method, turns out not to be scalable with respect to the number of subdomains; see 8, 2. We now present a scalable alternative, which we name a hybrid method.

4 A Hybrid Method The hybrid method introduced in this section is a scalable alternative to the FET-FET method. We use finite element functions in the space Ŵ,c := N i= Ŵ to discretize the contact problem. W denotes a finite element space on Ω i gl. Here, OL stands for one-level; this is because we will use one-level FET type preconditioners, which is obtained by regarding each body as a single subdomain without further decomposition. We introduce the Schur complement Ŝ Ŝ OL on Ŵ, which can be obtained by restricting S to Ŵ : T = R S R,i =,,N, where R : Ŵ W. We also introduce restriction operators R : Ŵ W, R = R (i,).. R (i,ni) R Π, where R (i,j) extracts from a vector in Ŵ defined similarly. We also define the scaled versions (i,j) the part that belongs to W R D, : R D, = R (i,) D,. R (i,ni) D, R Π. and R Π : Ŵ Ŵ Π is Here, R (i,j) D, is obtained as follows: a nonzero entry of R(i,j), which corresponds to a node x Ω i,j,h \ Ω i,h, is multiplied by δ i,j (x), where δ ρ i,j (x) := γ i,j (x) k N x,loc ρ γ i,k (x). The restriction of the imization problem (3.7) in W,c to the subspace Ŵ,c is as follows: u Ŵ,c 2 ut Ŝcu ĝc T u, with Ẑ k B,c u =, (4.) where Ŝ c = Ŝ () Ŝ (N), and where B,c only retains rows and columns of B,c which correspond to Ŵ,c, and Ẑk is obtained by removing irrelevant columns and rows of Z k. n addition to B,c, we define another jump operator B OL which acts on vectors of the space N i= W OL. This operator is needed in the preconditioner for the hybrid method. Recall that B,c acts on vectors of the space Ŵ,c and only has rows corresponding to the Lagrange multipliers enforcing the continuity between different bodies, i.e., continuity across gl. Thus B OL and B,c differ only in that B,c has a number of zero columns which correspond to the nodes on loc,h. We note that B OL can be regarded as the jump operator for the one-level FET method resulting from viewing each body as

a subdomain. We can define the scaled jump operator B OL,D in the usual way. ntroducing a vector of Lagrange multipliers λ, we arrive at the following saddle point formulation of (4.): Find (u,λ) Ŵ,c range( B,c ) Ŝ c (Ẑk B,c ) T Ẑ k B,c u λ = ĝc. (4.2) We can solve (4.2) by reducing the system to an equation of λ alone in a proper subspace, but that requires inverting Ŝc, which is expensive. nstead, we keep the saddle point problem (4.2) as is and solve it by a Krylov subspace method which can deal with indefinite systems, such as the preconditioned conjugate residual (PCR) method. Due to the singularity of the matrix Ŝc, the solution of the upper part of the system (4.2) exists if and only if ĝ c (Ẑk B,c ) T λ range(ŝc). Most of the discussion here concerning this issue will be very similar to that of section 2.2 on one-level FET methods. As in the one-level FET method, we introduce a matrix R c such that range(r c ) = ker(ŝc): R c = R () where R consists of the null vectors of Ŝ,i =,,N. n the PCR iterations, we will use an initial vector of Lagrange multipliers λ which satisfies ĝ c (Ẑk B,c ) T λ range(ŝc), and an increment µ with Ẑk BT,c µ range(ŝc). Therefore the space of admissible increments is defined as follows: R (N), V k := {µ range( B,c ) : (Ẑk B,c ) T µ range(ŝc)} = ker(g kt ), where G k := Ẑk B,c R c. We introduce a projection operator P k for the Lagrange multipliers which is an orthogonal projection from U to V k = ker(g kt ): P k := G k (G kt G k ) G kt. We also introduce a subspace Ŵ,R := range(ŝc) of Ŵ,c. We rewrite (4.2) in terms of vectors in the subspace Ŵ,R V. First, noting that any admissible λ has a decomposition of the form λ = λ +µ,µ V, we rewrite the leading equation of (4.2) as Ŝ c u + (Ẑk B,c ) T µ = ĝ c (Ẑk B,c ) T λ. (4.3) Using (4.3) and that P kt µ = P k µ = µ, we can rewrite (4.2): Ŝ c (P k Ẑ k B,c ) T u ĝc Ẑ k B = B,c T λ,c µ. (4.4) The solution of (4.4) satisfies Ŝ c (P k Ẑ k B,c ) T P k Ẑ k B,c u λ ĝc = B,c T λ. (4.5) We use the system (4.5) in order to make sure that our iterates are in the subspace Ŵ,R V. Note that the displacement part of the solution of (4.5), u, does not necessarily satisfy the continuity condition Ẑk B,c u = ; we can recover a solution which satisfies all constraints via the operation u R c (G kt G k ) G kt B,c u, see 5, 2, 8. We now consider a second approach to solve (3.5), following 9; (3.5) can be rewritten as u Ŵ,c 2 ut Ŝcu ĝc T u, with B,c u, (4.6) 2

which is equivalent to the following problem { Ŝc u + B T,c λ = ĝ c B,c u, λ, λ T B,c u =. (4.7) The complementarity condition given in the second line is equivalent to C(u,λ) := λ max(,λ + c B,c u ) =, (4.8) for each c >. The system (4.7) can thus be expressed as the following nonlinear system of equations: { Ŝc u + B,c T λ = ĝ c (4.9) C(u,λ) =. t follows that a Newton step for the nonlinear system (4.9) is Ŝ c BT,c δu k ĝc (Ŝcu c B,c,A k k δλ k = k + B,c T λk ) C(u k,λ k ) and where (4.) u k+ = u k + δu k, λ k+ = λ k + δλ k. (4.) k = {i : (λ k + c B,c u k ) i }, A k = {i : (λ k + c B,c u k ) i > }, (4.2) and B,c,A k results from replacing row i of B,c with zeros, for all i A k. k is defined similarly. We can rewrite the second equation of (4.) as follows: and the first equation as (c B,c δu k ) i = (c B,c u k ) i, i A k, and (δλ k ) i = λ k i, i k, (4.3) Ŝ c δu k + B T,c,A k(δλk ) A k + B T,c, k(δλk ) k = ĝ c (Ŝcu k + B T,c,A k(λk ) A k + B T,c, k(λk ) k), (4.4) which is equivalent to Ŝ c δu k + B T,c,A k(δλk ) A k = ĝ c (Ŝcu k + B T,c,A k(λk ) A k), (4.5) due to (4.3). Consequently, we can rewrite the Newton step defined by (4.) and (4.) as Ŝ c BT,c,A δu k k ĝ c (Ŝcu k c B,c,A k δλ k = + B T,c,A (λ k ) k A k) c B,c,A ku k (4.6) and u k+ = u k + δu k, λ k+ = λ k + δλ k, where (δλ k ) i = λ k i, i k. (4.7) Note that with a given pair of (u k,λk ), (4.6) and (4.7) are equivalent to Ŝ c BT,c,A u k+ k c B ĝc,c,a k λ k+ =. (4.8) Notice the similarity between (4.2) and (4.8); in fact, B,c,A k = Ẑk B,c and with c =, (4.2) and (4.8) are the same. n the following we will solve (4.9), with c =, via Newton s method as defined by (4.6) and (4.7). We now discuss our choice of preconditioner. Let A OL denote the stiffness matrix for the entire body 3

Ω i : this needs to be distinguished from A, which is a direct sum of stiffness matrices for individual subdomains (see (3.4)). We have A OL = A A A T A, i =,,N, where A corresponds to the nodes on Ω i gl and A to those interior of Ω i, etc. We define the corresponding Schur complement S OL := A A A A T and also a block-diagonal matrix for the entire system as the direct sum of the Schur complements for the individual bodies: S OL := S () OL Since inverting A can be expensive in practice, we need to solve A x = b approximately; we propose a way of doing this in Section 7. We now introduce the following block-diagonal preconditioner for the system: B PR M = BDDC P R P k M D P k (4.9) M BDDC = S (N) OL where P R := R c (Rc T R c ) Rc T is an orthogonal projection operator onto range(ŝc) and R ()T () D, S R () D, where M D We rewrite the KKT system (4.5) as A := Ŝ c (P k Ẑ k B,c ) T P k Ẑ k B,c. R (N) T (N) D, S R (N) D, = Ẑk B OL,DS OL B T OL,DẐkT., Ax = F, (4.2), x := u λ and F := ĝc. (4.2) n our hybrid method, we use a preconditioned conjugate residual (PCR) method with B as the preconditioner. For a description of the preconditioned conjugate residual method, see 26, 8. 5 Convergence Estimates Suppose the following system is solved with the PCR method with the preconditioner M: We define Au = b. (5.) K(M A) = µ max µ = max{ λ : λ σ(m A)} { λ : λ σ(m A)}, (5.2) where σ(m A) is the spectrum of M A. We have the following result, see 26, C.6.2; a proof can be found in 8, Section 9.5. 4

Lemma 5.. Let A be regular and symmetric and M symmetric and positive definite. Then, after k steps of the PCR algorithm, the norm of the residual is bounded by where ρ = K K+ M /2 r k 2 2ρµ + ρ 2µ M /2 r 2, and µ Z, such that k/2 < µ k/2. According to Lemma 5., we need to study the spectrum of the preconditioned operator B A, which has the same spectrum as B /2 AB /2. General Case We first study a general case, where assug that A B T A = B, B =  Ĉ α u T Âu u T Au α u T Âu, u. (5.3) We assume A, Rn n,ĉ Rm m are real symmetric and positive definite. Then, B /2 AB /2 =  /2A /2  /2 B T Ĉ Ĉ /2. B /2 n the following, we use the notation à :=  /2 A /2 and B := Ĉ /2 B /2. Note that α u T u u T Ãu α u T u, u. (5.4), We study the cases where matrix and  = A and  A separately. When  = A, à is simply the identity B /2 AB /2 BT =. (5.5) B Lemma 5.2. Let B /2 AB /2 be defined as in (5.5). We then have K(B A) = K(B /2 AB /2 ) = /2 + /4 + λ max /2 + /4 + λ, where λ max and λ are the largest and smallest eigenvalues of B T B, respectively. Proof. We consider the following eigenvalue problem: BT u B λ u = t λ, which is equivalent to u + B T λ = tu Bu = tλ (5.6) Notice that t due to the nonsingularity of A and B. Substituting the second equation of (5.6) into the first, we obtain u + t BT Bu = tu. Denoting the eigenvalues of B T B by λi,i =,,n, we obtain ( + λ i /t t)u =, i =,,n. 5

Since u = leads to λ =, we need to solve + λ i /t t =,i =,,n, which are equivalent to the quadratic equations t 2 t λ i =. Their solutions are /2 ± /4 + λ i and thus Clearly, σ(b A) = {/2 ± /4 + λ i : i =,,n}. max{ λ : λ σ(b A)} = /2 + /4 + λ max and { λ : λ σ(b A)} = /2 + /4 + λ, where λ max := max n i= λ i and λ := n i=λ i. We now consider the case  A. Then the eigenvalue analysis of A := B /2 AB /2 = à BT B is not as easy, and we left- and right- multiply this symmetrized preconditioned operator with C /2 = à /2 to obtain A 2 := C /2 A C /2 = à /2 BT Bà /2. (5.7) Eigenvalues of A 2 can be analyzed in the same manner as in Lemma 5.2. To relate the spectrum of A to the spectrum of A 2, we use the Courant-Fischer Minimax Theorem. Theorem 5.3 (Courant-Fischer). Let A R n n be a symmetric matrix with real eigenvalues λ i,i =,,n, which are ordered so that λ λ 2 λ n. Then λ k = max dim(v )=k λ k = dim(v )=n k+ x T Ax x V x T x x x max T Ax x V x T x x (5.8) (5.9) Let λ λ 2 λ n denote the eigenvalues of A 2, and λ λ 2 λ n the eigenvalues of A. Suppose λ k > and λ k+ <, where λ k is the smallest positive eigenvalue of A 2 and λ k+ the largest negative eigenvalue of A 2. Also, let q i,i =,,n denote the eigenvectors of A 2 such that A 2 q i = λ i q i and q T i q j = δ ij,i,j =,,n. Using (5.8) and the fact that A = C /2 A 2 C /2 we have λ k = max dim(v )=k x T A x x V x T x = max dim(v )=k x For V := C /2 span{q (),q (2),,q (k) }, we have Noting that due to the definition of C and (5.4), we have x V x x V x (C /2 x) T A 2 (C /2 x) λ x V (C /2 x) T (C /2 k. x) x (C /2 x) T A 2 (C /2 x) (C /2 x) T (C /2 x) (C /2 x) T (C /2 x) x T x α x T x x T Cx α x T x, x (5.) (C /2 x) T A (C /2 x) (C /2 x) T (C /2 x) (C /2 x) T (C /2 x) x T λ k α. x 6

Taking the maximum over all k-dimensional subspaces on the left hand side of the previous equation, we obtain λ k λ k α. Similarly, using (5.9), we have λ k+ = dim(v )=n k x max T A x x V x T x = dim(v )=n k x For V := C /2 {q (k+),,q (n) }, we have and max x V x max x V x max (C /2 x) T A 2 (C /2 x) λ x V (C /2 x) T (C /2 k+ x) x (C /2 x) T A 2 (C /2 x) (C /2 x) T (C /2 x) (C /2 x) T (C /2 x) x T x (C /2 x) T A 2 (C /2 x) (C /2 x) T (C /2 x) (C /2 x) T (C /2 x) x T λ k+ α. x Taking the imum on the left hand side of the previous equation, we obtain By a similar argument, λ k+ λ k+ α. λ λ α and λ n λ n α. Letting λ max and λ denote the maximum and the imum eigenvalues of à /2 BT Bà /2, respectively, we obtain K(A ) = max{ λ, λ n } { λ k, λ k+ } α max{λ, λ n } α {λ k, λ k+ } = α K(A 2 ) α /2 + /4 + λ max α α /2 +, (5.) /4 + λ where the second inequality follows from the definition of A 2 in (5.7) and Lemma 5.2. Noticing that and λ max (à /2 BT Bà /2 ) λ max ( B T B)λmax (à ) λ (à /2 BT Bà /2 ) λ ( B T B)λ (à ), α u T u u T à u α u T u, we rewrite (5.) in terms of λ max and λ, the maximum and the imum eigenvalues of B T B and obtain: K(A ) α /2 + /4 + λmax /α α /2 +. (5.2) /4 + λ /α Special Case We now use these results to study the convergence bound of our preconditioned system B A, where B and A are defined in (4.9) and (4.2), respectively. We have A = Ŝ, B = P k Ẑ k B,c,  = P R M BDDC P R, u, Ĉ = P k M D P k. Notice that A,Â, and Ĉ are now singular. However, this does not pose any problem, since in the application of the PCR method our iterates will be in a proper subspace in which those matrices will be nonsingular. From (5.2), we can see that the extreme eigenvalues of BT B and α,α in (5.3) are important parameters, where B T B =  /2 B T Ĉ B /2, which has the same spectrum as B B T Ĉ. n our case, B B T Ĉ = P k Ẑ k B,c P R M BDDC P T R B,cẐkT P k P k M D P k. (5.3) The following lemma indicates the spectral equivalence between the matrices A and Â; for a proof, see 2. 7

Lemma 5.4. ( x T Âx x T Ax C + log ( Hs h )) 2 x T Âx, (5.4) for all x range(ŝ). Thus, we can study the spectrum of BA B T Ĉ = P k Ẑ k B,c P R Ŝ P R = P k Ẑ k B,c Ŝ instead of that of the matrix (5.3), where Lemma 5.5. Proof. Let B,c = B T v, where A (i,) B T R (i,)t A (i,) B,c Ŝ Ŝ = T B,cẐkT P k P k M D P k B T,cẐkT P k P k Ẑ k B OL,D S OL B T OL,D Ẑ kt P k (5.5) Ŝ () Ŝ (N). B T,cẐkT P k = B OL S OL BT OL Ẑ kt P k. () (N) B,, B,B OL = B () OL,,B (N) OL. Note that the solution of Ŝ u = v range(ŝ ), can be obtained from the following equation: A (i,2) R (i,2)t A (i,2) A (i,)t R (i,) A (i,2)t R (i,ni)t = A (i,ni) A (i,ni). B T v A (i,ni)t Ni j= R(i,j)T R (i,2) R (i,ni) A (i,j) R(i,j) u (i,) u (i,2)... u (i,ni) û, (5.6) where R (i,j) : Ŵ (i,j) T W is a restriction operator. Noting that all entries of B v, corresponding to the nodes on loc, are zero and eliating those entries results in BT OL v, we can rearrange the system (5.6): A OL u = A A A T A u u = B T OL v where u is the displacement on gl Ω i. The equivalence of (5.6) and (5.7) shows that B OL S OL BT OL Ẑ kt P k. B Ŝ (5.7) B T Ẑ kt P k = Due to Lemma 5.5, the operator (5.5) can be written as P k Ẑ k B OL S OL BT OL Ẑ kt P k P k Ẑ k B OL,D S OL B T OL,D Ẑ kt P k. (5.8) The proof of the following lemma proceeds, line by line, as the proof of 26, Theorem 6.5. 8

Lemma 5.6. For λ range(p k ), λ,λ P k ẐB k OL S OL BT OL Ẑ kt P k P k ẐB k OL,D S OL B T OL,D Ẑ kt P k λ,λ C( + log(h b /h)) 2 λ,λ. We now can derive a concrete bound for (5.2), using Lemmas 5.4 and 5.6. n our case, λ max = C( + log(h b /h)) 2 ( + log(h s /h)) 2,λ =,α = C( + log(h s /h)) 2, and α =. Assug that H b /h and H s /h are large enough, we have 2 + 2 + λ max λmax, (5.9) α α and 2 + 4 + λ = α 2 + ( ( ( + 4 λ = 2 α 2 + + 2 λ + O 4 λ ) )) 2. (5.2) 2 α α Combining (5.2), (5.9), and (5.2), we have We obtain K(A ) C( + log(h b /h))( + log(h s /h)) 5. (5.2) Theorem 5.7. Let B, A, and K(B A) be defined as in (4.9), (4.2), and (5.2), respectively. Then we have the following bound: 6 Numerical Experiments K(B A) C( + log(h b /h))( + log(h s /h)) 5. Recall that an active set method consists of outer iterations, in which the active set is updated, and inner iterations, in which auxiliary equality constrained problems are solved on the current active set. n this section, we solve the latter by using the hybrid method. We note that such problems were solved using the FET-FET method in 2, 8. We solve the following imization problem: N b N b i= ( ) u i 2 dx fu i dx, (6.) 2 Ω i Ω i where Ω i R 2,i =,,N b N b, are square bodies with side length H b := /N b which form the N b N b system Ω = Ω i =,,. We require u i H (Ω i ),u i Ωi Ω =. Each Ω i is decomposed i= into N s N s square subdomains, each of which is discretized by square bilinear elements of side length h. Also, := i j Ω i Ω j denotes the interface between the bodies. We consider two linearized problems, each with a different contact area between the bodies. n the first problem, the entire is considered as the contact area, i.e., we require the continuity of the displacement vector across the entire. n the second problem, continuity is imposed only on the middle third of the faces between the bodies. We use the preconditioned conjugate residual method. All our experiments have been performed in MATLAB, and the stopping criterion is r n 2 / r 2 < 5, where r n and r are the n th and initial residuals, respectively. n Table, the results obtained with the hybrid method are presented. We have three parameters; the number of bodies across Ω (N b = /H b ), the number of subdomains across each body (N s = H b /H s ), and the number of elements across each subdomain (H s /h). We vary one parameter while keeping the 9

Table : Results for the hybrid method. iter denotes the iteration counts. Area on which continuity is imposed between bodies:, i.e., the entire interface for (), and only a proper subset of, for () () () /H b H b /H s H s /h iter iter 2 2 2 4 2 6 2 8 2 2 4 2 6 8 8 8 8 2 8 9 4 8 8 6 7 8 8 7 7 2 2 4 3 8 3 5 6 4 6 32 5 7 64 6 9 28 7 2 other two fixed. The results for the first set of experiments, with the entire as the contact surface, are shown in Column (); those for the second set of experiments with a reduced contact area shown in Column (). We observe that the iteration counts are independent of /H b and logarithmically dependent on H s /h. The iteration counts from Table are also plotted in Figure 3. Very similar numerical results have been obtained independently by Klawonn and Rheinbach; see 3 and 4. 7 Active set method combined with the hybrid method t is well known that an active set method can often be slow due to a poor initial guess. However, it has been shown in 9 that a certain primal-dual active set strategy, viewed as a Newton method, has a superlinear convergence provided that the initial point is close to the solution. A very efficient strategy of finding a good initial active set and λ was discussed briefly in 2, Chapter 5 and in full detail in 9. n our experiments this turns out to be a good estimate of the optimal active set. We set u =. The following is the complete algorithm:. Set u = and choose λ as described in 2, Chapter 5, 9. Set k =. 2. Set k = {i : (λ k + B,c u k ) i }, A k = {i : (λ k + B,c u k ) i > }. 3. Solve Ŝ c BT,c,A k B,c,A k u k+ λ k+ = ĝc (7.) and set λ k+ = on k. 2

Figure 3: teration counts for the hybrid method for two different contact areas between the bodies, namely, i.e., the entire interface for (), and only a proper subset of, for () 2 teration counts 2 8 6 4 Experiments () Experiments () teration Counts 9 8 7 6 5 4 3 Experiments () Experiments () teration Counts 8 6 4 2 8 6 Experiments () Experiments () 2 2 4 2 2 4 6 8 2 /H b (a) /H b varies 2 4 6 8 2 4 6 8 2 H s /H b (b) H b /H s varies 2 3 4 5 6 7 log 2 (H s /h) (c) H s/h varies 4. Stop if A k+ = A k and k+ = k. Otherwise return to 2. We have solved the nonlinear model problem (3.) with the method described above; the results are reported in Table 2. Recall that the application of our preconditioner requires the solution of linear systems with the matrices A,i =,,N, i.e., solving Dirichlet problems for the bodies; this could potentially be expensive and in such a case these Dirichlet problems need to be solved inexactly, e.g., by an iteration. We describe efficient preconditioners Ã, for A, for the preconditioned conjugate gradient method. Let Ŵ and W (i,j) denote the space of continuous finite element functions on loc and Ω i,j loc, which are similar to Ŵ (i,j) and W, respectively. Also, we define a restriction operator R : Ŵ W (i,j). After a symmetric permutation, we can obtain A = A (i,) R (i,)t A (i,) A (i,2) R (i,2)t A (i,2) A (i,)t R (i,) A (i,2)t R (i,ni)t A (i,ni) A (i,ni) A (i,ni)t R (i,j) T R (i,2) R (i,ni) A (i,j) R (i,j) The solution of A x = b can be found by a block factorization. More precisely, with we have where Ŝ := x (j) N i j= = A (i,j) (b (j) Ŝ R (i,j)t (A (i,j) A (i,j) A(i,j) A (i,j)t )R (i,j),. (7.2) A (i,j)t R (i,j) x ), j =,,N i, (7.3) N i x = b R (i,j)t A (i,j) j= A Solving (7.4) can be expensive; the solution of à x = b can be defined as b (j). (7.4) where x (j) x = = A (i,j) (b (j) T R D, S R D, A (i,j)t R (i,j) x ), j =,,N i, (7.5) N i b j= R (i,j)t A (i,j) A b (j), (7.6) 2

Table 2: Results: primal-dual active set method + hybrid method. outer it. denotes the number of outer iterations of the active set method; inner it. denotes the number of iterations needed to solve the inner imization problems by the PCR method, until the norm of the residual has been reduced by 5, on the active faces identified in the outer iterations. total it. denotes the total number of inner iterations. N sub (/H) H/h N dof (λ) N dof (total) outer it. inner it. total it. 6(4) 4 7 56 2 6 6 32 6(4) 8 33 245 2 2 9 39 6(4) 2 49 4753 2 22 2 42 6(4) 6 65 8385 2 26 24 5 64(8) 4 33 245 2 8 7 35 64(8) 8 65 8385 23 23 64(8) 2 97 872 27 27 64(8) 6 29 3353 29 29 44(2) 4 49 4753 9 9 44(2) 8 97 872 2 24 22 46 44(2) 2 45 495 2 28 24 52 44(2) 6 93 7435 2 3 27 57 256(6) 4 65 8385 9 9 256(6) 8 29 3353 26 26 256(6) 2 93 7435 28 28 256(6) 6 257 384 32 32 R D, and S R D, with defined similarly as and S, respectively. n Table 2, notice that the iteration counts for the inner imizations does not increase rapidly as we increase the number of elements per subdomain or the number of subdomains per body, which is an indication of the scalability of the hybrid algorithm. Also, notice that it takes at most two outer iterations to reach the optimal solution, which is an indication of the effectiveness of the strategy to find an initial active set 9. Figure 4: Solution of the model problem, from different angles. N sub = 6,H/h = 8..2.2.2.4.4.4.6.6.6.8.8.8.2.2.4.2.4.6.8.2.4.6.8 2.5.4.2.4.6.8 2.5.5.2.4 2.5.5.8.6.4.2 References Philip Avery and Charbel Farhat. The FET family of domain decomposition methods for inequality-constrained quadratic programg: Application to contact problems with conforg and nonconforg interfaces. Computer Methods in Applied Mechanics and Engineering, 98(2-26):673 683, 29. Advances in Simulation-Based Engineering Sciences - Honoring J. Tinsley Oden. 22

2 Philip Avery, Gert Rebel, Michel Lesoinne, and Charbel Farhat. A numerically scalable dual-primal substructuring method for the solution of contact problems part : the frictionless case. Comput. Methods Appl. Mech. Engrg., 93(23-26):243 2426, 24. 3 M. Bergounioux, M. Haddou, M. Hintermüller, and K. Kunisch. A comparison of a Moreau Yosidabased active set strategy and interior point methods for constrained optimal control problems. SAM J. on Optimization, (2):495 52, 2. 4 Maïtine Bergounioux, Kazufumi to, and Karl Kunisch. Primal-dual strategy for constrained optimal control problems. SAM J. Control Optim., 37(4):76 94, 999. 5 Zdeněk Dostál. Optimal quadratic programg algorithms. With applications to variational inequalities., volume 23 of Springer Optimization and ts Applications. Springer, New York, 29. 6 Zdeněk Dostál, David Horák, and Dan Stefanica. A scalable FET-DP algorithm for a semi-coercive variational inequality. Comput. Methods Appl. Mech. Engrg., 96(8):369 379, 27. 7 Charbel Farhat, Po-Shu Chen, and Jan Mandel. A scalable Lagrange multiplier based domain decomposition method for time-dependent problems. nternat. J. Numer. Methods Engrg., 38:383 3853, 995. 8 Wolfgang Hackbusch. terative solution of large sparse systems of equations, volume 95 of Applied Mathematical Sciences. Springer-Verlag, New York, 994. Translated and revised from the 99 German original. 9 M. Hintermüller, K. to, and K. Kunisch. The primal-dual active set strategy as a semismooth Newton method. SAM J. on Optimization, 3(3):865 888, 22. Kazufumi to and Karl Kunisch. Augmented Lagrangian methods for nonsmooth, convex optimization in Hilbert spaces. Nonlinear Anal., 4(5-6):59 66, 2. Axel Klawonn and Oliver Rheinbach. A parallel implementation of dual-primal FET methods for three-dimensional linear elasticity using a transformation of basis. SAM J. Sci. Comput., 28(5):886 96, 26. 2 Axel Klawonn and Oliver Rheinbach. Robust FET-DP methods for heterogeneous three dimensional elasticity problems. Comput. Methods Appl. Mech. Engrg., 96(8):4 44, 27. 3 Axel Klawonn and Oliver Rheinbach. A hybrid approach to 3-level FET. PAMM Proc. Appl. Math. Mech., 8():84 843, 28. 4 Axel Klawonn and Oliver Rheinbach. Highly scalable parallel domain decomposition methods with an application to biomechanics. ZAMM Z. Angew. Math. Mech., 9():5 32, 2. 5 Axel Klawonn and Olof B. Widlund. A domain decomposition method with Lagrange multipliers and inexact solvers for linear elasticity. SAM J. Sci. Comput., 22(4):99 29, 2. 6 Axel Klawonn and Olof B. Widlund. Dual-primal FET methods for linear elasticity. Comm. Pure Appl. Math., 59():523 572, 26. 7 Axel Klawonn, Olof B. Widlund, and Maksymilian Dryja. Dual-primal FET methods for threedimensional elliptic problems with heterogeneous coefficients. SAM J. Numer. Anal., 4():59 79, 22. 8 Jungho Lee. Convergence analysis of a new domain decomposition method for a linearized contact problem. 2. Submitted to SAM J. Sci. Comput. 9 Jungho Lee. A strategy of finding an initial active set for inequality constrained quadratic programg problems. 2. Submitted to Optimization Methods and Software. 23

2 Jungho Lee. A Hybrid Domain Decomposition Method and its Applications to Contact Problems. PhD thesis, Courant nstitute of Mathematical Sciences, September 29. 2 Jing Li and Olof B. Widlund. FET-DP, BDDC, and block Cholesky methods. nternat. J. Numer. Methods Engrg., 66(2):25 27, 26. 22 Jan Mandel and Clark R. Dohrmann. Convergence of a balancing domain decomposition by constraints and energy imization. Numer. Linear Algebra Appl., (7):639 659, 23. Dedicated to the 7th birthday of vo Marek. 23 Jan Mandel, Clark R. Dohrmann, and Radek Tezaur. An algebraic theory for primal and dual substructuring methods by constraints. Appl. Numer. Math., 54(2):67 93, 25. 24 Jan Mandel and Radek Tezaur. Convergence of a substructuring method with Lagrange multipliers. Numer. Math., 73(4):473 487, 996. 25 Jan Mandel and Radek Tezaur. On the convergence of a dual-primal substructuring method. Numer. Math., 88(3):543 558, 2. 26 Andrea Toselli and Olof Widlund. Domain decomposition methods algorithms and theory, volume 34 of Springer Series in Computational Mathematics. Springer-Verlag, Berlin, 25. 27 Xue Tu. Three-level BDDC in two dimensions. nternat. J. Numer. Methods Engrg., 69():33 59, 27. 24