A Jacobi Davidson Algorithm for Large Eigenvalue Problems from Opto-Electronics

Similar documents
A Domain Decomposition Based Jacobi-Davidson Algorithm for Quantum Dot Simulation

New Multigrid Solver Advances in TOPS

Preface to the Second Edition. Preface to the First Edition

The Nullspace free eigenvalue problem and the inexact Shift and invert Lanczos method. V. Simoncini. Dipartimento di Matematica, Università di Bologna

Adaptive algebraic multigrid methods in lattice computations

The Conjugate Gradient Method

Multigrid absolute value preconditioning

An Algebraic Multigrid Method for Eigenvalue Problems

Preconditioned Locally Minimal Residual Method for Computing Interior Eigenpairs of Symmetric Operators

A Jacobi Davidson Method with a Multigrid Solver for the Hermitian Wilson-Dirac Operator

APPLIED NUMERICAL LINEAR ALGEBRA

Solving Symmetric Indefinite Systems with Symmetric Positive Definite Preconditioners

Linear Solvers. Andrew Hazel

Spectral element agglomerate AMGe

Lecture 3: Inexact inverse iteration with preconditioning

Solving Ax = b, an overview. Program

Stabilization and Acceleration of Algebraic Multigrid Method

Computers and Mathematics with Applications

LARGE SPARSE EIGENVALUE PROBLEMS. General Tools for Solving Large Eigen-Problems

AMG for a Peta-scale Navier Stokes Code

FEM and sparse linear system solving

LARGE SPARSE EIGENVALUE PROBLEMS

Algebraic Multigrid as Solvers and as Preconditioner

SOLVING MESH EIGENPROBLEMS WITH MULTIGRID EFFICIENCY

Iterative methods for Linear System

Is there life after the Lanczos method? What is LOBPCG?

Domain decomposition on different levels of the Jacobi-Davidson method

OUTLINE ffl CFD: elliptic pde's! Ax = b ffl Basic iterative methods ffl Krylov subspace methods ffl Preconditioning techniques: Iterative methods ILU

Eigenvalue Problems CHAPTER 1 : PRELIMINARIES

A Novel Aggregation Method based on Graph Matching for Algebraic MultiGrid Preconditioning of Sparse Linear Systems

Chapter 7 Iterative Techniques in Matrix Algebra

AMS526: Numerical Analysis I (Numerical Linear Algebra for Computational and Data Sciences)

Lecture 10 Preconditioning, Software, Parallelisation

Bootstrap AMG. Kailai Xu. July 12, Stanford University

Conjugate Gradient Method

Using an Auction Algorithm in AMG based on Maximum Weighted Matching in Matrix Graphs

Preconditioned Eigensolver LOBPCG in hypre and PETSc

Preconditioned Eigenvalue Solvers for electronic structure calculations. Andrew V. Knyazev. Householder Symposium XVI May 26, 2005

Multilevel Methods for Eigenspace Computations in Structural Dynamics

Conjugate-Gradient Eigenvalue Solvers in Computing Electronic Properties of Nanostructure Architectures

The quadratic eigenvalue problem (QEP) is to find scalars λ and nonzero vectors u satisfying

AMS526: Numerical Analysis I (Numerical Linear Algebra) Lecture 23: GMRES and Other Krylov Subspace Methods; Preconditioning

A Generalized Eigensolver Based on Smoothed Aggregation (GES-SA) for Initializing Smoothed Aggregation Multigrid (SA)

Jae Heon Yun and Yu Du Han

Krylov subspace projection methods

Iterative Methods for Solving A x = b

Numerical Methods I Non-Square and Sparse Linear Systems

A Parallel Scalable PETSc-Based Jacobi-Davidson Polynomial Eigensolver with Application in Quantum Dot Simulation

Alternative correction equations in the Jacobi-Davidson method

Preconditioned inverse iteration and shift-invert Arnoldi method

Contents. Preface... xi. Introduction...

Incomplete Cholesky preconditioners that exploit the low-rank property

AN AGGREGATION MULTILEVEL METHOD USING SMOOTH ERROR VECTORS

Iterative projection methods for sparse nonlinear eigenvalue problems

CME342 Parallel Methods in Numerical Analysis. Matrix Computation: Iterative Methods II. Sparse Matrix-vector Multiplication.

Course Notes: Week 1

The Lanczos and conjugate gradient algorithms

Last Time. Social Network Graphs Betweenness. Graph Laplacian. Girvan-Newman Algorithm. Spectral Bisection

Key words. conjugate gradients, normwise backward error, incremental norm estimation.

Solution of eigenvalue problems. Subspace iteration, The symmetric Lanczos algorithm. Harmonic Ritz values, Jacobi-Davidson s method

State-of-the-art numerical solution of large Hermitian eigenvalue problems. Andreas Stathopoulos

ITERATIVE METHODS FOR SPARSE LINEAR SYSTEMS

Density-Matrix-Based Algorithms for Solving Eingenvalue Problems

SOLVING SPARSE LINEAR SYSTEMS OF EQUATIONS. Chao Yang Computational Research Division Lawrence Berkeley National Laboratory Berkeley, CA, USA

Inexact inverse iteration with preconditioning

Iterative methods for symmetric eigenvalue problems

Conjugate-Gradient Eigenvalue Solvers in Computing Electronic Properties of Nanostructure Architectures

A Jacobi Davidson Method for Nonlinear Eigenproblems

Arnoldi Methods in SLEPc

The Removal of Critical Slowing Down. Lattice College of William and Mary

ALGEBRAIC MULTILEVEL METHODS FOR GRAPH LAPLACIANS

Just solve this: macroscopic Maxwell s equations

Krylov Space Solvers

1. Fast Iterative Solvers of SLE

Parallel Numerics, WT 2016/ Iterative Methods for Sparse Linear Systems of Equations. page 1 of 1

of dimension n 1 n 2, one defines the matrix determinants

Lecture Notes for Inf-Mat 3350/4350, Tom Lyche

EIGIFP: A MATLAB Program for Solving Large Symmetric Generalized Eigenvalue Problems

A Parallel Implementation of the Davidson Method for Generalized Eigenproblems

Some Geometric and Algebraic Aspects of Domain Decomposition Methods

An Introduction to Algebraic Multigrid (AMG) Algorithms Derrick Cerwinsky and Craig C. Douglas 1/84

Multigrid Methods for Linear Systems with Stochastic Entries Arising in Lattice QCD. Andreas Frommer

Algebraic multigrid for moderate order finite elements

A decade of fast and robust Helmholtz solvers

ECS231 Handout Subspace projection methods for Solving Large-Scale Eigenvalue Problems. Part I: Review of basic theory of eigenvalue problems

Implementation of a preconditioned eigensolver using Hypre

A Tuned Preconditioner for Inexact Inverse Iteration for Generalised Eigenvalue Problems

AMS526: Numerical Analysis I (Numerical Linear Algebra)

Iterative methods for positive definite linear systems with a complex shift

Solution of eigenvalue problems. Subspace iteration, The symmetric Lanczos algorithm. Harmonic Ritz values, Jacobi-Davidson s method

Parallel Discontinuous Galerkin Method

Ma/CS 6b Class 23: Eigenvalues in Regular Graphs

Adaptive Multigrid for QCD. Lattice University of Regensburg

Iterative methods for Linear System of Equations. Joint Advanced Student School (JASS-2009)

Lecture 17: Iterative Methods and Sparse Linear Algebra

Numerical Methods in Matrix Computations

Solving PDEs with Multigrid Methods p.1

Assignment 1 Math 5341 Linear Algebra Review. Give complete answers to each of the following questions. Show all of your work.

M.A. Botchev. September 5, 2014

An efficient multigrid method for graph Laplacian systems II: robust aggregation

Transcription:

RANMEP, Hsinchu, Taiwan, January 4 8, 2008 1/30 A Jacobi Davidson Algorithm for Large Eigenvalue Problems from Opto-Electronics Peter Arbenz 1 Urban Weber 1 Ratko Veprek 2 Bernd Witzigmann 2 1 Institute of Computational Science, ETH Zurich, 2 Integrated Systems Laboratory, ETH Zurich Work in progress RANMEP, Hsinchu, Taiwan, January 4 8, 2008.

RANMEP, Hsinchu, Taiwan, January 4 8, 2008 2/30 Introduction I In solid state physics, the (electronic) band structure of a solid describes ranges of energy that an electron can assume. Electrons of single free-standing atom occupy atomic orbitals, that form discrete set of energy levels. When a large number of atoms are brought together to form a solid, the number of orbitals becomes exceedingly large, and the difference in energy between them becomes very small. Bands of energy levels are form rather than discrete energy levels. Intervals that contain no orbitals are called band gaps. From wikipedia.org

RANMEP, Hsinchu, Taiwan, January 4 8, 2008 3/30 Introduction II Any solid has a large number of bands, in theory, many. However, all but a few lie at energies so high that any electron that reaches those energies escapes from the solid. These bands are usually disregarded.

RANMEP, Hsinchu, Taiwan, January 4 8, 2008 4/30 Introduction III The uppermost occupied band is called valence band by analogy to the valence electrons of individual atoms. The lowermost unoccupied band is called the conduction band because only when electrons are excited to the conduction band can current flow in these materials.

RANMEP, Hsinchu, Taiwan, January 4 8, 2008 5/30 Introduction IV A numerical evaluation of the band structure takes into account the periodic nature of a crystal lattice. The Schrödinger equation for a single particle in a cristal lattice is given by ( 2 2m 0 2 + V (r))ψ(r) = EΨ(r) (1) where the potential V exhibits the cristal periodicities: V (r) = V (r + R) for all lattice vectors R. Solutions of (1) are Bloch functions: where u nk (r) is lattice-periodic. Ψ nk (r) = e ik r u nk (r). (2)

RANMEP, Hsinchu, Taiwan, January 4 8, 2008 6/30 Introduction V Here, k is called the wave vector, and is related to the direction of motion of the electron in the crystal, and n is the band index, which simply numbers the energy bands. The wave vector k takes on values within the Brillouin zone corresponding to the crystal lattice. Plugging (2) into (1) yields 2 2m 0 ( 2 + 2ik + k 2 + V (r) ) u nk (r) = E n (k)u nk (r) For each of the possible k there are infinitely many n = 1, 2,.... (3)

RANMEP, Hsinchu, Taiwan, January 4 8, 2008 7/30 Introduction VI There are a number of possible ways to compute the spectral bands. 1 Solve (3) for k = k 0 = 0 and determine the rest of the eigenvalues by perturbation theory. (Effective mass m ) 2 Solve (3) in a subspace spanned by a small number of eigenfunctions (zone-centered Bloch functions) u n0 (Kane, 1957). 3 In the k p method the two methods are somehow combined by writing m Φ(r) = g i (r)u i0 (r). (4) i=1 Here, the so-called envelope g i replaces the plane wave e ik r in (2) (Luttinger-Kohn, 1955; Löwdin, 1951). We go for the k p method for its superior accuracy.

RANMEP, Hsinchu, Taiwan, January 4 8, 2008 8/30 Introduction VII We finally arrive at an eigenvalue problem for the envelopes: H (2) ij (r) i j + H (1) i (r) i + H (0) (r) + U(r) g(r) = Eg(r) i,j=1,2,3 (5) Here, the perturbation U(r) takes into account an impurity potential of the material or quantum dot, etc. Note, that g is a m-vector (field). For the numerical computation g is represented by piecewise (linear) finite elements.

RANMEP, Hsinchu, Taiwan, January 4 8, 2008 9/30 Quantum dot Probability density of an excited state in a valence band in a pyramidal quantum dot.

RANMEP, Hsinchu, Taiwan, January 4 8, 2008 10/30 The linear algebra problem We end up with the generalized complex Hermitian eigenvalue problem Ax = λmx. (6) where A = A C n n, M T = M R n n, Depending on the bands included, A is either definite or indefinite. The latter holds if valence and conduction band are taken into account. Just a small number (k = 1 8) of bands are used. Matrices have k k blocks. Only a few of the eigenvalues closest to 0 of (6) are desired.

RANMEP, Hsinchu, Taiwan, January 4 8, 2008 11/30 Transformation complex Hermitian real symmetric I For A = A r + i A i C n m, A r, A i R n m, define the mapping (Day & Heroux, 2001) ( ) ϕ : C m n R 2m 2n Ar A : A ϕ(a) := i A i A r (7) that transforms a complex m n matrix into a real 2m 2n matrix. If operations are allowed, then ϕ(ab) = ϕ(a)ϕ(b). (8) Many codes for solving eigenvalue problems, in particular for large sparse eigenvalue problems, are available only in real arithmetic. Very often the sparsity structures of real and imaginary parts of the underlying matrices differ considerably.

RANMEP, Hsinchu, Taiwan, January 4 8, 2008 12/30 Transformation complex Hermitian real symmetric II We can rewrite the complex eigenvalue problem Ax = xλ (9) in the real form ϕ(ax) = ϕ(a)ϕ(x) = ϕ(xλ) = ϕ(x)ϕ(λ). (10) or, ( Ar A i A i A r ) ( ) ( ) ( xr x i xr x = i λr λ i x i x r So, a complex eigenvalue of A becomes a pair of complex conjugate eigenvalues of ϕ(a). A Hermitian = λ i = 0. x i x r λ i λ r ). (11)

RANMEP, Hsinchu, Taiwan, January 4 8, 2008 13/30 Transformation complex Hermitian real symmetric III Special cases: ϕ(x )ϕ(x) = ϕ( x 2 ) ( x T r x T i x T i x T r ) ( ) xr x i x i x r ( ) = x 2 1 0 0 1 ( ) Enforcing x yr y = 0 thus means that is made orthogonal y ( ) ( ) i ( ) xi xr yi to both and. The companion vector x r x i y r will then automatically be orthogonal to these two vectors. Computation is done with just one vector.

RANMEP, Hsinchu, Taiwan, January 4 8, 2008 14/30 Computing only a few eigenpairs Case 1. A is positive definite. Compute eigenvalues one by one starting with the smallest. Case 2: A is indefinite. (M is always spd.) Use approaches employing harmonic Ritz values/vectors. Preconditioner? Generate two new positive definite eigenvalue problems (A σ i M)M 1 (A σ i M)x = λmx, i = 1, 2. (12) where σ 1 is close to the smallest positive eigenvalue of (A, M) and σ 2 is close to the smallest negative eigenvalue of (A, M) (Vömel, 2007; Shi Shu, 2006). Compute eigenvalues of (12) with shift σ 1 one by one starting with the smallest. Compute eigenvalues of (12) with shift σ 2 one by one starting with the smallest. Preconditioner? AMG preconditioner?

RANMEP, Hsinchu, Taiwan, January 4 8, 2008 15/30 Symmetric Jacobi Davidson (JDSYM) (Sleijpen/van der Vorst, 1996; Geus, 2003) Let V k = span{v 1,..., v k }, v T k Mv j = δ kj, be the actual search space (not a Krylov space). Rayleigh Ritz Galerkin procedure: Extract Ritz pair ( λ, q) in V k with λ closest to some target value τ. Convergence: If r k M 1 (A λm) q M 1 < ε q M then we have found an eigenpair Solve correction equation for t k M q, (I M q q T )(A η k M)(I q q T M)t k = r k, q T Mt k = 0. (13) M-orthonormalize t k to V k to obtain v k+1 Expand search space: V k+1 = span{v 1,..., v k+1 }.

RANMEP, Hsinchu, Taiwan, January 4 8, 2008 16/30 Remarks on JDSYM In exact arithmetic, one-vector JD is Rayleigh quotient iteration. Eigenvalue approximations converge cubically. Shift η k in (13) is set to target value τ initially ( inverse iteration) and to the Rayleigh quotient ρ( q) close to convergence. Keep subspace size bounded: If k = j max reduce size of the search space to j min. Use j min best Ritz vectors in V jmax to define V jmin. The correction equation is solved only approximatively. We use a Krylov space method: QMRS (admits indefinite preconditioner) (Freund, 1992). Eigenvectors corresponding to higher eigenvalues are computed in the orthogonal complement of previously computed eigenvectors.

RANMEP, Hsinchu, Taiwan, January 4 8, 2008 17/30 Preconditioning the correction equation The correction equation is given by (I M q q T )(A η k M)(I q q T M)t k = r k, q T Mt k = 0. Preconditioning means solving with a system of the form (I M q q T )K(I q q T M)c = b, q T Mc = 0. (14) where K is a preconditioner for A ρ k M. As we are looking for just a few of the smallest eigenvalues we take K A σm where σ is close to the desired eigenvalues. Usually, σ = τ. We use the same aggregation-based AMG preconditioner K for all correction equations (Vaněk, Brezina, Mandel, 1996) (Remember we only compute a few eigenpairs.)

RANMEP, Hsinchu, Taiwan, January 4 8, 2008 18/30 Setup procedure for an abstract multigrid solver 1: Define the number of levels, L 2: for level l = 0,..., L 1 do 3: if l < L 1 then 4: Define prolongator P l ; 5: Define restriction R l = Pl T ; 6: K l+1 = R l K l P l ; 7: Define smoother S l ; 8: else 9: Prepare for solving with K l ; 10: end if 11: end for

RANMEP, Hsinchu, Taiwan, January 4 8, 2008 19/30 Smoothed aggregation (SA) AMG preconditioner I 1 Build adjacency graph G 0 of K 0 = K. (Take m m block structure into account.) 2 Group graph vertices into contiguous subsets, called aggregates. Each aggregate represents a coarser grid vertex. Typical aggregates: 3 3 3 nodes (of the graph) up to 5 5 5 nodes (if aggressive coarsening is used) (Par)METIS Note: The matrices K 1, K 2,... need much less memory space than K 0! Typical operator complexity for SA: 1.4 (!!!)

Smoothed aggregation (SA) AMG preconditioner II 3 Define a grid transfer operator: Low-energy modes (near-kernel) are chopped according to aggregation B l = Let B (l) j B (l) 1.. B (l) n l+1 = rows of B l corresponding to grid points assigned to j th aggregate. B (l) j = Q (l) j R (l) j be QR factorization of B (l) j then B l = P l B l+1, PT l Pl = I, with P l = diag(q (l) 1,..., Q(l) n l+1 ) and B l+1 = R (l) 1.. R (l) n l+1 Columns of B l+1 span the near kernel of K l+1. Notice: matrices K l are not used in constructing tentative prolongators P l, near kernels B l, and graphs G l. RANMEP, Hsinchu, Taiwan, January 4 8, 2008 20/30.

RANMEP, Hsinchu, Taiwan, January 4 8, 2008 21/30 Smoothed aggregation (SA) AMG preconditioner III 4 For elliptic problems, it is advisable to perform an additional step, to obtain smoothed aggregation (SA). P l = (I l ω l D 1 l K l ) P l, ω l = smoothed prolongator 4/3 λ max (D 1 l K l ), In non-smoothed aggregation: P l = P l 5 Smoother S l : polynomial smoother Choose a Chebyshev polynomial that is small on the upper part of the spectrum of K l (Adams, Brezina, Hu, Tuminaro, 2003). Parallelizes perfectly, quality independent of processor number.

RANMEP, Hsinchu, Taiwan, January 4 8, 2008 22/30 Matrix-free multigrid Our approach: pcg which almost smoothed aggregation AMG preconditioning We set K = K 0 = A σm in the positive definite case and K = (A σm)m 1 lumped (A σm) in the indefinite case. (We apply K 0.) We lump because M 1 is dense. G 0 is the graph corresponding to (A σm)m 1 lumped (A σm). P 0 is not smoothed, i.e. P 0 = P 0. K 1 = P T 0 K 0P 0 is formed explicitely. All graphs, including G 0, are constructed.

RANMEP, Hsinchu, Taiwan, January 4 8, 2008 23/30 The Software Environment: Trilinos The Trilinos Project is an effort to develop parallel solver algorithms and libraries within an object-oriented software framework for the solution of large-scale, complex multi-physics engineering and scientific applications. See http://software.sandia.gov/trilinos/ Provides means to distribute (multi)vectors and (sparse) matrices (Epetra and EpetraExt packages). Provides solvers that work on these distributed data. Here we use iterative solvers and preconditioners (packages AztecOO/IFPACK), smoothed aggregation multilevel AMG preconditioner (ML), direct solver wrappers (Amesos) and data distribution for parallelization (Zoltan/ParMETIS). Quality of documentation depends on package.

RANMEP, Hsinchu, Taiwan, January 4 8, 2008 24/30 Parallel mesh reading Implementation: HDF5 format/library Allows for parallel I/O Mesh file content A list of tetrahedra (4 vertices) Mesh reading scales with number of processors Binary file format allows for efficient I/O A list of vertex coordinates (x, y, z) A list of surface triangles (3 vertices)

RANMEP, Hsinchu, Taiwan, January 4 8, 2008 25/30 Timings for a small quantum wire problem (4 4, spd) Problem size n = 49860, 2*n = 99720 Number of nonzeros in the graph = 1259826 Some JDSYM parameters itmax=1000 linitmax=50 kmax=8 jmin=8 jmax=16 tau=0.0e+00 jdtol=1.0e-08 eps tr=1.0e-02 toldecay=1.5e+00 sigma=0.0e+00 linsolver=qmrs blksize=1

RANMEP, Hsinchu, Taiwan, January 4 8, 2008 26/30 Convergence history of a typical JD run

RANMEP, Hsinchu, Taiwan, January 4 8, 2008 27/30 Performance with spd problems E(p) = t(1) p t(p) 2DBigWire p t [sec] E(p) t(prec) n out n avg inner 1 400 1.00 243 59 15.6 2 181 1.10 147 56 15.8 4 140 0.71 115 56 15.7 8 215 0.23 178 61 17.1

Performance with indefinite problems Isoprobability surface at probability density of 0.006 of an elevated electron state of a pyramidal Indium-Arsenide (InAs) quantum dot calculated using k p 8 8. RANMEP, Hsinchu, Taiwan, January 4 8, 2008 28/30

RANMEP, Hsinchu, Taiwan, January 4 8, 2008 29/30 Conclusions JD solves the spd eigenvalue very efficiently (with the right number of processors) The SA multilevel preconditioner is unbeatable (memory consumption, scalability), if it is applicable. Uncertainties with respect to success with the indefinite system does preconditioner work? near null space right? stopping criterion? how is the accuracy affected by squaring of the matrix?

RANMEP, Hsinchu, Taiwan, January 4 8, 2008 30/30 The End