Some Comments on Accelerating Convergence of Iterative Sequences Using Direct Inversion of the Iterative Subspace (DIIS)

Similar documents
PHYS 705: Classical Mechanics. Calculus of Variations II

Chapter 5. Solution of System of Linear Equations. Module No. 6. Solution of Inconsistent and Ill Conditioned Systems

Vector Norms. Chapter 7 Iterative Techniques in Matrix Algebra. Cauchy-Bunyakovsky-Schwarz Inequality for Sums. Distances. Convergence.

Singular Value Decomposition: Theory and Applications

MMA and GCMMA two methods for nonlinear optimization

8.4 COMPLEX VECTOR SPACES AND INNER PRODUCTS

Generalized Linear Methods

CSci 6974 and ECSE 6966 Math. Tech. for Vision, Graphics and Robotics Lecture 21, April 17, 2006 Estimating A Plane Homography

U.C. Berkeley CS294: Beyond Worst-Case Analysis Luca Trevisan September 5, 2017

Inexact Newton Methods for Inverse Eigenvalue Problems

Lectures - Week 4 Matrix norms, Conditioning, Vector Spaces, Linear Independence, Spanning sets and Basis, Null space and Range of a Matrix

Some modelling aspects for the Matlab implementation of MMA

STAT 309: MATHEMATICAL COMPUTATIONS I FALL 2018 LECTURE 16

where the sums are over the partcle labels. In general H = p2 2m + V s(r ) V j = V nt (jr, r j j) (5) where V s s the sngle-partcle potental and V nt

1 GSW Iterative Techniques for y = Ax

1 Matrix representations of canonical matrices

Chapter Newton s Method

ρ some λ THE INVERSE POWER METHOD (or INVERSE ITERATION) , for , or (more usually) to

Report on Image warping

4DVAR, according to the name, is a four-dimensional variational method.

Salmon: Lectures on partial differential equations. Consider the general linear, second-order PDE in the form. ,x 2

Quantum Mechanics I - Session 4

Errors for Linear Systems

Relaxation Methods for Iterative Solution to Linear Systems of Equations

Lecture 21: Numerical methods for pricing American type derivatives

SIO 224. m(r) =(ρ(r),k s (r),µ(r))

Inner Product. Euclidean Space. Orthonormal Basis. Orthogonal

CME 302: NUMERICAL LINEAR ALGEBRA FALL 2005/06 LECTURE 13

Poisson brackets and canonical transformations

A New Refinement of Jacobi Method for Solution of Linear System Equations AX=b

n α j x j = 0 j=1 has a nontrivial solution. Here A is the n k matrix whose jth column is the vector for all t j=0

e - c o m p a n i o n

Yong Joon Ryang. 1. Introduction Consider the multicommodity transportation problem with convex quadratic cost function. 1 2 (x x0 ) T Q(x x 0 )

CHAPTER 5 NUMERICAL EVALUATION OF DYNAMIC RESPONSE

Outline and Reading. Dynamic Programming. Dynamic Programming revealed. Computing Fibonacci. The General Dynamic Programming Technique

OPTIMISATION. Introduction Single Variable Unconstrained Optimisation Multivariable Unconstrained Optimisation Linear Programming

Preconditioning techniques in Chebyshev collocation method for elliptic equations

Norms, Condition Numbers, Eigenvalues and Eigenvectors

Linear Approximation with Regularization and Moving Least Squares

The Exact Formulation of the Inverse of the Tridiagonal Matrix for Solving the 1D Poisson Equation with the Finite Difference Method

Solution of Linear System of Equations and Matrix Inversion Gauss Seidel Iteration Method

= = = (a) Use the MATLAB command rref to solve the system. (b) Let A be the coefficient matrix and B be the right-hand side of the system.

Hongyi Miao, College of Science, Nanjing Forestry University, Nanjing ,China. (Received 20 June 2013, accepted 11 March 2014) I)ϕ (k)

Finite Element Modelling of truss/cable structures

C/CS/Phy191 Problem Set 3 Solutions Out: Oct 1, 2008., where ( 00. ), so the overall state of the system is ) ( ( ( ( 00 ± 11 ), Φ ± = 1

A Hybrid Variational Iteration Method for Blasius Equation

Workshop: Approximating energies and wave functions Quantum aspects of physical chemistry

2.3 Nilpotent endomorphisms

Feb 14: Spatial analysis of data fields

1 Convex Optimization

APPENDIX A Some Linear Algebra

Prof. Dr. I. Nasser Phys 630, T Aug-15 One_dimensional_Ising_Model

Introduction to Density Functional Theory. Jeremie Zaffran 2 nd year-msc. (Nanochemistry)

Lecture Notes on Linear Regression

Curve Fitting with the Least Square Method

Structure and Drive Paul A. Jensen Copyright July 20, 2003

Introduction to Vapor/Liquid Equilibrium, part 2. Raoult s Law:

Feature Selection: Part 1

A SEPARABLE APPROXIMATION DYNAMIC PROGRAMMING ALGORITHM FOR ECONOMIC DISPATCH WITH TRANSMISSION LOSSES. Pierre HANSEN, Nenad MLADENOVI]

Lecture 12: Discrete Laplacian

International Journal of Pure and Applied Sciences and Technology

Convexity preserving interpolation by splines of arbitrary degree

Difference Equations

PRECONDITIONING TECHNIQUES IN CHEBYSHEV COLLOCATION METHOD FOR ELLIPTIC EQUATIONS

BOUNDEDNESS OF THE RIESZ TRANSFORM WITH MATRIX A 2 WEIGHTS

STAT 309: MATHEMATICAL COMPUTATIONS I FALL 2018 LECTURE 17. a ij x (k) b i. a ij x (k+1) (D + L)x (k+1) = b Ux (k)

Multigrid Methods and Applications in CFD

An Interactive Optimisation Tool for Allocation Problems

CHAPTER 14 GENERAL PERTURBATION THEORY

Advanced Quantum Mechanics

A Local Variational Problem of Second Order for a Class of Optimal Control Problems with Nonsmooth Objective Function

EEL 6266 Power System Operation and Control. Chapter 3 Economic Dispatch Using Dynamic Programming

5.04, Principles of Inorganic Chemistry II MIT Department of Chemistry Lecture 32: Vibrational Spectroscopy and the IR

MATH Sensitivity of Eigenvalue Problems

College of Computer & Information Science Fall 2009 Northeastern University 20 October 2009

The KMO Method for Solving Non-homogenous, m th Order Differential Equations

On a direct solver for linear least squares problems

Lecture 14: Forces and Stresses

Numerical Heat and Mass Transfer

Global Sensitivity. Tuesday 20 th February, 2018

CS4495/6495 Introduction to Computer Vision. 3C-L3 Calibrating cameras

Solutions to exam in SF1811 Optimization, Jan 14, 2015

The Prncpal Component Transform The Prncpal Component Transform s also called Karhunen-Loeve Transform (KLT, Hotellng Transform, oregenvector Transfor

Mean Field / Variational Approximations

Implicit Integration Henyey Method

EEE 241: Linear Systems

Physics 5153 Classical Mechanics. D Alembert s Principle and The Lagrangian-1

Physics 5153 Classical Mechanics. Principle of Virtual Work-1

Fall 2012 Analysis of Experimental Measurements B. Eisenstein/rev. S. Errede

Point cloud to point cloud rigid transformations. Minimizing Rigid Registration Errors

A particle in a state of uniform motion remain in that state of motion unless acted upon by external force.

On the Multicriteria Integer Network Flow Problem

Lagrangian Field Theory

Grover s Algorithm + Quantum Zeno Effect + Vaidman

Integrals and Invariants of Euler-Lagrange Equations

A linear imaging system with white additive Gaussian noise on the observed data is modeled as follows:

Topic 5: Non-Linear Regression

Problem Set 9 Solutions

MEM 255 Introduction to Control Systems Review: Basics of Linear Algebra

Least squares cubic splines without B-splines S.K. Lucas

Transcription:

Some Comments on Acceleratng Convergence of Iteratve Sequences Usng Drect Inverson of the Iteratve Subspace (DIIS) C. Davd Sherrll School of Chemstry and Bochemstry Georga Insttute of Technology May 1998 Introducton Most standard textbook approaches to solvng systems of lnear equatons or dagonalzng matrces are descrbed as drect methods, and they typcally requre a fxed number of mathematcal operatons whch depends on the dmensons of the problem. These methods generally requre access to matrx elements n random order, whch poses serous dffcultes for the very large matrces typcally encountered n computatonal quantum chemstry: random access of large dsk fles becomes prohbtvely expensve, and often the matrces are too large even to store on dsk! In such cases, one may avod the need for random access to ndvdual matrx elements by turnng to teratve technques, whch requre only the repeated evaluaton of matrx-vector products. Unfortunately, teratve methods are not guaranteed to converge, and they can have dffcultes when the matrx s not dagonally domnant or when there are nearly degenerate solutons. The well-known Davdson method [1] for the teratve soluton of the lowest few egenvalues and egenvectors of large, symmetrc matrces combnes some of the features of drect and teratve technques. Although only matrx-vector operatons are requred, and there s no need to explctly store the Hamltonan matrx, Davdson s method also uses drect methods to dagonalze a small Hamltonan matrx formed n the subspace of all tral CI vectors that have been consdered up to the present teraton. The current estmates of the egenvalues of the full Hamltonan matrx are obtaned as the egenvalues the small Hamltonan matrx, and the current CI vectors are obtaned as the lnear combnatons of the tral vectors whose coeffcents are gven by the egenvectors of the small Hamltonan matrx. Pople and co-workers later used related deas to teratvely solve the large systems of lnear equatons occurng n the coupled-perturbed Hartree-Fock method [2]. In 1980, Pulay publshed a somewhat smlar method [3] known as the drect nverson of the teratve subspace (DIIS). Lke the Davdson method, DIIS apples drect methods to a small lnear algebra problem (now a system of lnear equatons nstead of an egenvalue problem) n a subspace 1

formed by takng a set of tral vectors from the full-dmensonal space. Pulay found that DIIS could be useful for acceleratng the convergence of self-consstent-feld (SCF) procedures and, to a lesser extent, geometry optmzatons. The Mathematcs of DIIS Suppose that we have a set of tral vectors {p } whch have been generated durng the teratve soluton of a problem. Now let us form a set of resdual vectors defned as p = p +1 p. (1) The DIIS method assumes that a good approxmaton to the fnal soluton p f can be obtaned as a lnear combnaton of the prevous guess vectors p = c p, (2) where m s the number of prevous vectors (n practce, only the most recent few vectors are used). The coeffcents c are obtaned by requrng that the assocated resdual vector p = ( c ) p (3) approxmates the zero vector n a least-squares sense. Furthermore, the coeffcents are requred to add to one, c = 1. (4) The motvaton for the latter requrement can be seen as follows. Each of our tral solutons p can be wrtten as the exact soluton plus an error term, p f + e. Then, the DIIS approxmate soluton s gven by p = ( c p f + e ) (5) = m p f c + c e. Hence, we wsh to mnmze the actual error, whch s the second term n the equaton above (of course, n practce, we don t know e, only p ); dong so would make the second term vansh, leavng only the frst term. For p = p f, we must have m c = 1. 2

Thus, we wsh to mnmze the norm of the resduum vector p p = c c j p p j, (6) j subject to the constrant (4). These requrements can be satsfed by mnmzng the followng functon wth Lagrangan multpler λ where B s the matrx of overlaps L = c Bc λ ( 1 ) c, (7) B j = p p j. (8) We can mnmze L wth respect to a coeffcent c k to obtan (assumng real quanttes) L = 0 = c j B kj + c B k λ (9) c k j = 2 c B k λ. We can absorb the factor of 2 nto λ to obtan the followng matrx equaton, whch s eq. (6) of Pulay [3]: B 11 B 12 B 1m 1 c 1 0 B 21 B 22 B 2m 1 c 2 0 = (10) B m1 B m2 B mm 1 c m 0 1 1 1 0 λ 1 Programmng DIIS The DIIS procedure seems so smple that further comment on specfc computatonal mplementatons mght appear superfluous. However, I have found that the precse computatonal detals are absolutely crucal for effectve nterpolaton. Hence, I wll descrbe here my mplementaton of DIIS for the optmzaton of orbtals n a two-step CASSCF program. There are probably many varatons on ths mplementaton whch would also work, but often seemngly nconsequental chnages can make dramatc dfferences n effcency. In the two-step CASSCF procedure, one begns wth a set of guess orbtals, solves the full CI problem n the actve space, determnes the gradent for orbtal rotatons, takes a step n orbtal 3

rotaton (theta) space down the gradent drecton (.e., obtans new guess orbtals), and repeats the process teratvely untl convergence. To allow DIIS nterpolaton, one can express the current set of guess orbtals as the result of the multplcaton of a set of Gvens rotaton matrces by a matrx of reference orbtals (C µp = q C o µqu qp, see [4]). The rotaton angles whch defne the untary transformaton U (a product of Gvens rotaton matrces) comprse a vector of parameters, p. In ths case, one can defne the error vectors as the dfferences between subsequent sets of orbtal rotaton angles, or one could also reasonably choose the orbtal gradent vector. In my detcas program, the regular theta step s determned usng a Newton-Raphson approach wth an approxmate, dagonal orbtal Hessan. Ths s equvalent to scalng the orbtal gradent to a new coordnate system. Snce the step n theta space s just the scaled gradent, the scaled gradent s the same as the dfference between successve theta vectors (apart from a sgn) before the DIIS procedure starts. However, I have found t much better to assocate the gradent vector wth the next teraton s theta vector, not wth the theta vector from whch t was computed. In other words, t s best to change eq. (1) to the followng: p +1 = p +1 p. (11) Another general consderaton s that one does not want to add an nterpolated vector to the lst of vectors {p} unless t contans some new character to add to the subspace. Otherwse, lnear dependences can result. An outlne of my DIIS procedure for the detcas program s gven below: 1. Usng current orbtals p, obtan scaled orbtal gradent g. 2. Take Newton-Raphson step p +1 = p g. 3. Add p +1 to lst of vectors. Add p +1 = g to lst of error vectors. 4. Perform DIIS nterpolaton to obtan new guess vector. Overwrte p +1 wth DIIS nterpolant. (Ths vector wll never be added to lst of vectors). 5. Increment, begn new cycle. 4

References [1] E. R. Davdson, J. Comp. Phys. 17, 87 (1975). [2] J. A. Pople, R. Krshnan, H. B. Schlegel, and J. S. Bnkley, Int. J. Quantum Chem. Symp. 13, 225 (1979). [3] P. Pulay, Chem. Phys. Lett. 73, 393 (1980). [4] M. Head-Gordon and J. A. Pople, J. Phys. Chem. 92, 3063 (1988). 5