Optimization of quantum Monte Carlo wave functions by energy minimization Julien Toulouse, Roland Assaraf, Cyrus J. Umrigar Laboratoire de Chimie Théorique, Université Pierre et Marie Curie and CNRS, Paris, France. Laboratory of Atomic and Solid State Physics, Cornell University, Ithaca, New York, USA. Email: toulouse@lct.jussieu.fr Web page: www.lct.jussieu.fr/pagesperso/toulouse/ Collaborators: Claudia Filippi (Lieden University), Richard G. Hennig (Cornell University), Sandro Sorella (SISSA, Trieste). January 2008
Outline 1 Wave function optimization 2 Calculation of pair densities 3 Conclusions
Outline 1 Wave function optimization 2 Calculation of pair densities 3 Conclusions
Trial wave function Jastrow-Slater wave function N CSF Ψ(p) = Ĵ(α) i=1 c i C i Ĵ(α) = Jastrow factor (with e-e, e-n, e-e-n terms) C i = Configuration state function (CSF) = linear combination of Slater determinants of given symmetry.
Trial wave function Jastrow-Slater wave function N CSF Ψ(p) = Ĵ(α) i=1 c i C i Ĵ(α) = Jastrow factor (with e-e, e-n, e-e-n terms) C i = Configuration state function (CSF) = linear combination of Slater determinants of given symmetry. The Slater determinants are made of orbitals expanded on a Slater basis: φ k (r) = N basis µ=1 λ kµ χ µ (r) χ(r) = N(ζ)r n 1 e ζr S l,m (θ,φ)
Trial wave function Jastrow-Slater wave function N CSF Ψ(p) = Ĵ(α) i=1 c i C i Ĵ(α) = Jastrow factor (with e-e, e-n, e-e-n terms) C i = Configuration state function (CSF) = linear combination of Slater determinants of given symmetry. The Slater determinants are made of orbitals expanded on a Slater basis: φ k (r) = N basis µ=1 λ kµ χ µ (r) χ(r) = N(ζ)r n 1 e ζr S l,m (θ,φ) Parameters to optimize p = {α, c, λ, ζ}: Jastrow parameters α, CSF coefficients c, orbital coefficients λ and basis exponents ζ
Wave function optimization: why and how? Important for both VMC and DMC in order to reduce the systematic error reduce the statistical uncertainty
Wave function optimization: why and how? Important for both VMC and DMC in order to reduce the systematic error reduce the statistical uncertainty How to optimize? Until recently: minimization of the variance of the energy OK for the few Jastrow parameters but does not work well for the many CSF and orbital parameters Since recently: minimization of the energy (+ possibly a small fraction of variance) in order to optimize well all the parameters and because the energy is a better criterion
Wave function parametrization Jastrow parameters α, CSF coefficients c, basis exponents ζ: no difficulty orbital coefficients λ are redundant = bad parametrization
Wave function parametrization Jastrow parameters α, CSF coefficients c, basis exponents ζ: no difficulty orbital coefficients λ are redundant = bad parametrization Reparametrization of orbital coefficients λ κ (used in MCSCF) N CSF Ψ(p) = Ĵ(α)eˆκ(κ) i=1 c i C i where ˆκ(κ) is the generator of rotations in orbital space (occupied and virtual): ˆκ(κ) = ) κ kl (Êk l Ê l k k<l and Êk l = â k âl + â k âl is the singlet excitation operator.
Wave function parametrization Reparametrization of orbital coefficients λ κ (used in MCSCF) N CSF Ψ(p) = Ĵ(α)eˆκ(κ) i=1 c i C i where ˆκ(κ) is the generator of rotations in orbital space (occupied and virtual): ˆκ(κ) = ) κ kl (Êk l Ê l k k<l and Êk l = â k âl + â k âl is the singlet excitation operator. Some points to note non-redundant parametrization orthonormalization of orbitals preserved if basis exponents are not varied can be generalized if basis exponents are varied
Energy minimization in VMC We have worked on three methods
Energy minimization in VMC We have worked on three methods Newton method E(p) E(p 0 ) + E(p 0 ) p i + 1 2 E(p 0 ) p i p j p i 2 p i p j i i,j
Energy minimization in VMC We have worked on three methods Newton method E(p) E(p 0 ) + E(p 0 ) p i + 1 2 E(p 0 ) p i p j p i 2 p i p j i i,j Linear method Ψ(p) Ψ(p 0 ) + i Ψ(p 0 ) p i p i = diagonalization of Ĥ in the basis {Ψ(p0 ), Ψ(p0 ) p i } H p = E S p
Energy minimization in VMC We have worked on three methods Newton method E(p) E(p 0 ) + E(p 0 ) p i + 1 2 E(p 0 ) p i p j p i 2 p i p j i i,j Linear method Ψ(p) Ψ(p 0 ) + i Ψ(p 0 ) p i p i = diagonalization of Ĥ in the basis {Ψ(p0 ), Ψ(p0 ) p i } Perturbative method H p = E S p approximate resolution of H p = E S p by nonorthogonal perturbation theory
Linear optimization method: principle Expansion of the wave function around p 0 to linear order in p = p p 0 : Ψ [1] (p) = Ψ 0 + j p j Ψ j where Ψ 0 = Ψ(p 0 ) and Ψ j = Ψ(p0 )) p j.
Linear optimization method: principle Expansion of the wave function around p 0 to linear order in p = p p 0 : Ψ [1] (p) = Ψ 0 + j p j Ψ j where Ψ 0 = Ψ(p 0 ) and Ψ j = Ψ(p0 )) p j. Normalization of wave function chosen so that the derivatives Ψ j are orthogonal to Ψ 0.
Linear optimization method: principle Expansion of the wave function around p 0 to linear order in p = p p 0 : Ψ [1] (p) = Ψ 0 + j p j Ψ j where Ψ 0 = Ψ(p 0 ) and Ψ j = Ψ(p0 )) p j. Normalization of wave function chosen so that the derivatives Ψ j are orthogonal to Ψ 0. Minimization of the energy = generalized eigenvalue equation: ( E0 g T )( ) ( )( ) /2 1 1 0 T 1 = E g/2 H p lin 0 S p where E 0 = Ψ 0 Ĥ Ψ 0, g i = E(p0 ) p i, H ij = Ψ i Ĥ Ψ j, S ij = Ψ i Ψ j.
Linear optimization method: principle Expansion of the wave function around p 0 to linear order in p = p p 0 : Ψ [1] (p) = Ψ 0 + j p j Ψ j where Ψ 0 = Ψ(p 0 ) and Ψ j = Ψ(p0 )) p j. Normalization of wave function chosen so that the derivatives Ψ j are orthogonal to Ψ 0. Minimization of the energy = generalized eigenvalue equation: ( E0 g T )( ) ( )( ) /2 1 1 0 T 1 = E g/2 H p lin 0 S p where E 0 = Ψ 0 Ĥ Ψ 0, g i = E(p0 ) p i, H ij = Ψ i Ĥ Ψ j, S ij = Ψ i Ψ j. Update of the parameters: p 0 p 0 + p.
Linear optimization method: robustness The linear method is equivalent to a stabilized Newton method: ( E0 g T )( ) ( )( ) /2 1 1 0 T 1 = E g/2 H p lin 0 S p { (h + 2 E S) p = g 2 E = g T p where h = 2(H E 0 S) is an approximate Hessian, and E = E 0 E lin > 0 is the energy stabilization. = more robust than Newton method
Linear optimization method: robustness The linear method is equivalent to a stabilized Newton method: ( E0 g T )( ) ( )( ) /2 1 1 0 T 1 = E g/2 H p lin 0 S p { (h + 2 E S) p = g 2 E = g T p where h = 2(H E 0 S) is an approximate Hessian, and E = E 0 E lin > 0 is the energy stabilization. = more robust than Newton method In quantum chemistry, it is known as super-ci method or augmented Hessian method.
Linear optimization method: robustness The linear method is equivalent to a stabilized Newton method: ( E0 g T )( ) ( )( ) /2 1 1 0 T 1 = E g/2 H p lin 0 S p { (h + 2 E S) p = g 2 E = g T p where h = 2(H E 0 S) is an approximate Hessian, and E = E 0 E lin > 0 is the energy stabilization. = more robust than Newton method In quantum chemistry, it is known as super-ci method or augmented Hessian method. Additional stabilization: H ij H ij + a δ ij where a 0.
Linear optimization method: on a finite VMC sample The generalized eigenvalue equation is estimated as ( E0 gr T/2 )( ) ( )( 1 1 0 T 1 = E g L /2 H p lin 0 S p with Ψi (R) g L,i /2 = Ψ 0 (R) Ψi (R) H ij = Ψ 0 (R) H(R)Ψ 0 (R) Ψ 0 (R) H(R)Ψ j (R) Ψ 0 (R) non-symmetric! Ψ 2 0 Ψ 2 0 Ψ0 (R) H(R)Ψ j (R) and g R,j /2 = Ψ 0 (R) Ψ 0 (R) Ψi (R) Ψ j (R) and S ij = Ψ 0 (R) Ψ 0 (R) ) Ψ 2 0 Ψ 2 0
Linear optimization method: on a finite VMC sample The generalized eigenvalue equation is estimated as ( E0 gr T/2 )( ) ( )( 1 1 0 T 1 = E g L /2 H p lin 0 S p with Ψi (R) g L,i /2 = Ψ 0 (R) Ψi (R) H ij = Ψ 0 (R) H(R)Ψ 0 (R) Ψ 0 (R) H(R)Ψ j (R) Ψ 0 (R) non-symmetric! Ψ 2 0 Ψ 2 0 Ψ0 (R) H(R)Ψ j (R) and g R,j /2 = Ψ 0 (R) Ψ 0 (R) Ψi (R) Ψ j (R) and S ij = Ψ 0 (R) Ψ 0 (R) = Zero-variance principle of Nightingale et al. (PRL 2001): If there is some p so that Ψ 0 (R) + j p j Ψ j (R) = Ψ exact (R) then p is found with zero variance. In practice, these non-symmetric estimators reduce the fluctuations on p by 1 or 2 orders of magnitude. ) Ψ 2 0 Ψ 2 0
Linear optimization method: mixing a fraction of variance How to minimize the energy variance with the linear method? { V = min V 0 + gv T p + 1 } p 2 pt h V p
Linear optimization method: mixing a fraction of variance How to minimize the energy variance with the linear method? { V = min V 0 + gv T p + 1 } p 2 pt h V p V = min p ( 1 p T ) ( V 0 gv T/2 )( 1 g V /2 h V /2 + V 0 S p ( 1 p T ) ( 1 0 T )( ) 1 0 S p )
Linear optimization method: mixing a fraction of variance How to minimize the energy variance with the linear method? { V = min V 0 + gv T p + 1 } p 2 pt h V p V = min p ( 1 p T ) ( V 0 gv T/2 )( 1 g V /2 h V /2 + V 0 S p ( 1 p T ) ( 1 0 T )( ) 1 0 S p ) ( V0 g T V /2 g V /2 h V /2 + V 0 S )( 1 p ) ( 1 0 T = V 0 S )( 1 p )
Linear optimization method: mixing a fraction of variance How to minimize the energy variance with the linear method? { V = min V 0 + gv T p + 1 } p 2 pt h V p V = min p ( 1 p T ) ( V 0 gv T/2 )( 1 g V /2 h V /2 + V 0 S p ( 1 p T ) ( 1 0 T )( ) 1 0 S p ) ( V0 g T V /2 g V /2 h V /2 + V 0 S )( 1 p matrix to add to the energy matrix ) ( 1 0 T = V 0 S )( 1 p )
Simultaneous optimization of all parameters in VMC Optimization of 24 Jastrow, 49 CSF, 64 orbital and 12 exponent parameters for the C 2 molecule: -75.4-75.855 Energy (Hartree) -75.5-75.6-75.7-75.8 Energy (Hartree) -75.86-75.865-75.87-75.875-75.88 2 3 4 5 6 Iterations -75.9 0 1 2 3 4 5 6 Iterations = The energy converges with an accuracy of 1 mhartree in about 4 or 5 iterations
Systematic improvement in QMC For C 2 molecule: total energies for a series of fully optimized Jastrow-Slater wave functions: -75.8-75.82 Energy (Hartree) -75.84-75.86-75.88-75.9 VMC CCSD(T)/cc-pVQZ -75.92 Exact -75.94 J*SD J*CAS(8,5) J*CAS(8,7) J*CAS(8,8) J*RAS(8,26) Wave function = Systematic improvement in VMC
Systematic improvement in QMC For C 2 molecule: total energies for a series of fully optimized Jastrow-Slater wave functions: -75.8-75.82 Energy (Hartree) -75.84-75.86-75.88-75.9 VMC DMC CCSD(T)/cc-pVQZ -75.92 Exact -75.94 J*SD J*CAS(8,5) J*CAS(8,7) J*CAS(8,8) J*RAS(8,26) Wave function = Systematic improvement in VMC and DMC!
Potential energy curve of the C 2 molecule ( 1 Σ + g ) With fully optimized Jastrow single-determinant wave functions: -75.4 VMC J SD Energy (Hartree) -75.5-75.6-75.7-75.8 DMC J SD Morse potential -75.9 1 2 3 4 5 6 7 8 9 10 Interatomic distance (Bohr)
Potential energy curve of the C 2 molecule ( 1 Σ + g ) With fully optimized Jastrow single-determinant wave functions: -75.4 VMC J SD Energy (Hartree) -75.5-75.6-75.7-75.8 DMC J SD Morse potential -75.9 1 2 3 4 5 6 7 8 9 10 Interatomic distance (Bohr) = Single-determinant DMC is size-consistent with broken spin-symmetry at dissociation, Ψ FN Ŝ2 Ψ FN = 2
Potential energy curve of the C 2 molecule ( 1 Σ + g ) With fully optimized Jastrow multi-determinant wave functions: -75.4-75.5 Energy (Hartree) -75.6-75.7-75.8 VMC J CAS(8,8) DMC J CAS(8,8) Morse potential -75.9 1 2 3 4 5 6 7 8 9 10 Interatomic distance (Bohr)
Potential energy curve of the C 2 molecule ( 1 Σ + g ) With fully optimized Jastrow multi-determinant wave functions: -75.4-75.5 Energy (Hartree) -75.6-75.7-75.8 VMC J CAS(8,8) DMC J CAS(8,8) Morse potential -75.9 1 2 3 4 5 6 7 8 9 10 Interatomic distance (Bohr) = Multi-determinant DMC gives a well depth with chemical accuracy (1 kcal/mol 0.04 ev): E DMC = 6.482(3) vs E exact = 6.44(2)
Well depths of second-row homonuclear diatomic molecules VMC and DMC errors in well depths for some fully optimized Jastrow-Slater wave functions: Error in well depth (ev) 0-0.5-1 -1.5 VMC J SD Li 2 Be 2 B 2 C 2 N 2 Molecules O 2 F 2
Well depths of second-row homonuclear diatomic molecules VMC and DMC errors in well depths for some fully optimized Jastrow-Slater wave functions: Error in well depth (ev) 0-0.5-1 -1.5 VMC J SD DMC J SD Li 2 Be 2 B 2 C 2 N 2 Molecules O 2 F 2
Well depths of second-row homonuclear diatomic molecules VMC and DMC errors in well depths for some fully optimized Jastrow-Slater wave functions: Error in well depth (ev) 0-0.5-1 -1.5 VMC J SD DMC J SD VMC J CAS Li 2 Be 2 B 2 C 2 N 2 Molecules O 2 F 2
Well depths of second-row homonuclear diatomic molecules VMC and DMC errors in well depths for some fully optimized Jastrow-Slater wave functions: Error in well depth (ev) 0-0.5-1 DMC J SD DMC J CAS VMC J CAS -1.5 VMC J SD Li 2 Be 2 B 2 C 2 N 2 Molecules O 2 F 2 = Near chemical accuracy in DMC with Jastrow CAS
Outline 1 Wave function optimization 2 Calculation of pair densities 3 Conclusions
Calculation of pair densities Spherically and system-averaged pair density = intracule density I(u) = dωu drψ(r) 2 δ(r ij u) 4π i<j e.g., gives the Coulombic electron-electron interaction energy: Usefulness W ee = 0 du 4πu 2 I(u) 1 u qualitative analysis of electronic structure (Cioslowski, Gill, Ugalde, etc...) quantitative predictions beyond usual DFT (Gori-Giorgi, Perdew, Savin, etc...)
Calculation of intracules in QMC Usual histogram method I(u) dωu 4π i<j drψ(r) 2 1 [u u/2, u+ u/2](r ij ) u 3 Problems large statistical uncertainties due to large variances, especially at small u systematic errors due to approximate Ψ(R) but also due to discretization over u
Calculation of an observable in VMC Energy Estimator: E L (R) = H(R)Ψ(R) Ψ(R) Systematic error: δe = O(δΨ 2 ) Variance: σ 2 (E L ) = O(δΨ 2 ) } Quadratic Zero-Variance Zero-Bias property
Calculation of an observable in VMC Energy Estimator: E L (R) = H(R)Ψ(R) Ψ(R) Systematic error: δe = O(δΨ 2 ) Variance: σ 2 (E L ) = O(δΨ 2 ) } Quadratic Zero-Variance Zero-Bias property Arbitrary observable Ô (which does not commute with Ĥ) Estimator: O L (R) = O(R)Ψ(R) Ψ(R) } Systematic error: δo = O(δΨ) Quadratic Zero-Variance Variance: σ 2 (O L ) = O(1) Zero-Bias property
Zero-Variance Zero-Bias estimators (Assaraf & Caffarel) Consider the λ-dependent Hamiltonian Ĥ λ = Ĥ + λô with an associated trial wave function Ψ λ = Ψ + λψ +
Zero-Variance Zero-Bias estimators (Assaraf & Caffarel) Consider the λ-dependent Hamiltonian Ĥ λ = Ĥ + λô with an associated trial wave function Ψ λ = Ψ + λψ + Hellmann-Feynman theorem suggests to define ZVZB estimator ( ) de OL ZVZB λ (R) Ψ 2 = = dλ λ=0 O L (R) Ψ 2 + OL ZV (R) + O ZB Ψ 2 L Ψ (R), 2 with the ZV term [ H(R)Ψ OL ZV ] (R) = (R) Ψ (R) Ψ E L (R) (R) Ψ(R) and the ZB term O ZB L (R) = 2[E L(R) E] Ψ (R) Ψ(R)
Zero-Variance Zero-Bias estimators (Assaraf & Caffarel) Hellmann-Feynman theorem suggests to define ZVZB estimator ( ) de OL ZVZB λ (R) Ψ 2 = = dλ λ=0 O L (R) Ψ 2 + OL ZV (R) + O ZB Ψ 2 L Ψ (R), 2 with the ZV term [ H(R)Ψ OL ZV ] (R) = (R) Ψ (R) Ψ E L (R) (R) Ψ(R) and the ZB term O ZB L (R) = 2[E L(R) E] Ψ (R) Ψ(R) Quadratic Zero-Variance Zero-Bias property Systematic error: δo ZVZB = O(δΨ 2 + δψ δψ ( ) ) Variance: σ 2 OL ZVZB = O(δΨ 2 + δψ 2 + δψ δψ )
Calculation of intracules in QMC Simplest approximate wave function derivative: Ψ (R) = 1 dωu 1 4π 4π r ij u Ψ(R) ZVZB improved estimator I(u) = 1 2π i<j dωu 4π (+ possible refinements) Advantages reduction of variance i<j + (E L (R) E) reduction of systematic error [ drψ(r) 2 rj Ψ(R) r ij u Ψ(R) r ij u 3 ] 1 r ij u
Comparison of the estimators Intracule I(u) of the He atom in VMC (100000 configurations): 0.3 0.25 accurate intracule histogram estimator with HF wave function 0.2 I(u) (a.u.) 0.15 0.1 0.05 0 0 0.5 1 1.5 2 Interelectronic distance u (a.u.)
Comparison of the estimators Intracule I(u) of the He atom in VMC (100000 configurations): 0.3 0.25 accurate intracule histogram estimator with HF wave function ZV estimator with HF wave function 0.2 I(u) (a.u.) 0.15 0.1 0.05 0 0 0.5 1 1.5 2 Interelectronic distance u (a.u.) = reduction of statistical uncertainty and systematic error by several orders of magnitude
Comparison of the estimators Intracule I(u) of the He atom in VMC (100000 configurations): 0.3 0.25 accurate intracule histogram estimator with HF wave function ZV estimator with HF wave function ZVZB estimator with HF wave function 0.2 I(u) (a.u.) 0.15 0.1 0.05 0 0 0.5 1 1.5 2 Interelectronic distance u (a.u.) = reduction of statistical uncertainty and systematic error by several orders of magnitude
Systematic improvement of the intracule Correlation hole 4πu 2 [I(u) I HF (u)] of the C 2 molecule in VMC for a series of wave functions: 0.5 0.4 Jastrow HF 4 π u 2 [ I (u) - I HF (u) ] (a.u.) 0.3 0.2 0.1 0-0.1-0.2-0.3 0 1 2 3 4 5 Interelectronic distance u (a.u.)
Systematic improvement of the intracule Correlation hole 4πu 2 [I(u) I HF (u)] of the C 2 molecule in VMC for a series of wave functions: 0.5 0.4 Jastrow HF Jastrow SD 4 π u 2 [ I (u) - I HF (u) ] (a.u.) 0.3 0.2 0.1 0-0.1-0.2-0.3 0 1 2 3 4 5 Interelectronic distance u (a.u.)
Systematic improvement of the intracule Correlation hole 4πu 2 [I(u) I HF (u)] of the C 2 molecule in VMC for a series of wave functions: 4 π u 2 [ I (u) - I HF (u) ] (a.u.) 0.5 0.4 0.3 0.2 0.1 0-0.1-0.2 Jastrow HF Jastrow SD Jastrow CAS(8,5) -0.3 0 1 2 3 4 5 Interelectronic distance u (a.u.)
Systematic improvement of the intracule Correlation hole 4πu 2 [I(u) I HF (u)] of the C 2 molecule in VMC for a series of wave functions: 4 π u 2 [ I (u) - I HF (u) ] (a.u.) 0.5 0.4 0.3 0.2 0.1 0-0.1-0.2 Jastrow HF Jastrow SD Jastrow CAS(8,5) Jastrow CAS(8,7) -0.3 0 1 2 3 4 5 Interelectronic distance u (a.u.)
Systematic improvement of the intracule Correlation hole 4πu 2 [I(u) I HF (u)] of the C 2 molecule in VMC for a series of wave functions: 4 π u 2 [ I (u) - I HF (u) ] (a.u.) 0.5 0.4 0.3 0.2 0.1 0-0.1-0.2 Jastrow HF Jastrow SD Jastrow CAS(8,5) Jastrow CAS(8,7) Jastrow CAS(8,8) -0.3 0 1 2 3 4 5 Interelectronic distance u (a.u.)
Outline 1 Wave function optimization 2 Calculation of pair densities 3 Conclusions
Conclusions Summary Efficient wave function optimization method by energy minimization in VMC. Achievement of systematic improvement and near chemical accuracy. Improved QMC estimators for calculating pair densities. Toulouse, Umrigar, JCP 126, 084102 (2007) Umrigar, Toulouse, Filippi, Sorella, Hennig, PRL 98, 110201 (2007) Toulouse, Assaraf, Umrigar, JCP 126, 244112 (2007) Web page: www.lct.jussieu.fr/pagesperso/toulouse/ Future work Direct minimization of the DMC energy. Optimization of molecular geometries. Wave function optimization for excited states.