Beyond the Hartree-Fock Approximation: Configuration Interaction

Beyond the Hartree-Fock Approximation: Configuration Interaction The Hartree-Fock (HF) method uses a single determinant (single electronic configuration) description of the electronic wavefunction. For a given basis set, the method provides the (variationally) lowest energy possible (as the expectation value of the Hamiltonian) and hence the best wavefunction possible. The method provides a fairly good description of the ground state of most molecules near the equilibrium geometry, and the Hartree-Fock method forms the basis for most electronic structure calculations in the literature up to 1995 or so. Nevertheless, no matter how large and flexible the basis set is, the HF method can only provide an approximation to the exact (true) wavefunction -- the exact wavefunction cannot be described as a single determinant (i.e. single electronic configuration). A closed shell (all electrons paired) singlet configuration can always be written as a single determinant. Open shell singlets require at least a two determinant description. Non-singlets (doublets, triplets, etc...) require special methods. The unrestricted HF (UHF) method uses one determinant for the up electrons and one (different) determinant for the down electrons. That effectively doubles the computation time relative to a closed shell calculation but does in many cases provide a decent description of the unpaired spin distribution. Restricted - open shell HF (ROHF) pairs as many electrons as possible in doubly occupied orbitals and then couples the remaining unpaired electrons to this closed shell using specialized techniques. A Hartree-Fock calculation with the largest, most flexible basis set imaginable produces an energy, which is called the Hartree-Fock limit. This energy is always higher (less negative) than the exact, non-relativistic energy for the system. The difference between the energy at the Hartree-Fock Limit, and the exact, nonrelativistic energy is called the Correlation Energy: E(Correlation) = E(exact, nonrelativistic) - E(Hartree-Fock Limit). The main deficiency in the Hartree-Fock method is that it does not account properly for the "correlated" motion of the electrons. We talk about "dynamical" vs. "non-dynamical" correlation. The latter arises from near-degeneracies (or real

degeneracies) of states in the system more than one electronic configuration of similar energy (transition metals, for example). Dynamical correlation refers more directly to the energy associated with the motion of the electrons being correlated and not independent of each other. Correlation of the motions of electrons with parallel spin is partially enforced by the Pauli principle and accounted for via the determinantal description of the wavefunction, but there is no correlation of the motion of electrons with opposite spin in the Hartree-Fock wavefunction. The Hartree-Fock method, based on the variational principle, produces a wavefunction which is too "averaged" over the positions of the electrons --- not sufficient account is made of the "instantaneous" interactions among the electrons. In the HF method, the electrons get "too close" to each other; <r 12 > is too small, i.e. <1/r 12 > is too large. There is too much e-e repulsion in a HF wavefunction. The absence of correlation among electrons of opposite spin lead to a number of undesirable features of the Hartree-Fock approximation. Chief among them is the inability of most closed-shell Hartree-Fock wavefunctions to properly describe dissociation and hence potential energy curves at longer bond lengths. There are problems with even the simplest of all molecules, H 2. The ground state of H 2 has a singlet wavefunction (symmetric space part, antisymmetric spin part) Ψ = φ 1 (1)φ 1 (2)[α(1)β(2) - β(1)α(2)](2) -1/2 with φ 1 (1) = (2+2S AB ) -1/2 {1s A (1) + 1s B (1)} and φ 1 (2) = (2+2S AB ) -1/2 {1s A (2) + 1s B (2)} within a minimal basis set framework. Ignoring all square roots and the spin functions, we find (by multiplying out the terms) that the spatial part of the wavefunction takes the form Ψ ~ φ 1 (1)φ 1 (2) = {1s A (1) + 1s B (1)}{1s A (2) + 1s B (2)} = {1s A (1)1s A (2) + 1s A (1)1s B (2) + 1s B (1)1s A (2) +1s B (1)1s B (2)} The first term in this expression has both electrons on atom A (corresponding to the electronic configuration H A - ); the last term has both electrons on atom B 2

- (corresponding to H B ); the two central terms have one electron on A and one on B. The two central terms are covalent -- but both the first and the last terms are ionic. Thus, this wavefunction has 50% covalent and 50% ionic character -- doesn't seem right for a "covalently" bound molecule like H 2. Furthermore, at large internuclear separations, the wavefunction for H 2 must represent two neutral hydrogen atoms and hence have the "purely covalent" form Ψ cov = {1s A (1)1s B (2) + 1s B (1)1s A (2)}[α(1)β(2) - β(1)α(2)](2) -1/2 which cannot be written as a single determinant! So the basic HF wavefunction (single determinant) can never be correct and good at long distances, so it will always have problems describing bond breaking processes correctly. Although inclusion of some ionic character to the H 2 ground state wavefunction does lower the energy relative to a purely covalent description, a 50% admixture is way too much and contributes to the Hartree-Fock energy always being too high (and binding energies consequently too low). In most cases, the equilibrium geometries of molecules are produced well by Hartree-Fock theory, but dissociation energies are underestimated. Energies of transition state structures, which often involve partially broken bonds and unusual bonding arrangements, are typically overestimated by Hartree-Fock theory leading to overestimation of activation energy barriers. Vibrational frequencies, in particular bond stretches, are mostly overestimated by Hartree-Fock theory, often by 10-15%. Configuration Interaction Improvements to the theory may be achieved from the use of a linear combination of Slater determinants, each Slater determinant describing an electronic configuration different from the (ground state) Hartree-Fock determinant. 3

Let us assume we have carried out a Hartree-Fock calculation for the electronic ground state of some closed-shell (molecular) system with N basis functions and n electrons. We thus have the n/2 molecular orbitals of lowest energy doubly occupied and (N - n/2) molecular orbitals (the virtuals) unoccupied. In the following, the occupied orbitals will be denoted by indices i,j,k.. etc., and the virtuals with indices a,b,c.. etc. We construct new determinants by replacing occupied orbitals with (formerly) unoccupied orbitals, corresponding to configurations which are electronically excited relative to the ground state. A single substitution, i -> a, produces a singly excited configuration state function, Ψ a i ; two substitutions, i -> a and j -> b, produce a doubly excited configuration, Ψ ab ij, and so on. We now write an improved wavefunctions as Ψ = Σ s' a s' Ψ s' = a 0 Ψ 0 + Σ s a s Ψ s = a 0 Ψ HF + Σa i a Ψi a + Σaij ab Ψij ab +... where Ψ 0 =Ψ HF is the original ground state Hartree-Fock wavefunction and Ψ s (s = 1,2,3...) is a Slater determinant for an excited configuration. The unknown coefficients, a s, are determined by the linear variation principle from the solution of a set of secular equations Σ s (Hst - E i δ st )a s = 0 t = 0,1,2,3... and hence a secular determinant H st - E i δ st = 0 The various configurations are mutually orthogonal (<Ψ 0 Ψ i a > = <Ψ0 Ψ ij ab > = <Ψ i a Ψjk bc > = 0; <Ψi a Ψj b > = δij δ ab and <Ψ ij ab Ψkl cd > = δik δ jl δ ac δ bd and so on), 4

hence the presence of the delta functions in the secular equations. H st is an "interaction" matrix element between two configurations H st = <Ψ s H Ψ t > Since the Hamiltonian includes only one- and two-electron operators, all matrix elements H st involving two configurations Ψ s and Ψ t which differ by three molecular orbitals or more is automatically zero. The lowest root of the determinant provides the (improved) energy of the electronic ground state; the second lowest root is an estimate of the energy of the lowest electronically excited state; and so on. Substitution of a particular root back into the secular equations provides the appropriate set of determinantal coefficients for the state in question. The improved wavefunction is a linear combination of configuration state functions which have been allowed to interact with each other, a configuration interaction wavefunction. If we consider the form of the configuration interaction Hamiltonian matrix, we find that along the diagonal we have the energies of the pure configurations, E 0 = E HF = <Ψ 0 H Ψ 0 >, E a i = <Ψi a H Ψi a >, Eij ab = <Ψij ab H Ψij ab >, and so on. The off-diagonal elements are the interaction elements between the ground state configuration, Ψ 0, and the singly excited configurations, Ψ a i ; between the ground state configuration, Ψ 0, and the doubly excited configurations, Ψ ab ij ; between singly, Ψ a i, and singly Ψj b, excited configurations; between singly and doubly excited configurations; between doubly and doubly excited configurations, and so on. The number of possible configurations increases rapidly as the degree of electron excitation (substitution) increases. In practice, it is customary to truncate the extent of substitution and limit the calculation to explicitly include only configurations singly and doubly excited relative to the ground state determinant (CISD, QCISD, or CCSD). Sometimes, estimates are made regarding what triple excitations might do (CISD(T); QCISD(T); CCSD(T)) 5

[Example: Methanol in a 6-31G(d) basis set has 38 bsf and 18 electrons. Distributing 18 electrons over 38 orbitals, corresponding to full CI, can be done in approximately 10 13 different ways!] Inclusion of only singly excited configurations leads to the CIS (configuration interaction, singles) method (also called SECI (singly excited CI) or SCI). However, since H 0s = <Ψ 0 H Ψ a i > = <ΨHF H Ψ a i > = 0 always (a result known as "Brillouin's Theorem"), there is no mixing of the singly excited states with the (closed shell) ground state and hence no lowering of the ground state energy from this type of configuration interaction. H st = <Ψ s H Ψ t > = <Ψ a i H Ψj b > is generally not zero, so this type of calculation provides energies and wavefunctions of the electronically excited states (singlets and/or triplets), but it does not improve the ground state wavefunction or energy beyond its Hartree-Fock description. All triplet states are of course orthogonal to the closed shell ground state due to the difference in spin multiplicities, so configuration interaction calculations can be carried out for one spin multiplicity at a time. Inclusion of only doubly excited configurations leads to the CID (configuration interaction, doubles) method (also called DCI). The doubly excited configurations have nonzero matrix elements with the ground state and with each other (i.e., H 0s = <Ψ 0 H Ψ ab ij > = <ΨHF H Ψ ab ij > and Hst = <Ψ s H Ψ t > = <Ψ ab ij H Ψkl cd > are generally not equal to zero), so there will be mixing of the configurations. In particular, there will be a mixing of the doubly excited states into the ground state. The inclusion of doubly substituted determinants in the CI leads to an improved ground state wavefunction and a considerable lowering of the ground state energy. When both singly and doubly excited configurations are included in the calculation, we have the CISD (configuration interaction, singles and doubles) method (also called SDCI). The singly excited configurations do interact with the doubly excited configurations (i.e., H st = <Ψ s H Ψ t > = <Ψ a i H Ψjk bc > are generally not equal to 0) and do hence influence, and lower, the energy of the ground state and improve the ground state wavefunction. It is an indirect effect and thus not 6

nearly as large as the effect of the doubles alone. A CISD calculation may easily include a million (or more) configurations and special techniques have been developed to extract the lowest eigenvalues and vectors for matrices of that size ("Davidson's method"). One of the largest configuration interaction calculations carried out to date ( date ~ year 2000) was a full configuration interaction calculation (i.e. all levels of substitutions possible were considered) on water in a double-zeta basis set, which included more than one billion (10 9 ) configurations. The number of configurations included in a calculation may be reduced via the use of a "window". For example, a "10up-20down" CISD calculation would include all single and double excitations involving the 10 lowest virtual and 20 highest (doubly) occupied orbitals. Such reductions in excitation space are popular in semi-empirical methods ("ZINDO") but should be approached with caution, however, since the calculated results may depend strongly on the size of the window, in particular when small windows are applied. It is customary, though, in many CI applications to ignore excitations from the core (frozen core approximation) and thus only include excitations of the valence electrons into the virtual orbitals. The most time-consuming step of a CISD calculation scales approximately as n 2 N 4 (n = number of electrons, N = number of basis functions). The "rule of thumb" is that configuration interaction calculations should not be performed unless the computational facilities permit the use of basis sets of at least double-zeta plus polarization function quality ---- triple-zeta plus diffuse functions on the "atoms of importance" are, of course, even better. [Radial correlation (in-out); angular correlation (left-right)] Thus, configuration interaction calculations on real molecules are always very demanding in terms of computer memory, cpu-speed, and disk space. Differential correlation energy contributions can strongly influence reaction and, in particular, activation energies. Also, relative stabilities of isomers may be strongly affected by electron correlation, when the isomers show very different bonding patterns (such as single vs. multiple bonds; lone pairs vs. single bonds). It is still relatively rare that geometries are optimized at the CISD level for a molecule 7

containing more than 3-4 atoms. Instead, the correlation energy calculations are carried out at the (fixed) molecular geometry obtained from a lower level calculation such as Hartree-Fock (good basis set). Such calculations are called "single-point" calculations. Pople developed a simple nomenclature (involving "slashes", /) for reporting the level of a calculation and the geometry used in the calculation. The format is Method2/Basis2//Method1/Basis1 which means " a single point calculation using Method2 and Basis2 at the geometry optimized using Method1 and Basis1". For example, CISD/6-311++G**//HF/6-311G** implies performing one configuration interaction calculation including all singly and doubly excited determinants with the 6-311++G** basis set (valence triple-zeta + polarization + diffuse function quality) on a geometry, that was obtained by geometry optimization at the Hartree-Fock level with the 6-311G** basis set (valence triple-zeta plus polarization quality). An estimate of the effect of triple substitutions is available through the QCISD(T) method ("Quadratic CI including Single and Double substitutions with the effect of Triple substitutions included non-iteratively (i.e., not included to variational convergence)). Other popular methods based on configuration interaction include the CCSD ("Coupled Cluster Single and Double substitutions") and CCSD(T) methods. CISD is comparable to CCSD and QCISD(T) is comparable to CCSD(T). CCSD(T) calculations with large basis sets (~ triple-zeta quality plus several sets of polarization functions) are generally considered to be "state of the art" at this moment --- next year may be different. A "simple" correction formula has been developed by Davidson ('Davidson correction", S. R. Langhoff and E. R. Davidson, Int. J. Quantum. Chem., 8, 61 (1974)) to account for the energetic effect of all higher (triple, quadruple,..) substitutions, E(correction) = (1-a 2 0 ) ECISD where E CISD is the correlation energy obtained at the CISD level (i.e. E CISD -E HF ) and a 0 is the coefficient of the Hartree-Fock configuration in the CISD wavefunction. 8