High-level Quantum Chemistry Methods and Benchmark Datasets for Molecules Markus Schneider Fritz Haber Institute of the MPS, Berlin, Germany École Polytechnique Fédérale de Lausanne, Switzerland دانشگاه صنعتی اصفهان - 2016, th May 11 1/48 دانشگاه صنعتی اصفهان - 2016, th May 11
Outline 1 Introduction 2 Levels of Theory 3 Wavefunction-based methods Hartree-Fock Configuration interaction (CI) Coupled-Cluster (CC) DLPNO-CCSD(T) 2 nd order Møller-Plesset (MP2) Higher order Møller-Plesset 4 Practical considerations Counterpoise correction Extrapolation schemes and basis sets 5 Benchmark datasets 6 Summary 2/48 دانشگاه صنعتی اصفهان - 2016, th May 11
Introduction I. Introduction (often) very basic task: Calculate properties for a system! system: solids, surfaces, molecules, properties: potential energy, free energy, lattice constant, atomic distances, UV spectra, IR spectra, example: conformational search of cation-peptide systems in gas-phase 3/48 دانشگاه صنعتی اصفهان - 2016, th May 11
Introduction which quantum chemistry computer program should I use? there are many! https://en.wikipedia.org/wiki/list_of_quantum_chemistry_and_solid-state_physics_software 4/48 دانشگاه صنعتی اصفهان - 2016, th May 11
Introduction define your problem and evaluate choice of software! license (GPL, academic, commercial, ) support, manual, basis set (numeric atomic orbitals (NAO), plane waves (PW), Gaussian type orbitals (GTO), ) method (semi-empirical, DFT, post-hartree-fock, ) often compromise between accuracy and computational costs one should verify accuracy of method of choice! verify against experiment verify against theory (higher-level method) 5/48 دانشگاه صنعتی اصفهان - 2016, th May 11
Introduction verification against experiment but be careful, example: AcFA 6 -Na + at 4K Relative energy [ev] 0.200 0.150 0.100 0.050 0.000 PBE0+MBD (tight) PBE0+MBD (tight) + ZPE (PBE+MBD) 5.0 4.0 3.0 2.0 1.0 0.0 [kcal/mol] theory predicts one distinctive global minimum BUT: conformer experimentally excluded Who s wrong? Experiment or theory? 6/48 دانشگاه صنعتی اصفهان - 2016, th May 11
Levels of Theory II. Levels of Theory - Overview Full CI computational costs Wavefunction-based methods (CCSD(T), MP2, RPA, ) DFT (xc-functionals: LDA, GGA, hybrids, ) Semi-empirical methods (AM1, PM3, tight-binding, ) accuracy (?) Empirical methods (force fields) 7/48 دانشگاه صنعتی اصفهان - 2016, th May 11
Levels of Theory Timings qualitative example: Phenylalanine + Ca 2+ timings depend on basis set, integration grid, implementation, 8/48 دانشگاه صنعتی اصفهان - 2016, th May 11
Levels of Theory Full CI computational costs Wavefunction-based methods (CCSD(T), MP2, RPA, ) DFT (xc-functionals: LDA, GGA, hybrids, ) Semi-empirical methods (AM1, PM3, tight-binding, ) accuracy (?) Empirical methods (force fields) 9/48 دانشگاه صنعتی اصفهان - 2016, th May 11
Levels of Theory Empirical Force Field Potential Energy Function E pot = E bond + E angle + E torsion + E Coulomb + E vdw bonded terms: E bond = i<j Kb ij (r ij r 0 ij )2 non-bonded terms: E Coulomb = i<j E vdw = i<j [ A ij r 12 ij Cations are always non-bonded! q i q j 4πε 0 r ij B ij r 6 ij ] e.g. OPLS, CHARMM22 Parameters are derived from specific data sets (with experimental- or QM-derived properties) by a fitting algorithm. 10/48 دانشگاه صنعتی اصفهان - 2016, th May 11
Levels of Theory example: Histidine + Zn 2+ + H 2 O force field global minimum DFT global minimum Hirshfeld charges of the Zn 2+ cation over structure data set count 70 60 50 40 30 20 10 0 0.5 0.6 0.7 0.8 Hirshfeld charge opposes the idea of fixed-point charges (one solution: polarizable force fields buffered or scaled by factor; e.g. AMOEBA 11/48 دانشگاه صنعتی اصفهان - 2016, th May 11
Levels of Theory Full CI computational costs Wavefunction-based methods (CCSD(T), MP2, RPA, ) DFT (xc-functionals: LDA, GGA, hybrids, ) Semi-empirical methods (AM1, PM3, tight-binding, ) accuracy (?) Empirical methods (force fields) 12/48 دانشگاه صنعتی اصفهان - 2016, th May 11
Levels of Theory Full CI computational costs Wavefunction-based methods (CCSD(T), MP2, RPA, ) DFT (xc-functionals: LDA, GGA, hybrids, ) Semi-empirical methods (AM1, PM3, tight-binding, ) accuracy (?) Empirical methods (force fields) 13/48 دانشگاه صنعتی اصفهان - 2016, th May 11
Levels of Theory Density-functional theory (DFT) this workshop talks by Sergey Levchenko ( DFT in Practice ) and Weitao Yang ( DFT and the Exchange-Correlation Functional ) practical session 1 last tuesday solve Kohn-Sham equations self-consistently [ ħ2 2m 2 + Ṽ( r) ] ϕ i ( r) = ε i ϕ i ( r) with ϕ i ( r) non-interacting Kohn-Sham orbitals that fulfill N1 ϕ i ( r) 2 = n( r) Ṽ = V ext ( r) + e 2 n( r ) r r d3 r + V xc [n( r)] density-functional theory (DFT) is an exact method but density-functional approximation (DFA) is not 14/48 دانشگاه صنعتی اصفهان - 2016, th May 11
Levels of Theory local-density approximation (LDA): E LDA xc = ε xc (n) n( r) d 3 r generalized gradient approximation (GGA): E GGA xc = ε xc (n, n) n( r) d 3 r hybrid functionals include Hartree-Fock exact exchange functional E HF x E HF x = 1 2 i,j ϕ i ( r)ϕ j ( r) 1 r r ϕ i( r )ϕ j ( r ) d 3 r d 3 r e.g. PBE0: E PBE0 x = αe HF x + (1 α)e PBE x + E PBE c higher computational costs?? = higher accuracy 15/48 دانشگاه صنعتی اصفهان - 2016, th May 11
Wavefunction-based methods III. Wavefunction-based methods Full CI computational costs Wavefunction-based methods (CCSD(T), MP2, RPA, ) DFT (xc-functionals: LDA, GGA, hybrids, ) Semi-empirical methods (AM1, PM3, tight-binding, ) accuracy (?) Empirical methods (force fields) 16/48 دانشگاه صنعتی اصفهان - 2016, th May 11
Wavefunction-based methods Quick recap n-electron system described by Schrödinger equation in Born-Oppenheimer approximation (fixed nuclei / classical treatment separately from electrons): H BO ψ = E ψ n H BO = 1 N 2 Z j i r i=1 j=1 i R j + 1 n 1 2 r j i i r j analytical solution impossible use numerical technique 17/48 دانشگاه صنعتی اصفهان - 2016, th May 11
Wavefunction-based methods Hartree-Fock Hartree-Fock n quasi-independent electrons described by a (antisymmetric) Slater determinant ϕ 1 ( r 1 ) ϕ 2 ( r 1 )... ϕ n ( r 1 ) ϕ 1 ( r 2 ) ϕ 2 ( r 2 )... ϕ n ( r 2 ) ψ( r 1, r 2,..., r n ) =............ ϕ 1 ( r n ) ϕ 2 ( r n )... ϕ n ( r n ) variational principle (ground state energy E 0 E HF = H HF ) optimize orbitals w.r.t. E HF with E HF = ( n i=1 1 2 i ) N Z j j=1 r i R j + 1 ni=1 nj=1 2 (J ij K ij ) J ij Coulomb term Coulomb repulsion between electrons K ij exchange term no classical counterpart; corrects for self-interaction J ii of electrons self-consistent solution yields E HF, ψ HF 18/48 دانشگاه صنعتی اصفهان - 2016, th May 11
Wavefunction-based methods Hartree-Fock Hartree-Fock method is size consistent Hartree-Fock method is a mean field approximation instead, electrons are correlated remaining correlation error: E corr = E exact E HF exact energy = Hartree-Fock + correlation energy = mean field + instantaneous e - - e - interaction correlation energy E corr is 1% of E exact chemically significant how to calculate E corr? use HF as starting point apply post-hf wave-function based method on top 19/48 دانشگاه صنعتی اصفهان - 2016, th May 11
Wavefunction-based methods Configuration interaction (CI) Configuration interaction (CI) post-hartree Fock linear variational method instead of one Slater determinant linear combination of Slater determinants Ψ>= c 0 Ψ HF > + i,a ca i Ψa i > + occ. i<j unocc. a<b c ab ij Ψ ab ij > + Ψ HF Slater determinant Ψ a i determinant with one occupied orbital replaced by virtual orbital Ψ ab ij determinant with two occupied orbitals replaced by two virtual ones take into account all possible Slater determinants obtained by exciting all possible electrons to all possible virtual orbitals Full CI 20/48 دانشگاه صنعتی اصفهان - 2016, th May 11
Wavefunction-based methods Configuration interaction (CI) truncation of expansion: CIS configuration interaction singles CID configuration interaction doubles CISD configuration interaction singles and doubles HF singles singles doubles doubles 21/48 دانشگاه صنعتی اصفهان - 2016, th May 11
Wavefunction-based methods Configuration interaction (CI) problem: truncated CI not size-consistent slow convergence with excitations many configurations must be taken into account in order to approximate exact energy Full CI + infinite basis set = exact solution of Schrödinger equation Complete-CI method scales exponentiallly suitable only for very small systems 22/48 دانشگاه صنعتی اصفهان - 2016, th May 11
Wavefunction-based methods Coupled-Cluster (CC) Coupled-Cluster Method again, replace electronic wave functions in HF Slater determinant by virtual orbitals, so-called excitations T 1 = i,a Ψ CC >= e T Ψ HF >= e T 1+T 2 +T 3 +... Ψ HF > t a i a + a a i, T 2 = 1 4 i,j,a,b t ab ij a + a a + b a ia j, T 3 =... i, j occupied (internal) HF orbitals a, b canonical unoccupied (external; virtual) orbitals t...... singles and doubles wavefunction amplitudes ( CCSD), to be determined a +..., a... creation and destruction operators total energy E =<Ψ CC H BO Ψ CC >=<Ψ HF e T H BO e T Ψ HF > = E HF + E corr =<Ψ HF H BO Ψ HF > +... 23/48 دانشگاه صنعتی اصفهان - 2016, th May 11
Wavefunction-based methods Coupled-Cluster (CC) do the math singles and doubles residuals determine wavefunction amplitudes t...... R a i =<Ψ a i e T H e T Ψ HF >= 0 R ab ij =<Ψ ab ij e T H e T Ψ HF >= 0 Ψ a i >= a + a a i Ψ HF >, Ψ ab ij >= a + a a + b a ia j Ψ HF > set of non-linear equations iterative solution gives wavefunction amplitudes t a i, tab ij currently being implemented in FHI-aims 24/48 دانشگاه صنعتی اصفهان - 2016, th May 11
Wavefunction-based methods Coupled-Cluster (CC) Configuration Interaction (CI) vs Coupled-cluster (CC) Configuration Interaction (CI): Ψ CI >= (1 + T 1 + T 2 +...) Ψ HF > Coupled-cluster (CC): Ψ CC >= e T 1+T 2 +... Ψ HF > example: consider only doubles CI: Ψ CID >= (1 + T 2 ) Ψ HF ( > ) CC: Ψ CCD >= e T2 Ψ HF >= 1 + T 2 + T2 2 2! +... Ψ HF > T 2 doubles excitation operator (one electron pair) T2 2 simultaneous doubles excitation of two non-interacting electron pairs CI is not size consistent CC is size consistent 25/48 دانشگاه صنعتی اصفهان - 2016, th May 11
Wavefunction-based methods Coupled-Cluster (CC) CCSD gives results often above chemical accuracy ( 1 kcal/mol) CCSDT scales with O(N 8 ) instead CCSD(T), i.e. perturbative triples correction, based on wavefunction amplitudes obtained from CCSD; scales O(N 7 ) CCSD(T) often very accurate CCSD(T) = gold standard of quantum chemistry sufficiently large basis set required still restricted to rather small systems 26/48 دانشگاه صنعتی اصفهان - 2016, th May 11
Wavefunction-based methods DLPNO-CCSD(T) DLPNO-CCSD(T) how to improve computational costs of CCSD(T)? C. Riplinger, F. Neese, J. Chem. Phys., 138, 034106 (2013). localization truncation method and/or tail estimation E corr = individual pair correlation energies strong pair approximation 27/48 دانشگاه صنعتی اصفهان - 2016, th May 11
Wavefunction-based methods DLPNO-CCSD(T) external space (virtual orbitals) typically one order of magnitude larger than internal space (occupied orbitals) bottleneck truncation scheme, based on the occupation number virtual orbital representation is not fixed transform between different virtual orbital sets, e.g. t a i = = r r,s i, j occupied (internal) HF orbitals a, b canonical virtual (external) orbitals r, s virtual (external) orbitals w.r.t. new virtual basis Q ar ii t r i, t ab ij Q ar ij t rs ij Q bs ij domain approximation: restrict the virtual space to a subset (domain) of the orbitals, ij t a i = r [i] Q ar ii t r i, t ab ij = r,s [i,j] Q ar ij t rs ij Q bs ij 28/48 دانشگاه صنعتی اصفهان - 2016, th May 11
Wavefunction-based methods DLPNO-CCSD(T) new virtual basis of orbitals? canonical Molecular Orbitals (MOs)? no, too chaotic space, not local Projected Atomic Orbitals (PAOs)? yes, but PAOs are maximally localized accurate description of external space? Pair Natural Orbitals (PNOs)! Yes! constructed from so-called virtual pair density localized in same region of space as internal electron pair, but also certain amount of delocalization excitations are only allowed into respective local domains L(ocal)PNO approch = strong pair approximation + PNOs 29/48 دانشگاه صنعتی اصفهان - 2016, th May 11
Wavefunction-based methods DLPNO-CCSD(T) LPNO method expands PNOs in terms of virtual MOs PNOs are local, but expansion is not scales with O(N 5 ) instead expand PNOs in terms of PAOs D(omain based)lpno-ccsd(t) scales almost linearly implemented in ORCA 30/48 دانشگاه صنعتی اصفهان - 2016, th May 11
Wavefunction-based methods 2 nd order Møller-Plesset (MP2) 2 nd order Møller-Plesset perturbation theory (MP2) perturbative treatment: H = H 0 + λv H 0 unperturbated Hamiltonian which solution is known here: Hartree-Fock λv small perturbation here: correlation potential E MP2 = i,j,a,b <ϕ i(1)ϕ j (2) 1 r 12 ϕ a (1)ϕ b (2)> 2<ϕa(1)ϕ b(2) 1 r 12 ϕ i (1)ϕ j (2)> <ϕ a(1)ϕ b (2) 1 r 12 ϕ j (1)ϕ i (2)> ϵ i +ϵ j ϵ a ϵ b ϕ i,j occupied orbitals ϕ a,b unoccupied (virtual) orbitals ϵ i,j,a,b corresponding orbital energies problematic for systems with small HOMO-LUMO gap scales with O(N 5 ) implemented in FHI-aims 31/48 دانشگاه صنعتی اصفهان - 2016, th May 11
Wavefunction-based methods Higher order Møller-Plesset Higher order Møller-Plesset perturbation theory performance of MP3 very often worse than MP2 obviously computational costs increase with higher order Møller-Plesset theory diverges for higher n Olsen et al.; J. Chem. Phys., 2000, 112, pp 9736-9748 32/48 دانشگاه صنعتی اصفهان - 2016, th May 11
Practical considerations Counterpoise correction IV. Practical considerations - Counterpoise correction in weakly bound clusters artificial strengthening of the intermolecular interaction basis set superposition error (BSSE) monomer A approaches monomer B monomer A borrows basis function from monomer B, and vice versa dimer artificially stabilized problem: inconsistent treatment at short ( extra basis functions available) and long ( extra basis functions not available) distances complete basis-set limit: BSSE 0 always apply counterpoise correction (CP) S. F. Boys, F. Bernardi; Mol. Phys., 1970, 19, 553-566 33/48 دانشگاه صنعتی اصفهان - 2016, th May 11
Practical considerations Counterpoise correction uncorrected interaction energy E int (AB): E int (AB) = E AB (AB) E A (A) E B (B) subscripts denote basis parentheses denote system (here: assuming rigid conformers; often reasonable approximation) estimation of basis set superposition error (BSSE): E BSSE (A) = E AB (A) E A (A) E BSSE (B) = E AB (B) E A (B) E AB (A) < E A (A) E BSSE (A) < 0 (error is stabilizing) E AB (B) < E B (B) E BSSE (B) < 0 (error is stabilizing) e.g. calculate E AB (A) consider only basis of monomer B, not atom (electrons and nuclear charges) ghost atoms E CP int (AB) = E AB(AB) E AB (A) E AB (B) 34/48 دانشگاه صنعتی اصفهان - 2016, th May 11
Practical considerations Extrapolation schemes and basis sets Extrapolation schemes and basis sets problem for wave-function based methods: require large basis set slow convergence with basis set size but computational costs are high use extrapolation scheme calculate for smaller basis sets extrapolate to basis-set completeness 35/48 دانشگاه صنعتی اصفهان - 2016, th May 11
Practical considerations Extrapolation schemes and basis sets systematic basis sets required designed to converge systematically to the complete-basis-set (CBS) limit using empirical extrapolation techniques in contrast to e.g. DFT moderately sized basis sets enough to reach complete basis set limit e.g. numeric atom-centered basis functions in FHI-aims minimal, tier1, tier2, tier3 e.g. Pople basis sets 6-31G*, 6-311G*, widely used (HF, DFT, ) mostly used for light elements not too systematic 36/48 دانشگاه صنعتی اصفهان - 2016, th May 11
Practical considerations Extrapolation schemes and basis sets systematic basis sets required e.g. correlation-consistent basis sets (Dunning et al.) (J. Chem. Phys, 1989, 90 (2), pp 1007 1023) denoted cc-pvnz (correlation-consistent polarized valence N zeta) cc-pvdz: double-zeta cc-pvtz: triple-zeta cc-pvqz: quadruple-zeta cc-pv5z: quintuple-zeta cc-pcvnz: including core-polarization (e.g. for cations where semi-core states treatment is important) aug-cc-pvnz: augmented versions with added diffuse functions systematic, popular, performance well known use for MP2, coupled-cluster; not for HF, DFT 37/48 دانشگاه صنعتی اصفهان - 2016, th May 11
Practical considerations Extrapolation schemes and basis sets e.g. Ahlrichs/Karlsruhe basis sets (Phys. Chem. Chem. Phys., 2005, 7, pp 3297-3305) def2-svp def2-tzv: valence triple-zeta def2-tzvp: valence triple-zeta plus polarization def2-tzvpp: valence triple-zeta plus polarization (doubly polarized) def2-qzvpp: valence quadruple-zeta plus polarization (doubly polarized) also efficient for HF, DFT use doubly polarized versions (PP) for MP2, coupled-cluster, in FHI-aims: NAO-VCC-nZ (Numeric Atom-centered Orbitals - Valence Correlation Consistent n-zeta) (New J. Phys., 2013, 15, 123033) use for MP2, RPA, GW, constructed following Dunning s correlation-consistent recipe Igor Ying Zhang 38/48 دانشگاه صنعتی اصفهان - 2016, th May 11
Practical considerations Extrapolation schemes and basis sets Extrapolation schemes simple two-point extrapolation scheme by Halkier et al. (Chem. Phys. Lett., 1998, 286, 243) E CBS = E[n 1]n 3 1 E[n 2]n 3 2 n 3 1 n3 2 n 1, n 2 basis set cardinal numbers n = 3: triple-zeta n = 4: quadruple-zeta extrapolation scheme used for correlation energy, sometimes also used for total energy 39/48 دانشگاه صنعتی اصفهان - 2016, th May 11
Practical considerations Extrapolation schemes and basis sets more precise: different basis set behavior for correlation energy E corr and SCF energy E SCF use different extrapolation schemes for E corr and E SCF, e.g.: SCF extrapolation scheme by Kanton and Martin (Theor. Chem. Acc., 2006, 115, pp 330 333) E n SCF = ECBS SCF + Ae α n A, α, CBS-extrapolated energy E CBS SCF 3 parameters to be determined from least-square fitting need at least 3 different basis set calculations, e.g. n = T/Q/5 correlation energy extrapolation scheme by Truhlar (Chem. Phys. Lett., 1998, 294, pp 45 48) E n corr = E CBS corr + Bn β B, β, CBS-extrapolated energy E CBS corr 3 parameters to be determined from least-square fitting need at least 3 different basis set calculations, e.g. n = T/Q/5 40/48 دانشگاه صنعتی اصفهان - 2016, th May 11
Benchmark datasets V. Benchmark datasets purpose: provide highly accurate QM calculations of molecular structures, energies, and properties used as benchmarks for testing used for parameterization of computational methods collection of datasets: database e.g. Benchmark Energy and Geometry DataBase (BEGDB) (Řezáč et al. Collect. Czech. Chem. Commun., 2008, 73, pp 1261-1270) http://www.begdb.com/ e.g. NIST Computational Chemistry Comparison and Benchmark DataBase (CCCBDB) http://cccbdb.nist.gov/ 41/48 دانشگاه صنعتی اصفهان - 2016, th May 11
Benchmark datasets S22: Noncovalent Complexes (Jurecka et al.; Phys. Chem. Chem. Phys., 2006, 8, pp 1985-1993) widely popular set of 22 small (< 30 atoms) complexes containing only C, N, O and H, and single, double and triple bonds mix of hydrogen-bonded and dispersion-bonded complexes typical noncovalent interactions represented geometry relaxation using counterpoise-corrected gradient optimization several methods applied smaller complexes optimized with CCSD(T) using cc-pvtz/cc-pvqz without counterpoise correction 42/48 دانشگاه صنعتی اصفهان - 2016, th May 11
Benchmark datasets S66: A Well-balanced Database of Benchmark Interaction Energies Relevant to Biomolecular Structure (Řezáč et al.; J. Chem. Theory Comput., 2011, 7 (8), pp 2427-2438) 66 noncovalent complexes 23 hydrogen-bond dominated complexes 23 dispersion-dominated complexes 20 complexes with mixed electrostatic/dispersion interaction MP2/CBS calculations in aug-cc-pvtz/aug-cc-pvqz + CCSD(T) correction in aug-cc-pvdz 43/48 دانشگاه صنعتی اصفهان - 2016, th May 11
Benchmark datasets X40: Noncovalent Interactions of Halogenated Molecules (Řezáč et al.; J. Chem. Theory Comput., 2012, 8 (11), pp 4285-4292) 40 noncovalent complexes of organic halides, halohydrides, and halogen molecules variety of interaction types composite CCSD(T)/CBS scheme applied triple-zeta basis sets on all atoms but hydrogen 44/48 دانشگاه صنعتی اصفهان - 2016, th May 11
Benchmark datasets L7: Large Noncovalent Complexes (Sedlak et al.; J. Chem. Theory Comput., 2013, 9 (8), pp 3364-3374) seven large complexes (number of atoms: 48-112) MP2/CBS binding energies + QCISD(T)/6-31G*(0.25) correction MP2/CBS binding energies + QCISD(T)/aug-cc-pVDZ correction MP2/CBS binding energies + CCSD(T)/6-31G**(0.25,0.15) 45/48 دانشگاه صنعتی اصفهان - 2016, th May 11
Benchmark datasets and many more 46/48 دانشگاه صنعتی اصفهان - 2016, th May 11
Summary VI. Summary wave-function based methods often very accurate but computationally expensive (large basis set required) CI, CCSD(T), DLPNO-CCSD(T), MP2, numerically challenging counterpoise correction must be applied! almost always: extrapolation scheme use appropriate basis sets (cc-, def2- ) verify your method used for production but: don t get lost becoming too accurate 47/48 دانشگاه صنعتی اصفهان - 2016, th May 11
Summary VI. Summary wave-function based methods often very accurate but computationally expensive (large basis set required) CI, CCSD(T), DLPNO-CCSD(T), MP2, numerically challenging counterpoise correction must be applied! almost always: extrapolation scheme use appropriate basis sets (cc-, def2- ) verify your method used for production but: don t get lost becoming too accurate خیلی ممنون 48/48 دانشگاه صنعتی اصفهان - 2016, th May 11