Towards ab initio device Design via Quasiparticle self-consistent GW theory

Towards ab initio device Design via Quasiparticle self-consistent GW theory Mark van Schilfgaarde and Takao Kotani Arizona State University Limitations to the local density approximation, the GW approximation and other ab initio methods The QSGW method, and its limitations Can we calculate necessary quantities? Steps to building a reliable, truly ab initio device simulator 1

Practical Example: Ab initio Device Simulator Inputs needed to simulate a realistic electron device Electron energy bands Scattering matrix elements electron-phonon, electron-electron (scattering, polarons) impact ionization (3 rd generation solar cell designs) Recombination through defects 3. Many-body and quantum transport effects, e.g. excitons and tunneling Inputs fed into a Boltzmann equation solver (or another type that takes quantum effects into account) GaAs MESFET In standard bulk devices, mechanisms well known; but effects are always parameterized through models. Big issue: are models reasonable in new environments (quantum dots) where little is known? 2

Local Density Approximation LDA: the standard ab initio method. QP are often very poor Linear response, transport, are all suspect No way to systematically improve on the basic framework Semiconductor bandgaps, m *, underestimated Poor Na bands Shottky barriers: Puts Fermi level too close to valence band at metal/semi contact 3

skip Many attempts to extend the LDA The LDA is a good start, but it is often unsuitable, e.g. for electronic devices (poor energy bands), for many oxide compounds, for any compound containing an f shell element Many attempts to extend, improve on the LDA Self-Interaction Correction (Perdew, Zunger, PRB 23, 5048 (1981)) LDA+U (Anisimov, Zaanen, Andersen, Phys. Rev. B 44, 943 (1991)) LDA+Screened exchange, (Seidl et al, PRB 53, 3764 (1996)) LDA+DMFT (Anisimov et al, J. Phys. C9, 7359 (1997) ) Exact Exchange and OEP (Kotani, PRL 74, 2989 (1995)) Mix Hartree-Fock with LDA (B3LYP, Becke) All have significant successes to their credit, but improve one or another property in some special cases. All have limited applicability. None can be systematically refined 4

skip What is needed to do better? Is error because local V xc assumed to be same as HEG? This is the perspective of the Optimized Effective Potential Exact Exchange : do not assume universal form for V x ; but calculate explicitly from an OEP procedure ( densityfunctionalize Hartree Fock). V HF x OEP ( r, r ')! V ( r) Initially attractive: seemed to fix LDA gap underestimate Kotani, PRL 74, 2989 (1995) first EXX calculation Stadele et al, PRB 59, 10031 (1999): "We find that the exact exchange formalism, augmented by local density or generalized gradient correlations, yields both structural and optical properties in excellent agreement with experiment. The OEP procedure is true ab initio (In contrast to most other extensions to LDA) Neglects corresponding treatment for correlation V c. 5 x

skip Failures of Exact Exchange Good agreement in semiconductor gaps turns out to be an artifact of fortuitous cancellation between between two kinds of errors: 1. Neglect of corresponding treatment for correlation V c. 2. Discontinuity in V x introduced by density-functionalizing nonlocal Hartree-Fock potential to make a local effective potential Fix 1: Add to OEP potential the RPA correlation V c : QP levels in Si and Fe revert approximately to LDA result EXX in metals: bands and magnetic moments are far from Expt! (as it is in HF) Add V c to OEP largely restores situation back to LDA EXX+OEP, LDA EXX 6

skipneeded: some restoration of nonlocality LDA+U, Self-Interaction Correction, Screened Exchange Assume predominant error from locality of effective single-particle. Needed some nonlocal correction to LDA V xc. Representation through matrix elements V ( r, r ')! V LDA ( r) = " V ( r, r ') # " V xc xc RL, R ' L' LDA+U, SIC: on-site term ΔV RL,RL with R=R, l=l, e.g. l=2 Especially applied to localized states, e.g. d and f orbitals. LDA+U: ΔV RL,RL generated from some parameterized (empirical) U. SIC: Hartree-Fock on selected orbitals. In effect a kind of U that is not empirical, but it is not screened either. Screened exchange: ΔV RL,R L is nonlocal everywhere. Usually it is generated from model dielectric function Designed to improve bands in semiconductors All have significant successes to their credit. 7

Limitations skip to methods with onsite nonlocality LDA+U: results depend on choice of (unknown) U. Can do constrained LDA calculations to obtain U. Results are often nonsensical: U~5 ev in Fe AF insulator! U is usually treated as an empirical parameter. What to remove from LDA (double-counting) --- a serious problem Onsite nonlocality is a patch to fix correlations between localized states (d and f electrons). Semiconductor bandgaps essentially like LDA LDA energy bands for CuInSe 2 8

skip GW: includes full nonlocality at RPA level Start from some non-interacting hamiltonian H 0. 1. 2 " 1 H0 = # + Veff (, ') $ G0 = 2! # H r r Example: = H LDA 2.! = " ig # G RPA Polarization function 0 0 0 G 0 3. 4. " ( 1 ) 1 " 1 W =! v = " # v v $ v( r, r ') = r " r ' " 1! = i G0W Self-energy G 0 Dynamically screened exchange (Recover HF theory by ε 1) W! = 2 " H V V 2 H ext ( r, r ',! ) = # + ( r) + ( r) + $ ( r, r ',! ) G 0 9

Standard GW Approximation Start from some non-interacting hamiltonian H 0. Near universal: use H LDA for H 0. Then GW G LDA W LDA G LDA W LDA : applied almost exclusively to semiconductors Big advantage: True ab initio not model based Evaluate as perturbation to H 0 (no self-consistency) ( ) E =! + Z " #! $ Vxc ( r) " LDA LDA LDA LDA LDA LDA kn kn kn kn kn kn Big drawback: Not satisfactory when LDA is poor 10

skip Characteristics of GW G W : incorporates in a universal, parameter-free manner the same kinds of nonlocalities found in LDA+U, LDA+Screened exchange, etc. On-site, off-site terms all included No special, orbital dependent treatment of, e.g. d No ambiguities about U, or double-counting terms 1-particle eigenvalues (energy bands) correspond to true excitations (not so in LDA) Results depend on starting hamiltonian H 0. In practice, H 0 is almost always H LDA. G LDA W LDA : applied almost exclusively to semiconductors Perturbation (no self-consistency) Not satisfactory when LDA is poor 11

Results from G LDA W LDA Approximation Bandgaps better, but too small If LDA has wrong ordering, e.g. negative gap as in Ge, InN, InSb, G LDA W LDA cannot undo wrong topology. Result: negative mass conduction bands Bands, magnetic moments in MnAs worse than LDA Many other problems see PRB B74, 245125 (2006) 12

skip Limitations to G LDA W LDA Approximation Conventional wisdom that G LDA W LDA is accurate to ~0.1 ev was an artifact of: 1. Most applications were to covalent semiconductors 2. Use of the PP approximation 3. Σ(valence only), neglecting corrections to LDA core Our conclusion: G LDA W LDA is essentially no better than LDA for many correlated systems, e.g. NiO, CoO, ErAs, and MnAs Needed: something beyond G LDA W LDA. 13

skip Full Self-consistent GW Our strategy: Make GW self-consistent $ 1 G D igg W " v & = $ & = & (! #) V G & n = $ ig d / 2 & ' G = H ext ( H % = igw 1! $ ( T + V + V +% ) Full self-consistent GW looks good as formal theory: Based on Luttinger-Ward functional. Keep symmetry for G Conserving approx But poor in practice, even for HEG Z-factor cancellation is not satisfied. (cond-mat/0611002) G 0 W 0 GW Noninteracting 14

skip Z-factor cancellation (cond-mat/0611002)! = GW" q! q '! ' G W q " q '! "! ' Suppose W is exact. Then! "# 1 $ = 1 % = for q',!' & 0 "! Z G (Ward identity) 0 = ZG + (incoherent part) GW! " G W 0 + incoherent part Similar discussion for proper polarization Π! = " igg# $ " ig G 0 0 + incoherent part Z-factor cancellation QP-like contr. in a complicated way. In full scgw (no Γ), no Z-factor cancellation. 15

Quasiparticle Self-consistency Principle : Can we find a good H 0 in place of H LDA? How to find the best possible H 0? Requires a prescription for minimizing the difference between the full hamiltonian H and H 0. In the context of GW theory, we call it the Quasiparticle Self-Consistent GW Approximation QSGW : a self-consistent perturbation theory where self-consistency determines the best H 0 (within the GW approximation) PRL 96, 226402 (2006) 16

skip QSGW: a self-consistent perturbation theory Partition H into H 0 +ΔV and (noninteracting + residual) in such a way as to minimize ΔV : G 0 1 GWA 1 = $$$% G =! " H! " H + # V! 0 0 ( )!- ( H + # V (!) G(!) = " ( r $ r ') 0 ( )!- ( H + # V (!) G (!) $ " ( r % r ') 0 0 & # V (! ) G (! ) $ 0 If the GWA is meaningful, G 0 G 0 ( ( )) We seek the G 0 (ω) that most closely satisfies Eqn. of motion (cond-mat/0611002) 17

skip QP Self-consistency The prescription for minimizing ΔV may be viewed as a prescription for minimizing G G 0. G 0 GW!!!" ( A ) G G!!" G 0 ( B ) ( B) is determined as follows 18

skip Minimize difference in norm M betweeen ψ[h] and ψ[h 0 ]. (approximate) result of min M At self-consistency, E i of G matches E i of G 0 (real parts) 19

skip QSGW extracts (nearly) optimal independentparticle (bare QP) picture from G H 0 represents the Bare QP. (1) Bare QPs are the fundamental primary excitations. (2) They interact with the bare Coulomb interaction. If: Bare QP energy + eigenfunction Dressed QP energy + eigenfunction Then the perturbation expansion is meaningful. Perturbation can go beyond RPA, GW level (QSGW). At whatever level, starting from bare QP seems to be a better expansion series than true self-consistent GW 20

skip Our numerical technique 1. All-electron FP-LMTO (including local orbitals). Generates accurate eigenfunctions (essentially same as LAPW+local orbitals) 2. Mixed basis expansion for v and W. An almost complete basis to expand (Bloch) Ψ Ψ. 3. No plasmon pole approximation 4. Calculate Σ for all electrons Cores are treated at Hartree-Fock level or better. 5. Offset-Γ method to handle 1/q 2 divergence in v and W. Kotani built FP-GW starting from ASA-GW code by F.Aryasetiawan. 21

Results QSGW has been applied to wide range of materials (bands, dielectric function, spin susceptibility). --- QSGW works very well! --- *Very reliable for itinerant electronic structure *Universally applicable: reasonable results for correlated d systems and even for f systems. *Satisfactory improvements over existing methods *Truly ab initio --- no empirical or adjustable parameters *Examine remaining disagreements with experiments: Errors are systematic and reasonable See PRL 96, 226402 (2006), cond-mat/061100 f systems cond-mat/0610528 Spin-splitting for ZB semiconductors PRL96, 086405 22

QSGW results for sp bonded systems QSGW G LDA W LDA LDA Errors are small and systematic Γ Γ transitions overestimated by 0.2 ±0.1 ev (a little worse in transition-metal oxides, like TiO 2, SrTiO 3. Other transitions overestimated by 0.1±0.1 ev 23

QSGW results for sp bonded systems II GaAs Na LDA: broken blue QPscGW: green G LDA W LDA : Dotted red O: Experiment m * (QSGW) = 0.073 m * (LDA) = 0.022 m * (expt) = 0.067 Gap too large by ~0.3 ev Band dispersions ~0.1 ev Ga d level well described Na bandwidth reduced by 15% 24

skip QSGW results for sp bonded systems III 25

QSGW theory applied to simple d systems Fe (minority) * d band exchange splitting and bandwidths are systematically improved relative to LDA. * Generally good agreement with photoemission * magnetic moments: small systematic errors (slightly overestimated) 26

skip QSGW calculation of dielectric response Dielectric function has approximately the correct shape, but plasmon peak is too high relative to experiment and the Wannier excitons are missing. G Both originate from electron-hole interactions (ladder diagrams) missing in GW G W W 27

skip Optical Dielectric constant ε Shift in Im ε(ω) Re ε(ω 0) should be too small ε is universally ~20% smaller than expt --- true for many kinds of insulators Return to this later 28

skip Systematics of Errors Unoccupied states universally too high ~0.2 ev for sp semicond; <~1eV for itinerant d SrTiO 3, TiO 2 >~1eV for less itinerant d NiO >~3 ev for f Gd,Er,Yb Peaks in Im ε(ω) also too high ε 20% too small Magnetic moments slightly overestimated 29

skip Consequences of improving Π(q,ω) Errors are all consistent with missing electron-hole correlation in the polarization function Π(q,ω) -- excitonic effects Ladder diagrams seem to reliably correct Im ε(ω) starting from QP picture in a variety of materials systems, e.g. Si and NiO(?) (Bechstedt et al) and Cu 2 O (Reining et al, PRL 2006) They red shift Im ε(ω) peaks W(q,ω) should be very accurate Esp ε increased by 20% in insulators scales W(0,0) ω 0 dominates contribution to Σ=iGW. Scaling of Σ by 0.8 found empirically to largely eliminate ~0.2 ev gap overestimate in sp semiconductors. Corrections expected as localization. Qualitatively explains systematic increase in errors in sp d f orbitals 30

Significance Fundamentally, transport is not a one-particle property! There is a growing recognition that transport can be accurately described by a one-particle picture, provided the appropriate one-particle picture is taken! Near-optimal one-body descriptions of transport appear to be very close to optimal many-body descriptions We find maximizing the overlap of a Slater determinant composed of single-particle states to the many-body current-carrying state is more important than energy minimization for defining single-particle approximations in a system with open boundary conditions. Thus the most suitable single particle effective potential is not one commonly in use by electronic structure methods, such as the Hartree-Fock or Kohn-Sham approximations. Fagas, Delaney, and Greer, Phys. Rev. B73, 241314 (2006) QSGW is a way of constructing near-optimal one-particle H 0 31

Matrix elements Many kinds of possible matrix elements, depending on kind of scattering. Possible to do within QSGW? Example 1: dynamical matrix, normal modes for phonons 2 1! W ( R ) 2 i, R j ij = " = $ i j " Ri # # " R j 2 i, j! Ri! R j V E Z Z e Example 2: electron-phonon interaction Normal mode eigenvector $ W ( ) 2 j, # g( µ, + i, j) = e! µ ( ) %& k R r k k q q k " q+ ki ( r) " qj ( r) dr, $ R Example 3: Auger ionization process: comes directly from bubbles in RPA ring diagrams j 32

Applications 1. Solar cells: energy band structure in CuInSe 2 : QSGW gap is ~1.1eV (ambiguity in atomic posns) Good ab initio prediction of m*, character of eigenfunctions 2.Dresselhaus splitting from S-O coupling. Responsible for: Dyakonov-Perel spin relaxation spin relaxation anisotropy mechanism behind some spin filters [010] [100] 3. Spin waves and exchange interactions in magnetic systems, e.g. Fe, MnAs. Example: MnO: QSGW spin waves in good agreement with experiment. LDA: 3-4 too high; LDA+U is 2 too high. Works well in both simple, itinerant systems, and complex, highly correlated systems. 33 Spin waves in MnO

skip Application to CuInSe 2 Green = In character Red = Cu character Black = Se character QSGW gap is ~1.1 ev (ambiguity because internal displacement not accurately known) Conduction band is almost pure In sp (typical for widegap semiconductor) Valence band is mostly Cu d, with some Se mixed in (atypical) 34

Valence Band structure in CuInSe 2 near Γ skip An additional splitting of the HH, LH bands at Γ Hole bands are highly anisotropic: m * (1,0,0), m * (1,1,0), m * (0,0,1) are all quite different SO splitting ~150 mev 35

Valence Band structure, CuInSe 2 vs CuGaSe 2 skip CuInSe 2 CuGaSe 2 Effective masses m c m HH (100) m LH m SO m HH (001) m LH m SO CuInSe 2 0.096 0.36 0.17 0.30 0.18 0.55 0.36 CuGaSe 2 0.12 0.38 0.22 0.30 0.16 0.50 0.48 36

Barriers to practical realization QSGW is too expensive to put into large simulations Construction of is Σ very time-consuming Large basis set used to construct reliable Σ. Quantum: Simplified treatment necessary for many-atom structures (QDs) or to predict energy levels, scattering of defects Given Σ(full), rotate to minimal basis by matching eigenfunctions in energy window near Fermi level. Possible to generate tight-binding-sized hamiltonians. Self-consistency: QSGW has Hartree-Fock structure (but with screened interactions). Extract effective U s; then proceed as in HF calculation. Minimum assumptions. Classical: fee to BTE solver bands, scattering rates. For monolithic heterogeneous structures (HEMT, HBT), some interpolation of energy bands is necessary. 37

Proof-of-concept: simulate I-V characteristics of a GaAs FET Simulation of a GaAs FET Still to do: Scattering matrix elements should be calculated from ab initio theory, e.g. phonon spectra, e-p interaction Not practical or feasible to calculate energy bands for a monolithic heterogeneous structure (HEMT, HBT). Simplified treatment necessary for many-atom structures (QDs) or to predict energy levels, scattering of defects BTE solver misses some quantum effects, e.g. tunneling and traps 38 by defects, that must be patched in.

Conclusions The QSGW method - Self-consistent perturbation theory; self-consistency constructed to minimize size of perturbation - optimum partitioning between H 0 and ΔV=H H 0. - Perturbation around the bare QP picture is physically more transparent, and there is some justification. - The QSGW method works very well in practice! Reliably " treats variety of properties in a wide range of materials: satisfactory improvements over existing approaches. - QSGW is truly ab initio -- dependable starting point for better, many-body calculations Many other properties Engine for ab initio device simulation 39

End 40