united nations educational, scientific and cultural organization the abdus salam international centre for theoretical physics international atomic energy agency SMR 1595-7 Joint DEMOCRIOS - ICP School on CONINUUM QUANUM MONE CARLO MEHODS 1-3 January 004 SUMMARY AND PROBLEMS WIH VARIAIONAL MEHODS David M. CEPERLEY Beckman Institute for Advanced Studies & echnology University of Illinois at Urbana-Champaign - N.C.S.A. Il-61801 Urbana, U.S.A. hese are preliminary lecture notes, intended only for distribution to participants. strada costiera, 11-34014 trieste italy - tel. +39 040 40111 fax +39 040 4163 - sci_info@ictp.trieste.it - www.ictp.trieste.it
Summary and problems with variational methods Powerful method since you can use any trial function Scaling (computational effort vs. size is almost classical Learn directly about what works in wavefunctions No sign problem Optimization is time consuming Energy is insensitive to order parameter Non-energetic properties are less accurate. O(1 vs. O( for energy. Difficult to find out how accurate results are. Favors simple states over more complicated states, e.g. Solid over liquid Polarized over unpolarized What goes into the trial wave function comes out! GIGO We need a more automatic method! Projector Monte Carlo Summary of Variational (VMC error (au 1.E+01 1.E+00 1.E-01 1.E-0 1.E-03 1.E-04 Better trial function Simple trial function 1.E-05 1.E+00 1.E+01 1.E+0 1.E+03 1.E+04 1.E+05 1.E+06 1.E+07 computer time (sec 1
Dependence of energy on wavefunction 3d Electron fluid at a density r s =10 Kwon, Ceperley, Martin, Phys. Rev. B58,6800, 1998 Wavefunctions Slater-Jastrow (SJ three-body (3 backflow (BF fixed-node (FN Energy <φ H φ> converges to ground state Variance <φ [H-E] φ> to zero. Using 3B-BF gains a factor of 4. Using DMC gains a factor of 4. Energy -0.107-0.1075-0.108-0.1085-0.109 FN -SJ FN-BF 0 0.05 0.1 Variance Projector Monte Carlo (variants: Green s function MC, Diffusion MC, Reptation MC Project single state using the Hamiltonian φ (H E t ( t = e φ(0 We show that this is a diffusion + branching operator. Maybe we can interpret as a probability. But is this a probability? Yes! for bosons since ground state can be made real and non-negative. But all excited states must have sign changes. his is the sign problem. For efficiency we do importance sampling. Avoid sign problem with the fixed-node method.
Diffusion Monte Carlo How do we analyze this operator? Expand into exact eigenstates of H. hen the evolution is simple in this basis. Long time limit is lowest energy state that overlaps with the initial state, usually the ground state. How to carry out on the computer? ψ (H E t ( Rt, = e ψ( R,0 Hφ ψ( R,0 = φ ( R φ ψ(0 t( Eα E ( Rt, = α( Re α (0 α t( E0 E lim ψ( Rt, = φ ( Re φ ψ(0 0 = E φ α α α t α α ψ φ φ ψ 0 0 E E normalization fixed α Operator notation he Green s function d ρˆ dt ρˆ = e = Hˆ ρ ˆ Ht ˆ We define the coordinate green s function (or density matrix by: thˆ GR ( R'; t = Re R' Roughly the probability density of going from R 0 to R in time t. (but is it a probability? Properties: GR = t G R R R R ( 0 Rt ; HG ˆ ( R' Rt ; ( 0 ;0 = δ ( 0 * ( ; φ ( φ ( G R Rt R R = 0 α 0 α 0 α 3
Froebinius heorem When can we consider the wavefunction as a probability? First how about the Green s function? ( δ ( G R R;0 = R R 0 0 0 rotter's theorem implies it continues to be positive at all times. ( Rt G R 0 ; 0 But if we start with a non-negative function it will stay non-negative, and can be interpreted as a p.d.f. Not true for all Hamiltonians (require off-diagonal matrix elements to be non-positive. (not pseudopotentials, not magnetic fields. Only true for the bosonic ground state. Monte Carlo process Now consider the variable t as a continuous time (it is really imaginary time. ake derivative with respect to time to get evolution. his is a diffusion + branching process. Justify in terms of rotter s theorem. Requires interpretation of the wavefunction as a probability density. But is it? Only in the boson ground state. Otherwise there are nodes. Come back to later. ψ( Rt, = ( H E ψ ( Rt, t ħ H = i + V( R m i i ψ ( Rt, ħ = iψ ( Rt, t m i ψ ( Rt, = ( V( R E ψ( Rt, t i 4
How do we find the solution of: he operator solution is: rotter s theorem (1959: rotter s theorem Assumes that A,B and A+B are reasonable operators. dρˆ = ( A+ B ρˆ dt ( A+ B t ρˆ = e ρˆ = lim n t ˆ t ˆ t ˆ tˆ t ˆ t ˆ A B n n n A B A B n n n n 0 n = 0 ' 1 ' 1 1... n 1 ' n ' n n R e e R R e R R e R R e R R e R his means we just have to figure out what each operator does independently and then alternate their effect. his is rigorous in the limit as n. In the DMC case A is diffusion operator, B is a branching operator. Just like molecular dynamics At small time we evaluate each operator separately. e t Aˆ n e t Bˆ n n re Evaluation of kinetic density matrix τ r' r' r' * α 1 In PBC eigenfunctions of ˆ = e Ω and eigenvalues are λk re τ ( π ikr ikr ' τλ k ik( r' r τλ k = dke = 3 1 e e e Ω convert to an integral re τ = = α k φ ( r φ ( r' e 1 α τα ikr ( r r ( 4πλτ e 3/ ' /4λτ Danger: makes assumption about boundaries and statistics. his is a diffusion process. 5
Putting this together n is number of time slices. τ is the time-step V is diagonal r e r e τ τ V r ' r ' ρˆ = e ρˆ = lim β ( + V τ = β / n = n e ( 4πλτ τ e = δ ( r r ' e τv n ( r r 3/ ' /4λτ e τv ( r nh τ ˆ ˆ ( ˆ 1 ( 0 ~ τ τ V R V Rn 0 1... τ n n 1 n Re R R e R e R e R e τ Error at finite n comes from commutator is roughly: Diffusion preserves normalization but potential does not! e τ ˆ, ˆ V Basic DMC algorithm Construct an ensemble (population P(0 sampled from the trial wavefunction. {R 1,R,,R P } Go through ensemble and diffuse each one ndrn (timestep τ R' k = Rk + λτζ( t uprn τ ( V ( R E number of copies= floor function rial energy E adjusted to keep population fixed. Branching is uncontrolled e + u drhφ ( Rt, E = lim V( R drφ ( Rt, Problems: 0 t φ ( What do we do about fermi statistics? 6
Population Bias Having the right trial energy guarantees that population will on the average be stable, but fluctuations will always cause the population to either grow too large or too small. Various ways to control the population Suppose P 0 is the desired population and P(t is the current population. How much do we have to adjust E to make P(t+=P 0? ( δe Pt ( + = e Pt ( = P0 δ E ln( Pt (/ P0 = ( E = E +κ ln P/ P 0 0 here will be a (small bias in the energy caused by a limited population. Importance Sampling Kalos 1970, Ceperley 1979 Why should we sample the wavefunction? he physically correct pdf is φ. Importance sample (multiply by trial wave function. f( Rt, ψ ( R φ( Rt, lim frt (, ψ ( R φ ( R t f( Rt, = ψ ( RH f ( Rt, / t f( Rt, = t [ ψ ( R ] 1 ( ln + ( λ f λ f ψ R ψ Hψ ( f ( Rt, Evolution = diffusion + drift + branching Use accept/reject step for more accurate evolution. make acceptance ratio>99%. Determines time step. We have three terms in the evolution equation. rotter s theorem still applies. 0 Commute through H 7
Brownian Dynamics Consider a big molecule in a solvent. In the high viscosity limit the master equation (Smoluchowski or Fokker- Planck eq. is: ρ ( Rt, = D ρ( Rt, βd [ F( R ρ( Rt,] t Rt ( + τ = Rt ( + τβdf( Rt ( + η( t η( = 0 η( = τ t t D ( R' R βdfr τ ( GR ( R' = cexp Dτ Also the equation for Diffusion Quantum Monte Carlo without branching. Borrow rejection technique developed for that. Green s function for a gradient What is Green s function for the operator? F variables separate to 1D problems Evolution equation for Green's function: Gxt (, Gxt (, = F solution Gxt (, = hx ( Ft t x his operator just causes probability distribution to drift in the direction of F. Smoluchowski equation for Brownian motion it was the effect of gravitational field on the motion of colloids. In practice, we limit the gradient so the walk is not pushed too far. 8
o the pure diffusion algorithm we have added a drift step that pushes the random walk in directions of increasing trial function: Branching is now controlled by the local energy Because of zero variance principle, fluctuations are controlled. Cusp condition can limit infinities coming from singular potentials. We still determine E by keeping asymptotic population stable. Must have accurate time evolution. Adding accept/reject step is a major improvement. How do we deal with fermi statistics? R' = R+ λτ ln ψ ( R 1 E ( R E = ψ ( RH ψ( R E L drφ( RtH, ψ ( R E0 = lim t Eψ ( R drf ( Rt, f ( Importanced sampled Green s function: ( R' ( R' ψ τ H GR ( R' = Re R' ψ Exact property of DMC Green s function ( R G( R R' ( R' G( R' R Ψ =Ψ We enforce detailed balance to decrease time step errors. Gs ( ' s ψ ( s' A( s s' = min 1, Gs ( s' ψ ( s VMC satisfies detailed balance. ypically we choose time step to have 99% acceptance ratio. Method gives exact result if either time step is zero or trial function is exact. 9
Ensemble evolves according to Schematic of DMC Diffusion Drift branching ensemble Mixed estimators Problem is that PMC samples the wrong distribution. OK for the energy Linear extrapolation helps correct this systematic error Other solutions: Maximum overlap Forward walking Reptation/path integrals A A A M o V 0 0 * drψ RAφ R * drψ R φ R * drφ RAφ R * drφ R φ R * drψ RAψ R * drψ Rψ R ( ( ( ( ( ( ( ( ( ( ( ( (( φ ψ ( φ ψ (( φ ψ A A A + O A M A A M V V M V + O A = A dr for the density minimized wrt A 10
Other projector functions can be used 0 τ ( E E e Diffusion MC 1 GE ( = 1+ τ ( E E Green's Function MC G 1 τ ( E E Power MC GE ( = 1 ground state remains after many iterations dg τ = = time step de for all 3 cases: lim GE ( n n = e nτ ( E E E Common effect on long-time (iteration limit. 3 rd choice generates a Krylov sequence. Only works for bounded spectra such as a lattice model. Green s Function Monte Carlo Kalos, Levesque, Verlet Phys. Rev. A9, 178 (1974. It is possible to make a zero time-step-error method Works with the integral formulation of DMC dβ GRR (, ' R 1 H E R' e β τ = + = τ τ ( 1 0 1 + H E Sample time-step from Poisson distribution Express operator in a series expansion and sample the terms stochastically. GRR (, ' = HRR (, ' + dr" GRR (, " KR ( ", R' Recent Revival: Continuous time Monte Carlo for lattice models. 11
Exact fermion methods How can we do fermion simulations? he initial condition can be made real but not positive (for more than 1 electron in the same spin state In transient estimate or released-node methods one carries along the sign as a weight and samples the modulus. (Hˆ E t φ( t = e sign( φ( R,0 φ( R,0 Do not forbid crossing of the nodes, but carry along sign when walks cross. What s wrong with node release: Because walks don t die at the nodes, the computational effort increases (bosonic noise he signal is in the cancellation which dominates Monte Carlo can add but not subtract ( β Variational-Projector Approach (ransient Estimate β H Ψ = e Ψ βh τh τ H ( ( 0 p ( 0 0 1 p 1 p ( p Z( β = Ψ β Ψ β = Ψe Ψ = dr... dr Ψ R Re R... R e R Ψ R ( HΨ( ( β Ψ( β Ψ β β β E( β = = EL( R0 τ = β Ψ p ( β ( H E( β ( β Ψ( Ψ( de ( β Ψ Ψ = σ ( β = = EL( R0 EL( Rp E( β > 0 β dβ β β ψ(β converges to the exact ground state E is an upper bound converging to the exact answer monotonically because σ its derivative is positive. Z( β = dr... dr Ψ R Re R... R Ψ R = τh τh 0 p ( 0 0 1 p 1e Rp ( Rp σ ( 0 σ ( RP σ( R σ ( R 0 P 1
VMC 300K DMC timestep bcc hydrogen, N=16 Model fermion problem: Particle in a box Symmetric potential: V(r =V(-r Antisymmetric state: φ(r=-φ(-r Initial (trial state Positive walkers Final (exact state Node Negative walkers Sign of walkers fixed by initial position. hey are allowed to diffuse freely. f(r= number of positive-negative walkers. Node is dynamically established by diffusion process. (cancellation of positive and negative walkers. Et ( σ(0 σ( tet ( = σ(0 σ( t 13
Scaling in Released-Node Initial distribution Later distribution At any point, positive and negative walkers will tend to cancel so the signal is drown out by the fluctuations. t [ E Signal/noise ratio is : F E B ] e t=projection time E F and E B are Fermion, Bose energy (proportional to N Converges but at a slower rate. Higher accuracy, larger t. For general excited states: E F (1+ E g Exponential complexity! CPUtime ε ε Not a fermion problem but an excited state problem. Cancellation is difficult in high dimensions. e F N E g Exact fermion calculations Possible for the electron gas for up to 60 electrons. DEG at rs=1 N=6 ransient estimate calculation with SJ and BF-3B trial functions. Ψ th e Ψ 14
General statement of the fermion problem Given a system with N fermions and a known Hamiltonian and a property O. (usually the energy. How much time will it take to estimate O to an accuracy e? How does scale with N and e? If you can map the quantum system onto an equivalent problem in classical statistical mechanics then: N α ε With 0 <a < 4 his would be a solved quantum problem! All approximations must be controlled! Algebraic scaling in N! e.g. properties of Boltzmann or Bose systems in equilibrium. Solved Problems 1-D problem. (simply forbid exchanges Bosons and Boltzmanons at any temperature Some lattice models: Heisenberg model, 1/ filled Hubbard model on bipartite lattice (Hirsch Spin symmetric systems with purely attractive interactions: u<0 Hubbard model, nuclear Gaussian model. Harmonic oscillators or systems with many symmetries. Any problem with <i H j> 0 Fermions in special boxes Other lattice models Kalos and coworkers have invented a pairing method but it is not clear whether it is approximation free and stable. 15
he sign problem he fermion problem is intellectually and technologically very important. Progress is possible but danger-the problem maybe more subtle than you first might think. New ideas are needed. No fermion methods are perfect but QMC is competitive with other methods and more general. he fermion problem is one of a group of related problems in quantum mechanics (e.g dynamics. Feynman argues that general many-body quantum simulation is exponentially slow on a classical computer. Maybe we have to solve quantum problems using analog quantum computers: programmable quantum computers that can emulate any quantum system. Fixed-node method Initial distribution is a pdf. It comes from a VMC simulation. Drift term pushes walks away from the nodes. Impose the condition: his is the fixed-node BC Will give an upper bound to the exact energy, the best upper bound consistent with the FNBC. f( R,0 = ψ ( R φ( R = 0 when ψ ( R = 0. E FN 0 E = E if φ ( R ψ ( R 0 all R FN E 0 0 f(r,t has a discontinuous gradient at the nodal location. Accurate method because Bose correlations are done exactly. Scales well, like the VMC method, as N 3. Classical complexity. Can be generalized from the continuum to lattice finite temperature, magnetic fields, One needs trial functions with accurate nodes. 16
Proof of fixed-node theorem Suppose we solve S.E. in a subvolume V determined by the nodes of an antisymetric trial function. Hˆ φ = E φ inside V FN FN FN Extend the solution to all space with the permutation operator. 1 P φˆ ( R ( 1 FN φ ( PR FN N! P Inside a given sub-volume only permutations of a given sign ( ± contribute. Hence the extend solution is non-zero. Evaluate the variational energy the extended trial function. P+ P' * ( 1 ( ˆ drφfn PR HφFN ( PR ' PP' P+ P ( 1 drφfn ( PR φfn ( PR ' E = E E 0 ' * PP ' Edges of volumes do not contribute to the integral since the extend solution vanishes there. FN VMC Nodal Properties If we know the sign of the exact wavefunction (the nodes, we can solve the fermion problem with the fixed-node method. If φ(r is real, nodes are φ(r=0 where R is the 3N dimensional vector. Nodes are a 3N-1 dimensional surface. (Do not confuse with single particle orbital nodes! Coincidence points r i = r j are 3N-3 dimensional hyper-planes In 1 spatial dimension these points exhaust the nodes: fermion problem is easy to solve in 1D with the no crossing rule. Coincidence points (and other symmetries only constrain nodes in higher dimensions, they do not determine them. he nodal surfaces define nodal volumes. How many nodal volumes are there? Conjecture: there are typically only different volumes (+ and - except in 1D. (but only demonstrated for free particles. 17
Free electron Other electrons Nodal Picture: d slice thru 3d space Nodes pass thru their positions Divides space into regions Wavelength given by interparticle spacing SPIN? How do we treat spin in QMC? For extended systems we use the S z representation. We have a fixed number of up and down electrons and we antisymmetrize among electrons with the same spin. his leads to Slater determinants. For a given trial function, its real part is also a trial function (but it may have different symmetries, for example momentum ( ikr ikr e, e or ( cos( kr,sin( kr For the ground state, without magnetic fields, spin-orbit interaction we can always work with real functions. However, in some cases it may be better to work with complex functions. 18
Fixed-Phase method Ortiz, Martin, DMC 1993 Generalize the FN method to complex trial functions: Since the Hamiltonian is Hermitian, the variational energy is real: E V [ ] + λ [ I ] RU( R dre V R + λ UR λ R U R U R = ( ( ( ( dre R U( R ( We see only one place where the energy depends on the phase of the wavefunction. If we fix the phase, then we add this term to the potential energy. In a magnetic field we get also the vector potential. ( effective potential= V( R + λi Ari +I iu( R We can now do VMC or DMC and get upper bounds as before. he imaginary part of the local energy will not be zero unless the right phase is used. Used for twisted boundary conditions, magnetic fields, vortices, phonons, spin states, i ( Ψ R = e U R Problem with core electrons Bad scaling in both VMC and DMC In VMC, energy fluctuations from core dominate the calculation In DMC, time step will be controlled by core dynamics Solution is to eliminate core states by a pseudopotential Conventional solution: semi-local form rvˆ e core r' = vlocal( r δ ( r r' + vl( rpl(cos( r r' Ensures that valence electrons go into well defined valence states with the wavefunction and energy for each angular momentum state prescribed. PP is non-local: OK for VMC. Leads to an extra MC integral. But DMC uses a locality approximation and good trial functions. Extra approximation. l 19
Summary of =0 methods: Variational(VMC, Fixed-node(FN, Released-node(RN error (au 1.E+01 1.E+00 1.E-01 1.E-0 1.E-03 Better trial function Simple trial function VMC FN 1.E-04 RN 1.E-05 1.E+00 1.E+01 1.E+0 1.E+03 1.E+04 1.E+05 1.E+06 1.E+07 computer time (sec Problems with projector methods Fixed-node is a super-variational method DMC dynamics is determined by Hamiltonian Zero-variance principle allows very accurate calculation of ground state energy if trial function is good. Projector methods need a trial wavefunction for accuracy. hey are essentially methods that perturb from the trial function to the exact function. (Note: if you don t use a trial function, you are perturbing from the ideal gas Difficulty calculating properties other than energy. We must use extrapolated estimators or forward walking. f( R, =φ ( R ψ ( R not φ ( R 0 0 Bad for phase transitions and finite temperature, complex systems. Path Integral MC solves some of these problems. 0