Susumu YAMADA 1,3 Toshiyuki IMAMURA 2,3, Masahiko MACHIDA 1,3
|
|
- Lizbeth Jenkins
- 5 years ago
- Views:
Transcription
1 Dynamical Variation of Eigenvalue Problems in Density-Matrix Renormalization-Group Code PP12, Feb. 15, Center for Computational Science and e-systems, Japan Atomic Energy Agency 2 The University of Electro-Communications 3 CREST(JST) Susumu YAMADA 1,3 Toshiyuki IMAMURA 2,3, Masahiko MACHIDA 1,3
2 Outline Strongly Correlated Quantum System Parallelization scheme for density matrix renormalization group method Communication strategy for a massively parallel computer Numerical experiment Auto-tuning for parallel DMRG method Conclusion
3 Strongly-correlated Quantum Systems A typical example: BiO SrO CuO Ca CuO SrO BiO High-Tc cuprate superconductors Superconducting Layer (SL) Insulating Layer (IL) Superconducting Layer (SL) Insulating Layer (IL) Cu O CuO 2 plane Cu ex. Bi 2 Sr 2 CaCu 2 O 8-δ Superconducting Layer (SL) Crystalline Structure U The Simple Model: Hubbard Model t Hamiltonian t U : Coulomb interaction t : hopping parameter
4 Density Matrix Renormalization Group renormalization renormalization 2-D direction A L system A R environment Superblock leg-direction Direct extension of DMRG method toward 2D model The dimension of the Hamiltonian increases exponentially. Parallelization of DMRG
5 Target of parallelization The time consuming operations of DMRG method Solving all eigenpairs of a density matrix (dense matrix) Solving the ground state of the Hamiltonian for the superblock All eigenstates of density matrix dense matrix ScaLAPACK The ground state of Hamiltonian large sparse matrix Iteration method is generally utilized. (Lanczos method, LOBPCG method, ) The most time consuming operation of iteration method: Hamiltonian (large sparse matrix)-vector multiplication
6 Parallelization using feature of model Superblock for quasi-2d model Divide the model into 3 blocks Block 1 Block 4 Block 1 Block 4 Block 2 Block 3 i 1 i 2 i 3 i 4 Block 2 Block 3 H H l H c H r The Hamiltonian H is decomposed as H I 4 I3 Hl I4 Hc I1 H r I2 I1 I The identity matrix whose dimension is the same as the i number of the states of the block i. Hv I4 I3 Hl v I4 Hc I1 v H r I2 I1 v Hamiltonian-vector multiplication 3 matrix-vector multiplications
7 Parallelization of matrix-vector multiplication Convert vector v into matrices V l, V c, and V r in consideration of the direct product with the identity matrix. Hv I 4 I3 Hl v HlVl I4 Hc I1 v HcVc H r I2 I1 v H rvr Three sparse matrixvector multiplications Three sparse matrix-dense matrix multiplications Parallelization of sparse matrix - dense matrix multiplication Sparse matrix partitioning dense matrix columnwisely Computation cost can be partitioned equally. Transformation of the partitioned data of matrices V l, V c, and V r all-to-all communication
8 Communication for transformation between partitioned matrices The all-to-all communication can realize the transformation between the data of the partitioned matrices V l, V c, and V r. Conflict process 0 process 1 process 2 process COM1 COM2 V l COM3 COM Ex. All-to-all communication on 4 processes V r V c The communication conflict occurs, because of the communication on all processes simultaneously. The all-to-all communication is not suitable for a massively parallel computer.
9 2-step communication All-to-all communication on all processes can be avoided by doubling the communication. process 0 process 1 process 2 process The total amount of communication data is the same as the all-to-all communication. V l COM1 COM V c COM2 COM Ex. 2-step communication on 4 processes V r The communication conflict decreases. But, the amount of communication data becomes double.
10 Numerical Experiment T2K Open Supercomputer (Todai Combined Cluster) The University of Tokyo Processor:AMD Opteron 8356 Quad core (2.3GHz) Number of processors per node :4 (16 cores) Network:Myrinet-10G link Bandwidth: 5GB/s full-duplex Compiler:Intel Fortran Compiler 11.0 Option:-O3 ip Parallelization:FlatMPI
11 Total elapsed time (sec) Numerical Experiment 4x10-site Hubbard model 19 up-spins, 19 down-spins U/t=10 64 cores 128 cores Elapsed time cores 512 cores cores Number of states kept (m) Conventional all-to-all communication cores 128 cores 256 cores 512 cores 1024 cores Number of states kept (m) 2-step communication Speed down on 1024 cores
12 Elapsed time (sec) Reason for speed down Elapsed time (sec) Communication and calculation time distribution for matrix-vector multiplication (m=200) Conventional communication 2-step communication cores 1024 cores COM 1 COM 2 COM 3 COM 4 calculation All communication times decrease Conventional communication 2-step communication COM2 and COM3 increase. V l V c V r COM1 COM4 COM2 COM3 No problem Factor in speed down
13 Reason for speed down Ex. Parallel computer with 8 dual-core processors COM1,COM4 COM2,COM3 P 0 P 1 P 2 P 3 P 4 P 5 P 6 P 7 P 0 P 1 P 2 P 3 P 4 P 5 P 6 P 7 P 8 P 9 core core processor P 10 P 11 P 12 P 13 P 14 The conflict hardly occurs, because of local communicating. P 15 P 8 P 9 P 10 P 11 P 12 P 13 P 14 The conflict may occur frequently, because of global communication. P 15
14 Scheduling for overlapping the calculation and the communication P 0 P 1 P 2 P 3 P 4 P 5 P 6 P 7 P 8 P 9 P 10 P 11 P 12 P 13 P 14 P 15 Execute the communication per each group one by one. Some cores, which do not execute the communication, become idle. Communication conflict can be avoided. Execute the calculation on the idle cores. Overlapping calculation and communication
15 Elapsed time (sec) Total elapsed time (sec) Effect of overlapping the calculation and the communication 4x10-site Hubbard model 19 up-spins, 19 down-spins U/t= T2K Open Supercomputer (Todai Combined Cluster) Parallelization:FlatMPI Total elapsed time Matrix-vector multiplication time (overlap method) cores 128 cores 1024 cores cores 512 cores cores speedup Conventional step Overlap method communication method Number of states kept (m) COM 1 COM 2 COM 3 COM 4 calculation calculation+communication Speedup up to 1024 cores
16 Targets of auto-tuning for parallel DMRG method In our parallel strategy, performance of two operations strongly depend on the computer architecture. Pattern of communication group for the 2-step all-to-all communication Eigenvalue problem for density matrix
17 Pattern of communication group for 2-step all-to-all communication The network architecture of a multi-core parallel computer system is complex and often heterogeneous. We can choose various pattern of the communication groups for the 2-step all-to-all communication. Example patterns of communication groups for COM1 and COM4 on parallel computer with 8 dual-core processors 4 groups of 4 processes 8 groups of 2 processes P 0 P 1 P 2 P 3 P 4 P 5 P 6 P 7 P 0 P 1 P 2 P 3 P 4 P 5 P 6 P 7 P 8 P 9 P 10 P 11 P 12 P 13 P 14 P 15 P 8 P 9 P 10 P 11 P 12 P 13 P 14 P 15
18 Total elapsed time (sec) Elapsed time for patterns of communication group 4x10-site Hubbard model 18 up-spins, 18 down-spins, U/t=10, Number of states kept : 400 FUJITSU PRIMERGY BX900 (Japan Atomic Energy Agency), 1024 cores 2000 The optimal case Number of communication groups of COM1 and COM 4 The performance strongly depends on the number of the communication groups. We have to optimize the pattern by executing DMRG method on various patterns. Auto tuning is required.
19 Eigenvalue problem for density matrix Density matrix Block diagonal matrix The dimension of each block matrix is various. Assign all processors to the large matrix. Ex. Assign the optimal number of processors to each problem. Ex. A B C D All PE s Serial computing (1 PE) A B C D 1000 PE s 100 PE s 10 PE s 1120 PE s It is very difficult to estimate the optimal number of processors theoretically. Auto-tuning is demanded.
20 Conclusion We proposed the parallelization strategy of DMRG method for quasi-2-dimensional quantum model. Key point Hamiltonian (sparse matrix) vector multiplication Sparse matrix- dense matrix multiplication using the property of the quantum model Parallelization by decomposing dense matrix All-to-all communication Strategy for avoiding conflict 2-step communication Overlapping for communication and calculation Our method can obtain the parallel efficiency up to 1024 cores. In future work We develop auto-tuning schemes to optimize: the dividing pattern of communication group for 2-step communication, the parallel eigenvalue solver for density matrix.
Parallelization Strategies for Density Matrix Renormalization Group algorithms on Shared-Memory Systems
Parallelization Strategies for Density Matrix Renormalization Group algorithms on Shared-Memory Systems G. Hager HPC Services, Computing Center Erlangen, Germany E. Jeckelmann Theoretical Physics, Univ.
More informationUltra-Large Scale Simulations for Superconductor MgB 2 Device toward Nuclear Application and Fundamental Issues in Nano-structured Superconductors
Chapter 3 Epoch Making Simulation Ultra-Large Scale Simulations for Superconductor MgB 2 Device toward Nuclear Application and Fundamental Issues in Nano-structured Superconductors Project Representative
More informationA parameter tuning technique of a weighted Jacobi-type preconditioner and its application to supernova simulations
A parameter tuning technique of a weighted Jacobi-type preconditioner and its application to supernova simulations Akira IMAKURA Center for Computational Sciences, University of Tsukuba Joint work with
More informationComputational strongly correlated materials R. Torsten Clay Physics & Astronomy
Computational strongly correlated materials R. Torsten Clay Physics & Astronomy Current/recent students Saurabh Dayal (current PhD student) Wasanthi De Silva (new grad student 212) Jeong-Pil Song (finished
More informationExtreme scale simulations of high-temperature superconductivity. Thomas C. Schulthess
Extreme scale simulations of high-temperature superconductivity Thomas C. Schulthess T [K] Superconductivity: a state of matter with zero electrical resistivity Heike Kamerlingh Onnes (1853-1926) Discovery
More informationHow to model holes doped into a cuprate layer
How to model holes doped into a cuprate layer Mona Berciu University of British Columbia With: George Sawatzky and Bayo Lau Hadi Ebrahimnejad, Mirko Moller, and Clemens Adolphs Stewart Blusson Institute
More informationDynamical properties of strongly correlated electron systems studied by the density-matrix renormalization group (DMRG) Takami Tohyama
Dynamical properties of strongly correlated electron systems studied by the density-matrix renormalization group (DMRG) Takami Tohyama Tokyo University of Science Shigetoshi Sota AICS, RIKEN Outline Density-matrix
More informationANTIFERROMAGNETIC EXCHANGE AND SPIN-FLUCTUATION PAIRING IN CUPRATES
ANTIFERROMAGNETIC EXCHANGE AND SPIN-FLUCTUATION PAIRING IN CUPRATES N.M.Plakida Joint Institute for Nuclear Research, Dubna, Russia CORPES, Dresden, 26.05.2005 Publications and collaborators: N.M. Plakida,
More informationQuantum Lattice Models & Introduction to Exact Diagonalization
Quantum Lattice Models & Introduction to Exact Diagonalization H! = E! Andreas Läuchli IRRMA EPF Lausanne ALPS User Workshop CSCS Manno, 28/9/2004 Outline of this lecture: Quantum Lattice Models Lattices
More informationEnergy-efficient Mapping of Big Data Workflows under Deadline Constraints
Energy-efficient Mapping of Big Data Workflows under Deadline Constraints Presenter: Tong Shu Authors: Tong Shu and Prof. Chase Q. Wu Big Data Center Department of Computer Science New Jersey Institute
More informationTFlops and 159-Billion-dimensional Exact-diagonalization for Trapped Fermion-Hubbard Model on the Earth Simulator
16.447 TFlops and 159-Billion-dimensional Exact-diagonalization for Trapped Fermion-Hubbard Model on the Earth Simulator Susumu Yamada Japan Atomic Energy Research Institute 6-9-3 Higashi-Ueno, Taito-ku
More informationPhysics 215 Quantum Mechanics 1 Assignment 1
Physics 5 Quantum Mechanics Assignment Logan A. Morrison January 9, 06 Problem Prove via the dual correspondence definition that the hermitian conjugate of α β is β α. By definition, the hermitian conjugate
More informationarxiv: v1 [hep-lat] 19 Jul 2009
arxiv:0907.3261v1 [hep-lat] 19 Jul 2009 Application of preconditioned block BiCGGR to the Wilson-Dirac equation with multiple right-hand sides in lattice QCD Abstract H. Tadano a,b, Y. Kuramashi c,b, T.
More informationDirect Self-Consistent Field Computations on GPU Clusters
Direct Self-Consistent Field Computations on GPU Clusters Guochun Shi, Volodymyr Kindratenko National Center for Supercomputing Applications University of Illinois at UrbanaChampaign Ivan Ufimtsev, Todd
More informationReview: From problem to parallel algorithm
Review: From problem to parallel algorithm Mathematical formulations of interesting problems abound Poisson s equation Sources: Electrostatics, gravity, fluid flow, image processing (!) Numerical solution:
More informationParallel Eigensolver Performance on High Performance Computers
Parallel Eigensolver Performance on High Performance Computers Andrew Sunderland Advanced Research Computing Group STFC Daresbury Laboratory CUG 2008 Helsinki 1 Summary (Briefly) Introduce parallel diagonalization
More informationA Twisted Ladder: Relating the Iron Superconductors and the High-Tc Cuprates
A Twisted Ladder: Relating the Iron Superconductors and the High-Tc Cuprates arxiv:0905.1096, To appear in New. J. Phys. Erez Berg 1, Steven A. Kivelson 1, Doug J. Scalapino 2 1 Stanford University, 2
More informationStriping in Cuprates. Michael Bertolli. Solid State II Elbio Dagotto Spring 2008 Department of Physics, Univ. of Tennessee
Striping in Cuprates Michael Bertolli Solid State II Elbio Dagotto Spring 2008 Department of Physics, Univ. of Tennessee Outline Introduction Basics of Striping Implications to Superconductivity Experimental
More informationQuasiparticle dynamics and interactions in non uniformly polarizable solids
Quasiparticle dynamics and interactions in non uniformly polarizable solids Mona Berciu University of British Columbia à beautiful physics that George Sawatzky has been pursuing for a long time à today,
More informationAngle-Resolved Two-Photon Photoemission of Mott Insulator
Angle-Resolved Two-Photon Photoemission of Mott Insulator Takami Tohyama Institute for Materials Research (IMR) Tohoku University, Sendai Collaborators IMR: H. Onodera, K. Tsutsui, S. Maekawa H. Onodera
More informationNumerical Methods in Many-body Physics
Numerical Methods in Many-body Physics Reinhard M. Noack Philipps-Universität Marburg Exchange Lecture BME, Budapest, Spring 2007 International Research Training Group 790 Electron-Electron Interactions
More informationHigh Temperature Cuprate Superconductors
High Temperature Cuprate Superconductors Theoretical Physics Year 4 Project T. K. Kingsman School of Physics and Astronomy University of Birmingham March 1, 2015 Outline 1 Introduction Cuprate Structure
More informationMomentum-space and Hybrid Real- Momentum Space DMRG applied to the Hubbard Model
Momentum-space and Hybrid Real- Momentum Space DMRG applied to the Hubbard Model Örs Legeza Reinhard M. Noack Collaborators Georg Ehlers Jeno Sólyom Gergely Barcza Steven R. White Collaborators Georg Ehlers
More informationJournal Club: Brief Introduction to Tensor Network
Journal Club: Brief Introduction to Tensor Network Wei-Han Hsiao a a The University of Chicago E-mail: weihanhsiao@uchicago.edu Abstract: This note summarizes the talk given on March 8th 2016 which was
More informationThe end is (not) in sight: exact diagonalization, Lanczos, and DMRG
The end is (not) in sight: exact diagonalization, Lanczos, and DMRG Jürgen Schnack, Matthias Exler, Peter Hage, Frank Hesmer Department of Physics - University of Osnabrück http://www.physik.uni-osnabrueck.de/makrosysteme/
More informationNumerical diagonalization studies of quantum spin chains
PY 502, Computational Physics, Fall 2016 Anders W. Sandvik, Boston University Numerical diagonalization studies of quantum spin chains Introduction to computational studies of spin chains Using basis states
More informationReal-Space Renormalization Group (RSRG) Approach to Quantum Spin Lattice Systems
WDS'11 Proceedings of Contributed Papers, Part III, 49 54, 011. ISBN 978-80-7378-186-6 MATFYZPRESS Real-Space Renormalization Group (RSRG) Approach to Quantum Spin Lattice Systems A. S. Serov and G. V.
More informationStatic-scheduling and hybrid-programming in SuperLU DIST on multicore cluster systems
Static-scheduling and hybrid-programming in SuperLU DIST on multicore cluster systems Ichitaro Yamazaki University of Tennessee, Knoxville Xiaoye Sherry Li Lawrence Berkeley National Laboratory MS49: Sparse
More informationSPARSE SOLVERS POISSON EQUATION. Margreet Nool. November 9, 2015 FOR THE. CWI, Multiscale Dynamics
SPARSE SOLVERS FOR THE POISSON EQUATION Margreet Nool CWI, Multiscale Dynamics November 9, 2015 OUTLINE OF THIS TALK 1 FISHPACK, LAPACK, PARDISO 2 SYSTEM OVERVIEW OF CARTESIUS 3 POISSON EQUATION 4 SOLVERS
More informationab initio Electronic Structure Calculations
ab initio Electronic Structure Calculations New scalability frontiers using the BG/L Supercomputer C. Bekas, A. Curioni and W. Andreoni IBM, Zurich Research Laboratory Rueschlikon 8803, Switzerland ab
More informationLeveraging Task-Parallelism in Energy-Efficient ILU Preconditioners
Leveraging Task-Parallelism in Energy-Efficient ILU Preconditioners José I. Aliaga Leveraging task-parallelism in energy-efficient ILU preconditioners Universidad Jaime I (Castellón, Spain) José I. Aliaga
More informationPreconditioned Parallel Block Jacobi SVD Algorithm
Parallel Numerics 5, 15-24 M. Vajteršic, R. Trobec, P. Zinterhof, A. Uhl (Eds.) Chapter 2: Matrix Algebra ISBN 961-633-67-8 Preconditioned Parallel Block Jacobi SVD Algorithm Gabriel Okša 1, Marián Vajteršic
More informationAccelerating Linear Algebra on Heterogeneous Architectures of Multicore and GPUs using MAGMA and DPLASMA and StarPU Schedulers
UT College of Engineering Tutorial Accelerating Linear Algebra on Heterogeneous Architectures of Multicore and GPUs using MAGMA and DPLASMA and StarPU Schedulers Stan Tomov 1, George Bosilca 1, and Cédric
More informationDynamic Scheduling for Work Agglomeration on Heterogeneous Clusters
Dynamic Scheduling for Work Agglomeration on Heterogeneous Clusters Jonathan Lifflander, G. Carl Evans, Anshu Arya, Laxmikant Kale University of Illinois Urbana-Champaign May 25, 2012 Work is overdecomposed
More informationParallel Preconditioning Methods for Ill-conditioned Problems
Parallel Preconditioning Methods for Ill-conditioned Problems Kengo Nakajima Information Technology Center, The University of Tokyo 2014 Conference on Advanced Topics and Auto Tuning in High Performance
More informationEfficient implementation of the overlap operator on multi-gpus
Efficient implementation of the overlap operator on multi-gpus Andrei Alexandru Mike Lujan, Craig Pelissier, Ben Gamari, Frank Lee SAAHPC 2011 - University of Tennessee Outline Motivation Overlap operator
More informationSakurai-Sugiura algorithm based eigenvalue solver for Siesta. Georg Huhs
Sakurai-Sugiura algorithm based eigenvalue solver for Siesta Georg Huhs Motivation Timing analysis for one SCF-loop iteration: left: CNT/Graphene, right: DNA Siesta Specifics High fraction of EVs needed
More informationParallel Eigensolver Performance on High Performance Computers 1
Parallel Eigensolver Performance on High Performance Computers 1 Andrew Sunderland STFC Daresbury Laboratory, Warrington, UK Abstract Eigenvalue and eigenvector computations arise in a wide range of scientific
More informationMaking electronic structure methods scale: Large systems and (massively) parallel computing
AB Making electronic structure methods scale: Large systems and (massively) parallel computing Ville Havu Department of Applied Physics Helsinki University of Technology - TKK Ville.Havu@tkk.fi 1 Outline
More informationParallelization of Multilevel Preconditioners Constructed from Inverse-Based ILUs on Shared-Memory Multiprocessors
Parallelization of Multilevel Preconditioners Constructed from Inverse-Based ILUs on Shared-Memory Multiprocessors J.I. Aliaga 1 M. Bollhöfer 2 A.F. Martín 1 E.S. Quintana-Ortí 1 1 Deparment of Computer
More informationThe advent of computer era has opened the possibility to perform large scale
Chapter 2 Density Matrix Renormalization Group 2.1 Introduction The advent of computer era has opened the possibility to perform large scale numerical simulations of the quantum many-body systems and thus
More informationParallel sparse direct solvers for Poisson s equation in streamer discharges
Parallel sparse direct solvers for Poisson s equation in streamer discharges Margreet Nool, Menno Genseberger 2 and Ute Ebert,3 Centrum Wiskunde & Informatica (CWI), P.O.Box 9479, 9 GB Amsterdam, The Netherlands
More informationCME342 Parallel Methods in Numerical Analysis. Matrix Computation: Iterative Methods II. Sparse Matrix-vector Multiplication.
CME342 Parallel Methods in Numerical Analysis Matrix Computation: Iterative Methods II Outline: CG & its parallelization. Sparse Matrix-vector Multiplication. 1 Basic iterative methods: Ax = b r = b Ax
More informationNON EQUILIBRIUM DYNAMICS OF QUANTUM ISING CHAINS IN THE PRESENCE OF TRANSVERSE AND LONGITUDINAL MAGNETIC FIELDS
NON EQUILIBRIUM DYNAMICS OF QUANTUM ISING CHAINS IN THE PRESENCE OF TRANSVERSE AND LONGITUDINAL MAGNETIC FIELDS by Zahra Mokhtari THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE
More informationarxiv:cond-mat/ v2 [cond-mat.str-el] 27 Dec 1999
Phase separation in t-j ladders Stefan Rommer and Steven R. White Department of Physics and Astronomy, University of California, Irvine, California 9697 D. J. Scalapino Department of Physics, University
More informationMulti-Length Scale Matrix Computations and Applications in Quantum Mechanical Simulations
Multi-Length Scale Matrix Computations and Applications in Quantum Mechanical Simulations Zhaojun Bai http://www.cs.ucdavis.edu/ bai joint work with Wenbin Chen, Roger Lee, Richard Scalettar, Ichitaro
More informationHybrid static/dynamic scheduling for already optimized dense matrix factorization. Joint Laboratory for Petascale Computing, INRIA-UIUC
Hybrid static/dynamic scheduling for already optimized dense matrix factorization Simplice Donfack, Laura Grigori, INRIA, France Bill Gropp, Vivek Kale UIUC, USA Joint Laboratory for Petascale Computing,
More information4 Matrix product states
Physics 3b Lecture 5 Caltech, 05//7 4 Matrix product states Matrix product state (MPS) is a highly useful tool in the study of interacting quantum systems in one dimension, both analytically and numerically.
More informationDe l atome au. supraconducteur à haute température critique. O. Parcollet Institut de Physique Théorique CEA-Saclay, France
De l atome au 1 supraconducteur à haute température critique O. Parcollet Institut de Physique Théorique CEA-Saclay, France Quantum liquids Quantum many-body systems, fermions (or bosons), with interactions,
More informationELSI: A Unified Software Interface for Kohn-Sham Electronic Structure Solvers
ELSI: A Unified Software Interface for Kohn-Sham Electronic Structure Solvers Victor Yu and the ELSI team Department of Mechanical Engineering & Materials Science Duke University Kohn-Sham Density-Functional
More informationQuantum spin systems - models and computational methods
Summer School on Computational Statistical Physics August 4-11, 2010, NCCU, Taipei, Taiwan Quantum spin systems - models and computational methods Anders W. Sandvik, Boston University Lecture outline Introduction
More informationA Quantum Chemistry Domain-Specific Language for Heterogeneous Clusters
A Quantum Chemistry Domain-Specific Language for Heterogeneous Clusters ANTONINO TUMEO, ORESTE VILLA Collaborators: Karol Kowalski, Sriram Krishnamoorthy, Wenjing Ma, Simone Secchi May 15, 2012 1 Outline!
More informationRenormalization of Tensor- Network States Tao Xiang
Renormalization of Tensor- Network States Tao Xiang Institute of Physics/Institute of Theoretical Physics Chinese Academy of Sciences txiang@iphy.ac.cn Physical Background: characteristic energy scales
More informationExact results concerning the phase diagram of the Hubbard Model
Steve Kivelson Apr 15, 2011 Freedman Symposium Exact results concerning the phase diagram of the Hubbard Model S.Raghu, D.J. Scalapino, Li Liu, E. Berg H. Yao, W-F. Tsai, A. Lauchli G. Karakonstantakis,
More informationarxiv: v1 [cond-mat.str-el] 22 Jun 2007
Optimized implementation of the Lanczos method for magnetic systems arxiv:0706.3293v1 [cond-mat.str-el] 22 Jun 2007 Jürgen Schnack a, a Universität Bielefeld, Fakultät für Physik, Postfach 100131, D-33501
More informationJacobi-Based Eigenvalue Solver on GPU. Lung-Sheng Chien, NVIDIA
Jacobi-Based Eigenvalue Solver on GPU Lung-Sheng Chien, NVIDIA lchien@nvidia.com Outline Symmetric eigenvalue solver Experiment Applications Conclusions Symmetric eigenvalue solver The standard form is
More informationX. Zotos - Research Publications
X. Zotos - Research Publications After 2004 1. Phonon-Magnon Interaction in Low Dimensional Quantum Magnets Observed by Dynamic Heat Transport Measurements, M. Montagnese, M. Otter, X. Zotos et al., Physical
More informationOne-dimensional electron-phonon systems: Mott- versus Peierls-insulators
One-dimensional electron-phonon systems: Mott- versus Peierls-insulators H. Fehske 1,, G. Wellein 3, A. P. Kampf 4, M. Sekania 4, G. Hager 3, A. Weiße, H. Büttner, and A. R. Bishop 5 1 Institut für Physik,
More informationQuantum Cluster Methods: An introduction
Quantum Cluster Methods: An introduction David Sénéchal Département de physique, Université de Sherbrooke International summer school on New trends in computational approaches for many-body systems May
More informationPRECONDITIONING IN THE PARALLEL BLOCK-JACOBI SVD ALGORITHM
Proceedings of ALGORITMY 25 pp. 22 211 PRECONDITIONING IN THE PARALLEL BLOCK-JACOBI SVD ALGORITHM GABRIEL OKŠA AND MARIÁN VAJTERŠIC Abstract. One way, how to speed up the computation of the singular value
More informationTuning order in cuprate superconductors
Tuning order in cuprate superconductors arxiv:cond-mat/0201401 v1 23 Jan 2002 Subir Sachdev 1 and Shou-Cheng Zhang 2 1 Department of Physics, Yale University, P.O. Box 208120, New Haven, CT 06520-8120,
More informationHigh temperature superconductivity - insights from Angle Resolved Photoemission Spectroscopy
High temperature superconductivity - insights from Angle Resolved Photoemission Spectroscopy Adam Kaminski Ames Laboratory and Iowa State University Funding: Ames Laboratory - US Department of Energy Ames
More informationCRYSTAL in parallel: replicated and distributed (MPP) data. Why parallel?
CRYSTAL in parallel: replicated and distributed (MPP) data Roberto Orlando Dipartimento di Chimica Università di Torino Via Pietro Giuria 5, 10125 Torino (Italy) roberto.orlando@unito.it 1 Why parallel?
More informationA model leading to self-consistent iteration computation with need for HP LA (e.g, diagonalization and orthogonalization)
A model leading to self-consistent iteration computation with need for HP LA (e.g, diagonalization and orthogonalization) Schodinger equation: Hψ = Eψ Choose a basis set of wave functions Two cases: Orthonormal
More informationWelcome to MCS 572. content and organization expectations of the course. definition and classification
Welcome to MCS 572 1 About the Course content and organization expectations of the course 2 Supercomputing definition and classification 3 Measuring Performance speedup and efficiency Amdahl s Law Gustafson
More informationNumerical Studies of the 2D Hubbard Model
arxiv:cond-mat/0610710v1 [cond-mat.str-el] 25 Oct 2006 Numerical Studies of the 2D Hubbard Model D.J. Scalapino Department of Physics, University of California, Santa Barbara, CA 93106-9530, USA Abstract
More informationIntroduction to DMFT
Introduction to DMFT Lecture 2 : DMFT formalism 1 Toulouse, May 25th 2007 O. Parcollet 1. Derivation of the DMFT equations 2. Impurity solvers. 1 Derivation of DMFT equations 2 Cavity method. Large dimension
More informationBalanced Truncation Model Reduction of Large and Sparse Generalized Linear Systems
Balanced Truncation Model Reduction of Large and Sparse Generalized Linear Systems Jos M. Badía 1, Peter Benner 2, Rafael Mayo 1, Enrique S. Quintana-Ortí 1, Gregorio Quintana-Ortí 1, A. Remón 1 1 Depto.
More informationLarge-scale Simulation for a Terahertz Resonance Superconductor Device
Large-scale Simulation for a Terahertz Resonance Superconductor Device Project Representative Masashi Tachiki Research Organization for Information Science and Technology Authors Mikio Iizuka 1, Masashi
More informationThe Hubbard model out of equilibrium - Insights from DMFT -
The Hubbard model out of equilibrium - Insights from DMFT - t U Philipp Werner University of Fribourg, Switzerland KITP, October 212 The Hubbard model out of equilibrium - Insights from DMFT - In collaboration
More informationIntroduction to Superconductivity. Superconductivity was discovered in 1911 by Kamerlingh Onnes. Zero electrical resistance
Introduction to Superconductivity Superconductivity was discovered in 1911 by Kamerlingh Onnes. Zero electrical resistance Meissner Effect Magnetic field expelled. Superconducting surface current ensures
More informationH ψ = E ψ. Introduction to Exact Diagonalization. Andreas Läuchli, New states of quantum matter MPI für Physik komplexer Systeme - Dresden
H ψ = E ψ Introduction to Exact Diagonalization Andreas Läuchli, New states of quantum matter MPI für Physik komplexer Systeme - Dresden http://www.pks.mpg.de/~aml laeuchli@comp-phys.org Simulations of
More informationPerformance Analysis of Lattice QCD Application with APGAS Programming Model
Performance Analysis of Lattice QCD Application with APGAS Programming Model Koichi Shirahata 1, Jun Doi 2, Mikio Takeuchi 2 1: Tokyo Institute of Technology 2: IBM Research - Tokyo Programming Models
More informationIntroduction to numerical computations on the GPU
Introduction to numerical computations on the GPU Lucian Covaci http://lucian.covaci.org/cuda.pdf Tuesday 1 November 11 1 2 Outline: NVIDIA Tesla and Geforce video cards: architecture CUDA - C: programming
More informationIntroduction to tensor network state -- concept and algorithm. Z. Y. Xie ( 谢志远 ) ITP, Beijing
Introduction to tensor network state -- concept and algorithm Z. Y. Xie ( 谢志远 ) 2018.10.29 ITP, Beijing Outline Illusion of complexity of Hilbert space Matrix product state (MPS) as lowly-entangled state
More informationTime Evolving Block Decimation Algorithm
Time Evolving Block Decimation Algorithm Application to bosons on a lattice Jakub Zakrzewski Marian Smoluchowski Institute of Physics and Mark Kac Complex Systems Research Center, Jagiellonian University,
More informationarxiv:cond-mat/ v2 [cond-mat.str-el] 24 Feb 2006
Applications of Cluster Perturbation Theory Using Quantum Monte Carlo Data arxiv:cond-mat/0512406v2 [cond-mat.str-el] 24 Feb 2006 Fei Lin, Erik S. Sørensen, Catherine Kallin and A. John Berlinsky Department
More informationCommunication-avoiding LU and QR factorizations for multicore architectures
Communication-avoiding LU and QR factorizations for multicore architectures DONFACK Simplice INRIA Saclay Joint work with Laura Grigori INRIA Saclay Alok Kumar Gupta BCCS,Norway-5075 16th April 2010 Communication-avoiding
More informationA knowledge-based approach to high-performance computing in ab initio simulations.
Mitglied der Helmholtz-Gemeinschaft A knowledge-based approach to high-performance computing in ab initio simulations. AICES Advisory Board Meeting. July 14th 2014 Edoardo Di Napoli Academic background
More informationQuantum Cluster Methods (CPT/CDMFT)
Quantum Cluster Methods (CPT/CDMFT) David Sénéchal Département de physique Université de Sherbrooke Sherbrooke (Québec) Canada Autumn School on Correlated Electrons Forschungszentrum Jülich, Sept. 24,
More informationHigh-T c superconductors
High-T c superconductors Parent insulators Carrier doping Band structure and Fermi surface Pseudogap, superconducting gap, superfluid Nodal states Bilayer, trilayer Stripes High-T c superconductors Parent
More informationWRF performance tuning for the Intel Woodcrest Processor
WRF performance tuning for the Intel Woodcrest Processor A. Semenov, T. Kashevarova, P. Mankevich, D. Shkurko, K. Arturov, N. Panov Intel Corp., pr. ak. Lavrentieva 6/1, Novosibirsk, Russia, 630090 {alexander.l.semenov,tamara.p.kashevarova,pavel.v.mankevich,
More informationParallel Asynchronous Hybrid Krylov Methods for Minimization of Energy Consumption. Langshi CHEN 1,2,3 Supervised by Serge PETITON 2
1 / 23 Parallel Asynchronous Hybrid Krylov Methods for Minimization of Energy Consumption Langshi CHEN 1,2,3 Supervised by Serge PETITON 2 Maison de la Simulation Lille 1 University CNRS March 18, 2013
More informationQR Factorization of Tall and Skinny Matrices in a Grid Computing Environment
QR Factorization of Tall and Skinny Matrices in a Grid Computing Environment Emmanuel AGULLO (INRIA / LaBRI) Camille COTI (Iowa State University) Jack DONGARRA (University of Tennessee) Thomas HÉRAULT
More informationTR A Comparison of the Performance of SaP::GPU and Intel s Math Kernel Library (MKL) for Solving Dense Banded Linear Systems
TR-0-07 A Comparison of the Performance of ::GPU and Intel s Math Kernel Library (MKL) for Solving Dense Banded Linear Systems Ang Li, Omkar Deshmukh, Radu Serban, Dan Negrut May, 0 Abstract ::GPU is a
More informationAn introduction to the dynamical mean-field theory. L. V. Pourovskii
An introduction to the dynamical mean-field theory L. V. Pourovskii Nordita school on Photon-Matter interaction, Stockholm, 06.10.2016 OUTLINE The standard density-functional-theory (DFT) framework An
More informationComputational Approaches to Quantum Critical Phenomena ( ) ISSP. Fermion Simulations. July 31, Univ. Tokyo M. Imada.
Computational Approaches to Quantum Critical Phenomena (2006.7.17-8.11) ISSP Fermion Simulations July 31, 2006 ISSP, Kashiwa Univ. Tokyo M. Imada collaboration T. Kashima, Y. Noda, H. Morita, T. Mizusaki,
More informationParallelization of the Dirac operator. Pushan Majumdar. Indian Association for the Cultivation of Sciences, Jadavpur, Kolkata
Parallelization of the Dirac operator Pushan Majumdar Indian Association for the Cultivation of Sciences, Jadavpur, Kolkata Outline Introduction Algorithms Parallelization Comparison of performances Conclusions
More informationBlock Iterative Eigensolvers for Sequences of Dense Correlated Eigenvalue Problems
Mitglied der Helmholtz-Gemeinschaft Block Iterative Eigensolvers for Sequences of Dense Correlated Eigenvalue Problems Birkbeck University, London, June the 29th 2012 Edoardo Di Napoli Motivation and Goals
More informationAlgorithm for Sparse Approximate Inverse Preconditioners in the Conjugate Gradient Method
Algorithm for Sparse Approximate Inverse Preconditioners in the Conjugate Gradient Method Ilya B. Labutin A.A. Trofimuk Institute of Petroleum Geology and Geophysics SB RAS, 3, acad. Koptyug Ave., Novosibirsk
More informationFROM NODAL LIQUID TO NODAL INSULATOR
FROM NODAL LIQUID TO NODAL INSULATOR Collaborators: Urs Ledermann and Maurice Rice John Hopkinson (Toronto) GORDON, 2004, Oxford Doped Mott insulator? Mott physics: U Antiferro fluctuations: J SC fluctuations
More informationTechniques for translationally invariant matrix product states
Techniques for translationally invariant matrix product states Ian McCulloch University of Queensland Centre for Engineered Quantum Systems (EQuS) 7 Dec 2017 Ian McCulloch (UQ) imps 7 Dec 2017 1 / 33 Outline
More informationPerformance Evaluation of MPI on Weather and Hydrological Models
NCAR/RAL Performance Evaluation of MPI on Weather and Hydrological Models Alessandro Fanfarillo elfanfa@ucar.edu August 8th 2018 Cheyenne - NCAR Supercomputer Cheyenne is a 5.34-petaflops, high-performance
More informationPorting a sphere optimization program from LAPACK to ScaLAPACK
Porting a sphere optimization program from LAPACK to ScaLAPACK Mathematical Sciences Institute, Australian National University. For presentation at Computational Techniques and Applications Conference
More informationIntroduction to Density Functional Theory
1 Introduction to Density Functional Theory 21 February 2011; V172 P.Ravindran, FME-course on Ab initio Modelling of solar cell Materials 21 February 2011 Introduction to DFT 2 3 4 Ab initio Computational
More informationSuperconductivity in Fe-based ladder compound BaFe 2 S 3
02/24/16 QMS2016 @ Incheon Superconductivity in Fe-based ladder compound BaFe 2 S 3 Tohoku University Kenya OHGUSHI Outline Introduction Fe-based ladder material BaFe 2 S 3 Basic physical properties High-pressure
More informationNew trends in density matrix renormalization
Advances in Physics, Vol. 55, Nos. 5 6, July October 2006, 477 526 New trends in density matrix renormalization KAREN A. HALLBERG Instituto Balseiro and Centro Ato mico Bariloche, Comisio n Nacional de
More informationAll-electron density functional theory on Intel MIC: Elk
All-electron density functional theory on Intel MIC: Elk W. Scott Thornton, R.J. Harrison Abstract We present the results of the porting of the full potential linear augmented plane-wave solver, Elk [1],
More informationMagnetic-field-tuned superconductor-insulator transition in underdoped La 2-x Sr x CuO 4
Magnetic-field-tuned superconductor-insulator transition in underdoped La 2-x Sr x CuO 4 Dragana Popović National High Magnetic Field Laboratory Florida State University, Tallahassee, FL, USA Collaborators
More informationarxiv:cond-mat/ v1 [cond-mat.str-el] 4 Sep 2006
Advances in Physics Vol. 00, No. 00, January-February 2005, 1 54 arxiv:cond-mat/0609039v1 [cond-mat.str-el] 4 Sep 2006 New Trends in Density Matrix Renormalization KAREN A. HALLBERG Instituto Balseiro
More information