Spin glass simulations on Janus
|
|
- Shona Randall
- 5 years ago
- Views:
Transcription
1
2 Spin glass simulations on Janus R. (lele) Tripiccione Dipartimento di Fisica, Universita' di Ferrara UCHPC, Rodos (Greece) Aug. 27 th, 2012
3 Warning / Disclaimer / Fineprints I' m an outsider here ---> a physicist's view on an application-specific architecture A flavor of physics-motivated, performance-paranoic, (hopefully) unconventional computer architecture However a few points of contact with main-stream CS may still exist...
4 On the menu today WHAT?: spin-glass simulations in short WHY?: computational challenges HOW?: the JANUS systems DID IT WORK?: measured and expected performance (and comparison with conventional systems) Take-away lessons / Conclusions
5 Our computational problem Bring a spin-glass (*) system of e.g grid points to thermal equilibrium: - a challenge never attempted sofar ---> - follow the system for Monte Carlo (*) steps - on ~100 independent system instances Back-of-envelope estimate: 1 high-end CPU for 10,000 years (which is not the same as 10,000 CPUs for 1 year...) (*) to be defined in the next slides
6 Statistical mechanics in brief... Statistical mechanics tries to describe the macroscopic behaviour of matter in terms of average values of microscopic structure An (hopefully familiar) example : Explain why magnets have a transition temperature beyond which they lose their magnetic state T
7 The Ising model... The tiny little magnets are named spins; they take just two values A configuration is a specific value assignment for all spins in the system The macro -behavior is dictated by the energy function at the micro level: Each spin interacts only with its nearest neighbours in a discrete D-dim mesh: U {S} = ij JS i S j J 0 Statistical physics bridges the gap from micro to macro...
8 The spin-glass model... Spin-glasses are a generalization of Ising systems. They are the reference theoretical model of glassy behavior Interesting per se A model of complexity Interesting for industrial applications An apparently trivial change in the energy functions makes spin-glasses much more complex than Ising systems Studying these systems is a computational nightmare...
9 Why are Spin Glasses so hard?? A very simple change in the energy-function (defined on e.g. a discrete 3- D lattice) U= NB ij J ij i j, ={ 1, 1} J={ 1, 1} hides tremendously complex dynamics, due to the extremely irregular energy landscape in the configuration space (frustration):
10 Monte Carlo algorithms These beasts are best studied numerically by Monte Carlo algorithms Monte Carlo algorithms navigate in configuration space in such a way that: ----> any configuration will show up according to its probability to be realized in the real world (at a given temperature) MC algorithms come in several versions most versions have remarkably similar requirements in terms of their algorithmic structure.
11 The Metropolis algorithm An endless loop... Pick up one (or several) spin(s) Compute the energy Flip it/them Compute the new energy U U' Compute U=U' U If U 0 accept the change unconditionally else accept the change only with probability e U/KT pick up new spin(s) and do it again
12 ... just a few C lines
13 Monte Carlo algorithms Common features: bit-manipulation operations on spins (+ LUT access) (good-quality/long) random numbers a huge degree of available parallelism regular program flow (orderly loops on the grid sites) regular, predictable memory access pattern information-exchange (processor<->memory) is huge however the size of the data-base is tiny many small (nottoo small) cores hardwired control on-chip memory
14 Compute intensive, you mean?? One Monte Carlo step is roughly the (real) time in which a (real) system flips one of its spins, roughly 1 pico-second If you want to understand what happens in just the first seconds of a real experiment you need O(10 12 ) time steps on ~ 100 replicas of a system ---> updates Clever programming on standard CPUs: 1 ns /spin-update ---> 3000 years
15 Compute intensive, you mean?? The dynamics is dramatically slow (see picture) So even a simulated box whose size is a small multiple of the corr. Length will give accurate physics results Good news: we're in business even if we simulate a very small box... However...
16 Hard scaling vs Weak Scaling Amdahl's law (strong scaling) S A = 1 p p 1 p p/n = 1 1 p p/n vs... Gustafson's law (weak scaling) S G = 1 p Np 1 p p = 1 p Np In our case enlarging system-size is meaningless, as we do not yet have the resources to study a small system ----> the ultimate quest for strong scaling...
17 The JANUS project An attempt at developing, building and operating an applicationdriven compute engine for Monte Carlo simulations of spin glass systems A collaboration of: Universities of Rome (La Sapienza) and Ferrara Universities of Madrid, Zaragoza, Badajoz BIFI (Zaragoza) Eurotech Partially supported by Microsoft, Xilinx
18 The nature of the available parallelism Spin glass simulations have two levels of available parallelism 1) Embarassingly trivial: need statistics on several replicas ---> farm it out to independent processors 2) Trivially identified: sweep order for Monte Carlo update is not specified ---> can update in parallel any set of non-mutually interacting spins make it a black-white checkerboard: it opens the way to tens of thousands of independent thread... 1) & 2) do not commute
19 The ideal spin glass machine... A further question: what is the appropriate system-scale at which this parallelism is best exploited One update engine: computes the local contribution to U U= NB ij i J ij j addresses a probability table compares with a freshly generated random numbr assigns the new spin value
20 The ideal spin glass machine... All this is just a bunch (~1000) of gates And in spite of that a typical CPU core, with O(107+) gates can process perhaps 4 spins at each clock cycle If you can arrange your stock of gates the way it best suits the algorithm, can easily expect ~1000 update engines on one chip ----> The best structure is a massively-many-core organization ( or perhaps an application-driven GPU??)
21 The ideal spin glass machine... is an orderly structure (a 2D grid) of a large number of update engines each update engine handles a subset of the physical mesh its architectural structure is extremely simple each data path processess one bit at a time memory addresing is regular and predictable SIMD processing is OK however memory bandwidth requirements are huge (need 7 bit to process one bit..) however memory can be local to the processor Simple hardware structure ---> FPGA are OK!
22 The JANUS machine A parallel system of (themselves) massively parallel processor chips The basic hardware element: A 2-D grid of 4 x 4 (FPGA based) processors (SP's) Data links among nearest neighbours on the grid One control processors on each board (IOP) with 2 Gbit Ethernet links to host st
23 JANUS: a picture gallery
24 Our large machine 256 (16 x 16) processors 8 host PCs --> ~ 90 TIPS for spin-glass simulation A typical simulation wall-clock time on this nice little machine goes down to a more manageable ~ 100 days.
25 JANUS as a spin-glass engine The 2008 implementation (XILINX Virtex4-LX200): 1024 update cores on each processor, pipelineable to one spin update per clock cycle ---> 88% of available logic resources system clock at 62.5 Mhz ---> 16 ps average spin update time using a bandwidth of ~ read bits written bits per clock cycle ---> 47% of available on-chip memory
26 (Measured) Performances Let's use conventional units, first???? The data path of each Processing Element (PE) performs sustained pipelined ops per clock cycle (62.5 Mhz) We have 1024 PEs ----> ~ 830 GIPS However 11 ops are on very short data words: more honestly: sustained conventional pipelined ops per clock cycle: We have 1024 PEs ----> ~ 300 GIPS ---> 10 GIPS/W Sustained by ~ 1 Tbyte/sec combined memory bandwidth
27 (Measured) Performances Physicicst like a different figure-of-merit ----> the spin-flip rate R, typically measured in psecs per flip For each processor in the system: R = 1 Nf = MHz 16ps / flip For one complete element of the IANUS core (16 procs): R = 1 N p Nf = MHz 1 ps / flip as fast as Nature...
28 Physics results
29 Performance figures ( ) Spin-glass addicts like to quote the average spin-update time SUT GUT Janus module 16 ps 1 ps PC (IntelCoreDuo) 3000 ps 700 ps IBM CBE (all cores) - 65 ps 300x 700x!!
30 Performance figures ( ) In the last couple of years, multi/many core processors and GPUs have entered the arena... Still 10x 20x!!
31 What next?? 4+ years old Janus still has an edge on state-of-the-art commercial HPC computing architectures Reasonable to continue on the same line, surfing on technology developments Expected performance increase???? FPGA size x Clock frequency 4.0 x SUT parallel 16 x Grand total x Log(Grand Total) ~ 7.5
32 What next?? Janus 2 Exactly the same architecture of JANUS but... Xilinx Virtex-7 FPGAs (Virtex7-485) 2 DDR-3 memory banks on each SP Improved local 4x4 interconnection Tighter coupling with the HOST (on-box CPU + PCIe gen2) Protos in fall 2012 Physics in early > Simulate a Ising-spin glass for 2 42 time steps
33 Looking at the crystal ball... How long is the (predicted) opportunity window for Janus2?? A graphical anwer (and some speculations on Moore's law)
34 Take-away lessons JANUS is an extremely rewarding example of (strongly application driven) on-chip multiprocessing: We designed a machine around an unconventional problem No wonder the machine turned out to be unconventional enough Results were rewarding... WHY????
35 Take-away lessons Results were rewarding...why?? there is a lot of parallelism available that is actually exploited; load is automatically balanced among the update engines; memory access is heavy, but patterns are predictable; processors (and their memories) are arranged on a regular grid; inter-node traffic is not huge and regular. IN SHORT Our machine tried to exploit all these feature at best
Janus: FPGA Based System for Scientific Computing Filippo Mantovani
Janus: FPGA Based System for Scientific Computing Filippo Mantovani Physics Department Università degli Studi di Ferrara Ferrara, 28/09/2009 Overview: 1. The physical problem: - Ising model and Spin Glass
More informationQuantum versus Thermal annealing (or D-wave versus Janus): seeking a fair comparison
Quantum versus Thermal annealing (or D-wave versus Janus): seeking a fair comparison Víctor Martín-Mayor Dep. Física Teórica I, Universidad Complutense de Madrid Janus Collaboration In collaboration with
More informationTwo case studies of Monte Carlo simulation on GPU
Two case studies of Monte Carlo simulation on GPU National Institute for Computational Sciences University of Tennessee Seminar series on HPC, Feb. 27, 2014 Outline 1 Introduction 2 Discrete energy lattice
More informationAn FPGA-based supercomputer for statistical physics: the weird case of Janus
An FPGA-based supercomputer for statistical physics: the weird case of Janus M. Baity-Jesi, R. A. Baños, A. Cruz, L. A. Fernandez, J. M. Gil-Narvion, A. Gordillo-Guerrero, M. Guidetti, D. Iñiguez, A. Maiorano,
More informationINF2270 Spring Philipp Häfliger. Lecture 8: Superscalar CPUs, Course Summary/Repetition (1/2)
INF2270 Spring 2010 Philipp Häfliger Summary/Repetition (1/2) content From Scalar to Superscalar Lecture Summary and Brief Repetition Binary numbers Boolean Algebra Combinational Logic Circuits Encoder/Decoder
More informationWelcome to MCS 572. content and organization expectations of the course. definition and classification
Welcome to MCS 572 1 About the Course content and organization expectations of the course 2 Supercomputing definition and classification 3 Measuring Performance speedup and efficiency Amdahl s Law Gustafson
More informationPhase Transitions in Spin Glasses
p.1 Phase Transitions in Spin Glasses Peter Young http://physics.ucsc.edu/ peter/talks/bifi2008.pdf e-mail:peter@physics.ucsc.edu Work supported by the and the Hierarchical Systems Research Foundation.
More informationCSE370: Introduction to Digital Design
CSE370: Introduction to Digital Design Course staff Gaetano Borriello, Brian DeRenzi, Firat Kiyak Course web www.cs.washington.edu/370/ Make sure to subscribe to class mailing list (cse370@cs) Course text
More informationLab 70 in TFFM08. Curie & Ising
IFM The Department of Physics, Chemistry and Biology Lab 70 in TFFM08 Curie & Ising NAME PERS. -NUMBER DATE APPROVED Rev Aug 09 Agne 1 Introduction Magnetic materials are all around us, and understanding
More informationQuantum versus Thermal annealing, the role of Temperature Chaos
Quantum versus Thermal annealing, the role of Temperature Chaos Víctor Martín-Mayor Dep. Física Teórica I, Universidad Complutense de Madrid Janus Collaboration In collaboration with Itay Hen (Information
More informationBig Bang, Big Iron: CMB Data Analysis at the Petascale and Beyond
Big Bang, Big Iron: CMB Data Analysis at the Petascale and Beyond Julian Borrill Computational Cosmology Center, LBL & Space Sciences Laboratory, UCB with Christopher Cantalupo, Theodore Kisner, Radek
More informationREVIEW: Derivation of the Mean Field Result
Lecture 18: Mean Field and Metropolis Ising Model Solutions 1 REVIEW: Derivation of the Mean Field Result The Critical Temperature Dependence The Mean Field Approximation assumes that there will be an
More informationCRYSTAL in parallel: replicated and distributed (MPP) data. Why parallel?
CRYSTAL in parallel: replicated and distributed (MPP) data Roberto Orlando Dipartimento di Chimica Università di Torino Via Pietro Giuria 5, 10125 Torino (Italy) roberto.orlando@unito.it 1 Why parallel?
More informationKinetic Monte Carlo (KMC) Kinetic Monte Carlo (KMC)
Kinetic Monte Carlo (KMC) Molecular Dynamics (MD): high-frequency motion dictate the time-step (e.g., vibrations). Time step is short: pico-seconds. Direct Monte Carlo (MC): stochastic (non-deterministic)
More informationOn the Use of a Many core Processor for Computational Fluid Dynamics Simulations
On the Use of a Many core Processor for Computational Fluid Dynamics Simulations Sebastian Raase, Tomas Nordström Halmstad University, Sweden {sebastian.raase,tomas.nordstrom} @ hh.se Preface based on
More informationCMP 338: Third Class
CMP 338: Third Class HW 2 solution Conversion between bases The TINY processor Abstraction and separation of concerns Circuit design big picture Moore s law and chip fabrication cost Performance What does
More informationJulian Merten. GPU Computing and Alternative Architecture
Future Directions of Cosmological Simulations / Edinburgh 1 / 16 Julian Merten GPU Computing and Alternative Architecture Institut für Theoretische Astrophysik Zentrum für Astronomie Universität Heidelberg
More informationOn Portability, Performance and Scalability of a MPI OpenCL Lattice Boltzmann Code
On Portability, Performance and Scalability of a MPI OpenCL Lattice Boltzmann Code E Calore, S F Schifano, R Tripiccione Enrico Calore INFN Ferrara, Italy 7 th Workshop on UnConventional High Performance
More informationQuantum Computing. Separating the 'hope' from the 'hype' Suzanne Gildert (D-Wave Systems, Inc) 4th September :00am PST, Teleplace
Quantum Computing Separating the 'hope' from the 'hype' Suzanne Gildert (D-Wave Systems, Inc) 4th September 2010 10:00am PST, Teleplace The Hope All computing is constrained by the laws of Physics and
More informationGPU accelerated Monte Carlo simulations of lattice spin models
Available online at www.sciencedirect.com Physics Procedia 15 (2011) 92 96 GPU accelerated Monte Carlo simulations of lattice spin models M. Weigel, T. Yavors kii Institut für Physik, Johannes Gutenberg-Universität
More informationA Monte Carlo Implementation of the Ising Model in Python
A Monte Carlo Implementation of the Ising Model in Python Alexey Khorev alexey.s.khorev@gmail.com 2017.08.29 Contents 1 Theory 1 1.1 Introduction...................................... 1 1.2 Model.........................................
More informationPopulation annealing study of the frustrated Ising antiferromagnet on the stacked triangular lattice
Population annealing study of the frustrated Ising antiferromagnet on the stacked triangular lattice Michal Borovský Department of Theoretical Physics and Astrophysics, University of P. J. Šafárik in Košice,
More informationMonte Carlo Simulation of the Ising Model. Abstract
Monte Carlo Simulation of the Ising Model Saryu Jindal 1 1 Department of Chemical Engineering and Material Sciences, University of California, Davis, CA 95616 (Dated: June 9, 2007) Abstract This paper
More information2. Accelerated Computations
2. Accelerated Computations 2.1. Bent Function Enumeration by a Circular Pipeline Implemented on an FPGA Stuart W. Schneider Jon T. Butler 2.1.1. Background A naive approach to encoding a plaintext message
More informationLecture 2: Metrics to Evaluate Systems
Lecture 2: Metrics to Evaluate Systems Topics: Metrics: power, reliability, cost, benchmark suites, performance equation, summarizing performance with AM, GM, HM Sign up for the class mailing list! Video
More informationPhysics 115/242 Monte Carlo simulations in Statistical Physics
Physics 115/242 Monte Carlo simulations in Statistical Physics Peter Young (Dated: May 12, 2007) For additional information on the statistical Physics part of this handout, the first two sections, I strongly
More informationHigh-Performance Scientific Computing
High-Performance Scientific Computing Instructor: Randy LeVeque TA: Grady Lemoine Applied Mathematics 483/583, Spring 2011 http://www.amath.washington.edu/~rjl/am583 World s fastest computers http://top500.org
More informationCMOS Ising Computer to Help Optimize Social Infrastructure Systems
FEATURED ARTICLES Taking on Future Social Issues through Open Innovation Information Science for Greater Industrial Efficiency CMOS Ising Computer to Help Optimize Social Infrastructure Systems As the
More informationEECS 579: Logic and Fault Simulation. Simulation
EECS 579: Logic and Fault Simulation Simulation: Use of computer software models to verify correctness Fault Simulation: Use of simulation for fault analysis and ATPG Circuit description Input data for
More informationMOLECULAR DYNAMIC SIMULATION OF WATER VAPOR INTERACTION WITH VARIOUS TYPES OF PORES USING HYBRID COMPUTING STRUCTURES
MOLECULAR DYNAMIC SIMULATION OF WATER VAPOR INTERACTION WITH VARIOUS TYPES OF PORES USING HYBRID COMPUTING STRUCTURES V.V. Korenkov 1,3, a, E.G. Nikonov 1, b, M. Popovičová 2, с 1 Joint Institute for Nuclear
More informationAny live cell with less than 2 live neighbours dies. Any live cell with 2 or 3 live neighbours lives on to the next step.
2. Cellular automata, and the SIRS model In this Section we consider an important set of models used in computer simulations, which are called cellular automata (these are very similar to the so-called
More informationMolecular Dynamics Simulations
MDGRAPE-3 chip: A 165- Gflops application-specific LSI for Molecular Dynamics Simulations Makoto Taiji High-Performance Biocomputing Research Team Genomic Sciences Center, RIKEN Molecular Dynamics Simulations
More informationBuilding a Multi-FPGA Virtualized Restricted Boltzmann Machine Architecture Using Embedded MPI
Building a Multi-FPGA Virtualized Restricted Boltzmann Machine Architecture Using Embedded MPI Charles Lo and Paul Chow {locharl1, pc}@eecg.toronto.edu Department of Electrical and Computer Engineering
More informationFPGA Implementation of a Predictive Controller
FPGA Implementation of a Predictive Controller SIAM Conference on Optimization 2011, Darmstadt, Germany Minisymposium on embedded optimization Juan L. Jerez, George A. Constantinides and Eric C. Kerrigan
More informationAlexei Safonov Lecture #18
Methods of Experimental Particle Physics Alexei Safonov Lecture #18 1 Presentations: Trigger Today Lecture D0 calorimeter by Jeff 2 Collisions at LHC 14 000 x mass of proton (14 TeV) = Collision Energy
More informationEECS150 - Digital Design Lecture 21 - Design Blocks
EECS150 - Digital Design Lecture 21 - Design Blocks April 3, 2012 John Wawrzynek Spring 2012 EECS150 - Lec21-db3 Page 1 Fixed Shifters / Rotators fixed shifters hardwire the shift amount into the circuit.
More informationBeiHang Short Course, Part 7: HW Acceleration: It s about Performance, Energy and Power
BeiHang Short Course, Part 7: HW Acceleration: It s about Performance, Energy and Power James C. Hoe Department of ECE Carnegie Mellon niversity Eric S. Chung, et al., Single chip Heterogeneous Computing:
More informationGPU Computing Activities in KISTI
International Advanced Research Workshop on High Performance Computing, Grids and Clouds 2010 June 21~June 25 2010, Cetraro, Italy HPC Infrastructure and GPU Computing Activities in KISTI Hongsuk Yi hsyi@kisti.re.kr
More informationThe Phase Transition of the 2D-Ising Model
The Phase Transition of the 2D-Ising Model Lilian Witthauer and Manuel Dieterle Summer Term 2007 Contents 1 2D-Ising Model 2 1.1 Calculation of the Physical Quantities............... 2 2 Location of the
More informationSP-CNN: A Scalable and Programmable CNN-based Accelerator. Dilan Manatunga Dr. Hyesoon Kim Dr. Saibal Mukhopadhyay
SP-CNN: A Scalable and Programmable CNN-based Accelerator Dilan Manatunga Dr. Hyesoon Kim Dr. Saibal Mukhopadhyay Motivation Power is a first-order design constraint, especially for embedded devices. Certain
More informationGenerating Hard but Solvable SAT Formulas
Generating Hard but Solvable SAT Formulas T-79.7003 Research Course in Theoretical Computer Science André Schumacher October 18, 2007 1 Introduction The 3-SAT problem is one of the well-known NP-hard problems
More informationSpin glasses, where do we stand?
Spin glasses, where do we stand? Giorgio Parisi Many progresses have recently done in spin glasses: theory, experiments, simulations and theorems! In this talk I will present: A very brief introduction
More informationEECS150 - Digital Design Lecture 11 - Shifters & Counters. Register Summary
EECS50 - Digital Design Lecture - Shifters & Counters February 24, 2003 John Wawrzynek Spring 2005 EECS50 - Lec-counters Page Register Summary All registers (this semester) based on Flip-flops: q 3 q 2
More informationTowards a highly-parallel PDE-Solver using Adaptive Sparse Grids on Compute Clusters
Towards a highly-parallel PDE-Solver using Adaptive Sparse Grids on Compute Clusters HIM - Workshop on Sparse Grids and Applications Alexander Heinecke Chair of Scientific Computing May 18 th 2011 HIM
More informationGPU-based computation of the Monte Carlo simulation of classical spin systems
Perspectives of GPU Computing in Physics and Astrophysics, Sapienza University of Rome, Rome, Italy, September 15-17, 2014 GPU-based computation of the Monte Carlo simulation of classical spin systems
More informationCactus Tools for Petascale Computing
Cactus Tools for Petascale Computing Erik Schnetter Reno, November 2007 Gamma Ray Bursts ~10 7 km He Protoneutron Star Accretion Collapse to a Black Hole Jet Formation and Sustainment Fe-group nuclei Si
More informationParallelization of the QC-lib Quantum Computer Simulator Library
Parallelization of the QC-lib Quantum Computer Simulator Library Ian Glendinning and Bernhard Ömer VCPC European Centre for Parallel Computing at Vienna Liechtensteinstraße 22, A-19 Vienna, Austria http://www.vcpc.univie.ac.at/qc/
More informationEECS150 - Digital Design Lecture 15 SIFT2 + FSM. Recap and Outline
EECS150 - Digital Design Lecture 15 SIFT2 + FSM Oct. 15, 2013 Prof. Ronald Fearing Electrical Engineering and Computer Sciences University of California, Berkeley (slides courtesy of Prof. John Wawrzynek)
More informationarxiv:cond-mat/ v1 19 Sep 1995
Large-scale Simulation of the Two-dimensional Kinetic Ising Model arxiv:cond-mat/9509115v1 19 Sep 1995 Andreas Linke, Dieter W. Heermann Institut für theoretische Physik Universität Heidelberg Philosophenweg
More informationStatistics and Quantum Computing
Statistics and Quantum Computing Yazhen Wang Department of Statistics University of Wisconsin-Madison http://www.stat.wisc.edu/ yzwang Workshop on Quantum Computing and Its Application George Washington
More informationand B. Taglienti (b) (a): Dipartimento di Fisica and Infn, Universita di Cagliari (c): Dipartimento di Fisica and Infn, Universita di Roma La Sapienza
Glue Ball Masses and the Chameleon Gauge E. Marinari (a),m.l.paciello (b),g.parisi (c) and B. Taglienti (b) (a): Dipartimento di Fisica and Infn, Universita di Cagliari Via Ospedale 72, 09100 Cagliari
More informationMONTE CARLO METHODS IN SEQUENTIAL AND PARALLEL COMPUTING OF 2D AND 3D ISING MODEL
Journal of Optoelectronics and Advanced Materials Vol. 5, No. 4, December 003, p. 971-976 MONTE CARLO METHODS IN SEQUENTIAL AND PARALLEL COMPUTING OF D AND 3D ISING MODEL M. Diaconu *, R. Puscasu, A. Stancu
More informationOutline. EECS Components and Design Techniques for Digital Systems. Lec 18 Error Coding. In the real world. Our beautiful digital world.
Outline EECS 150 - Components and esign Techniques for igital Systems Lec 18 Error Coding Errors and error models Parity and Hamming Codes (SECE) Errors in Communications LFSRs Cyclic Redundancy Check
More informationMarkov Chain Monte Carlo The Metropolis-Hastings Algorithm
Markov Chain Monte Carlo The Metropolis-Hastings Algorithm Anthony Trubiano April 11th, 2018 1 Introduction Markov Chain Monte Carlo (MCMC) methods are a class of algorithms for sampling from a probability
More informationAn Efficient FETI Implementation on Distributed Shared Memory Machines with Independent Numbers of Subdomains and Processors
Contemporary Mathematics Volume 218, 1998 B 0-8218-0988-1-03024-7 An Efficient FETI Implementation on Distributed Shared Memory Machines with Independent Numbers of Subdomains and Processors Michel Lesoinne
More informationParallel Tempering Algorithm in Monte Carlo Simulation
Parallel Tempering Algorithm in Monte Carlo Simulation Tony Cheung (CUHK) Kevin Zhao (CUHK) Mentors: Ying Wai Li (ORNL) Markus Eisenbach (ORNL) Kwai Wong (UTK/ORNL) Metropolis Algorithm on Ising Model
More informationGRAPE and Project Milkyway. Jun Makino. University of Tokyo
GRAPE and Project Milkyway Jun Makino University of Tokyo Talk overview GRAPE Project Science with GRAPEs Next Generation GRAPE the GRAPE-DR Project Milkyway GRAPE project GOAL: Design and build specialized
More informationSIMULATED TEMPERING: A NEW MONTECARLO SCHEME
arxiv:hep-lat/9205018v1 22 May 1992 SIMULATED TEMPERING: A NEW MONTECARLO SCHEME Enzo MARINARI (a),(b) and Giorgio PARISI (c) Dipartimento di Fisica, Università di Roma Tor Vergata, Via della Ricerca Scientifica,
More informationFeatured Articles Advanced Research into AI Ising Computer
156 Hitachi Review Vol. 65 (2016), No. 6 Featured Articles Advanced Research into AI Ising Computer Masanao Yamaoka, Ph.D. Chihiro Yoshimura Masato Hayashi Takuya Okuyama Hidetaka Aoki Hiroyuki Mizuno,
More informationCRYPTOGRAPHIC COMPUTING
CRYPTOGRAPHIC COMPUTING ON GPU Chen Mou Cheng Dept. Electrical Engineering g National Taiwan University January 16, 2009 COLLABORATORS Daniel Bernstein, UIC, USA Tien Ren Chen, Army Tanja Lange, TU Eindhoven,
More informationENERGY CONSERVATION The Fisrt Law of Thermodynamics and the Work/Kinetic-Energy Theorem
A. La Rosa Lecture Notes PH 21 ENERGY CONSERVATION The Fisrt Law of Thermodynamics and the Work/Kinetic-Energy Theorem ENERGY [This section taken from The Feynman Lectures Vol-1 Ch-4] 1. What is energy?
More informationGPU Based Parallel Ising Computing for Combinatorial Optimization Problems in VLSI Physical Design
1 GPU Based Parallel Ising Computing for Combinatorial Optimization Problems in VLSI Physical Design arxiv:1807.10750v1 [physics.comp-ph] 27 Jul 2018 Chase Cook Student Member, IEEE, Hengyang Zhao Student
More informationComplexity of the quantum adiabatic algorithm
Complexity of the quantum adiabatic algorithm Peter Young e-mail:peter@physics.ucsc.edu Collaborators: S. Knysh and V. N. Smelyanskiy Colloquium at Princeton, September 24, 2009 p.1 Introduction What is
More informationab initio Electronic Structure Calculations
ab initio Electronic Structure Calculations New scalability frontiers using the BG/L Supercomputer C. Bekas, A. Curioni and W. Andreoni IBM, Zurich Research Laboratory Rueschlikon 8803, Switzerland ab
More informationHardware Acceleration of the Tate Pairing in Characteristic Three
Hardware Acceleration of the Tate Pairing in Characteristic Three CHES 2005 Hardware Acceleration of the Tate Pairing in Characteristic Three Slide 1 Introduction Pairing based cryptography is a (fairly)
More informationEECS Components and Design Techniques for Digital Systems. Lec 26 CRCs, LFSRs (and a little power)
EECS 150 - Components and esign Techniques for igital Systems Lec 26 CRCs, LFSRs (and a little power) avid Culler Electrical Engineering and Computer Sciences University of California, Berkeley http://www.eecs.berkeley.edu/~culler
More informationQuantum computing with superconducting qubits Towards useful applications
Quantum computing with superconducting qubits Towards useful applications Stefan Filipp IBM Research Zurich Switzerland Forum Teratec 2018 June 20, 2018 Palaiseau, France Why Quantum Computing? Why now?
More informationTate Bilinear Pairing Core Specification. Author: Homer Hsing
Tate Bilinear Pairing Core Specification Author: Homer Hsing homer.hsing@gmail.com Rev. 0.1 March 4, 2012 This page has been intentionally left blank. www.opencores.org Rev 0.1 ii Revision History Rev.
More informationZacros. Software Package Development: Pushing the Frontiers of Kinetic Monte Carlo Simulation in Catalysis
Zacros Software Package Development: Pushing the Frontiers of Kinetic Monte Carlo Simulation in Catalysis Jens H Nielsen, Mayeul D'Avezac, James Hetherington & Michail Stamatakis Introduction to Zacros
More informationQuantum simulation with string-bond states: Joining PEPS and Monte Carlo
Quantum simulation with string-bond states: Joining PEPS and Monte Carlo N. Schuch 1, A. Sfondrini 1,2, F. Mezzacapo 1, J. Cerrillo 1,3, M. Wolf 1,4, F. Verstraete 5, I. Cirac 1 1 Max-Planck-Institute
More informationCS 700: Quantitative Methods & Experimental Design in Computer Science
CS 700: Quantitative Methods & Experimental Design in Computer Science Sanjeev Setia Dept of Computer Science George Mason University Logistics Grade: 35% project, 25% Homework assignments 20% midterm,
More informationEECS150 - Digital Design Lecture 23 - FFs revisited, FIFOs, ECCs, LSFRs. Cross-coupled NOR gates
EECS150 - Digital Design Lecture 23 - FFs revisited, FIFOs, ECCs, LSFRs April 16, 2009 John Wawrzynek Spring 2009 EECS150 - Lec24-blocks Page 1 Cross-coupled NOR gates remember, If both R=0 & S=0, then
More informationThese are special traffic patterns that create more stress on a switch
Myths about Microbursts What are Microbursts? Microbursts are traffic patterns where traffic arrives in small bursts. While almost all network traffic is bursty to some extent, storage traffic usually
More informationToday. ESE532: System-on-a-Chip Architecture. Energy. Message. Preclass Challenge: Power. Energy Today s bottleneck What drives Efficiency of
ESE532: System-on-a-Chip Architecture Day 20: November 8, 2017 Energy Today Energy Today s bottleneck What drives Efficiency of Processors, FPGAs, accelerators How does parallelism impact energy? 1 2 Message
More informationParallelization of the QC-lib Quantum Computer Simulator Library
Parallelization of the QC-lib Quantum Computer Simulator Library Ian Glendinning and Bernhard Ömer September 9, 23 PPAM 23 1 Ian Glendinning / September 9, 23 Outline Introduction Quantum Bits, Registers
More informationQCDOC A Specialized Computer for Particle Physics
QCDOC A Specialized Computer for Particle Physics Supercomputers for Science across the Atlantic May 19, 2005 Norman H. Christ Columbia University Outline Physics overview Computer design opportunities
More informationLattice Boltzmann simulations on heterogeneous CPU-GPU clusters
Lattice Boltzmann simulations on heterogeneous CPU-GPU clusters H. Köstler 2nd International Symposium Computer Simulations on GPU Freudenstadt, 29.05.2013 1 Contents Motivation walberla software concepts
More informationClaude Tadonki. MINES ParisTech PSL Research University Centre de Recherche Informatique
Claude Tadonki MINES ParisTech PSL Research University Centre de Recherche Informatique claude.tadonki@mines-paristech.fr Monthly CRI Seminar MINES ParisTech - CRI June 06, 2016, Fontainebleau (France)
More informationDynamic resource sharing
J. Virtamo 38.34 Teletraffic Theory / Dynamic resource sharing and balanced fairness Dynamic resource sharing In previous lectures we have studied different notions of fair resource sharing. Our focus
More informationMassively parallel semi-lagrangian solution of the 6d Vlasov-Poisson problem
Massively parallel semi-lagrangian solution of the 6d Vlasov-Poisson problem Katharina Kormann 1 Klaus Reuter 2 Markus Rampp 2 Eric Sonnendrücker 1 1 Max Planck Institut für Plasmaphysik 2 Max Planck Computing
More informationDirect Self-Consistent Field Computations on GPU Clusters
Direct Self-Consistent Field Computations on GPU Clusters Guochun Shi, Volodymyr Kindratenko National Center for Supercomputing Applications University of Illinois at UrbanaChampaign Ivan Ufimtsev, Todd
More informationParallelism in Computer Arithmetic: A Historical Perspective
Parallelism in Computer Arithmetic: A Historical Perspective 21s 2s 199s 198s 197s 196s 195s Behrooz Parhami Aug. 218 Parallelism in Computer Arithmetic Slide 1 University of California, Santa Barbara
More informationQuantum and classical annealing in spin glasses and quantum computing. Anders W Sandvik, Boston University
NATIONAL TAIWAN UNIVERSITY, COLLOQUIUM, MARCH 10, 2015 Quantum and classical annealing in spin glasses and quantum computing Anders W Sandvik, Boston University Cheng-Wei Liu (BU) Anatoli Polkovnikov (BU)
More informationEfficient random number generation on FPGA-s
Proceedings of the 9 th International Conference on Applied Informatics Eger, Hungary, January 29 February 1, 2014. Vol. 1. pp. 313 320 doi: 10.14794/ICAI.9.2014.1.313 Efficient random number generation
More informationSession 8C-5: Inductive Issues in Power Grids and Packages. Controlling Inductive Cross-talk and Power in Off-chip Buses using CODECs
ASP-DAC 2006 Session 8C-5: Inductive Issues in Power Grids and Packages Controlling Inductive Cross-talk and Power in Off-chip Buses using CODECs Authors: Brock J. LaMeres Agilent Technologies Kanupriya
More informationarxiv: v1 [hep-lat] 7 Oct 2010
arxiv:.486v [hep-lat] 7 Oct 2 Nuno Cardoso CFTP, Instituto Superior Técnico E-mail: nunocardoso@cftp.ist.utl.pt Pedro Bicudo CFTP, Instituto Superior Técnico E-mail: bicudo@ist.utl.pt We discuss the CUDA
More informationBranislav K. Nikolić
Interdisciplinary Topics in Complex Systems: Cellular Automata, Self-Organized Criticality, Neural Networks and Spin Glasses Branislav K. Nikolić Department of Physics and Astronomy, University of Delaware,
More informationPerformance evaluation of scalable optoelectronics application on large-scale Knights Landing cluster
Performance evaluation of scalable optoelectronics application on large-scale Knights Landing cluster Yuta Hirokawa Graduate School of Systems and Information Engineering, University of Tsukuba hirokawa@hpcs.cs.tsukuba.ac.jp
More informationListening for thunder beyond the clouds
Listening for thunder beyond the clouds Using the grid to analyse gravitational wave data Ra Inta The Australian National University Overview 1. Gravitational wave (GW) observatories 2. Analysis of continuous
More information2.6 Complexity Theory for Map-Reduce. Star Joins 2.6. COMPLEXITY THEORY FOR MAP-REDUCE 51
2.6. COMPLEXITY THEORY FOR MAP-REDUCE 51 Star Joins A common structure for data mining of commercial data is the star join. For example, a chain store like Walmart keeps a fact table whose tuples each
More informationquantum mechanics is a hugely successful theory... QSIT08.V01 Page 1
1.0 Introduction to Quantum Systems for Information Technology 1.1 Motivation What is quantum mechanics good for? traditional historical perspective: beginning of 20th century: classical physics fails
More informationReview: Directed Models (Bayes Nets)
X Review: Directed Models (Bayes Nets) Lecture 3: Undirected Graphical Models Sam Roweis January 2, 24 Semantics: x y z if z d-separates x and y d-separation: z d-separates x from y if along every undirected
More informationLarge-scale Electronic Structure Simulations with MVAPICH2 on Intel Knights Landing Manycore Processors
Large-scale Electronic Structure Simulations with MVAPICH2 on Intel Knights Landing Manycore Processors Hoon Ryu, Ph.D. (E: elec1020@kisti.re.kr) Principal Researcher / Korea Institute of Science and Technology
More information1.0 Introduction to Quantum Systems for Information Technology 1.1 Motivation
QSIT09.V01 Page 1 1.0 Introduction to Quantum Systems for Information Technology 1.1 Motivation What is quantum mechanics good for? traditional historical perspective: beginning of 20th century: classical
More informationStochastic chemical kinetics on an FPGA: Bruce R Land. Introduction
Stochastic chemical kinetics on an FPGA: Bruce R Land Introduction As you read this, there are thousands of chemical reactions going on in your body. Some are very fast, for instance, the binding of neurotransmitters
More informationThe Last Survivor: a Spin Glass Phase in an External Magnetic Field.
The Last Survivor: a Spin Glass Phase in an External Magnetic Field. J. J. Ruiz-Lorenzo Dep. Física, Universidad de Extremadura Instituto de Biocomputación y Física de los Sistemas Complejos (UZ) http://www.eweb.unex.es/eweb/fisteor/juan
More information1 Brief Introduction to Quantum Mechanics
CMSC 33001: Novel Computing Architectures and Technologies Lecturer: Yongshan Ding Scribe: Jean Salac Lecture 02: From bits to qubits October 4, 2018 1 Brief Introduction to Quantum Mechanics 1.1 Quantum
More informationChapter 7. Sequential Circuits Registers, Counters, RAM
Chapter 7. Sequential Circuits Registers, Counters, RAM Register - a group of binary storage elements suitable for holding binary info A group of FFs constitutes a register Commonly used as temporary storage
More informationLecture 27: Hardware Acceleration. James C. Hoe Department of ECE Carnegie Mellon University
18 447 Lecture 27: Hardware Acceleration James C. Hoe Department of ECE Carnegie Mellon niversity 18 447 S18 L27 S1, James C. Hoe, CM/ECE/CALCM, 2018 18 447 S18 L27 S2, James C. Hoe, CM/ECE/CALCM, 2018
More informationOHW2013 workshop. An open source PCIe device virtualization framework
OHW2013 workshop An open source PCIe device virtualization framework Plan Context and objectives Design and implementation Future directions Questions Context - ESRF and the ISDD electronic laboratory
More information