Shortest Lattice Vector Enumeration on Graphics Cards

Size: px
Start display at page:

Download "Shortest Lattice Vector Enumeration on Graphics Cards"

Transcription

1 Shortest Lattice Vector Enumeration on Graphics Cards Jens Hermans 1 Michael Schneider 2 Fréderik Vercauteren 1 Johannes Buchmann 2 Bart Preneel 1 1 K.U.Leuven 2 TU Darmstadt SHARCS - 10 September 2009

2

3 Why GPU? (Source: MSI)

4 CUDA framework Warning: sales talk Your own personal supercomputer for < e500. Nvidia CUDA Framework: Run general programs on GPU More complex operations, data types, branching... Recent GPU required Theory: 1TFlop (practice: 200 GFlop)

5 Crypto on GPU Current applications: Ciphers: RSA 1, ECC 2, AES 3 Cryptanalysis: Factoring 4 Brute force Focus: high throughput, not latency 1 Moss, Page, Smart / Szerwinski, Guneysu / Fleissner 2 Szerwinski, Guneysu 3 Manavski / Harrison, Waldron 4 Bernstein, Chen, Cheng, Lange, Yang

6

7 Lattices b 2 b 1 Basis matrix B = {b 1,..., b n } with b i R d Lattice: L(B) = { n i=1 x ib i, x i Z}

8 Shortest Vector Problem (SVP) b 2 b 1 b 2 b 1 Basis not unique Idea: good basis B and bad basis B Finding λ 1 (L) is hard with B

9 Algorithms for SVP Shortest vector problem Compute min x Z n Bx 2 SVP algorithms: LLL (+variants): approximate solution, polynomial BKZ... Enum: exact solution, exponential = This talk: focus on enum.

10 Enumeration xn =... x n 1 =... x 2 =... x 1 =... Optimum A = Bx 2 2 and x = [1, 0,..., 0]

11 Enumeration xn =... x n 1 =... x 2 =... x 1 =... Intermediate norm l 2 s.t. l i l i+1 (with l 1 = Bx 2 2 )

12 Enumeration xn =... x n 1 =... x 2 =... x 1 =... New optimum A = Bx 2 2

13 Enumeration xn =... x n 1 =... x 2 =... x 1 =... Cut off branch if l i > A.

14 Enumeration xn =... x n 1 =... x 2 =... x 1 =...

15 Enumeration xn =... x n 1 =... x 2 =... x 1 =...

16 Enumeration xn =... x n 1 =... x 2 =... x 1 =...

17 Programming model Memory

18 Processor Programming model Memory Nvidia GTX280: 240 cores, scalar processors 30 multiprocessors (8 cores each) 1.3 GHz 1GB Global Memory 32 & 64-bit integers, FP

19 Programming model Programming model Memory (Source: CUDA programming guide)

20 Memory types Programming model Memory (Source: CUDA programming guide)

21

22 Algorithm Flow x n. x α. x 1 α

23 Basic idea Input: B, A, α, n Compute the Gram-Schmidt decomposition of b i Output: (x 1,..., x n) with P n i=1 x ib i = λ1(l)

24 Basic idea Input: B, A, α, n Compute the Gram-Schmidt decomposition of b i CPU: generate x i = [0,..., 0, x α,..., x n] Output: (x 1,..., x n) with P n i=1 x ib i = λ1(l)

25 Basic idea Input: B, A, α, n Compute the Gram-Schmidt decomposition of b i CPU: generate x i = [0,..., 0, x α,..., x n] GPU thread: run a sub-enum on x i, if new optimum store in x Output: (x 1,..., x n) with P n i=1 x ib i = λ1(l)

26 Basic idea Input: B, A, α, n Compute the Gram-Schmidt decomposition of b i CPU: generate x i = [0,..., 0, x α,..., x n] GPU thread: run a sub-enum on x i, if new optimum store in x Output: (x 1,..., x n) with P n i=1 x ib i = λ1(l) = horrible performance

27 Early termination... Input: B, A, α, n 1 2 Compute the Gram-Schmidt decomposition of b i CPU: generate x i = [0,..., 0, x α,..., x n] 8 9 Output: (x 1,..., x n) with P n i=1 x ib i = λ 1(L)

28 Early termination... Input: B, A, α, n Compute the Gram-Schmidt decomposition of b i CPU: generate x i = [0,..., 0, x α,..., x n] GPU thread: while there are x i left. do Start enum for a certain x i = [0,..., 0, x α,..., x n] Stop enum after S steps, store the state {l i, x i, s i = S} end Output: (x 1,..., x n) with P n i=1 x ib i = λ 1(L)

29 Early termination... Input: B, A, α, n Compute the Gram-Schmidt decomposition of b i CPU: generate x i = [0,..., 0, x α,..., x n] GPU thread: while there are x i left. do Start enum for a certain x i = [0,..., 0, x α,..., x n] Stop enum after S steps, store the state {l i, x i, s i = S} end CPU: Get enum state x i = [ x 1,..., x α 1, x α,..., x n] Output: (x 1,..., x n) with P n i=1 x ib i = λ 1(L)

30 Early termination... Input: B, A, α, n Compute the Gram-Schmidt decomposition of b i CPU: generate x i = [0,..., 0, x α,..., x n] GPU thread: while there are x i left. do Start enum for a certain x i = [0,..., 0, x α,..., x n] Stop enum after S steps, store the state {l i, x i, s i = S} end CPU: Get enum state x i = [ x 1,..., x α 1, x α,..., x n] CPU: Continue enum if S was reached Output: (x 1,..., x n) with P n i=1 x ib i = λ 1(L) = solves length difference problem, still not so good

31 Iterating Input: B, A, α, n Compute the Gram-Schmidt decomposition of b i while true do CPU: generate some x i = [0,..., 0, x α,..., x n ] 9 10 CPU: Get enum state x i = [ x 1,..., x α 1, x α,..., x n ] end Output: (x 1,..., x n ) with n i=1 x ib i = λ1 (L)

32 Iterating Input: B, A, α, n Compute the Gram-Schmidt decomposition of b i while true do CPU: generate some x i = [0,..., 0, x α,..., x n ] GPU thread: while there are x i left. do Start enum for a certain x i or continue enum for x i Stop enum after S steps, store the state {l i, x i, s i = S} end CPU: Get enum state x i = [ x 1,..., x α 1, x α,..., x n ] end Output: (x 1,..., x n ) with n i=1 x ib i = λ1 (L)

33 GPU Enumeration x α... x n x 1

34 GPU Enumeration x α... x n x 1

35 GPU Enumeration x α... x n x 1

36 GPU Enumeration x α... x n x 1

37 GPU Enumeration x α... x n x 1

38 Implementation details Some facts & figures: Dimension 50, starting vectors upload & download 20 MB of data to GPU CPU top enum: very fast (low dimension) GPU runs for > 10 seconds per iteration, iteration overhead is limited Share new optimal values among GPU threads

39

40 Throughput Throughput: CPU: around steps/s GPU: up to steps/s Throughput on GPU depends on: Lattice dimension n Length of sub-enumerations Number of parallel threads, uploaded points...

41

42 n fplll 18.3s 139s 277s 2483s 6960s CUDA 20.2s 92s 133s 959s 2599s 110% 66% 48% 39% 37% Table: Average time needed for enumeration of lattices in each dimension n.

43 Ideas for the Future Future: Generalize ideas (not specific for gpu s... clusters?) Use full power of CPU (now: idle during gpu-time) Gaussian heuristic

44 The end... Questions?

45 Algorithm Algorithm 1: High-level GPU ENUM Algorithm Input: b i, A, α, n Compute the Gram-Schmidt decomposition of b i while true do S = {(x i, x i, 2 x i, l i = α, s i = 0)} i Top enum: generate at most numstartpoints #T vectors R = {( x i, x i, 2 x i, l i, s i )} i GPU enumeration, starting from S T T {R i : s i S} if #T < cputhreshold then Enumerate the starting points in T on the CPU. Stop end end Output: (x 1,..., x n) with P n i=1 x ib i = λ1(l)

Random Sampling for Short Lattice Vectors on Graphics Cards

Random Sampling for Short Lattice Vectors on Graphics Cards Random Sampling for Short Lattice Vectors on Graphics Cards Michael Schneider, Norman Göttert TU Darmstadt, Germany mischnei@cdc.informatik.tu-darmstadt.de CHES 2011, Nara September 2011 Michael Schneider

More information

CRYPTOGRAPHIC COMPUTING

CRYPTOGRAPHIC COMPUTING CRYPTOGRAPHIC COMPUTING ON GPU Chen Mou Cheng Dept. Electrical Engineering g National Taiwan University January 16, 2009 COLLABORATORS Daniel Bernstein, UIC, USA Tien Ren Chen, Army Tanja Lange, TU Eindhoven,

More information

Gauss Sieve on GPUs. Shang-Yi Yang 1, Po-Chun Kuo 1, Bo-Yin Yang 2, and Chen-Mou Cheng 1

Gauss Sieve on GPUs. Shang-Yi Yang 1, Po-Chun Kuo 1, Bo-Yin Yang 2, and Chen-Mou Cheng 1 Gauss Sieve on GPUs Shang-Yi Yang 1, Po-Chun Kuo 1, Bo-Yin Yang 2, and Chen-Mou Cheng 1 1 Department of Electrical Engineering, National Taiwan University, Taipei, Taiwan {ilway25,kbj,doug}@crypto.tw 2

More information

Sieving for Shortest Vectors in Ideal Lattices:

Sieving for Shortest Vectors in Ideal Lattices: Sieving for Shortest Vectors in Ideal Lattices: a Practical Perspective Joppe W. Bos Microsoft Research LACAL@RISC Seminar on Cryptologic Algorithms CWI, Amsterdam, Netherlands Joint work with Michael

More information

Creating a Challenge for Ideal Lattices

Creating a Challenge for Ideal Lattices Creating a Challenge for Ideal Lattices Thomas Plantard 1 and Michael Schneider 2 1 University of Wollongong, Australia thomaspl@uow.edu.au 2 Technische Universität Darmstadt, Germany mischnei@cdc.informatik.tu-darmstadt.de

More information

DIMACS Workshop on Parallelism: A 2020 Vision Lattice Basis Reduction and Multi-Core

DIMACS Workshop on Parallelism: A 2020 Vision Lattice Basis Reduction and Multi-Core DIMACS Workshop on Parallelism: A 2020 Vision Lattice Basis Reduction and Multi-Core Werner Backes and Susanne Wetzel Stevens Institute of Technology 29th March 2011 Work supported through NSF Grant DUE

More information

GPU acceleration of Newton s method for large systems of polynomial equations in double double and quad double arithmetic

GPU acceleration of Newton s method for large systems of polynomial equations in double double and quad double arithmetic GPU acceleration of Newton s method for large systems of polynomial equations in double double and quad double arithmetic Jan Verschelde joint work with Xiangcheng Yu University of Illinois at Chicago

More information

Sieving for Shortest Vectors in Ideal Lattices: a Practical Perspective

Sieving for Shortest Vectors in Ideal Lattices: a Practical Perspective Sieving for Shortest Vectors in Ideal Lattices: a Practical Perspective Joppe W. Bos 1, Michael Naehrig 2, and Joop van de Pol 3 1 NXP Semiconductors, Leuven, Belgium joppe.bos@nxp.com 2 Microsoft Research,

More information

ECM at Work. Joppe W. Bos 1 and Thorsten Kleinjung 2. 1 Microsoft Research, Redmond, USA

ECM at Work. Joppe W. Bos 1 and Thorsten Kleinjung 2. 1 Microsoft Research, Redmond, USA ECM at Work Joppe W. Bos 1 and Thorsten Kleinjung 2 1 Microsoft Research, Redmond, USA 2 Laboratory for Cryptologic Algorithms, EPFL, Lausanne, Switzerland 1 / 18 Security assessment of public-key cryptography

More information

A CUDA Solver for Helmholtz Equation

A CUDA Solver for Helmholtz Equation Journal of Computational Information Systems 11: 24 (2015) 7805 7812 Available at http://www.jofcis.com A CUDA Solver for Helmholtz Equation Mingming REN 1,2,, Xiaoguang LIU 1,2, Gang WANG 1,2 1 College

More information

CSE 206A: Lattice Algorithms and Applications Spring Basis Reduction. Instructor: Daniele Micciancio

CSE 206A: Lattice Algorithms and Applications Spring Basis Reduction. Instructor: Daniele Micciancio CSE 206A: Lattice Algorithms and Applications Spring 2014 Basis Reduction Instructor: Daniele Micciancio UCSD CSE No efficient algorithm is known to find the shortest vector in a lattice (in arbitrary

More information

Introduction to numerical computations on the GPU

Introduction to numerical computations on the GPU Introduction to numerical computations on the GPU Lucian Covaci http://lucian.covaci.org/cuda.pdf Tuesday 1 November 11 1 2 Outline: NVIDIA Tesla and Geforce video cards: architecture CUDA - C: programming

More information

Hybrid CPU/GPU Acceleration of Detection of 2-SNP Epistatic Interactions in GWAS

Hybrid CPU/GPU Acceleration of Detection of 2-SNP Epistatic Interactions in GWAS Hybrid CPU/GPU Acceleration of Detection of 2-SNP Epistatic Interactions in GWAS Jorge González-Domínguez*, Bertil Schmidt*, Jan C. Kässens**, Lars Wienbrandt** *Parallel and Distributed Architectures

More information

Algorithmic Geometry of Numbers: LLL and BKZ

Algorithmic Geometry of Numbers: LLL and BKZ Algorithmic Geometry of Numbers: LLL and BKZ Léo Ducas CWI, Amsterdam, The Netherlands HEAT Summer-School on FHE and MLM Léo Ducas (CWI, Amsterdam) LLL and BKZ HEAT, October 2015 1 / 28 A gift from Johannes

More information

Measuring freeze-out parameters on the Bielefeld GPU cluster

Measuring freeze-out parameters on the Bielefeld GPU cluster Measuring freeze-out parameters on the Bielefeld GPU cluster Outline Fluctuations and the QCD phase diagram Fluctuations from Lattice QCD The Bielefeld hybrid GPU cluster Freeze-out conditions from QCD

More information

A Fast Phase-Based Enumeration Algorithm for SVP Challenge through y-sparse Representations of Short Lattice Vectors

A Fast Phase-Based Enumeration Algorithm for SVP Challenge through y-sparse Representations of Short Lattice Vectors A Fast Phase-Based Enumeration Algorithm for SVP Challenge through y-sparse Representations of Short Lattice Vectors Dan Ding 1, Guizhen Zhu 2, Yang Yu 1, Zhongxiang Zheng 1 1 Department of Computer Science

More information

Estimates for factoring 1024-bit integers. Thorsten Kleinjung, University of Bonn

Estimates for factoring 1024-bit integers. Thorsten Kleinjung, University of Bonn Estimates for factoring 1024-bit integers Thorsten Kleinjung, University of Bonn Contents GNFS Overview Polynomial selection, matrix construction, square root computation Sieving and cofactoring Strategies

More information

Practical Free-Start Collision Attacks on 76-step SHA-1

Practical Free-Start Collision Attacks on 76-step SHA-1 Practical Free-Start Collision Attacks on 76-step SHA-1 Inria and École polytechnique, France Nanyang Technological University, Singapore Joint work with Thomas Peyrin and Marc Stevens CWI, Amsterdam 2015

More information

Welcome to MCS 572. content and organization expectations of the course. definition and classification

Welcome to MCS 572. content and organization expectations of the course. definition and classification Welcome to MCS 572 1 About the Course content and organization expectations of the course 2 Supercomputing definition and classification 3 Measuring Performance speedup and efficiency Amdahl s Law Gustafson

More information

Practical Free-Start Collision Attacks on full SHA-1

Practical Free-Start Collision Attacks on full SHA-1 Practical Free-Start Collision Attacks on full SHA-1 Inria and École polytechnique, France Nanyang Technological University, Singapore Joint work with Thomas Peyrin and Marc Stevens Séminaire Cryptologie

More information

上海超级计算中心 Shanghai Supercomputer Center. Lei Xu Shanghai Supercomputer Center San Jose

上海超级计算中心 Shanghai Supercomputer Center. Lei Xu Shanghai Supercomputer Center San Jose 上海超级计算中心 Shanghai Supercomputer Center Lei Xu Shanghai Supercomputer Center 03/26/2014 @GTC, San Jose Overview Introduction Fundamentals of the FDTD method Implementation of 3D UPML-FDTD algorithm on GPU

More information

Orthogonalized Lattice Enumeration for Solving SVP

Orthogonalized Lattice Enumeration for Solving SVP Orthogonalized Lattice Enumeration for Solving SVP Zhongxiang Zheng 1, Xiaoyun Wang 2, Guangwu Xu 3, Yang Yu 1 1 Department of Computer Science and Technology,Tsinghua University, Beijing 100084, China,

More information

Estimation of the Success Probability of Random Sampling by the Gram-Charlier Approximation

Estimation of the Success Probability of Random Sampling by the Gram-Charlier Approximation Estimation of the Success Probability of Random Sampling by the Gram-Charlier Approximation Yoshitatsu Matsuda 1, Tadanori Teruya, and Kenji Kashiwabara 1 1 Department of General Systems Studies, Graduate

More information

Computational algebraic number theory tackles lattice-based cryptography

Computational algebraic number theory tackles lattice-based cryptography Computational algebraic number theory tackles lattice-based cryptography Daniel J. Bernstein University of Illinois at Chicago & Technische Universiteit Eindhoven Moving to the left Moving to the right

More information

Computing Generator in Cyclotomic Integer Rings

Computing Generator in Cyclotomic Integer Rings A subfield algorithm for the Principal Ideal Problem in L 1 K 2 and application to the cryptanalysis of a FHE scheme Jean-François Biasse 1 Thomas Espitau 2 Pierre-Alain Fouque 3 Alexandre Gélin 2 Paul

More information

arxiv: v1 [hep-lat] 7 Oct 2010

arxiv: v1 [hep-lat] 7 Oct 2010 arxiv:.486v [hep-lat] 7 Oct 2 Nuno Cardoso CFTP, Instituto Superior Técnico E-mail: nunocardoso@cftp.ist.utl.pt Pedro Bicudo CFTP, Instituto Superior Técnico E-mail: bicudo@ist.utl.pt We discuss the CUDA

More information

Diophantine equations via weighted LLL algorithm

Diophantine equations via weighted LLL algorithm Cryptanalysis of a public key cryptosystem based on Diophantine equations via weighted LLL algorithm Momonari Kudo Graduate School of Mathematics, Kyushu University, JAPAN Kyushu University Number Theory

More information

Practical Analysis of Key Recovery Attack against Search-LWE Problem

Practical Analysis of Key Recovery Attack against Search-LWE Problem Practical Analysis of Key Recovery Attack against Search-LWE Problem The 11 th International Workshop on Security, Sep. 13 th 2016 Momonari Kudo, Junpei Yamaguchi, Yang Guo and Masaya Yasuda 1 Graduate

More information

GPU Computing Activities in KISTI

GPU Computing Activities in KISTI International Advanced Research Workshop on High Performance Computing, Grids and Clouds 2010 June 21~June 25 2010, Cetraro, Italy HPC Infrastructure and GPU Computing Activities in KISTI Hongsuk Yi hsyi@kisti.re.kr

More information

Lattice Reduction of Modular, Convolution, and NTRU Lattices

Lattice Reduction of Modular, Convolution, and NTRU Lattices Summer School on Computational Number Theory and Applications to Cryptography Laramie, Wyoming, June 19 July 7, 2006 Lattice Reduction of Modular, Convolution, and NTRU Lattices Project suggested by Joe

More information

Solving PDEs with CUDA Jonathan Cohen

Solving PDEs with CUDA Jonathan Cohen Solving PDEs with CUDA Jonathan Cohen jocohen@nvidia.com NVIDIA Research PDEs (Partial Differential Equations) Big topic Some common strategies Focus on one type of PDE in this talk Poisson Equation Linear

More information

UNCONDITIONAL CLASS GROUP TABULATION TO Anton Mosunov (University of Waterloo) Michael J. Jacobson, Jr. (University of Calgary) June 11th, 2015

UNCONDITIONAL CLASS GROUP TABULATION TO Anton Mosunov (University of Waterloo) Michael J. Jacobson, Jr. (University of Calgary) June 11th, 2015 UNCONDITIONAL CLASS GROUP TABULATION TO 2 40 Anton Mosunov (University of Waterloo) Michael J. Jacobson, Jr. (University of Calgary) June 11th, 2015 AGENDA Background Motivation Previous work Class number

More information

Parallel programming practices for the solution of Sparse Linear Systems (motivated by computational physics and graphics)

Parallel programming practices for the solution of Sparse Linear Systems (motivated by computational physics and graphics) Parallel programming practices for the solution of Sparse Linear Systems (motivated by computational physics and graphics) Eftychios Sifakis CS758 Guest Lecture - 19 Sept 2012 Introduction Linear systems

More information

Multiphase Flow Simulations in Inclined Tubes with Lattice Boltzmann Method on GPU

Multiphase Flow Simulations in Inclined Tubes with Lattice Boltzmann Method on GPU Multiphase Flow Simulations in Inclined Tubes with Lattice Boltzmann Method on GPU Khramtsov D.P., Nekrasov D.A., Pokusaev B.G. Department of Thermodynamics, Thermal Engineering and Energy Saving Technologies,

More information

Accelerating linear algebra computations with hybrid GPU-multicore systems.

Accelerating linear algebra computations with hybrid GPU-multicore systems. Accelerating linear algebra computations with hybrid GPU-multicore systems. Marc Baboulin INRIA/Université Paris-Sud joint work with Jack Dongarra (University of Tennessee and Oak Ridge National Laboratory)

More information

COMPARISON OF CPU AND GPU IMPLEMENTATIONS OF THE LATTICE BOLTZMANN METHOD

COMPARISON OF CPU AND GPU IMPLEMENTATIONS OF THE LATTICE BOLTZMANN METHOD XVIII International Conference on Water Resources CMWR 2010 J. Carrera (Ed) c CIMNE, Barcelona, 2010 COMPARISON OF CPU AND GPU IMPLEMENTATIONS OF THE LATTICE BOLTZMANN METHOD James.E. McClure, Jan F. Prins

More information

1 Shortest Vector Problem

1 Shortest Vector Problem Lattices in Cryptography University of Michigan, Fall 25 Lecture 2 SVP, Gram-Schmidt, LLL Instructor: Chris Peikert Scribe: Hank Carter Shortest Vector Problem Last time we defined the minimum distance

More information

arxiv: v1 [hep-lat] 10 Jul 2012

arxiv: v1 [hep-lat] 10 Jul 2012 Hybrid Monte Carlo with Wilson Dirac operator on the Fermi GPU Abhijit Chakrabarty Electra Design Automation, SDF Building, SaltLake Sec-V, Kolkata - 700091. Pushan Majumdar Dept. of Theoretical Physics,

More information

Solving All Lattice Problems in Deterministic Single Exponential Time

Solving All Lattice Problems in Deterministic Single Exponential Time Solving All Lattice Problems in Deterministic Single Exponential Time (Joint work with P. Voulgaris, STOC 2010) UCSD March 22, 2011 Lattices Traditional area of mathematics Bridge between number theory

More information

Compact Ring LWE Cryptoprocessor

Compact Ring LWE Cryptoprocessor 1 Compact Ring LWE Cryptoprocessor CHES 2014 Sujoy Sinha Roy 1, Frederik Vercauteren 1, Nele Mentens 1, Donald Donglong Chen 2 and Ingrid Verbauwhede 1 1 ESAT/COSIC and iminds, KU Leuven 2 Electronic Engineering,

More information

Direct Self-Consistent Field Computations on GPU Clusters

Direct Self-Consistent Field Computations on GPU Clusters Direct Self-Consistent Field Computations on GPU Clusters Guochun Shi, Volodymyr Kindratenko National Center for Supercomputing Applications University of Illinois at UrbanaChampaign Ivan Ufimtsev, Todd

More information

The Lattice Boltzmann Method for Laminar and Turbulent Channel Flows

The Lattice Boltzmann Method for Laminar and Turbulent Channel Flows The Lattice Boltzmann Method for Laminar and Turbulent Channel Flows Vanja Zecevic, Michael Kirkpatrick and Steven Armfield Department of Aerospace Mechanical & Mechatronic Engineering The University of

More information

Fault Attacks Against Lattice-Based Signatures

Fault Attacks Against Lattice-Based Signatures Fault Attacks Against Lattice-Based Signatures T. Espitau P-A. Fouque B. Gérard M. Tibouchi Lip6, Sorbonne Universités, Paris August 12, 2016 SAC 16 1 Towards postquantum cryptography Quantum computers

More information

BKZ 2.0: Better Lattice Security Estimates

BKZ 2.0: Better Lattice Security Estimates BKZ 2.0: Better Lattice Security Estimates Yuanmi Chen and Phong Q. Nguyen 1 ENS, Dept. Informatique, 45 rue d Ulm, 75005 Paris, France. http://www.eleves.ens.fr/home/ychen/ 2 INRIA and ENS, Dept. Informatique,

More information

Solving Multivariate Polynomial Systems

Solving Multivariate Polynomial Systems Solving Multivariate Polynomial Systems Presented by: Bo-Yin Yang work with Lab of Yang and Cheng, and Charles Bouillaguet, ENS Institute of Information Science and TWISC, Academia Sinica Taipei, Taiwan

More information

Algebraic Cryptanalysis of MQQ Public Key Cryptosystem by MutantXL

Algebraic Cryptanalysis of MQQ Public Key Cryptosystem by MutantXL Algebraic Cryptanalysis of MQQ Public Key Cryptosystem by MutantXL Mohamed Saied Emam Mohamed 1, Jintai Ding 2, and Johannes Buchmann 1 1 TU Darmstadt, FB Informatik Hochschulstrasse 10, 64289 Darmstadt,

More information

Practical, Predictable Lattice Basis Reduction

Practical, Predictable Lattice Basis Reduction Practical, Predictable Lattice Basis Reduction Daniele Micciancio and Michael Walter University of California, San Diego {miwalter,daniele}@eng.ucsd.edu Abstract. Lattice reduction algorithms are notoriously

More information

CSE 206A: Lattice Algorithms and Applications Spring Basic Algorithms. Instructor: Daniele Micciancio

CSE 206A: Lattice Algorithms and Applications Spring Basic Algorithms. Instructor: Daniele Micciancio CSE 206A: Lattice Algorithms and Applications Spring 2014 Basic Algorithms Instructor: Daniele Micciancio UCSD CSE We have already seen an algorithm to compute the Gram-Schmidt orthogonalization of a lattice

More information

Real-time signal detection for pulsars and radio transients using GPUs

Real-time signal detection for pulsars and radio transients using GPUs Real-time signal detection for pulsars and radio transients using GPUs W. Armour, M. Giles, A. Karastergiou and C. Williams. University of Oxford. 15 th July 2013 1 Background of GPUs Why use GPUs? Influence

More information

Background. Another interests. Sieve method. Parallel Sieve Processing on Vector Processor and GPU. RSA Cryptography

Background. Another interests. Sieve method. Parallel Sieve Processing on Vector Processor and GPU. RSA Cryptography Background Parallel Sieve Processing on Vector Processor and GPU Yasunori Ushiro (Earth Simulator Center) Yoshinari Fukui (Earth Simulator Center) Hidehiko Hasegawa (Univ. of Tsukuba) () RSA Cryptography

More information

The quantum threat to cryptography

The quantum threat to cryptography The quantum threat to cryptography Ashley Montanaro School of Mathematics, University of Bristol 20 October 2016 Quantum computers University of Bristol IBM UCSB / Google University of Oxford Experimental

More information

Tuple Lattice Sieving

Tuple Lattice Sieving Tuple Lattice Sieving Shi Bai, Thijs Laarhoven and Damien Stehlé IBM Research Zurich and ENS de Lyon August 29th 2016 S. Bai, T. Laarhoven & D. Stehlé Tuple Lattice Sieving 29/08/2016 1/25 Main result

More information

GPU accelerated Monte Carlo simulations of lattice spin models

GPU accelerated Monte Carlo simulations of lattice spin models Available online at www.sciencedirect.com Physics Procedia 15 (2011) 92 96 GPU accelerated Monte Carlo simulations of lattice spin models M. Weigel, T. Yavors kii Institut für Physik, Johannes Gutenberg-Universität

More information

Parallel Rabin-Karp Algorithm Implementation on GPU (preliminary version)

Parallel Rabin-Karp Algorithm Implementation on GPU (preliminary version) Bulletin of Networking, Computing, Systems, and Software www.bncss.org, ISSN 2186-5140 Volume 7, Number 1, pages 28 32, January 2018 Parallel Rabin-Karp Algorithm Implementation on GPU (preliminary version)

More information

An Implementation of SPELT(31, 4, 96, 96, (32, 16, 8))

An Implementation of SPELT(31, 4, 96, 96, (32, 16, 8)) An Implementation of SPELT(31, 4, 96, 96, (32, 16, 8)) Tung Chou January 5, 2012 QUAD Stream cipher. Security relies on MQ (Multivariate Quadratics). QUAD The Provably-secure QUAD(q, n, r) Stream Cipher

More information

Daniel J. Bernstein University of Illinois at Chicago. means an algorithm that a quantum computer can run.

Daniel J. Bernstein University of Illinois at Chicago. means an algorithm that a quantum computer can run. Quantum algorithms 1 Daniel J. Bernstein University of Illinois at Chicago Quantum algorithm means an algorithm that a quantum computer can run. i.e. a sequence of instructions, where each instruction

More information

Claude Tadonki. MINES ParisTech PSL Research University Centre de Recherche Informatique

Claude Tadonki. MINES ParisTech PSL Research University Centre de Recherche Informatique Claude Tadonki MINES ParisTech PSL Research University Centre de Recherche Informatique claude.tadonki@mines-paristech.fr Monthly CRI Seminar MINES ParisTech - CRI June 06, 2016, Fontainebleau (France)

More information

arxiv: v1 [cs.dc] 4 Sep 2014

arxiv: v1 [cs.dc] 4 Sep 2014 and NVIDIA R GPUs arxiv:1409.1510v1 [cs.dc] 4 Sep 2014 O. Kaczmarek, C. Schmidt and P. Steinbrecher Fakultät für Physik, Universität Bielefeld, D-33615 Bielefeld, Germany E-mail: okacz, schmidt, p.steinbrecher@physik.uni-bielefeld.de

More information

Extended Lattice Reduction Experiments using the BKZ Algorithm

Extended Lattice Reduction Experiments using the BKZ Algorithm Extended Lattice Reduction Experiments using the BKZ Algorithm Michael Schneider Johannes Buchmann Technische Universität Darmstadt {mischnei,buchmann}@cdc.informatik.tu-darmstadt.de Abstract: We present

More information

Parallel Asynchronous Hybrid Krylov Methods for Minimization of Energy Consumption. Langshi CHEN 1,2,3 Supervised by Serge PETITON 2

Parallel Asynchronous Hybrid Krylov Methods for Minimization of Energy Consumption. Langshi CHEN 1,2,3 Supervised by Serge PETITON 2 1 / 23 Parallel Asynchronous Hybrid Krylov Methods for Minimization of Energy Consumption Langshi CHEN 1,2,3 Supervised by Serge PETITON 2 Maison de la Simulation Lille 1 University CNRS March 18, 2013

More information

The Shortest Vector Problem (Lattice Reduction Algorithms)

The Shortest Vector Problem (Lattice Reduction Algorithms) The Shortest Vector Problem (Lattice Reduction Algorithms) Approximation Algorithms by V. Vazirani, Chapter 27 - Problem statement, general discussion - Lattices: brief introduction - The Gauss algorithm

More information

Jacobi-Based Eigenvalue Solver on GPU. Lung-Sheng Chien, NVIDIA

Jacobi-Based Eigenvalue Solver on GPU. Lung-Sheng Chien, NVIDIA Jacobi-Based Eigenvalue Solver on GPU Lung-Sheng Chien, NVIDIA lchien@nvidia.com Outline Symmetric eigenvalue solver Experiment Applications Conclusions Symmetric eigenvalue solver The standard form is

More information

Cosmology with Galaxy Clusters: Observations meet High-Performance-Computing

Cosmology with Galaxy Clusters: Observations meet High-Performance-Computing Cosmology with Galaxy Clusters: Observations meet High-Performance-Computing Julian Merten (ITA/ZAH) Clusters of galaxies GPU lensing codes Abell 2744 CLASH: A HST/MCT programme Clusters of galaxies DM

More information

M4. Lecture 3. THE LLL ALGORITHM AND COPPERSMITH S METHOD

M4. Lecture 3. THE LLL ALGORITHM AND COPPERSMITH S METHOD M4. Lecture 3. THE LLL ALGORITHM AND COPPERSMITH S METHOD Ha Tran, Dung H. Duong, Khuong A. Nguyen. SEAMS summer school 2015 HCM University of Science 1 / 31 1 The LLL algorithm History Applications of

More information

Dynamic Scheduling for Work Agglomeration on Heterogeneous Clusters

Dynamic Scheduling for Work Agglomeration on Heterogeneous Clusters Dynamic Scheduling for Work Agglomeration on Heterogeneous Clusters Jonathan Lifflander, G. Carl Evans, Anshu Arya, Laxmikant Kale University of Illinois Urbana-Champaign May 25, 2012 Work is overdecomposed

More information

Calculation of ground states of few-body nuclei using NVIDIA CUDA technology

Calculation of ground states of few-body nuclei using NVIDIA CUDA technology Calculation of ground states of few-body nuclei using NVIDIA CUDA technology M. A. Naumenko 1,a, V. V. Samarin 1, 1 Flerov Laboratory of Nuclear Reactions, Joint Institute for Nuclear Research, 6 Joliot-Curie

More information

Parallel stochastic simulation using graphics processing units for the Systems Biology Toolbox for MATLAB

Parallel stochastic simulation using graphics processing units for the Systems Biology Toolbox for MATLAB Parallel stochastic simulation using graphics processing units for the Systems Biology Toolbox for MATLAB Supplemental material Guido Klingbeil, Radek Erban, Mike Giles, and Philip K. Maini This document

More information

On the Complexity of the Hybrid Approach on HFEv-

On the Complexity of the Hybrid Approach on HFEv- On the Complexity of the Hybrid Approach on HFEv- Albrecht Petzoldt National Institute of Standards and Technology, Gaithersburg, Maryland, USA albrecht.petzoldt@gmail.com Abstract. The HFEv- signature

More information

Module 5: CPU Scheduling

Module 5: CPU Scheduling Module 5: CPU Scheduling Basic Concepts Scheduling Criteria Scheduling Algorithms Multiple-Processor Scheduling Real-Time Scheduling Algorithm Evaluation 5.1 Basic Concepts Maximum CPU utilization obtained

More information

S XMP LIBRARY INTERNALS. Niall Emmart University of Massachusetts. Follow on to S6151 XMP: An NVIDIA CUDA Accelerated Big Integer Library

S XMP LIBRARY INTERNALS. Niall Emmart University of Massachusetts. Follow on to S6151 XMP: An NVIDIA CUDA Accelerated Big Integer Library S6349 - XMP LIBRARY INTERNALS Niall Emmart University of Massachusetts Follow on to S6151 XMP: An NVIDIA CUDA Accelerated Big Integer Library High Performance Modular Exponentiation A^K mod P Where A,

More information

Chapter 6: CPU Scheduling

Chapter 6: CPU Scheduling Chapter 6: CPU Scheduling Basic Concepts Scheduling Criteria Scheduling Algorithms Multiple-Processor Scheduling Real-Time Scheduling Algorithm Evaluation 6.1 Basic Concepts Maximum CPU utilization obtained

More information

Specialized Cryptanalytic Machines: Two examples, 60 years apart. Patrick Schaumont ECE Department Virginia Tech

Specialized Cryptanalytic Machines: Two examples, 60 years apart. Patrick Schaumont ECE Department Virginia Tech Specialized Cryptanalytic Machines: Two examples, 60 years apart Patrick Schaumont ECE Department Virginia Tech What is cryptanalysis? Cryptography aims to defeat cryptanalysis Cryptanalysis aims to defeat

More information

Enumeration. Phong Nguyễn

Enumeration. Phong Nguyễn Enumeration Phong Nguyễn http://www.di.ens.fr/~pnguyen March 2017 References Joint work with: Yoshinori Aono, published at EUROCRYPT 2017: «Random Sampling Revisited: Lattice Enumeration with Discrete

More information

The factorization of RSA D. J. Bernstein University of Illinois at Chicago

The factorization of RSA D. J. Bernstein University of Illinois at Chicago The factorization of RSA-1024 D. J. Bernstein University of Illinois at Chicago Abstract: This talk discusses the most important tools for attackers breaking 1024-bit RSA keys today and tomorrow. The same

More information

Open-Source Parallel FE Software : FrontISTR -- Performance Considerations about B/F (Byte per Flop) of SpMV on K-Supercomputer and GPU-Clusters --

Open-Source Parallel FE Software : FrontISTR -- Performance Considerations about B/F (Byte per Flop) of SpMV on K-Supercomputer and GPU-Clusters -- Parallel Processing for Energy Efficiency October 3, 2013 NTNU, Trondheim, Norway Open-Source Parallel FE Software : FrontISTR -- Performance Considerations about B/F (Byte per Flop) of SpMV on K-Supercomputer

More information

Computational algebraic number theory tackles lattice-based cryptography

Computational algebraic number theory tackles lattice-based cryptography Computational algebraic number theory tackles lattice-based cryptography Daniel J. Bernstein University of Illinois at Chicago & Technische Universiteit Eindhoven Moving to the left Moving to the right

More information

Ring-LWE security in the case of FHE

Ring-LWE security in the case of FHE Chair of Naval Cyber Defense 5 July 2016 Workshop HEAT Paris Why worry? Which algorithm performs best depends on the concrete parameters considered. For small n, DEC may be favourable. For large n, BKW

More information

CHAPMAN & HALL/CRC CRYPTOGRAPHY AND NETWORK SECURITY ALGORITHMIC CR YPTAN ALY51S. Ant nine J aux

CHAPMAN & HALL/CRC CRYPTOGRAPHY AND NETWORK SECURITY ALGORITHMIC CR YPTAN ALY51S. Ant nine J aux CHAPMAN & HALL/CRC CRYPTOGRAPHY AND NETWORK SECURITY ALGORITHMIC CR YPTAN ALY51S Ant nine J aux (g) CRC Press Taylor 8* Francis Croup Boca Raton London New York CRC Press is an imprint of the Taylor &

More information

Measuring, simulating and exploiting the head concavity phenomenon in BKZ

Measuring, simulating and exploiting the head concavity phenomenon in BKZ Measuring, simulating and exploiting the head concavity phenomenon in BKZ Shi Bai 1, Damien Stehlé 2, and Weiqiang Wen 2 1 Department of Mathematical Sciences, Florida Atlantic University. Boca Raton.

More information

Porting a sphere optimization program from LAPACK to ScaLAPACK

Porting a sphere optimization program from LAPACK to ScaLAPACK Porting a sphere optimization program from LAPACK to ScaLAPACK Mathematical Sciences Institute, Australian National University. For presentation at Computational Techniques and Applications Conference

More information

Public Key Compression and Modulus Switching for Fully Homomorphic Encryption over the Integers

Public Key Compression and Modulus Switching for Fully Homomorphic Encryption over the Integers Public Key Compression and Modulus Switching for Fully Homomorphic Encryption over the Integers Jean-Sébastien Coron, David Naccache and Mehdi Tibouchi University of Luxembourg & ENS & NTT EUROCRYPT, 2012-04-18

More information

A new lattice construction for partial key exposure attack for RSA

A new lattice construction for partial key exposure attack for RSA A new lattice construction for partial key exposure attack for RSA Yoshinori Aono Dept. of Mathematical and Computing Sciences Tokyo Institute of Technology, Tokyo, Japan aono5@is.titech.ac.jp Abstract.

More information

Looking back at lattice-based cryptanalysis

Looking back at lattice-based cryptanalysis September 2009 Lattices A lattice is a discrete subgroup of R n Equivalently, set of integral linear combinations: α 1 b1 + + α n bm with m n Lattice reduction Lattice reduction looks for a good basis

More information

Algebraic Aspects of Symmetric-key Cryptography

Algebraic Aspects of Symmetric-key Cryptography Algebraic Aspects of Symmetric-key Cryptography Carlos Cid (carlos.cid@rhul.ac.uk) Information Security Group Royal Holloway, University of London 04.May.2007 ECRYPT Summer School 1 Algebraic Techniques

More information

Research into GPU accelerated pattern matching for applications in computer security

Research into GPU accelerated pattern matching for applications in computer security Research into GPU accelerated pattern matching for applications in computer security November 4, 2009 Alexander Gee age19@student.canterbury.ac.nz Department of Computer Science and Software Engineering

More information

RSA Key Extraction via Low- Bandwidth Acoustic Cryptanalysis. Daniel Genkin, Adi Shamir, Eran Tromer

RSA Key Extraction via Low- Bandwidth Acoustic Cryptanalysis. Daniel Genkin, Adi Shamir, Eran Tromer RSA Key Extraction via Low- Bandwidth Acoustic Cryptanalysis Daniel Genkin, Adi Shamir, Eran Tromer Mathematical Attacks Input Crypto Algorithm Key Output Goal: recover the key given access to the inputs

More information

Accelerating Quantum Chromodynamics Calculations with GPUs

Accelerating Quantum Chromodynamics Calculations with GPUs Accelerating Quantum Chromodynamics Calculations with GPUs Guochun Shi, Steven Gottlieb, Aaron Torok, Volodymyr Kindratenko NCSA & Indiana University National Center for Supercomputing Applications University

More information

A microsecond a day keeps the doctor away: Efficient GPU Molecular Dynamics with GROMACS

A microsecond a day keeps the doctor away: Efficient GPU Molecular Dynamics with GROMACS GTC 20130319 A microsecond a day keeps the doctor away: Efficient GPU Molecular Dynamics with GROMACS Erik Lindahl erik.lindahl@scilifelab.se Molecular Dynamics Understand biology We re comfortably on

More information

A Quantum Chemistry Domain-Specific Language for Heterogeneous Clusters

A Quantum Chemistry Domain-Specific Language for Heterogeneous Clusters A Quantum Chemistry Domain-Specific Language for Heterogeneous Clusters ANTONINO TUMEO, ORESTE VILLA Collaborators: Karol Kowalski, Sriram Krishnamoorthy, Wenjing Ma, Simone Secchi May 15, 2012 1 Outline!

More information

BALANCED INTEGER SOLUTIONS OF LINEAR EQUATIONS

BALANCED INTEGER SOLUTIONS OF LINEAR EQUATIONS BALANCED INTEGER SOLUTIONS OF LINEAR EQUATIONS KONSTANTINOS A. DRAZIOTIS Abstract. We use lattice based methods in order to get an integer solution of the linear equation a x + +a nx n = a 0, which satisfies

More information

Applications of Mathematical Economics

Applications of Mathematical Economics Applications of Mathematical Economics Michael Curran Trinity College Dublin Overview Introduction. Data Preparation Filters. Dynamic Stochastic General Equilibrium Models: Sunspots and Blanchard-Kahn

More information

Short generators without quantum computers: the case of multiquadratics

Short generators without quantum computers: the case of multiquadratics Short generators without quantum computers: the case of multiquadratics Christine van Vredendaal Technische Universiteit Eindhoven 1 May 2017 Joint work with: Jens Bauch & Daniel J. Bernstein & Henry de

More information

High-Performance Computing, Planet Formation & Searching for Extrasolar Planets

High-Performance Computing, Planet Formation & Searching for Extrasolar Planets High-Performance Computing, Planet Formation & Searching for Extrasolar Planets Eric B. Ford (UF Astronomy) Research Computing Day September 29, 2011 Postdocs: A. Boley, S. Chatterjee, A. Moorhead, M.

More information

GPU Acceleration of BCP Procedure for SAT Algorithms

GPU Acceleration of BCP Procedure for SAT Algorithms GPU Acceleration of BCP Procedure for SAT Algorithms Hironori Fujii 1 and Noriyuki Fujimoto 1 1 Graduate School of Science Osaka Prefecture University 1-1 Gakuencho, Nakaku, Sakai, Osaka 599-8531, Japan

More information

Parallel Longest Common Subsequence using Graphics Hardware

Parallel Longest Common Subsequence using Graphics Hardware Parallel Longest Common Subsequence using Graphics Hardware John Kloetzli rian Strege Jonathan Decker Dr. Marc Olano Presented by: rian Strege 1 Overview Introduction Problem Statement ackground and Related

More information

Short generators without quantum computers: the case of multiquadratics

Short generators without quantum computers: the case of multiquadratics Short generators without quantum computers: the case of multiquadratics Daniel J. Bernstein & Christine van Vredendaal University of Illinois at Chicago Technische Universiteit Eindhoven 19 January 2017

More information

Public Key Encryption

Public Key Encryption Public Key Encryption 3/13/2012 Cryptography 1 Facts About Numbers Prime number p: p is an integer p 2 The only divisors of p are 1 and p s 2, 7, 19 are primes -3, 0, 1, 6 are not primes Prime decomposition

More information

Computers and Mathematics with Applications

Computers and Mathematics with Applications Computers and Mathematics with Applications 61 (2011) 1261 1265 Contents lists available at ScienceDirect Computers and Mathematics with Applications journal homepage: wwwelseviercom/locate/camwa Cryptanalysis

More information

Side Channel Attack to Actual Cryptanalysis: Breaking CRT-RSA with Low Weight Decryption Exponents

Side Channel Attack to Actual Cryptanalysis: Breaking CRT-RSA with Low Weight Decryption Exponents Side Channel Attack to Actual Cryptanalysis: Breaking CRT-RSA with Low Weight Decryption Exponents Santanu Sarkar and Subhamoy Maitra Leuven, Belgium 12 September, 2012 Outline of the Talk RSA Cryptosystem

More information

On the concrete hardness of Learning with Errors

On the concrete hardness of Learning with Errors On the concrete hardness of Learning with Errors Martin R. Albrecht 1, Rachel Player 1, and Sam Scott 1 Information Security Group, Royal Holloway, University of London Abstract. The Learning with Errors

More information