Background. Another interests. Sieve method. Parallel Sieve Processing on Vector Processor and GPU. RSA Cryptography
|
|
- Cory Knight
- 5 years ago
- Views:
Transcription
1 Background Parallel Sieve Processing on Vector Processor and GPU Yasunori Ushiro (Earth Simulator Center) Yoshinari Fukui (Earth Simulator Center) Hidehiko Hasegawa (Univ. of Tsukuba) () RSA Cryptography is the key technology for safe Internet use. () The safeness is based on the result that the factorization algorithm of a long-digit number n to P and Q has high computational complexity and consumes enormous computation time. () To guarantee, a decryption time of more than 0 years is necessary, even using the fastest computer. SIAM PP, Feb. 6, 0 Another interests Different Architecture for RSA (GPU and Vector Processor instead of PC) Non-Floating Point Number Operations (Almost 0 GFLOPS) Other Usage of GPU and Vector Processor RSA Cryptography Creation of keys () Choose prime numbers p, q, and e () Set n = p * q and f = (p-) * (q-) () Compute d = /e (mod f) Encryption Compute C = M e (mod n) Public keys (e, n) are used Decryption Compute M = C d (mod n); Euler s theorem M f (mod n) Secure keys (d, n) are used 4 Computation time of RSA-768 ( digits) CPU Year ratio(%) Sieve Processing Matrices Proc. 9 Exploration of polynomial 0 Algebraic Square root 0 Others 0 Cf. AMD64 (.GHz, Core) Sieve method Factorize N based on the relationship: A -B =(A-B)(A+B) 0 (mod N) () Gather numerous A i and B i such that A l A l A k lk B m B m B j mj (mod N) () Look for even l i and m i with factorization of 0- Matrices Dimension: 9,796,0 * 9,79,0 6
2 Steps of Sieve Processing Comp. Iteration Size Set Number Long 0^{00} Compute Base int 0^8 Choose prime number Q l and f l (x) int 0^ 4 Process for each l 0^ 4. Repeat Step 4.. (0^{}) /LP 4.. Compute PS[i] V[i]=0 Sieve Processing Double int int64 LP LP N*LP/p p (Harmonic mean of primes) LP: 0^6 for PC, 0^8 for GPU, 0^9 for ES LP LP LP LP LP p p p p N primes p LP/p Steps
3 4.. Kernel of Sieve for (k=0; k<n; k++) : # of primes in base { for (i=start[k]; i<lp; i+=prime[k]) { V[i] += LogP[k]; } LP : Size of Sieve } for (i=0; i<lp; i++) : Pick up sieved data { if(v[i] >= PS[i]) { Sive[No] = Pointer + i; No++; } } No : # of sieved data Update Start[0]~Start[N-] for next Sieve Tuning for Sieve Processing Base must be stored in fast memory PC (Cache) Shorter LP (LP is size of Sieve) 0^6 Vector Processor ES LP=0^9; 0^ times larger than that of PC Picking up of sieved data is slow ( if exists in a loop) GPU(GTX480) LP=0^8; 0^ times larger than that of PC Discontinuous memory access (stride varies) 4 Tuning for Vector Processor ES Picking up ratio is 0^{-6} ~0^{-0} Compute Maximum value in a block instead of picking up Modification: If the maximum number > value_b then perform Picking_up process Block size is 64K (6,6) This results in times faster Tuning for GPU Each thread has smaller LP (LP of PC * 000 / 0480 threads) Larger Prime[k] makes small hit ratio if(ii < LP) V[ii] += LogP[k]; N is used for Loop length instead LP Incorrect result for V[ii]+=LogP[k]; Sieve Processing can permit small error (Loss of pick up is OK up to 0^{-}) types of parallelization are used based on the size of prime numbers in the base 6 GPU program (before) no = gn*bn; bn=, gn=40 for (k=0; k<lp; k+= no) LP=0*04 { i = (bn * blockidx.x + threadidx.x) + k; V[i] = 0; } syncthreads(); for (k=0; k<n; k++) 0,480 threads in LP { for (i=start[k]; i<lp; i+=prime[k]*no) { ii = (bn*blockidx.x+threadidx.x)*prime[k]+i; if(ii < LP) V[ii] += LogP[k]; } syncthreads(); } 7 GPU program (after) for (k=0; k<n; k++) Sieve for ~0(K)-th primes { for (i=start[k]; i<lp; i+=prime[k]*gn*bn) { ii = (bn*blockidx.x + threadidx.x)*prime[k] + i; if(ii < LP) V[ii] += LogP[k]; syncthreads(); } } for (k=n; k<n; k+=gn) K+~40K-th primes { kk = blockidx.x + k; for (i=start[kk]; i<lp; i+=prime[kk]*bn) { ii = threadidx.x*prime[k] + i; if(ii < LP) V[ii] += LogP[kk]; syncthreads(); } } for (k=n; k<n; k+=gn*bn) Over (40K+)-th primes { kk =(bn*blockidx.x + threadidx.x) + k; for (i=start[kk]; i<lp; i+=prime[kk]) { if(i < LP) V[i] += LogP[kk]; syncthreads(); } } 8
4 Change of Result on Modified Program 60 digits, LP=0*04, Prime numbers N=0,000*04, 000 iterations M: Number of Sieved data () Original: M=4484 () After parallelization (times) M=[4488, 44867], mean=4480 largest difference = 87 (0.0%) 9 Conditions PC: Core of Dell Vostro 00 Intel Core,.GHz, GB Windows Vista, gcc, -O Vector Processor ES node (8 CPU).GHz: 89Gflops, 8GB SUPER-UX, Auto-parallel FORTRAN+MPI GPU NVIDIA GTX80.44GHz,.GB Unix, CUDA., -O 0 Time of Sieve Processing 60 digits Ratio of Sieve Processing 60 digits Computation time (hours) PC GPU ES ( node) ratio 00 0 PC/ES PC/GPU GPU/ES Prime numbers in Base (*0^6) Prime numbers in Base (0^6) ratio Dependency of Base size digits 0 digits PC/ES PC/GPU.E+04.E+0.E+06.E+07.E+08.E+09.E+0 Prime numbers in Base ES (Vector Processor) Parallelization is simple and easy, however a special treatment is needed for storing sieved data. GPU Fine grain Parallelization is needed, and that makes data dependency for sieved data. By omitting some data inconsistencies, we chose forced parallelization. 4
5 Summary of Sieve Processing Almost all ops. are addition of bits integer with different-stride 99.9 % are vectorizable, and easy MPI-parallelizable Speed depends on the size of fast memory One node of ES is times faster than PC (Check before pick up becomes times faster) GPU (GXT80) is 60 times faster than PC (Force-parallelization has 0.0% loss of sieved data; types of parallelization are used based on the magnitude of primes) High speed range of Base is ES >> GPU >> PC Forecast of RSA-768 ( digits) ratio Time(Node Year) (%) CPU GPU ES Sieve Processing Matrices polynomials Alg. SQRT Others Prediction Guess of RSA-04 (09 digits) ratio Time(Node Year) (%) CPU GPU ES Sieve Processing 8 6*0 0^ Matrices 9 4* Polynomials Alg. SQRT Others Guess is based on Sieve 0, 0- Mat. 0, Others 0
ENHANCING THE PERFORMANCE OF FACTORING ALGORITHMS
ENHANCING THE PERFORMANCE OF FACTORING ALGORITHMS GIVEN n FIND p 1,p 2,..,p k SUCH THAT n = p 1 d 1 p 2 d 2.. p k d k WHERE p i ARE PRIMES FACTORING IS CONSIDERED TO BE A VERY HARD. THE BEST KNOWN ALGORITHM
More informationCRYPTOGRAPHIC COMPUTING
CRYPTOGRAPHIC COMPUTING ON GPU Chen Mou Cheng Dept. Electrical Engineering g National Taiwan University January 16, 2009 COLLABORATORS Daniel Bernstein, UIC, USA Tien Ren Chen, Army Tanja Lange, TU Eindhoven,
More informationDynamic Scheduling for Work Agglomeration on Heterogeneous Clusters
Dynamic Scheduling for Work Agglomeration on Heterogeneous Clusters Jonathan Lifflander, G. Carl Evans, Anshu Arya, Laxmikant Kale University of Illinois Urbana-Champaign May 25, 2012 Work is overdecomposed
More informationBlock AIR Methods. For Multicore and GPU. Per Christian Hansen Hans Henrik B. Sørensen. Technical University of Denmark
Block AIR Methods For Multicore and GPU Per Christian Hansen Hans Henrik B. Sørensen Technical University of Denmark Model Problem and Notation Parallel-beam 3D tomography exact solution exact data noise
More informationSolving PDEs with CUDA Jonathan Cohen
Solving PDEs with CUDA Jonathan Cohen jocohen@nvidia.com NVIDIA Research PDEs (Partial Differential Equations) Big topic Some common strategies Focus on one type of PDE in this talk Poisson Equation Linear
More informationGurgen Khachatrian Martun Karapetyan
34 International Journal Information Theories and Applications, Vol. 23, Number 1, (c) 2016 On a public key encryption algorithm based on Permutation Polynomials and performance analyses Gurgen Khachatrian
More informationIntroduction to numerical computations on the GPU
Introduction to numerical computations on the GPU Lucian Covaci http://lucian.covaci.org/cuda.pdf Tuesday 1 November 11 1 2 Outline: NVIDIA Tesla and Geforce video cards: architecture CUDA - C: programming
More informationClaude Tadonki. MINES ParisTech PSL Research University Centre de Recherche Informatique
Claude Tadonki MINES ParisTech PSL Research University Centre de Recherche Informatique claude.tadonki@mines-paristech.fr Monthly CRI Seminar MINES ParisTech - CRI June 06, 2016, Fontainebleau (France)
More informationTR A Comparison of the Performance of SaP::GPU and Intel s Math Kernel Library (MKL) for Solving Dense Banded Linear Systems
TR-0-07 A Comparison of the Performance of ::GPU and Intel s Math Kernel Library (MKL) for Solving Dense Banded Linear Systems Ang Li, Omkar Deshmukh, Radu Serban, Dan Negrut May, 0 Abstract ::GPU is a
More informationBachet s equation and groups formed from solutions in Z p
Bachet s equation and groups formed from solutions in Z p Boise State University April 30, 2015 Elliptic Curves and Bachet s Equation Elliptic curves are of the form y 2 = x 3 + ax + b Bachet equations
More informationDirect Self-Consistent Field Computations on GPU Clusters
Direct Self-Consistent Field Computations on GPU Clusters Guochun Shi, Volodymyr Kindratenko National Center for Supercomputing Applications University of Illinois at UrbanaChampaign Ivan Ufimtsev, Todd
More informationDense Arithmetic over Finite Fields with CUMODP
Dense Arithmetic over Finite Fields with CUMODP Sardar Anisul Haque 1 Xin Li 2 Farnam Mansouri 1 Marc Moreno Maza 1 Wei Pan 3 Ning Xie 1 1 University of Western Ontario, Canada 2 Universidad Carlos III,
More informationPublic Key Encryption
Public Key Encryption 3/13/2012 Cryptography 1 Facts About Numbers Prime number p: p is an integer p 2 The only divisors of p are 1 and p s 2, 7, 19 are primes -3, 0, 1, 6 are not primes Prime decomposition
More informationS0214 : GPU Based Stacking Sequence Generation For Composite Skins Using GA
S0214 : GPU Based Stacking Sequence Generation For Composite Skins Using GA Date: 16th May 2012 Wed, 3pm to 3.25pm(Adv. Session) Sathyanarayana K., Manish Banga, and Ravi Kumar G. V. V. Engineering Services,
More informationTips Geared Towards R. Adam J. Suarez. Arpil 10, 2015
Tips Geared Towards R Departments of Statistics North Carolina State University Arpil 10, 2015 1 / 30 Advantages of R As an interpretive and interactive language, developing an algorithm in R can be done
More informationOpen-Source Parallel FE Software : FrontISTR -- Performance Considerations about B/F (Byte per Flop) of SpMV on K-Supercomputer and GPU-Clusters --
Parallel Processing for Energy Efficiency October 3, 2013 NTNU, Trondheim, Norway Open-Source Parallel FE Software : FrontISTR -- Performance Considerations about B/F (Byte per Flop) of SpMV on K-Supercomputer
More informationExperience in Factoring Large Integers Using Quadratic Sieve
Experience in Factoring Large Integers Using Quadratic Sieve D. J. Guan Department of Computer Science, National Sun Yat-Sen University, Kaohsiung, Taiwan 80424 guan@cse.nsysu.edu.tw April 19, 2005 Abstract
More informationMulticore Parallelization of Determinant Quantum Monte Carlo Simulations
Multicore Parallelization of Determinant Quantum Monte Carlo Simulations Andrés Tomás, Che-Rung Lee, Zhaojun Bai, Richard Scalettar UC Davis SIAM Conference on Computation Science & Engineering Reno, March
More informationShortest Lattice Vector Enumeration on Graphics Cards
Shortest Lattice Vector Enumeration on Graphics Cards Jens Hermans 1 Michael Schneider 2 Fréderik Vercauteren 1 Johannes Buchmann 2 Bart Preneel 1 1 K.U.Leuven 2 TU Darmstadt SHARCS - 10 September 2009
More informationTi Secured communications
Ti5318800 Secured communications Pekka Jäppinen September 20, 2007 Pekka Jäppinen, Lappeenranta University of Technology: September 20, 2007 Relies on use of two keys: Public and private Sometimes called
More informationS XMP LIBRARY INTERNALS. Niall Emmart University of Massachusetts. Follow on to S6151 XMP: An NVIDIA CUDA Accelerated Big Integer Library
S6349 - XMP LIBRARY INTERNALS Niall Emmart University of Massachusetts Follow on to S6151 XMP: An NVIDIA CUDA Accelerated Big Integer Library High Performance Modular Exponentiation A^K mod P Where A,
More informationAccelerated Search for Gaussian Generator Based on Triple Prime Integers
Journal of Computer Science 5 (9): 614-618, 2009 ISSN 1549-3636 2009 Science Publications Accelerated Search for Gaussian Generator Based on Triple Prime Integers 1 Boris S. Verkhovsky and 2 Md Shiblee
More informationAccelerating Linear Algebra on Heterogeneous Architectures of Multicore and GPUs using MAGMA and DPLASMA and StarPU Schedulers
UT College of Engineering Tutorial Accelerating Linear Algebra on Heterogeneous Architectures of Multicore and GPUs using MAGMA and DPLASMA and StarPU Schedulers Stan Tomov 1, George Bosilca 1, and Cédric
More informationA model leading to self-consistent iteration computation with need for HP LA (e.g, diagonalization and orthogonalization)
A model leading to self-consistent iteration computation with need for HP LA (e.g, diagonalization and orthogonalization) Schodinger equation: Hψ = Eψ Choose a basis set of wave functions Two cases: Orthonormal
More informationP214 Efficient Computation of Passive Seismic Interferometry
P214 Efficient Computation of Passive Seismic Interferometry J.W. Thorbecke* (Delft University of Technology) & G.G. Drijkoningen (Delft University of Technology) SUMMARY Seismic interferometry is from
More informationEfficient implementation of the overlap operator on multi-gpus
Efficient implementation of the overlap operator on multi-gpus Andrei Alexandru Mike Lujan, Craig Pelissier, Ben Gamari, Frank Lee SAAHPC 2011 - University of Tennessee Outline Motivation Overlap operator
More informationToward High Performance Matrix Multiplication for Exact Computation
Toward High Performance Matrix Multiplication for Exact Computation Pascal Giorgi Joint work with Romain Lebreton (U. Waterloo) Funded by the French ANR project HPAC Séminaire CASYS - LJK, April 2014 Motivations
More informationCongruence Classes. Number Theory Essentials. Modular Arithmetic Systems
Cryptography Introduction to Number Theory 1 Preview Integers Prime Numbers Modular Arithmetic Totient Function Euler's Theorem Fermat's Little Theorem Euclid's Algorithm 2 Introduction to Number Theory
More informationOverview. Background / Context. CSC 580 Cryptography and Computer Security. March 21, 2017
CSC 580 Cryptography and Computer Security Math for Public Key Crypto, RSA, and Diffie-Hellman (Sections 2.4-2.6, 2.8, 9.2, 10.1-10.2) March 21, 2017 Overview Today: Math needed for basic public-key crypto
More informationParallel Asynchronous Hybrid Krylov Methods for Minimization of Energy Consumption. Langshi CHEN 1,2,3 Supervised by Serge PETITON 2
1 / 23 Parallel Asynchronous Hybrid Krylov Methods for Minimization of Energy Consumption Langshi CHEN 1,2,3 Supervised by Serge PETITON 2 Maison de la Simulation Lille 1 University CNRS March 18, 2013
More informationAntti-Pekka Hynninen, 5/10/2017, GTC2017, San Jose CA
S7255: CUTT: A HIGH- PERFORMANCE TENSOR TRANSPOSE LIBRARY FOR GPUS Antti-Pekka Hynninen, 5/10/2017, GTC2017, San Jose CA MOTIVATION Tensor contractions are the most computationally intensive part of quantum
More informationIntroduction to Modern Cryptography. Lecture RSA Public Key CryptoSystem 2. One way Trapdoor Functions
Introduction to Modern Cryptography Lecture 7 1. RSA Public Key CryptoSystem 2. One way Trapdoor Functions Diffie and Hellman (76) New Directions in Cryptography Split the Bob s secret key K to two parts:
More informationDiscrete Logarithm Problem
Discrete Logarithm Problem Çetin Kaya Koç koc@cs.ucsb.edu (http://cs.ucsb.edu/~koc/ecc) Elliptic Curve Cryptography lect08 discrete log 1 / 46 Exponentiation and Logarithms in a General Group In a multiplicative
More informationSP-CNN: A Scalable and Programmable CNN-based Accelerator. Dilan Manatunga Dr. Hyesoon Kim Dr. Saibal Mukhopadhyay
SP-CNN: A Scalable and Programmable CNN-based Accelerator Dilan Manatunga Dr. Hyesoon Kim Dr. Saibal Mukhopadhyay Motivation Power is a first-order design constraint, especially for embedded devices. Certain
More informationSPARSE SOLVERS POISSON EQUATION. Margreet Nool. November 9, 2015 FOR THE. CWI, Multiscale Dynamics
SPARSE SOLVERS FOR THE POISSON EQUATION Margreet Nool CWI, Multiscale Dynamics November 9, 2015 OUTLINE OF THIS TALK 1 FISHPACK, LAPACK, PARDISO 2 SYSTEM OVERVIEW OF CARTESIUS 3 POISSON EQUATION 4 SOLVERS
More informationLogic gates. Quantum logic gates. α β 0 1 X = 1 0. Quantum NOT gate (X gate) Classical NOT gate NOT A. Matrix form representation
Quantum logic gates Logic gates Classical NOT gate Quantum NOT gate (X gate) A NOT A α 0 + β 1 X α 1 + β 0 A N O T A 0 1 1 0 Matrix form representation 0 1 X = 1 0 The only non-trivial single bit gate
More informationSecurity II: Cryptography exercises
Security II: Cryptography exercises Markus Kuhn Lent 2015 Part II Some of the exercises require the implementation of short programs. The model answers use Perl (see Part IB Unix Tools course), but you
More informationCryptographic Hash Functions
Cryptographic Hash Functions Çetin Kaya Koç koc@ece.orst.edu Electrical & Computer Engineering Oregon State University Corvallis, Oregon 97331 Technical Report December 9, 2002 Version 1.5 1 1 Introduction
More informationClass invariants by the CRT method
Class invariants by the CRT method Andreas Enge Andrew V. Sutherland INRIA Bordeaux-Sud-Ouest Massachusetts Institute of Technology ANTS IX Andreas Enge and Andrew Sutherland Class invariants by the CRT
More informationSOLUTION of linear systems of equations of the form:
Proceedings of the Federated Conference on Computer Science and Information Systems pp. Mixed precision iterative refinement techniques for the WZ factorization Beata Bylina Jarosław Bylina Institute of
More informationEfficient and Cryptographically Secure Generation of Chaotic Pseudorandom Numbers on GPU
Efficient and Cryptographically Secure Generation of Chaotic Pseudorandom Numbers on GPU Jacques M. Bahi, Raphaël Couturier, Christophe Guyeux, and Pierre-Cyrille Héam October 30, 2018 arxiv:1112.5239v1
More informationWhat is The Continued Fraction Factoring Method
What is The Continued Fraction Factoring Method? Slide /7 What is The Continued Fraction Factoring Method Sohail Farhangi June 208 What is The Continued Fraction Factoring Method? Slide 2/7 Why is Factoring
More informationPerformance and Energy Analysis of the Iterative Solution of Sparse Linear Systems on Multicore and Manycore Architectures
Performance and Energy Analysis of the Iterative Solution of Sparse Linear Systems on Multicore and Manycore Architectures José I. Aliaga Performance and Energy Analysis of the Iterative Solution of Sparse
More informationSecurity Issues in Cloud Computing Modern Cryptography II Asymmetric Cryptography
Security Issues in Cloud Computing Modern Cryptography II Asymmetric Cryptography Peter Schwabe October 21 and 28, 2011 So far we assumed that Alice and Bob both have some key, which nobody else has. How
More informationAn Implementation of the MRRR Algorithm on a Data-Parallel Coprocessor
An Implementation of the MRRR Algorithm on a Data-Parallel Coprocessor Christian Lessig Abstract The Algorithm of Multiple Relatively Robust Representations (MRRR) is one of the most efficient and accurate
More informationDiophantine equations via weighted LLL algorithm
Cryptanalysis of a public key cryptosystem based on Diophantine equations via weighted LLL algorithm Momonari Kudo Graduate School of Mathematics, Kyushu University, JAPAN Kyushu University Number Theory
More informationParallel programming practices for the solution of Sparse Linear Systems (motivated by computational physics and graphics)
Parallel programming practices for the solution of Sparse Linear Systems (motivated by computational physics and graphics) Eftychios Sifakis CS758 Guest Lecture - 19 Sept 2012 Introduction Linear systems
More informationECM at Work. Joppe W. Bos 1 and Thorsten Kleinjung 2. 1 Microsoft Research, Redmond, USA
ECM at Work Joppe W. Bos 1 and Thorsten Kleinjung 2 1 Microsoft Research, Redmond, USA 2 Laboratory for Cryptologic Algorithms, EPFL, Lausanne, Switzerland 1 / 18 Security assessment of public-key cryptography
More informationElliptic Curve Computations
Elliptic Curve Computations New Mexico Supercomputing Challenge Final Report April 1st, 2009 Team 66 Manzano High School Team Members Kristin Cordwell Chen Zhao Teacher Stephen Schum Project Mentor William
More informationBeam dynamics calculation
September 6 Beam dynamics calculation S.B. Vorozhtsov, Е.Е. Perepelkin and V.L. Smirnov Dubna, JINR http://parallel-compute.com Outline Problem formulation Numerical methods OpenMP and CUDA realization
More informationA Massively Parallel Eigenvalue Solver for Small Matrices on Multicore and Manycore Architectures
A Massively Parallel Eigenvalue Solver for Small Matrices on Multicore and Manycore Architectures Manfred Liebmann Technische Universität München Chair of Optimal Control Center for Mathematical Sciences,
More informationOn the complexity of computing discrete logarithms in the field F
On the complexity of computing discrete logarithms in the field F 3 6 509 Francisco Rodríguez-Henríquez CINVESTAV-IPN Joint work with: Gora Adj Alfred Menezes Thomaz Oliveira CINVESTAV-IPN University of
More informationGauss Sieve on GPUs. Shang-Yi Yang 1, Po-Chun Kuo 1, Bo-Yin Yang 2, and Chen-Mou Cheng 1
Gauss Sieve on GPUs Shang-Yi Yang 1, Po-Chun Kuo 1, Bo-Yin Yang 2, and Chen-Mou Cheng 1 1 Department of Electrical Engineering, National Taiwan University, Taipei, Taiwan {ilway25,kbj,doug}@crypto.tw 2
More informationCryptography and Security Midterm Exam
Cryptography and Security Midterm Exam Serge Vaudenay 23.11.2017 duration: 1h45 no documents allowed, except one 2-sided sheet of handwritten notes a pocket calculator is allowed communication devices
More informationCOMP424 Computer Security
COMP424 Computer Security Prof. Wiegley jeffw@csun.edu Rivest, Shamir & Adelman (RSA) Implementation 1 Relatively prime Prime: n, is prime if its only two factors are 1 and n. (and n 1). Relatively prime:
More information9 Knapsack Cryptography
9 Knapsack Cryptography In the past four weeks, we ve discussed public-key encryption systems that depend on various problems that we believe to be hard: prime factorization, the discrete logarithm, and
More informationAttacks on RSA & Using Asymmetric Crypto
Attacks on RSA & Using Asymmetric Crypto Luke Anderson luke@lukeanderson.com.au 7 th April 2017 University Of Sydney Overview 1. Crypto-Bulletin 2. Breaking RSA 2.1 Chinese Remainder Theorem 2.2 Common
More informationCS-206 Concurrency. Lecture 13. Wrap Up. Spring 2015 Prof. Babak Falsafi parsa.epfl.ch/courses/cs206/
CS-206 Concurrency Lecture 13 Wrap Up Spring 2015 Prof. Babak Falsafi parsa.epfl.ch/courses/cs206/ Created by Nooshin Mirzadeh, Georgios Psaropoulos and Babak Falsafi EPFL Copyright 2015 EPFL CS-206 Spring
More informationQuantum Computer Simulation Using CUDA (Quantum Fourier Transform Algorithm)
Quantum Computer Simulation Using CUDA (Quantum Fourier Transform Algorithm) Alexander Smith & Khashayar Khavari Department of Electrical and Computer Engineering University of Toronto April 15, 2009 Alexander
More informationScalable and Power-Efficient Data Mining Kernels
Scalable and Power-Efficient Data Mining Kernels Alok Choudhary, John G. Searle Professor Dept. of Electrical Engineering and Computer Science and Professor, Kellogg School of Management Director of the
More information8 Elliptic Curve Cryptography
8 Elliptic Curve Cryptography 8.1 Elliptic Curves over a Finite Field For the purposes of cryptography, we want to consider an elliptic curve defined over a finite field F p = Z/pZ for p a prime. Given
More informationElliptic Curve Cryptography
Elliptic Curve Cryptography Elliptic Curves An elliptic curve is a cubic equation of the form: y + axy + by = x 3 + cx + dx + e where a, b, c, d and e are real numbers. A special addition operation is
More informationEfficient Modular Exponentiation Based on Multiple Multiplications by a Common Operand
Efficient Modular Exponentiation Based on Multiple Multiplications by a Common Operand Christophe Negre, Thomas Plantard, Jean-Marc Robert Team DALI (UPVD) and LIRMM (UM2, CNRS), France CCISR, SCIT, (University
More informationHybrid CPU/GPU Acceleration of Detection of 2-SNP Epistatic Interactions in GWAS
Hybrid CPU/GPU Acceleration of Detection of 2-SNP Epistatic Interactions in GWAS Jorge González-Domínguez*, Bertil Schmidt*, Jan C. Kässens**, Lars Wienbrandt** *Parallel and Distributed Architectures
More informationPublic Key Algorithms
Public Key Algorithms Raj Jain Washington University in Saint Louis Saint Louis, MO 63130 Jain@cse.wustl.edu Audio/Video recordings of this lecture are available at: http://www.cse.wustl.edu/~jain/cse571-09/
More informationPolynomial Selection. Thorsten Kleinjung École Polytechnique Fédérale de Lausanne
Polynomial Selection Thorsten Kleinjung École Polytechnique Fédérale de Lausanne Contents Brief summary of polynomial selection (no root sieve) Motivation (lattice sieving, monic algebraic polynomial)
More informationRandomized Selection on the GPU. Laura Monroe, Joanne Wendelberger, Sarah Michalak Los Alamos National Laboratory
Randomized Selection on the GPU Laura Monroe, Joanne Wendelberger, Sarah Michalak Los Alamos National Laboratory High Performance Graphics 2011 August 6, 2011 Top k Selection on GPU Output the top k keys
More informationCryptography IV: Asymmetric Ciphers
Cryptography IV: Asymmetric Ciphers Computer Security Lecture 7 David Aspinall School of Informatics University of Edinburgh 31st January 2011 Outline Background RSA Diffie-Hellman ElGamal Summary Outline
More informationOn the design of parallel linear solvers for large scale problems
On the design of parallel linear solvers for large scale problems ICIAM - August 2015 - Mini-Symposium on Recent advances in matrix computations for extreme-scale computers M. Faverge, X. Lacoste, G. Pichon,
More informationPerformance evaluation of scalable optoelectronics application on large-scale Knights Landing cluster
Performance evaluation of scalable optoelectronics application on large-scale Knights Landing cluster Yuta Hirokawa Graduate School of Systems and Information Engineering, University of Tsukuba hirokawa@hpcs.cs.tsukuba.ac.jp
More informationParallel Sparse Tensor Decompositions using HiCOO Format
Figure sources: A brief survey of tensors by Berton Earnshaw and NVIDIA Tensor Cores Parallel Sparse Tensor Decompositions using HiCOO Format Jiajia Li, Jee Choi, Richard Vuduc May 8, 8 @ SIAM ALA 8 Outline
More informationMathematics of Public Key Cryptography
Mathematics of Public Key Cryptography Eric Baxter April 12, 2014 Overview Brief review of public-key cryptography Mathematics behind public-key cryptography algorithms What is Public-Key Cryptography?
More informationElliptic and Hyperelliptic Curves: a Practical Security Comparison"
Elliptic and Hyperelliptic Curves: a Practical Security Comparison Joppe W. Bos (Microsoft Research), Craig Costello (Microsoft Research),! Andrea Miele (EPFL) 1/13 Motivation and Goal(s)! Elliptic curves
More informationAn Efficient Heuristic Algorithm for Linear Decomposition of Index Generation Functions
An Efficient Heuristic Algorithm for Linear Decomposition of Index Generation Functions Shinobu Nagayama Tsutomu Sasao Jon T. Butler Dept. of Computer and Network Eng., Hiroshima City University, Hiroshima,
More informationFPGA Implementation of a Predictive Controller
FPGA Implementation of a Predictive Controller SIAM Conference on Optimization 2011, Darmstadt, Germany Minisymposium on embedded optimization Juan L. Jerez, George A. Constantinides and Eric C. Kerrigan
More informationHomomorphic Encryption. Liam Morris
Homomorphic Encryption Liam Morris Topics What Is Homomorphic Encryption? Partially Homomorphic Cryptosystems Fully Homomorphic Cryptosystems Benefits of Homomorphism Drawbacks of Homomorphism What Is
More informationThe RSA Cipher and its Algorithmic Foundations
Chapter 1 The RSA Cipher and its Algorithmic Foundations The most important that is, most applied and most analyzed asymmetric cipher is RSA, named after its inventors Ron Rivest, Adi Shamir, and Len Adleman.
More informationFast Scalar Multiplication on Elliptic Curves fo Sensor Nodes
Réseaux Grand Est Fast Scalar Multiplication on Elliptic Curves fo Sensor Nodes Youssou FAYE Hervé GUYENNET Yanbo SHOU Université de Franche-Comté Besançon le 24 octobre 2013 TABLE OF CONTENTS ❶ Introduction
More informationFurther improving security of Vector Stream Cipher
NOLTA, IEICE Paper Further improving security of Vector Stream Cipher Atsushi Iwasaki 1a) and Ken Umeno 2 1 Fukuoka Institute of Technology Wajiro-higashi, Higashiku, Fukuoka 811-0295, Japan 2 Graduate
More information1 Recommended Reading 1. 2 Public Key/Private Key Cryptography Overview RSA Algorithm... 2
Contents 1 Recommended Reading 1 2 Public Key/Private Key Cryptography 1 2.1 Overview............................................. 1 2.2 RSA Algorithm.......................................... 2 3 A Number
More informationInformation Security
SE 4472 / ECE 9064 Information Security Week 12: Random Number Generators and Picking Appropriate Key Lengths Fall 2015 Prof. Aleksander Essex Random Number Generation Where do keys come from? So far we
More informationTwo case studies of Monte Carlo simulation on GPU
Two case studies of Monte Carlo simulation on GPU National Institute for Computational Sciences University of Tennessee Seminar series on HPC, Feb. 27, 2014 Outline 1 Introduction 2 Discrete energy lattice
More informationElliptic Curves Cryptography and factorization. Part VIII. Elliptic curves cryptography and factorization. Historical Remarks.
Elliptic Curves Cryptography and factorization Part VIII Elliptic curves cryptography and factorization Cryptography based on manipulation of points of so called elliptic curves is getting momentum and
More informationThe factorization of RSA D. J. Bernstein University of Illinois at Chicago
The factorization of RSA-1024 D. J. Bernstein University of Illinois at Chicago Abstract: This talk discusses the most important tools for attackers breaking 1024-bit RSA keys today and tomorrow. The same
More informationPuReMD-GPU: A Reactive Molecular Dynamic Simulation Package for GPUs
Purdue University Purdue e-pubs Department of Computer Science Technical Reports Department of Computer Science 2012 PuReMD-GPU: A Reactive Molecular Dynamic Simulation Package for GPUs Sudhir B. Kylasa
More informationOn Portability, Performance and Scalability of a MPI OpenCL Lattice Boltzmann Code
On Portability, Performance and Scalability of a MPI OpenCL Lattice Boltzmann Code E Calore, S F Schifano, R Tripiccione Enrico Calore INFN Ferrara, Italy 7 th Workshop on UnConventional High Performance
More informationAnalysis of the RSA Encryption Algorithm
Analysis of the RSA Encryption Algorithm Betty Huang June 16, 2010 Abstract The RSA encryption algorithm is commonly used in public security due to the asymmetric nature of the cipher. The procedure is
More informationEfficient Leak Resistant Modular Exponentiation in RNS
Efficient Leak Resistant Modular Exponentiation in RNS Andrea Lesavourey (1), Christophe Negre (1) and Thomas Plantard (2) (1) DALI (UPVD) and LIRMM (Univ. of Montpellier, CNRS), Perpignan, France (2)
More informationAn Implementation of the MRRR Algorithm on a Data-Parallel Coprocessor
An Implementation of the MRRR Algorithm on a Data-Parallel Coprocessor Christian Lessig Abstract The Algorithm of Multiple Relatively Robust Representations (MRRRR) is one of the most efficient and most
More informationverdex pro Series Performance and Power Technical Reference
verdex pro Series Performance and Power Technical Reference Operating Temperature PXA270-based verdex pro and verdex COMs PXA255-based basix and connex motherboards Summary Performance Benchmarks: Verdex
More informationCorrectness, Security and Efficiency of RSA
Correttezza di RSA Correctness, Security and Efficiency of RSA Ozalp Babaoglu! Bisogna dimostrare D(C(m)) = m ALMA MATER STUDIORUM UNIVERSITA DI BOLOGNA 2 Correttezza di RSA Correttezza di RSA! Risultati
More informationAn Efficient FETI Implementation on Distributed Shared Memory Machines with Independent Numbers of Subdomains and Processors
Contemporary Mathematics Volume 218, 1998 B 0-8218-0988-1-03024-7 An Efficient FETI Implementation on Distributed Shared Memory Machines with Independent Numbers of Subdomains and Processors Michel Lesoinne
More informationBeiHang Short Course, Part 7: HW Acceleration: It s about Performance, Energy and Power
BeiHang Short Course, Part 7: HW Acceleration: It s about Performance, Energy and Power James C. Hoe Department of ECE Carnegie Mellon niversity Eric S. Chung, et al., Single chip Heterogeneous Computing:
More informationERLANGEN REGIONAL COMPUTING CENTER
ERLANGEN REGIONAL COMPUTING CENTER Making Sense of Performance Numbers Georg Hager Erlangen Regional Computing Center (RRZE) Friedrich-Alexander-Universität Erlangen-Nürnberg OpenMPCon 2018 Barcelona,
More informationThe Lattice Boltzmann Method for Laminar and Turbulent Channel Flows
The Lattice Boltzmann Method for Laminar and Turbulent Channel Flows Vanja Zecevic, Michael Kirkpatrick and Steven Armfield Department of Aerospace Mechanical & Mechatronic Engineering The University of
More informationCpt S 223. School of EECS, WSU
Algorithm Analysis 1 Purpose Why bother analyzing code; isn t getting it to work enough? Estimate time and memory in the average case and worst case Identify bottlenecks, i.e., where to reduce time Compare
More informationAUTHOR QUERY FORM. Fax: For correction or revision of any artwork, please consult
Our reference: YJCPH 52 P-authorquery-v7 AUTHOR QUERY FORM Journal: YJCPH Please e-mail or fax your responses and any corrections to: Article Number: 52 E-mail: corrections.esch@elsevier.vtex.lt Fax: +
More informationWelcome to MCS 572. content and organization expectations of the course. definition and classification
Welcome to MCS 572 1 About the Course content and organization expectations of the course 2 Supercomputing definition and classification 3 Measuring Performance speedup and efficiency Amdahl s Law Gustafson
More informationCPSC 467b: Cryptography and Computer Security
CPSC 467b: Cryptography and Computer Security Michael J. Fischer Lecture 10 February 19, 2013 CPSC 467b, Lecture 10 1/45 Primality Tests Strong primality tests Weak tests of compositeness Reformulation
More informationLevel-3 BLAS on a GPU
Level-3 BLAS on a GPU Picking the Low Hanging Fruit Francisco Igual 1 Gregorio Quintana-Ortí 1 Robert A. van de Geijn 2 1 Departamento de Ingeniería y Ciencia de los Computadores. University Jaume I. Castellón
More information