The tangent FFT. D. J. Bernstein University of Illinois at Chicago

Size: px
Start display at page:

Download "The tangent FFT. D. J. Bernstein University of Illinois at Chicago"

Transcription

1 The tangent FFT D. J. Bernstein University of Illinois at Chicago

2 Advertisement SPEED: Software Performance Enhancement for Encryption and Decryption A workshop on software speeds for secret-key cryptography and public-key cryptography. Amsterdam, June 11 12,

3 The convolution problem How quickly can we multiply polynomials in the ring R[Ü]? Answer depends on degrees, representation of polynomials, number of polynomials, etc. Answer also depends on definition of quickly. Many models of computation; many interesting cost measures.

4 Assume two inputs ¾ R[Ü], deg Ñ, deg Ò Ñ, so deg Ò. Assume represented as coeff sequences. How quickly can we compute the Ò coeffs of, given s Ñ coeffs and s Ò Ñ + 1 coeffs? Inputs ( 0 1 Ñ 1) ¾ R Ñ, ( 0 1 Ò Ñ) ¾ R Ò Ñ+1. Output ( 0 1 Ò 1) ¾ R Ò with Ü+ + Ò 1Ü Ò 1 = ( Ü + )( Ü + ).

5 Assume R-algebraic algorithms (without divs, branches): chains of binary R-adds Ù Ú Ù + Ú, binary R-subs Ù Ú Ù Ú, binary R-mults Ù Ú ÙÚ, starting from inputs and constants. Example (1963 Karatsuba):

6 Cost measure for this talk: total R-algebraic complexity. Cost 1 for binary R-add; cost 1 for binary R-sub; cost 1 for binary R-mult; cost 0 for constant in R. Many real-world computations use (e.g.) Pentium M s floating-point operations to approximate operations in R. Properly scheduled operations achieve Pentium M cycles total R-algebraic complexity.

7 Fast Fourier transforms Define Ò ¾ C as exp(2 Ò). Define Ì : C[Ü] (Ü Ò 1) C Ò as (1) ( Ò ) ( Ò 1 Ò ). Can very quickly compute Ì. First publication of fast algorithm: 1866 Gauss. Easy to see that Gauss s FFT uses Ç(Ò lg Ò) arithmetic operations if Ò ¾ Several subsequent reinventions, ending with 1965 Cooley Tukey.

8 Inverse map is also very fast. Multiplication in C Ò is very fast Sande, 1966 Stockham: Can very quickly multiply in C[Ü] (Ü Ò 1) or C[Ü] or R[Ü] by mapping C[Ü] (Ü Ò 1) to C Ò. Given ¾ C[Ü] (Ü Ò 1): compute as Ì 1 (Ì( )Ì( )). Given ¾ C[Ü] with deg Ò: compute from its image in C[Ü] (Ü Ò 1). R-algebraic complexity Ç(Ò lg Ò).

9 A closer look at costs More precise analysis of Gauss FFT (and Cooley-Tukey FFT): C[Ü] (Ü Ò 1) C Ò using (1 2)Ò lg Ò binary C-adds, (1 2)Ò lg Ò binary C-subs, (1 2)Ò lg Ò binary C-mults, if Ò ¾ ( ) ¾ R 2 represents + ¾ C. C-add, C-sub, C-mult cost 2 2 6: ( ) + ( ) = ( + + ), ( ) ( ) = ( ), ( )( ) = ( + ).

10 Total cost 5Ò lg Ò. Easily save some time: eliminate mults by 1; absorb mults by 1 into subsequent operations; simplify mults by Ô using, e.g., ( )(1 Ô 2 1 Ô 2) = (( ) Ô 2 ( + ) Ô 2). Cost 5Ò lg Ò 10Ò + 16 to map C[Ü] (Ü Ò 1) C Ò, if Ò ¾

11 What about cost of convolution? 5Ò lg Ò + Ç(Ò) to compute Ì( ), 5Ò lg Ò + Ç(Ò) to compute Ì( ), Ç(Ò) to multiply in C Ò, similar 5Ò lg Ò + Ç(Ò) for Ì 1. Total cost 15Ò lg Ò + Ç(Ò) to compute ¾ C[Ü] (Ü Ò 1) given ¾ C[Ü] (Ü Ò 1). Total cost (15 2)Ò lg Ò + Ç(Ò) to compute ¾ R[Ü] (Ü Ò 1) given ¾ R[Ü] (Ü Ò 1): map R[Ü] (Ü Ò 1) R 2 C Ò 2 1 (Gauss) to save half the time.

12 1968 R. Yavne: Can do better! Cost 4Ò lg Ò 6Ò + 8 to map C[Ü] (Ü Ò 1) C Ò, if Ò ¾

13 1968 R. Yavne: Can do better! Cost 4Ò lg Ò 6Ò + 8 to map C[Ü] (Ü Ò 1) C Ò, if Ò ¾ James Van Buskirk: Can do better! Cost (34 9)Ò lg Ò + Ç(Ò). Expositions of the new algorithm: Frigo, Johnson, in IEEE Trans. Signal Processing; Lundy, Van Buskirk, in Computing; Bernstein, this talk, expanding an old idea of Fiduccia.

14 Van Buskirk, comp.arch, January 2005: Have you ever considered changing djbfft to get better opcounts along the lines of home.comcast.net/~kmbtib?

15 Van Buskirk, comp.arch, January 2005: Have you ever considered changing djbfft to get better opcounts along the lines of home.comcast.net/~kmbtib? Bernstein, comp.arch: What do you mean, better opcounts? The algebraic complexity of a size-2 complex DFT has stood at (3 3)2 +4 additions and ( 3)2 +4 multiplications since 1968.

16 Van Buskirk, comp.arch, January 2005: Have you ever considered changing djbfft to get better opcounts along the lines of home.comcast.net/~kmbtib? Bernstein, comp.arch: What do you mean, better opcounts? The algebraic complexity of a size-2 complex DFT has stood at (3 3)2 +4 additions and ( 3)2 +4 multiplications since Van Buskirk, comp.arch: Oh, you re so 20th century, Dan. Read the handwriting on the wall.

17 Understanding the FFT The FFT trick: C[Ü] (Ü Ò 1) C[Ü] (Ü Ò 2 1) C[Ü] (Ü Ò 2 +1) by unique C[Ü]-algebra morphism. Cost 2Ò: Ò 2 C-adds, Ò 2 C-subs. e.g. Ò = 4: C[Ü] (Ü 4 1) C[Ü] (Ü 2 1) C[Ü] (Ü 2 + 1) by Ü + 2 Ü Ü 3 ( ) + ( )Ü ( 0 2 ) + ( 1 3 )Ü. Representation: ( ) (( ) ( )) (( 0 2 ) ( 1 3 )).

18 Recurse: C[Ü] (Ü Ò 2 1) C[Ü] (Ü Ò 4 1) C[Ü] (Ü Ò 4 +1); similarly C[Ü] (Ü Ò 2 + 1) C[Ü] (Ü Ò 4 ) C[Ü] (Ü Ò 4 + ); continue to C[Ü] (Ü 1). General case: C[Ü] (Ü Ò «2 ) C[Ü] (Ü Ò 2 «) C[Ü] (Ü Ò 2 +«) by Ò 2 Ü Ò 2 + ( 0 + «Ò 2 ) + ( 1 + «)Ü +, ( 0 «Ò 2 ) + ( 1 «)Ü +. Cost 5Ò: Ò 2 C-mults, Ò 2 C-adds, Ò 2 C-subs. Recurse, eliminate easy mults. Cost 5Ò lg Ò 10Ò + 16.

19 Alternative: the twisted FFT After previous C[Ü] (Ü Ò 1) C[Ü] (Ü Ò 2 1) C[Ü] (Ü Ò 2 +1), apply unique C-algebra morphism C[Ü] (Ü Ò 2 +1) C[Ý] (Ý Ò 2 1) that maps Ü to Ò Ý Ü+ + Ò 2 Ü Ò 2 + ( 0 + Ò 2 )+( 1 + Ò 2+1 )Ü+, ( 0 Ò 2 )+ Ò ( 1 Ò 2+1 )Ý+. Again cost 5Ò: Ò 2 C-mults, Ò 2 C-adds, Ò 2 C-subs. Eliminate easy mults, recurse. Cost 5Ò lg Ò 10Ò + 16.

20 The split-radix FFT FFT and twisted FFT end up with same number of mults by Ò, same number of mults by Ò 2, same number of mults by Ò 4, etc. Is this necessary? No! Split-radix FFT: more easy mults. Don t twist until you see the whites of their s. (Can use same idea to speed up Schönhage-Strassen algorithm for integer multiplication.)

21 Cost 2Ò: C[Ü] (Ü Ò 1) C[Ü] (Ü Ò 2 1) C[Ü] (Ü Ò 2 +1). Cost Ò: C[Ü] (Ü Ò 2 + 1) C[Ü] (Ü Ò 4 ) C[Ü] (Ü Ò 4 + ). Cost 6(Ò 4): C[Ü] (Ü Ò 4 ) C[Ý] (Ý Ò 4 1) by Ü Ò Ý. Cost 6(Ò 4): C[Ü] (Ü Ò 4 + ) C[Ý] (Ý Ò 4 1) by Ü 1 Ò Ý. Overall cost 6Ò to split into , entropy 1 5 bits. Eliminate easy mults, recurse. Cost 4Ò lg Ò 6Ò + 8, exactly as in 1968 Yavne.

22 The tangent FFT Several ways to achieve cost 6 for mult by. One approach: Factor as (1 + tan ) cos. Cost 2 for mult by cos. Cost 4 for mult by 1 + tan. For stability and symmetry, use max cos sin instead of cos. Surprise (Van Buskirk): Can merge some cost-2 mults!

23 Rethink basis of C[Ü] (Ü Ò 1). Instead of 1 Ü Ü Ò 1 use 1 Ò 0 Ü Ò 1 Ü Ò 1 Ò Ò 1 where Ò = max cos 2 Ò max cos 2 max. Ò 4 cos 2 Ò 16 sin 2 Ò sin 2 Ò 4 sin 2 Ò 16 Now ( 0 1 Ò 1) represents 0 Ò Ò 1Ü Ò 1 Ò Ò 1. Note that Ò = Ò +Ò 4. Note that Ò( Ò 4 Ò ) is (1 + tan ) or (cot + ).

24 Cost 2Ò: C[Ü] (Ü Ò 1), basis Ü Ò, C[Ü] (Ü Ò 2 1), basis Ü Ò, C[Ü] (Ü Ò 2 + 1), basis Ü Ò. Cost Ò: C[Ü] (Ü Ò 2 1), basis Ü Ò, C[Ü] (Ü Ò 4 1), basis Ü Ò, C[Ü] (Ü Ò 4 + 1), basis Ü Ò. Cost Ò: C[Ü] (Ü Ò 2 +1), basis Ü Ò, C[Ü] (Ü Ò 4 ), basis Ü Ò, C[Ü] (Ü Ò 4 + ), basis Ü Ò.

25 Cost Ò 2 2: C[Ü] (Ü Ò 4 1), basis Ü Ò, C[Ü] (Ü Ò 4 1), basis Ü Ò 4. Cost Ò 2 2: C[Ü] (Ü Ò 4 +1), basis Ü Ò, C[Ü] (Ü Ò 4 + 1), basis Ü Ò 2. Cost Ò 2: C[Ü] (Ü Ò 4 + 1), basis Ü Ò 2, C[Ü] (Ü Ò 8 ), basis Ü Ò 2, C[Ü] (Ü Ò 8 + ), basis Ü Ò 2.

26 Cost 4(Ò 4) 6: C[Ü] (Ü Ò 4 ), basis Ü Ò, C[Ü] (Ý Ò 4 1), basis Ý Ò 4. Cost 4(Ò 4) 6: C[Ü] (Ü Ò 4 + ), basis Ü Ò, C[Ü] (Ý Ò 4 1), basis Ý Ò 4. Cost 4(Ò 8) 6: C[Ü] (Ü Ò 8 ), basis Ü Ò 2, C[Ü] (Ý Ò 8 1), basis Ý Ò 8. Cost 4(Ò 8) 6: C[Ü] (Ü Ò 8 + ), basis Ü Ò 2, C[Ü] (Ý Ò 8 1), basis Ý Ò 8.

27 Overall cost 8 5Ò 28 to split into , entropy 9 4. Recurse: (34 9)Ò lg Ò + Ç(Ò). What if input is in C[Ü] (Ü Ò 1) with usual basis 1 Ü Ü Ò 1? Could scale immediately, but faster to scale upon twist. Cost (34 9)Ò lg Ò (124 27)Ò 2 lg Ò (2 9)( 1) lg Ò lg Ò + (16 27)( 1) lg Ò + 8, exactly as in 2004 Van Buskirk.

28 Easily handle R[Ü] (Ü Ò + 1) by mapping to C[Ü] (Ü Ò 2 ). Easily handle R[Ü] (Ü Ò 1) by mapping to R[Ü] (Ü Ò 2 1) R[Ü] (Ü Ò 2 +1). Cost (17 9)Ò lg Ò + Ç(Ò) for R[Ü] (Ü Ò 1) R 2 C Ò 2 1, so cost (17 3)Ò lg Ò + Ç(Ò) to compute ¾ R[Ü] (Ü Ò 1) given ¾ R[Ü] (Ü Ò 1). Cost (17 3)Ò lg Ò + Ç(Ò) for size-ò convolution. Open: Can 17 3 be improved?

The tangent FFT. Daniel J. Bernstein

The tangent FFT. Daniel J. Bernstein The tangent FFT Daniel J. Bernstein Department of Mathematics, Statistics, and Computer Science (M/C 249) University of Illinois at Chicago, Chicago, IL 60607 7045, USA djb@cr.yp.to Abstract. The split-radix

More information

Fast multiplication and its applications

Fast multiplication and its applications Algorithmic Number Theory MSRI Publications Volume 44, 2008 Fast multiplication and its applications DANIEL J. BERNSTEIN ABSTRACT. This survey explains how some useful arithmetic operations can be sped

More information

Algorithm Design and Analysis

Algorithm Design and Analysis Algorithm Design and Analysis LECTURE 14 Divide and Conquer Fast Fourier Transform Sofya Raskhodnikova 10/7/2016 S. Raskhodnikova; based on slides by K. Wayne. 5.6 Convolution and FFT Fast Fourier Transform:

More information

Speeding up characteristic 2: I. Linear maps II. The Å(Ò) game III. Batching IV. Normal bases. D. J. Bernstein University of Illinois at Chicago

Speeding up characteristic 2: I. Linear maps II. The Å(Ò) game III. Batching IV. Normal bases. D. J. Bernstein University of Illinois at Chicago Speeding up characteristic 2: I. Linear maps II. The Å(Ò) game III. Batching IV. Normal bases D. J. Bernstein University of Illinois at Chicago NSF ITR 0716498 Part I. Linear maps Consider computing 0

More information

Advanced code-based cryptography. Daniel J. Bernstein University of Illinois at Chicago & Technische Universiteit Eindhoven

Advanced code-based cryptography. Daniel J. Bernstein University of Illinois at Chicago & Technische Universiteit Eindhoven Advanced code-based cryptography Daniel J. Bernstein University of Illinois at Chicago & Technische Universiteit Eindhoven Lattice-basis reduction Define L = (0; 24)Z + (1; 17)Z = {(b; 24a + 17b) : a;

More information

Integer multiplication with generalized Fermat primes

Integer multiplication with generalized Fermat primes Integer multiplication with generalized Fermat primes CARAMEL Team, LORIA, University of Lorraine Supervised by: Emmanuel Thomé and Jérémie Detrey Journées nationales du Calcul Formel 2015 (Cluny) November

More information

Multiplying huge integers using Fourier transforms

Multiplying huge integers using Fourier transforms Fourier transforms October 25, 2007 820348901038490238478324 1739423249728934932894??? integers occurs in many fields of Computational Science: Cryptography Number theory... Traditional approaches to

More information

CSE 548: Analysis of Algorithms. Lecture 4 ( Divide-and-Conquer Algorithms: Polynomial Multiplication )

CSE 548: Analysis of Algorithms. Lecture 4 ( Divide-and-Conquer Algorithms: Polynomial Multiplication ) CSE 548: Analysis of Algorithms Lecture 4 ( Divide-and-Conquer Algorithms: Polynomial Multiplication ) Rezaul A. Chowdhury Department of Computer Science SUNY Stony Brook Spring 2015 Coefficient Representation

More information

McBits: fast constant-time code-based cryptography. (to appear at CHES 2013)

McBits: fast constant-time code-based cryptography. (to appear at CHES 2013) McBits: fast constant-time code-based cryptography (to appear at CHES 2013) D. J. Bernstein University of Illinois at Chicago & Technische Universiteit Eindhoven Joint work with: Tung Chou Technische Universiteit

More information

5.6 Convolution and FFT

5.6 Convolution and FFT 5.6 Convolution and FFT Fast Fourier Transform: Applications Applications. Optics, acoustics, quantum physics, telecommunications, control systems, signal processing, speech recognition, data compression,

More information

Distinguishing prime numbers from composite numbers: the state of the art. D. J. Bernstein University of Illinois at Chicago

Distinguishing prime numbers from composite numbers: the state of the art. D. J. Bernstein University of Illinois at Chicago Distinguishing prime numbers from composite numbers: the state of the art D. J. Bernstein University of Illinois at Chicago Is it easy to determine whether a given integer is prime? If easy means computable

More information

Finding small factors of integers. Speed of the number-field sieve. D. J. Bernstein University of Illinois at Chicago

Finding small factors of integers. Speed of the number-field sieve. D. J. Bernstein University of Illinois at Chicago The number-field sieve Finding small factors of integers Speed of the number-field sieve D. J. Bernstein University of Illinois at Chicago Prelude: finding denominators 87366 22322444 in R. Easily compute

More information

CPSC 518 Introduction to Computer Algebra Schönhage and Strassen s Algorithm for Integer Multiplication

CPSC 518 Introduction to Computer Algebra Schönhage and Strassen s Algorithm for Integer Multiplication CPSC 518 Introduction to Computer Algebra Schönhage and Strassen s Algorithm for Integer Multiplication March, 2006 1 Introduction We have now seen that the Fast Fourier Transform can be applied to perform

More information

Radix-4 Factorizations for the FFT with Ordered Input and Output

Radix-4 Factorizations for the FFT with Ordered Input and Output Radix-4 Factorizations for the FFT with Ordered Input and Output Vikrant 1, Ritesh Vyas 2, Sandeep Goyat 3, Jitender Kumar 4, Sandeep Kaushal 5 YMCA University of Science & Technology, Faridabad (Haryana),

More information

McBits: Fast code-based cryptography

McBits: Fast code-based cryptography McBits: Fast code-based cryptography Peter Schwabe Radboud University Nijmegen, The Netherlands Joint work with Daniel Bernstein, Tung Chou December 17, 2013 IMA International Conference on Cryptography

More information

How to Multiply. 5.5 Integer Multiplication. Complex Multiplication. Integer Arithmetic. Complex multiplication. (a + bi) (c + di) = x + yi.

How to Multiply. 5.5 Integer Multiplication. Complex Multiplication. Integer Arithmetic. Complex multiplication. (a + bi) (c + di) = x + yi. How to ultiply Slides by Kevin Wayne. Copyright 5 Pearson-Addison Wesley. All rights reserved. integers, matrices, and polynomials Complex ultiplication Complex multiplication. a + bi) c + di) = x + yi.

More information

CPSC 518 Introduction to Computer Algebra Asymptotically Fast Integer Multiplication

CPSC 518 Introduction to Computer Algebra Asymptotically Fast Integer Multiplication CPSC 518 Introduction to Computer Algebra Asymptotically Fast Integer Multiplication 1 Introduction We have now seen that the Fast Fourier Transform can be applied to perform polynomial multiplication

More information

Hyperelliptic-curve cryptography. D. J. Bernstein University of Illinois at Chicago

Hyperelliptic-curve cryptography. D. J. Bernstein University of Illinois at Chicago Hyperelliptic-curve cryptography D. J. Bernstein University of Illinois at Chicago Thanks to: NSF DMS 0140542 NSF ITR 0716498 Alfred P. Sloan Foundation Two parts to this talk: 1. Elliptic curves; modern

More information

Fast Polynomials Multiplication Using FFT

Fast Polynomials Multiplication Using FFT Li Chen lichen.xd at gmail.com Xidian University January 17, 2014 Outline 1 Discrete Fourier Transform (DFT) 2 Discrete Convolution 3 Fast Fourier Transform (FFT) 4 umber Theoretic Transform (TT) 5 More

More information

Curve41417: Karatsuba revisited

Curve41417: Karatsuba revisited Curve41417: Karatsuba revisited Chitchanok Chuengsatiansup Technische Universiteit Eindhoven September 25, 2014 Joint work with Daniel J. Bernstein and Tanja Lange Chitchanok Chuengsatiansup Curve41417:

More information

SCALED REMAINDER TREES

SCALED REMAINDER TREES Draft. Aimed at Math. Comp. SCALED REMAINDER TREES DANIEL J. BERNSTEIN Abstract. It is well known that one can compute U mod p 1, U mod p 2,... in time n(lg n) 2+o(1) where n is the number of bits in U,

More information

Chapter 1 Divide and Conquer Polynomial Multiplication Algorithm Theory WS 2015/16 Fabian Kuhn

Chapter 1 Divide and Conquer Polynomial Multiplication Algorithm Theory WS 2015/16 Fabian Kuhn Chapter 1 Divide and Conquer Polynomial Multiplication Algorithm Theory WS 2015/16 Fabian Kuhn Formulation of the D&C principle Divide-and-conquer method for solving a problem instance of size n: 1. Divide

More information

Speedy Maths. David McQuillan

Speedy Maths. David McQuillan Speedy Maths David McQuillan Basic Arithmetic What one needs to be able to do Addition and Subtraction Multiplication and Division Comparison For a number of order 2 n n ~ 100 is general multi precision

More information

Introduction to Algorithms

Introduction to Algorithms Lecture 1 Introduction to Algorithms 1.1 Overview The purpose of this lecture is to give a brief overview of the topic of Algorithms and the kind of thinking it involves: why we focus on the subjects that

More information

Frequency Domain Finite Field Arithmetic for Elliptic Curve Cryptography

Frequency Domain Finite Field Arithmetic for Elliptic Curve Cryptography Frequency Domain Finite Field Arithmetic for Elliptic Curve Cryptography Selçuk Baktır, Berk Sunar {selcuk,sunar}@wpi.edu Department of Electrical & Computer Engineering Worcester Polytechnic Institute

More information

Arithmétique et Cryptographie Asymétrique

Arithmétique et Cryptographie Asymétrique Arithmétique et Cryptographie Asymétrique Laurent Imbert CNRS, LIRMM, Université Montpellier 2 Journée d inauguration groupe Sécurité 23 mars 2010 This talk is about public-key cryptography Why did mathematicians

More information

Faster integer multiplication using short lattice vectors

Faster integer multiplication using short lattice vectors Faster integer multiplication using short lattice vectors David Harvey and Joris van der Hoeven ANTS XIII, University of Wisconsin, Madison, July 2018 University of New South Wales / CNRS, École Polytechnique

More information

Even faster integer multiplication

Even faster integer multiplication Even faster integer multiplication DAVID HARVEY School of Mathematics and Statistics University of New South Wales Sydney NSW 2052 Australia Email: d.harvey@unsw.edu.au JORIS VAN DER HOEVEN a, GRÉGOIRE

More information

Integer factorization, part 1: the Q sieve. D. J. Bernstein

Integer factorization, part 1: the Q sieve. D. J. Bernstein Integer factorization, part 1: the Q sieve D. J. Bernstein Sieving small integers 0 using primes 3 5 7: 1 3 3 4 5 5 6 3 7 7 8 9 3 3 10 5 11 1 3 13 14 7 15 3 5 16 17 18 3 3 19 0 5 etc. Sieving and 611 +

More information

How to Write Fast Numerical Code Spring 2012 Lecture 19. Instructor: Markus Püschel TAs: Georg Ofenbeck & Daniele Spampinato

How to Write Fast Numerical Code Spring 2012 Lecture 19. Instructor: Markus Püschel TAs: Georg Ofenbeck & Daniele Spampinato How to Write Fast Numerical Code Spring 2012 Lecture 19 Instructor: Markus Püschel TAs: Georg Ofenbeck & Daniele Spampinato Miscellaneous Roofline tool Project report etc. online Midterm (solutions online)

More information

Integer multiplication and the truncated product problem

Integer multiplication and the truncated product problem Integer multiplication and the truncated product problem David Harvey Arithmetic Geometry, Number Theory, and Computation MIT, August 2018 University of New South Wales Political update from Australia

More information

CS S Lecture 5 January 29, 2019

CS S Lecture 5 January 29, 2019 CS 6363.005.19S Lecture 5 January 29, 2019 Main topics are #divide-and-conquer with #fast_fourier_transforms. Prelude Homework 1 is due Tuesday, February 5th. I hope you ve at least looked at it by now!

More information

Design and Analysis of Algorithms

Design and Analysis of Algorithms Design and Analysis of Algorithms CSE 5311 Lecture 5 Divide and Conquer: Fast Fourier Transform Junzhou Huang, Ph.D. Department of Computer Science and Engineering CSE5311 Design and Analysis of Algorithms

More information

How to Write Fast Numerical Code

How to Write Fast Numerical Code How to Write Fast Numerical Code Lecture: Discrete Fourier transform, fast Fourier transforms Instructor: Markus Püschel TA: Georg Ofenbeck & Daniele Spampinato Rest of Semester Today Lecture Project meetings

More information

CS/COE 1501 cs.pitt.edu/~bill/1501/ Integer Multiplication

CS/COE 1501 cs.pitt.edu/~bill/1501/ Integer Multiplication CS/COE 1501 cs.pitt.edu/~bill/1501/ Integer Multiplication Integer multiplication Say we have 5 baskets with 8 apples in each How do we determine how many apples we have? Count them all? That would take

More information

Parallelism in Computer Arithmetic: A Historical Perspective

Parallelism in Computer Arithmetic: A Historical Perspective Parallelism in Computer Arithmetic: A Historical Perspective 21s 2s 199s 198s 197s 196s 195s Behrooz Parhami Aug. 218 Parallelism in Computer Arithmetic Slide 1 University of California, Santa Barbara

More information

CSE 421 Algorithms. T(n) = at(n/b) + n c. Closest Pair Problem. Divide and Conquer Algorithms. What you really need to know about recurrences

CSE 421 Algorithms. T(n) = at(n/b) + n c. Closest Pair Problem. Divide and Conquer Algorithms. What you really need to know about recurrences CSE 421 Algorithms Richard Anderson Lecture 13 Divide and Conquer What you really need to know about recurrences Work per level changes geometrically with the level Geometrically increasing (x > 1) The

More information

Addition laws on elliptic curves. D. J. Bernstein University of Illinois at Chicago. Joint work with: Tanja Lange Technische Universiteit Eindhoven

Addition laws on elliptic curves. D. J. Bernstein University of Illinois at Chicago. Joint work with: Tanja Lange Technische Universiteit Eindhoven Addition laws on elliptic curves D. J. Bernstein University of Illinois at Chicago Joint work with: Tanja Lange Technische Universiteit Eindhoven 2007.01.10, 09:00 (yikes!), Leiden University, part of

More information

Optimizing MPC for robust and scalable integer and floating-point arithmetic

Optimizing MPC for robust and scalable integer and floating-point arithmetic Optimizing MPC for robust and scalable integer and floating-point arithmetic Liisi Kerik * Peeter Laud * Jaak Randmets * * Cybernetica AS University of Tartu, Institute of Computer Science January 30,

More information

E The Fast Fourier Transform

E The Fast Fourier Transform Fourier Transform Methods in Finance By Umberto Cherubini Giovanni Della Lunga Sabrina Mulinacci Pietro Rossi Copyright 2010 John Wiley & Sons Ltd E The Fast Fourier Transform E.1 DISCRETE FOURIER TRASFORM

More information

DIVIDE AND CONQUER II

DIVIDE AND CONQUER II DIVIDE AND CONQUER II master theorem integer multiplication matrix multiplication convolution and FFT Lecture slides by Kevin Wayne Copyright 2005 Pearson-Addison Wesley http://www.cs.princeton.edu/~wayne/kleinberg-tardos

More information

P vs NP & Computational Complexity

P vs NP & Computational Complexity P vs NP & Computational Complexity Miles Turpin MATH 89S Professor Hubert Bray P vs NP is one of the seven Clay Millennium Problems. The Clay Millenniums have been identified by the Clay Mathematics Institute

More information

Elliptic Curves Spring 2013 Lecture #3 02/12/2013

Elliptic Curves Spring 2013 Lecture #3 02/12/2013 18.783 Elliptic Curves Spring 2013 Lecture #3 02/12/2013 3.1 Arithmetic in finite fields To make explicit computations with elliptic curves over finite fields, we need to know how to perform arithmetic

More information

Old and new algorithms for computing Bernoulli numbers

Old and new algorithms for computing Bernoulli numbers Old and new algorithms for computing Bernoulli numbers University of New South Wales 25th September 2012, University of Ballarat Bernoulli numbers Rational numbers B 0, B 1,... defined by: x e x 1 = n

More information

Integer factorization, part 1: the Q sieve. part 2: detecting smoothness. D. J. Bernstein

Integer factorization, part 1: the Q sieve. part 2: detecting smoothness. D. J. Bernstein Integer factorization, part 1: the Q sieve Integer factorization, part 2: detecting smoothness D. J. Bernstein The Q sieve factors by combining enough -smooth congruences ( + ). Enough log. Plausible conjecture:

More information

Code-based post-quantum cryptography. D. J. Bernstein University of Illinois at Chicago

Code-based post-quantum cryptography. D. J. Bernstein University of Illinois at Chicago Code-based post-quantum cryptography D. J. Bernstein University of Illinois at Chicago Once the enormous energy boost that quantum computers are expected to provide hits the street, most encryption security

More information

1 Divide and Conquer (September 3)

1 Divide and Conquer (September 3) The control of a large force is the same principle as the control of a few men: it is merely a question of dividing up their numbers. Sun Zi, The Art of War (c. 400 C.E.), translated by Lionel Giles (1910)

More information

Fast reversion of formal power series

Fast reversion of formal power series Fast reversion of formal power series Fredrik Johansson LFANT, INRIA Bordeaux RAIM, 2016-06-29, Banyuls-sur-mer 1 / 30 Reversion of power series F = exp(x) 1 = x + x 2 2! + x 3 3! + x 4 G = log(1 + x)

More information

The Discrete Fourier transform

The Discrete Fourier transform 453.70 Linear Systems, S.M. Tan, The University of uckland 9- Chapter 9 The Discrete Fourier transform 9. DeÞnition When computing spectra on a computer it is not possible to carry out the integrals involved

More information

MA3232 Numerical Analysis Week 9. James Cooley (1926-)

MA3232 Numerical Analysis Week 9. James Cooley (1926-) MA umerical Analysis Week 9 James Cooley (96-) James Cooley is an American mathematician. His most significant contribution to the world of mathematics and digital signal processing is the Fast Fourier

More information

Implementation of the DKSS Algorithm for Multiplication of Large Numbers

Implementation of the DKSS Algorithm for Multiplication of Large Numbers Implementation of the DKSS Algorithm for Multiplication of Large Numbers Christoph Lüders Universität Bonn The International Symposium on Symbolic and Algebraic Computation, July 6 9, 2015, Bath, United

More information

Short generators without quantum computers: the case of multiquadratics

Short generators without quantum computers: the case of multiquadratics Short generators without quantum computers: the case of multiquadratics Daniel J. Bernstein University of Illinois at Chicago 31 July 2017 https://multiquad.cr.yp.to Joint work with: Jens Bauch & Henry

More information

A brief survey of post-quantum cryptography. D. J. Bernstein University of Illinois at Chicago

A brief survey of post-quantum cryptography. D. J. Bernstein University of Illinois at Chicago A brief survey of post-quantum cryptography D. J. Bernstein University of Illinois at Chicago Once the enormous energy boost that quantum computers are expected to provide hits the street, most encryption

More information

Short Division of Long Integers. (joint work with David Harvey)

Short Division of Long Integers. (joint work with David Harvey) Short Division of Long Integers (joint work with David Harvey) Paul Zimmermann October 6, 2011 The problem to be solved Divide efficiently a p-bit floating-point number by another p-bit f-p number in the

More information

PUTTING FÜRER ALGORITHM INTO PRACTICE WITH THE BPAS LIBRARY. (Thesis format: Monograph) Linxiao Wang. Graduate Program in Computer Science

PUTTING FÜRER ALGORITHM INTO PRACTICE WITH THE BPAS LIBRARY. (Thesis format: Monograph) Linxiao Wang. Graduate Program in Computer Science PUTTING FÜRER ALGORITHM INTO PRACTICE WITH THE BPAS LIBRARY. (Thesis format: Monograph) by Linxiao Wang Graduate Program in Computer Science A thesis submitted in partial fulfillment of the requirements

More information

High-speed cryptography, part 3: more cryptosystems. Daniel J. Bernstein University of Illinois at Chicago & Technische Universiteit Eindhoven

High-speed cryptography, part 3: more cryptosystems. Daniel J. Bernstein University of Illinois at Chicago & Technische Universiteit Eindhoven High-speed cryptography, part 3: more cryptosystems Daniel J. Bernstein University of Illinois at Chicago & Technische Universiteit Eindhoven Cryptographers Working systems Cryptanalytic algorithm designers

More information

A design paradigm. Divide and conquer: (When) does decomposing a problem into smaller parts help? 09/09/ EECS 3101

A design paradigm. Divide and conquer: (When) does decomposing a problem into smaller parts help? 09/09/ EECS 3101 A design paradigm Divide and conquer: (When) does decomposing a problem into smaller parts help? 09/09/17 112 Multiplying complex numbers (from Jeff Edmonds slides) INPUT: Two pairs of integers, (a,b),

More information

RSA Implementation. Oregon State University

RSA Implementation. Oregon State University RSA Implementation Çetin Kaya Koç Oregon State University 1 Contents: Exponentiation heuristics Multiplication algorithms Computation of GCD and Inverse Chinese remainder algorithm Primality testing 2

More information

B. Encryption using quasigroup

B. Encryption using quasigroup Sequence Randomization Using Quasigroups and Number Theoretic s Vaignana Spoorthy Ella Department of Computer Science Oklahoma State University Stillwater, Oklahoma, USA spoorthyella@okstateedu Abstract

More information

Fast Polynomial Multiplication

Fast Polynomial Multiplication Fast Polynomial Multiplication Marc Moreno Maza CS 9652, October 4, 2017 Plan Primitive roots of unity The discrete Fourier transform Convolution of polynomials The fast Fourier transform Fast convolution

More information

Fall 2011, EE123 Digital Signal Processing

Fall 2011, EE123 Digital Signal Processing Lecture 6 Miki Lustig, UCB September 11, 2012 Miki Lustig, UCB DFT and Sampling the DTFT X (e jω ) = e j4ω sin2 (5ω/2) sin 2 (ω/2) 5 x[n] 25 X(e jω ) 4 20 3 15 2 1 0 10 5 1 0 5 10 15 n 0 0 2 4 6 ω 5 reconstructed

More information

The Fast Fourier Transform

The Fast Fourier Transform The Fast Fourier Transform A Mathematical Perspective Todd D. Mateer Contents Chapter 1. Introduction 3 Chapter 2. Polynomials 17 1. Basic operations 17 2. The Remainder Theorem and Synthetic Division

More information

Parallel Numerical Algorithms

Parallel Numerical Algorithms Parallel Numerical Algorithms Chapter 13 Prof. Michael T. Heath Department of Computer Science University of Illinois at Urbana-Champaign CS 554 / CSE 512 Michael T. Heath Parallel Numerical Algorithms

More information

Subquadratic Space Complexity Multiplication over Binary Fields with Dickson Polynomial Representation

Subquadratic Space Complexity Multiplication over Binary Fields with Dickson Polynomial Representation Subquadratic Space Complexity Multiplication over Binary Fields with Dickson Polynomial Representation M A Hasan and C Negre Abstract We study Dickson bases for binary field representation Such representation

More information

CHAPMAN & HALL/CRC CRYPTOGRAPHY AND NETWORK SECURITY ALGORITHMIC CR YPTAN ALY51S. Ant nine J aux

CHAPMAN & HALL/CRC CRYPTOGRAPHY AND NETWORK SECURITY ALGORITHMIC CR YPTAN ALY51S. Ant nine J aux CHAPMAN & HALL/CRC CRYPTOGRAPHY AND NETWORK SECURITY ALGORITHMIC CR YPTAN ALY51S Ant nine J aux (g) CRC Press Taylor 8* Francis Croup Boca Raton London New York CRC Press is an imprint of the Taylor &

More information

Distinguishing prime numbers from composite numbers: the state of the art. D. J. Bernstein University of Illinois at Chicago

Distinguishing prime numbers from composite numbers: the state of the art. D. J. Bernstein University of Illinois at Chicago Distinguishing prime numbers from composite numbers: the state of the art D. J. Bernstein University of Illinois at Chicago Is it easy to determine whether a given integer is prime? If easy means computable

More information

You separate binary numbers into columns in a similar fashion. 2 5 = 32

You separate binary numbers into columns in a similar fashion. 2 5 = 32 RSA Encryption 2 At the end of Part I of this article, we stated that RSA encryption works because it s impractical to factor n, which determines P 1 and P 2, which determines our private key, d, which

More information

Chapter 4 Number Representations

Chapter 4 Number Representations Chapter 4 Number Representations SKEE2263 Digital Systems Mun im/ismahani/izam {munim@utm.my,e-izam@utm.my,ismahani@fke.utm.my} February 9, 2016 Table of Contents 1 Fundamentals 2 Signed Numbers 3 Fixed-Point

More information

Objectives. McBits: fast constant-time code-based cryptography. Set new speed records for public-key cryptography. (to appear at CHES 2013)

Objectives. McBits: fast constant-time code-based cryptography. Set new speed records for public-key cryptography. (to appear at CHES 2013) McBits: fast constant-time code-based cryptography (to appear at CHES 2013) Objectives Set new speed records for public-key cryptography. D. J. Bernstein University of Illinois at Chicago & Technische

More information

Computational Methods CMSC/AMSC/MAPL 460

Computational Methods CMSC/AMSC/MAPL 460 Computational Methods CMSC/AMSC/MAPL 460 Fourier transform Balaji Vasan Srinivasan Dept of Computer Science Several slides from Prof Healy s course at UMD Last time: Fourier analysis F(t) = A 0 /2 + A

More information

Arithmetic of split Kummer surfaces: Montgomery endomorphism of Edwards products

Arithmetic of split Kummer surfaces: Montgomery endomorphism of Edwards products 1 Arithmetic of split Kummer surfaces: Montgomery endomorphism of Edwards products David Kohel Institut de Mathématiques de Luminy International Workshop on Codes and Cryptography 2011 Qingdao, 2 June

More information

Even faster integer multiplication

Even faster integer multiplication Even faster integer multiplication DAVID HARVEY School of Mathematics and Statistics University of New South Wales Sydney NSW 2052 Australia Email: d.harvey@unsw.edu.au JORIS VAN DER HOEVEN a, GRÉGOIRE

More information

Optimizing Scientific Libraries for the Itanium

Optimizing Scientific Libraries for the Itanium 0 Optimizing Scientific Libraries for the Itanium John Harrison Intel Corporation Gelato Federation Meeting, HP Cupertino May 25, 2005 1 Quick summary Intel supplies drop-in replacement versions of common

More information

feb abhi shelat Matrix, FFT

feb abhi shelat Matrix, FFT L7 feb 11 2016 abhi shelat Matrix, FFT userid: = Using the standard method, how many multiplications does it take to multiply two NxN matrices? cos( /4) = cos( /2) = sin( /4) = sin( /2) = Mergesort Karatsuba

More information

Fast and Small: Multiplying Polynomials without Extra Space

Fast and Small: Multiplying Polynomials without Extra Space Fast and Small: Multiplying Polynomials without Extra Space Daniel S. Roche Symbolic Computation Group School of Computer Science University of Waterloo CECM Day SFU, Vancouver, 24 July 2009 Preliminaries

More information

Efficient encryption and decryption. ECE646 Lecture 10. RSA Implementation: Efficient Encryption & Decryption. Required Reading

Efficient encryption and decryption. ECE646 Lecture 10. RSA Implementation: Efficient Encryption & Decryption. Required Reading ECE646 Lecture 10 RSA Implementation: Efficient Encryption & Decryption Required Reading W. Stallings, "Cryptography and etwork-security, Chapter 9.2 The RSA Algorithm Chapter 8.4 The Chinese Remainder

More information

part 2: detecting smoothness part 3: the number-field sieve

part 2: detecting smoothness part 3: the number-field sieve Integer factorization, part 1: the Q sieve Integer factorization, part 2: detecting smoothness Integer factorization, part 3: the number-field sieve D. J. Bernstein Problem: Factor 611. The Q sieve forms

More information

Computer Vision, Convolutions, Complexity and Algebraic Geometry

Computer Vision, Convolutions, Complexity and Algebraic Geometry Computer Vision, Convolutions, Complexity and Algebraic Geometry D. V. Chudnovsky, G.V. Chudnovsky IMAS Polytechnic Institute of NYU 6 MetroTech Center Brooklyn, NY 11201 December 6, 2012 Fast Multiplication:

More information

The Class NP. NP is the problems that can be solved in polynomial time by a nondeterministic machine.

The Class NP. NP is the problems that can be solved in polynomial time by a nondeterministic machine. The Class NP NP is the problems that can be solved in polynomial time by a nondeterministic machine. NP The time taken by nondeterministic TM is the length of the longest branch. The collection of all

More information

Chapter 1 Divide and Conquer Algorithm Theory WS 2016/17 Fabian Kuhn

Chapter 1 Divide and Conquer Algorithm Theory WS 2016/17 Fabian Kuhn Chapter 1 Divide and Conquer Algorithm Theory WS 2016/17 Fabian Kuhn Formulation of the D&C principle Divide-and-conquer method for solving a problem instance of size n: 1. Divide n c: Solve the problem

More information

EDISP (NWL3) (English) Digital Signal Processing DFT Windowing, FFT. October 19, 2016

EDISP (NWL3) (English) Digital Signal Processing DFT Windowing, FFT. October 19, 2016 EDISP (NWL3) (English) Digital Signal Processing DFT Windowing, FFT October 19, 2016 DFT resolution 1 N-point DFT frequency sampled at θ k = 2πk N, so the resolution is f s/n If we want more, we use N

More information

The Fourier transform allows an arbitrary function to be represented in terms of simple sinusoids. The Fourier transform (FT) of a function f(t) is

The Fourier transform allows an arbitrary function to be represented in terms of simple sinusoids. The Fourier transform (FT) of a function f(t) is 1 Introduction Here is something I wrote many years ago while working on the design of anemometers for measuring shear stresses. Part of this work required modelling and compensating for the transfer function

More information

Large Integer Multiplication on Hypercubes. Barry S. Fagin Thayer School of Engineering Dartmouth College Hanover, NH

Large Integer Multiplication on Hypercubes. Barry S. Fagin Thayer School of Engineering Dartmouth College Hanover, NH Large Integer Multiplication on Hypercubes Barry S. Fagin Thayer School of Engineering Dartmouth College Hanover, NH 03755 barry.fagin@dartmouth.edu Large Integer Multiplication 1 B. Fagin ABSTRACT Previous

More information

1: Please compute the Jacobi symbol ( 99

1: Please compute the Jacobi symbol ( 99 SCORE/xx: Math 470 Communications Cryptography NAME: PRACTICE FINAL Please show your work write only in pen. Notes are forbidden. Calculators, all other electronic devices, are forbidden. Brains are encouraged,

More information

Cosc 412: Cryptography and complexity Lecture 7 (22/8/2018) Knapsacks and attacks

Cosc 412: Cryptography and complexity Lecture 7 (22/8/2018) Knapsacks and attacks 1 Cosc 412: Cryptography and complexity Lecture 7 (22/8/2018) Knapsacks and attacks Michael Albert michael.albert@cs.otago.ac.nz 2 This week Arithmetic Knapsack cryptosystems Attacks on knapsacks Some

More information

FPGA-based Niederreiter Cryptosystem using Binary Goppa Codes

FPGA-based Niederreiter Cryptosystem using Binary Goppa Codes FPGA-based Niederreiter Cryptosystem using Binary Goppa Codes Wen Wang 1, Jakub Szefer 1, and Ruben Niederhagen 2 1. Yale University, USA 2. Fraunhofer Institute SIT, Germany April 9, 2018 PQCrypto 2018

More information

Tutorial 2 - Learning about the Discrete Fourier Transform

Tutorial 2 - Learning about the Discrete Fourier Transform Tutorial - Learning about the Discrete Fourier Transform This tutorial will be about the Discrete Fourier Transform basis, or the DFT basis in short. What is a basis? If we google define basis, we get:

More information

Faster arithmetic for number-theoretic transforms

Faster arithmetic for number-theoretic transforms University of New South Wales 7th October 2011, Macquarie University Plan for talk 1. Review number-theoretic transform (NTT) 2. Discuss typical butterfly algorithm 3. Improvements to butterfly algorithm

More information

Three Ways to Test Irreducibility

Three Ways to Test Irreducibility Three Ways to Test Irreducibility Richard P. Brent Australian National University joint work with Paul Zimmermann INRIA, Nancy France 12 Feb 2009 Outline Polynomials over finite fields Irreducibility criteria

More information

Frequency-domain representation of discrete-time signals

Frequency-domain representation of discrete-time signals 4 Frequency-domain representation of discrete-time signals So far we have been looing at signals as a function of time or an index in time. Just lie continuous-time signals, we can view a time signal as

More information

Thanks to: University of Illinois at Chicago NSF DMS Alfred P. Sloan Foundation

Thanks to: University of Illinois at Chicago NSF DMS Alfred P. Sloan Foundation Building circuits for integer factorization D. J. Bernstein Thanks to: University of Illinois at Chicago NSF DMS 0140542 Alfred P. Sloan Foundation I want to work for NSA as an independent contractor.

More information

Divide and Conquer: Polynomial Multiplication Version of October 1 / 7, 24201

Divide and Conquer: Polynomial Multiplication Version of October 1 / 7, 24201 Divide and Conquer: Polynomial Multiplication Version of October 7, 2014 Divide and Conquer: Polynomial Multiplication Version of October 1 / 7, 24201 Outline Outline: Introduction The polynomial multiplication

More information

Divide and Conquer algorithms

Divide and Conquer algorithms Divide and Conquer algorithms Another general method for constructing algorithms is given by the Divide and Conquer strategy. We assume that we have a problem with input that can be split into parts in

More information

Transforms and Orthogonal Bases

Transforms and Orthogonal Bases Orthogonal Bases Transforms and Orthogonal Bases We now turn back to linear algebra to understand transforms, which map signals between different domains Recall that signals can be interpreted as vectors

More information

Exact Arithmetic on a Computer

Exact Arithmetic on a Computer Exact Arithmetic on a Computer Symbolic Computation and Computer Algebra William J. Turner Department of Mathematics & Computer Science Wabash College Crawfordsville, IN 47933 Tuesday 21 September 2010

More information

Scientific Computing: An Introductory Survey

Scientific Computing: An Introductory Survey Scientific Computing: An Introductory Survey Chapter 12 Prof. Michael T. Heath Department of Computer Science University of Illinois at Urbana-Champaign Copyright c 2002. Reproduction permitted for noncommercial,

More information

Tunable Floating-Point for Energy Efficient Accelerators

Tunable Floating-Point for Energy Efficient Accelerators Tunable Floating-Point for Energy Efficient Accelerators Alberto Nannarelli DTU Compute, Technical University of Denmark 25 th IEEE Symposium on Computer Arithmetic A. Nannarelli (DTU Compute) Tunable

More information

What s the best data structure for multivariate polynomials in a world of 64 bit multicore computers?

What s the best data structure for multivariate polynomials in a world of 64 bit multicore computers? What s the best data structure for multivariate polynomials in a world of 64 bit multicore computers? Michael Monagan Center for Experimental and Constructive Mathematics Simon Fraser University British

More information

( ) is symmetric about the y - axis.

( ) is symmetric about the y - axis. (Section 1.5: Analyzing Graphs of Functions) 1.51 PART F: FUNCTIONS THAT ARE EVEN / ODD / NEITHER; SYMMETRY A function f is even f ( x) = f ( x) x Dom( f ) The graph of y = f x for every x in the domain

More information

17.1 Binary Codes Normal numbers we use are in base 10, which are called decimal numbers. Each digit can be 10 possible numbers: 0, 1, 2, 9.

17.1 Binary Codes Normal numbers we use are in base 10, which are called decimal numbers. Each digit can be 10 possible numbers: 0, 1, 2, 9. ( c ) E p s t e i n, C a r t e r, B o l l i n g e r, A u r i s p a C h a p t e r 17: I n f o r m a t i o n S c i e n c e P a g e 1 CHAPTER 17: Information Science 17.1 Binary Codes Normal numbers we use

More information