COMPARISON OF FPGA IMPLEMENTATION OF THE MOD M REDUCTION

Similar documents
Recurrence Relations

EE260: Digital Design, Spring n Binary Addition. n Complement forms. n Subtraction. n Multiplication. n Inputs: A 0, B 0. n Boolean equations:

Parallel Vector Algorithms David A. Padua

3. Z Transform. Recall that the Fourier transform (FT) of a DT signal xn [ ] is ( ) [ ] = In order for the FT to exist in the finite magnitude sense,

A Block Cipher Using Linear Congruences

Math 609/597: Cryptography 1

CSE 1400 Applied Discrete Mathematics Number Theory and Proofs

CHAPTER I: Vector Spaces

18.440, March 9, Stirling s formula

Summary: Congruences. j=1. 1 Here we use the Mathematica syntax for the function. In Maple worksheets, the function

The Discrete Fourier Transform

62. Power series Definition 16. (Power series) Given a sequence {c n }, the series. c n x n = c 0 + c 1 x + c 2 x 2 + c 3 x 3 +

Chapter Vectors

Beurling Integers: Part 2

In number theory we will generally be working with integers, though occasionally fractions and irrationals will come into play.

Principle Of Superposition

SNAP Centre Workshop. Basic Algebraic Manipulation

Infinite Series and Improper Integrals

Curve Sketching Handout #5 Topic Interpretation Rational Functions

Q-BINOMIALS AND THE GREATEST COMMON DIVISOR. Keith R. Slavin 8474 SW Chevy Place, Beaverton, Oregon 97008, USA.

The "Last Riddle" of Pierre de Fermat, II

Lecture 8: Solving the Heat, Laplace and Wave equations using finite difference methods

Congruence Modulo a. Since,

[ 11 ] z of degree 2 as both degree 2 each. The degree of a polynomial in n variables is the maximum of the degrees of its terms.

Math 155 (Lecture 3)

Topic 1 2: Sequences and Series. A sequence is an ordered list of numbers, e.g. 1, 2, 4, 8, 16, or

arxiv: v1 [math.nt] 10 Dec 2014

THE ASYMPTOTIC COMPLEXITY OF MATRIX REDUCTION OVER FINITE FIELDS

KNOWLEDGE OF NUMBER SENSE, CONCEPTS, AND OPERATIONS

Pell and Lucas primes

Inverse Matrix. A meaning that matrix B is an inverse of matrix A.

Sequences of Definite Integrals, Factorials and Double Factorials

SCALING OF NUMBERS IN RESIDUE ARITHMETIC WITH THE FLEXIBLE SELECTION OF SCALING FACTOR

Dr. Clemens Kroll. Abstract

6 Integers Modulo n. integer k can be written as k = qn + r, with q,r, 0 r b. So any integer.

Infinite Sequences and Series

CSI 2101 Discrete Structures Winter Homework Assignment #4 (100 points, weight 5%) Due: Thursday, April 5, at 1:00pm (in lecture)

x a x a Lecture 2 Series (See Chapter 1 in Boas)

It is always the case that unions, intersections, complements, and set differences are preserved by the inverse image of a function.

Mathematical Induction

CS / MCS 401 Homework 3 grader solutions

Exercises 1 Sets and functions

Lesson 10: Limits and Continuity

PAijpam.eu ON DERIVATION OF RATIONAL SOLUTIONS OF BABBAGE S FUNCTIONAL EQUATION

DIVISIBILITY PROPERTIES OF GENERALIZED FIBONACCI POLYNOMIALS

The Method of Least Squares. To understand least squares fitting of data.

Discrete-Time Systems, LTI Systems, and Discrete-Time Convolution

Hoggatt and King [lo] defined a complete sequence of natural numbers

Eigenvalues and Eigenvectors

ECE-S352 Introduction to Digital Signal Processing Lecture 3A Direct Solution of Difference Equations

Oblivious Transfer using Elliptic Curves

PROPERTIES OF THE POSITIVE INTEGERS

Theorem: Let A n n. In this case that A does reduce to I, we search for A 1 as the solution matrix X to the matrix equation A X = I i.e.

Activity 3: Length Measurements with the Four-Sided Meter Stick

Fortgeschrittene Datenstrukturen Vorlesung 11

1 Summary: Binary and Logic

The structure of finite rings. The multiplicative residues. Modular exponentiation. and finite exponentiation

Some Explicit Formulae of NAF and its Left-to-Right. Analogue Based on Booth Encoding

NUMERICAL METHODS FOR SOLVING EQUATIONS

w (1) ˆx w (1) x (1) /ρ and w (2) ˆx w (2) x (2) /ρ.

NUMERICAL METHODS COURSEWORK INFORMAL NOTES ON NUMERICAL INTEGRATION COURSEWORK

CHAPTER 10 INFINITE SEQUENCES AND SERIES

End-of-Year Contest. ERHS Math Club. May 5, 2009

1. By using truth tables prove that, for all statements P and Q, the statement

SOME TRIBONACCI IDENTITIES

SOLVED EXAMPLES

Similarity Solutions to Unsteady Pseudoplastic. Flow Near a Moving Wall

( a) ( ) 1 ( ) 2 ( ) ( ) 3 3 ( ) =!

SECTION 1.5 : SUMMATION NOTATION + WORK WITH SEQUENCES

INFINITE SEQUENCES AND SERIES

Chapter 4. Fourier Series

Sequences, Mathematical Induction, and Recursion. CSE 2353 Discrete Computational Structures Spring 2018

The Ratio Test. THEOREM 9.17 Ratio Test Let a n be a series with nonzero terms. 1. a. n converges absolutely if lim. n 1

Chapter 6 Overview: Sequences and Numerical Series. For the purposes of AP, this topic is broken into four basic subtopics:

Appendix: The Laplace Transform

Abstract. Keywords: conjecture; divisor function; divisor summatory function; prime numbers; Dirichlet's divisor problem

7. Modern Techniques. Data Encryption Standard (DES)

ON POINTWISE BINOMIAL APPROXIMATION

Design and Analysis of Algorithms

International Baccalaureate LECTURE NOTES MATHEMATICS HL FURTHER MATHEMATICS HL Christos Nikolaidis TOPIC NUMBER THEORY

Chapter 7: Numerical Series

THE SYSTEMATIC AND THE RANDOM. ERRORS - DUE TO ELEMENT TOLERANCES OF ELECTRICAL NETWORKS

Zeros of Polynomials

Analysis of Experimental Measurements

Stochastic Matrices in a Finite Field

CHAPTER XI DATAPATH ELEMENTS

Mechanical Efficiency of Planetary Gear Trains: An Estimate

Exponents. Learning Objectives. Pre-Activity

A NEW CLASS OF 2-STEP RATIONAL MULTISTEP METHODS

Chapter 7: The z-transform. Chih-Wei Liu

3.1 Counting Principles

Math 4400/6400 Homework #7 solutions

Introduction to Machine Learning DIS10

THE SOLUTION OF NONLINEAR EQUATIONS f( x ) = 0.

Injections, Surjections, and the Pigeonhole Principle

A proposed discrete distribution for the statistical modeling of

Different kinds of Mathematical Induction

15.083J/6.859J Integer Optimization. Lecture 3: Methods to enhance formulations

Generating Functions. 1 Operations on generating functions

Matrices and vectors

Transcription:

Lati America Applied Research 37:93-97 (2007) COMPARISON OF FPGA IMPLEMENTATION OF THE MOD M REDUCTION J-P. DESCHAMPS ad G. SUTTER Escola Tècica Superior d Egiyeria, Uiversitat Rovira i Virgili, Tarragoa, Spai, jeapierre.deschamps@urv.et; http://www.etse.urv.es Escuela Politécica Superior, Uiversidad Autóoma de Madrid, Madrid, Spai, gustavo.sutter@ii.uam.es; http://www.ii.uam.es. Abstract Several algorithms for computig x mod m are preseted, amog others the reductio mod B k -a, the pre-computatio of B i.k mod m, a geeralized versio of the Barrett algorithm ad a modified versio of the same Barrett algorithm. The four metioed algorithms, as well as the classical iteger o-restorig divisio algorithm, have bee sythesized ad implemeted withi xc3s4000 compoets. Keywords Arithmetic i FPGA, Galois Field, Cryptography, modular operatio. I. INTRODUCTION Arithmetic operatios over the fiite rig Z m = {0, 1,..., m-1} are used as computatio primitives for executig umerous cryptographic algorithms, especially those related with the use of public keys (asymmetric cryptography). Classical examples are cipherig / decipherig, autheticatio ad digital sigature protocols based o RSA-type or elliptic curve algorithms. Oe of the basic operatios is the modulo m reductio. Combied with operatios over the set Z of itegers (sum, subtractio, product, ad so o) it allows to perform the same operatios over Z m. A straightforward solutio cosists of usig a iteger divisio algorithm. Nevertheless, more efficiet algorithms have bee proposed (Blake et al, 2002; Hakerso et al, 2004). I this paper several algorithms are described, amely the reductio mod B k -a, the pre-computatio of B i.k mod m, a geeralized versio of the Barrett algorithm ad a modified versio of the same Barrett algorithm. The four metioed algorithms, as well as the classical iteger o-restorig divisio algorithm, have bee sythesized ad implemeted withi xc3s4000 compoets. II. ALGORITHM I this sectio the followig problem is studied: give two aturals x ad m, compute z = x mod m. A. Iteger divisio A straightforward method cosists of performig the iteger divisio of x by m, that is, x = q.m + z, z < m. For that purpose, ay divisio algorithm ca be used, for example the o-restorig divisio algorithm (Deschamps et al, 2006). Algorithm 1 No-restorig reductio y := m*(2**(-k)); rems(1) := x - y; for i i 1.. -k loop if rems(i) < 0 the rems(i+1) := 2*rems(i)+ y; rems(i+1) := 2*rems (i) - y; ed if; if rems(-k+1) < 0 the z := rems(-k+1)/(2**(-k)) + m; z := rems(-k+1)/(2**(-k)); ed if; The core of the algorithm is a (-k)-step iteratio. If a ripple-carry k-bit adder-subtractor is used, the computatio time is about time(,k) (-k).k.t FA, (1) where T FA is the delay of a full-adder. B. Reductio mod B k -a Assume that B k-1 m < B k, where B is a atural umber 2. The m = B k a where 1 a B k B k-1. Compute the followig quotiets q i ad remaiders r i : x = q 0.B k + r 0, q 0.a = q 1.B k + r 1, q 1.a = q 2.B k + r 2, (2)... q s-2.a = q s-1.b k + r s-1. Multiply the secod equatio of (2) by (B k /a), the third oe by (B k /a) 2,..., the last oe by (B k /a) s-1, ad sum up the s equatios; the result is x = r 0 + r 1.(B k /a) + r 2.(B k /a) 2 +... + r s-1.(b k /a) s-1 + q s-1.b k.(b k /a) s-1. (3) As a < B k, that is, B k /a > 1, there exists a value of s such that x < B k.(b k /a) s-1, (4) ad thus q s-1 = 0. Let s be the value of s such that q s-1 = 0. Notice that if r s-1 = 0 the the last equatio 93

Lati America Applied Research 37:93-97 (2007) of (2), with q s-1 = 0, is q s-2.a = 0, that is, q s-2 = 0, so that s is ot the value of s such that q s-1 = 0. Thus x = r 0 + r 1.(B k /a) + r 2.(B k /a) 2 +... + r s-1.(b k /a) s-1, with r s-1 > 0. (5) By summig up the s equatios of system (2), with q s-1 = 0, the followig relatio is obtaied: x = q 0.(B k -a) + q 1.(B k -a) +... + q s-2.(b k -a) + r 0 + r 1 +... + r s-1. (6) Defie r = r 0 + r 1 +... + r s-1. (7) Accordig to (6) ad (7), x r mod m, with m = B k -a. (8) Comparig (5) with (7) it is obvious that if s > 1, that is, if x B k, the r < x. If r is still greater tha or equal to B k, the same method ca be used i order to get r x mod m, with r < r. After a fiite umber of iteratios, a umber r is obtaied such that r x mod m ad r < B k, so that z = r q.m where 0 q B-1. I particular, if B = 2 the z is either r or r -m. To summarize, the mod m reductio algorithm, with m = B k -a, is the followig: Algorithm 2 mod m reductio algorithm, with m = B k -a r := x mod b**k; q := x/b**k; loop loop r := r + (q*a mod b**k); q := q*a/b**k; if q = 0 the exit; ed if; q := r/b**k; r := r mod b**k; if q = 0 the exit; ed if; while r >= m loop r := r-m; If B is the base (or a power of the base) of the chose umeratio system, the the divisio by B k ad the mod B k reductio are trivial operatios. The oly o-trivial operatios are multiplicatio by a, sums (remaider accumulatio) ad subtractios (fial reductio). The umber of executios of the iteral loop body ca be estimated as follows: a sufficiet coditio for q s-1 beig equal to 0 is (4), which is equivalet to s > (log x log a) / (k.log B - log a). Thus s = (log x log a) / (k.log B - log a). I particular, if x = B -1, that is, the greatest - digit B-ary umber, the s = ( log B a) / (k log B a), (9) ad, assumig that log B a is much smaller tha k ad, s /k. (10) As regards the reductio rate of the algorithm, that is, the relatio betwee a iitial value x ad the obtaied value r after a first executio of the iteral loop, otice that r is smaller tha s.b k, so that the umber d(r) of B- ary digits of r satisfies the coditio d(r) k + log B s, (11) where s is approximately equal to (10). Thus d(r max ) log B + k log B k < log B + k. I order to defie the size of the variable r = r 0 + r 1 +... + r s-1, the followig values are previously calculated (see (9) ad (11)): s = ( log 2 a) / (k log 2 a), t = log 2 s, so that r ca be represeted as a (k+t)-bit umber. The core of the algorithm is a (/k)-step iteratio. Each step icludes the multiplicatio of a (-k)-bit umber q by a k-bit umber a, ad the sum of a (k+t)-bit umber r ad a k-bit umber. The computatio time of the multiplier depeds o the particular value of a. Nevertheless, i order to get a estimatio of the computatio time, it will be assumed that a parallel multiplier is used. Its computatio time is about ((-k)+2.k-2).t FA (+k).t FA (Deschamps et al, 2006). The step duratio is approximately equal to (+k).t FA + (k+t).t FA. If +2.k >> t the the computatio time is approximately equal to time(,k) (/k).( +2.k).T FA. (12) C. Pre-computatio of B i.k mod m Assume agai that B k-1 m < B k, ad that x is represeted i base B k, i.e. x = x s-1.b (s-1).k + x s-2.b (s-2).k +... + x 1.B k + x 0, where x s-1 > 0. (13) The followig values must have bee previously computed: b 0 = 1, b 1 = B k mod m, b 2 = B 2.k mod m,..., b s-1 = B (s-1).k mod m. The x x s-1.b s-1 + x s-2.b s-2 +... + x 1.b 1 + x 0.b 0 mod m, ad the problem is reduced to the computatio of r mod m where r = x s-1.b s-1 + x s-2.b s-2 +... + x 1.b 1 + x 0.b 0. (14) Observe that b i = (B i.k mod m) < m < B k B i.k, i > 0. Comparig (14) with (13), it is obvious that if s > 1, that is, if x B k, the r < x. If r is still greater tha or equal to B k, the same method ca be used i order to get r x mod m with r < r. After a fiite umber of iteratios, a umber r is obtaied such that r x mod m ad r < B k, so that z = r q.m where 0 q B-1. I particular, if B = 2 the z is either r or r -m. To summarize, the mod m reductio algorithm, with pre-computatio of B i.k mod m, is the followig (it is assumed that the ats b i = B i.k mod m have bee previously calculated): 94

J-P. DESCHAMPS, G. SUTTER Algorithm 3 mod m reductio, with pre-computatio of B i.k mod m mai: loop --represet x as a s-digit umber: vector_x(0) := x mod base**k; q := x/base**k; for i i 1.. s-1 loop vector_x(i) := q mod base**k; q := q/base**k; --ed of computatio detectio: oe_digit := true; for i i 1.. s-1 loop if vector_x(i) /= 0 the oe_digit := false; exit; ed if; --mai computatio if oe_digit the exit mai; x := vector_x(0); iteral: for i i 1.. s-1 loop x := x + vector_x(i)*b(i); ed loop iteral; ed if; ed loop mai; r := vector_x(0); while r >= m loop r := r-m; The mod B k reductio ad the iteger divisio by B k are trivial operatios. The oly o-trivial operatios are products of base-b k digits (vector_x(i)*b(i)) ad sums, as well as the ed of computatio detectio. Let be the umber of B-ary digits of x. Accordig to (13), x max = B s.k - 1, so that = s.k ad the umber s of executios of the iteral loop body is s = /k. (15) As regards the reductio rate of the algorithm, otice that r is smaller tha s.b 2.k, so that the umber d(r) of B- ary digits of r satisfies the coditio d(r) 2.k + log B s, where s is equal to (15). Thus d(r max ) log B + 2.k log B k < log B + 2.k. (16) The core of the algorithm is a (/k)-step iteratio. Each of them icludes the product of two k-bit umbers x i ad b i, ad the sum of two (log 2 + 2.k)-bit umbers. The total computig time is approximately equal to time(,k) (/k).(log 2 + 5.k). (17) D. Barrett reductio algorithms A geeralized versio of the Barrett algorithm (Blake et al, 2002; Hakerso et al, 2004) is preseted. D.1 -digit to (k+t)-digit reductio Assume that m belogs to the rage B k-1 < m < B k where B is the base (or a power of the base) of the chose umeratio system (if m is a power of B the computatio of x mod m is trivial). The value of z = x mod m is the remaider of the iteger divisio of x by m, that is, x = q.m + z, z < m. The Barrett algorithm starts with the computatio of a approximatio q of q = x/m such that q-a q q. (18) Compute r = x q.m. (19) Takig ito accout that z = x q.m, the, accordig to (18), z r z + a.m. Let t be the iteger such that B t a+1. (20) The r z + a.m < (a+1).m < B k+t. Thus 0 z r < B k+t, so that r = r mod B k+t = (x - q.m) mod B k+t. (21) Furthermore, accordig to (19) r mod m = x mod m = z. (22) The followig algorithm, icludig a fuctio approximatio which geerates a approximatio q of x/m - see relatio (18) -, computes a (k+t)-digit umber r equivalet to x mod m: Algorithm 4 -digit to (k+t)-digit reductio q := approximatio(x, m); r := ((x mod B k+t ) (q*m mod B k+t )) mod B k+t ; If a = 2 ad B 3, the coditio (20) is B t 3 ad is satisfied if t = 1. Thus x - q.m ca be computed mod B k+1. This case correspods to the classical Barrett algorithm. D.2 A first approximatio of q Let x ad m be expressed i base B: x = x -1.B -1 + x -2.B -2 +... + x 0.B 0, m = m k-1.b k-1 + m k-2.b k-2 +... + m 0.B 0, where m k-1 > 0. The approximatio q of q = x/m is q = x/b k-1. B /m / B -k+1. It ca be demostrated (Hakerso et al, 2004) that q q + 2, that is a = 2. Accordig to (20) the value of t must be chose i such a way that B t 3. Thus if B = 2, the t = 2 (the computatio is performed mod B k+2 ), if B > 2 (classical Barrett algorithm), the t = 1 (the computatio is performed mod B k+1 ). 95

Lati America Applied Research 37:93-97 (2007) To summarize, the followig algorithm computes z = x mod p. The at c = B /m (23) must have bee previously calculated. Algorithm 5 Geeralized Barrett reductio y := x/b**(k-1); w := y*c; q := (w/b**(-k+1)) mod B**(k+t); r := ((x mod B**(k+t)) ((q*m) mod B**(k+t))) mod B**(k+t); while r >= m loop r := r-m; The divisio by B k-1 or B -k+1 ad the mod B k+t reductio are trivial operatios. The oly o-trivial operatios are the multiplicatio by m ad the subtractios. Commet I the classical Barrett algorithm (Blake et al, 2002; Hakerso et al, 2004), is assumed to be equal to 2.k, so that c = B 2.k /m, q = x/b k-1. B 2.k /m / B k+1. Assumig (best case approximatio) that the first value of r is already smaller tha m, the computatio time is the sum of the delays of a (-k+1)-bit by (-k+1)-bit multiplier (computatio of w), a (k+t)-bit by k-bit multiplier (computatio of q.m) ad a (k+t+1)-bit subtractor. It is approximately equal to ((3.(-k+1)-2) + (k+t+2.k-2) + (k+t+1)).t FA. If 2.t << 3.+k, the time(,k) (3.+k).T FA. (24) A drawback of the Barrett algorithm is the high of the multipliers. The of a -bit by m-bit multiplier is proportioal to.m (Deschamps et al, 2006). Thus, the total of both multipliers is proportioal to (k+1) 2 + (k+t).k (-k) 2 + k 2 whose value (for k smaller tha ) is 2 /2 (whe k = /2). D.3 A secod approximatio of q I order to reduce the computatio complexity (basically the computatio of w), a worse approximatio of q ca be computed. First observe that c = B /m is a at most (-k+1)-digit umber. Thus w = y.c = c 0.B 0.y + c 1.B 1.y +...+ c -k.b -k.y, q = y.c / B -k+1 = c 0.B -+k-1.y + c 1.B -+k.y +... + c -k.b -1.y. (25) Defie q = c 0. B -+k-1.y + c 1. B -+k.y +...+ c -k. B -1.y, that is q = c 0.v 0 + c 1. v 1 +... + c -k. v -k, with v i = y/b -k-i+1, i = 0, 1,..., -k. (26) Obviously q q. Furthermore q q + c 0 + c 1 +...+ c -m = q + weight(c), where weight(c) is the sum of all digits of c. Thus q weight(c) q q ad q - 2 weight(c) q q, that is, q is a approximatio (18) of q such that a = 2 + weight(c). Algorithm 6 Modified Barrett reductio y := x/b**(k-1); for i i 0.. -k loop v(i) := (y/b**(-k-i+1)) mod B**(k+t); q := c(0)*v(0) + c(1)*v(1)) mod B**(k+t); for i i 2.. -k loop q := (q + c(i)*v(i)) mod B**(k+t); r := ((x mod B**(k+t)) ((q*m) mod B**(k+t))) mod B**(k+t); while r >= m loop r := r-m; The divisio by B k-1 or B -k-i+1 ad the mod B k+t reductio are trivial operatios. The oly o-trivial operatios are multiplicatios by B-ary digits c i (a trivial operatio if B=2), multiplicatio by m, additios ad subtractios. The computatio is divided ito two parts. First, a (-k)-step iteratio computes q. The correspodig time is approximately (-k).(k+t).t FA (k).k.t FA. Assumig agai (best case approximatio) that the first value of r is smaller tha m, the secod part cosists of a (k+t)-bit by k-bit product (q.m) ad a (k+t)- bit subtractio, that is, a delay equal to ((3.k+t-2) + (k+t)).t FA 4.k.T FA. Thus, the total time is about time(,k) (-k+4).k.t FA (-k).k.t FA. (27) E. Summary The mai results are summarized i table 1. The approximate computatio time, expressed i full-adder delays, is give for every reductio method. I particular, the values obtaied whe = 2.k are computed: they correspod to the case where x is the result of multiplyig two elemets of Z m, that is, two k-bit umbers. Table 1. Computatio time, expressed i full-adder delays, for reducig a -bit umber modulo a k-bit umber algorithm time(,k) time(2.k,k) o-restorig divisio (-k).k k 2 mod 2 k -a (/k).(+2.k) 8.k pre-comput of 2 i.k mod m (/k).(log 2 +5.k) 10.k Barrett 3.+k 7.k modified Barrett (-k).k k 2 As log as the computatio time is cosidered, ad assumig that the approximatios are reasoably good, the Barrett algorithm is the best choice. Nevertheless, as quoted above, its is O( 2 ) ad could be prohibitively high for great values of (see ext sectio). III. FPGA IMPLEMENTATIONS Reductio circuits, with = 2.k = 16, 64, 256 ad 1024, have bee sythesized usig ISE6.3i (Xilix, 2006). The results for a xc3s4000-5 device are give i tables 2 to 6. The is expressed i umber of slices. Apart from the logic slices, both Barrett algorithms eed a lot of 18- by-18-bit multipliers. The xc3s4000-5 device cotais 96 such dividers, a isufficiet umber for implemet- 96

J-P. DESCHAMPS, G. SUTTER ig Barrett algorithms for great values of. This fact is idicated by the symbol withi the colum. Reductio circuits for = 64 ad m = 239, so that k = 8, have also bee sythesized (table 7). Table 2. No-restorig divisio: ad computatio time ( = 2.k) 16 49 7 60 64 133 9 300 256 430 14 1,800 1024 1619 36 19,000 Table 3. Reductio mod 2 k -a: ad computatio time ( = 2.k) 16 25 6 25 64 72 8 35 256 240 13 55 1024 918 35 140 Table 4. Pre-computatio of 2 i.k mod m: ad computatio time ( = 2.k) 16 42 6 50 64 144 9 75 256 536 20 160 1024 2061 62 500 Table 5. Barrett algorithm: ad computatio time ( = 2.k) 16 31 8 25 64 130 10 30 256 - - 1024 - - Table 7. Cost ad computatio time ( = 64 ad m = 239) algorithm mi. period (s) time (s) o-restorig divisio 118 14 850 mod 2 k -a 101 14 300 pre-comput. of 2 i.k mod m 116 20 600 Barrett 546 13 50 modified Barrett 215 19 1,600 IV. COMMENTS AND CONCLUSIONS Accordig to both the theoretical aalysis (table 1) ad the practical sythesis results (tables 2 to 7), the fastest circuits are obtaied with the Barrett algorithm. Nevertheless, the correspodig s are excessive for great values of. The secod best solutio, as regards the computatio time, is the reductio mod 2 k -a. Actually, these coclusios are valid as log as geeric reductio circuits are cosidered. For specific values of ad m, the pre-computatio optio could be a iterestig alterative (chapter 8 of Deschamps et al, 2006). For small values of, the best optio is a block of ROM storig the 2 pre-computed values of x mod m. I the case where the reductio is part of a algorithm icludig a lot of multiplicatios, for example a expoetiatio algorithm, a alterative solutio is the Motgomery product (Motgomery, 1985). It has ot bee studied i this paper dedicated to reductio circuits, but is oe of the mai topics of aother (ot yet published) work o fiite rig ad field operatios. REFERENCES Blake, I.V., G. Seroussi ad N. Smart, Elliptic Curves i Cryptography. Cambridge Uiversity Press (2002) Hakerso, D., A.J. Meezes ad S. Vastoe, Guide to Elliptic Curve Cryptography, Spriger (2004) Deschamps, J.-P., G.A. Bioul, ad G.D. Sutter, Sythesis of Arithmetic Circuits, Wiley (2006) Motgomery, P., Modular Multiplicatio without Trial Divisio, Mathematics of Computatio, 44, 519-521 (1985) Xilix Ic, http://www.xilix.com (2006) Table 6. Modified Barrett algorithm: ad computatio time ( = 2.k) 16 62 9 80 64 373 17 650 256 4,245 25 3,300 1024 - - Received: April 14, 2006. Accepted: September 8, 2006. Recommeded by Special Issue Editor Hilda Larrodo. 97