Elliptic Curve Scalar Point Multiplication Algorithm Using Radix-4 Booth s Algorithm

Elliptic Curve Scalar Multiplication Algorith Using Radix-4 Booth s Algorith Elliptic Curve Scalar Multiplication Algorith Using Radix-4 Booth s Algorith Sangook Moon, Non-eber ABSTRACT The ain back-bone operation in elliptic curve cryptosystes is scalar point ultiplication The ost frequently used ethod ipleenting the scalar point ultiplication, which is perfored in the topost level of ultiplication and division, has been the double-and-add algorith, which is being recently challenged by NAF (Non-Adjacent Forat) algorith In this paper, we propose a ore efficient and novel approach of a scalar ultiplication ethod than the double-and-add by applying redundant recoding which originates fro the radix-4 odied Booth s algorith We call the novel algorith quad-and-add After deriving the algorith, we created a new operation, naed point quadruple, and veried with calculations of a real-world application to utilize it Derived nuerical expressions were veried using both C progras and HDL (Hardware Description Language) Proposed ethod of elliptic curve scalar point ultiplication can be utilized in any elliptic curve security applications for handling efficient and fast calculations Keywords: elliptic curve cryptosyste, scalar point ultiplication,, HDL, security 1 INTRODUCTION As an indispensable coponent of inforation technologies, security applications, such as IC cards used for personal authentication and doestic network applications, play an iportant role In fact, such data security receives constant attention, since people tend to counicate with each other by various electronic devices over networks Security applications are based upon intensive coputations of cryptographic algoriths, which generally involve in arithetic operations in large Galois Fields () [1][] Polynoial basis offers good solutions to ost coputational probles Also, polynoial basis is the easiest to use aong other representations Therefore, we focus on using the polynoial basis throughout this docuent [] 04PSI05: Manuscript received on Deceber 1, 004 ; revised on June, 005 The author is with the Departent of Inforation Security and Electronic Engineering, Mokwon University, Daejeon, Korea E-ail:soon@okwonackr The ost iportant and tie-consuing operation in calculating elliptic curve cryptography (ECC) operations is the scalar point ultiplication, which repeatedly perfors point addition operation as in expression (1) In expression (1), k is an arbitrary integer nuber on a finite field ( ) and P is an arbitrary point on an elliptic curve (EC) defined on the finite field ( ) kp = k P (k ties of point addition) (1) i=1 Figure 1 shows the hierarchical structure of an ECC operation In general, as we intend to perfor one scalar point ultiplication [4], we need a couple of point addition operation ( two points are dferent) and a couple of point double operation ( two points are identical) The ost iportant factor required in the speed-effective ipleentation of a scalar point ultiplication is proper handling of expression (1) Double-and-add algorith has been traditionally prevalent in this area, which is recently being challenged by NAF algorith [5] In this paper, we propose a scalar point ultiplication algorith with a novel approach applying radix-4 Booth s recoding and derive nuerical expressions on the point quadruple operation [6] We evaluated and veried the algoriths using real applications Derived expressions were described with both C progra and HDL to be proven, easuring its perforance iproveent The outline of the paper is as follows: We start by introducing the concept of elliptic curve scalar point ultiplication operation in Chapter In Chapter we discuss our evaluation and validation about our proposed algoriths, and will conclude in Chapter 4 ELLIPTIC CURVE SCALAR POINT - MULTIPLICATION OPERATION AL- GORITHMS In this contribution, we will propose a new approach of obtaining the scalar point ultiplication product based on an EC group First, we ll introduce the fundaental atheatics of the ECC-based cryptosyste, especially for polynoial basis arithetics In section, we discuss the previous studies which have been researched to iprove the coplex EC point ultiplication operation calculation After that, we propose the algorith and a few copleentary forulas in section

4 ECTI TRANSACTIONS ON COMPUTER AND INFORMATION THEORY VOL1, NO1 MAY 005 Multiplication Multiplication/ Squaring Division Double Multiplication/ Squaring Division Fig1: Hierarchical structure of an elliptic curve operation 1 Matheatics of the ECC-based cryptosyste Two ain operations are required to ultiply an EC group eleent by a constant when encrypting a essage: point addition and pointdouble operations We also include point negation (Neg) as a iscellaneous operation and point quadruple (Quad) operation, which is about to be suggested for fast ipleentation algorith of kp The elliptic curve E is defined as the set of all solutions (x, y) to the equation y +xy = x +ax +b together with the point at infinity O, where b is not 0 This extra point O is needed to represent the group identity Rules for the above atheatical operation routines except for Quad operation are presented below Rules for the Quad operation are given in section addition: Let P (x 1, y 1 ) and Q(x, y ) be two dferent points on the curve If either point is O, the result is the other point If P = Q, use point double routine If x 1 = x and y 1 y, P + Q = O If P Q, then P + Q = R(x, y ), where x = λ + λ + x 1 + x + a, y = λ(x 1 + x + x + y 1, ( y1 + y ) and λ = x 1 + x () double: Let P (x 1, y 1 ) = Q(x 1, y 1 ) be a point on the curve If x 1 = 0, the result of P is O If x 1 0, (x 1, y 1 ) = R(x, y ), where negation: x = λ + λ + a, y = x 1 + (λ + 1)x, ( and λ = x 1 + y ) 1 x 1 Let P (x 1, y 1 ) be a point on the curve -P = R(x, y ), or () (x, y ) = (x 1, y 1 ) = (x 1, x 1 + y 1 ) (4) Fro the rules above, we can discern the nuber of field operations required to carry out the routine In the point addition routine, 8 additions, 1 ultiplication, 1 division, and 1 squaring are required We should check that the divider of λ, or (x 1 +x ) is not zero The point double routine requires 4 additions, 1 ultiplication, squarings, and 1 division Also, we should check that the divider of λ or x 1 is not zero The point negation routine requires just one addition This operation is needed when ipleenting the fast algorith for the calculation of kp As explained later in section, the values of ( P ) and ( P ) are needed in the algorith we developed As basic atheatics for the ECC-based cryptosyste, ultiplication and division occupy indispensable positions, with the greatest iportance of utilizing the scalarpointultiplication operation, which is to discussed fro below Recent studies Double-and-add algorith has been the leading algorith in ipleenting the scalar point ultiplication in ECC [7] Double-and-add is siilar to the square-and-ultiply algorith in the RSA cryptosyste [8], in which odular exponentiation is ipleented with the algorith Double-and-add algorith is represented in expression (5) as below, when k = 1 i=0 b i i (b i 0, 1) Fro this point on, we will use soe notations The point addition operation will be represented as add( ) and the point double as double( ) Double-and-add algorith for coputing kp kp: k = 1 i=0 b i i (b i 0, 1) P := P (x 1, y 1 ) Q := P for i fro 1 downto 0 do Q := double{q) b i = 1 then

Elliptic Curve Scalar Multiplication Algorith Using Radix-4 Booth s Algorith 5 Q := add(p, Q) end (Q = kp ) (5) Note that we need as any nuber of add( ) operations as the nuber of Haing weight in the binary representation of k in addition to at least 1 ties of double( ) In order to iprove the perforance of the algorith above, several algoriths have been suggested One of the algoriths is NAF (Non-Adjacent Forat), as described below in expression (6) ultiplication/ squaring Multiplication Double ultiplication/ squaring Quadruple ultiplication/ squaring Binary NAF ethod for coputing kp kp: division division division Q := 0 NAF (k) = t 1 k i i i=0 for i fro t 1 downto 0 do Q := Q k i = 1 then Q := Q + P k i = 1 then Q := Q P end (Q := kp ) (6) In the above ethod, the concept of redundancy of the binary representation of k is used in calculating kp However, it has a weak point that k should be converted into NAF forat in advance As an iproved approach of the concept of redundancy, we propose a tricky algorith naed quad-and-add algorith which utilizes point quadruple operation, both of which will be discussed in detail in the next section Quad-and-add algorith In order to obtain two ties as fast calculations as double-and-add algorith, we applied radix-4 redundant recoding to the binary presentation of EC point Q Expression (7) shows the concept of using radix- 4 redundancy in pseudo code representation Due to the characteristic of radix-4 redundancy recoding, to- [ tal nuber of steps reduces by half down to ] 1 According to the result of radix-4 recoding of point Q in each step, one out of the adders 0P, ±P, ±P is chosen so that we get the final scalar ultiplication [ ] result in 1 cycles, which is ties as fast as the double-and-add algorith Quad-and-add algorith using radix-4 redundancy kp: [ ] k = 1 i=0 r i4 i (r i is the value of redundancy recoding) P := P (x 1, y 1 ) addition/ negation addition/ negation addition/ negation Fig: Hierarchical structure of elliptic curve operations in suggested algorith P := double(p ) Q := one of {0P, +P, +P, P, P } [ ] for i fro 1 downto 0 do Q := quad(q) (r i == +P ) then Q := add(p, Q) (r i == +P ) then Q := add(p, Q) tepp := neg(p ) tepp := neg(p ) (r i == P ) then (r i == P ) then Q := add(tepp, Q) end (Q := kp) (7) Here, in order to get the quadruple point of a point P on the given EC without using the double( ) operation two consecutive ties, we derived the point quadruple operation (hereafter quad( )) cobining the point addition and point double operation, as in expression (8) Then, the hierarchy shown in Figure 1 becoes slightly odied as Figure quadruple operation (quad( )) P (x 1, y 1 ) = Q(x 1, y 1 ) is identical on an EC x 1 = 0, the result 4P is O (zero at infinity) x 1 0, the result 4P (x 1, y 1 ) = R(x, y ), where x and y are as follows, x = λ + λ + a,

6 ECTI TRANSACTIONS ON COMPUTER AND INFORMATION THEORY VOL1, NO1 MAY 005 y = x + ( λ + 1)x, λ = x + λ + 1 + x 1, (8) x x = λ + λ + a, ( λ = x 1 + y ) 1 x 1 Fro this forula, we can deterine the nuber of field operations The quad( ) routine will require 10 additions, 1 ultiplication, divisions, and 4 squarings Fig shows a siple exaple of coparison between the traditional double-and-add algorith and our proposed new algorith using radix-4 redundant recoding [ Making unsigned nuber Booth s recoding : select aong 0P, +1P, +P, -1P, -P Exaple 1010P 1 P P (P) +P 4 ((P) +P) = 10P Fig: k P ( 0kt 1kt kt k1k0 ) P k k k k k ) P ( b 1 b b 1 0 P -P -P 01010P = 010 1 1 0 1 1P (1P) 4-1P = P (4P-1P) 4 P = 10P Coparison exaple of two algoriths The ] nuber of iterations decreases fro to + 1 steps Table 1 suarizes the iproveent in the nuber of steps and required EC operations The nuber of operations in Table 1 is calculated based on the probability that is dependent on the haing weight of the prie polynoial The probability of the existence of 1 in the binary representation of k during steps in the double-and-add algorith is 05, and the probability of the existence of non-zero Booth s recoding ter is 6/8 The new algorith exhibits a reduction of about 15% in handling Add operations Furtherore, the new algorith is also advantageous because of using Quad operations The Quad routine is induced fro anipulating the expressions in the Double routine resulting in a reduction of 1 field ultiplication, and the proposed algorith can be far ore efficient by enhancing the Quad routine using higher atheatics in future The proposed algorith requires Booth s recoding circuit and eory space for storing the values of P, P, P, and P additionally The nuber of field operations for calculating kp are represented in detail in Table, which deonstrates the efficiency of our proposed algorith We achieved perforance iproveent of about 19% in ultiplication considering that our ultiplier can be used as a squarer, and about 9% in division Table 1: Double- and-add Proposed algorith Coparison of Nuber of operations # of steps Add Double Neg Quad 1 0 0 [ ] + 1 [ ] 8 1 + 1 EVALUATION AND VALIDATION We divided the entire scalar ultiplication process into sub-blocks to very the derived expressions of quad( ) operation algorith Sub-blocks were roughly categorized as ultiplier block and divider block Entire prototype processor was siulated using digital control unit designed fro finite state achines, integrating teporary register blocks Overall perforance of the pilot EC processor was veried through siulation using both C test bench progras and HDL at algorithic level Firstly, we ipleented Mastrovito s serial ultiplier [9], which is undoubtedly reliable for verication, in C language level and applied -bit rando binary nubers as test vectors C languagedescribed serial ultiplier does not produce error due to its algorithic originality Result was confired by checking the value of partial results during each step of algorithic process Also, we used divider which was also veried in [1] We used the fact that we ultiply a point by the order of given EC, we get the point at infinity (O), as a special characteristic of EC operations [10], to very the entire scalar point ultiplication process The upper part of Figure represents the process of obtaining the point at infinity using double-and-add algorith at 19 nd level In the lower part of Figure 4, we can see that we get the expected point at infinity at 96th step using quad( ) operation Evaluation was perfored at the level of highest hierarchy, or scalar point ultiplication, in the HDL-described -bit elliptic curve cryptoprocessor Figure 5 shows the block diagra of HDL-described -bit elliptic curve processor We evaluated the perforance focusing on the suggested quad( ) operation Table represents the perforance iproveent Measureent on our -bit cryptoprocessor showed ore than 40% reduction of ultiplication operation and a sall aount of increase in the nuber of division and addition applying sectr as EC paraeters which is suggested by SEG [4] in bit scalar ultiplication Overall perforance gain over the architecture using doubleand-add algorith was about 0%, considering speed and area

Elliptic Curve Scalar Multiplication Algorith Using Radix-4 Booth s Algorith 7 double-and-add step# 0:double-and-add 0_D9B67D19_E067C8_0F9E1A_7E8CA14_A65150A_AE617E8F 1_CE9456_07C04AC_9E7DEFB_D9CA01F5_96F97_4CDECF6C step# 1:double 1_756FF0DC_810F7856_0C5F5C_B14481F_A66857B_B151DA 1_07188B7_5B044A9_17ADAC_A9EF8CDC_89CDEBA_F9165 step# :double 1_1549FE4_A8980E6_C9AF6F_4C81D415_00B09840_85FB447 1_C0DDD61E_0CD1960A_59F7FE6_A8660A5_4D9F41E_4BC989F step#190:double-and-add 1_654EB57_65586DB_05FDEBC_511BC95F_D995691_E0E95F9F 0_9CBCACD_87A6A81_97F978_D088E_179790E_589F97 step#191:double 1_5AE784C_9954F598_6475718C_069EE79_FAA9E_465F8E7 1_BC551A_6D7AE79_4E5EDF9_FA6FB66_DB5D58D_1BC8CAA step#19:double-and-add quad-and-add step# 0: quad-and-add(p) 0_D9B67D19_E067C8_0F9E1A_7E8CA14_A65150A_AE617E8F 1_CE9456_07C04AC_9E7DEFB_D9CA01F5_96F97_4CDECF6C step# 1: quad 1_1549FE4_A8980E6_C9AF6F_4C81D415_00B09840_85FB447 1_C0DDD61E_0CD1960A_59F7FE6_A8660A5_4D9F41E_4BC989F step# 94 : quad -and-add(p) 0_4771CC_8EF4A9_81965AC9_5FBC8DE_4A0FC90_608E77D 0_905C0FE8_0E90B8D0_59C068_C561E98C_495E74_9A87F1D step# 95 : quad -and-add(p) 1_654EB57_65586DB_05FDEBC_511BC95F_D995691_E0E95F9F 0_9CBCACD_87A6A81_97F978_D088E_179790E_589F97 step# 96 : quad -and-add(p) Fig4: Scalar ultiplication coparison using double-and-add and quad-and-add Table : Coparison of Nuber of operations Doubld- Quad-and-add Reduction and-add ratio Multiplication 7 8 + 1 058 Division 11 8 + 1 091 5 Square 19 8 + 095 8 8 + 6 1 4 DISCUSSION AND CONCLUSION We proposed an iproved version of a scalar point ultiplication algorith using the concept of radix-4 redundancy, or point quadruple scalar operation useful in coputing coplex EC operations We applied the test environent in an Elgaal EC cryptosyste using the proposed algorith Designed prototype was veried with both C progra language and HDL Evaluation result showed about ore than 0% perforance enhanceent over the algorith using double-and-add ethod Fast scalar point ultiplication operation can be used in various applications such as encryption / de- serial in encrypted_point (x t, y t ) Fig5: Paraeter registers k f a b x P y P x P y P Add 194 ux sqr_reg ux Sht/ decode ula_in ul_inb ( ) Multiplier 194 div_in 0 ( ) Divider CTL_ROM Teporary registers 1 1 14 x t y t div_sel Control Block -bit EC cryptoprocessor prototype cryption using EC operations and electronic signature authentication as well as secure key exchange, and the iportance of its versatility can be too uch ephasized Also, by utilizing the point quadruple operation suggested in this paper, we can expect faster and efficient coputation in ost finite field operations References [1] E R Berlekap, Bit-Serial Reed-Soloon Encoders, IEEE Transactions on Inforation Theory, Vol IT-8, No 6, pp 869-874, Nov 198 [] C Paar, P Fleischann, P S-Rodriguez: Fast Arithetic for Public-Key Algoriths in Galois Fields with Coposite Exponents, IEEE Transactions on Coputers, October 1999, Vol 48, No 10, pp 105-104 [] E Mastrovito : VLSI Architectures for Coputation in Galois Fields, PhD thesis, Dept of Electrical Eng, Linkoping Univ, Sweden, 1991 [4] Neal Koblitz, Elliptic Curve Cryptosystes, Matheatics of Coputation, 48 n177 (1987), pp 0-09 [5] D Hankerson, J L Hernandez, and A Menezes, Software Ipleentation of Elliptic Curve Cryptography over Binary Fields, Crypto95 (1995) [6] Israel Koren, Coputer Arithetic Algoriths, Chapter 6, pp 99-106 Prentice Hall International, 199 [7] N Koblitz, A Course in Nuber Theory and Cryptography, Springer-Verlag (1991) [8] R L Rivest, A Shair, and L M Adlean,

8 ECTI TRANSACTIONS ON COMPUTER AND INFORMATION THEORY VOL1, NO1 MAY 005 A Method for Obtaining Digital Signatures and Public-key Cryptosystes, Counications of the ACM, vol 1, pp 10-16 (Feb 1978) [9] E D Mastrovito, VLSI Design for ultiplication over finite fields ( ), in Lecture notes in Coputer Science 57, Springer-Verlag, Berlin, pp 97-09 (Mar 1989) [10] Certico research, SEG: Recoended Elliptic Curve Doain Paraeters (Oct 1999) Sangook Moon was born in Korea, in 1971 He received the BS, MS and PhD degree in electronic engineering fro Yonsei University, Korea in 1995, 1997, and 00 respectively In 004, he joined the Departent of Electronic Engineering at Mokwon University, where he is currently an assistant professor His current research interests include VLSI, crypto-processors, icroprocessors, coputer arithetic, and security-related SoC