CHEBYSHEV polynomials have been an essential mathematical

Similar documents
A new class of irreducible pentanomials for polynomial based multipliers in binary fields

Maximally Flat Lowpass Digital Differentiators

Modular Multiplication in GF (p k ) using Lagrange Representation

Polynomial evaluation and interpolation on special sets of points

Integer Multiplication

Name Class Date. t = = 10m. n + 19 = = 2f + 9

Polynomial Multiplication over Finite Fields using Field Extensions and Interpolation

Fast Polynomial Multiplication

Bellwood-Antis School District Curriculum Revised on 11/08/12

Gaussian-Shaped Circularly-Symmetric 2D Filter Banks

ON THE REALIZATION OF 2D LATTICE-LADDER DISCRETE FILTERS

NUMERICAL MATHEMATICS & COMPUTING 7th Edition

A Low-Error Statistical Fixed-Width Multiplier and Its Applications

A new class of irreducible pentanomials for polynomial based multipliers in binary fields

A HIGH-SPEED PROCESSOR FOR RECTANGULAR-TO-POLAR CONVERSION WITH APPLICATIONS IN DIGITAL COMMUNICATIONS *

LECTURE NOTES ELEMENTARY NUMERICAL METHODS. Eusebius Doedel

Low complexity bit-parallel GF (2 m ) multiplier for all-one polynomials

East Penn School District Secondary Curriculum

Elliptic Curves Spring 2013 Lecture #3 02/12/2013

Functions and their Graphs

Optimal Polynomial Control for Discrete-Time Systems

x y = x + y. For example, the tropical sum 4 9 = 4, and the tropical product 4 9 = 13.

SOLVING LINEAR SYSTEMS

Generating Functions (Revised Edition)

Chapter Six. Polynomials. Properties of Exponents Algebraic Expressions Addition, Subtraction, and Multiplication Factoring Solving by Factoring

Prentice Hall Mathematics, Algebra Correlated to: Achieve American Diploma Project Algebra II End-of-Course Exam Content Standards

MATRICES. a m,1 a m,n A =

Subquadratic Space Complexity Multiplication over Binary Fields with Dickson Polynomial Representation

DESIGN OF QUANTIZED FIR FILTER USING COMPENSATING ZEROS

Divide and Conquer: Polynomial Multiplication Version of October 1 / 7, 24201

e x = 1 + x + x2 2! + x3 If the function f(x) can be written as a power series on an interval I, then the power series is of the form

CHINO VALLEY UNIFIED SCHOOL DISTRICT INSTRUCTIONAL GUIDE ALGEBRA II

California Common Core State Standards for Mathematics Standards Map Mathematics III

Numerical Analysis. Yutian LI. 2018/19 Term 1 CUHKSZ. Yutian LI (CUHKSZ) Numerical Analysis 2018/19 1 / 41

CSE 548: Analysis of Algorithms. Lecture 4 ( Divide-and-Conquer Algorithms: Polynomial Multiplication )

Grade 8 Math Curriculum Map Erin Murphy

Carbon Career & Technical Institute

Efficient random number generation on FPGA-s

Degree of a polynomial

Parallelism in Computer Arithmetic: A Historical Perspective

Compute the behavior of reality even if it is impossible to observe the processes (for example a black hole in astrophysics).

Algebra 1. Correlated to the Texas Essential Knowledge and Skills. TEKS Units Lessons

Systematic Synthetic Factoring

Subquadratic space complexity multiplier for a class of binary fields using Toeplitz matrix approach

Power Series Solutions We use power series to solve second order differential equations

Determining a span. λ + µ + ν = x 2λ + 2µ 10ν = y λ + 3µ 9ν = z.

Mastrovito Form of Non-recursive Karatsuba Multiplier for All Trinomials

SCHOOL OF MATHEMATICS MATHEMATICS FOR PART I ENGINEERING. Self-paced Course

A New Algorithm to Compute Terms in Special Types of Characteristic Sequences

Binary addition example worked out

ESCONDIDO UNION HIGH SCHOOL DISTRICT COURSE OF STUDY OUTLINE AND INSTRUCTIONAL OBJECTIVES

Case Studies of Logical Computation on Stochastic Bit Streams

SYNTHESIS OF BIRECIPROCAL WAVE DIGITAL FILTERS WITH EQUIRIPPLE AMPLITUDE AND PHASE

ALGEBRA I FORM I. Textbook: Algebra, Second Edition;Prentice Hall,2002

Citation Ieee Signal Processing Letters, 2001, v. 8 n. 6, p

Numerical Analysis: Solving Systems of Linear Equations

BASIC MATRIX ALGEBRA WITH ALGORITHMS AND APPLICATIONS ROBERT A. LIEBLER CHAPMAN & HALL/CRC

Algebra 2 Secondary Mathematics Instructional Guide

We consider the problem of finding a polynomial that interpolates a given set of values:

Mathematical preliminaries and error analysis

9th and 10th Grade Math Proficiency Objectives Strand One: Number Sense and Operations

) = nlog b ( m) ( m) log b ( ) ( ) = log a b ( ) Algebra 2 (1) Semester 2. Exponents and Logarithmic Functions

Pre AP Algebra. Mathematics Standards of Learning Curriculum Framework 2009: Pre AP Algebra

Algebra 2 Standards. Essential Standards:

Implementation of the DKSS Algorithm for Multiplication of Large Numbers

RON M. ROTH * GADIEL SEROUSSI **

Chapter 4 Mathematics of Cryptography

3.3 Accumulation Sequences

Hardware Operator for Simultaneous Sine and Cosine Evaluation

Essentials of Intermediate Algebra

Algebra Performance Level Descriptors

A Digit-Serial Systolic Multiplier for Finite Fields GF(2 m )

Efficient Subquadratic Space Complexity Binary Polynomial Multipliers Based On Block Recombination

Introduction to Algorithms

Chapter 1 Divide and Conquer Algorithm Theory WS 2016/17 Fabian Kuhn

Mathematics Standards for High School Algebra II

Elements of Floating-point Arithmetic

Chapter 2 Formulas and Definitions:

Frequency Domain Finite Field Arithmetic for Elliptic Curve Cryptography

Interpolation and Cubature at Geronimus Nodes Generated by Different Geronimus Polynomials

Computing running DCTs and DSTs based on their second-order shift properties

1.2 Finite Precision Arithmetic

Mathematics Online Instructional Materials Correlation to the 2009 Algebra II Standards of Learning and Curriculum Framework

_Algebra 2 Marking Period 1

VOYAGER INSIDE ALGEBRA CORRELATED TO THE NEW JERSEY STUDENT LEARNING OBJECTIVES AND CCSS.

Transactions on Modelling and Simulation vol 12, 1996 WIT Press, ISSN X

MyMathLab for School Precalculus Graphical, Numerical, Algebraic Common Core Edition 2016

Division of Trinomials by Pentanomials and Orthogonal Arrays

Bemidji Area Schools Outcomes in Mathematics Analysis 1. Based on Minnesota Academic Standards in Mathematics (2007) Page 1 of 5

Analysis and Synthesis of Weighted-Sum Functions

Solving Quadratic Equations

Polynomial Interpolation

1 Chapter 2 Perform arithmetic operations with polynomial expressions containing rational coefficients 2-2, 2-3, 2-4

Algebra II Learning Targets

2012 IEEE International Symposium on Information Theory Proceedings

FIBER Bragg gratings are important elements in optical

3.3 Real Zeros of Polynomial Functions

TAYLOR AND MACLAURIN SERIES

Some Notes on Linear Algebra

Portable Assisted Study Sequence ALGEBRA IIB

Transcription:

IEEE TRANSACTIONS ON COMPUTERS, VOL X, NO X, MONTH YYYY 1 A Karatsuba-based Algorithm for Polynomial Multiplication in Chebyshev Form Juliano B Lima, Student Member, IEEE, Daniel Panario, Member, IEEE, and Qiang Wang Abstract In this paper, we present a new method for multiplying polynomials in Chebyshev form Our approach has two steps First, the well-known Karatsuba s algorithm is applied to polynomials constructed by using Chebyshev coefficients Then, from the obtained result, extra arithmetic operations are used to write the final result in Chebyshev form The proposed algorithm has a quadratic computational complexity We also compare our method to other approaches Index Terms Theory of computation, analysis of algorithms and problem complexity, computations on polynomials I INTRODUCTION CHEBYSHEV polynomials have been an essential mathematical object in several fields of knowledge In Electronics, for instance, such polynomials have an important role in the design of analog and digital filters with characteristics close to the ideal ones [1] Recently, Chebyshev series, ie, the approximation of a function in terms of Chebyshev polynomials, was proposed for analyzing circuit s nonlinearities This provides more accuracy when comparing to other expansions, such as Taylor series [] Interpolation techniques via Chebyshev polynomials have been part of numerical algorithms for calculating chromatic dispersion coefficients of optical fibers This allows us to plot the dispersion curves that describe the behavior of those fibers [3] Such techniques are also useful in direct digital frequency synthesis of arbitrary waveform, resample procedures for discrete multitone modems and many other scenarios [4], [5] In general, the use of Chebyshev polynomials for approximating a function assures more stability than the monomial representation or the use of other basis In particular, if a truncation is necessary, the quick decreasing of Chebyshev expansion coefficients entails relatively small rounding errors [6], [7] This is the basic reason making those polynomials highly attractive in numerical analysis and, in particular, in approximation and interpolation techniques This paper deals with the important operation of multiplication of polynomials in Chebyshev form That is, given two polynomials a(x) and b(x) in Chebyshev form, obtain the polynomial c(x) = a(x) b(x) also written in Chebyshev form This problem was previously addressed in [8], where two approaches were given The first one is a direct multiplication of polynomials in Chebyshev form, while the second is based on the discrete cosine transform (DCT) In this work, we propose a method based on the well-known Karatsuba s algorithm [9], [10] Our approach consists of the application of Karatsuba s algorithm to the ordinary polynomials a (x) and b (x) obtained Manuscript received Month DD, YYYY; revised Month DD, YYYY J B Lima is with the Department of Electronics and Systems, Federal University of Pernambuco, Recife, Brazil (e-mail: juliano bandeira@ieeeorg) D Panario and Q Wang are with the School of Mathematics and Statistics, Carleton University, Ottawa, Canada (e-mail: daniel@mathcarletonca; wang@mathcarletonca) from a(x) and b(x) The coefficients in the resulting product are denoted by c i Then, we show that the Chebyshev coefficients of c(x), denoted by c i, can be computed from the coefficients c i This procedure, which needs extra arithmetic operations, is derived and explained in detail Although our method involves a quadratic computational complexity, the number of required multiplications is reduced by half, when compared to the direct multiplication [8] Under this aspect, for small degree polynomials a(x) and b(x) covering several Chebyshev expansion practical applications [], [3], [4], our method is also more efficient than the mentioned DCT approach Moreover, our procedure seems to provide implementation advantages because it does not introduce rounding errors In Section II, we review Chebyshev polynomials and the direct method for multiplying polynomials in Chebyshev form In Section III, after introducing the main ideas of this paper, the standard Karatsuba s algorithm is briefly shown Then, we use this algorithm to perform the Chebyshev basis polynomial multiplication and provide some examples Furthermore, Theorem gives a precise estimate for the cost of the algorithm A comparison with other approaches and conclusions are given in Section IV II MULTIPLICATION OF POLYNOMIALS IN CHEBYSHEV FORM The classical definition of Chebyshev polynomials of the first kind is T i (x) := cos(i arccos x), (1) where i N and x [ 1, 1] From Equation (1), we obtain T 0 (x) = 1, T 1 (x) = x, and the recurrence relation T i+1 (x) = x T i (x) T i 1 (x) Hence, Chebyshev polynomials of degree i can be easily obtained It is also shown that every real polynomial a(x) of degree N 1 can be written as a linear combination of Chebyshev polynomials of the first kind [6] Usually, this is called Chebyshev expansion and it is given by From the relation a(x) = a N 1 0 + a i T i (x), a i R () T i T j = T i+j + T i j, i, j N, which can be verified by using simple trigonometric identities, a multiplication rule for polynomials in Chebyshev form can be derived It is described in the following proposition [8] Proposition 1: Let a(x) and b(x) be polynomials of degree N 1 given in the Chebyshev form a(x) = a N 1 0 + a i T i (x)

IEEE TRANSACTIONS ON COMPUTERS, VOL X, NO X, MONTH YYYY and b(x) = b N 1 0 + b i T i (x), where a i, b i R Then the product c(x) = a(x) b(x) has the Chebyshev form c i = a 0 b 0, i = 0; i l=0 a i l b l, i = 1,, N ; N 1 l=i N+1 a i l b l, i = N 1,, N (6) with c i = c(x) = c N 0 + c i T i (x) a 0 b 0 + N 1 l=1 a l b l, i = 0; i l=0 a i l b l + N 1 i l=1 (a l b l+i + a l+i b l ), i = 1,, N ; N 1 l=i N+1 a i l b l, i = N 1,, N The computation of all coefficients c i, i = 0,, N, directly from Equation (3) is referred as a direct method and involves O(N ) real multiplications [8] In this same equation, the number of all possible products a i b j, i, j = 0,, N 1, and the number of products by 1/ are counted This gives M d (n), the exact number of multiplications for computing all coefficients c i using the direct method, (3) M d (N) = N + N 1 (4) According to Equation (3), given integers i 1 and i such that 1 i 1 N and i 1 < i N, any term with the form (a l b l+i1 +a l+i1 b l ), l = 1,, N 1 i 1, is previously computed in the sum i a l=0 i l b l or N 1 a l=i N+1 i l b l Consequently, in the second row of that equation, the additions (a l b l+i +a l+i b l ) do not need to be counted Therefore, A d (N), the exact number of additions for obtaining all coefficients c i using the direct method is N A d (N) = N 1 + (N 1) + N ( N i), i=n 1 (N 1) (3N ) = (5) III KARATSUBA-BASED ALGORITHM FOR THE MULTIPLICATION OF POLYNOMIALS IN CHEBYSHEV FORM In this section, we present our algorithm: we use Karatsuba s algorithm to compute the product of two polynomials whose Chebyshev coefficients are given Karatsuba s algorithm intermediate results are kept in track and then used to obtain the Chebyshev coefficients c i of the product polynomial The key point of our approach is to apply Karatsuba s algorithm for performing an ordinary polynomial multiplication and recover Chebyshev coefficients through some equations More specifically, in order to use the algorithm for multiplying a(x) and b(x), coefficients a i and b i are associated to the term of degree i, i = 0,, N 1, in the monomial representation This procedure gives polynomials a (x) = N 1 a i=0 i x i and b (x) = N 1 b i=0 i x i By running Karatsuba s algorithm, we obtain the polynomial c (x) = a (x) b (x) = N c i=0 i xi On the other hand, these coefficients c i are given by By substituting Equation (6) into Equation (3), we obtain c i + N 1 l=1 a l b l, i = 0; c i = c i + N 1 i l=1 (a l b l+i + a l+i b l ), i = 1,, N ;, i = N 1,, N c i We remark that Equation (6) can be obtained by running a classical divide and conquer method It involves N multiplications and N 1+N(N 1) (log N) 1 k additions To obtain k=1 coefficients c i in Equation (7), we need extra operations (that is, N 1 extra multiplications and N(N 1)/ extra additions) The total numbers of multiplications and additions are equal to the same numbers for the direct method; see Equations (4) and (5) That is why in this paper we concentrate on using Karatsuba s algorithm to obtain coefficients c i in Equation (6) Given coefficients c i computed by Karatsuba s algorithm, coefficients c i could be obtained from Equation (7) with the following number of extra multiplications: N 1 due to the scale factor 1/; N 1 for computing terms a l b l, l = 1,, N 1; (N )(N 1)/ for computing terms (a l b l+i + a l+i b l ) = (a l + a l+i ) (b l + b l+i ) a l b l a l+i b l+i, i = 1,, N, l = 1,, N 1 i This implies a total number of extra multiplications given by (N + 3 N )/ The number of extra additions related to first and second rows of Equation (7) would be N 1 and 5(N )(N 1)/, respectively Then, the total number of extra additions would be (5 N 13 N + 8)/ We show how these numbers of extra operations can be further reduced by using the intermediate results of the Karatsuba s algorithm previously applied Our algorithm is given below Algorithm: Karatsuba-based algorithm for polynomial multiplication in Chebyshev form Input: polynomials a(x) = a0 + N 1 a i T i (x) and b(x) = b 0 + N 1 b i T i (x) of degree N 1 in Chebyshev form Output: polynomial c(x) = a(x) b(x) = c0 + N c i T i (x) of degree N in Chebyshev form Step 1: Apply Karatsuba s algorithm on polynomials a (x) = N 1 a i=0 i x i and b (x) = N 1 b i=0 i x i, the product of which is denoted by c (x) = a (x) b (x) = N c i=0 i xi and store all intermediate computations Step 11: These c i are obtained from Equation (6) Step 1: Clearly, any intermediate computation related to the term of degree d in the polynomial c (x) can be written in the form D k=0 (a i k b jk +a jk b ik ), where D N 1 and i k +j k = d Step : Obtain terms a l b l, l = 1,, N 1, and (a l b l+i +a l+i b l ), i = 1,, N, l = 1,, N 1 i, from intermediate computations of the form presented in Step 11 This may require (7)

LIMA et al: A KARATSUBA-BASED ALGORITHM FOR POLYNOMIAL MULTIPLICATION IN CHEBYSHEV FORM 3 a separation procedure Step 1 (Separation): Separate each term of the form (a l b l+i + a l+i b l ) from the intermediate term D (a k=0 i k b jk + a jk b ik ), D > 0, such that i k + j k = l + i, for i = 1,, N and l = 1,, N 1 i Step 3: Add the terms obtained in Step to coefficients c i, i = 0,, N, according to first and second rows of Equation (7), to obtain c(x) = a(x) b(x) = c0 + N c i T i (x) We provide details concerning the execution of Step of the presented algorithm in Section III-B The correctness of the algorithm is immediate from Equations (3), (6) and (7) A Karatsuba s Algorithm Assume that we want to multiply two polynomials, a (x) and b (x), with degrees N 1 These polynomials are given in the monomial form and have coefficients a i and b i respectively For the purpose of this paper, we consider N = n, n N However, there are also efficient ways for dealing with polynomials with degrees different from n 1 [10], [11] We may write and where a (x) = A 1 (x) x N/ + A 0 (x) b (x) = B 1 (x) x N/ + B 0 (x), A 1 (x) = a N 1 x N/ 1 + + a N/, A 0 (x) = a N/ 1 x N/ 1 + + a 0, B 1 (x) = b N 1 x N/ 1 + + b N/, B 0 (x) = b N/ 1 x N/ 1 + + b 0 We have c (x) = a (x) b (x) given by c (x) = [A 1 (x) B 1 (x)] x N + [A 0 (x) B 1 (x) + A 1 (x) B 0 (x)] x N/ (8) + [A 0 (x) B 0 (x)] In the above equation, simplifying the notation and omitting (x), the term multiplying x N/ may be rewritten as A 0 B 1 + A 1 B 0 = (A 0 + A 1 ) (B 0 + B 1 ) A 0 B 0 A 1 B 1 This saves one multiplication, because we have previously computed A 0 B 0 and A 1 B 1 Therefore, the product of polynomials with degree N 1 may be computed using three products of polynomials with degree (N/) 1 As this procedure is recursive, it is shown that Karatsuba s algorithm for multiplying polynomials of degree N = n, ie, for obtaining coefficients c i, can be done with N log 3 multiplications and at most 6 N log 3 8 N + additions [1] It is important to notice that we are not applying Karatsuba s algorithm in a blackbox manner Instead, we store all intermediate results to be used later We also remark that such an algorithm has a three term structure based on the recursive computation of A 1 B 1, A 0 B 0 and A 0 B 1 +A 1 B 0 = (A 0 +A 1 )(B 0 +B 1 ) A 0 B 0 A 1 B 1 Throughout this paper, intermediate terms involved on the computation of A 1 B 1, A 0 B 0 and A 0 B 1 + A 1 B 0 are respectively associated to symbols 11, 00 and 01 B Extra Operations for Karatsuba s algorithm According to Equation (7), in order to obtain the Chebyshev coefficients c i of polynomial c(x) from coefficients c i, we need to consider scaling factors 1/ and computing terms a l b l, l = 1,, N 1, and (a l b l+i + a l+i b l ), i = 1,, N, l = 1,, N 1 i Due to the recursive nature of Karatsuba s algorithm, some of these terms appear computed together with other terms Therefore, extra arithmetic operations will be needed for computing them separately before conveniently adding them to coefficients c i In this paper, this procedure is referred as separation Briefly, extra operations for obtaining coefficients c i from coefficients c i are related to: operations for separating terms originally computed together with other terms; additions of terms a l b l and (a l b l+i +a l+i b l ) respectively on first and second rows of Equation (7); multiplications by the scale factor 1/ The total number of required extra operations is stated in the following theorem Theorem 1: Let a(x) and b(x) be polynomials of degree N 1 whose Chebyshev coefficients a i and b i, i = 0,, N 1, are given Let a (x) = N 1 a i=0 i x i and b (x) = N 1 b i=0 i x i be polynomials whose product is denoted by c (x) = N c i=0 i xi If the polynomial c (x) is computed using Karatsuba s algorithm, then the Chebyshev coefficients c i, i = 0,, N, of the polynomial c(x) = a(x) b(x) are obtained from the coefficients c i, i = 0,, N, with M e(n) = N N log 3 + 5 N extra multiplications and A e(n) 5 N 6 N log 3 + N (1 log N) extra additions (9) (10) Before presenting the proof of Theorem 1, we introduce some notations and develop examples which make the derivation of Equations (9) and (10) easier to understand Particularly, we are interested in observing which intermediate terms related to symbols 11, 00 and 01 are produced together In what follows, terms with such characteristic are written between ; we omit this notation for single terms a i b i Example 1: We want to multiply polynomials a(x) and b(x), N =, whose Chebyshev coefficients a i and b i are given Using Karatsuba s algorithm for computing coefficients c i, we have 11 : c = A 1 B 1 = a 1 b 1 ; (11) 00 : c 0 = A 0 B 0 = a 0 b 0 ; (1) 01 : c 1 = (A 1 + A 0 ) (B 1 + B 0 ) A 1 B 1 A 0 B 0 = (a 1 + a 0 ) (b 1 + b 0 ) a 1 b 1 a 0 b 0 = a 0 b 1 + a 1 b 0 (13) From Equation (7), we directly obtain c = c /, c 1 = c 1/ and c 0 = c 0/ + c, because there are no terms to be separated In this case, extra operations are exclusively due to the scale factor 1/ and the addition c 0/ + c, which results in M e() = 3 and A e() = 1 Example : A second example is to multiply a(x) and b(x) where N = 4 As Karatsuba s algorithm is recursive, in this

4 IEEE TRANSACTIONS ON COMPUTERS, VOL X, NO X, MONTH YYYY case, the computation of A 1 B 1 and A 0 B 0 may be viewed as repetitions of the first example Therefore, the intermediate terms related to symbols 11 and 00 are 11 : c 6 = a 3 b 3, c 5 = a b 3 + a 3 b, a b ; (14) 00 : a 1 b 1, c 1 = a 0 b 1 + a 1 b 0, c 0 = a 0 b 0 (15) The computation of (A 1 + A 0 )(B 1 + B 0 ) A 1 B 1 A 0 B 0 is similar, being necessary a special care with terms produced together More specifically, we have (A 1 + A 0 ) = (a 3 + a 1 )x + (a + a 0 ) and (B 1 + B 0 ) = (b 3 + b 1 )x + (b + b 0 ), the product of which produces terms (a 3 + a 1 ) (b 3 + b 1 ), (a 3 + a 1 ) (b + b 0 ) + (a + a 0 ) (b 3 + b 1 ), (a + a 0 ) (b + b 0 ) The subtractions by A 1 B 1 and A 0 B 0 come from the intermediate terms related to symbols 11 and 00, respectively, in Equations (14) and (15) By subtracting a 3 b 3 and a 1 b 1 from (a 3 +a 1 ) (b 3 + b 1 ), we obtain a 1 b 3 +a 3 b 1 ; by subtracting a b and a 0 b 0 from (a +a 0 ) (b +b 0 ), we get a 0 b +a b 0 ; by subtracting a b 3 +a 3 b and a 0 b 1 +a 1 b 0 from (a 3 +a 1 ) (b +b 0 )+ (a +a 0 ) (b 3 +b 1 ), we obtain a 1 b +a b 1 +a 0 b 3 +a 3 b 0 Therefore, the final result for the intermediate terms related to symbol 01 is 01 : a 1 b 3 + a 3 b 1, c 3 = a 1 b + a b 1 + a 0 b 3 + a 3 b 0, (16) a 0 b + a b 0 We recall that coefficients c i, i = 0,, 6, are obtained by running Karatsuba s algorithm after all other intermediate terms are computed However, at this point, we just want to observe the terms that are produced together, being sufficient to perform the first step of the algorithm In this sense, from Equation (7), we particularly know that c 1 = c 1 + (a 1 b + a b 1 + a b 3 + a 3 b ) Hence, in order to evaluate c 1, we need to compute a 1 b +a b 1, because this term is originally produced together with a 0 b 3 + a 3 b 0, as shown in Equation (16) Since we know a 1 b 1 and a b, this requires one multiplication and four additions because a 1 b + a b 1 = (a 1 + a ) (b 1 + b ) a 1 b 1 a b All other coefficients c i can be obtained in similar way Naturally, we still need to count other extra operations mentioned before Theorem 1 The final result is M e(4) = 8 and A e(4) = 11 Remark: In Example, we do not need to separate a 0 b 3 + a 3 b 0 However, this term could be obtained by using one more addition, namely a 0 b 3 + a 3 b 0 = a 1 b + a b 1 + a 0 b 3 + a 3 b 0 (17) a 1 b + a b 1 Although the last step of the separation procedure is not required in Example, we do need to use it in multiplications involving larger degree polynomials Example 3: In this example, we want to multiply a(x) and b(x) for N = 8 As in Example, the computation of A 1 B 1 and A 0 B 0 may be viewed as repetitions of the case N = 4 The terms obtained are 11 : c 14 = a 7 b 7, c 13 = a 6 b 7 + a 7 b 6, a 6 b 6, a 5 b 7 + a 7 b 5, c 11 = a 5 b 6 + a 6 b 5 + a 4 b 7 + a 7 b 4, (18) a 4 b 6 + a 6 b 4, a 5 b 5, a 4 b 5 + a 5 b 4, a 4 b 4 ; 00 : a 3 b 3, a b 3 + a 3 b, a b, a 1 b 3 + a 3 b 1, c 3 = a 1 b + a b 1 + a 0 b 3 + a 3 b 0, (19) a 0 b + a b 0, a 1 b 1, c 1 = a 0 b 1 + a 1 b 0, c 0 = a 0 b 0 The computation of (A 1 + A 0 ) (B 1 + B 0 ) A 1 B 1 A 0 B 0 is also analogous We give only its final result, that is 01 : a 3 b 7 + a 7 b 3, a b 7 + a 7 b + a 3 b 6 + a 6 b 3, a b 6 + a 6 b, a 1 b 7 + a 7 b 1 + a 3 b 5 + a 5 b 3, c 7 = a 1 b 6 + a 6 b 1 + a b 5 + a 5 b (0) + a 0 b 7 + a 7 b 0 + a 3 b 4 + a 4 b 3, a 0 b 6 + a 6 b 0 + a b 4 + a 4 b, a 1 b 5 + a 5 b 1, a 0 b 5 + a 5 b 0 + a 1 b 4 + a 4 b 1, a 0 b 4 + a 4 b 0 Let us consider the term c 7 = a 1 b 6 +a 6 b 1 +a b 5 +a 5 b + a 0 b 7 +a 7 b 0 +a 3 b 4 +a 4 b 3 Terms a 1 b 6 +a 6 b 1, a b 5 +a 5 b and a 3 b 4 +a 4 b 3 need to be separated from c 7 because they must also be added to c 5, c 3 and c 1, in order to compute c 5, c 3 and c 1, respectively Similarly to the previous example, one multiplication and four additions are necessary for calculating each one of these terms From the term a b 7 + a 7 b + a 3 b 6 + a 6 b 3, which is associated to c 9, we need to separate a b 7 + a 7 b and a 3 b 6 +a 6 b 3, and respectively add them to c 5 and c 3, in order to compute c 5 and c 3 The same procedure is applied for all terms which are previously computed together After this, other extra operations have to be counted for adding the separated terms to coefficients c i and multiplying by the factor 1/ This results in M e(8) = 4 and A e(8) = 71 With the previous examples in mind, we can derive a formula for the number of operations necessary to separate terms produced together in Karatsuba s algorithm We start observing the intermediate terms produced by the algorithm, ie, before obtaining the final result for the coefficients of c (x) We associate terms in the form a i b i to 0, a i1 b j1 + a j1 b i1 to 1, a i1 b j1 + a j1 b i1 + a i b j + a j b i to, a i1 b j1 + a j1 b i1 + a i b j + a j b i + a i3 b j3 + a j3 b i3 + a i4 b j4 + a i4 b j4 to 4, etc In general, a term with the form t (a ik b jk + a jk b ik ), (1) k=1 where t N and i k +j k is a constant for 1 k t, is associated to the number or status s = t If we consider that all terms in the

LIMA et al: A KARATSUBA-BASED ALGORITHM FOR POLYNOMIAL MULTIPLICATION IN CHEBYSHEV FORM 5 above expression need to be separated, s 1 extra multiplications are required Consequently, at most 4 (s 1)+1 extra additions are necessary The upper bound is justified by the possible presence of terms of the form a 0 b i + a i b 0, i 0, produced together with other terms They do not need to be separated and, in these cases, one addition is saved; see remark after Example After applying the separation procedure just explained, every term has status at most 1, ie, has the form a i b i or a i b j + a j b i Such terms are then added to coefficients c i according to Equation (7) in order to obtain coffecients c i For N = 1, we have a 0 b 0 only, which has status 0 and does not represent any extra operation Since this case is like an initial state, we associate it to 01 For N =, we have a repetition of the previous one on terms associated to 11 and 00; see Equations (11) and (1) The symbol 01 is also a repetition of the previous one, but with a status incremented from 0 to 1; see Equation (13) Due to the recursive nature of the algorithm, an analogous fact occurs for N = 4, 8, This may be verified in Equations (14) (16) and Equations (18) (0) This allows to construct Table I, which shows the status of all terms in Karatsuba s algorithm up to N = 8 The last row emphasizes that m(n), the number of multiplications necessary for separating terms that Karatsuba s algorithm computes together, is obtained by summing contributions of terms associated to 11, 01 and 00 These contributions are respectively denoted by m(n) 11, m(n) 01 and m(n) 00 If N = 4, for instance, we have m(n) = m(n) 01 = 1 because only the term with status associated to 01 requires a separation procedure (see Table I) Specifically, this term corresponds to a 1 b +a b 1 +a 0 b 3 +a 3 b 0, presented in Example 1 If N = 8, we have m(n) 00 = 1 (one term with status ), m(n) 01 = 7 (four terms with status and one term with status 4) and m(n) 11 = 1 (one term with status ) These terms may be distinguished in Equations (18) (0) In this case, m(n) = 1 + 7 + 1 = 9 Moreover, by comparing rows for N = 4 (n = ) and N = 8 (n = 3) in Table I, we note that m(3) 11 = m(3) 00 = m(); m(3) 01 is given by m() 01 plus the contribution of the terms related to m() 01, but with incremented (doubled) statuses Due to the recursion of Karatsuba s algorithm, this situation is general, that is, m(n) 11 = m(n) 00 = m(n 1) and m(n) 01 is given by m(n 1) 01 plus the contribution of the terms associated to m(n 1) 01 with incremented statuses Proof of Theorem 1: By using previous notation and remarks, the number of multiplications necessary for separating terms that Karatsuba s algorithm computes together, m(n), is given by We know that m(n) = m(n) 11 + m(n) 01 + m(n) 00 () m(n) 11 = m(n) 00 = m(n 1) (3) From above comments, m(n) 01 is given by m(n 1) 01 plus the contribution of the terms related to m(n 1) 01 with incremented (doubled) statuses A term with status s 1 = n, n 0, contributes with m s1 = n 1 extra multiplications Consequently, a term with status s = s 1 = n+1 contributes with m s = n+1 1 = ( n 1) + 1 = m s1 + 1 extra multiplications Then, if a set with t terms contributes with m t extra multiplications, a new set, obtained by doubling the status of each term in the previous set, contributes with m t +t extra multiplications We note that there are 3 n terms associated to m(n 1) 01 (see Table I for the cases n = 1,, 3) Therefore, by doubling the status of each one of these terms, the new contribution is m(n 1) 01 +3 n This allows us to write m(n) 01 = m(n 1) 01 + m(n 1) 01 + 3 n = 4 m(n 1) 01 + 3 n We also note that m(n 1) 01 = m(n 1) m(n ) Thus, the above equation may be rewritten as m(n) 01 = 4 (m(n 1) m(n )) + 3 n (4) By substituting Equations (3) and (4) in Equation (), we have m(n) = m(n 1) + 4 (m(n 1) m(n )) + 3 n = 6 m(n 1) 8 m(n ) + 3 n (5) Equation (5) is a recurrence relation 1 and it can be solved by means of the z-transform Denoting by M(z) the z-transform of m(n), Equation (5) is written in the z-transform domain as M(z) = 6 M(z) z 1 8 M(z) z + z 1 3 z 1 In the last equation, grouping the terms with M(z), we have z M(z) = (1 6 z 1 + 8 z ) (1 3 z 1 ) 1/ = 1 4 z 1 + 1/ 1 z 1 1 (6) 1 3 z 1 Applying the inverse z-transform to Equation (6), one obtains m(n) = 4n + n 3 n The above equation can be written in function of N as m(n) = N + N N log 3 Adding to m(n) multiplications due to the scale factor 1/, we compute M e(n), the total number of extra multiplications for computing coefficients c i from coefficients c i, by M e(n) = N + N N log 3 + N 1 = N N log 3 + 5 N The extra additions come from two sources The first one is related to the separation procedure There are four additions per product and at most one more addition per each term with status ; see comments immediately after Equation (1) Given n, the total number of terms produced in the first step of Karatsuba s algorithm is 3 n Denoting respectively by S 0 (n) and S 1 (n) the number of terms with status 0 and 1 for such an n, we know that S (n), the number of terms with status, is given by We note that S 0 (n) = n and S (n) = 3 n S 0 (n) S 1 (n) (7) S 1 (n) = S 1 (n 1) + S 0 (n 1) = S 1 (n 1) + n 1 1 Curiously, this recurrence relation produces a sequence m(n), n = 0, 1,,, which coincides with the number of monotone Boolean functions of n variables with mincuts It also represents the number of Sperner systems with blocks and some other sequences archived by the On-line Encyclopedia of Integer Sequences [13]

6 IEEE TRANSACTIONS ON COMPUTERS, VOL X, NO X, MONTH YYYY TABLE I STATUS OF ALL TERMS IN KARATSUBA S ALGORITHM UP TO N = 8 THE NUMBER OF MULTIPLICATIONS m(n) NECESSARY FOR SEPARATING TERMS ORIGINALLY COMPUTED TOGETHER WITH OTHER TERMS IS ALSO PRESENTED N = n 11 01 00 m(n) 1-0 - 0 0 1 0 0 4 0, 1, 0 1,, 1 0, 1, 0 1 8 0, 1, 0, 1,, 1, 0, 1, 0 }{{} } {{ } } {{ } 1,, 1,, 4,, 1,, 1 0, 1, 0, 1,, 1, 0, 1, 0 9 m(n) 11 m(n) 01 m(n) 00 Since the above equation is also a recursion, it may be solved using the z-transform The result is S 1 (n) = n n 1 Hence, Equation (7) may be written as S (n) = 3 n n n n 1 and, consequently, S (N) = N log 3 N N log N ( = N log 3 N 1 + log ) N Thus, the number of extra additions related to the separation procedure is at most 4 N + N N log 3 ( + N log 3 N 1 + log ) N = N 3 N log 3 + N ( 1 log N ) (8) The second source of extra additions is related to operations needed to add terms a l b l, l = 1,, N 1, and a l b l+i +a l+i b l, i = 1,, N, l = 1,, N 1 i, in Equation (7), which gives N N 1 + (N 1 i) = N (N 1) (9) Thus, by summing Equations (8) and (9), we compute A e(n), the total number of extra additions for computing coefficients c i from coefficients c i One obtains ( A e(n) N 3N log 3 + N 1 log ) N N(N 1) + = 5N 6N log 3 + N(1 log N) C Total Arithmetic Complexity By using our Karatsuba-based algorithm, the total arithmetic complexity for computing Chebyshev coefficients of the product of two polynomials in Chebyshev form is given by the following theorem Theorem : Let a(x) and b(x) be polynomials of degree N 1 whose Chebyshev coefficients a i and b i, i = 0,, N 1, are given By means of the proposed Karatsuba-based algorithm, Chebyshev coefficients c i, i = 0,, N, of the polynomial c(x) = a(x) b(x) are obtained with M k (N) = N + 5 N (30) multiplications and A k (N) 5 N + 6 N log 3 N (15 + log N) + 4 (31) additions The proof is immediate Equations (30) and (31) are obtained by adding the number of operations necessary for computing coefficients c i, presented in Section III-A, to the number of extra operations derived in the last subsection We remark that the standard application of Karatsuba s algorithm for multiplying polynomials involves O(N log 3 ) arithmetic operations Here, due to the extra operations, our method has a higher cost of O(N ) IV DISCUSSION AND CONCLUSIONS From Equations (4), (5), (30) and (31), we construct Table II, in which the total number of multiplications and additions for multiplying polynomials in Chebyshev form by direct (resp M d and A d ) and our Karatsuba-based (resp M k and A k ) methods are shown All the entrances in Table II were checked by a Matlab c computer simulation The program counted the number of operations for both direct and our Karatsuba-based methods Although both direct and our Karatsuba-based methods involve O(N ) multiplications, the division by in Equation (30) makes a considerable difference By asymptotically evaluating the ratio M k (N)/M d (N), we conclude that half of the multiplications required by the direct method is saved if we use Karatsuba-based algorithm This tendency is observed in Table II As expected, since one of Karatsuba s algorithm principles is to exchange multiplications by additions, A k (N) is larger than A d (N) More precisely, the ratio A k (N)/A d (N) is closed to 5/3 as N increases Thus, a coherent comparison between the direct and the proposed methods strongly depends on the computational cost of one multiplication in terms of additions If we consider that one multiplication costs r additions, the following analysis can be done Let N = n, the total computational cost T d (N) for multiplying two polynomials of degree N 1 in Chebyshev form by the direct method is measured by T d (N) = r M d (N) + A d (N) The total cost T k (n) using our Karatsuba-based method is T k (N) = r M k (N) + A k (N)

LIMA et al: A KARATSUBA-BASED ALGORITHM FOR POLYNOMIAL MULTIPLICATION IN CHEBYSHEV FORM 7 TABLE II TOTAL NUMBER OF MULTIPLICATIONS AND ADDITIONS FOR MULTIPLYING POLYNOMIALS IN CHEBYSHEV BASIS BY DIRECT (RESP M d AND A d ) AND KARATSUBA-BASED (RESP M k AND A k ) METHODS N = n M d M k A d A k 1 0 0 7 6 5 4 3 17 15 35 8 79 51 77 171 16 87 167 345 733 3 1087 591 1457 971 64 43 07 5985 11757 18 16639 8511 457 46115 TABLE III TOTAL NUMBER OF MULTIPLICATIONS AND ADDITIONS FOR MULTIPLYING POLYNOMIALS IN CHEBYSHEV BASIS BY DCT (RESP M DCT AND A DCT ) AND KARATSUBA-BASED (RESP M k AND A k ) METHODS N = n M DCT M k A DCT A k 1 1 0 7 6 30 5 4 3 17 81 35 8 67 51 16 171 16 179 167 555 733 3 451 591 1374 971 64 1091 07 397 11757 18 563 8511 7716 46115 A general knowledge concerning the ratio T d (N)/T k (N) can be acquired by computing [ ] [ ] Td (N) r Md (N) + A lim = lim d (N) N T k (N) N r M k (N) + A k (N) In order to find the range of r where Karatsuba-based approach is faster than the direct approach, we substitute previously derived formulas in the above equation and obtain whose solution is r + 3 r + 5 > 1, r > Hence, Karatsuba-based approach is cheaper than the direct approach if one multiplication costs more than two additions In most applications, one multiplication is significantly more expensive than two additions [10] Another alternative for performing the operation discussed in this paper is to expand the polynomials in Chebyshev form to rewrite them in monomial form Then, the product is computed applying the standard Karatsuba s algorithm As a final step, the obtained polynomial is written back in Chebyshev form In this case, besides increasing the involved arithmetic complexity, extra operations for converting polynomials in Chebyshev form to polynomials in monomial form and vice-versa also induce precision restrictions It is also pertinent to compare our approach with that proposed in [8], where the polynomial multiplication in Chebyshev form is computed in the discrete cosine transform (DCT) domain In this case, the product of two polynomials of degree N 1 is carried out by computing N-DCTs Although the authors of [8] only discuss asymptotic aspects of the arithmetic complexity involved in this method, it is possible to use general formulas and obtain a more precise number of multiplications and additions required by the DCT method They are respectively denoted by M DCT (N) and A DCT (N) and are given in [14] and M DCT (N) = 3N log N 4N + 3 A DCT (N) = (9N + 3) log N 4N + 1 By observing Table III, which compares DCT and our Karatsubabased methods, we note that the former uses less arithmetic operations for N 3 For N = 16, a coherent comparison depends on the cost r of one multiplication in terms of additions Since DCT implementation requires multiplications by cosines of arcs, precision restrictions must be also considered On the other hand, in Karatsuba-based method, besides products among coefficients a i and b i, only products by 1/ are necessary, which makes this aspect less critical Hence, for N < 16, which covers several Chebyshev expansion practical applications, Karatsubabased method should be used For instance, in [], [3] and [4], Chebyshev expansions with 4 N 6, 5 N 13 and 3 N 5 are used, respectively For larger N, if precision is not a problem, DCT method should be used We remark that the space required by our algorithm is a bit larger than that for the other algorithms However, our method should be employed for intermediate sizes where this larger memory requirement is not a problem Although this paper is not focused on hardware implementations for the proposed method, there is a relevant remark concerning this aspect Except for some multiplications by 1/, all extra operations needed for computing coefficients c i from coefficients c i can be implemented in parallel to standard Karatsuba s algorithm Thus, using this, our method can be considerably sped up ACKNOWLEDGMENT Juliano B Lima performed this work while at the School of Mathematics and Statistics, Carleton University He was supported by Coordenação de Aperfeiçoamento de Pessoal de Nível Superior CAPES under Grant 0599-07-7 Both Daniel Panario and Qiang Wang are supported in part by NSERC of Canada REFERENCES [1] A V Oppenheim, R W Schafer, and J R Buck, Discrete-Time Signal Processing, Prentice-Hall, Englewood Cliffs, NJ, nd edition, 1999 [] I Sarkas, D Mavridis, M Papamichail, and G Papadopoulos, Volterra analysis using Chebyshev series, in Proc IEEE Int Symposium on Circuits and Systems (ISCAS 007), May 007, pp 1931 1934 [3] P J Chiang, C P Yu, and H C Chang, Robust calculation of chromatic dispersion coefficients of optical fibers from numerically determined effective indices using Chebyshev-Lagrange interpolation polynomials, Journal of Lightwave Technology, vol 4, no 11, pp 4411 4416, Nov 006 [4] A Ashrafi, R Adhami, L Joiner, and P Kaveh, Arbitrary waveform DDFS utilizing Chebyshev polynomials interpolation, IEEE Transactions on Circuits and Systems I: Regular Papers, vol 51, no 8, pp 1468 1475, Aug 004 [5] G Cuypers, G Ysebaert, M Moonen, and F Pisoni, Chebyshev interpolation for DMT modems, in Proc IEEE Int Conference on Communications (ICC 004), June 004, pp 736 740 [6] J C Mason and D C Handscomb, Chebyshev Polynomials, Chapman & Hall/CRC, Boca Raton, FL, 1st edition, 003 [7] G H Rawitscher and I Koltracht, An efficient numerical spectral method for solving the Schrodinger equation, Computing in Science & Engineering, vol 7, no 6, pp 58 66, Nov-Dec 005

8 IEEE TRANSACTIONS ON COMPUTERS, VOL X, NO X, MONTH YYYY [8] G Baszenski and M Tasche, Fast polynomial multiplication and convolutions related to the discrete cosine transform, Linear Algebra Appl, vol 5, no 1-3, pp 1 5, Feb 1997 [9] A Karatsuba and Y Ofman, Multiplication of many-digital numbers by automatic computers, Doklady Akad Nauk SSSR, vol 145, pp 93 94, 196 Translation in Physics-Doklady, no 7, pp 595 596, 1963 [10] J von zur Gathen and J Gerhard, Modern Computer Algebra, Cambridge University Press, Cambridge, United Kingdom, nd edition, 003 [11] P L Montgomery, Five, six, and seven-term Karatsuba-like formulae, IEEE Transactions on Computers, vol 54, no 3, pp 36 369, Mar 005 [1] C Paar, A new architecture for a parallel finite field multiplier with low complexity based on composite fields, IEEE Transactions on Computers, vol 45, no 7, pp 856 861, July 1996 [13] N J A Sloane, The on-line encyclopedia of integer sequences, http://wwwresearchattcom/ njas/sequences/a01669 [14] S C Chan and K L Ho, Direct method for computing sinusoidal transforms, IEEE Proceedings, vol 137, no 6, pp 433 44, Dec 1990