A Newton Raphson Divider Based on Improved Reciprocal Approximation Algorithm

Size: px
Start display at page:

Download "A Newton Raphson Divider Based on Improved Reciprocal Approximation Algorithm"

Transcription

1 EE38N High Speed Coputer Arithetic Fall 6 Project Report A Newton Raphson ivider Based on Iproved Reciprocal Approxiation Algorith Gaurav Agrawal Ankit Khandelwal Subitted On ec 4, 6

2 Abstract Newton Raphson Functional Approxiation is an attractive division strategy as it provides quadratic convergence and can be faster than digit recurrence ethods if an accurate initial approxiation is available. In this project, we study and siulate several table-lookup based initial approxiation ethods. Of particular interest is the Taylor Series based reciprocal approxiation ethod which uses a table lookup followed by a ultiplication for initial approxiation and can provide a very accurate approxiation with a very sall ROM size. We ipleented and siulated a 4 bit divider based on various ethods published in literature and also proposed an iproveent that retains accuracy while using a uch saller ROM.

3 Table of Contents. Motivation 4. Proble Stateent 4 3. Background 4 3. The ivision Proble 4 3. Classification of ivision Algoriths igit Recurrence Algoriths (Slow ivision) Functional Approxiation Algoriths (Fast ivision) Initial Approxiation Techniques Linear Approxiation irect Table Lookup Table Lookup followed by Multiplication 8 4. Related Work 5. esign Ipleentation 4 6. Results 7. Conclusion 8. References 3 Appix A: MATLAB Code 4 Appix B: Table of ROM Values 35 3

4 . Motivation Floating point perforance is a key denoinator of perforance for several applications including those in scientific, graphics and SP doains. High speed floating point hardware is a requireent to eet the ever increasing coputational deands of these applications. Modern applications coprise several floating point operations including addition, ultiplication, and division. In recent FPUs, ephasis has been placed on designing ever faster adders and ultipliers, with division receiving less attention. Typically, the range for addition latency is two to four cycles, and the range for ultiplication is two to eight cycles. In contrast, the latency for double precision division in odern FPUs ranges fro less than eight cycles to over 6 cycles. A coon perception of division is that it is an infrequent operation whose ipleentation need not receive high priority. However, it has been argued that ignoring its ipleentation can result in significant syste perforance degradation for any applications.. Proble Stateent In Newton-Raphson functional approxiation based division algoriths, the accuracy of initial reciprocal approxiation is highly desirable as it enables quick convergence to the final result. The proble studied in this project is that of deterination of the reciprocal approxiation with high accuracy while using less area. 3. Background 3. The ivision Proble: The proble of arithetic division can be forulated as below: Where: Q = Quotient N = Nuerator (ivid) = enoinator (ivisor) Q = In this project N and are assued to be of the for (as would be the case for the antissa of a noralized floating point nuber) N N =. x x x... x =. y y y... y 3 3 k k 4

5 3. Classification of division algoriths: The division techniques suitable for VLSI ipleentation can be divided into two broad categories: 3.. igit Recurrence Algoriths (Slow ivision): igit recurrence algoriths use subtractive ethods to calculate quotients one digit per iteration. The basic recurrence relation used in these algoriths is as given below: Where P + = rpj q j n( j+ ) P j = The partial reainder of the division r = The radix q = the digit of the quotient in position n ( j+) n( j+) are nubered fro least-significant to ost significant ( n ) n = nuber of digits in the quotient = the denoinator, where the digit positions Various techniques using digit-recurrence algoriths can be classified as below: (i) (ii) (iii) Restoring ivision a. Perforing Restoring ivision b. Non-Perforing Restoring ivision Non Restoring ivision Radix-r SRT ivision 3.. Functional Approxiation Algoriths (Fast ivision): Unlike digit recurrence division, division by functional iteration utilizes ultiplication as the fundaental operation. The priary difficulty with subtractive division is the linear convergence to the quotient. Multiplicative division algoriths are able to take advantage of high-speed ultipliers to converge to a result quadratically. Rather than retiring a fixed nuber of quotient bits in every cycle, ultiplication-based algoriths are able to double the nuber of correct quotient bits in every iteration. However, the tradeoff between the two classes is not only latency in ters of the nuber of iterations, but also the length of each iteration in cycles. Additionally, if the divider shares an existing ultiplier, the perforance raifications on regular ultiplication operations ust be considered. It has been reported that in typical floating point applications, the perforance degradation due to a shared ultiplier is sall. Accordingly, if area ust be iniized, an existing ultiplier ay be shared with the division unit with only inial syste perforance degradation. 5

6 (i) Goldschidt s Algorith: Goldschidt algorith uses series expansion to converge to the quotient. The strategy of Goldschidt is repeatedly ultiply the divid and divisor by a factor R to converge the divisor to as the divid converges to the quotient Q. N( R)( R)( R)...( RK) Q = ( R)( R)( R)...( RK) As ( R)( R)( R)...( RK) converges to, N ( R)( R)( R)...( RK) converges to Q. (ii) Newton Raphson ivision: Newton Raphson iteration is a well-known iterative ethod to approxiate the root of a non-linear function. Let f (x) be a well behaved function and let r be a root of the equation f ( x) =, we start with x which is a good estiate of r and let r = x + h. The nuber h easures how far the estiate x is fro the truth. Since h is sall, the linear approxiation can be used to conclude that And therefore, unless f ( x ) is close to, = f ( r) = f ( x + h) f ( x ) + hf ' ( x ) ' It follows that h f ( x ) f ' ( x ) r = x + h x f ( x ) f ' ( x ) Our new iproved estiate x of r is therefore given by x = x f ( x ) f ' ( x ) Continue in this way. If xi is the current estiate, then the next estiate x i+ is given by: x i+ = x i f ( xi ) f ' ( x ) i () The equation obtained above is called the Newton Raphson forula. In order to copute the reciprocal, the following function and its derivative are used: 6

7 f ( x) = X x f ' ( x) = x () (3) Substituting equations () and () into (3) yields x = x Xx i+ i i (4) It can also be written as: x = x ( Xx i+ i i ) (5) Above equations can be ipleented in hardware in order to double the accuracy in each iteration. Using the for in equation (4), one square, one ultiplication, one shift and one subtraction are required for coputation of x i+. Error Analysis: Let ε = i x X This can also be expressed as: i be the error at i th iteration, then: ε X xi ( xi X i+ = xi+ = X ) ε = ε i+ = X ( / X xi ) X i The above equation clearly shows that the absolute error decays quadratically in each iteration. 3.3 Initial Approxiation Techniques Quadratic convergence techniques like Newton-Raphson, require an initial approxiation on which they iterate to iprove the accuracy of the final result. The nuber of iterations required deps upon the accuracy of the initial approxiation. The reduction in nuber of iterations not only decreases the area of the design but it also helps in reducing the delay and the power nubers. Thus it is good to have as accurate an initial approxiation as possible with as little an area increent as possible. Various techniques are available to calculate the initial approxiation and soe of the are explained below. 7

8 3.3. Linear Approxiation This ethod is one of the siplest approaches used for calculating the initial approxiation. It uses the equation, X = (.94 ), and can be easily ipleented using an adder. But this approach does not provide a good initial approxiation and hence is rarely used in real world designs irect Table Look Up This ethod uses a ROM, as a look up table, to calculate the initial approxiation. The ost significant bits, excluding the leading, of the antissa are used as the address bits for the table look up. The values stored in the table are calculated using the equation stored = ' + M + M Where, = [. d d... d ] and (M+) is the accuracy in bits desired in the initial ' M approxiation. The ROM size required by this approach is Table Look Up Followed By Multiplication M M bits. The ROM values are obtained by perforing the Taylor series expansion of the reciprocal function. A Taylor series expansion for a general function f (x) around point a is given by n f ( a) n f ( x) = ( x a) n! n= So in order to obtain the Taylor series expansion of the reciprocal function operand is split into two parts such that,... =. dd d = [. d d d k ] and = +, (), the can further be represented as, = d. Substituting this value of in equation () gives, = + d 8

9 9 Expanding the Taylor series for around = d and taking the first two ters gives the following equation, )] [( ) ( )] ( ) [( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( d d + + = + + = + + = + + = + + = The first ter, ) ( +, of this equation is read fro the ROM using the address bits of excluding the leading. The second ter, +, which can be represented as k d d d d d d ~... ~ ~ , is obtained fro the operand odifier. The operand odifier keeps the first ) ( + bits (including the leading ) intact and inverts the rest of the bits to obtain the final output. Multiplication of these ters provides an initial approxiation of the inverse of denoinator, whose accuracy is ) ( bits. The corresponding ROM size is ) ( + bits. For exaple, if an accuracy of 4 bits is desired in the initial approxiation, then a ROM of size 896 ) 6 ( 6 = + bits needs to be designed. Since the ROM output is only ) ( + bits accurate, the output bit accuracy obtained fro the operand odifier can be reduced to ) ( + bits. Finally, a ultiplier of size ) ( ) ( + + bits could be used and its output could be finally rounded off to ) ( + bits. This would help in reducing the area and the power consuption of the design. In order to keep the ROM size iniu, the value of needs to be deterined carefully as the ROM size deps exponentially on.

10 4. Related Work: Rich literature exists describing various ipleentations of Newton Raphson reciprocal approxiation based dividers. Most of these ipleentations differ in the ethod used to get the initial estiate of reciprocal of denoinator and in the tradeoffs between speed and area. escribed below are soe of the interesting ipleentations that were considered as the baseline for this project. The design proposed by Fowler et al. [3], shown in fig (Called esign 3), is one of the earliest NR techniques utilizing direct table look-up ethod for initial approxiation of the reciprocal. The first ( ) antissa bits (excluding the leading ) were used to index into the ROM and a bits accurate initial approxiation was obtained, thus aking the ROM size. Iteration steps were then used to iprove upon this initial approxiation. Each iteration step needed two ultipliers and a bunch of inverters for calculating the two s copleent (rather one s copleent). =. x x... x x x... x x x x k N =. x x... x x x... x x x x k N Q = Figure: Fowler et al. s esign [3]

11 Kucukkabak et al [] used Taylor Series Expansion ethod to calculate the initial approxiation for its iteration steps. The initial approxiation was obtained after ultiplying the ROM output with the odifier output as shown on next page (Called esign ). A ROM output and a odifier output of bits were used to calculate the initial approxiation of ± 3 bits accuracy. The ultiplier used for calculating the initial approxiation is also utilized by the iteration steps (twice by each iteration) to iprove upon the interediate results.. x x x... x6 x7 x8... x4 x5 x Figure: Kucukkabak et al. s esign []

12 Chen et al. [] proposed an iproveent to the table look-up read followed by a ultiplication with the odifier output, as shown on next page (Called esign ), to obtain the initial approxiation. The eory size used in the design is (+ ) bits wide and the odifier output obtained is + bits wide. This provides an accuracy of bits. Iteration steps iprove upon the initial approxiation, thus obtained, with the help of a squarer, a ultiplier, a shifter and a subtractor.. x x x... x6x7 x8... x4 x5 x ROM Modifier Register Register Multiplier Register MUX con Squarer Register Shift Register Register Mutliplier Shift Register Register MUX con Register Subtractor Figure: Chen et al. s esign []

13 The three designs described above have their own advantages and disadvantages. As is clearly evident, design 3 provides the best throughput but in turn requires ore hardware, thus increasing power and area. esign, on the other hand uses the sae ultiplier for each ultiplication step required. This reduces the throughput by a huge aount but saves upon the area of the design. esign uses different ultipliers for initial approxiation and iteration steps. Hence, though it provides better throughput but it has ore area than esign. In coparison to esign 3, it is better is ters of area but is worse in ters of throughput. If an initial approxiation of a particular accuracy is desired by all the designs, then esign requires the iniu ROM size. 3

14 5. esign Ipleentation: We used an unsigned 4 bit divider as the design on which various strategies were copared. The divider perfors the following operation: N Q = N =. x x =. y y x... x y... y 3 x 4 y 4 N and both have 4-bit significands and have been considered to be in the range ( N, ) < as is the case for the antissa of noralized floating point nubers. As a design goal, we stipulated that the axiu error in Q should be ε < 4 i.e. the error should be less than LSB ofq. The three designs described in the previous section were adapted to the above design specification. We also ipleented our own iproved design, the block diagra of which is shown on the next page. This design is based on the ipleentation in []; however, it uses less than half the aount of ROM needed in [] while still achieving the desired accuracy. This is a fully pipelined design i.e. a new division can be started every cycle. The latency fro the beginning of division to the availability of Q is 5 cycles (For siplicity of latency calculation, we have assued that all ROM lookup and ultiplication operations require one cycle). The paraeters that needed to be selected for this design were: M: The nuber of address bits indexing into ROM W: The width of a word in ROM WX: Width of the odified operand W: Width of the rest of the data path M For a particular choice of M and W, a ROM with size ( W ) bits will be required. One of the goals of this design was to iniize the aount of ROM required. The process used to coe up with an optiu set of paraeters is explained next. Since it was required to refine the initial approxiation to 4-bit accuracy using just one Newton Raphson iteration, we needed at least -bit accuracy in the initial approxiation. To allow ourselves soe argin in the inaccuracies introduced later by finite width ultipliers, we selected an approxiation ethod that will give us 3 bits of accuracy in absence of any finite word length affect. Q 4

15 Miniization of M was key to area iniization as the size of ROM deps exponentially on M. As the accuracy of -ter Taylor Series approxiation itself deps on the value of M, we first deterined the iniu value of M for which Taylor Series approxiation itself will have sufficient accuracy. =. x x x... x6x7x8... x4 x5 x M WX ROM Modifier W Register WX Register Multiplier W Register Squarer W Register Mutliplier Shift Register W Register Subtractor N =. x x x... x6x7x8... x4 x5x W Mutliplier 3 4 N Q = Figure: Iproved esign 5

16 Key results fro this analysis are shown in the figures below. They show the error in Taylor Series approxiation for M = 3,4,5,6,7. The plots in left colun show the histogra error while those in right colun show the agnitude of the error. It can be seen that the error agnitude is always negative which eans that truncating the Taylor series always results in an approxiation that is always saller than the true result. This insight is used later for selecting ROM patterns..3 Error in Quotient M=3 Error in Quotient M= Probability.5 Probability Error (in Bit) Error Magnitude x -3.3 Error in Quotient M=4 Error in Quotient M= Probability.5 Probability Error (in Bit) Error Magnitude x -3.3 Error in Quotient M=5.7 Error in Quotient M= Probability.5 Probability Error (in Bit) Error Magnitude x -4.3 Error in Quotient M=6.7 Error in Quotient M= Probability.5 Probability Error (in Bit) Error Magnitude x -5 6

17 .3 Error in Quotient M=7 Error in Quotient M= Probability..5 Probability Error (in Bit) Error Magnitude x -5 Also looking at the histogras, we can see that as M increases, we get ore and ore accuracy (visible fro rightward shift of histogras as M increases). The increase in accuracy of approxiation as M increases is shown in the plot below. The accuracy deps linearly on M as roughly (M+). 6 -ter Taylor Approxiation accuracy vs M 5 4 Quotient Precision (Bits) 3 Accuracy = M M (Bits) It can be seen that M=6 is enough to give us desired accuracy of 3 bits in initial approxiation. M=5 does not provide enough accuracy while M=7 consues unnecessary ROM area. Therefore, we selected M=6 in our ipleentation. Next step was to select a width of ROM word (W) and a strategy to fit the actual approxiation to this finite word size. Again we siulated for M=6, different values of W for 3 different strategies:. Truncating the approxiation to W bits (used in []). Rounding the approxiation to W bits 3. Ceiling the approxiation to W bits 7

18 4 3 Truncation Rounding Ceiling ROM Width vs Accuracy (M=6) Quotient Precision ROM Width (Bits) The results fro siulation are shown above. We find that the rounding gives us better accuracy for saller W but ceiling gives superior results when W is sufficiently large. This can be understood by recalling that Taylor Series approxiation by itself underestiates the result. Therefore, doing a ceiling operation ts to copensate the error introduces by Taylor series approxiation. On the other hand, both rounding and ore so truncation add to the Taylor series error and therefore do not perfor as well as ceiling. This is visible in error histogra below for (M=6, W=5) where ceiling operation provides a syetric behavior around. For this reason, we ceiled the approxiation to 5 bits in our ipleentation. Appix B lists the contents of ROM as used in our design..35 Floor (M=6, W=5) Round (M=6, W=5).5 Ceiling (M=6, W=5) Probability..5 Probability..5 Probability Error in Quotient x Error in Quotient x Error in Quotient x -4 At sall W, however, the error introduced by finite word width doinates the total error and therefore, rounding gives the best results due to its syetric error around zero. 8

19 The next paraeter to decide was the width of odified operand (WX). Given our previous selection of M=6 and W=5, we siulated various values of WX and the result is as given below. 3.5 XP Width vs Accuracy (M=6, W=5) 3.5 Quotient Precision (Bits) XP Width (Bits) We find that WX=5 gives us ore than.5 bits of accuracy which will still leave us with soe argin to account for finite word effects and error aplification. WX=4 is arginal at bits which WX=6 will result in an unnecessarily large ultiplier. Therefore, we chose to use WX=5. Selection of paraeters M=6, W=5, WX=5 gives us an accurate enough approxiation. The next step was to decide the width of data path (W) in Newton Raphson iteration. We deterined that W=7 was the sallest value that gave us 4 bits of accuracy in final result. Therefore we chose W=7. For the other designs, we kept the various paraeters as proposed in those designs and ade soe odifications to convert their schee to 4 bit wide ultiplier. 9

20 6. Results: The tables below list quantitative coparison of the three Newton-Raphson divider ipleentations published in literature [,, and 3] with our iproved ipleentation. Options Considered: esign The design based on the work of Chen et al. [] esign The design based on the work of Kucukkabak et al. [] esign 3 The design based on the work of Fowler et al. [3] Our esign The design based on iproved table lookup Coplexity: Ipleentation ROM Size Logic Gates esign Kbits 953 esign Kbits 438 esign 3 Kbits 3937 Our esign.94 Kbits 9 Accuracy: Ipleentation Worst Accuracy (Bits) esign 4.5 esign 4.3 esign Our esign 4.7 Speed (Latency and Throughput): Ipleentation Latency Pipelining Throughput esign /Tck esign /(3Tck) esign /(Tck) Our esign /(Tck) As can be seen our design uses the sallest aount of ROM while still eeting the desired accuracy. It has the sae latency as [] and can be fully pipelined.

21 The benefit that we achieved by using ceil instead of truncation can be seen in the error histogras below. The first histogra is for our ipleentation while the second is the histogra that would be achieved if we had used truncation instead. It can be seen that truncated ROM does not eet the accuracy requireent of 4 bits..45 Iproved esign (M=6, W=5, 7 bit datapath) Probability Error (in bit).45 Effect of truncated ROM values (M=6, W=5, 7 bit datapath) Probability.5..5 Error in bit Error (in bit)

22 7. Conclusions: In this project we studied, siulated and copared three divider ipleentations based on Newton Raphson based reciprocal division. We also proposed an iproved ipleentation that provides better accuracy while using a saller ROM size than the published ethods.

23 8. References: [] ongdong Chen; Bintian Zhou; Zhan Guo; Nilsson, P., "esign and ipleentation of reciprocal unit," Circuits and Systes, 5. 48th Midwest Syposiu on, vol., no.pp Vol., 7- Aug. 5 [] Kucukkabak, U. and Akkas, A. 4. esign and Ipleentation of Reciprocal Unit Using Table Look-up and Newton-Raphson Iteration. In Proceedings of the igital Syste esign, EUROMICRO Systes on (sd'4) - Volue (August 3 - Septeber 3, 4). S. IEEE Coputer Society, Washington, C, [3] Fowler,.L.; Sith, J.E., "An accurate, high speed ipleentation of division by reciprocal approxiation," Coputer Arithetic, 989., Proceedings of 9th Syposiu on, vol., no.pp.6-67, 6-8 Sep 989 [4] Oberann, S.F.; Flynn, M.J., "ivision algoriths and ipleentations," Coputers, IEEE Transactions on, vol.46, no.8pp , Aug 997 [5] Behrooz Parhai, Coputer Arithetic Algoriths and Hardware esigns, Oxford University Press, October 999 3

24 Appix A (Matlab Code) bit / 4bit = 4 bit newton Raphson ivider MATLAB code ipleenting the strategy as published in ongdong Chen; Bintian Zhou; Zhan Guo; Nilsson, P., "esign and ipleentation of reciprocal unit," Circuits and Systes, 5. 48th Midwest Syposiu on, vol., no.pp Vol., 7- Aug. 5 Ankit Khandelwal Gaurav Agrawal Worst Case Accuracy: 4.5 bits Hardware Needed: ROM Size : ^7 x 6 bits Multipliers : 4 6 x 5 = 6 truncated ultiplier 6 x 6 = 7 truncated squarer 7 x 4 = 7 rounded ultiplier 7 x 4 = 4 rounded ultipler Adder : 7-7 = 7 subtractor Perforance: 5 cycles (++++) function esign() M = 7; W = *M + ; WX = *M; WM = *M + ; W = 7; N = 4; NUM_SAMPLES = 7; for i=:num_samples N = round(rand*(^n))/^n; = round(rand*(^n))/^n; Q = (+N)/(+); Q_scale = ; if (Q <.) Q_scale = ; Q = Q*; Q = Q-; Approxiation by ro lookup followed by ultiplier generates WM bit wide approxiation p read ro - get fraction rovalue = ro(floor(*^m), M, W); deterine.xxx3..x+'x+' 4

25 xp = truncate(( + (floor(*^m))/^m + ( - (( - ((floor(*^m))/^m))*^m) - /^(N-M))/^M), WX); get initial approx p = truncate(xp*rovalue, WM); 6 x 5 = 6 truncated ultiplier Iteration of Newton Raphson Method p_squared = truncate(p*p, W); 6 x 6 = 7 truncated squarer i = round(p_squared*(+), W); 7 x 4 = 7 rounded ultiplier p = *p-i; 7-7 = 7 subtractor Final Multiplication QNR = round(p*(+n),4); 7 x 4 = 4 rounded ultipler Err_NR(i) = QNR - (Q+)/Q_scale; Err_NR = -log(abs(err_nr)+e-5); innr = in(err_nr); X = 9.5:35.5; [N] = hist(err_nr, X); N = N/length(Err_NR); bar(x, N); title('esign'); xlabel('error (in bit)'); ylabel('probability'); fprintf (, 'Max Error is f\n', innr); Function: ro function value = ro(add, M, W) if ((add < ) (add >= ^M)) fprintf (, 'Error: Invalid address to ro: d\n', add); error ('quiting'); x = + add/^m; c = /(x + ^(-M-))^; value = floor(c*^(w))/^(w); Function: truncate function value = truncate(x, N) value = floor(x*(^n))/(^n); Function: round function value = round(x, N) value = floor(x*(^n)+.5)/(^n); 5

26 bit / 4bit = 4 bit newton Raphson ivider MATLAB code ipleenting the strategy as published in Kucukkabak, U. and Akkas, A. 4. esign and Ipleentation of Reciprocal Unit Using Table Look-up and Newton-Raphson Iteration. In Proceedings of the igital Syste esign, EUROMICRO Systes on (sd'4) - Volue (August 3 - Septeber 3, 4). S. IEEE Coputer Society, Washington, C, Ankit Khandelwal Gaurav Agrawal Worst Case Accuracy: 4.3 bits Hardware Needed: ROM Size : ^ x bits Multipliers : 7 x 7 = 7 rounded ultiplier 7 x 4 = 4 rounded ultipler Perforance: 5 cycles (+3+) function esign() M = ; W = *M; WX = *M; W = 7; N = 4; NUM_SAMPLES = 7; for i=:num_samples N = round(rand*(^n))/^n; = round(rand*(^n))/^n; Q = (+N)/(+); Q_scale = ; if (Q <.) Q_scale = ; Q = Q*; Q = Q-; Approxiation by ro lookup followed by ultiplier generates WM bit wide approxiation p read ro - get fraction rovalue = ro(floor(*^m), M, W); deterine.xxx3..x+'x+' xp = truncate(( + (floor(*^m))/^m + ( - (( - ((floor(*^m))/^m))*^m) - /^(N-M))/^M), WX); get initial approx p = round(xp*rovalue, W); 7 x 7 = 7 rounded ultiplier Iteration of Newton Raphson Method

27 p = round(p*(+), W); pb = (-p-(^(-w-))); p = round(pb*p, W); sae ultiplier sae ultiplier Final Multiplication QNR = round(p*(+n),4); 7 x 4 = 7 rounded ultiplier Err_NR(i) = QNR - (Q+)/Q_scale; Err_NR = -log(abs(err_nr)+e-5); innr = in(err_nr); X = 9.5:35.5; [N] = hist(err_nr, X); N = N/length(Err_NR); bar(x, N); title('esign'); xlabel('error (in bit)'); ylabel('probability'); fprintf (, 'Max Error is f\n', innr); Function: ro function value = ro(add, M, W) if ((add < ) (add >= ^M)) fprintf (, 'Error: Invalid address to ro: d\n', add); error ('quiting'); x = + add/^m; c = /(x + ^(-M-))^; value = floor(c*^(w))/^(w); Function: truncate function value = truncate(x, N) value = floor(x*(^n))/(^n); Function: round function value = round(x, N) value = floor(x*(^n)+.5)/(^n); EOF 7

28 bit / 4bit = 4 bit newton Raphson ivider MATLAB code ipleenting the strategy as published in Fowler,.L.; Sith, J.E., "An accurate, high speed ipleentation of division by reciprocal approxiation," Coputer Arithetic, 989., Proceedings of 9th Syposiu on, vol., no.pp.6-67, 6-8 Sep 989 Ankit Khandelwal Gaurav Agrawal Worst Case Accuracy: 4.4 bits Hardware Needed: ROM Size : ^3 x 4 bits Multipliers : 3 4 x 4 = 7 rounded ultiplier 4 x 7 = 7 rounded ultiplier 7 x 4 = 7 rounded ultiplier Perforance: 3 cycles (++) function esign3() M = 3; W = 4; W = 7; N = 4; NUM_SAMPLES = 7; for i=:num_samples N = round(rand*(^n))/^n; = round(rand*(^n))/^n; Q = (+N)/(+); Q_scale = ; if (Q <.) Q_scale = ; Q = Q*; Q = Q-; Approxiation by ro lookup followed by ultiplier generates WM bit wide approxiation p read ro - get fraction p = ro(floor(*^m), M, W); Iteration of Newton Raphson Method p = round(p*(+), W); 4 x 4 = 7 rounded ultiplier pb = (-p-(^(-w-))); p = round(pb*p, W); 4 x 7 = 7 rounded ultiplier Final Multiplication

29 QNR = round(p*(+n),4); 7 x 4 = 7 rounded ultiplier Err_NR(i) = QNR - (Q+)/Q_scale; Err_NR = -log(abs(err_nr)+e-5); innr = in(err_nr); X = 9.5:35.5; [N] = hist(err_nr, X); N = N/length(Err_NR); bar(x, N); title('esign3'); xlabel('error (in bit)'); ylabel('probability'); fprintf (, 'Max Error is f\n', innr); Function: ro function value = ro(add, M, W) if ((add < ) (add >= ^M)) fprintf (, 'Error: Invalid address to ro: d\n', add); error ('quiting'); x = + add/^m; c = (/(x + ^(-M-))) + (^(-M-)); value = floor(c*^(w))/^(w); Function: truncate function value = truncate(x, N) value = floor(x*(^n))/(^n); Function: round function value = round(x, N) value = floor(x*(^n)+.5)/(^n); EOF 9

30 bit / 4bit = 4 bit newton Raphson ivider MATLAB code ipleenting our Newton Raphson based divider which uses saller ROM size and gets better accuracy in less area Ankit Khandelwal Gaurav Agrawal Worst Case Accuracy: 4.7 bits Hardware Needed: ROM Size : ^6 x 5 bits Multipliers : 4 6 x 5 = 5 truncated ultiplier 5 x 5 = 7 truncated squarer 7 x 4 = 7 rounded ultiplier 7 x 4 = 4 rounded ultipler Adder : 7-7 = 7 subtractor Perforance: 5 cycles (++++) function ivider() M = 6; W = *M + 3; WX = *M + 3; WM = *M + 3; W = 7; N = 4; NUM_SAMPLES = 7; for i=:num_samples N = round(rand*(^n))/^n; = round(rand*(^n))/^n; Q = (+N)/(+); Q_scale = ; if (Q <.) Q_scale = ; Q = Q*; Q = Q-; Approxiation by ro lookup followed by ultiplier generates WM bit wide approxiation p read ro - get fraction rovalue = ro(floor(*^m), M, W); deterine.xxx3..x+'x+' xp = truncate(( + (floor(*^m))/^m + ( - (( - ((floor(*^m))/^m))*^m) - /^(N-M))/^M), WX); 9 Inv get initial approx p = truncate(xp*rovalue, WM); 6 x 5 = 5 truncated ultiplier Iteration of Newton Raphson Method 3

31 p_squared = truncate(p*p, W); 5 x 5 = 7 truncated squarer i = round(p_squared*(+), W); 7 x 4 = 7 rounded ultiplier p = *p-i; 7-7 = 7 subtractor Final Multiplication QNR = round(p*(+n),4); 7 x 4 = 4 rounded ultipler Err_NR(i) = QNR - (Q+)/Q_scale; Err_NR = -log(abs(err_nr)+e-5); innr = in(err_nr); X = 9.5:35.5; [N] = hist(err_nr, X); N = N/length(Err_NR); bar(x, N); title('iproved esign (M=6, W=5, 7 bit datapath)'); xlabel('error (in bit)'); ylabel('probability'); fprintf (, 'Max Error is f\n', innr); Function: ro function value = ro(add, M, W) if ((add < ) (add >= ^M)) fprintf (, 'Error: Invalid address to ro: d\n', add); error ('quiting'); x = + add/^m; c = /(x + ^(-M-))^; value = ceil(c*^(w))/^(w); Function: truncate function value = truncate(x, N) value = floor(x*(^n))/(^n); Function: round function value = round(x, N) value = floor(x*(^n)+.5)/(^n); EOF 3

32 Matlab code to deterine error inherent in Taylor series expansion M = 7; N = 5; practically infinite precision NUM_SAMPLES = 5; for i=:num_samples Nr = round((+rand)*(^n))/^n; r = round((+rand)*(^n))/^n; precise Quotient Q = (Nr)/(r); Floating point division based on taylor series x = floor((r)*^m)/^m; x = r - floor(r*^m)/^m; Quotient obtained fro first two ters of Taylor series expansion Q_Taylor = /(x + ^(-M-)) - (/(x + ^(-M-))^)*(x - ^(-M-)); Error in Taylor series approxiation Err(i) = Nr*Q_Taylor - Q; figure(); plot(err); [N, X] = hist(err, ); N = N/length(Err); plot(x, N, 'o-'); axis([.7*in(x).5*ax(n)]); title('error in Quotient M=7'); xlabel('error Magnitude'); ylabel('probability'); grid on; Err = -log(-err+e-3); avg = ean(err); var = std(err)^; ax = ax(err); in = in(err); fprintf(, ' M = d\n', M); fprintf(, ' avg =.4f\n var =.4f\n ax =.4f\n in =.4f\n\n', avg, var, ax, in); figure(); X = 6.5:5.5; [N] = hist(err, X); N = N/length(Err); bar(x, N); axis([5 5.5*ax(N)]); title('error in Quotient M=7'); xlabel('error (in Bit)'); ylabe('probability'); 3

33 MATLAB code to understand accuracy trade offs with ROM-size function ROMAccuracy() M = 6; N = 5; LSB = *M+; NUM_SAMPLES = ; for j=: W = *M+j-3; for i=:num_samples N = round(rand*(^n))/^n; = round(rand*(^n))/^n; Q = (+N)/(+); Q_scale = ; if (Q <.) Q_scale = ; Q = Q*; Q = Q-; ivision based on ROM followed by a ultiplier read ro - get fraction rovalue = ro(floor(*^m), M, W); deterine.xxx3..x+'x+'... xp = + (floor(*^m))/^m + ( - (( - ((floor(*^m))/^m))*^m) - /^(N-M))/^M; xp = floor(( + (floor(*^m))/^m + ( - (( - ((floor(*^m))/^m))*^m) - /^(N-M))/^M)*^(W+3))/^(W+3); get initial approx p = xp*rovalue; only 6 bits after decial are considered p = floor(p*^w)/(^w); Q_ro = p*(+n); Err_ROMf(i) = (Q_ro() - (+Q)/Q_scale); Err_ROMr(i) = (Q_ro() - (+Q)/Q_scale); Err_ROMc(i) = (Q_ro(3) - (+Q)/Q_scale); Err_ROMf = -log(abs(err_romf) + e-4); Err_ROMr = -log(abs(err_romr) + e-4); Err_ROMc = -log(abs(err_romc) + e-4); inf(j) = in(err_romf); inr(j) = in(err_romr); inc(j) = in(err_romc); X = *M + (:) -3; plot(x,inf, 'ro-'); hold on; plot(x,inr, 'gx-'); 33

34 hold on; plot(x,inc, 'b^-'); title('rom Width vs Accuracy (M=6)'); xlabel('rom Width (Bits)'); ylabel('quotient Precision'); leg('truncation', 'Rounding', 'Ceiling'); grid on; Function: ro function value = ro(add, M, W) if ((add < ) (add >= ^M)) fprintf (, 'Error: Invalid address to ro: d\n', add); error ('quiting'); x = + add/^m; c = /(x + ^(-M-))^; value() = floor(c*^(w))/^(w); value() = round(c*^(w))/^(w); value(3) = ceil(c*^(w))/^(w); 34

35 Appix B (ROM Pattern) Address ROM Contents (5 bits) (6 bits) Binary Hex 7E6 7A35 768F FB 5 6C8B F 8 63BF 9 6 5E77 5BFA B F5 6 5E8 7 4EEF B33 496E 47B B 4 4F EA BF9 3 3AB E A 35 34F E9 37 3E5 38 3E8 39 3F

36 4 FB 4 E3A 43 5F 44 C8A 45 BBA 46 AF 47 AC B3 5 7FE 5 74E 5 6A 53 5FA B7 56 4C F A 6 C

Model Fitting. CURM Background Material, Fall 2014 Dr. Doreen De Leon

Model Fitting. CURM Background Material, Fall 2014 Dr. Doreen De Leon Model Fitting CURM Background Material, Fall 014 Dr. Doreen De Leon 1 Introduction Given a set of data points, we often want to fit a selected odel or type to the data (e.g., we suspect an exponential

More information

Fast Montgomery-like Square Root Computation over GF(2 m ) for All Trinomials

Fast Montgomery-like Square Root Computation over GF(2 m ) for All Trinomials Fast Montgoery-like Square Root Coputation over GF( ) for All Trinoials Yin Li a, Yu Zhang a, a Departent of Coputer Science and Technology, Xinyang Noral University, Henan, P.R.China Abstract This letter

More information

Intelligent Systems: Reasoning and Recognition. Perceptrons and Support Vector Machines

Intelligent Systems: Reasoning and Recognition. Perceptrons and Support Vector Machines Intelligent Systes: Reasoning and Recognition Jaes L. Crowley osig 1 Winter Seester 2018 Lesson 6 27 February 2018 Outline Perceptrons and Support Vector achines Notation...2 Linear odels...3 Lines, Planes

More information

a a a a a a a m a b a b

a a a a a a a m a b a b Algebra / Trig Final Exa Study Guide (Fall Seester) Moncada/Dunphy Inforation About the Final Exa The final exa is cuulative, covering Appendix A (A.1-A.5) and Chapter 1. All probles will be ultiple choice

More information

A Division Algorithm Using Bisection Method in Residue Number System

A Division Algorithm Using Bisection Method in Residue Number System International Journal of Coputer, Consuer and Control IJ3C), Vol., No. 03) 59 A Division Algorith Using Bisection Method in Residue Nuber Syste * Chin-Chen Chang and Jen-Ho Yang Abstract. Introduction

More information

Lecture 11. Advanced Dividers

Lecture 11. Advanced Dividers Lecture 11 Advanced Dividers Required Reading Behrooz Parhami, Computer Arithmetic: Algorithms and Hardware Design Chapter 15 Variation in Dividers 15.3, Combinational and Array Dividers Chapter 16, Division

More information

OBJECTIVES INTRODUCTION

OBJECTIVES INTRODUCTION M7 Chapter 3 Section 1 OBJECTIVES Suarize data using easures of central tendency, such as the ean, edian, ode, and idrange. Describe data using the easures of variation, such as the range, variance, and

More information

Non-Parametric Non-Line-of-Sight Identification 1

Non-Parametric Non-Line-of-Sight Identification 1 Non-Paraetric Non-Line-of-Sight Identification Sinan Gezici, Hisashi Kobayashi and H. Vincent Poor Departent of Electrical Engineering School of Engineering and Applied Science Princeton University, Princeton,

More information

A method to determine relative stroke detection efficiencies from multiplicity distributions

A method to determine relative stroke detection efficiencies from multiplicity distributions A ethod to deterine relative stroke detection eiciencies ro ultiplicity distributions Schulz W. and Cuins K. 2. Austrian Lightning Detection and Inoration Syste (ALDIS), Kahlenberger Str.2A, 90 Vienna,

More information

Ph 20.3 Numerical Solution of Ordinary Differential Equations

Ph 20.3 Numerical Solution of Ordinary Differential Equations Ph 20.3 Nuerical Solution of Ordinary Differential Equations Due: Week 5 -v20170314- This Assignent So far, your assignents have tried to failiarize you with the hardware and software in the Physics Coputing

More information

Feature Extraction Techniques

Feature Extraction Techniques Feature Extraction Techniques Unsupervised Learning II Feature Extraction Unsupervised ethods can also be used to find features which can be useful for categorization. There are unsupervised ethods that

More information

9. Datapath Design. Jacob Abraham. Department of Electrical and Computer Engineering The University of Texas at Austin VLSI Design Fall 2017

9. Datapath Design. Jacob Abraham. Department of Electrical and Computer Engineering The University of Texas at Austin VLSI Design Fall 2017 9. Datapath Design Jacob Abraham Department of Electrical and Computer Engineering The University of Texas at Austin VLSI Design Fall 2017 October 2, 2017 ECE Department, University of Texas at Austin

More information

Efficient Filter Banks And Interpolators

Efficient Filter Banks And Interpolators Efficient Filter Banks And Interpolators A. G. DEMPSTER AND N. P. MURPHY Departent of Electronic Systes University of Westinster 115 New Cavendish St, London W1M 8JS United Kingdo Abstract: - Graphical

More information

Topic 5a Introduction to Curve Fitting & Linear Regression

Topic 5a Introduction to Curve Fitting & Linear Regression /7/08 Course Instructor Dr. Rayond C. Rup Oice: A 337 Phone: (95) 747 6958 E ail: rcrup@utep.edu opic 5a Introduction to Curve Fitting & Linear Regression EE 4386/530 Coputational ethods in EE Outline

More information

Statistical Logic Cell Delay Analysis Using a Current-based Model

Statistical Logic Cell Delay Analysis Using a Current-based Model Statistical Logic Cell Delay Analysis Using a Current-based Model Hanif Fatei Shahin Nazarian Massoud Pedra Dept. of EE-Systes, University of Southern California, Los Angeles, CA 90089 {fatei, shahin,

More information

Complexity reduction in low-delay Farrowstructure-based. filters utilizing linear-phase subfilters

Complexity reduction in low-delay Farrowstructure-based. filters utilizing linear-phase subfilters Coplexity reduction in low-delay Farrowstructure-based variable fractional delay FIR filters utilizing linear-phase subfilters Air Eghbali and Håkan Johansson Linköping University Post Print N.B.: When

More information

SUPERIOR-ORDER CURVATURE-CORRECTED PROGRAMMABLE VOLTAGE REFERENCES

SUPERIOR-ORDER CURVATURE-CORRECTED PROGRAMMABLE VOLTAGE REFERENCES SUPEIO-ODE CUATUE-COECTED POGAMMABLE OLTAGE EFEENCES Cosin Popa e-ail: cosin@golanapubro Faculty of Electronics and Telecounications, University Politehnica of Bucharest, B dul Iuliu Maniu 1-3, Bucuresti,

More information

A High-Speed Processor for Digital Sine/Cosine Generation and Angle Rotation*

A High-Speed Processor for Digital Sine/Cosine Generation and Angle Rotation* Copyright IEEE 998: Published the proceedgs of the nd Asiloar Conference on Signals, Systes and Coputers, Nov -4, 998, at Asiloar, California, USA A High-Speed Processor for Digital Se/Cose Generation

More information

Ch 12: Variations on Backpropagation

Ch 12: Variations on Backpropagation Ch 2: Variations on Backpropagation The basic backpropagation algorith is too slow for ost practical applications. It ay take days or weeks of coputer tie. We deonstrate why the backpropagation algorith

More information

Least Squares Fitting of Data

Least Squares Fitting of Data Least Squares Fitting of Data David Eberly, Geoetric Tools, Redond WA 98052 https://www.geoetrictools.co/ This work is licensed under the Creative Coons Attribution 4.0 International License. To view a

More information

Boosting with log-loss

Boosting with log-loss Boosting with log-loss Marco Cusuano-Towner Septeber 2, 202 The proble Suppose we have data exaples {x i, y i ) i =... } for a two-class proble with y i {, }. Let F x) be the predictor function with the

More information

lecture 36: Linear Multistep Mehods: Zero Stability

lecture 36: Linear Multistep Mehods: Zero Stability 95 lecture 36: Linear Multistep Mehods: Zero Stability 5.6 Linear ultistep ethods: zero stability Does consistency iply convergence for linear ultistep ethods? This is always the case for one-step ethods,

More information

e-companion ONLY AVAILABLE IN ELECTRONIC FORM

e-companion ONLY AVAILABLE IN ELECTRONIC FORM OPERATIONS RESEARCH doi 10.1287/opre.1070.0427ec pp. ec1 ec5 e-copanion ONLY AVAILABLE IN ELECTRONIC FORM infors 07 INFORMS Electronic Copanion A Learning Approach for Interactive Marketing to a Custoer

More information

EE5900 Spring Lecture 4 IC interconnect modeling methods Zhuo Feng

EE5900 Spring Lecture 4 IC interconnect modeling methods Zhuo Feng EE59 Spring Parallel LSI AD Algoriths Lecture I interconnect odeling ethods Zhuo Feng. Z. Feng MTU EE59 So far we ve considered only tie doain analyses We ll soon see that it is soeties preferable to odel

More information

On Constant Power Water-filling

On Constant Power Water-filling On Constant Power Water-filling Wei Yu and John M. Cioffi Electrical Engineering Departent Stanford University, Stanford, CA94305, U.S.A. eails: {weiyu,cioffi}@stanford.edu Abstract This paper derives

More information

Kernel Methods and Support Vector Machines

Kernel Methods and Support Vector Machines Intelligent Systes: Reasoning and Recognition Jaes L. Crowley ENSIAG 2 / osig 1 Second Seester 2012/2013 Lesson 20 2 ay 2013 Kernel ethods and Support Vector achines Contents Kernel Functions...2 Quadratic

More information

DEPARTMENT OF ECONOMETRICS AND BUSINESS STATISTICS

DEPARTMENT OF ECONOMETRICS AND BUSINESS STATISTICS ISSN 1440-771X AUSTRALIA DEPARTMENT OF ECONOMETRICS AND BUSINESS STATISTICS An Iproved Method for Bandwidth Selection When Estiating ROC Curves Peter G Hall and Rob J Hyndan Working Paper 11/00 An iproved

More information

Elliptic Curve Scalar Point Multiplication Algorithm Using Radix-4 Booth s Algorithm

Elliptic Curve Scalar Point Multiplication Algorithm Using Radix-4 Booth s Algorithm Elliptic Curve Scalar Multiplication Algorith Using Radix-4 Booth s Algorith Elliptic Curve Scalar Multiplication Algorith Using Radix-4 Booth s Algorith Sangook Moon, Non-eber ABSTRACT The ain back-bone

More information

Pattern Recognition and Machine Learning. Learning and Evaluation for Pattern Recognition

Pattern Recognition and Machine Learning. Learning and Evaluation for Pattern Recognition Pattern Recognition and Machine Learning Jaes L. Crowley ENSIMAG 3 - MMIS Fall Seester 2017 Lesson 1 4 October 2017 Outline Learning and Evaluation for Pattern Recognition Notation...2 1. The Pattern Recognition

More information

Lecture 21. Interior Point Methods Setup and Algorithm

Lecture 21. Interior Point Methods Setup and Algorithm Lecture 21 Interior Point Methods In 1984, Kararkar introduced a new weakly polynoial tie algorith for solving LPs [Kar84a], [Kar84b]. His algorith was theoretically faster than the ellipsoid ethod and

More information

On the Design of an On-line Complex Householder Transform

On the Design of an On-line Complex Householder Transform On the esign of an On-line Coplex Householder Transfor Robert McIlhenny Coputer Science epartent California State University, Northridge Northridge, CA 9330 rcilhen@csunedu Milo s Ercegovac Coputer Science

More information

CHAPTER 19: Single-Loop IMC Control

CHAPTER 19: Single-Loop IMC Control When I coplete this chapter, I want to be able to do the following. Recognize that other feedback algoriths are possible Understand the IMC structure and how it provides the essential control features

More information

A HIGH-SPEED PROCESSOR FOR RECTANGULAR-TO-POLAR CONVERSION WITH APPLICATIONS IN DIGITAL COMMUNICATIONS *

A HIGH-SPEED PROCESSOR FOR RECTANGULAR-TO-POLAR CONVERSION WITH APPLICATIONS IN DIGITAL COMMUNICATIONS * Copyright IEEE 999: Published in the Proceedings of Globecom 999, Rio de Janeiro, Dec 5-9, 999 A HIGH-SPEED PROCESSOR FOR RECTAGULAR-TO-POLAR COVERSIO WITH APPLICATIOS I DIGITAL COMMUICATIOS * Dengwei

More information

List Scheduling and LPT Oliver Braun (09/05/2017)

List Scheduling and LPT Oliver Braun (09/05/2017) List Scheduling and LPT Oliver Braun (09/05/207) We investigate the classical scheduling proble P ax where a set of n independent jobs has to be processed on 2 parallel and identical processors (achines)

More information

LogLog-Beta and More: A New Algorithm for Cardinality Estimation Based on LogLog Counting

LogLog-Beta and More: A New Algorithm for Cardinality Estimation Based on LogLog Counting LogLog-Beta and More: A New Algorith for Cardinality Estiation Based on LogLog Counting Jason Qin, Denys Ki, Yuei Tung The AOLP Core Data Service, AOL, 22000 AOL Way Dulles, VA 20163 E-ail: jasonqin@teaaolco

More information

Arithmetic Unit for Complex Number Processing

Arithmetic Unit for Complex Number Processing Abstract Arithetic Unit or Coplex Nuber Processing Dr. Soloon Khelnik, Dr. Sergey Selyutin, Alexandr Viduetsky, Inna Doubson, Seion Khelnik This paper presents developent o a coplex nuber arithetic unit

More information

ESTIMATING AND FORMING CONFIDENCE INTERVALS FOR EXTREMA OF RANDOM POLYNOMIALS. A Thesis. Presented to. The Faculty of the Department of Mathematics

ESTIMATING AND FORMING CONFIDENCE INTERVALS FOR EXTREMA OF RANDOM POLYNOMIALS. A Thesis. Presented to. The Faculty of the Department of Mathematics ESTIMATING AND FORMING CONFIDENCE INTERVALS FOR EXTREMA OF RANDOM POLYNOMIALS A Thesis Presented to The Faculty of the Departent of Matheatics San Jose State University In Partial Fulfillent of the Requireents

More information

A New Algorithm for Reactive Electric Power Measurement

A New Algorithm for Reactive Electric Power Measurement A. Abiyev, GAU J. Soc. & Appl. Sci., 2(4), 7-25, 27 A ew Algorith for Reactive Electric Power Measureent Adalet Abiyev Girne Aerican University, Departernt of Electrical Electronics Engineering, Mersin,

More information

Machine Learning Basics: Estimators, Bias and Variance

Machine Learning Basics: Estimators, Bias and Variance Machine Learning Basics: Estiators, Bias and Variance Sargur N. srihari@cedar.buffalo.edu This is part of lecture slides on Deep Learning: http://www.cedar.buffalo.edu/~srihari/cse676 1 Topics in Basics

More information

FPGA Implementation of Point Multiplication on Koblitz Curves Using Kleinian Integers

FPGA Implementation of Point Multiplication on Koblitz Curves Using Kleinian Integers FPGA Ipleentation of Point Multiplication on Koblitz Curves Using Kleinian Integers V.S. Diitrov 1 K.U. Järvinen 2 M.J. Jacobson, Jr. 3 W.F. Chan 3 Z. Huang 1 February 28, 2012 Diitrov et al. (Univ. Calgary)

More information

Low-complexity, Low-memory EMS algorithm for non-binary LDPC codes

Low-complexity, Low-memory EMS algorithm for non-binary LDPC codes Low-coplexity, Low-eory EMS algorith for non-binary LDPC codes Adrian Voicila,David Declercq, François Verdier ETIS ENSEA/CP/CNRS MR-85 954 Cergy-Pontoise, (France) Marc Fossorier Dept. Electrical Engineering

More information

A Simplified Analytical Approach for Efficiency Evaluation of the Weaving Machines with Automatic Filling Repair

A Simplified Analytical Approach for Efficiency Evaluation of the Weaving Machines with Automatic Filling Repair Proceedings of the 6th SEAS International Conference on Siulation, Modelling and Optiization, Lisbon, Portugal, Septeber -4, 006 0 A Siplified Analytical Approach for Efficiency Evaluation of the eaving

More information

E0 370 Statistical Learning Theory Lecture 6 (Aug 30, 2011) Margin Analysis

E0 370 Statistical Learning Theory Lecture 6 (Aug 30, 2011) Margin Analysis E0 370 tatistical Learning Theory Lecture 6 (Aug 30, 20) Margin Analysis Lecturer: hivani Agarwal cribe: Narasihan R Introduction In the last few lectures we have seen how to obtain high confidence bounds

More information

Finite fields. and we ve used it in various examples and homework problems. In these notes I will introduce more finite fields

Finite fields. and we ve used it in various examples and homework problems. In these notes I will introduce more finite fields Finite fields I talked in class about the field with two eleents F 2 = {, } and we ve used it in various eaples and hoework probles. In these notes I will introduce ore finite fields F p = {,,...,p } for

More information

A remark on a success rate model for DPA and CPA

A remark on a success rate model for DPA and CPA A reark on a success rate odel for DPA and CPA A. Wieers, BSI Version 0.5 andreas.wieers@bsi.bund.de Septeber 5, 2018 Abstract The success rate is the ost coon evaluation etric for easuring the perforance

More information

On the Analysis of the Quantum-inspired Evolutionary Algorithm with a Single Individual

On the Analysis of the Quantum-inspired Evolutionary Algorithm with a Single Individual 6 IEEE Congress on Evolutionary Coputation Sheraton Vancouver Wall Centre Hotel, Vancouver, BC, Canada July 16-1, 6 On the Analysis of the Quantu-inspired Evolutionary Algorith with a Single Individual

More information

lecture 37: Linear Multistep Methods: Absolute Stability, Part I lecture 38: Linear Multistep Methods: Absolute Stability, Part II

lecture 37: Linear Multistep Methods: Absolute Stability, Part I lecture 38: Linear Multistep Methods: Absolute Stability, Part II lecture 37: Linear Multistep Methods: Absolute Stability, Part I lecture 3: Linear Multistep Methods: Absolute Stability, Part II 5.7 Linear ultistep ethods: absolute stability At this point, it ay well

More information

Support Vector Machine Classification of Uncertain and Imbalanced data using Robust Optimization

Support Vector Machine Classification of Uncertain and Imbalanced data using Robust Optimization Recent Researches in Coputer Science Support Vector Machine Classification of Uncertain and Ibalanced data using Robust Optiization RAGHAV PAT, THEODORE B. TRAFALIS, KASH BARKER School of Industrial Engineering

More information

arxiv: v3 [cs.ds] 22 Mar 2016

arxiv: v3 [cs.ds] 22 Mar 2016 A Shifting Bloo Filter Fraewor for Set Queries arxiv:1510.03019v3 [cs.ds] Mar 01 ABSTRACT Tong Yang Peing University, China yangtongeail@gail.co Yuanun Zhong Nanjing University, China un@sail.nju.edu.cn

More information

Symbolic Analysis as Universal Tool for Deriving Properties of Non-linear Algorithms Case study of EM Algorithm

Symbolic Analysis as Universal Tool for Deriving Properties of Non-linear Algorithms Case study of EM Algorithm Acta Polytechnica Hungarica Vol., No., 04 Sybolic Analysis as Universal Tool for Deriving Properties of Non-linear Algoriths Case study of EM Algorith Vladiir Mladenović, Miroslav Lutovac, Dana Porrat

More information

ASSUME a source over an alphabet size m, from which a sequence of n independent samples are drawn. The classical

ASSUME a source over an alphabet size m, from which a sequence of n independent samples are drawn. The classical IEEE TRANSACTIONS ON INFORMATION THEORY Large Alphabet Source Coding using Independent Coponent Analysis Aichai Painsky, Meber, IEEE, Saharon Rosset and Meir Feder, Fellow, IEEE arxiv:67.7v [cs.it] Jul

More information

Interactive Markov Models of Evolutionary Algorithms

Interactive Markov Models of Evolutionary Algorithms Cleveland State University EngagedScholarship@CSU Electrical Engineering & Coputer Science Faculty Publications Electrical Engineering & Coputer Science Departent 2015 Interactive Markov Models of Evolutionary

More information

Low complexity bit parallel multiplier for GF(2 m ) generated by equally-spaced trinomials

Low complexity bit parallel multiplier for GF(2 m ) generated by equally-spaced trinomials Inforation Processing Letters 107 008 11 15 www.elsevier.co/locate/ipl Low coplexity bit parallel ultiplier for GF generated by equally-spaced trinoials Haibin Shen a,, Yier Jin a,b a Institute of VLSI

More information

Ocean 420 Physical Processes in the Ocean Project 1: Hydrostatic Balance, Advection and Diffusion Answers

Ocean 420 Physical Processes in the Ocean Project 1: Hydrostatic Balance, Advection and Diffusion Answers Ocean 40 Physical Processes in the Ocean Project 1: Hydrostatic Balance, Advection and Diffusion Answers 1. Hydrostatic Balance a) Set all of the levels on one of the coluns to the lowest possible density.

More information

A Self-Organizing Model for Logical Regression Jerry Farlow 1 University of Maine. (1900 words)

A Self-Organizing Model for Logical Regression Jerry Farlow 1 University of Maine. (1900 words) 1 A Self-Organizing Model for Logical Regression Jerry Farlow 1 University of Maine (1900 words) Contact: Jerry Farlow Dept of Matheatics Univeristy of Maine Orono, ME 04469 Tel (07) 866-3540 Eail: farlow@ath.uaine.edu

More information

Analyzing Simulation Results

Analyzing Simulation Results Analyzing Siulation Results Dr. John Mellor-Cruey Departent of Coputer Science Rice University johnc@cs.rice.edu COMP 528 Lecture 20 31 March 2005 Topics for Today Model verification Model validation Transient

More information

Polygonal Designs: Existence and Construction

Polygonal Designs: Existence and Construction Polygonal Designs: Existence and Construction John Hegean Departent of Matheatics, Stanford University, Stanford, CA 9405 Jeff Langford Departent of Matheatics, Drake University, Des Moines, IA 5011 G

More information

Sharp Time Data Tradeoffs for Linear Inverse Problems

Sharp Time Data Tradeoffs for Linear Inverse Problems Sharp Tie Data Tradeoffs for Linear Inverse Probles Saet Oyak Benjain Recht Mahdi Soltanolkotabi January 016 Abstract In this paper we characterize sharp tie-data tradeoffs for optiization probles used

More information

Birthday Paradox Calculations and Approximation

Birthday Paradox Calculations and Approximation Birthday Paradox Calculations and Approxiation Joshua E. Hill InfoGard Laboratories -March- v. Birthday Proble In the birthday proble, we have a group of n randoly selected people. If we assue that birthdays

More information

Department of Electronic and Optical Engineering, Ordnance Engineering College, Shijiazhuang, , China

Department of Electronic and Optical Engineering, Ordnance Engineering College, Shijiazhuang, , China 6th International Conference on Machinery, Materials, Environent, Biotechnology and Coputer (MMEBC 06) Solving Multi-Sensor Multi-Target Assignent Proble Based on Copositive Cobat Efficiency and QPSO Algorith

More information

Keywords: Estimator, Bias, Mean-squared error, normality, generalized Pareto distribution

Keywords: Estimator, Bias, Mean-squared error, normality, generalized Pareto distribution Testing approxiate norality of an estiator using the estiated MSE and bias with an application to the shape paraeter of the generalized Pareto distribution J. Martin van Zyl Abstract In this work the norality

More information

MSEC MODELING OF DEGRADATION PROCESSES TO OBTAIN AN OPTIMAL SOLUTION FOR MAINTENANCE AND PERFORMANCE

MSEC MODELING OF DEGRADATION PROCESSES TO OBTAIN AN OPTIMAL SOLUTION FOR MAINTENANCE AND PERFORMANCE Proceeding of the ASME 9 International Manufacturing Science and Engineering Conference MSEC9 October 4-7, 9, West Lafayette, Indiana, USA MSEC9-8466 MODELING OF DEGRADATION PROCESSES TO OBTAIN AN OPTIMAL

More information

Combining Classifiers

Combining Classifiers Cobining Classifiers Generic ethods of generating and cobining ultiple classifiers Bagging Boosting References: Duda, Hart & Stork, pg 475-480. Hastie, Tibsharini, Friedan, pg 246-256 and Chapter 10. http://www.boosting.org/

More information

paper prepared for the 1996 PTRC Conference, September 2-6, Brunel University, UK ON THE CALIBRATION OF THE GRAVITY MODEL

paper prepared for the 1996 PTRC Conference, September 2-6, Brunel University, UK ON THE CALIBRATION OF THE GRAVITY MODEL paper prepared for the 1996 PTRC Conference, Septeber 2-6, Brunel University, UK ON THE CALIBRATION OF THE GRAVITY MODEL Nanne J. van der Zijpp 1 Transportation and Traffic Engineering Section Delft University

More information

Randomized Recovery for Boolean Compressed Sensing

Randomized Recovery for Boolean Compressed Sensing Randoized Recovery for Boolean Copressed Sensing Mitra Fatei and Martin Vetterli Laboratory of Audiovisual Counication École Polytechnique Fédéral de Lausanne (EPFL) Eail: {itra.fatei, artin.vetterli}@epfl.ch

More information

Impact of Imperfect Channel State Information on ARQ Schemes over Rayleigh Fading Channels

Impact of Imperfect Channel State Information on ARQ Schemes over Rayleigh Fading Channels This full text paper was peer reviewed at the direction of IEEE Counications Society subject atter experts for publication in the IEEE ICC 9 proceedings Ipact of Iperfect Channel State Inforation on ARQ

More information

CSE525: Randomized Algorithms and Probabilistic Analysis May 16, Lecture 13

CSE525: Randomized Algorithms and Probabilistic Analysis May 16, Lecture 13 CSE55: Randoied Algoriths and obabilistic Analysis May 6, Lecture Lecturer: Anna Karlin Scribe: Noah Siegel, Jonathan Shi Rando walks and Markov chains This lecture discusses Markov chains, which capture

More information

Ştefan ŞTEFĂNESCU * is the minimum global value for the function h (x)

Ştefan ŞTEFĂNESCU * is the minimum global value for the function h (x) 7Applying Nelder Mead s Optiization Algorith APPLYING NELDER MEAD S OPTIMIZATION ALGORITHM FOR MULTIPLE GLOBAL MINIMA Abstract Ştefan ŞTEFĂNESCU * The iterative deterinistic optiization ethod could not

More information

In this chapter, we consider several graph-theoretic and probabilistic models

In this chapter, we consider several graph-theoretic and probabilistic models THREE ONE GRAPH-THEORETIC AND STATISTICAL MODELS 3.1 INTRODUCTION In this chapter, we consider several graph-theoretic and probabilistic odels for a social network, which we do under different assuptions

More information

arxiv: v1 [math.nt] 14 Sep 2014

arxiv: v1 [math.nt] 14 Sep 2014 ROTATION REMAINDERS P. JAMESON GRABER, WASHINGTON AND LEE UNIVERSITY 08 arxiv:1409.411v1 [ath.nt] 14 Sep 014 Abstract. We study properties of an array of nubers, called the triangle, in which each row

More information

Block designs and statistics

Block designs and statistics Bloc designs and statistics Notes for Math 447 May 3, 2011 The ain paraeters of a bloc design are nuber of varieties v, bloc size, nuber of blocs b. A design is built on a set of v eleents. Each eleent

More information

Ensemble Based on Data Envelopment Analysis

Ensemble Based on Data Envelopment Analysis Enseble Based on Data Envelopent Analysis So Young Sohn & Hong Choi Departent of Coputer Science & Industrial Systes Engineering, Yonsei University, Seoul, Korea Tel) 82-2-223-404, Fax) 82-2- 364-7807

More information

TABLE FOR UPPER PERCENTAGE POINTS OF THE LARGEST ROOT OF A DETERMINANTAL EQUATION WITH FIVE ROOTS. By William W. Chen

TABLE FOR UPPER PERCENTAGE POINTS OF THE LARGEST ROOT OF A DETERMINANTAL EQUATION WITH FIVE ROOTS. By William W. Chen TABLE FOR UPPER PERCENTAGE POINTS OF THE LARGEST ROOT OF A DETERMINANTAL EQUATION WITH FIVE ROOTS By Willia W. Chen The distribution of the non-null characteristic roots of a atri derived fro saple observations

More information

A Theoretical Analysis of a Warm Start Technique

A Theoretical Analysis of a Warm Start Technique A Theoretical Analysis of a War Start Technique Martin A. Zinkevich Yahoo! Labs 701 First Avenue Sunnyvale, CA Abstract Batch gradient descent looks at every data point for every step, which is wasteful

More information

A new type of lower bound for the largest eigenvalue of a symmetric matrix

A new type of lower bound for the largest eigenvalue of a symmetric matrix Linear Algebra and its Applications 47 7 9 9 www.elsevier.co/locate/laa A new type of lower bound for the largest eigenvalue of a syetric atrix Piet Van Mieghe Delft University of Technology, P.O. Box

More information

A Low-Complexity Congestion Control and Scheduling Algorithm for Multihop Wireless Networks with Order-Optimal Per-Flow Delay

A Low-Complexity Congestion Control and Scheduling Algorithm for Multihop Wireless Networks with Order-Optimal Per-Flow Delay A Low-Coplexity Congestion Control and Scheduling Algorith for Multihop Wireless Networks with Order-Optial Per-Flow Delay Po-Kai Huang, Xiaojun Lin, and Chih-Chun Wang School of Electrical and Coputer

More information

Smith Predictor Based-Sliding Mode Controller for Integrating Process with Elevated Deadtime

Smith Predictor Based-Sliding Mode Controller for Integrating Process with Elevated Deadtime Sith Predictor Based-Sliding Mode Controller for Integrating Process with Elevated Deadtie Oscar Caacho, a, * Francisco De la Cruz b a Postgrado en Autoatización e Instruentación. Grupo en Nuevas Estrategias

More information

New Slack-Monotonic Schedulability Analysis of Real-Time Tasks on Multiprocessors

New Slack-Monotonic Schedulability Analysis of Real-Time Tasks on Multiprocessors New Slack-Monotonic Schedulability Analysis of Real-Tie Tasks on Multiprocessors Risat Mahud Pathan and Jan Jonsson Chalers University of Technology SE-41 96, Göteborg, Sweden {risat, janjo}@chalers.se

More information

REDUCTION OF FINITE ELEMENT MODELS BY PARAMETER IDENTIFICATION

REDUCTION OF FINITE ELEMENT MODELS BY PARAMETER IDENTIFICATION ISSN 139 14X INFORMATION TECHNOLOGY AND CONTROL, 008, Vol.37, No.3 REDUCTION OF FINITE ELEMENT MODELS BY PARAMETER IDENTIFICATION Riantas Barauskas, Vidantas Riavičius Departent of Syste Analysis, Kaunas

More information

Short Papers. Test Data Compression and Decompression Based on Internal Scan Chains and Golomb Coding

Short Papers. Test Data Compression and Decompression Based on Internal Scan Chains and Golomb Coding IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 1, NO. 6, JUNE 00 715 Short Papers Test Data Copression and Decopression Based on Internal Scan Chains and Golob Coding

More information

A note on the multiplication of sparse matrices

A note on the multiplication of sparse matrices Cent. Eur. J. Cop. Sci. 41) 2014 1-11 DOI: 10.2478/s13537-014-0201-x Central European Journal of Coputer Science A note on the ultiplication of sparse atrices Research Article Keivan Borna 12, Sohrab Aboozarkhani

More information

Inspection; structural health monitoring; reliability; Bayesian analysis; updating; decision analysis; value of information

Inspection; structural health monitoring; reliability; Bayesian analysis; updating; decision analysis; value of information Cite as: Straub D. (2014). Value of inforation analysis with structural reliability ethods. Structural Safety, 49: 75-86. Value of Inforation Analysis with Structural Reliability Methods Daniel Straub

More information

A Better Algorithm For an Ancient Scheduling Problem. David R. Karger Steven J. Phillips Eric Torng. Department of Computer Science

A Better Algorithm For an Ancient Scheduling Problem. David R. Karger Steven J. Phillips Eric Torng. Department of Computer Science A Better Algorith For an Ancient Scheduling Proble David R. Karger Steven J. Phillips Eric Torng Departent of Coputer Science Stanford University Stanford, CA 9435-4 Abstract One of the oldest and siplest

More information

Experimental Design For Model Discrimination And Precise Parameter Estimation In WDS Analysis

Experimental Design For Model Discrimination And Precise Parameter Estimation In WDS Analysis City University of New York (CUNY) CUNY Acadeic Works International Conference on Hydroinforatics 8-1-2014 Experiental Design For Model Discriination And Precise Paraeter Estiation In WDS Analysis Giovanna

More information

Analysis of Polynomial & Rational Functions ( summary )

Analysis of Polynomial & Rational Functions ( summary ) Analysis of Polynoial & Rational Functions ( suary ) The standard for of a polynoial function is ( ) where each of the nubers are called the coefficients. The polynoial of is said to have degree n, where

More information

Page 1 Lab 1 Elementary Matrix and Linear Algebra Spring 2011

Page 1 Lab 1 Elementary Matrix and Linear Algebra Spring 2011 Page Lab Eleentary Matri and Linear Algebra Spring 0 Nae Due /03/0 Score /5 Probles through 4 are each worth 4 points.. Go to the Linear Algebra oolkit site ransforing a atri to reduced row echelon for

More information

Sensorless Control of Induction Motor Drive Using SVPWM - MRAS Speed Observer

Sensorless Control of Induction Motor Drive Using SVPWM - MRAS Speed Observer Journal of Eerging Trends in Engineering and Applied Sciences (JETEAS) 2 (3): 509-513 Journal Scholarlink of Eerging Research Trends Institute in Engineering Journals, 2011 and Applied (ISSN: 2141-7016)

More information

ANALYSIS OF HALL-EFFECT THRUSTERS AND ION ENGINES FOR EARTH-TO-MOON TRANSFER

ANALYSIS OF HALL-EFFECT THRUSTERS AND ION ENGINES FOR EARTH-TO-MOON TRANSFER IEPC 003-0034 ANALYSIS OF HALL-EFFECT THRUSTERS AND ION ENGINES FOR EARTH-TO-MOON TRANSFER A. Bober, M. Guelan Asher Space Research Institute, Technion-Israel Institute of Technology, 3000 Haifa, Israel

More information

On Concurrent Detection of Errors in Polynomial Basis Multiplication

On Concurrent Detection of Errors in Polynomial Basis Multiplication 1 On Concurrent Detection of Errors in Polynoial Basis Multiplication Siavash Bayat-Saradi and M. Anwar Hasan Abstract The detection of errors in arithetic operations is an iportant issue. This paper discusses

More information

Measures of average are called measures of central tendency and include the mean, median, mode, and midrange.

Measures of average are called measures of central tendency and include the mean, median, mode, and midrange. CHAPTER 3 Data Description Objectives Suarize data using easures of central tendency, such as the ean, edian, ode, and idrange. Describe data using the easures of variation, such as the range, variance,

More information

A MESHSIZE BOOSTING ALGORITHM IN KERNEL DENSITY ESTIMATION

A MESHSIZE BOOSTING ALGORITHM IN KERNEL DENSITY ESTIMATION A eshsize boosting algorith in kernel density estiation A MESHSIZE BOOSTING ALGORITHM IN KERNEL DENSITY ESTIMATION C.C. Ishiekwene, S.M. Ogbonwan and J.E. Osewenkhae Departent of Matheatics, University

More information

Principal Components Analysis

Principal Components Analysis Principal Coponents Analysis Cheng Li, Bingyu Wang Noveber 3, 204 What s PCA Principal coponent analysis (PCA) is a statistical procedure that uses an orthogonal transforation to convert a set of observations

More information

COS 424: Interacting with Data. Written Exercises

COS 424: Interacting with Data. Written Exercises COS 424: Interacting with Data Hoework #4 Spring 2007 Regression Due: Wednesday, April 18 Written Exercises See the course website for iportant inforation about collaboration and late policies, as well

More information

arxiv: v1 [cs.ds] 3 Feb 2014

arxiv: v1 [cs.ds] 3 Feb 2014 arxiv:40.043v [cs.ds] 3 Feb 04 A Bound on the Expected Optiality of Rando Feasible Solutions to Cobinatorial Optiization Probles Evan A. Sultani The Johns Hopins University APL evan@sultani.co http://www.sultani.co/

More information

The Methods of Solution for Constrained Nonlinear Programming

The Methods of Solution for Constrained Nonlinear Programming Research Inventy: International Journal Of Engineering And Science Vol.4, Issue 3(March 2014), PP 01-06 Issn (e): 2278-4721, Issn (p):2319-6483, www.researchinventy.co The Methods of Solution for Constrained

More information

UCSD Spring School lecture notes: Continuous-time quantum computing

UCSD Spring School lecture notes: Continuous-time quantum computing UCSD Spring School lecture notes: Continuous-tie quantu coputing David Gosset 1 Efficient siulation of quantu dynaics Quantu echanics is described atheatically using linear algebra, so at soe level is

More information

Genetic Quantum Algorithm and its Application to Combinatorial Optimization Problem

Genetic Quantum Algorithm and its Application to Combinatorial Optimization Problem Genetic Quantu Algorith and its Application to Cobinatorial Optiization Proble Kuk-Hyun Han Dept. of Electrical Engineering, KAIST, 373-, Kusong-dong Yusong-gu Taejon, 305-70, Republic of Korea khhan@vivaldi.kaist.ac.kr

More information

An Approximate Model for the Theoretical Prediction of the Velocity Increase in the Intermediate Ballistics Period

An Approximate Model for the Theoretical Prediction of the Velocity Increase in the Intermediate Ballistics Period An Approxiate Model for the Theoretical Prediction of the Velocity... 77 Central European Journal of Energetic Materials, 205, 2(), 77-88 ISSN 2353-843 An Approxiate Model for the Theoretical Prediction

More information

This model assumes that the probability of a gap has size i is proportional to 1/i. i.e., i log m e. j=1. E[gap size] = i P r(i) = N f t.

This model assumes that the probability of a gap has size i is proportional to 1/i. i.e., i log m e. j=1. E[gap size] = i P r(i) = N f t. CS 493: Algoriths for Massive Data Sets Feb 2, 2002 Local Models, Bloo Filter Scribe: Qin Lv Local Models In global odels, every inverted file entry is copressed with the sae odel. This work wells when

More information

About the definition of parameters and regimes of active two-port networks with variable loads on the basis of projective geometry

About the definition of parameters and regimes of active two-port networks with variable loads on the basis of projective geometry About the definition of paraeters and regies of active two-port networks with variable loads on the basis of projective geoetry PENN ALEXANDR nstitute of Electronic Engineering and Nanotechnologies "D

More information