a fpga ieee-754-2008 decimal64 floating-point adder-subtractor

6
A FPGA IEEE-754-2008 DECIMAL64 FLOATING-POINT ADDER/SUBTRACTOR Carlos Minchola 1 , Martín Vazquez 2 and Gustavo Sutter 1 1 School of Engineering, Universidad Autónoma de Madrid, Madrid, Spain 2 INTIA Institute, Universidad Nacional del Centro de la Prov. de Bs. As., Argentina e-mail: [email protected]; [email protected]; [email protected] ABSTRACT This paper describes the FPGA implementation of a Decimal Floating Point (DFP) adder/subtractor. The design performs addition and subtraction on 64-bit operands that use the IEEE 754-2008 decimal encoding of DFP numbers and is based on a fully pipelined circuit. The design presents a novel hardware for pre-signal generation stage and an enhanced version of previously published leading zero stage. The design can operate at a frequency of 200 MHZ on a Virtex-5 with a latency of 8 cycles. The presented DFP adder/subtractor supports operations on the decimal64 format and it is easily extendable for the decimal128 format. To our knowledge, this is the first hardware FPGA design for adding and subtracting IEEE 754-2008 using decimal64 encoding. 1. INTRODUCTION Decimal arithmetic plays a key role in many commercial and financial applications, which process decimal values and perform decimal rounding. However, current software implementations are prohibitively slow [1], prompting hardware manufacturers such as IBM to add decimal floating point (DFP) arithmetic support to their microprocessors [2]. Furthermore, the IEEE has developed the newly IEEE 754-2008 [3] standard for Floating-Point Arithmetic adding the decimal representation to the IEEE 754-1985 standard. There are several works focused on fixed-point addition [4, 5]. For example, in [4, 5], Busaba, and Haller propose combined decimal and binary adders using pre-sum and pre-selection logic. Only a few previous recent papers focused on decimal floating-point addition [6-8]. The proposed adder by Cohen et al. [8] and Bohlender et al. [6] have long latencies and produces one result digit each cycle. In [7] presents the first IEEE 754 decimal floating point adder. Nevertheless, recently appears the first publications in decimal arithmetic applications on FPGA [9]. Several hardware designs were synthesized using other platforms than FPGA [10], this is why it is believed that it can be one of the first implementation on FPGA for addition/subtraction using decimal64 encoding. A recent work [9] presents a DFP adder for FPGA but using Binary Integer Decimal (BID) encoding, whose results will be used in our comparisons. Because of the BID adder occupies less area [9] than our proposed design, it may be argued that the BID format is well suited for hardware implementation but considering the latency-area tradeoff. The outline of the paper is as follows. In the next Section, it is described the background information on decimal floating-point. Section 3 describes the challenge of adding decimal64 encoded numbers, and presents the technique and theory for addition and subtraction. Section 4 presents synthesis results for our proposed adder and comparisons with the adder from [9]. Finally, Section 5 presents our conclusions. In the rest of this paper, AX and BX are the significands and EAX, EBX and EX are the exponents respectively. X is a digit that denotes the outputs of different units. The symbol (N) Z T refers to T th bit of the Z th digit in a number N, where the least significant bit and the least significant digit have index 0. For example, (A1) 2 5 is the fifth bit of the second BCD digit in A1. 2. DECIMAL FLOATING POINT IN IEEE 754-2008 The new IEEE 754-2008 [3] standard specifies formats for both binary floating-point (BFP) and decimal floating-point (DFP) numbers. The primary difference between two formats, besides the radix, is the normalization of the significands (coefficient or mantissa). BFP significands are normalized with the radix point to the right of the most significant bit (MSB), while DFP mantissa are not required to be normalized and are represented as integers. The IEEE 754-2008 standard specifies DFP formats of 32, 64, and 128 bits. A DFP number contains a sign bit, an integer significand with a precision of p digits, and a biased exponent q. The value of a finite DFP number is: D = -1 s x C x 10 q , q = E bias (1) Where s is the sign bit, C is the non-negative integer significand, and q the exponent. The exponent q is obtained as a function of biased non-negative integer exponent E. 978-1-4244-8848-3/11/$26.00 ©2011 IEEE 251

Upload: lavanyams

Post on 14-Apr-2015

257 views

Category:

Documents


2 download

DESCRIPTION

Decimal floating point adder

TRANSCRIPT

Page 1: A Fpga Ieee-754-2008 Decimal64 Floating-point Adder-subtractor

A FPGA IEEE-754-2008 DECIMAL64 FLOATING-POINT ADDER/SUBTRACTOR

Carlos Minchola1, Martín Vazquez2 and Gustavo Sutter1

1School of Engineering, Universidad Autónoma de Madrid, Madrid, Spain 2INTIA Institute, Universidad Nacional del Centro de la Prov. de Bs. As., Argentina

e-mail: [email protected]; [email protected]; [email protected]

ABSTRACT

This paper describes the FPGA implementation of a Decimal Floating Point (DFP) adder/subtractor. The designperforms addition and subtraction on 64-bit operands thatuse the IEEE 754-2008 decimal encoding of DFP numbersand is based on a fully pipelined circuit. The designpresents a novel hardware for pre-signal generation stage and an enhanced version of previously published leadingzero stage. The design can operate at a frequency of 200 MHZ on a Virtex-5 with a latency of 8 cycles. The presented DFP adder/subtractor supports operations on the decimal64 format and it is easily extendable for the decimal128 format. To our knowledge, this is the first hardware FPGA design for adding and subtracting IEEE754-2008 using decimal64 encoding.

1. INTRODUCTION

Decimal arithmetic plays a key role in many commercial and financial applications, which process decimal values and perform decimal rounding. However, current software implementations are prohibitively slow [1], promptinghardware manufacturers such as IBM to add decimal floating point (DFP) arithmetic support to their microprocessors [2]. Furthermore, the IEEE has developedthe newly IEEE 754-2008 [3] standard for Floating-PointArithmetic adding the decimal representation to the IEEE 754-1985 standard.

There are several works focused on fixed-point addition[4, 5]. For example, in [4, 5], Busaba, and Haller propose combined decimal and binary adders using pre-sum and pre-selection logic. Only a few previous recent papers focused on decimal floating-point addition [6-8]. Theproposed adder by Cohen et al. [8] and Bohlender et al. [6]have long latencies and produces one result digit each cycle. In [7] presents the first IEEE 754 decimal floating point adder. Nevertheless, recently appears the first publications in decimal arithmetic applications on FPGA[9].

Several hardware designs were synthesized using otherplatforms than FPGA [10], this is why it is believed that it

can be one of the first implementation on FPGA foraddition/subtraction using decimal64 encoding. A recent work [9] presents a DFP adder for FPGA but using Binary Integer Decimal (BID) encoding, whose results will be usedin our comparisons. Because of the BID adder occupies lessarea [9] than our proposed design, it may be argued that the BID format is well suited for hardware implementation butconsidering the latency-area tradeoff.

The outline of the paper is as follows. In the nextSection, it is described the background information ondecimal floating-point. Section 3 describes the challenge ofadding decimal64 encoded numbers, and presents thetechnique and theory for addition and subtraction. Section 4presents synthesis results for our proposed adder and comparisons with the adder from [9]. Finally, Section 5presents our conclusions.

In the rest of this paper, AX and BX are the significands and EAX, EBX and EX are the exponents respectively. X isa digit that denotes the outputs of different units. Thesymbol �(N)Z

T� refers to Tth bit of the Zth digit in a number N, where the least significant bit and the least significant digit have index 0. For example, (A1)2

5 is the fifth bit of the second BCD digit in A1.

2. DECIMAL FLOATING POINT IN IEEE 754-2008

The new IEEE 754-2008 [3] standard specifies formats for both binary floating-point (BFP) and decimal floating-point (DFP) numbers. The primary difference between two formats, besides the radix, is the normalization of the significands (coefficient or mantissa). BFP significands arenormalized with the radix point to the right of the mostsignificant bit (MSB), while DFP mantissa are not requiredto be normalized and are represented as integers.

The IEEE 754-2008 standard specifies DFP formats of 32, 64, and 128 bits. A DFP number contains a sign bit, aninteger significand with a precision of p digits, and a biased exponent q. The value of a finite DFP number is:

D = -1s x C x 10q, q = E � bias (1)

Where s is the sign bit, C is the non-negative integer significand, and q the exponent. The exponent q is obtainedas a function of biased non-negative integer exponent E.

978-1-4244-8848-3/11/$26.00 ©2011 IEEE

251

Page 2: A Fpga Ieee-754-2008 Decimal64 Floating-point Adder-subtractor

The mantissa is encoded in densely packed decimal [3],the exponent must be in the range [emin, emax], when biased by bias. Representations for infinity and not-a-number (NaN) are also provided. Representations offloating-point numbers in the decimal interchange formatsare encoded in k bits in the following three fields (Fig. 1):

Fig. 1. Decimal interchange floating-point format

a) 1-bit sign s. b) A w + 5 bit combination field G encoding

classification and, if the encoded datum is a finite number,the exponent q and four significand bits (1 or 3 of which areimplied). The biased exponent E is a w + 2 bit quantity q + bias, where the value of the first two bits of the biasedexponent taken together is either 0, 1, or 2.

c) A t-bit trailing significand field T that contains J × 10 bits and contains the bulk of the significand. J representsthe number of declets.

When this field is combined with the leading significand bits from the combination field, the format encodes a totalof p = 3 × J + 1 decimal digits. The values of k, p, t, w, andbias for decimal64 interchange formats are 16, 50, 12, and 398 respectively. That means that number has p=16 decimal digits of precision in the significand, an unbiased exponentrange of [�383, 384], and a bias of 398.

3. DECIMAL FLOATING-POINT ADDER/SUBTRACTOR IMPLEMENTATION

A general overview of proposed adder/subtractor isdescribed below. For the best performance, the designpresents eight pipelined stages as is exhibited in the Fig. 2.Arrows are used to show the direction of data flow, thedashed blocks indicate the main stages of the design, and the dotted line indicates the pipeline.

This architecture was proposed for the IEEE 754-2008 decimal64 format and can be extended for the decimal128 format. The adder/subtractor on decimal64 is carried out asfollows: The decoder unit takes the two 64-bit IEEE 754-2008 operands (OP1, OP2) to generate the sign bits (SA,SB), 16-digit BCD significands (A0, B0), 10-bit biasedexponents (EA, EB), the effective operation (EOP) and flags for specials values of NaN or infinity. The signalEOP defines the effective operation (EOP = 0 for effectiveaddition and EOP = 1 for effective subtraction), this signal is calculated as:

EOP = SA xor SB xor OP (2)

As soon as possible the decoded significands becomeavailable, the leading zero detection unit (LZD) takes theseresults and computes the temporary exponents (EA1, EB1) and the normalized coefficients (A1, B1). The swapping

unit swaps the operands (A1, B1) if EA1 < EB1 and generates the BCD coefficients A2 (with higher exponent,max(EA1, EB1)) and B2 (with lower exponent, min(EA1,EB1)). In parallel with the above mentioned, this unit generates an exponent difference (Ed = |EA1 - EB1|), theexponent E2 = max(EA1, EB1), the SWAP flag if a swapping process is carried out, and the right shift amount(RSA) which indicates how many digits B2 should be rightshifted in order to guarantee that both coefficients (A2, B2)have the same exponent.

Fig. 2. High-level Decimal Floating-Point Adder Diagram

The RSA is computed as follows:

if (Ed <= p_max) RSA = Edelse RSA = p_max

(3)(4)

The value p_max = 18 digits, RSA is limited to this value since B2 contains 16 digits plus two digits whichwill be processed to compute the guard and round digit.

Next, the Shifting unit receives as inputs the RSA, and the significand B2 generating a shifted B2 (B3) and a 2-bitsignal called predicted sticky-bit (PSB) that will predict two initials sticky bits. PSB and B3 will be utilized as inputs in the decimal addition, control signals generation and post-correction units, respectively.

252

Page 3: A Fpga Ieee-754-2008 Decimal64 Floating-point Adder-subtractor

The outputs above mentioned plus two signals,significand A2 and EOP, are taken as inputs in the controlsignals generation unit and generates the signals necessaryto perform an addition or subtraction operation, these signals are described in the Sub-section 3.4 and are made up of a prior guard digit (GD1), a prior round digit (RD1),an extra digit (ED), a signal which verifies if A2 > B3(AGTB) and a carry into (CIN). The significands BCD (A2, B3) and the CIN are inputs the decimal addition unit generating the partial sum of magnitude |S1| = |A2 + (-1)EOP B3| and a carry out (COUT), respectively. At once, the 16-digit decimal addition unit takes the A2, B3, EOP and CIN and computes S1 as follows: S1 = A2 + B3 ifEOP = 0, S1 = A2 + cmp9(B3) if EOP = 1 and A2 >= B3,and S1 = cmp9(A2+cmp9(B3)) if EOP = 1 and A2 < B3.The symbol cmp9 means the 9`s complement.

The post-correction unit uses as inputs the PSB, the exponent E2, GD1, RD1, ED, the partial sum S1 and COUT to verify, correct and compute the inputs signals if only the following two cases occur: 1) COUT=1 andEOP=0, and 2) (S1)15=0 and (GD1 > 0) and (EOP=1). The analysis is explained in the Sub-section 3.5. This unitgenerates the final sticky bit (FSB), the corrected guard digit (GD2) and round digit (RD2), the final partial sum(S2) and the corrected exponent (E3).

Next, the Rounding unit takes the outputs of the prior unit and rounds S2 to produce the result´s significand S3and adjusts the exponent E3 to calculate the final exponent E4. Simultaneously the overflow, underflow and sign bitsignals are generated. The final sign bit is computed as:

FS = (SA � ~EOP) � (EOP � (AGTB � SA � SWAP)) (5)

Finally, the FS, E4 and S3 are processed to generate theIEEE 754-2008 result in the Encoder unit. This stage alsohandles special cases as not-a-number (NaN) or infinity.

Fig. 3. Alignment and swapping unit

3.1. IEEE 754-2008 Decoder/Coder

A DFP number represented into the IEEE 754-2008 interchange format (Fig.1) is unpacked into the three fields: sign, exponent and significands. A decimal64number (64 bit interchange format) is decomposed into a 1

bit sign, a 10-bit exponent (E - bias), and 16 digitsrepresented in BCD, i.e. 64 bits.

Then the decoder process is a combinational circuit thatunpacks 64 bits into 75 bits (1+10+64). Into the decoder function the special cases (infinity and NANs) are detected and signaled. The coder stage transforms the 75-bitrepresentation of the result into the 64-bit interchange format. The combinational circuit takes into account the special cases that can be detected by the special case detection circuitry. Additionally in this stage the overflowand underflow conditions are detected and signaled.

3.2. Alignment operation: leading zero detection andswapping

The alignment operation is made up of the leading zero detection (LZD) and swapping stages.

This unit reads the unpacked signals from the prior one, at once a LZD process is applied on both operands toestimate and remove the significant zeros (LZDA, LZDB) in order to obtain normalized significant digits in the results (A1, B1). The exponents are also normalized and set to EA1 = EA - LZDA and EB1 = EB - LZDB. The two significands results are swapped if EB1 > EB2 and a newexponent E2 is determined, simultaneously a SWAP flag isproduced to indentify the swapping (if swapping processoccurs then SWAP = 1 otherwise SWAP = 0). The LZD and swapping process generate outputs as the significandsA2 and B2, RSA and E2 which were explained the beginning of Section 3.

The implementation of the LZD and swapping units areshown in the Fig. 3.

The LZD design is based on low level combinationalcircuit which improves period of the data path, the proposed circuit is slighter than the proposed one in [11].The SUB-ABS process calculates |EA1-EB1| that uses a binary absolute value unit; this grants a reduced delay [12].

3.3. Shifting right

The B2 is right shifted by the RSA amount in order to bothsignificands (A2, B2), the results is stored in a 19-digit register (B3) in which the last three digits represent the predicted guard and round digit (GD1, RD1), and the extradigit (ED) respectively. The ED digit is used when a subtraction operation occurs. In the same way this module generates a 2-bit predicted sticky bit signal PSB(1:0). Actually this predicts two sticky-bit signals which are made up of the following way: SB1 = PSB(1) or PSB(0),SB2 = PSB(0) which will be selected by means ofconditions described in the post-correction stage. PSB is produced concatenating the least significant digit (LSD) ofB3 ((B3)0) and the or-operation of the digits of B2 that are not considered in B3.

253

Page 4: A Fpga Ieee-754-2008 Decimal64 Floating-point Adder-subtractor

3.4. Control signal generation

After processing the previous stage, this one takes asinputs: ISB, EOP, and the significands B3, A2. The adder/subtractor operation is controlled by the extra signalsgenerated in this unit, these ones are: an initial carry (CIN), previous guard and round digit (GD1, RD1), an additionaldigit (ED) and a signal that verifies if A2 >= B3.

In parallel in the adder unit the A1 operand and the B3�s 16 most significant digits (MSD) are registered, this process will be described in the next sub-section. The goalis to compute the mentioned signals: guard digit (GD1),round digit (RD1) and extra digit for subtraction (ED).

The final alignment is exhibited in Fig. 4, both operandsare placed starting from the MSD thus the first of them is equal to A2 and the second one equal to (B3)18 down to 1 , the initial value of the extra digit (IED) is set to (B3)0 .

As a matter of fact this stage just predicts signals outputs, the corrected and definitive values are generatedin the post-correction unit. The implementation of this stage is fully combinational.

3.5. Decimal Addition/Subtraction design

Because both operands (A2, (B3)18 down to 3) are computed itis proposed a 16-digit decimal adder/subtractor. First, it isnecessary to correct B3: if subtraction operation is selected(EOP = 1) then (B3corrected)i = 9 - (B3)i, otherwise (B3)i.Having as inputs signals the two operands above mentioned (A2, B3corrected), EOP and AGTB, the goal of the implementation is to calculate the corresponding partial sum (S1 = A2 + (B3 corrected)18 down to 3).

The proposed algorithm is based on 10´s complement BCD numbers, and in the carry-chains techniques to carryout the fully design. The carry-chain adding consists in computing beforehand all carry propagating and carrygenerating conditions, in order to reduce the overall execution time. It has been considered the high-performance adder proposed in [13].

3.6. Post-Correction

The temporary result generated from the adder unitrequires a post-correction unit to convert the uncorrectedresult. As it is indicated at the beginning of Section 3, thisstage uses as inputs the PSB, the exponent E2, GD1, RD1, ED, the partial sum S1 and COUT. The outputs (correctedinputs) are designated as final sticky-bit (FSB), final guard digit (GD2), final round digit (RD2), corrected partial sumS1 (S2), and updated exponent E2 (E3). The conditions for performing this correction are defined below:

1.- If EOP = 0 and COUT = 1.Condition enforced when performing an effective

addition operation and COUT = 1, therefore a correction is applied to S1. In order to fit 16 digits in S1 then COUT is

considered as MSD of S1 plus 15 digits MSD of one, thisresult is stored in S2. As well the guard digit final (GD2) isdetermined as the LSD of S1. Likewise the final round digit (RD2) is updated to GD1, the final exponent E3 is corrected to E2 + 1 and the new value of the final sticky bit (FSB) is equal to bit generated by the or-operation: PSB(0)or PSB(1) or RD1.

2.- If EOP = 1 and MSD of S1 = 0 and GD1 > 0.Condition necessary when performing an effective

subtraction operation, the MSD of S1 is equal to 0 and guard digit input is greater than zero. The result S2 is madeup of 15 digits LSD of S1 plus GD1 as LSD of S2. The final guard digit GD2 is considered equal to RD1, the finalround digit RD2 equal to extra digit (ED), final exponent(E3) is update to E2 - 1 and the new sticky bit (FSB) is set to PSB(0).

3.- If EOP = 1 and MSD of S1 = 0 and GD1 = 0.Condition enforced when performing an effective

subtraction operation, the MSD of S1 is 0 and guard digitinput is equal to 0. The result S2 is equal to S1. The finalguard digit GD2 is equal to GD1, the final round digit RD2equal to RD1, final exponent (E3) is update to E2 and thenew sticky bit (FSB) is set to bit generated by the or-operation: PSB(0) or PSB(1).

A 2 + B 3 C o m p o n e n t

(A 2 ) 1 5 (A 2 )1 4 (A 2 )1 3 � . (A 2 ) 3 (A 2 ) 2 (A 2 )1 (A 2 ) 0

1 6 D ig its (A 2 , B 3 )

R D 1 = P rio r R o u n d D ig itG D 1 = P rio r G u a rd D ig itIE D = In itia l E x tra D ig it = (B 3 )oA G T B = �0 � if A > B , �1 � o th e rw ise

E O P

A d d itio n (e o p = 0 ) / S u b tra c tio n (e o p = 1 )

(B 3 )1 8 (B 3 )1 7 (B 3 ) 1 6 � . (B 3 )6 (B 3 ) 5 (B 3 )4 (B 3 )3 (B 3 )2 (B 3 )1

A G T BC IN

A 2

B 3

M S D G D 1L S D R D 1

2 D ig its

Fig. 4. Example of Alignment for A2 and B3 registers.

3.7. Rounding

This unit takes as inputs the outputs of the previous one, isa simple stage that receives verified signals beforehand. In the rounding process, when no overflow or underflow ispresent (the stage detects the underflow and overflow casesin decimal adder) the rounding must be performed when allof the significant digits of the S2 do not fit within the pdigits (p=16) of the result´s significands. The roundingexamines the inputs: Sticky bit (FSB), Guard Digit (GD2) and Round Digit (RD2) in accordance with the chosenrounding. The decision to increment or not is made by therounding module depending on the rounding strategy, theconditional plus one decimal adders adds one to the S2 ornot depending on rounding add_one. The rounding decision add_one is made by the following algorithm:

254

Page 5: A Fpga Ieee-754-2008 Decimal64 Floating-point Adder-subtractor

Algorithm 2. Round ties to even rounding strategyGD = GD2 --guard digit SB = FSB --sticky bit. if GD < 5 then add_one := �0�; elsif GD >5 then add_one := �1�; else --GD = 5 if SB = �0� then add_one := �0;� else add_one := �1�; end if end if;

An additional extra operation is potentially carried outat this point. If the sum S2 is 99�99 and the add_one signal is asserted, the resulted decimal significand is 10�0 but using one extra digit. The solution in that situation is that the conditional adder returns 10�0 with p digits and gives an additional signal in order to add one to theintermediate exponent, E3, resulting the final exponent E4.

4. FPGA IMPLEMENTATION OF DFP ADDITION/SUBTRACTION

All circuits were described in VHDL. Some parts ofadder/subtractor use low level component instantiation.The circuits have been implemented on Xilinx Virtex-5,device XC5VLX220T, with speed grade �2 using timingconstraints [14]. For synthesis and implementation XST [15] and Xilinx ISE 12.1 tool [16] have been usedrespectively.

An area breakdown is exhibited in Table 1 and showsthe approximate contribution of each component. Thedesign uses 863 slices (2390 LUTS, 935 FF), and can operate at a frequency of 200 MHz with a latency of 8cycles.

Table 1. Area breakdown of DFP Adder/Subtractor

Components cycles area

LUTs % Decoder 1 128 5.6%

BC

D A

dder

Leading Zero Detector 1 18 0.8%Swapping 1 16 0.7%

Pre-signal generation 1 97 4.2%Carry-chain Adder 1 375 16.4%

Post-Correction 1 91 4.0%Rounding (overflow, underflow) 1 110 4.8%Further combinational circuits 0 1416 59.9%

Special Cases 0 4 0.2%Further combinational circuits 0 48 1.0%

Coder 1 83 3.6%Entire Design 8 2390 100%

The performance of our 16-Digit Decimal adder/subtractor is based on carry-chain techniques described in[13] and is the key point in the performance. The presented result uses the nearest ties to even as rounding, the special cases detection and signaling and overflow/underflowdetection.

4.1. Results verification

To validate the designs, large numbers of random vectors were applied to an automated testbench that tests thebehavioral model using ModelSim. For the floating pointadder/subtractor, over 40,000 test cases were used. All rounding modes and exception are supported in thisversion. Additionally the core was implemented as aMicroblaze coprocessor and again 40 thousand random testcases were used to validate results.

4.2. Results Comparisons

Since no previous results were found for decimal64addition/subtraction in FPGA, as a first comparison we match up our approach against a BID adder/subtractor [9]and the double precision Binary Floating Pointadder/subtractor [17] provided by the Xilinx CoreGenerator implemented in Virtex-5. [9] describes the FPGAimplementation of a decimal 64-bit adder using BinaryInteger Decimal (BID) encoding, in BID the 16 decimal digits are represented as 54 bits number. The results areshown at table 2, the area is expressed in LUTs, FF, DSPblocks and Block Rams (BR), and the timing informationincludes maximum operating frequency (MHz), the numberof cycles (#cy), the latency (T), and the productivity inmega-operations per second. The results depicted in table 2shows that our approach outperforms in terms of latencyand throughput the BID approach of [9]. Even the decimal adder/subtractor proposed has a performance similar of the binary floating point representation, but doubling the area. Table 2. Implementation result comparison for BID adder,

a Binary64 and the proposed circuit in Virtex-5Adder /

SubtractorLUT FF DSP BR MHz #Cy

T (ns)

Mop /sec

BID [9] 2171 1392 12 1 163.9 13/18 79.3/ 109.8

12.6/ 9.1

Binary64 [17] 734 960 3 - 333 14 42.0 333 Proposed 2390 935 - - 200 8 40.0 200

As a second comparison we present at table 3 severalsoftware and hardware achievement for decimal adder/subtractors. The reports [18] and [19] are based on anIntel�s software DFP library, meanwhile Cowlishaw reportsfor decNumber software [20] a worst case performance of848 cycles [21]. The hardware implementation includes the IBM Power6 and Z10 processors who�s has hardwareacceleration for decimal floating point [22][23], and theBinary Integer Decimal (BID) implementation in 65nm and Virtex 5.[9, 10]

The proposed FPGA implementation is more than oneorder of magnitude faster than the software implementation. The hardware acceleration present in Z10 and Power 6 is faster than a single Virtex 5 core. However, processors have

255

Page 6: A Fpga Ieee-754-2008 Decimal64 Floating-point Adder-subtractor

a single DFP unit, meanwhile in a single Virtex-5 FPGA (LX330) we can fit near 60 adder/subtractor cores.

Table 3. Latency of the DFP adder/subtractor implemented on different platforms.

TechnologyClk

(GHz) Cycles

Delay (ns)

Mops/ sec

SW

Itanium2 [18] 1.4 219 156.4 6.4Xeon5100 [19] 3.0 133 44.3 22.6Xeon [18] 3.2 249 77.8 12.9Pentium M [21] 1.5 848 565.3 1.8

HW

Power6 [22] 5.0 17 3.4 294.1Z10 [23] 4.4 12 2.7 366.7BID 65nm [10] 1.3 3-13 10.0 100.0BID Virtex 5 [9] 0.16 13-18 109.8 9.1Proposed Virtex5 0.2 8 40 200.0

5. SUMMARY

To the author´s knowledge, this is the first hardware FPGA implementation of a DFP adder/subtractor using denselypacked decimal, and compliant with the standard IEEE 754-2008. The design is fully pipelined with 8 cycle�s latency and can operate at 200 MHz in a Virtex 5 device. The proposed architecture pre-normalizes the BCD significant improving the operating frequency. Thesignificant BCD addition/subtraction is based on a carrysave structure previously published.

The comparison with software libraries running in a general purpose processor is at least one order ofmagnitude faster. Comparing against the hardwareaccelerator presented in modern IBM processors is the same order of magnitude in productivity. Nevertheless,processors have a single DFP unit, meanwhile it is possibleto arrange tens DFP cores in a single FPGA device.

6. REFERENCES

[1] M. F. Cowlishaw, �Decimal floating-point: algorism forcomputers,� in Proc. 16th IEEE Symp. Computer Arithmetic, 2003, pp. 104�111.

[2] E. M. Schwarz, J. S. Kapernick, and M. F. Cowlishaw, �Decimal floating-point support on the IBM System z10processor,� 2009, iBM Journal of Research andDevelopment.

[3] �IEEE Standard for Floating-Point Arithmetic,� pp. 1�58, 2008, iEEE Std 754-2008.

[4] F. Y. Busaba, C. A. Krygowski, W. H. Li, E. M. Schwarz,and S. R. Carlough, �The IBM z900 decimal arithmetic unit,� in Proc. Conf Signals, Systems and Computers Record of the Thirty-Fifth Asilomar Conf, vol. 2, 2001, pp. 1335�1339.

[5] W. Haller, K. Ulrich, L. Thomas, and H. Wetter,�Combined binary/decimal adder unit,� in International Business Machines Corporation (Armonk, NY), 1999.

[6] G. Bohlender and T. Teufel, �BAP-SC: A Decimal Floating-Point Processors for Optimal Arithmetic,� in Computerarithmetic: Scientific Computation and

Programming Languages, E. Kaucher, U. Kulisch, andC. Ullrich, Eds. B.G Teubner Verlag, 1987, pp. 31�58.

[7] J. Thompson, N. Karra, and M. J. Schulte, �A 64-bit decimal floating-point adder,� in Proc. IEEE Computersociety Annual Symp. VLSI, 2004, pp. 297�298.

[8] M. S. Cohen, T. E. Hull, and V. C. Hamacher, �CADAC: AControlled-Precision Decimal Arithmetic Unit,� no. 4, pp.370�377, 1983.

[9] A. Farmahini-Farahani, C. Tsen, and K. Compton, �FPGA implementation of a 64-Bit BID-based decimal floating-point adder/subtractor,� in Proc. Int. Conf. Field-Programmable Technology FPT 2009, 2009, pp. 518�521.

[10] C. Tsen, S. Gonzalez-Navarro, and M. Schulte, �Hardware design of a Binary Integer Decimal-based floating-pointadder,� pp. 288�295, 2007, computer Design, 2007. ICCD2007. 25th International Conference on.

[11] C. Minchola and G. Sutter, �A FPGA IEEE-754-2008 Decimal64 Floating-Point Multiplier,� in Proc. Int. Conf. Reconfigurable Computing and FPGAs ReConFig �09, 2009, pp. 59�64.

[12] L.-K. Wang and M. J. Schulte, �Decimal Floating-Point Adder and Multifunction Unit with Injection-BasedRounding,� in Proc. 18th IEEE Symp. Computer Arithmetic ARITH �07, 2007, pp. 56�68.

[13] M. Vazquez, G. Sutter, G. Bioul, and J. P. Deschamps, �Decimal Adders/Subtractors in FPGA: Efficient 6-input LUT Implementations,� in Proc. Int. Conf. Reconfigurable Computing and FPGAs ReConFig �09, 2009, pp. 42�47.

[14] Xilinx Inc. XST User Guide 12.1, v12.1 ed., Xilinx Inc., June 2009. [Online]. Available: http://www.xilinx.com

[15] Xilinx Inc. Xilinx ISE Design Suite 12.1 Software Manuals, v12.1 ed., Xilinx Inc., June 2009. [Online]. Available: http://www.xilinx.com

[16] Xilinx Inc. Virtex-5 Libraries Guide for VHDL design, v12.1 ed., Xilinx Inc., June 2009. [Online]. Available: http://www.xilinx.com

[17] Xilinx Inc, DS335: Floating-Point Operator v5.0, June 2009.

[18] M. Cornea, J. Harrison, C. Anderson, P. Tang, E. Schneider, and E. Gvozdev, �A software implementationof the ieee 754r decimal floating-point arithmetic using the binary encoding format,� pp. 148�162, 2009, computers, IEEE Transactions on.

[19] M. Cornea, C. Anderson, J. Harrison, P. T. P. Tang, E. Schneider, and C. Tsen, �A software implementation ofthe ieee 754r decimal floating-point arithmetic using the binary encoding format,� pp. 29�37, 2007, computerArithmetic, 2007. ARITH �07. 18th IEEE Symposium on.

[20] The decNumber C library, v3.68 ed., IBM UK Laboratories, January 2010. [Online]. Available: http://-speleotrove.com/decimal/decnumber.pdf

[21] M. Cowlishaw. (2009) Decimal library performance. [Online]. Available: http://speleotrove.com/decimal/-decperf.pdf

[22] L. Eisen, J. W. Ward, H.-W. Tast, N. Mading, J. Leenstra,S. M. Mueller, C. Jacobi, J. Preiss, E. M. Schwarz, andS. R. Carlough, �Ibm power6 accelerators: Vmx and dfu,� pp. 1�21, 2007, iBM Journal of Research andDevelopment.

[23] C. F. Webb, �IBM z10: The Next-Generation MainframeMicroprocessor,� vol. 28, no. 2, pp. 19�29, 2008.

256