Improved Goldschmidt division method using mapping of divisors

Download Improved Goldschmidt division method using mapping of divisors

Post on 23-Dec-2016

213 views

Category:

Documents

1 download

Embed Size (px)

TRANSCRIPT

<ul><li><p>. BRIEF REPORT .</p><p>SCIENCE CHINAInformation Sciences</p><p>September 2013, Vol. 56 099101:1099101:6</p><p>doi: 10.1007/s11432-013-4996-1</p><p>c Science China Press and Springer-Verlag Berlin Heidelberg 2013 info.scichina.com www.springerlink.com</p><p>Improved Goldschmidt division method usingmapping of divisors</p><p>YAN Wen1, QU XiuJie1, CHEN He1, YU JiYang2 &amp; LONG Teng1</p><p>1School of Information and Electronics, Beijing Institute of Technology, Beijing 100081, China;2The Electronic Engineering Technology Research Center of China Academy of Space Tech, Beijing 100094, China</p><p>Received May 23, 2013; accepted July 18, 2013</p><p>Abstract To achieve high precision with fewer storage resources, an improved Goldschmidt division method</p><p>of using the mapping of divisors is presented. The improved division method does not need the initial approxi-</p><p>mation, which means that the look-up table can be saved. Then a mapping method is proposed to reduce the</p><p>relative errors of the iteration results through multiplying the dividends and divisors by the mapping coecients</p><p>simultaneously. Since the mapping coecients are all xed factors, the mapping method applies CSD coding in</p><p>the multiplication with xed factors to reduce the hardware resources. Finally, using fewer hardware resources,</p><p>the proposed method can achieve smaller relative errors.</p><p>Keywords Goldschmidt division method, mapping of divisors, CSD coding, multiplication with xed factors,</p><p>fewer hardware resources</p><p>Citation Yan W, Qu X J, Chen H, et al. Improved Goldschmidt division method using mapping of divisors. Sci</p><p>China Inf Sci, 2013, 56: 099101(6), doi: 10.1007/s11432-013-4996-1</p><p>1 Introduction</p><p>Floating-point division is now more and more popular in scientic and engineering applications. Although</p><p>division is not used as frequently as addition and multiplication, it aects the performance of computation</p><p>results a lot. The division algorithm is usually based on the iterations of multiplication. Newton-</p><p>Raphson [1,2] and Goldschmidt [35] division are the most popular iteration algorithms. We usually</p><p>choose Goldschmidt algorithm instead of Newton-Raphson algorithm due to its higher intrinsic parallelism</p><p>[6], which leads to less execution times.</p><p>Many researchers have tried to reduce the computation time of division by convergence, and the</p><p>most popular approach is to reduce the number of iterations by increasing the precision of the initial</p><p>approximation to the reciprocal, thus needing to increase the digit numbers of the look-up table. Then</p><p>the storage resources will grow accordingly in an exponential manner. Since the storage resources of a</p><p>system is usually limited, it is important to save the table area.To reduce the area for the reciprocal table,</p><p>some methods have been suggested. For example, a Goldschmidt division method is proposed in [5] with</p><p>faster than quadratic convergence, which can achieve faster convergence speed with fewer table areas.</p><p>Corresponding author (email: quxiujie@bit.edu.cn)</p></li><li><p>Yan W, et al. Sci China Inf Sci September 2013 Vol. 56 099101:2</p><p>Although the proposed method in [5] can reduce the table area, it still needs to increase the storage</p><p>resources when the precision requirement is very high. To achieve high precision using fewer storage</p><p>resources, an improved Goldschmidt division method of using mapping of divisors is proposed. This</p><p>method does not need the initial approximation, thus saving the look-up table. The proposed method</p><p>has smaller relative errors than the conventioal Goldschmidt method and the DFQC method in [5].</p><p>Besides, the mapping of divisors is equivalent to the multiplication with xed factors. And CSD coding</p><p>is applied to reduce the multiplication resources [7].</p><p>The main contribution of our method is that it has saved the initial approximation and can achieve</p><p>high precision with fewer storage resources. In Sections 2, the improved Goldschmidt division method</p><p>using mapping of divisors is explained in detail. Then some analysis on the errors and resources is made in</p><p>Section 3 and Section 4. And results of the new design are compared with other methods to demonstrate</p><p>its better performance. Our conclusion is given in Section 5.</p><p>2 Divisor mapping GLD method</p><p>Division can be written as Q = N/D, where N is the numerator, D is the denominator and Q is</p><p>the quotient. The principle of the Goldschmidt division method is to multiply the numerator and the</p><p>denominator with a group of factors Fi at the same time to make the denominator close to 1, such that</p><p>the numerator is close to the quotient [3,8]. Details on the conventional Goldschmidt division method can</p><p>be seen in [3,5]. Results of the conventional GLD division method can be achieved through the following</p><p>equations </p><p>Fi+1 = 2Di+1,Ni+1 = NiFi,</p><p>Di+1 = DiFi,</p><p>(1)</p><p>where i = 0, 1, 2, . . . ,M 1, M represents the iteration times and M 1. Ni and Di represent thedividend and divisor after the ith iteration respectively. Here N0 = N , D0 = D, and F0 is produced by</p><p>a reciprocal table with limited precision, which has an initial error s.</p><p>We know that the conventional Goldschmidt division method needs to store the initial approximations</p><p>in a look-up table before implementation. However, to improve the precision of the initial approximation,</p><p>it needs to increase the number of digits of the look-up table. Then the storage resources will grow</p><p>accordingly in an exponential manner.</p><p>2.1 Improved GLD method</p><p>In this paper, an improved division method, named the Divisor Mapping GLD (Goldschmidt Division)</p><p>method is proposed, which doesnt need the initial approximation. Instead, it needs to preprocess the</p><p>dividends and divisors using the mapping method before the iterations.</p><p>We reform the iteration equations as follows</p><p>Fi = 2Di,Ni+1 = NiFi,</p><p>Di+1 = DiFi,</p><p>(2)</p><p>where i = 0, 1, . . . ,M 1, M represents the iteration times and M 1. Here, N0 = N , D0 = D. We cannd that F0 = 2D, so the Divisor Mapping GLD method does not need the initial approximation.</p><p>With the rst and the third equations in (2), we can get</p><p>Di+1 1 = (D 1)2i+1 . (3)</p><p>With (3), we can get</p><p>Ni+1 =N</p><p>DDi+1 =</p><p>N</p><p>D</p><p>[1 (D 1)2i+1</p><p>]. (4)</p></li><li><p>Yan W, et al. Sci China Inf Sci September 2013 Vol. 56 099101:3</p><p>When the divisor is close to 1, the quotient is approximately equal to the dividend. After the Mth</p><p>iteration, the quotient QM can be expressed as</p><p>QM NM = ND</p><p>[1 (D 1)2M</p><p>]. (5)</p><p>Dene the relative error as</p><p>ei+1 = abs</p><p>(Qi+1 N/D</p><p>N/D</p><p>)</p><p>. (6)</p><p>Combining (5) and (6), the relative error of results after the M th iteration is</p><p>eM = (D 1)2M . (7)For the oating system, N and D can be expressed in the scientic notation as 1.F 2G, where F and</p><p>G stand for the mantissa and the exponent respectively. Obviously, we can get the division results by</p><p>the exponent subtract and the mantissa divide. Then the normalized N and D are both in [1, 2).</p><p>From (7), we can see that the improved GLD method still achieves quadratic convergence. But the</p><p>relative error of results has nothing to do with the initial approximation. When the iteration time M</p><p>is xed, the relative error will increase with the increasing divisor. So the closer D is to 2, the larger</p><p>the relative errors are. Then we have proposed a divisor mapping method, which can reduce the relative</p><p>errors of the improved GLD method.</p><p>2.2 Mapping of divisors</p><p>Assume the precision requirement to be E. If we want to meet the precision requirement, the divisors</p><p>should satisfy eM E. Then we can get</p><p>D 10lgE/2M + 1 = p. (8)</p><p>Here, p is the largest divisor which can meet the precision requirement, and p [1, 2]. From theabove analysis, we can nd that only when the divisor p [1, p], can it meet the precision requirement.Then the question is how to accommodate the divisors in the interval of (p, 2). Our approach is to map</p><p>divisors in (p, 2) onto [1, p].</p><p>When p 1.5, we just need one mapping. Then the divisors in (p, 2) can be mapped onto [1, p]through multiplying a xed factor. When D (p, 2), we can dene a mapping coecient (0.5, 1)to get D = D, N = N, where N and D represent the dividend and the divisor after mappingrespectively. Here, D [1, p]. We can guarantee results of all divisors to meet the precision requirementby converting N/D into N /D.</p><p>But when p &lt; 1.5, it is hard to accomplish the process with just one mapping, and so we need to map</p><p>for several times. We assume the mapping times as T. That is to say, the interval (p, 2) is divided into</p><p>T sections. The interval of the ith mapping is recorded as (Range(i,min),Range(i,max)], here i = 1, . . . , T .</p><p>The relationship of all the intervals should satisfy</p><p>Range(0,min) = p,</p><p>Range(i1,max) = Range(i,min),</p><p>Range(T1,max) = 2.</p><p>(9)</p><p>Assume the coecient of every mapping to be C = {c(1), c(2), . . . , c(T )}. Then we can get{</p><p>c(i)Range(i,max) = p,</p><p>c(i)Range(i,min) = 1,(10)</p><p>Combining (9) and (10), we have the following mapping coecients:</p><p>Range(i,max) =2</p><p>pT1i ,</p><p>Range(i,min) =2</p><p>pTi ,</p><p>c(i) = pTi</p><p>2 .</p><p>(11)</p></li><li><p>Yan W, et al. Sci China Inf Sci September 2013 Vol. 56 099101:4</p><p>1.0 1.2 1.4 1.6 1.8 2.0350</p><p>300</p><p>250</p><p>200</p><p>150</p><p>100</p><p>50</p><p>0</p><p>D</p><p>Rel</p><p>ativ</p><p>e er</p><p>rors</p><p> (dB)</p><p>E</p><p>Before mappingAfter mapping</p><p>Figure 1 Relative errors before and after the mapping method.</p><p>To make sure that all the divisors in (p, 2) can be mapped onto [1, p], we usually keep a moderate</p><p>margin, i.e., that the neighboring sections can have overlaps. Then we have</p><p>{c(i)Range(i,max) p,c(i)Range(i,min) 1.</p><p>(12)</p><p>With (11) and (12), we can get</p><p>logp 2 1 T logp 2. (13)Considering that T is an integer, we have</p><p>T = ceil(logp 2 1), (14)where ceil represents rounding up.</p><p>Through the mapping method, all divisors can be mapped onto [1, p] and results of all divisors can</p><p>meet the precision requirement through the mapping method. We choose 100 dierent numbers within</p><p>the interval [1, 2] as the divisor. And the improved GLD method using mapping of divisors is simulated</p><p>using these numbers. The relative errors before and after the mapping method are shown in Figure 1.</p><p>Figure 1 shows that the relative errors will increase with the increasing divisors without the mapping</p><p>method. And the mapping method can eectively reduce the relative errors. Assume the precision</p><p>requirement is E. Then results of all divisors can meet the precision requirement after using the mapping</p><p>method.</p><p>3 Error analysis</p><p>We use Matlab to get the relative errors of the Divisor Mapping GLD method and the conventional GLD</p><p>method. To see more clearly, we do subtraction between the relative errors of the two methods. Results</p><p>show that the dierence between the relative errors of the two methods is smaller than 1 1014. So theDivisor Mapping GLD method can achieve approximately the same precision as the conventional division</p><p>method.</p><p>The performance of the Divisor Mapping GLD method is evaluated by determining the deviation from</p><p>the true quotient at each step. The relative error is used as the performance metric to evaluate the speed</p><p>of the convergence for the improved division method. Then we compare the performance of our method</p><p>with the conventional division method and the DFQC method proposed in [5].</p><p>Based on the previous analysis, all divisors can be transformed to the interval of [1, p], namely 1 D p. By (7) and (8), the maximum relative error of the improved method after the ith iteration is</p><p>Emax improved = (p 1)2i = exp(</p><p>lnE</p><p>M i)</p><p>. (15)</p></li><li><p>Yan W, et al. Sci China Inf Sci September 2013 Vol. 56 099101:5</p><p>Table 1 Comparison on maximum relative errors after ith iteration</p><p>Algorithms i = 0 i = 1 i = 2 i = 3</p><p>Conventional method 25.4 210.8 221.6 243.2</p><p>DFQC method in [5] 25.4 213.2 229.6 258.4</p><p>Our method 27.4 214.7 229.5 258.9</p><p>High precision oating operating arithmetic units play an important role in high performance com-</p><p>pution. For the IEEE-754 rounding modes, the nal quotient error must be less than 255 when Q is in[0.5, 1) [5]. In high-performance processor design which requires low error rate, we set E = 259 [9].</p><p>In [5], the maximum relative error of the conventional method after the ith iteration is</p><p>Emax conventional = 2i</p><p>0 . (16)</p><p>Our method does not need the look-up table. The conventional method and the DFQC method in [5]</p><p>use a 5-bit-in 5-bit-out reciprocal ROM table. The initial approximation error 0 = 25.4 is calculated</p><p>using the error of an optimized reciprocal table [5].</p><p>The errors of a three-iteration divider after the ith iteration using the three division methods are shown</p><p>in Table 1. The relative errors of the conventional method and the DFQC method are referred to [5].</p><p>Although the convergence of Divisor Mapping GLD method is still quadratic, the initial errors of the</p><p>improved method are smaller than the other two division method. From the above analysis, we can see</p><p>that the Divisor Mapping GLD method has smaller relative errors than the conventional division method</p><p>and the DFQC method in [5].</p><p>4 Resource analysis</p><p>For the M -iteration Divisor Mapping GLD method, we can deduce from (2) that every iteration needs</p><p>two multiplications and one addition (Here we consider the resources of additions and subtractions to be</p><p>equal). However, in the hardware implementation of the oating system, compared with multiplications,</p><p>the resources of additions can be ignored. So we just need to consider the resources of multiplications.</p><p>In the Divisor Mapping GLD method, we also need to consider the multiplications with the mapping</p><p>coecients. Since there need to be two multiplications for one mapping, when the mapping time is T,</p><p>the mapping method needs 2T multiplications. However, considering that the multiplication in mapping</p><p>is operated with xed factors, we can reduce the hardware resources using the coding of the xed factors.</p><p>Dene ( [0, 1]) as the ratio of the resources in the multiplication with xed factors to that of theregular multiplication.</p><p>After the M th iteration, the required numbers of multiplication can be dened as</p><p>R(M) = 2M + 2T. (17)</p><p>Together with (8) and (14), we can get</p><p>R(M) = 2M + 2 ceil[</p><p>log10</p><p>lgE</p><p>2M +12 1</p><p>]</p><p>. (18)</p><p>When CSD coding is applied in the multiplication with xed factors, we can know from [1012] that</p><p>the average value of is 0.3. A multiplication method with xed factors using the least resources is</p><p>proposed in [7], which can reduce the resources by 20%. Through this method, the average value of </p><p>is 0.2.</p><p>We can see that the Divisor Mapping GLD method can save the storage resources at a cost of increasing</p><p>the numbers of multipliers. However, the mapping of divisors is implemented before the iteration process.</p><p>So the multipliers used for the mapping method can be reused in the iteration process such that they do</p><p>not need extra resources. Besides, there usually are many dividers in a processing system, so the module</p><p>of mapping can be shared, which can also save a lot of hardware resources.</p></li><li><p>Yan W, et al. Sci China Inf Sci September 2013 Vol. 56 099101:6</p><p>5 Conclusion</p><p>The Divisor Mapping GLD method presented in this paper does not need the initial approximation, and</p><p>its relative error is just related to the value of divisors and the iteration times. To improve the precision</p><p>of the iteration results, a mapping method is proposed, which can facilita...</p></li></ul>