considering the alternatives in low power design

7
Employing Alternative Number Systems to Reduce Power Dissipation in Portable Devices and High-Performance Systems P ower dissipation has evolved into an instrumental de- sign optimization objective due to the growing de- mand for portable electronics equipment as well as due to excessive heat generation in high-performance systems. In the former case, low-power techniques are employed to prolong battery life, while in the latter case, low-power techniques are re- quired to mitigate the reliability problems that may arise. The dominant component of power dissipation for well-designed CMOS circuits is static power dissipation, given as [1] P aC fV L = dd 2 , (1) where a is the activity factor, C L is the switching capacitance, f is the clock frequency, and V dd is the supply voltage. A variety of design techniques are commonly employed to reduce the fac- tors of Eq. (1), without degrading system performance. As slower circuits tend to dissipate less power, the low-power de- sign problem can be seen as an attempt to achieve a specified system performance by employing slow components. The reduction of the various factors that determine power dissipation is sought at all levels of the design abstraction. In particular, techniques for power dissipation reduction at higher design abstraction levels aim to reduce the computational load and the number of memory accesses required to perform a cer- tain task as well as to introduce parallelism and pipelining in the system [2]. At the circuit and process levels, minimal feature size circuits are preferred, capable of operating at minimal supply voltages, while leakage currents and device threshold voltages are minimized. The choice of the number system—i.e., the way numbers are represented in a digital system—can reduce power dissipa- tion, since the number system has an effect on several levels of the design abstraction. In particular, the appropriate selection of the number system can reduce power dissipation, because it can re- duce: 1. the number of the operations; 2. the strength of the operators; and 3. the activity of the data. A particular choice of number system can reduce the number of the actual operations required to accomplish certain computa- tional tasks; therefore, it can reduce the computational load of an application. Furthermore, both data activity and the strength of the operators are influenced by the choice of the number sys- tem. Finally, power dissipation can be reduced by using low-power arithmetic circuit architectures. Again, the possible architectures are determined by the number system. Several authors address the issue of low-power arithmetic; the bulk work in this field is on the definition of new low-power circuit-level architectures [3] or the identification among exist- ing architectures of those that dissipate minimal power for the basic operations, such as addition and multiplication [4]. Also, comparisons of number representations, such as sign-magnitude and two’s-complement systems [5], in terms of underlying bit ac- tivity have been reported [6]. In this article we focus on two alternative number systems that are quite different than the conventional linear number rep- resentations, namely the logarithmic number system (LNS) and the residue number system (RNS). Both have recently attracted the interest of researchers for their low-power properties. We CIRCUITS & DEVICES JULY 2001 8755-3996/01/$10.00 ©2001 IEEE 23 T. Stouraitis and V. Paliouras © 1999 ARTVILLE, LLC.

Upload: faith-khanyisile

Post on 08-Apr-2016

212 views

Category:

Documents


0 download

DESCRIPTION

Employing Alternative Number Systemsto Reduce Power Dissipationin Portable Devices and High-Performance Systems

TRANSCRIPT

Employing Alternative Number Systemsto Reduce Power Dissipation

in Portable Devices and High-Performance Systems

Power dissipation has evolved into an instrumental de-sign optimization objective due to the growing de-mand for portable electronics equipment as well asdue to excessive heat generation in high-performance

systems. In the former case ,low-power techniques are employed toprolong battery life, while in the lattercase, low-power techniques are re-quired to mitigate the reliability problems that may arise. Thedominant component of power dissipation for well-designedCMOS circuits is static power dissipation, given as [1]

P aC f VL= dd2 , (1)

where a is the activity factor, CL is the switching capacitance, fis the clock frequency, andV

ddis the supply voltage. A variety of

design techniques are commonly employed to reduce the fac-tors of Eq. (1), without degrading system performance. Asslower circuits tend to dissipate less power, the low-power de-sign problem can be seen as an attempt to achieve a specifiedsystem performance by employing slow components.

The reduction of the various factors that determine powerdissipation is sought at all levels of the design abstraction. Inparticular, techniques for power dissipation reduction at higherdesign abstraction levels aim to reduce the computational loadand the number of memory accesses required to perform a cer-tain task as well as to introduce parallelism and pipelining in thesystem [2]. At the circuit and process levels, minimal feature sizecircuits are preferred, capable of operating at minimal supplyvoltages, while leakage currents and device threshold voltagesare minimized.

The choice of the number system—i.e., the way numbersare represented in a digital system—can reduce power dissipa-tion, since the number system has an effect on several levels ofthe design abstraction. In particular, the appropriate selection

of the number system can reducepower dissipation, because it can re-duce:

1. the number of the operations;2. the strength of the operators; and

3. the activity of the data.

A particular choice of number system can reduce the number ofthe actual operations required to accomplish certain computa-tional tasks; therefore, it can reduce the computational load ofan application. Furthermore, both data activity and the strengthof the operators are influenced by the choice of the number sys-tem. Finally, power dissipation can be reduced by usinglow-power arithmetic circuit architectures. Again, the possiblearchitectures are determined by the number system.

Several authors address the issue of low-power arithmetic;the bulk work in this field is on the definition of new low-powercircuit-level architectures [3] or the identification among exist-ing architectures of those that dissipate minimal power for thebasic operations, such as addition and multiplication [4]. Also,comparisons of number representations, such as sign-magnitudeand two’s-complement systems [5], in terms of underlying bit ac-tivity have been reported [6].

In this article we focus on two alternative number systemsthat are quite different than the conventional linear number rep-resentations, namely the logarithmic number system (LNS) andthe residue number system (RNS). Both have recently attractedthe interest of researchers for their low-power properties. We

CIRCUITS & DEVICES � JULY 2001 8755-3996/01/$10.00 ©2001 IEEE 23 �

T. Stouraitis and V. Paliouras

©19

99A

RT

VIL

LE,L

LC.

address aspects of the con-ventional arithmetic repre-sentations, the impact oflogarithmic arithmetic onpower dissipation, and dis-cuss the low-power aspectsof residue arithmetic.

Conventional Arithmetic RepresentationsParhami [5] offers an overview of low-power techniques forarithmetic circuits. Common techniques for low-power logic de-sign can be applied to arithmetic circuits as well [5]. Such tech-niques are based on the following guidelines:

1. avoid wasted power: glitching minimization, not clockingidle modules;

2. barely meet performance requirements, since slower cir-cuits dissipate less power; and

3. minimize signal activity by properly encoding data.In some cases, wasted power can be reduced by several times

by minimizing the computational load of a particular task. Theappropriate selection of the number system will be shown belowto reduce the computational load in certain tasks.

Callaway and Swartzlander [7] have focused on low-powerarithmetic at the gate level; they have characterized several adderand multiplier architectures in terms of power dissipation. Theyoffer area, time, and power dissipation measures for various archi-tectures and word lengths. In terms of minimal power dissipationfor 16-bit adders, the constant-width carry-skip adder emerges asthe optimal choice. However, minimal absolute power dissipationmay not be the optimization objective in a design. In most cases, amore complex criterion, the power-delay product, is more appli-cable, because it describes the combined effect of reducing powerdissipation at the cost of increasing circuit delay. Returning to the16-bit adder example, the utilization of the power-delay productcriterion points out a different topology as an optimal solution,namely the variable-width carry-skip adder [7].

This example demonstrates that there is not an optimalchoice of architecture applicable to every design situation. In-stead, the design specifications (expressed as area, time, andpower dissipation constraints) should be met, while minimizingan appropriate cost function. A similar discussion for multipliersof word sizes between 8 and 32 bits reveals that Wallace andDadda architectures outperform array multipliers for low-poweroperation [7].

Bit activity is another factor that affects power dissipationand depends on the number system selection. It has been shownthat the probabilistic distribution of the input signals largely af-

fects the performance ofthe number representationin terms of bit activity.Landman and Rabaey dem-onstrate this effect by in-troducing the dual-bit type(DBT) method for model-ing the bit activity in a data

word [6], assuming two’s-complement and sign-magnitude rep-resentations. While the sign magnitude representation is foundto exhibit less bit activity than two’s-complement coding, a gen-eral conclusion on the power dissipation behavior cannot bedrawn, since the complexity of the corresponding processingcircuitry is different. Since sign-magnitude arithmetic requiresmore complicated adders and subtractors than two’s- comple-ment arithmetic, the increased activity of the latter can be com-pensated from a power dissipation viewpoint.

The Logarithmic Number SystemThe LNS [8] has been employed in the design of low-power DSPdevices, such as a digital hearing aid by Morley et al. [9]. More re-cently, Sacha and Irwin report that LNS can reduce power dissi-pation in adaptive filtering processors [10].

LNS BasicsThe LNS maps a linear number X to a triplet as follows

X z s x XbLNS → =( , , log | | ), (2)

where z is a single-bit flag which, when asserted, denotes that Xis zero, s is the sign of X, and b is the base of the logarithmic rep-resentation. The organization of an LNS word is shown in Fig. 1.Mapping [Eq. (2)] is of practical interest because it can simplifycertain arithmetic operations; i.e., reduce the strength of the op-erators. For example, due to the properties of the logarithmfunction, the multiplication of two linear numbers X bx= andY by= is reduced to the addition of their logarithmic images, xand y. The basic arithmetic operations and their LNS counter-parts are summarized in Table 1.

In order to utilize the benefits of LNS, a conversion overheadis required in most cases. Conversion circuitry is required to per-form the forward LNS mapping [Eq. (2)] and the inverse map-ping of the logarithmic results to linear numbers, defined as

X z bs x= − −( )( )1 1 . (3)

It is noted that mappings (2) and (3) are required in the case thatan LNS processor receives as input and transmits as output lin-ear data in digital format. Since all arithmetic operations can beperformed in the logarithmic domain, only an initial conversionis imposed; therefore, as the amount of processing implementedin LNS grows, the conversion overhead contribution to powerdissipation becomes negligible since it remains constant.

In stand-alone DSP systems a different approach is possible.The LNS forward and inverse mapping overhead can be miti-

� 24 CIRCUITS & DEVICES � JULY 2001

LNS is applicable for low-power designbecause it reduces the strength of certainarithmetic operators and the bit activity.

z s …… 0

x X= logb | |

n n 1−

1. The organization of a ( )n + 1 -bit LNS digital word.

gated by employing loga-rithmic A/D and D/Aconverters, instead of lin-ear converters, followed bycorresponding digital con-version circuitry. Such anapproach has been adoptedby Morley et al. in the de-sign of a digital hearing-aidprocessor [9].

LNS and Power DissipationLNS is applicable for low-power design because it reduces

1. the strength of certain arithmetic operators; and2. the bit activity.The operator strength reduction by LNS reduces the switch-

ing capacitance; i.e., it reduces the CL factor of Eq. (1). Sacha andIrwin have studied the impact of the number system choice onthe QRD-RLS algorithm [10]. They have compared the amountof switched capacitance per algorithm iteration for several im-plementations of QRD-RLS, each using a particular arithmetic,namely CORDIC, floating-point, fixed-point, and LNS. A perfor-mance comparison of the various implementations reveals thatLNS offers accuracy comparable to floating-point, but only at afraction of switched capacitance per iteration of the algorithm.

The reduction of average switched capacitance due to LNSstems from the simplification of basic arithmetic operations,shown in Table 1. However, LNS can affect power dissipation inan additional waythe bit activity; i.e., the a factor of Eq. (1). Adesign parameter that is often neglected despite playing a keyrole in an LNS-based processor performance is the base of thelogarithm b [11, 12], as demonstrated in Fig. 2. The choice ofbase has a substantial impact on the average bit activity. Figure2 shows activity per bit position; i.e., the probability of a transi-tion from “low” to “high” in a particular bit position, for atwo’s-complement word and several LNS words, each of a differ-ent base b. It can be seen that departing from the traditionalchoice b = 2 can substantially reduce the signal activity in com-parison to the two’s-complement representation. The input dataare sampled from a zero-mean Gaussian process with a correla-tion factorρ = −0 99. , similar to the derivation of the DBT model.

Since multiplication-additions are important in DSP applica-tions, the power requirements of an LNS and a linear fixed-pointadder-multiplier have been compared. Paliouras and Stouraitisreport that approximately a two-times reduction in power dissi-pation is possible for operations with word size of 8 to 14 bits.Given a sufficient number of multiplication-additions, the LNSimplementation becomes more efficient from the low-powerdissipation viewpoint, even when a constant conversion over-head is taken into consideration.

The Residue Number SystemThe RNS [13] has recently been shown to offer significantpower-dissipation savings in the design of signal processing ar-chitectures for FIR filters [14] and frequency synthesizers [15].

It is shown that RNS caneven reduce the computa-t ion load in com-plex-number processing,thus providing savings atthe algorithmic level.

RNS BasicsThe RNS maps an integer Xto a N-tuple of residues xi ,

X x x x NRNS → { , , , }1 2 K , (4)

where

x Xi m i= , (5)

⋅m i

denotes the mod mi operation, and mi is a member of the setof the co-prime integers { , , , }m m mM1 2 K , called moduli.Co-prime integers have the property that gcd( , )m mi j =1, i j≠ .The modulo operation X

mreturns the integer remainder of

the integer division x div m; i.e., a number k such thatx m l k= ⋅ + , where l is an integer.

CIRCUITS & DEVICES � JULY 2001 25 �

Table 1. Basic Linear Arithmetic Operationsand their LNS Counterparts

Linear Operation Logarithmic Operation

Z XY b b bx y x y= = = + z Z x yb= = +log

Z X Y b b bx y x y= = = −/ / z x y= −

Z X b bm xmx

m= = = z x m m= / , , integer

Z X bm x m= = ( ) z mx m= , , integer

Z X Y b b b bx y x y x= + = + = + −( )1 z x bby x= + + −log ( )1

Z X Y b b b bx y x y x= − = − = − −( )1 z x bby x= + − −log ( )1

It is shown that RNS can even reducethe computation load in complex-number

processing, thus providing savingsat the algorithmic level.

0.5

0.4

0.3

0.2

0.1

15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0bits

Two’s Complement

b = 1.5b = 1.3

b = 1.1

p0 1→

2. Probability p0 1→ per bit position for two’s-complement and LNSencoding for ρ = −0 99. .

RNS is of interest be-cause basic arithmetic op-erations can be performedin a digit-paral le lcarry-free manner; i.e.,

z x yi i i m i= o , (6)

where i M=1 2, , ,K , and the symbol o stands for addition, sub-traction, or multiplication. Every integer in the range0

1≤ <

=∏X mi

N

i has a unique RNS representation. Inverseconversion is accomplished by means of the Chinese RemainderTheorem (CRT) or mixed-radix conversion [16].

The basic architecture of an RNS processor in comparison toa binary counterpart is depicted in Fig. 3. Figure 3 shows thatthe word length nof the binary counterpart is partitioned into Msubwords, the residues, which can be processed independentlyand are of word length significantly smaller than n. The ith resi-due channel performs arithmetic modulo mi . Conceptually, RNSintroduces a subword-level parallelism into an algorithm; there-fore, its hardware implementation can enjoy the low-power ben-efits of parallel architectures [2].

RNS and Power DissipationFreking and Parhi have studied the power dissipation of FIR fil-ter architectures that employ RNS. They report that RNS can re-duce power dissipation since it reduces [14]:

1. the hardware cost;2. the switching activity; and3. the supply voltage.

By employing the binary-like RNS filter structures by Ibrahim[17], Freking and Parhi report that RNS reduces the bit activityup to 38% in ( )4 4× -bit multipliers. As the critical path in anRNS architecture increases logarithmically with the equivalentbinary word length, RNS can tolerate a larger reduction in thesupply voltage than the corresponding binary architecture whileachieving a particular delay specification. To demonstrate theoverall impact of the RNS on the power budget of an FIR filter,Freking and Parhi report that a filter unit with 16-bit coefficientsand 32-bit dynamic range, operating at 50 MHz, dissipates 26.2mW on average for a two’s-complement implementation, while

the RNS equivalent archi-tecture dissipates 3.8 mW.Hence, power dissipationreduction becomes moresignificant as the numberof filter taps increases, anda three-times reduction ispossible for filters with

more than 100 taps.A different approach to low-power RNS is proposed by Chren.

Chren suggests to one-hot encode the residues in an RNS-basedarchitecture, thus defining one-hot RNS (OHR) [15]. Instead ofencoding a residue value xi in a conventional positional nota-tion, an ( )m −1 -bit word is employed. In this word, the assertionof the ith bit denotes the residue value xi . The one-hot approachallows for further reducing bit activity and power-delay productsusing residue arithmetic. OHR is found to require simple cir-cuits for processing. The power reduction is rendered possiblesince all basic operations (i.e., addition/subtraction and multi-plication) and the RNS-specific operations of scaling (i.e., divi-sion by constant), modulus conversion, and index computationare performed using transposition of bit lines and barrel shifters.The performance of the obtained residue architectures is dem-onstrated through the design of a direct digital frequency syn-thesizer, which exhibits a power-delay product reduction of 85%over the conventional approach [15].

RNS Signal Activity for Gaussian InputIn the following, the bit activity in an RNS architecture withpositionally encoded residues is experimentally studied for theencoding of 8-bit data using the base {2,151}, which provides alinear FXP dynamic range of approximately 8.24 bits. Assumingdata sampled from a Gaussian process, the bit assertion activitiesof the particular RNS, an 8-bit sign-magnitude, and an 8-bittwo’s-complement system are measured and compared. The re-sults are depicted in Figs. 4-6 for 100 Monte Carlo runs. It is ob-served that RNS performs better than two’s-complementrepresentation for anti-correlated data and slightly worse thansign-magnitude and two’s-complement representations foruncorrelated and correlated sequences.

The Quadratic RNSResidue arithmetic can be exploited toreduce the number of real operations re-quired to perform complex-numbermultiplication. This is achieved by em-ploying an extension of RNS, thequadratic RNS (QRNS) [16]. The directcomplex-number multiplication can beperformed as

p a jb c jd= + +( )( ) (7)

� 26 CIRCUITS & DEVICES � JULY 2001

n n

n n

(a) (b)

For

war

d C

onve

rter

Inve

rse

Con

vert

er

n M/

n M/

n M/

n M/

n M/

n M/

mod m1

mod m2

mod mM

3. Structure of a binary architecture (a) and the corresponding RNS processor (b).

The impact of the arithmetic in a digitalsystem is not limited to the definition of the

architecture of arithmetic circuits.

= − + +( ) ( )ac bd j bc da , (8)

where j is the imaginary unit (i.e., −1), and a, b, c, and d are realnumbers. Parhami [5] shows a different technique to reduce thenumber of multiplications to three by performing five additionsor subtractions with an extra computational step. According tothis technique, the complex product is computed as

p c a b b c d j c a b a c d= + − + + + − −[ ( ) ( )] [ ( ) ( )], (9)

where the common term c a b( )+ is initially computed.In case the moduli are primes of the form m ki = +4 1, a QRNS

mapping can be established, such that the residue pair of the realand imaginary part modulo mi can be mapped to a quadratic resi-due as

( , ) ( , )*a b q qi i i iQRNS → , (10)

where qi and qi* are the quadratic images of ai and bi , respec-

tively. The quadratic images are obtained as

q a jbi i i m i

= +(11)

q a jbi i i m i

* = − ,(12)

while the mapping is inversed as

a q qi i i m i

= +−2 1( )*

(13)

b j q qi i i m i

= −− −2 1 1( )* ,(14)

where j is the solution of

jm i

− + =2 1 0.(15)

The quadratic mapping is of practical importance because it alle-viates the dependency of the real and imaginary parts of a com-plex product from both the real and imaginary parts of both theoperands, as shown by Eq. (8). In other words, it eliminates thecross-product terms. Therefore, by exploiting the QRNS, thecomplex product {( , )| , , , }*qp qp i Ni i =1 2K of two QRNS-encodedcomplex numbers, {( , )| , , , }*qa qa i Ni i =1 2K and{( , )| , , , }*qb qb i Ni i =1 2K , can be evaluated as the direct productof the corresponding quadratic images; i.e.,

qp qa qbi i i m i= ⋅ (16)

qp qa qbi i i m i

* * *= ⋅ .(17)

Equations (16) and (17) show that a complex multiplication re-quires only two residue multiplications instead of four multipli-cations, an addition, and a subtraction. Therefore, by paying aninitial cost for conversion, a significant computational complex-ity reduction can be achieved by the QRNS mapping, which is di-rectly translated to power savings.

CIRCUITS & DEVICES � JULY 2001 27 �

2500

2000

1500

1000

500

20 40 60 80 100

TC

RNS

SM

4. Number of low-to-high transitions, assuming strongly anti-corre-lated (ρ = −0 99. ) Gaussian data, for two’s-complement, RNS, and

sign-magnitude number systems for 100 Monte Carlo runs.

2500

2000

1500

1000

500

20 40 60 80 100

TCRNS

SM

5. Number of low-to-high transitions, assuming uncorrelated (ρ = 0)Gaussian data.

2500

2000

1500

1000

500

20 40 60 80 100

TC

RNS

SM

6. Number of low-to-high transitions, assuming strongly correlated(ρ = 0 99. ) Gaussian data.

Consider the Monte Carlo runs of the following experiment.Assuming that the real and imaginary parts of the factors of acomplex product are taken from two Gaussian random pro-cesses, the total bit activity in the intermediate results is mea-sured for the complex product evaluation. Specifically, 10-bitsign-magnitude and 10-bit two’s-complement operations are

compared to QRNS operations that cover a dynamic range in ex-cess of 20 bits. Ten Monte-Carlo runs, each of 1000 samples,compose the experiment, which is repeated for uncorrelated( )ρ = 0 , correlated (ρ = 0 99. ), and anti-correlated (ρ = −0 99. )Gaussian data; results are shown in Figs. 7-9, respectively. Evenin the case that QRNS provides significantly larger dynamicrange, it can be seen that the bit activity is reduced approxi-mately two times.

D’Amora et al. have compared the implementation of a di-rect-form complex FIR filter with its QRNS counterpart [18].They report that, for a particular throughput rate, theQRNS-based implementation requires half the area and a thirdof the power dissipation of the conventional implementation.The conventional implementation is assumed to utilize thefour-multiplication scheme for complex-number multiplica-tion, while the QRNS implementation exploits the index trans-form.

The index transform reduces a modulo-mmultiplication to amodulo-( )m −1 addition, for m prime, resembling the reductionof multiplication to addition by LNS. An integer rootρ can be de-termined, such that the residues r m∈[ , )1 can be written as

r n

m= ρ , (18)

and the multiplication of the residues can be reduced to additionmodulo ( )m −1 of the indices, which correspond to the residuesto be multiplied. Therefore, the modulo product p of two resi-dues, r1 and r2 is

p r rm

n n

m

n n

m

n= = = =+1 2

1 2 1 2ρ ρ ρ ρ , (19)

where

n n nm

= + −1 2 1. (20)

Hence, modulo multiplication can be performed as residue addi-tion, preceded and followed by a mapping of the operands totheir indices and of the result to the residue. These mappings arecommonly implemented as table look-ups [16].

The QRNS can exploit the index transform because the uti-lized moduli need to be prime. Hence, in the case of DSP archi-tectures such as FIR filters, the coefficients can be directly storedin index-residue form, thus the strength of each multiplicationcan be further reduced, since the determination of the corre-sponding indices is not repeated for every residue multiplica-tion. The significant power dissipation savings reported byD’Amora et al. assume the utilization of the index transform forresidue multiplication [18].

ConclusionsRecent advances in computer arithmetic offer interesting alter-native solutions for low-power design. Depending on an assort-ment of factors that need to be considered, such as signalstatistics, computational load, type of arithmetic operations, ac-curacy and dynamic range, it is worth evaluating the LNS or the

� 28 CIRCUITS & DEVICES � JULY 2001

25000

20000

15000

10000

5000

2 4 6 8 10

TC

QRNS

SM

7. Number of low-to-high transitions for complex-number multiplica-tion, assuming uncorrelated (ρ = 0) Gaussian operands.

25000

20000

15000

10000

5000

2 4 6 8 10

TC

QRNS

SM

9. Number of low-to-high transitions for complex-number multiplica-tion, assuming strongly anti-correlated (ρ = −0 99. ) Gaussian

20000

15000

10000

5000

2 4 6 8 10

TC

QRNS

SM

8. Number of low-to-high transitions for complex-number multiplica-tion, assuming strongly correlated ( )ρ = 0 99. Gaussian operands.

RNS for hardware implementations of computationally inten-sive tasks.

The choice of arithmetic can lead to substantial power sav-ings. It affects several levels of the design abstraction since it canreduce the number of operations, the signal activity, and thestrength of the operators. The impact of the arithmetic in a digi-tal system is not limited to the definition of the architecture ofarithmetic circuits.

Thanos Stouraitis received a B.S. in physics and an M.S. in elec-tronic automation from the University of Athens, Greece, in1979 and 1981, respectively; an M.S. in electrical engineeringfrom the University of Cincinnati in 1983; and the Ph.D. degreefrom the University of Florida in 1986. He was awarded the Out-standing Ph.D. Dissertation award of the University of Floridaand a Certificate of Appreciation by the IEEE Circuits and Sys-tems Society in 1997. He is a professor of electrical and com-puter engineering at the University of Patras, Greece. He hasserved on the faculty of the University of Florida and the OhioState University. He has published two books, several book chap-ters, and approximately 30 journal and 70 conference papers inthe areas of computer architecture, computer arithmetic, VLSIsignal and image processing, and low-power processing. Heserves on the IEEE Circuits and Systems Society’s technicalcommittee on VLSI Systems and Applications and the digital sig-nal processing and the multimedia systems committees (e-mail:thanos@ee. Upatras.gr).

Vassilis Paliouras received the Diploma in electrical engineer-ing in 1992 and the Ph.D. degree in electrical engineering in1999, from the Electrical and Computer Engineering Depart-ment, University of Patras, Greece. He works as a researcher atthe VLSI Design Laboratory, ECE Dept., while teaching micro-processor-based system design at the Computer Engineeringand Informatics Department, both at the University of Patras.His research interests include computer arithmetic algorithmsand circuits, microprocessor architecture, and VLSI signal pro-cessing, areas where he has published more than 30 conferenceand journal articles. Dr. Paliouras received the MEDCHIP VLSIDesign Award in 1997. He is also the recipient of the 2000 IEEECircuits and Systems Society Guillemin-Cauer Award. He is aMember of ACM, SIAM, and the Technical Chamber of Greece.

References[1] A.P. Chandrakasan, S. Sheng, and R. Brodersen, “Low-power CMOS digi-

tal design,” IEEE J. Solid-State Circuits, vol. 27, pp. 473-484, Apr. 1992.

[2] J.M. Rabaey and M. Pedram, Low Power Design Methodologies. Boston,MA: Kluwer, 1996.

[3] K.K. Parhi, “Low-energy CSMT carry generators and binary adders,” IEEETrans. VLSI Syst., vol. 7, pp. 450-462, Dec. 1999.

[4] T.K. Callaway and E.E. Swartzlander, Jr., “Power-delay characteristics ofCMOS multipliers,” in Proc. 13th Symp. Computer Arithmetic (ARITH13),Asilomar, USA, July 1997, pp. 26-32.

[5] B. Parhami, Computer Arithmetic—Algorithms and Hardware Designs.New York: Oxford Univ. Press, 2000.

[6] P.E. Landman and J.M. Rabaey, “Architectural power analysis: The dual bittype method,” IEEE Trans. VLSI Syst., vol. 3, pp. 173-187, June 1995.

[7] T.K. Callaway and E.E. Swartzlander, “Low power arithmetic compo-nents,” in Low Power Design Methodologies. J.M. Rabaey and M. Pedram,Eds. Boston, MA: Kluwer, 1996.

[8] E. Swartzlander and A. Alexopoulos, “The sign/logarithm number sys-tem,” IEEE Trans. Computers, vol. 24, pp. 1238-1242, Dec. 1975.

[9] R.E. Morley, Jr., G.L. Engel, T.J. Sullivan, and S.M. Natarajan, “VLSI baseddesign of a battery-operated digital hearing aid,” in Proc. IEEE Int. Conf.Acoustics, Speech and Signal Processing, pp. 2512-2515, 1988.

[10] J.R. Sacha and M.J. Irwin, “The logarithmic number system for strengthreduction in adaptive filtering,” in Proc. Int. Symp. Low-Power Electronicsand Design (ISLPED’98), Monterey, CA, 1998, pp. 256-261.

[11] V. Paliouras and T. Stouraitis, “Signal activity and power consumptionreduction using the Logarithmic Number System,” in Proc. 2001 IEEE Int.Symp. Circuits and Systems (ISCAS), vol. 2, pp. II-653-II-656, 2001.

[12] V. Paliouras and T. Stouraitis, “Low-power properties of the LogarithmicNumber System,” in Proc. 15th Symp. Computer Arithmetic (ARITH15),2001.

[13] N. Szabó and R. Tanaka, Residue Arithmetic and its Applications to Com-puter Technology. New York: McGraw-Hill, 1967.

[14] W.L. Freking and K.K. Parhi, “Low-power FIR digital filters using residuearithmetic,” in Proc. 31st Asilomar Conference on Signals, Systems, andComputers, vol. 1, pp. 739-743, 1997.

[15] W.A. Chren, Jr., “One-hot residue coding for low delay-power productCMOS design,” IEEE Trans. Circuits Syst. II, vol. 45, pp. 303-313, March1998.

[16] M.A. Soderstrand, W.K. Jenkins, G.A. Jullien, and F.J. Taylor, ResidueNumber Arithmetic: Modern Applications in Digital Signal Processing.Piscataway, NJ: IEEE Press, 1986.

[17] M.K. Ibrahim, “Novel digital filter implementations using hybridRNS-binary arithmetic,” Signal Processing, vol. 40, no. 2-3, pp. 287-294,1994.

[18] A. D’Amora, A. Nannarelli, M. Re, and G.C. Cardarilli, “Reducing powerdissipation in complex digital filters by using the Quadratic Residue Num-ber System,” in Proc. 34th Asilomar Conference on Signals, Systems, andComputers, 2000. CD�

CIRCUITS & DEVICES � JULY 2001 29 �