implementation of wallace tree multiplier using adders.docx

7/25/2019 Implementation of Wallace Tree Multiplier Using adders.docx

1/7

International Journal of Emerging Engineering Research and Technology V3 I8 August 201 110110 International Journal of Emerging Engineering Research and Technology V3 I8 August 201

International Journal of Emerging Engineering Research and Technology

Volume 3! Issue 8! August 201! "" 110#11$

I%%& 23'(#'3( )"rint* + I%%& 23'(#''0( ),nline*

-esign and Im.lementation of /allace Tree ulti.lier sing

ogge %tone Adder and rent ung Adder

45 %hireesha1! -r5 45 ana6a -urga

2

1M.E, Department of Electronics and Communication Engineering, M.V.S.R Engineering College, Hyderabad

2rofessor ! H"D,Department of #nformation$ec%nologyEngineering,M.V.S.REngineeringCollege,Hyderabad

A%TRA7T

A fixed point Wallace tree multiplier architecture is used to perform multiple multiplications on different data

paths. Wallace tree multiplier using Kogge Stone and Brent Kung adders perform more number of

multiplications in parallel with fewer extra carry save adder stages than existing multiplier. he modified n!bit

Wallace tree multiplier structure is used to perform four "n#$%&"n#$%!bit multiplications' two n&"n#$%!bit

multiplications and one n&n!bit multiplication in parallel. (n the existing Wallace tree multiplier design )arryloo*ahed adder ")+A% is used. o further improve the speed and to reduce the area parallel prefix adders are

used in the modified Wallace tree multiplier. he Kogge Stone adder "KSA% is used for high!speed and Brent

Kung adder "BKA% is used to reduce the area. Wallace tree multiplier using KSA and BKA are implemented

using ,ilinx 1-.$

eyords9 Kogge stone adder' Brent Kung adder' Wallace tree multiplier.

I&TR,-7TI,&

he performance of the embedded system' microprocessor and many modern S/ applications are

mainly dependent on the performance of the multipliers as it is the *ey element. An efficient system

can be build by ma*ing suitable multiplier for the system. A multiplier has two stages' partial product

generation and partial product addition. An n bit multiplier produce n number of partial products andto reduce this number of partial products modified Booth algorithm 1 was used. But booth encoder

in this multiplier increases the circuit depth. he n bit Baugh Wooly array multiplier $ has the

circuit depth 2"n%. (n modern technology' vector processors - are playing a ma3or role to achieve

data level parallelism "+/%. 4ere multiple operands are following the multiple data paths in the same

hardware. So only one instruction is used to perform the operations on the vector of data. he twin

precision based array multiplier is explained in 5 is used for data level parallelism. Where the full

precision multiplier is used to implement two half precision multiplications with circuit depth of 2"n%.

hat is the 6!bit multiplier is used to perform two 5!bit multiplications or one 6 bit multiplication at a

time. he depth of )arry Save Adder ")SA% is 2"1%. he depth of the carry save addition tree is

2"log$ n% for Wallace tree 7 based multiplier and 2"n% for Braun based7 multiplier' where n is the

number of bits. he 8uarter precision Wallace tree multiplier 9 is used to produce the multiple

multiplications to improve the speed with increased circuit depth as trade off. o further improve thespeed and reduce the area Wallace tree multiplier using Kogge Stone adder and Brent Kung adder :!

6 are used.

32#IT /A::A7E TREE :TI":IER

he modern digital signal processor re8uires large multiplier to compute complex signal processing

operations. he S/ processor shows the need for 95!bit ;ultiplier' where four -$&-$!bit

multiplications or sixteen 19&19!bitmultiplications or four19&-$!bit multiplications or four6&6!bit

multiplications are performed using one 95!bit ;ultiplier in parallel. (n the same way' the Wallace

tree multiplier architecture is allowed to perform more than one multiplication in parallel to achieve

data level parallelism in vector processors. he -$!bit Wallace tree multiplier is having 6 carry save

stages and 75!bit final adder to get the product. Where the modified -$!bit Wallace structure is having

*Address for correspondence:

sirihanc y < g m a il.com
mailto:[email protected]:[email protected]


2/7

45 %hireesha+ -r5 45 ana6a -urga ;-esign and Im.lementation of /allace Tree ulti.lier sing

ogge %tone Adder and rent ung Adderach of the Bloc* ( will

act as19!bit Wallace tree multipliers carry reduction tree and they are getting the partial products

from each of the 8uarters he Bloc* ( is having 9carry save stages. he final carry and sum from

)SA15 of Bloc* ( is sent to $7!bit )+A to get four19!bit multiplication results at a time.

=ig15 &allace tree multiplier

=ig25artial product arrangement

Bloc* ((( respectively. >ach of the Bloc*s ( will act as19!bit Wallace tree multipliers carry reduction

tree and they are getting the partial products from each of the 8uarters. he Bloc* ( is having six carry

save stages. he final carry and sum from )SA15 of Bloc* ( is sent to $7!bit )+A to get four19!bit

multiplication results at a time.

loc6s of 32#>it /allace Treeulti.lier

loc6 II

=ig35'loc( # of &allace tree multiplier

he inputs to Bloc* (( are from the output of )SA15 of two of the Bloc* (' which tends to produce

two19&-$!bit multiplication results in parallel. he Bloc* (( is having two carry save stages and one

50!bit recursive doubling based )+A. Similarly the inputs to bloc* ((( are from the output of )SA15

of the entire Bloc* (' which tends to produce one -$&-$!bit multiplication result. he Bloc* ((( is


3/7

having four carry save stages and one75!bit recursive doubling based )+A. And all the Bloc* (('

Bloc* ((( and $7!bit )+As are in parallel. herefore the critical path of the modified structure includes

Bloc* ( and Bloc* (((. So the total critical depth of the -$!bit Wallace tree multiplier is e8ual to the

addition of number of carry stages of Bloc* (' number of carry save stage of Bloc* ((( and depth of

75!bit recursive doubling based )+A. So the critical path of Wallace structure includes 10 carry save

stages and one 75!bit recursive doubling based )+A. So the Wallace structure re8uires two extra

carry save stages than existing -$!bit Wallace structure and these causes slightly increase in the worstpath delay of modified system than the existing multiplier. he -$!bit Wallace structure has -0 carry

save adders and one 75!bit recursive doubling based )+A. he Bloc* ( has15 )arry Save adders

")SA%' Bloc* (( has $ )arry Save adders and cell ((( has six )arry Save adders. So the -$!bit Wallacestructure has"515%?"$$%?"19% @ 99 )arry Save adders' four $7!bit recursive doubling based

)+As'

two 50!bit recursive doubling based )+As and one75!bit recursive doubling based )+A. And hence'this huge difference causes increase in total cell area' total number of cells and net power than

conventional structure.

loc6 III

loc6 III

=ig'5'loc( ## of &allace tree multiplier

=ig5'loc( ### of &allace tree multiplier

,-I=IE- /A::A7E TREE :TI":IER

(n Wallace tree multiplier 9 )+A is replaced with Kogge Stone adder "KSA% and Brent Kung

adder"BKA% to further improve the speed and reduce the area. Kogge Stone adder and Brent Kung

adders are the parallel prefix adders. /arallel!prefix structures are found to be common in high


4/7

performance adders because the delay is logarithmically proportional to the adder width .he parallel

prefix adders are more flexible and are used to speed up the binary additions. /arallel prefix adders

are obtained from )arry +oo* Ahead ")+A% structure. he construction of parallel prefix adder

involves three stages

1. /re! processing stage

$. )arry generation networ*-. /ost processing

"re#.ossessing stage

(n this stage we compute' generate and propagate signals to each pair of inputs A and B. hese signals

are given by the logic e8uations 1$

/i@Ai xor Bi "1%

i@Ai and Bi "$%

7arry generation netor6

(n this stage we compute carries corresponding to each bit. >xecution of these operations is carried

out in parallel. After the computation of carries in parallel they are segmented into smaller pieces. (tuses carry propagate and generate as intermediate signals which are given by the logic e8uations - and5

)/iC3@/iC*?1 and /*C3 "-%

)iC3@iC*?1 or "/iC*?1 and *C3% "5%

"ost .rocessing

his is the final step to compute the summation of input bits. (t is common for all adders and the sumbits are computed by logic e8uation 7 and 9C

)i!1@ "/i and )in% or i "7%

Si@/i xor )i!1 "9%

hese parallel prefix adders contains gray cells and bloc* cells. he blac* cell "B)% generates theordered pair' the gray cell ")% generates only left signal :.

=ig$5'lac( cell and gray cell

32 IT /A::A7E TREE :TI":IER %I&4 %A

=ig?5Modified &allace tree multiplier using )S*


5/7

Kogge!Stone adder is a parallel prefix form of carry loo*!ahead adder. A parallel prefix adder can be

represented as a parallel prefix graph consisting of carry operator nodes. he time re8uired to generate

carry signals in this prefix adder is 2"log n%. (t is a fastest adder design and common design for high

performance adders in industry .(n pre computation stage propagation and generation operations are

performed. Stage1 used 15 blac* cells and one gray cell. Stage$ used 1$ blac* cells and two gray

cells. Stage- used 6 blac* cells and 5 gray cells. Stage5 used 6 gray cells. (n final computation stage

sum and carry generated.1$#>it ogge %tone adder

loc6 II

loc6 III

=ig85 1+bit )ogge Stone adder

=ig(5'loc( ## of modified &allace tree multiplier using )S*

=ig105'loc( ### of modified &allace tree multiplier using )S*

32#IT /A::A7E TREE :TI":IER %I&4 A

=ig115Modified &allace tree multiplier using ')*


6/7

he cost and wiring complexity is greatly reduced using Brent Kung adders .(n pre computation stage

we are doing propagation and generation operations. After that stage 1 used seven blac* cells and one

gray cell. Stage $ used three blac* cells and one gray cell. Stage - used one blac* cells and one gray

cell. Stage 5 used one gray cell.stage7 used one gray cell. Stage 9 used three gray cells. Stage : used

seven gray cells. (n final computation stage sum and carry are generated.

1$#it rent ung Adder

loc6 II

loc6 III

=ig125 1+bit 'rent )ung adder

=ig135'loc( ## of modified &allace tree multiplier using ')*

=ig1'5'loc( ### of modified &allace tree multiplier using ')*

%I:ATI,& /AVE=,R%


7/7

he simulation is done using ,ilinx (S> 1-.$ tool. (n the above waveform a and b are -$!bit numbers.

he final output is 95!bit multiplication result. (n parallel four 19D19 bit and two 19D-$ bit results

also produced.

7,"ARI%,& ,= AREA A&- -E:A@ ,= /A::A7E TREE :TI":IER

%I&4 A--ER%

he delay of Wallace tree multiplier using Kogge Stone adder is improved from 60.E::ns to-$.519ns. Wallace tree multiplier using Brent Kung adder delay is improved from 60.E::ns to

70.E19ns and also the area is also reduced by three slices.

7,&7:%I,&

A -$!bit Wallace tree multiplier is modified and redesigned. he modified Wallace tree multiplier

achieves data level parallelism in vector processors. (n addition part )SA' )+A adders used to

enhance to prefix adders KSA' BKA. KSA adder having high!speed and BKA adder ta*e less area so

three adders using -$ bit Wallace tree multiplier is implemented by ,ilinx 1-.$ and hardware *it

=/A spartan-e.

RE=ERE&7E%

1 /.>. ;adrid' B. ;illar' and >.>. SwartFlander' G;odified booth algorithm for high radixmultiplicationH' (>>> (nternational )onference on )omputer esignC I+S( in )omputers and

/rocessors' pp. 116!1$1' 2ct.1EE$.

$ ).>. KoFyra*is and .A. /atterson' GScalable' vector processors for embedded systemsH' ;icro'

(>>> Journals and magaFines' vol. $-' no.9' pp. -9!57' $00-.

- S3alander ; and +arsson!>defors /' G4igh!Speed and +ow!/ower ;ultipliers sing the Baugh!

Wooley Algorithm and 4/; Leduction reeH' (>>> (nternational )onference on >lectronics'

)ircuits and Systems' page"s% --!-9' Sep. $006.

5 ;. S3alander and /. +arsson!>defors' G;ultiplication acceleration through twin precisionH' (>>>

ransactions on Iery +arge Scale (ntegration "I+S(% Systems' vol. 1:' no. E' pp. 1$--!1$59'

Sept. $00E.

7 ). S. Wallace' MA suggestion for a fast multiplier' (>>> ransactions on >lectronic )omputers'

vol. >)!1-' no. 1' pp. 151:' =eb. 1E95

9 ;ohamed Asan Basiri' ;. Samaresh )handra Naya* and Noor ;ahammad S*' M;ultiplication

Acceleration hrough Ouarter /recision Wallace ree ;ultiplier'(>>> $015.

: Sudheer *umar PeFerla' B La3endra Nai* '.esign and estimation of delay' power' and area forparallel prefix adders' (>>> $015.

6 Adila*shmi Siliveru' ;.Bharathi Mesign of Kogge stone and Brent *ung adders using

degenerate pass transistor logic' (nternational 3ournal of emerging science and engineering$01-.

implementation of wallace tree multiplier using adders.docx

Documents