copyright 2016, nikhil patil

54
Complementary Based Logic Design for Arithmetic Building Blocks By Nikhil Patil, B.Tech A Thesis In Electrical Engineering Submitted to the Graduate Faculty of Texas Tech University in Partial Fulfillment of the Requirements for the Degree of MASTER OF SCIENCES Approved Dr. Tooraj Nikoubin Chair of Committee Dr. Brian Nutter Member of Committee Dr. Stephen Bayne Member of Committee Mark Sheridan Dean of the Graduate School May 2016

Upload: others

Post on 01-Oct-2021

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Copyright 2016, Nikhil Patil

Complementary Based Logic Design for Arithmetic Building Blocks

By

Nikhil Patil, B.Tech

A Thesis

In

Electrical Engineering

Submitted to the Graduate Faculty

of Texas Tech University in

Partial Fulfillment of

the Requirements for

the Degree of

MASTER OF SCIENCES

Approved

Dr. Tooraj Nikoubin

Chair of Committee

Dr. Brian Nutter

Member of Committee

Dr. Stephen Bayne

Member of Committee

Mark Sheridan

Dean of the Graduate School

May 2016

Page 2: Copyright 2016, Nikhil Patil

Copyright 2016, Nikhil Patil

Page 3: Copyright 2016, Nikhil Patil

Texas Tech University, Nikhil Patil, May 2016

ii

ACKNOWLEDGMENTS

I would like to thank Dr. Tooraj Nikoubin for all his help, advice, patience, and for

giving me the opportunity to work on this thesis. His guidance helped me all the time

in the research and writing this thesis. I couldn’t imagine a better advisor for my

research.

Besides my advisor I would like to thank my thesis committee members Dr. Brian

Nutter and Dr. Bayne for insightful comments, encouragement and support.

I thank my fellow labmates Ashish Joshi, the Nima Eskandari, Swetha Rapollu,

Karthikeya Challa, Abhilash Reddy and Rathan for the stimulating discussions and

sleepless nights we were working together.

Last but not the least, I would like to thank my parents and my sister for their support

and encouragement.

Page 4: Copyright 2016, Nikhil Patil

Texas Tech University, Nikhil Patil, May 2016

iii

Table of Contents ACKNOWLEDGMENTS .................................................................................................... ii ABSTRACT ..................................................................................................................... iv LIST OF TABLES ............................................................................................................. v LIST OF FIGURES .......................................................................................................... vi INTRODUCTION .............................................................................................................. 1 COMPLIMENTARY BASED LOGIC DESIGN .................................................................... 3 OPTIMIZATION OF ARITHMETIC BUILDING BLOCKS ................................................... 8

2. 1 Traditional Add/ Sub circuit for BCD ................................................................. 8

2.1.1 Nine’s complement circuit (Block A in the fig 7): ..................................... 12 2.1.2 BCD adder(Block B in the figure 7): .......................................................... 14 2.1.3 nine’s Complement OR Add by one Or buffer(C): ..................................... 15

2.2 Reduced Delay Adder Optimization Using CBLD ............................................ 17

2.3 High- Speed Parallel Decimal Multiplication with Redundant Internal Encoding

circuit optimization with CBLD. ............................................................................. 19 DESIGN OF CARRY SELECT ADDER USING CBLD ...................................................... 23

3.1 Regular CSLA: ................................................................................................... 24 3.2 BEC-CSLA: ....................................................................................................... 24

3.3 CS block based CSLA: ...................................................................................... 27 PROPOSED ADDER DESIGN .......................................................................................... 29

4.2 Half Adder Design: ............................................................................................ 30 4.3 Determination of stage size: ............................................................................... 31

4.3.1 SQRT structure: .......................................................................................... 31

4.3.2 Uniform stage size Cascade Structure: ....................................................... 32 ASIC DESIGN FLOW .................................................................................................... 34 SIMULATION AND RESULTS .......................................................................................... 39

6.1 Optimization of arithmetic building blocks using CBLD .................................. 39 6.1.1 Add/Sub module ......................................................................................... 39 6.1.2 Reduced delay adder ................................................................................... 40 6.1.3 High Speed parallel multiplier .................................................................... 40

6.2 Design of CBEC-RCA ....................................................................................... 41 CONCLUSION ................................................................................................................ 44 REFERENCES ................................................................................................................ 45

Page 5: Copyright 2016, Nikhil Patil

Texas Tech University, Nikhil Patil, May 2016

iv

ABSTRACT

High-performance low power arithmetic circuit with reduced area is essential

for advanced arithmetic processes. This thesis proposes a modification of arithmetic

units using a new design technique which is called Complimentary Based Logic Design

(CBLD). CBLD is the design technique which minimizes the number of gates hence

improves the area efficiency of the overall circuit. This new design approach can be

issued in an arithmetic unit in which multiple blocks perform conditional operations in

parallel at one stage or in series with dependency. As multiple module performs in

parallel only one module functions and others stays idle. CBLD reduces such idle

modules by compressing all modules with a single module. The functionality of CBLD

can be verified by implementing it on an optimized module of BCD adder and BCD

adder/subtractor module. Comparison of CBLD design with its’ older counterpart shows

significant area power and delay efficiency.

Second Part of this thesis proposes a novel design of Carry select adder

implementing CBLD. Proposed design of conditional BEC-CSLA or modified ripple

carry adder is compared with recently published Area power and time efficient Carry

select adders.

Page 6: Copyright 2016, Nikhil Patil

Texas Tech University, Nikhil Patil, May 2016

v

LIST OF TABLES

1: Output in terms of input complement ........................................................................ 4

2: Min Terms representation .......................................................................................... 4

3: multi-input CBLD consideration ............................................................................... 5

4: Nine's Complement table ......................................................................................... 12

5: Condition function for nine's complement ............................................................... 12

6: CBLD final Function for Nine's complement .......................................................... 13

7: Function for add by six ............................................................................................ 14

8: Condition for add by one and nine's complement from truth table .......................... 15

9: Control signal Conditions ........................................................................................ 16

10: CBLD Functions for the 9'c complement add by one module ............................... 16

11: Conditions in the BCD adder ................................................................................. 18

12: Add by six and add by one ..................................................................................... 19

13: Condition for add by nine / add by one/ add by 10 ................................................ 20

14: Control signal condition ........................................................................................ 20

15: CBLD condition for Add by One ........................................................................... 29

16: Result of comparison for Add/sub to CBLD counterpart ...................................... 39

17: Reduced delay adder comparison .......................................................................... 40

18: ADP PDP comparison for Reduced delay adder ................................................... 40

19: Comparison of High speed parallel multiplier with CBLD ................................... 40

20: Comparison Result for CBEC-RCA and standard adders ..................................... 42

Page 7: Copyright 2016, Nikhil Patil

Texas Tech University, Nikhil Patil, May 2016

vi

LIST OF FIGURES

1: Typical multistage system ........................................................................................... 3

2: Complementary based Logic Design General Block Presentation .............................. 3

3: Final 2-bit Add by one Function ................................................................................. 5

4: General XOR gate representation ................................................................................ 6

5: (A) Proposed class- A XOR gate [14] (B) 3-input XOR [19] ..................................... 6

6:(a) Time delay result for 2-input OR gate (b)Time delay result of 2-input XOR ........ 7

7: Traditional BCD add/Sub circuit ............................................................................... 11

8: Nine's compliment module using CBLD ................................................................... 13

9: waveform for nine's complement module with CBLD methodology ....................... 14

10: Add by six/ buffer module using CBLD methodology ........................................... 15

11: Functionality verification for BCD adder using CBLD methodology .................... 15

12 : 9's Complement/ Add by one/ Buffer Unit by CBLD methodology ...................... 16

13: Functionality Waveform for stage 3 (c_e is borrow or subtraction

correction/a_s is add/Sub decision) ........................................................................ 17

14: Reduced Delay Adder Block Diagram [1] .............................................................. 17

15: Multistage CBLD .................................................................................................... 18

16: BCD correction module for reduced delay adder sing CBLD methodology [18]. .. 19

17: High- Speed Parallel Decimal Multiplication module [6] ....................................... 20

18: Optimized Circuit for High- Speed Parallel Decimal Multiplication circuit.

Using CBLD .......................................................................................................... 21

19: Multistage Approach for High- Speed Parallel Decimal Multiplication circuit

[18] ......................................................................................................................... 21

20: Waveform for Multistage CBLD for multiplier module ......................................... 22

21: Typical CSLA structure [18] ................................................................................... 24

22: BEC unit proposed in BEC CSLA [4] ..................................................................... 25

23: Typical Square Root Structure with increasing stage size [4] ................................. 26

24: SQRT Structure for BEC CSLA [4] ........................................................................ 26

25: Carry select adder with conditional carry [18] ........................................................ 27

26: Carry skip based BEC-adder [22] ............................................................................ 28

Page 8: Copyright 2016, Nikhil Patil

Texas Tech University, Nikhil Patil, May 2016

vii

27: Add by one realization ............................................................................................. 29

28: Ripple carry adder CBEC ........................................................................................ 30

29: Proposed CBEC RCA design with half adder [18] ................................................. 30

30: Waveform for 8-bit CSLA ....................................................................................... 31

31: SQRT structure delay path for last multiplexer ....................................................... 31

32: Critical Path for SQRT RCA-CBEC ....................................................................... 32

33: 4-4 Structure for RCA-CBEC ................................................................................. 32

34: The 2 bit RCA-CBEC unit ...................................................................................... 33

35: 2-bit cascade structure for CBEC-RCA .................................................................. 33

36: VLSI IC Design cycle [16] ...................................................................................... 34

37: Design Compiler Functional Diagram [17] ............................................................. 35

38: typical Floor plan in IC compiler ............................................................................ 37

39: Final IC layout using IC compiler ........................................................................... 38

40: ADP and PDP comparison for add/sub ................................................................... 39

41: ADP PDP comparison for High speed parallel multiplier ....................................... 41

42: Area Comparison ..................................................................................................... 42

43: Power comparison ................................................................................................... 43

Page 9: Copyright 2016, Nikhil Patil

Texas Tech University, Nikhil Patil, May 2016

1

CHAPTER I

INTRODUCTION

In arithmetic, there arises a condition when multiple modules are assigned for one

stage these modules work in parallel. However, many of the times in the several parallel

modules only one module is active. Complimentary Based Logic Design (CBLD) is an

approach to compress the redundant modules with a standard method of reduction.

Which, provide better Area, Power delay performance of complicated circuitry. In the

normal circuit there exist multiple modules which process the input differently but

depending on the condition only one or few modules are actually used. CBLD will

consider all the modules and a control signal and compress it to one module.

CBLD design method also works when a constant is to be added to a variable

number. Adding constant to a variable number is general practice in digital design. In

computer architecture “Increment memory address by 1” is a very typical operation.

Traditionally a constant is added by an n-bit binary adder to a variable number e.g.

address of a register in this case. This method of adding a constant is very area consuming

and slow. Various efficient circuits are present. However, these circuits are case pacific

and are generated using AOI or OAI method. On the other hand, CBLD gives a general

methodology which works in any add by constant case and gives comparative results in

standard form.

In the CBLD, XOR gates are used as controlled buffer or inverter modules to

control the output of each module. When one input of XOR gate is high the second input

of the gate is inverted so that the XOR gate acts as an inverter. If one of the inputs is low,

the other input is passed without any change and the XOR gate acts as a buffer in this

case. This attribute of the XOR gate is harnessed to develop the CBLD Methodology.

CBLD can be used to optimize existing circuit with multiple parallel modules

with conditional inputs and CBLD can also be used to design a new circuit where a

conditional constant addition is used. Second part of the thesis present such design

Page 10: Copyright 2016, Nikhil Patil

Texas Tech University, Nikhil Patil, May 2016

2

example of CBLD. A Conditional carry select adder is proposed and compared with state

of the art adders present. Carry select adders are used because of their better speed

performance compared to simple carry skip or carry save adders and they have less area

consumption compared to complicated adders like carry look ahead adders. A novel

structure of conditional binary to excess-1 adder based on carry select adder logic is

proposed.

Regular CSLA is area-consuming due to the dual Ripple-Carry Adder (RCA)

structure. For reducing area, the CSLA can be implemented by using a single RCA and

a Binary to excess-1 circuit instead of using dual RCA. A conditional BEC block is

proposed which can be implemented in place of regular BEC block and multiplexer to

add one to the output of modified RCA or pass the output without any change. This

approach reduces necessity of multiplexers for CSLA hence reduce delay, area and power

consumption compared to BEC-CSLA. Characteristics of modified RCA are compared

with regular CSLA and two recent CSLA designs with SQRT structure. Result and

analysis shows modified CBEC-CSLA have better area and power consumption than the

regular dual RCA-CSLA, CBEC-MRCA, BEC based carry select adder and CS- based

CSLA.

Page 11: Copyright 2016, Nikhil Patil

Texas Tech University, Nikhil Patil, May 2016

3

CHAPTER II

COMPLIMENTARY BASED LOGIC DESIGN

A typical multistage system is shown in the fig. 1. each block in the figure

represent a module. These modules generates an output which is used by the next stage.

The advantage of binary system is the output of any stage can only take two values i.e. 1

or 0 hence, we can say that the input is either inverted or passed without any change this

principle is used to develop CBLD methodology.

I3 I2 I1 I0

C0

C1

C2

C3

O0O1O2O3 Figure 1: Typical multistage system

As mentioned earlier CBLD heavily uses property of XOR gates, if we consider

one input to XOR gate as control signal and second input as variable. Then, if first input

is 1 the second input to gate (variable input) is then inverted, similarly if first input which

control signal is 0 second input to the gate is passed without any change.

For a given circuit a function of inputs can be formed to represent outputs. i.e.

Outputs = f(Inputs) Similarly, Complementary Based Logic Design considers a function

of inputs which complement the output. i.e. output 𝑓(𝑖𝑛𝑝𝑢𝑡𝑠) = Output.

CBLD F1CBLD F2CBLD F3CBLD F4

I[0]I[1]I[2]I[3]

O[0]O[1]O[2]O[3]

Figure 2: Complementary based Logic Design General Block Presentation

Page 12: Copyright 2016, Nikhil Patil

Texas Tech University, Nikhil Patil, May 2016

4

This gives us an ability to use control signal directly inside the module to bypass

the entire stage or complement the input according to the function.

Let’s, consider a case when we want to add one to the input if control signal x is

high if x is zero then input will be passed to output without any change.

Step 1: Represent output in terms of input complement.

Table 1: output in terms of input complement

Input Output Output in terms of inputs

I1 I0 O1 O0 O1 O0

0 0 0 1 NC C

0 1 1 0 C C

1 0 1 1 NC C

1 1 0 0 C C

* C = Complement ** NC = No Complement

Step 2: Find the min terms:

Now we consider all the complement terms as 1 and non-complement terms as zero.

Consider I1 and O1

Table 2: Min Terms representation

I1 O1 Complement Min term

0 0 NC 0

0 1 C 1

1 1 NC 0

1 0 C 1

Min terms for O1 = 𝐼1 are (1,3)

Step 3: Function Realization:

Function can be realized using SOP, POS or Quine McCuskey method. Simple SOP of

the above equation gives complement function as = I0

If we follow step 2 and 3 for O0 we get a function as I0

Page 13: Copyright 2016, Nikhil Patil

Texas Tech University, Nikhil Patil, May 2016

5

Step 4: Control Signal consideration:

Control signals are ANDed to the functions hence corresponding function is only

performed when control signal is high.

In the above equation there is only one case of complement for control signal X Hence

X is multiplied to these functions.

Depending on this equation circuit can be realized and is as shown in the fig.3

X

I0I1

O0O1

Figure 3: Final 2-bit Add by one Function

The proposed circuit performs add by one or pass the input as is operation with only 3

gates.

CBLD can further be expanded if we represent all the input to find the

compliment function.

Consider following example let’s represent O0 in terms of I1 and I0 for a random circuit.

Table 3: multi-input CBLD consideration

I1 I0 O1 O1 in

terms

of I1

O1 in

terms

of I0

0 1 1 𝐼1 I0

0 0 1 𝐼1 𝐼0

1 1 0 𝐼1 𝐼0

1 0 0 𝐼1 I0

Page 14: Copyright 2016, Nikhil Patil

Texas Tech University, Nikhil Patil, May 2016

6

Now the functions for O1 can be f (in terms of I1) = 𝐼1 or in terms of f (in terms of I0) =

I0 exor I Here we can chose which function is less complicated and easy to realize and

use that function instead of using O1 in terms of I1 directly. Clearly in this example O1

in terms of I1 is the best case, however this is not the case always.

1.1 XOR gate consideration:

Bottleneck for the modern digital design is an XOR gate. A logical representation

of an XOR gate is as follows:

A EXOR B = A.𝐵 + B.𝐴

Which can be represented as

A

O

B

Figure 4: General XOR gate representation

This structure is 3 Stage structure.

However, for design simulation modern compilers use very optimized design of

XOR gates. Various research shows that performance of XOR can be improved with 4-

transistor design New 4- Transistor XOR XNOR designs [14] paper shows a very novel

design of XOR gate based on Cell design methodology.

AB

XNOR

XOR

G6G5

G8G7

G3 G5

G1 G2 XOR

XONR

ln1

ln3

ln2

ln4

Y

Y

(A) (B)

Figure 5: (A) Proposed class- A XOR gate [14] (B) 3-input XOR [19]

This design is improved using transmission gates [15] this paper represents

various designs of XOR- XNOR circuits. The gates developed in this paper are balanced

Page 15: Copyright 2016, Nikhil Patil

Texas Tech University, Nikhil Patil, May 2016

7

XOR-XNOR circuits. Author suggests various methods to implement XOR-XOR circuit.

This group XOR uses feedback network mechanism to correct high impedance states to

obtain XOR-XNOR circuit from basic cells.

A systematic cell design methodology suggests a methodology to design XOR

[19]. This design provides full-swing and fairly balanced XOR circuit with 3 inputs. The

critical path contains only 2 transistors. Elementary basic cell (EBC) for this circuit is

shown in fig. 5 B

We performed design test on gates n Design vision to analyze XOR gate

performance following picture gives the delay result for XOR gate and OR gate.

(a)

(b)

Figure 6:(a) Time delay result for 2-input OR gate (b)Time delay result of 2-input XOR

In the fig.6 highlighted box represents the data arrival time for the gates. This

result shows the standard cell library (saed90_typ_ht.db) suggested by synopsis uses

modified XOR gate. This Modified gate has comparative time delay to OR gate. This

proves that we can consider XOR gate as standard gate with nearly same parameters as

AND OR gates.

Page 16: Copyright 2016, Nikhil Patil

Texas Tech University, Nikhil Patil, May 2016

8

CHAPTER II

OPTIMIZATION OF ARITHMETIC BUILDING BLOCKS

2. 1 Traditional Add/ Sub circuit for BCD

First arithmetic block optimized is a basic illustration of traditional BCD add-sub unit.

This example shows the optimization capacity of traditional design which considers add

by constant operation, nine’s complement operation and conditional addition operation

all these modules can be efficiently optimized by CBLD methodology and optimization

results in significant reduction in number of gates required providing less area

consumption in the circuit. Reduction in number of gates in critical path provides less

delay and as a result of reduced number of elements and elimination of Multiplexer unit

gives less power consumption.

A. BCD Addition:

In Computer Arithmetic numbers are represented in binary format. BCD is method

of representing a binary number in decimal format. BCD is method of representing a

binary number in decimal format. A decimal number contains 10 digits starting from 0

to 9. A group of 4 binaries are formed to represent a decimal digit. In BCD we can use

the binary number from 0000-1001 only, which are the decimal equivalent from 0-9.

Suppose if a number have single decimal digit then it’s equivalent Binary Coded Decimal

will be the respective four binary digits of that decimal number and if the number contains

two decimal digits then it’s equivalent BCD will be the respective eight binary of the

given decimal number, four for the first decimal digit and next four for the second

decimal digit. hence 64 will be represented as 0110 0100

BCD addition Steps

Page 17: Copyright 2016, Nikhil Patil

Texas Tech University, Nikhil Patil, May 2016

9

1) Add the two numbers using Binary Rules:

Case I 1010 Case II 0010

+ 0100 + 0101

1110 0111

2) Here we have to judge the result of addition. Above two cases describe the rules

of BCD Addition. In case 1 the result of addition of two binary number is 14

which is greater than 9, which is not valid for BCD number. But the result of

addition in case 2 which is 7 is less than 9, which is valid for BCD numbers.

3) The result in case I is incorrect and hence 6 has to be added to obtain a BCD

number

Hence,

Hence the result of addition is 4 with a carry 1

B. Nine’s complement method for BCD subtraction

9’s complement of any number is obtained by subtracting the number from (10n - 1)

here n = number of digits in the number e.g. 9’s complement of 333 can be obtained by

subtracting the given number from

103-1 hence

999

- 333

666

Steps to obtain 9’s complement subtraction:

Consider BCD SUBTRACTION By 9’s Compliment A= 8 , B = 6

1110

+ 0110

0100

Page 18: Copyright 2016, Nikhil Patil

Texas Tech University, Nikhil Patil, May 2016

10

1)9’s Compliment of first number

That is subtract the digit from 9 logically however it’s done in two steps

a) Compliment the number 1001

b) Add 1010 to the number = 0011

2) Add second BCD to the number, this addition is binary addition and does not consider

BCD aspect of the addition. Which gives the result of addition in hexadecimal format

and it can be greater than 9

1000

+ 0011

= 1011

3) As the addition in the previous stage results in the hexadecimal format, the number

should be converted back to BCD. The “Add 6 if the number is > 9” condition converts

the number to the BCD.

1011

+ 0110

= 0001

4) Add One if carry generated and if the carry is not generated this means the answer is

in 9’s complement format and needs to be converted back to BCD format by taking 9’s

complement of the number similar to stage 1 in this discussion.

0001

+0001

=0010

Hence, 8-6 = 2 (0010)

Page 19: Copyright 2016, Nikhil Patil

Texas Tech University, Nikhil Patil, May 2016

11

Add/Sub

4-Bt Binary Adder

10X5 MUX

4-Bt Binary Adder

4-Bt Binary Adder

4-Bt Binary Adder

00

00

0 0

0

0Ci

n

Ci

n

Ci

n

B3 B2 B1 B0

A0A1A2A3

9's compliment Buffer

BCD adderAdd By Six/ Buffer

BCD SUB Correction/ Add by One/ 9's Compliment

A

A

B

C

Figure 7: Traditional BCD add/Sub circuit

Above figure shows the traditional add/Sub circuit. First step in the circuit is 9’s

complement unit. second stage is BCD addition and third stage is BCD subtraction

adjustment. Add/ sub is control signal.

In the first stage add/Sub control is an input to XOR gates with first operand. If

the Add/ Sub signal is high the operand is complemented else the operand remains, the

same. Result of this operation is given to input of a binary ripple carry adder with 10 as

second input formed by add/sub decision. Hence binary 10 is added to the complemented

number or the second operand is passed without any change.

Page 20: Copyright 2016, Nikhil Patil

Texas Tech University, Nikhil Patil, May 2016

12

Second stage Adds the two operands with binary adder. Result of addition is

verified. If the result is greater than nine, six is added to the result and carry is generated.

In the third stage does the nine’s complement of the number or adds one to the

result depending on the add/sub decision and Carry/borrow generation.

2.1.1 Nine’s complement circuit (Block A in the fig 7):

In this stage input has to be complemented and added by ten. Traditionally it’s a

two stage approach, however as the input and outputs are definite we can directly map

them to obtain CBLD functions. Consider the nine’s compliment truth table:

Table 4: Nine's Complement table

Inputs Outputs

0 0 0 0 1 0 0 1

0 0 0 1 1 0 0 0

0 0 1 0 0 1 1 1

0 0 1 1 0 1 1 0

0 1 0 0 0 1 0 1

0 1 0 1 0 1 0 0

0 1 1 0 0 0 1 1

0 1 1 1 0 0 1 0

1 0 0 0 0 0 0 1

1 0 0 1 0 0 0 0

Solving above table and following the steps for function generation conditions mentioned

in table 5 can be obtained.

Table 5: Condition function for nine's complement

Bit Condition for compliment

B0 Always

B1 never

B2 A1

B3 A1 + A2

Page 21: Copyright 2016, Nikhil Patil

Texas Tech University, Nikhil Patil, May 2016

13

Here we consider A [3:0] is an input and B [3:0] is an output. Now again, we have

only one control signal i.e. Add/Sub decision hence this signal is ANDed to the given

conditions:

Table 6: CBLD final Function for Nine's complement

Bit CBLD Function

B0 X

B1 No operation

B2 X. A1

B3 ( A1 + A2 ). X

Above equations can be used to realize a CBLD circuit for 9’s complement. Fig

8 shows the CBLD circuit for the 9’s complement

B3 B2 B1 B0

I1I2I3

O0O2O3

I0

X

O1

I1I2I3

O0O2O3

I0

X

O1

Figure 8: Nine's compliment module using CBLD

Number of gates :6

Number of gates in critical Path: 3

Above circuit is verified in Verilog Xilinx Result of simulation proves the

functionality:

Test bench used for the verification is as follows:

Here count is temporary variable,

for(count=0;count<9;count = count+1)

begin

b = b+1;

Page 22: Copyright 2016, Nikhil Patil

Texas Tech University, Nikhil Patil, May 2016

14

a_s = 1;

#100;

end

Above experiment gives the following waveform:

Figure 9: waveform for nine's complement module with CBLD methodology

Here O[3:0] is an output and i[3:0] is an input.

2.1.2 BCD adder(Block B in the figure 7):

Second stage is BCD adder it first adds to BCD adders with Binary rules and

depending on obtained results adds 0 or 6, as the second part of this module adds 6 or 0

this module is conditional block hence we can use CBLD

This case is add by constant, constant is six, hence we have definite input and

outputs and hence CBLD functions can be found out. Considering the truth table, we get

following conditions:

Table 7: Function for add by six

Bit Condition for compliment

B0 No change

B1 Always

B2 I1

B3 (𝐼2 + 𝐼1)

Here condition is generated from first part of the module and its BCD correction,

considering condition signal x, we obtained following circuit:

Page 23: Copyright 2016, Nikhil Patil

Texas Tech University, Nikhil Patil, May 2016

15

X

I0I1I2I3

O0O1O2O3

Figure 10: Add by six/ buffer module using CBLD methodology

Functionality of the BCD module can be verified using Verilog and following figure

proves the functionality:

Figure 11: Functionality verification for BCD adder using CBLD methodology

2.1.3 nine’s Complement OR Add by one Or buffer(C):

This module adds one if borrow is generated, else does nine’s compliment if the

entire circuit is in subtraction mode else it doesn’t change input in addition mode. Both

the results are given to a multiplexer and result of addition or subtraction are given out.

In CBLD approach we have 3 conditions Add by one nine’s complement and

buffer, we have 2 control signals Add/sub decision and borrow:

Functions for nine’s complement are defined in 2.1.1 and functions for add by one can

be calculated similarly. Both the conditions are as follows:

Table 8: Condition for add by one and nine's complement from truth table

Bit Add By One Nine’s compliment

O0 Always Always

O1 I0 never

O2 (I1. I0) I1

O3 (I2. I1. I0) I1 + I2

Page 24: Copyright 2016, Nikhil Patil

Texas Tech University, Nikhil Patil, May 2016

16

Control Signal Consideration: Control Signal for these two conditions is borrow,

And these two modules will be active only in subtraction

Table 9: Control signal Conditions

Condition Control Signal

9’s complement (A_S).(𝐵)

Add by one (A_S).B

Combining above two tables we get functions as follows:

Table 10: CBLD Functions for the 9'c complement add by one module

Bit Final Function

For Add/Sub (X1)

O0 A_S

O1 A_S. B. I0

O2 A_S. (B(I0I1) +𝐵(I0))

O3 A_S. (BI2I1I0) +𝐵( ( I1 + I2))

Hence the circuit realization is as follows:

Borrow

Add/Sub

I3I2

I1I0

O3 O2 O1 O0 Figure 12 : 9's Complement/ Add by one/ Buffer Unit by CBLD methodology

Number of gates: 15

Number of gates in critical path: 5

Functionality of above circuit is verified using Verilog following waveform proves the

functionality:

Page 25: Copyright 2016, Nikhil Patil

Texas Tech University, Nikhil Patil, May 2016

17

Figure 13: Functionality Waveform for stage 3 (c_e is borrow or subtraction

correction/a_s is add/Sub decision)

2.2 Reduced Delay Adder Optimization Using CBLD

Reduced delay adder proposed by Alp Arslan Bayrakci, Ahmet Akkas et al is novel

design for an BCD adder. This Circuit has broadly two modules. Module one adds the 2

BCD numbers using binary rules. Second module does

Figure 14: Reduced Delay Adder Block Diagram [1]

Page 26: Copyright 2016, Nikhil Patil

Texas Tech University, Nikhil Patil, May 2016

18

the BCD correction. For the second unit carry input to entire circuit (Cin), carry

output from first module (Cout) and result of binary summation (Sum) are the

inputs cin and cout are arranged so that if cin is high and cout is low then 1 is

added to the summation. If cout is high and cin is low 6 is added to summation

if both Cin Cout are high 7 is added to the Summation and if both are low

summation is passed without any change

If we tabulate the conditions:

Table 11: Conditions in the BCD adder

Input Operation

Cin = 1,cout =0 Add By One

Cout = 1,cin =0 Add by Six

Cin & Cout =1 Add By Seven

Here we noticed that if condition 1 and condition 2 are performed together we

can satisfy condition 3. Hence, we use multistage approach in the CBLD. As

shown in following fig:

CBLD F11CBLD F12CBLD F13CBLD F14

O[0]O[1]O[2]O[3]

CBLD F01CBLD F02CBLD F03CBLD F04

Figure 15: Multistage CBLD

Instead of combining two stages we apply add by constant on each stage.

Conditions for add by six and add by one for CBLD can be obtained using truth

table and are as follows:

Page 27: Copyright 2016, Nikhil Patil

Texas Tech University, Nikhil Patil, May 2016

19

Table 12: Add by six and add by one

Bit

Condition for

compliment Add by

six

Condition for

compliment Add by

one

O0 No change Always

O1 Always I0

O2 I1 I1.I0

O3 (𝐼2 + 𝐼1) I2.I1.Io

Considering above circuit, we obtained following circuit:

`

Cin

Cout

A3 A2 A1 A0

S3 S2 S1 S0

Figure 16: BCD correction module for reduced delay adder sing CBLD methodology [18].

Number of gates with CBLD design: 15

Number of gates in critical path with CBLD design: 6

Number of gates in former design 24

Number of gates in critical path on former design 8

2.3 High- Speed Parallel Decimal Multiplication with Redundant Internal

Encoding circuit optimization with CBLD.

High -speed parallel decimal multiplier is an another good example of conditional

operation. Entire operation of the multiplier is beyond the scope and explanation of it is

redundant. The author of the paper suggests a modified block in final stage which adds

15, 10 or 9 to the sum generated in the previous modules. For this purpose, author has

used modified add by constant circuits for each of the constants. Result of summation is

then given to the multiplier. This module can be optimized by the CBLD methodology.

Page 28: Copyright 2016, Nikhil Patil

Texas Tech University, Nikhil Patil, May 2016

20

Figure 17: High- Speed Parallel Decimal Multiplication module [6]

Here the control signals are Ci+1,Ci. following table gives the condition for add

by constant in CBLD approach obtained from the truth table.

Table 13: Condition for add by nine / add by one/ add by 10

Bit Number Add by 10 Add By 15 Add by 9

0 Never Always Always

1 Always 𝐴0 Never

2 A1 𝐴0 . 𝐴1 A0A1

3 𝐴2 + 𝐴1 + 𝐴0 𝐴0. 𝐴3. 𝐴2 𝐴2 + 𝐴1 + 𝐴0

Conditions for control signals:

Table 14: Control signal condition

Input Operation

C0 = 1 Add by fifteen

C1 = 1 Add by ten

C0 & C1 =1 Add By nine

C0 & C1 = 0 Buffer Input to output

Hence the optimized Circuit:

Page 29: Copyright 2016, Nikhil Patil

Texas Tech University, Nikhil Patil, May 2016

21

C0

C1

I0I1I2I3

O0O1O2O3

Figure 18: Optimized Circuit for High- Speed Parallel Decimal Multiplication circuit.

Using CBLD

Number of gates in CBLD Design: 23

Number of gates in critical path in CBLD Design: 7

Number of gates in former design: 36

Number of gates in former design: 7

This Circuit gives reduced number of gates compared to original Circuit. However

Similar to 2.2 circuit optimization if two conditions are satisfied third condition is

satisfied simultaneously. Hence, we can use Multistage CBLD approach.

Following diagram shows the Multistage approach of CBLD:

I0I1I2I3

O3 O2 O1 O0

C0

C1

Figure 19: Multistage Approach for High- Speed Parallel Decimal Multiplication

circuit [18]

Page 30: Copyright 2016, Nikhil Patil

Texas Tech University, Nikhil Patil, May 2016

22

Number of gates with CBLD design: 21

Number of gates in critical Path with CBLD design: 7

Number of gates without CBLD with former design 36

Number of gates in critical path in former deign 7

Verilog verification waveform proves the functionality:

Figure 20: Waveform for Multistage CBLD for multiplier module

Page 31: Copyright 2016, Nikhil Patil

Texas Tech University, Nikhil Patil, May 2016

23

CHAPTER III

DESIGN OF CARRY SELECT ADDER USING CBLD

Carry Select Adder (CSLA) is used for arithmetic operations for better speed at

the expense of area and power. We present novel structure of Conditional Binary to

Excess-1 adder based on carry select adder design on gate level using CBLD

methodology. Regular CSLA is area-consuming due to the dual Ripple-Carry Adder

(RCA) structure. For reducing area, the CSLA can be implemented by using a single

RCA and a Binary to excess-1 circuit instead of using dual RCA. A conditional BEC

block is proposed which can be implemented in place of regular BEC block and

multiplexer to add one to the output of modified RCA or pass the output without any

change. This approach reduces necessity of multiplexers for CSLA hence reduce delay,

area and power consumption.

In digital design area and time delay are inversely proportional to each other. As

a result, time delay improvement results in consumption of more area and power. Carry

select adder is significant adder as this adder balances area and time delay. Efforts have

been taken to optimize the basic design of carry select adder. A CSLA based on common

Boolean logic is suggested in [20]. This adder uses only one XOR gate and one inverter

for summation and 1 AND gate and inverter for carry. This reduces the number of gates

significantly. However, the result shows that the delay obtained is similar to RCA adder.

Improvement for this is suggested in [21]. Author suggests more systematic

representation by using SQRT structure. This gives better speed. Comparison of these

adders to binary excess-1 [4] adder suggest that BEC- CSLA is the best design.

As the function of Carry select adder uses conditional operation of add by one or

passing the input to output without any change, CBLD can be used to optimize the

performance. With CBLD a new block is generated which is more efficient compared to

traditional CSLA and State of the art designs of CSLA proposed recently.

Page 32: Copyright 2016, Nikhil Patil

Texas Tech University, Nikhil Patil, May 2016

24

3.1 Regular CSLA:

A carry select adder is designed typically with 2 RCA blocks one RCA block adds the

input with carry input as 0 and second adds with 1 multiplexer is used to select one of

the sum word depending on carry in. here two binary adds work in parallel. This increases

the redundant units staying in idle condition. Following figure shows the typical RCA

block diagram

RCA-1

RCA-2

SUM and Carry

Selection Unit

A B

0

1

Cin

n n

nn

Coutn

Sum

Figure 21: Typical CSLA structure [18]

3.2 BEC-CSLA:

Low power and Area-efficient Carry select adder proposed by Ramkumar and Kuttur [7]

shown in fig.16. This design uses binary to excess-1 addition referred as BEC addition.

Carry bit and sum word for the RCA with Carry input 0 (C0, S0) are produced using

conventional RCA and Carry and sum word for RCA with Carry input 1 (C1, S1) are

produced using excess one block instead of using regular RCA. If we consider S0[i] and

C0[i] as outputs when Cin=0 and S1[i] and C1[i] as output when Cin=1; where i is the

number of a bit. The operation in Ramkumar and Kuttur paper can be formulated as

follows

Page 33: Copyright 2016, Nikhil Patil

Texas Tech University, Nikhil Patil, May 2016

25

BEC block is,

S1[0] = S’0[0] (1)

C1[0] = Cin (2)

S1[1]= S0[0] ⊕ S0[1] (3)

C1[1]= S0[0].S0[1] (4)

S1[2]= S0[2] ⊕ (S0[0].S0[1]) and so

on..

(5)

B0B1B2B3

O0O1O2O3

Figure 22: BEC unit proposed in BEC CSLA [4]

And final sum and carry are,

Sum = S0 . C’in + S1Cin (6)

Carry = C0 . C’in + C1Cin (7)

Expression 2 shows the C1[1] is dependent on S0[0]. This shows a small ripple in BEC

which travels from S1[1] to S1[n] for n bit adder. Hence a BEC adder has more delay than

regular CSLA. Efforts have been done to reduce this dependency and speed up the

addition.

Page 34: Copyright 2016, Nikhil Patil

Texas Tech University, Nikhil Patil, May 2016

26

Author suggest a Square root structure of the adder. Considering traditional adders as

shown in the following diagram:

Figure 23: Typical Square Root Structure with increasing stage size [4]

In above diagram all the highlighted blocks work in parallel and are independent of

previous stages. Hence, Delay of the circuit comes from MUX unit. MUX unit in

standard form are of same size, hence if the delay for one unit is n then total delay will

be (n x number) of stages. This delay can be reduced using SQRT structure consider

following diagram.

12345

Figure 24: SQRT Structure for BEC CSLA [4]

In above diagram all the stages have variable stage size. Hence suppose block 1

and block to requires 1 ns for execution each. Then inputs for the multiplexer are at the

same time. Similarly block 3 has a delay which is equal to the delay of block one plus

delay of multiplexer one hence for third stage multiplexer inputs arrive at the same time

and so on. This approach reduces the delay by significant amount.

Page 35: Copyright 2016, Nikhil Patil

Texas Tech University, Nikhil Patil, May 2016

27

3.3 CS block based CSLA:

Most effective effort to reduce dependency is given by Mohanty et al.[1]. Area-

Power-Delay Efficient Carry-Select adder suggested by Mohanty et al considers addition

of one in carry block instead of sum block. Following figure gives the proposed module

of Carry select adder suggested by them. In this design two carry words are generated for

individual bits instead of generating 2 separated sum words. Proper carry word is selected

depending on Cin with a modified multiplexer and added to the sum word previously

generated. Elimination of ripple for sum generation is successfully done and hence a fast

adder is designed.

Again two separate units are required for generation of two separate carry words and

hence this design is area consuming. Our effort is to minimize the area occupancy and

power consumption for the same BEC- based adders.

Cin

HSG

CG0

CG1

CS

FSG

C S

C0

C1

Cout

Sum

A B

nn

Figure 25: Carry select adder with conditional carry [18]

3.4 Carry- Skip based BEC-CSLA

BEC-CSLA is one of the most efficient adder. However, this adder has more

delay. This delay can be reduced by Carry skip logic proposed in [22]. This circuit uses

modified BEC block and AOI -OAI alternating logic to skip the carry. The proposed

design is as shown in the following diagram:

Page 36: Copyright 2016, Nikhil Patil

Texas Tech University, Nikhil Patil, May 2016

28

RCA without Cin

BEC Unit

A B

Cin

n

nn

Cout

n

Sum

Carry skip

Figure 26: Carry skip based BEC-adder [22]

Page 37: Copyright 2016, Nikhil Patil

Texas Tech University, Nikhil Patil, May 2016

29

CHAPTER IV

PROPOSED ADDER DESIGN

In CSLA just like arithmetic circuits analyzed in chapter II have a conditional

operation block. This block adds one or pass the input without change. Considering this

condition, we can form a CBLD circuit for 4-bits. After representing input in terms of

output and obtaining min terms we get following functions for add by one:

Step 3: Function realization for CBLD

Table 15: CBLD condition for Add by One

Bit Condition for

compliment

B0 Never

B1 A0

B2 A0A1

B3 A1A2A0

If we continue for more number of bits, the function observed is as follows

f(Bn ) = An-1.An-2……..A0

This gives a definitive form for the module which is easy to implement.

Cin

I0I1I2I3

Figure 27: Add by one realization

Compared to traditional BEC mentioned in the chapter II this unit doesn’t require

multiplier unit.

First part module consists of ripple carry adder and second module consists of

add by one unit.

Page 38: Copyright 2016, Nikhil Patil

Texas Tech University, Nikhil Patil, May 2016

30

a1b1a2b2a3b3

Cin

S0S1S2S3Co

a0b0

Figure 28: Ripple carry adder CBEC

4.2 Half Adder Design for RCA-CBEC:

The initial design comprises of a ripple carry adder with a conditional BEC unit.

The carry from the first bit is given to the CBEC chain. In this approach propagation

delay comes from the RCA chain. Leaving CBEC chain idle. Efforts are being made to

decrease the delay from RCA chain. First approach is replacing first bit full adder to half

adder. Proposed design is as shown in the following figure:

a0b0a1b1a2b2a3b3

Cin

S0S1S2S3Co Figure 29: Proposed CBEC RCA design with half adder [18]

Above circuit is tested using Verilog and result proves the functionality:

Page 39: Copyright 2016, Nikhil Patil

Texas Tech University, Nikhil Patil, May 2016

31

Figure 30: Waveform for 8-bit CSLA

4.3 Determination of stage size:

4.3.1 SQRT structure:

Chapter III BEC paper suggest a square root structure. This stage grouping is

preferred stage size grouping for its delay reduction. SQRT structure is efficient

because it efficiently manages the time delay from first stage Ripple carry adder and

multiplexer. In previous examples of CSLA [7] , as all the RCA work in parallel, only

first RCA and all the carry select multiplexers are in critical path.

Here a multiplexer used are n x (n/2) hence delay for any multiplier is the same

and is equal to 3 logic gates. Consider following diagram for the critical path. Now, if

we consider n-bit ripple carry adder produces n ns delay and a carry select multiplexer

produces 1 ns delay in following diagram. We get the input for CY mux at 5 ns from

both the critical paths RCA [15:11]. Consider figure 29.

RCA [1:0]

CY Mux [3:2] CY Mux [6:4] CY Mux [10:7]2 ns 3 ns 4 ns

RCA [15:11]

CY Mux [10:7]

5 ns

5 ns

Figure 31: SQRT structure delay path for last multiplexer

In case of CBEC- RCA critical path is produced in the CBEC module and it’s

not same for each step size. Consider following fig. 31. Here the critical path is along

the CBEC unit. Hence a increasing order of CBEC units increases the delay, which is

constant in case of multiplexers in the CSLA-BEC. Test results show that SQRT

Page 40: Copyright 2016, Nikhil Patil

Texas Tech University, Nikhil Patil, May 2016

32

structure of RCA-CBEC are better than the normal RCA-CBEC. However, the results

can be improved using

a0b0a1b1

S0S1

Co

a0b0a1b1

S0S1S2

a1b1

Figure 32: Critical Path for SQRT RCA-CBEC

4.3.2 Uniform stage size Cascade Structure:

CBEC- RCA’s critical path restricts the usage of SQRT and other equivalent

structure which manipulates the RCA delay to match second stage delay. Hence, a

uniform structure has to be considered. A 4-4 structure is shown in following diagram:

a0b0a1b1a2b2a3b3

S0S1S2S3Co

Figure 33: 4-4 Structure for RCA-CBEC

The critical path for this structure is along the first stage RCA and hence the delay of this

adder will be equal to RCA.

Page 41: Copyright 2016, Nikhil Patil

Texas Tech University, Nikhil Patil, May 2016

33

Co Cin

Figure 34: The 2 bit RCA-CBEC unit

The above diagram shows the 2- bit adder. In this case the delay is along the

CBEC unit unlike 4- bit cascade structure. The number of gates used in this structure are

for 2-bits is equal to 12 hence a 4-bit adder formed using two 2-bit adders has 24 number

of gates. On the other hand, a single 4- bit adder requires 26 gates. This is because of half

adder in bit 0. This is true for any n-bit cascade structure e.g. a 8-bit cascade will require

54 gates, compared to 48 gates required by 2-bit cascade structure. Hence the proposed

network structure for CBEC-RCA is 2-bt cascade as shown in the following diagram.

a0b0a1b1

S0S1

Co Cin

a0b0a1b1

S0S1

Co Cin

a0b0a1b1

S0S1

Co Cin

a0b0a1b1

S0S1

Co Cin

Figure 35: 2-bit cascade structure for CBEC-RCA

Page 42: Copyright 2016, Nikhil Patil

Texas Tech University, Nikhil Patil, May 2016

34

CHAPTER V

ASIC DESIGN FLOW

Before diving into the simulation results, we have to review how digital ICs

design flow works and how they are developed. During its development, a digital design

goes through various set of states and it undergoes multiple transformations from the

original set of specifications. Each of these transformations corresponds, coarsely, to a

different description of the system, each stage in this transformation is more detailed

compared to its previous stage and each stage has its set of primitives.

IC LAYOUT

Design Specification

Fabrication Design

RTL Design

RTL verification

Synthesis and Optimization

RTL VSGate Verification

Tech Mapping

Fabrication

Tech andPachaging

High Level Function descriptionVoid adder (int a ,int b, int sum, int carry){Sum = a+b; .. ..} Register level Function Module adder ( a , b, sum carry);

Input [3:0]a;Output [3:0]sum; .. ..endmodule

Gate Level Description

Silicon Die

Figure 36: VLSI IC Design cycle [16]

Page 43: Copyright 2016, Nikhil Patil

Texas Tech University, Nikhil Patil, May 2016

35

Above figure shows the top -down design flow in simplified format. However,

the industrial design flow is much more complex and consider much more iterations.

First stage in this process is design specification. This represents a document of

specification a specific desired module needs to fulfil. In our case a standard low area

and power consuming CSLA is a desired module. Design is realized using software

usually in C to verify the functional behavior of the concept. This software serves as the

reference point in the design to check functionality of the module.

Next phase in designing is RTL design and verification. During this phase, the

architectural description is further refined: memory element and functional components

of each model are designed using a Hardware Description Languages (HDL). RTL design

is the last stage of functional digital design. After this verification process starts,

verification makes sure that the design functions properly assuming there are no

manufacturing errors. Various tests are performed to check the module is functionally

correct. If any errors exist HDL design need to be modified to remove all the possible

errors. In design flow the RTL verification works parallel to the RTL design.

In the next stage of synthesis and optimization, a more detailed model of design

is generated which is optimized depending on design constrains. This stage gives an

initial view of actual IC’s area, power and delay. Design produced in this stage is in gate

level. If generated result does not satisfy area, power requirements we need to go back to

RTL design phase to make necessary changes. We use Design vision by synopsis.

Timing Optimization

Data Path Optimization

Power Optimization

Area Optimization

Test Synthesis

Timing Closure

Verilog HDL

HDL Compiler(Xilinx)

Optimized Netlist

Place and Route

Design CompilerConstraints

IP DesignWare

Tech File

Symbol Library

SDFPDEF

Timing Power analysis

Formal Verification

Figure 37: Design Compiler Functional Diagram [17]

Page 44: Copyright 2016, Nikhil Patil

Texas Tech University, Nikhil Patil, May 2016

36

Design Compiler uses technology libraries, synthetic or DesignWare libraries,

and symbol libraries to implement synthesis and to display synthesis results graphically.

Design compiler contains 2 library files, generic technology library(GTECH) file which

contains all the logic gates and flip flops and DesignWare which contains more complex

circuits such as adders and comparators. Both the files are technology independent. HDL

design is extracted using this file. At this point design is not mapped according to specific

technology. Symbol library is used to generate the schematic of the design.

After the extraction design compiler maps the design to specific Technology

using Technology file called target library. This process is constrain driven. These

constrains are specified by the user such as environmental restrictions under which the

synthesis is done.

After the design is optimized next stage is Test synthesis. This is the process by

which designers can integrate test logic into a design during logic synthesis. Test

synthesis enables designers to ensure that a design is testable and resolve any test issues

early in the design cycle. The result of the logic synthesis process is an optimized gate-

level netlist, which is a list of circuit elements and their interconnections.

After test synthesis, the design is ready for the place and route tools, which place

and interconnect cells in the design. Based on the physical routing, the designer can back-

annotate the design with actual interconnect delays; Design Compiler can then

resynthesize the design for more accurate timing analysis. The synthesized file is then

used for IC compiler.

Next Stage in the flow is Tech mapping and place routing. This is also called as

physical implementation. Tool used for physical implementation is IC compiler. Physical

implementation in Design consists of

1) Floor planning

2) Placement

3) Routing

Page 45: Copyright 2016, Nikhil Patil

Texas Tech University, Nikhil Patil, May 2016

37

Floor planning: At this stage we have the netlist which is the logical description

of the design. In floor planning we map the netlist design. floor planning is most

important stage in the physical implementation. Time area power consumption of the real

IC depends on the floor planning. In side floor planning following steps are performed:

1) Estimation of the size of the chip.

2) Arrangement of various blocks on the in the design on chip

3) Pin assignment

4) IO and power planning

5) Clock distribution is decided

Figure 38: typical Floor plan in IC compiler

Placement: Once floor planning is done placement is initiated.in this stage

standard cells are assigned to the rows defined in the floor plan. Space is left out for

connection of these cell. This enables us to see the capacitive load each cell will need to

drive. The placement is done internally depending on the algorithm. There are several

algorithms to achieve this e.g. constructive or min cut algorithm, Iterative algorithm etc.

Routing: Till now cells are just placed now, in this stage the connection is done.

It also splits into two routes: Global Route: This stage just plans the interconnections.

This optimizes the path delay and critical path delay. The chip is divided into small cells

these are called cell bins or gcells. The size of these gcells depends on the algorithm used.

Page 46: Copyright 2016, Nikhil Patil

Texas Tech University, Nikhil Patil, May 2016

38

Detailed routing: in this stage actual connections are made. Each layer has its own routing

grid and rules.

Once routing is done parasitic extraction is done with extraction tools and the

result gives us the parasitic resistance and capacitance in the circuit. Then, Timing is

calculated with static timing analysis tools. Once Timing requirement are satisfied and

LVS matching is done, design can be sent to foundry for manufacturing.

Figure 39: Final IC layout using IC compiler

Page 47: Copyright 2016, Nikhil Patil

Texas Tech University, Nikhil Patil, May 2016

39

CHAPTER VI

SIMULATION AND RESULTS

In this chapter we will discuss the ASIC simulation results based on

“saed90nm_typ_ht.db” tech file. We compared (a) Traditional Add/ Sub module with its

CBLD counterpart (b) Reduced delay adders’ conditional adder with CBLD counterpart

(c) High speed parallel multiplier and its CBLD counterpart (d) Binary to excess-1 CSLA

SQRT structure to CBEC-RCA (e) Conditional carry adder to CBEC-RCA (f)Regular

CSLA to CBEC-RCA.

For comparison we follow the procedure discussed in previous chapter. All

designed in Xilinx Verilog ISE, functionality of these modules is tested on Xilinx for all

the corner cases. The design is then compared on design vision to obtain more accurate

results.

6.1 Optimization of arithmetic building blocks using CBLD

6.1.1 Add/Sub module

Following table shows the comparison of Add/Sub Module to its CBLD

counterpart. For comparison only 1 digit is considered as the structure is uniformly

cascaded the result will be a linearly increasing graph.

Table 16: Result of comparison for Add/sub to CBLD counterpart

Name Area Power Delay ADP PDP

Add/sub 637.74um2 368uw 4.12ns 2627 1516

CBLD 356.65um2 146.73uw 2.65ns 945.12 388.8

Above result shows significant improvement in all three important parameters.

Figure 40: ADP and PDP comparison for add/sub

0

200

400

ADP PDP

ADP and PDP comparison

Add/sub

CBLD

Page 48: Copyright 2016, Nikhil Patil

Texas Tech University, Nikhil Patil, May 2016

40

6.1.2 Reduced delay adder

For comparison adder block of the design is considered.

Table 17: Reduced delay adder comparison

Name Delay Area Power ADP PDP

MRDA 2.61 1816.47 227.73 4740.98 594.37

CBLD 2.55 1511.42 182.0558 3854.12 464.2

As discussed earlier, the CBLD design of the reduced delay adder require 15 gates

compared to 20 gates required by the former design this reflects in less area power

consumption and reduction of number of gates in the critical path improves the time delay

marginally. Result of this gives a better power delay and area delay product. Following

graph shows the ADP and PDP improvement.

Table 18: ADP PDP comparison for Reduced delay adder

6.1.3 High Speed parallel multiplier

In this module like previous cases only module is considered which requires

modification. Following is the result of comparison.

Table 19: Comparison of High speed parallel multiplier with CBLD

Name Delay Area Power ADP PDP

Multiplier

block

1.02ns 215.65 44uW 219.3 44.88

CBLD 0.94ns 139.161 32.71uW 130.81 30.74

0

50

100

150

200

250

ADP PDP

ADP and PDP comparison

Add/sub

CBLD

Page 49: Copyright 2016, Nikhil Patil

Texas Tech University, Nikhil Patil, May 2016

41

In previous two examples authors have used binary adder for add by constant. In

this example modified add by constant modules have been used. Following graphs proves

the ability of CBLD to improve the area, delay and power

Figure 41: ADP PDP comparison for High speed parallel multiplier

6.2 Design of CBEC-RCA

In this section a 2-bit adder cascade structure of CBEC-RCA is compared with

state of the art adders. Adders are built for 8-bit, 16-bit,32-bit and 64-bit to compare the

performance over the increase in the stage size.

Following table gives the comparison of area, delay and power. From the table

1) CBEC-RCA consumes 57.08% less area, 9.8% more speed and has 57.55% more

power efficiency for 8-bit structure. Area efficiency improvement is at 56.63%,

power at 57.37% with marginal improvement in delay. This shows the CBEC-

RCA unit remains better than regular structure over the stage size.

2) Compared to conditional carry adder CBEC consumes 52.53% less area and

22.02% less power at 8-bit and at 64- bits 37.36% less area and 22.6% less power

consumption is noted.

3) Compared with BEC adder we observe 43.27%less area, 55.39% less power

consumption, with 38.54% better speed at 8- bit and at 64 -bit 43.59% less area and

59.45% less power with 15.96 % better speed.

Following table shows the result of comparison

0

50

100

150

200

250

ADP PDP

ADP and PDP comparison

Add/sub

CBLD

Page 50: Copyright 2016, Nikhil Patil

Texas Tech University, Nikhil Patil, May 2016

42

Table 20: Comparison Result for CBEC-RCA and standard adders

Width Design Area Delay Power PDP APP

µm2 ns µW 10-15 10-15

8

ADP 499.1 0.78 145.32 113.34 72.532

CBEC 326.8 1.65 113.32 186.97 37.039

BEC 576.3 2.71 254.06 688.50 146.42

Reg 759.4 1.83 267.8 490.07 203.36

16

ADP 1015.2 1.78 295.36 525.74 299.87

CBEC 653.5 3.07 230.90 708.86 150.91

BEC 1154.1 5.24 537.97 2818.9 620.89

Reg 1509.6 3.22 542.83 1747.9 819.45

32

ADP 2063.3 2.47 601.82 1486.4 1241.7

CBEC 1307.0 5.9 467.50 2758.2 6110.4

BEC 2311.6 9.40 1113.8 10469 2574.6

Reg 3015.3 6.01 1093.3 6570.7 3296.6

64

ADP 4173.5 3.67 1206.7 4428.5 5036.2

CBEC 2613.9 11.57 933.70 10802 2440.6

BEC 4632.7 16.66 2303.3 38372 1067.0

Reg 6027.2 11.59 2191.4 25398 1320.8

Following graph shows the Area and power comparison for 8,16,32,64 bit for all the

adders.

Figure 42: Area Comparison

0

500

1000

1500

2000

2500

8-bit 16-bit 32-bit 64-bit

Po

we

r (µ

w) CBEC

ADP

Reg

BEC

Page 51: Copyright 2016, Nikhil Patil

Texas Tech University, Nikhil Patil, May 2016

43

Figure 43: Power comparison

0

500

1000

1500

2000

2500

8-bit 16-bit 32-bit 64-bitP

ow

er

(µw

) CBEC

ADP

Reg

BEC

Page 52: Copyright 2016, Nikhil Patil

Texas Tech University, Nikhil Patil, May 2016

44

CHAPTER VII

CONCLUSION

New view of logic design with the name of Compliment Based Logic Design

(CBLD) has been used to model the Various Arithmetic circuits in the field of VLSI.

Comparison result obtained from Design vision of CBLD approach to their former

counter parts shows that the CBLD version of BCD Adder/ subtractor consumes 60%

less power and 19% less power than Modified reduced delay adder.

Further, A new design approach is proposed in this paper to reduce the area and

power of conventional structure for CSLAs architecture. This approach eliminates

multiplexer and hence reduce number of gates required. This work offers the very large

reduction of area and also the total power. The compared results show that the modified

CBEC-CSLA has a slightly larger delay, but the area and power of the 64-bit modified

MRCA are significantly reduced by 51.5%, 77.1% 81% compared state of the art

structures mentioned earlier. The modified RCA architecture is therefore, very less area

occupying and very less power consuming.

Page 53: Copyright 2016, Nikhil Patil

Texas Tech University, Nikhil Patil, May 2016

45

REFERENCES

[1] Alp Arslan Bayrakci, Ahmet Akkas, "Reduced Delay BCD Adder", IEEEInternational Conf. on

Application-specific Systems, Architectures and Processors(ASAP 2007), Volume-Issue: 9-11 ,

Page(s):266 – 271,July 2007.

[2] Sundaresan C. et al, “Modified reduced delay BCD adder ,”Int. conf. on Biomedical Eng. And infomatics,

City of Conf.,2011 ,pp.2148-2151.

[3] Al-Khaleel, O et al, “Fast and compact binary-to-BCD conversion circuits for decimal multiplication”.

IEEE 29th Int. Conf. on Computer Design (ICCD), 2011,pp.226-231

[4] B. Ramkumar and H M Kittur, “low-Power and Area-Efficient Carry Select Adder,”IEEE Trans. Very

Large Scale Integr. (VLSI) Syst.,vol 20, no.20,pp. 371-375,FEB 2012.

[5] M.M. Mano. Digital Design,pages 129-131.Prentice Hall,third edition,2002

[6] L. Han and S. B. Ko “High-Speed Parallel Decimal Multiplication with Redundant Internal Encodings”

IEEE Transactions on Computers. Vol-62, pp- 956-958,2013

[7] Ramkumar and H. Kuttur “Low- power and area-efficient carry select adder” IEEE trans. on VLSI sys.

vol. 20, no. 2, pp 371-375, feb 2012

[8] B. K. Mohanty and S. K. Patel “Area-delay-power efficient Carry select adder”. IEEE trans. on Circuits

and Sys. vol. 61, no. 6, pp 418-422, june 2014.

[9] Y. Kim and L.S. Kim “64-bit carry select adder with reduced area”. Electron. Lett.,vol.37,no.10,pp614-

615,may 2001.

[10] B. Parhami, Computer Arithmatic: Algorithms and Hardware Designs, 2nd ed. New York,USA: Oxford

univ. Press 2010.

[11] O. J. Bedrij, “Carry-select adder,” IRE Trans. Electron. Comput., pp. 340–344, 1962

[12] CEIANG, T.Y., and HSIAO, M.J.: ‘Carry-select adder using single ripplecarry adder’, Electron. Ixtt.,

1998, 34, (22), pp. 2101-2103

[13] T. Y. Ceiang and M. J. Hsiao, “Carry-select adder using single ripple

carryadder,”Electron.Lett.,vol.34,no.22,pp.2101–2103,Oct.199

[14] T. Nikoubin, et al, “A new cell design methodology for balanced XOR-XNOR circuits for hybrid-CMOS

logic”, Journal of Low Power Electronics. P.p. 474-483, December 2009

[15] T. Nikoubin and M. Grailoo, “Cell Design Methodology Based on Transmission Gate for Low-Power

High-Speed Balanced XOR-XNOR Circuits in Hybrid-CMOS Logic Style”Journal of Low Power

Electronics. Pp503-512, year2010

Page 54: Copyright 2016, Nikhil Patil

Texas Tech University, Nikhil Patil, May 2016

46

[16] Sherwani, N. A. Algorithms for VLSI Physical Design Automation,page – 5, Kluwer Academic

Publishers, Third edition, Nov 1998.

[17] Design compiler user guide, synopsys, version E-2010, December2010

[18] Nikhil Patil et. al. “RCA with conditional BEC in CSLA structure for area-power efficiency” 6th

International Conference on Computing, Communication and Networking Technologies (ICCCNT), July

2015

[19] T. Nikoubin et. al. “Energy and Area Efficient Three-Input XOR/XNORs With Systematic Cell Design

Methodology” IEEE Transactions on Very Large Scale Integration (VLSI) Systems. Vol.24, pp. 396-

402, March 2015.

[20] I.-C. Wey, C.-C. Ho, Y.-S. Lin, and C. C. Peng, “An area-efficient carry select adder design by sharing

the common Boolean logic term,” in Proc. IMECS, pp. 1–4, 2012.

[21] S. Manju and V. Sornagopal, “An efficient SQRT architecture of carry select adder design by common

Boolean logic,” in Proc. VLSI ICEVENT , pp. 1–5, 2013.

[22] Milad Bahadori "High-Speed and Energy-Efficient Carry Skip Adder Operating Under a Wide Range of

Supply Voltage Levels", IEEE Transactions on Very Large Scale Integration (VLSI) Systems, Vol 24,pp

421-433, March 2015