mirror adder

Post on 07-Apr-2015

634 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Lecture 16, ECE 124A, VLSI Principles Kaustav Banerjee

ECE 124AECE 124AVLSI PrinciplesVLSI Principles

Lecture 16Lecture 16

Prof. Kaustav BanerjeeElectrical and Computer Engineering

E-mail: kaustav@ece.ucsb.edu

Lecture 16, ECE 124A, VLSI Principles Kaustav Banerjee

A Generic Digital ProcessorA Generic Digital Processor

MEMORY

DATAPATH

CONTROL

Inpu

t / O

utpu

t

Lecture 15Static, Dynamic MemoryLatch, RegisterTiming, Stability

Interconnect

Sequential Circuit

Today……

Lecture 16, ECE 124A, VLSI Principles Kaustav Banerjee

DATAPATHDATAPATHThe core of a digital ProcessorDatapath consists

Logic Blocks– Combinational Logic Functions (AND, OR, XOR….)

Arithmetic Blocks– Addition– Multiplication– Comparison– Shift

Lecture 16, ECE 124A, VLSI Principles Kaustav Banerjee

An Intel MicroprocessorAn Intel Microprocessor9-

1 M

ux9-

1 M

ux

5-1

Mux

2-1

Mux

ck1

CARRYGEN

SUMGEN+ LU

1000um

b

s0

s1

g64

sum sumb

LU : LogicalUnit

SUM

SEL

a

to Cachenode1

REG

Fetzer, Orton, ISSCC’02

Itanium has 6 integer execution units like this

Intel Itanium®

Integer Datapath

Lecture 16, ECE 124A, VLSI Principles Kaustav Banerjee

BitBit--Sliced DesignSliced Design

Reg

iste

r

Add

er

Shift

er

Mul

tiple

xer

Bit 0

Bit 32

……

DA

TA I

N

DA

TA O

UT

Control

Design a single bit datapath and repeat for all bits

Lecture 16, ECE 124A, VLSI Principles Kaustav Banerjee

AddersAdders

Lecture 16, ECE 124A, VLSI Principles Kaustav Banerjee

DDeesign an Addersign an AdderFundamental Arithmetic Building BlockPerformance

Logic Level Optimization– Optimize Boolean Functions– Carry Lookahead

Circuit Level Optimization– Transistor Sizing

Power Consumption

Lecture 16, ECE 124A, VLSI Principles Kaustav Banerjee

HalfHalf--Adder ImplementationsAdder ImplementationsHalf-Adder X

Xy

y

s

c

Xy

Xy

sc

s

cX

y

Half-Adder

X y

Carry

SUM

Lecture 16, ECE 124A, VLSI Principles Kaustav Banerjee

FullFull--AdderAdder

HA

HA

AB

Cin

Cout

s

A B

Cout

Sum

Cin Fulladder

in inin in in

out in in

S A B C ABC ABC ABC ABCC AB BC AC= ⊕ ⊕ = + + +

= + +

Lecture 16, ECE 124A, VLSI Principles Kaustav Banerjee

Express Sum and CarryExpress Sum and Carry

Generate (G) = AB

Propagate (P) = A ⊕ B

Delete (D) = A B

Co = G + PCi

S = P Ci⊕Truth Table for Full Adder

G11111G10011P10101P01001P10110P01010D01100D00000

statusCoSCiBA

Lecture 16, ECE 124A, VLSI Principles Kaustav Banerjee

Complimentary Static CMOS Full AdderComplimentary Static CMOS Full Adder

28 Transistors

A B

B

A

Ci

Ci A

X

VDD

VDD

A B

Ci BA

B VDD

A

B

Ci

Ci

A

B

A CiB

Co

VDD

S

0( )in i i

o in in

S A B C ABC C A B CC AB BC AC= ⊕ ⊕ = + + +

= + +

C0

S

Lecture 16, ECE 124A, VLSI Principles Kaustav Banerjee

The RippleThe Ripple--Carry AdderCarry Adder

Propagation Delay is linearly proportional to Ntcarry dominates the propagation delay

Worst case delay: linear with the number of bits td = O(N)

tadder = (N-1)tcarry + tsum

Lecture 16, ECE 124A, VLSI Principles Kaustav Banerjee

Inverting PropertyInverting Property

0 0

( , , ) ( , , )

( , , ) ( , , )

ii

ii

S A B C S A B C

C A B C C A B C

=

=

Inverting all inputs to FA results in inverted values for all outputs

Lecture 16, ECE 124A, VLSI Principles Kaustav Banerjee

Minimize Critical Path by Reducing Inverting StagesMinimize Critical Path by Reducing Inverting Stages

Exploit Inversion Property

A3

FA FA FA

Even cell Odd cell

FA

A0 B0

S0

A1 B1

S1

A2 B2

S2

B3

S3

Ci,0 Co,0 Co,1 Co,3Co,2

Lecture 16, ECE 124A, VLSI Principles Kaustav Banerjee

Mirror Adder (1)Mirror Adder (1)

G11111

G10011

P10101

P01001

P10110

P01010

D01100

D00000

statusCoSCiBA

A

A

A

A

B B

BB

VDD

CiCo

A B Ci

A B Ci

VDD

A

B

Ci

Ci

A

B

VDD

S

Carry

Truth Table for Full Adder

Lecture 16, ECE 124A, VLSI Principles Kaustav Banerjee

G11111

G10011

P10101

P01001

P10110

P01010

D01100

D00000

statusCoSCiBA

A

A

A

A

B B

BB

VDD

CiCo

A B Ci

A B Ci

VDD

A

B

Ci

Ci

A

B

VDD

S

Mirror Adder (2)Mirror Adder (2)Sum

Truth Table for Full Adder

Lecture 16, ECE 124A, VLSI Principles Kaustav Banerjee

Stick Diagram

CiA B

VDD

GND

B

Co

A Ci Co Ci A B

S

Mirror Adder ImplementationMirror Adder Implementation

Lecture 16, ECE 124A, VLSI Principles Kaustav Banerjee

Mirror Adder SummaryMirror Adder Summary24 Transistors

The NMOS and PMOS chains are completely symmetrical

A maximum of 2 series transistors in carry-generation circuitry

The transistors connected to Ci should be closest to the output

Carry-Stage transistors have to be optimized for speed

Sum-Stage transistors can be optimized for area

The most critical issue is to minimize the capacitance at Co

Co is composed of 4 diffusion capacitances, 2 internal gate

capacitances, and 6 gate capacitances in the connecting adder cell

Lecture 16, ECE 124A, VLSI Principles Kaustav Banerjee

A

B

P

Ci

VDDA

A A

VDD

Ci

A

P

AB

VDD

VDD

Ci

Ci

Co

S

Ci

P

P

P

P

P

Sum Generation

Carry Generation

Setup

This implementation has similar sum and carry output delay

Transmission Gate Full AdderTransmission Gate Full Adder

Lecture 16, ECE 124A, VLSI Principles Kaustav Banerjee

CoCi

Gi

Di

Pi

Pi

VDD CoCi

Gi

Pi

VDD

φ

φ

Manchester CarryManchester Carry--ChainChainManchester Carry Gates

Static Implementation Dynamic Implementation

Lecture 16, ECE 124A, VLSI Principles Kaustav Banerjee

G2

φ

C3

G3Ci,0

P0

G1

VDD

φ

G0

P1 P2 P3

C3C2C1C0

44--Bit Manchester CarryBit Manchester Carry--ChainChain

Lecture 16, ECE 124A, VLSI Principles Kaustav Banerjee

Pi + 1 Gi + 1 φ

Ci

Inverter/Sum Row

Propagate/Generate Row

Pi Gi φ

Ci - 1Ci + 1

VDD

GND

Stick Diagram

Manchester CarryManchester Carry--Chain ImplementationChain Implementation

Lecture 16, ECE 124A, VLSI Principles Kaustav Banerjee

Design for Long Word LengthDesign for Long Word Length

Propagation delay of chain style design is quadratic in the number of bits

Chain style design is NOT practical for long word length (e.g. 32 bits)

Lecture 16, ECE 124A, VLSI Principles Kaustav Banerjee

FA FA FA FA

P0 G1 P0 G1 P2 G2 P3 G3

Co,3Co,2Co,1Co,0Ci,0

FA FA FA FA

P0 G1 P0 G1 P2 G2 P3 G3

Co,2Co,1Co,0Ci,0

Co,3

Mul

tiple

xer

BP=PoP1P2P3

Idea: If (P0 and P1 and P2 and P3 = 1)then Co3 = C0, else “kill” or “generate”.

CarryCarry--Bypass AdderBypass AdderCarry-Skip Adder

Lecture 16, ECE 124A, VLSI Principles Kaustav Banerjee

Carrypropagation

SetupBit 0–3

Sum

M bits

tsetup

tsum

Carrypropagation

SetupBit 4–7

Sum

tbypass

Carrypropagation

SetupBit 8–11

Sum

Carrypropagation

SetupBit 12–15

Sum

tadder = tsetup + Mtcarry + (N/M-1)tbypass + (M-1)tcarry + tsum

1616--bit Carrybit Carry--Bypass AdderBypass Adder

N=16 M=4

Lecture 16, ECE 124A, VLSI Principles Kaustav Banerjee

Carry Ripple versus Carry BypassCarry Ripple versus Carry Bypass

For smaller N, bypass adder is not preferred due to the overhead of bypass multiplexer

Ripple Adder

Bypass Adder

tp

N4~8

Depends on technology

Lecture 16, ECE 124A, VLSI Principles Kaustav Banerjee

Setup

"0" Carry Propagation

"1" Carry Propagation

Multiplexer

Sum Generation

Co,k-1 Co,k+3

"0"

"1"

P,G

Carry Vector

Linear CarryLinear Carry--Select AdderSelect Adder

Both situations are evaluated

~ 30 % Hardware overhead

Lecture 16, ECE 124A, VLSI Principles Kaustav Banerjee

0

1

Sum Generation

Multiplexer

1-Carry

0-Carry

Setup

Ci,0 Co,3 Co,7 Co,11 Co,15

S0–3

Bit 0–3 Bit 4–7 Bit 8–11 Bit 12–15

0

1

Sum Generation

Multiplexer

1-Carry

0-Carry

Setup

S4–7

0

1

Sum Generation

Multiplexer

1-Carry

0-Carry 0-Carry

Setup

S8–11

0

1

Sum Generation

Multiplexer

1-Carry

Setup

S12–15

tadder = tsetup + Mtcarry + (N/M)tmux + tsum

1616--bit Linear Carrybit Linear Carry--Select AdderSelect Adder

N=16 M=4

Lecture 16, ECE 124A, VLSI Principles Kaustav Banerjee

Setup

"0" Carry

"1" Carry

Multiplexer

Sum Generation

"0"

"1"

Setup

"0" Carry

"1" Carry

Multiplexer

Sum Generation

"0"

"1"

Setup

"0" Carry

"1" Carry

Multiplexer

Sum Generation

"0"

"1"

Setup

"0" Carry

"1" Carry

Multiplexer

Sum Generation

"0"

"1"

Bit 0-1 Bit 2-4 Bit 5-8 Bit 9-13

S0-1 S2-4 S5-8 S9-13

Ci,0

(4) (5) (6) (7)

(1)

(1)

(3) (4) (5) (6)

Mux

Sum

S14-19

(7)

(8)

Bit 14-19

(9)

(3)

SquareSquare--Root CarryRoot Carry--Select AdderSelect Adder

22 3 ...... ( 1) 2

PN P= + + + + ≅

N bits P stages First Stage has M bitsFor example: 2P N=

tadder = tsetup + Mtcarry + (N/M)tmux + tsum= tsetup + Mtcarry +Ptmux + tsum

M=2

Lecture 16, ECE 124A, VLSI Principles Kaustav Banerjee

Adder Delays Adder Delays -- Comparison Comparison

Square root select

Linear select

Ripple adder

20 40N

t p(in

uni

t del

ays)

600

10

0

20

30

40

50

Lecture 16, ECE 124A, VLSI Principles Kaustav Banerjee

AN-1, BN-1A1, B1

P1

S1

• • •

• • • SN-1

PN-1Ci, N-1

S0

P0Ci,0 Ci,1

A0, B0

Expanding Lookahead equations:

All the way:

C0,k=Gk+PkC0,k-1

C0,k=Gk+Pk(Gk-1+Pk-1C0,k-2)

C0,k=Gk+Pk(Gk-1+Pk-1(……+P1(G0+P0Ci,0))))))

LookLook--Ahead Ahead -- Basic IdeaBasic Idea

Lecture 16, ECE 124A, VLSI Principles Kaustav Banerjee

Carry DeterminationCarry DeterminationCo,0=G0+P0Ci,0

Co,1=G1+P1Co,0=G1+P1(G0+P0Ci,0)=G1+P1G0+P1P0Ci,0

Co,2=G2+P2G1+P2P1G0+P2P1P0Ci,0

Co,3=G3+P3G2+P3P2G1+P3P2P1G0+P3P2P1P0Ci,0

Lecture 16, ECE 124A, VLSI Principles Kaustav Banerjee

Co,3

Ci,0

VDD

P0

P1

P2

P3

G0

G1

G2

G3

44--bit Lookbit Look--Ahead Ahead

Large stack

Poor Performance

Lecture 16, ECE 124A, VLSI Principles Kaustav Banerjee

Co,0=G0+P0Ci,0

Co,1=G1+P1Co,0

Co,2=G2+P2C0,1

Co,3=G3+P3Co,2

A7

F

A6A5A4A3A2A1

A0

A0A1

A2A3

A4A5

A6

A7

F

tp∼ log2(N)

tp∼ N

(g”,p”) (g’,p’)

(g,p)

g=g”+g’p”p=p’p”

Hierarchically DecompositionHierarchically Decomposition

Lecture 16, ECE 124A, VLSI Principles Kaustav Banerjee

Logarithmic LookLogarithmic Look--Ahead AdderAhead Adder

2logpt N∝

Kogge-Stone tree(A

0, B

0)

(A1,

B1)

(A2,

B2)

(A3,

B3)

(A4,

B4)

(A5,

B5)

(A6,

B6)

(A7,

B7)

(A8,

B8)

(A9,

B9)

(A10

, B10

)

(A11

, B11

)

(A12

, B12

)

(A13

, B13

)

(A14

, B14

)

(A15

, B15

)

S 0 S 1 S 2 S 3 S 4 S 5 S 6 S 7 S 8 S 9 S 10 S 11 S 12 S 13 S 14 S 15

Lecture 16, ECE 124A, VLSI Principles Kaustav Banerjee

MultipliersMultipliers

Lecture 16, ECE 124A, VLSI Principles Kaustav Banerjee

The Binary MultiplicationThe Binary Multiplication

x

+

Partial products

Multiplicand

Multiplier

Result

1 0 1 0 1 0

1 0 1 0 1 0

1 0 1 0 1 0

1 1 1 0 0 1 1 1 0

0 0 0 0 0 0

1 0 1 0 1 0

1 0 1 1

Lecture 16, ECE 124A, VLSI Principles Kaustav Banerjee

Reduce Partial ProductsReduce Partial Products

Partial products can be reduced by multiplier transformation(Booth’s Recording)

Reduce number of partial products is equivalent to reducing the number of additions

Lecture 16, ECE 124A, VLSI Principles Kaustav Banerjee

Y0

Y1

X3 X2 X1 X0

X3

HA

X2

FA

X1

FA

X0

HA

Y2X3

FA

X2

FA

X1

FA

X0

HA

Z1

Z3Z6Z7 Z5 Z4

Y3X3

FA

X2

FA

X1

FA

X0

HA

Z2

Z0

4X4 Bit4X4 Bit--Array MultiplierArray Multiplier

Lecture 16, ECE 124A, VLSI Principles Kaustav Banerjee

Critical PathCritical Path Y0

Y1

X3 X2 X1 X0

X3

HA

X2

FA

X1

FA

X0

HA

Y2X3

FA

X2

FA

X1

FA

X0

HA

Z1

Z3Z6Z7 Z5 Z4

Y3X3

FA

X2

FA

X1

FA

X0

HA

Z2

Z0

For MXN Bit-Array Multiplier

M

N-1

tmultiplier = [ (M-1) + (N-2) ] tcarry + (N-1) tsum + tand

tand

Lecture 16, ECE 124A, VLSI Principles Kaustav Banerjee

CarryCarry--Save MultiplierSave Multiplier

HA HA HA HA

FAFAFAHA

FAHA FA FA

FAHA FA HA

Vector Merging Adder

tmultiplier = (N-1) tcarry + tand + tmerge

Carry is saved for the next adder stage

Lecture 16, ECE 124A, VLSI Principles Kaustav Banerjee

Multiplier Multiplier FloorplanFloorplan

SCSCSCSC

SCSCSCSC

SCSCSCSC

SC

SC

SC

SC

Z0

Z1

Z2

Z3Z4Z5Z6Z7

X0X1X2X3

Y1

Y2

Y3

Y0

Vector Merging Cell

HA Multiplier Cell

FA Multiplier Cell

X and Y signals are broadcastedthrough the complete array.( )

Carry-Save Multiplier

Rectangular shape is easy for integration

Lecture 16, ECE 124A, VLSI Principles Kaustav Banerjee

6 5 4 3 2 1 0 6 5 4 3 2 1 0

Partial products First stage

Bit position

6 5 4 3 2 1 0 6 5 4 3 2 1 0Second stage Final adder

FA HA

(a) (b)

(c) (d)

Partial Products TransformationPartial Products Transformation

Lecture 16, ECE 124A, VLSI Principles Kaustav Banerjee

WallaceWallace--Tree MultiplierTree MultiplierPartial products

First stage

Second stage

Final adder

FA FA FA

HA HA

FA

x3y3

z7 z6 z5 z4 z3 z2 z1 z0

x3y2x2y3

x1y1x3y0 x2y0 x0y1x0y2

x2y2x1y3

x1y2x3y1x0y3 x1y0 x0y0x2y1

Lecture 16, ECE 124A, VLSI Principles Kaustav Banerjee

Wallace Tree MultiplierWallace Tree MultiplierPros:

Substantial hardware savingReduce propagation delay

Cons:Irregular structureCustomized layout

Lecture 16, ECE 124A, VLSI Principles Kaustav Banerjee

ShiftersShifters

Lecture 16, ECE 124A, VLSI Principles Kaustav Banerjee

The Binary ShifterThe Binary Shifter

Ai

Ai-1

Bi

Bi-1

Right Leftnop

Bit-Slice i

...

Widely used for floating point units, multiplications by constant numbers

This binary shifter is extremely slow for multi-bit applications

Lecture 16, ECE 124A, VLSI Principles Kaustav Banerjee

The Barrel ShifterThe Barrel Shifter

Sh3Sh2Sh1Sh0

Sh3

Sh2

Sh1

A3

A2

A1

A0

B3

B2

B1

B0

: Control Wire

: Data Wire

Area Dominated by Control Wiring

Decoder Control Signal

Lecture 16, ECE 124A, VLSI Principles Kaustav Banerjee

4x4 barrel shifter4x4 barrel shifter

BufferSh3S h2Sh 1Sh0

A3

A2

A 1

A 0

Widthbarrel ~ 2 pm M

Lecture 16, ECE 124A, VLSI Principles Kaustav Banerjee

Logarithmic ShifterLogarithmic ShifterSh1 Sh1 Sh2 Sh2 Sh4 Sh4

A3

A2

A1

A0

B1

B0

B2

B3

Lecture 16, ECE 124A, VLSI Principles Kaustav Banerjee

A3

A 2

A1

A0

Out3

Out2

Out1

Out0

00--7 bit Logarithmic Shifter7 bit Logarithmic Shifter

top related