variable latency speculative addition: a new paradigm for arithmetic circuit design

25
Ajay K. Verma, Philip Brisk and Paolo Ienne Processor Architecture Laboratory (LAP) & Centre for Advanced Digital Systems (CSDA) Ecole Polytechnique Fédérale de Lausanne (EPFL) csda csda Variable Latency Speculative Variable Latency Speculative Addition: Addition: A New Paradigm for Arithmetic A New Paradigm for Arithmetic Circuit Design Circuit Design

Upload: emilia

Post on 19-Mar-2016

62 views

Category:

Documents


3 download

DESCRIPTION

csda. csda. Variable Latency Speculative Addition: A New Paradigm for Arithmetic Circuit Design. Ajay K. Verma, Philip Brisk and Paolo Ienne Processor Architecture Laboratory (LAP) & Centre for Advanced Digital Systems (CSDA) Ecole Polytechnique Fédérale de Lausanne (EPFL). - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Variable Latency Speculative Addition:  A New Paradigm for Arithmetic Circuit Design

Ajay K. Verma, Philip Brisk and Paolo Ienne

Processor Architecture Laboratory (LAP)& Centre for Advanced Digital Systems (CSDA)

Ecole Polytechnique Fédérale de Lausanne (EPFL)

csda

csda

Variable Latency Speculative Addition: Variable Latency Speculative Addition: A New Paradigm for Arithmetic Circuit A New Paradigm for Arithmetic Circuit

DesignDesign

Page 2: Variable Latency Speculative Addition:  A New Paradigm for Arithmetic Circuit Design

2

Do We Always Need 100% Do We Always Need 100% AccuracyAccuracy

Ariane 5 explosion, 96 Patriot missile failure, 91

Cryptography attacks

X

Page 3: Variable Latency Speculative Addition:  A New Paradigm for Arithmetic Circuit Design

3

Ciphertext-Only Attacks (1 of 2)Ciphertext-Only Attacks (1 of 2)

Guess a key

Decryption

Frequencyanalysis

Ciphertext

Yes

No

Page 4: Variable Latency Speculative Addition:  A New Paradigm for Arithmetic Circuit Design

4

Ciphertext-Only Attacks (2 of 2)Ciphertext-Only Attacks (2 of 2) Speeding up decryption process will allow

Large amount of ciphertext to decipher More key guesses

Error in the decryption of a few blocks will NOT Affect the frequencies of characters significantly Reduce the efficacy of attack

Use of extremely fast, almost correct arithmetic components is desirable

Page 5: Variable Latency Speculative Addition:  A New Paradigm for Arithmetic Circuit Design

5

Our ContributionOur Contribution Almost Correct Adder (ACA)

Exponentially faster compared to fastest reliable adder Produces correct result in 99.99% cases

Trade-off between delay and error-precision

Variable Latency Speculative Adder (VLSA) For a processor which allows variable latency

instructions Uses ACA as a component Always produces correct result Extremely fast in more than 99.99% cases

Page 6: Variable Latency Speculative Addition:  A New Paradigm for Arithmetic Circuit Design

6

OutlineOutline Related work Main Idea

Limited carry propagation occurs in most cases Design of the ACA

Delay optimal design with minimal area Design of the VLSA

Error detection and recovery of ACA Results Extension to other arithmetic components

Parallel counters, multipliers etc. Conclusions

Page 7: Variable Latency Speculative Addition:  A New Paradigm for Arithmetic Circuit Design

7

Related WorkRelated Work Design of optimal adders with respect to different metrics

Delay and area: Ripple carry adder, Carry lookahead adder, Prefix adder etc.

Maximum fanout, wiretrack: Kogge-Stone adder, Brent-Kung adder, Knowles adders

Generation of all Pareto-optimal prefix adders [Liu07]

Probabilistic arithmetic component Probabilistic arithmetic component to save energy [George06] Razor: circuit level correction for low power operations

[Ernst05] Error detection and correction due to reduction in power

supply voltage [Hegde01] Asynchronous speculative adder [Nowick96, Nowick97]

Page 8: Variable Latency Speculative Addition:  A New Paradigm for Arithmetic Circuit Design

8

Recurrence for A Typical AdderRecurrence for A Typical Addera15 a14 a13 a12 a1 a0

b15 b14 b13 b12 b1 b0

s15 s14 s13 s12 s1 s0

gi = ai bi

pi = ai bi

ki = ai + bisi = ai bi ci-1

ci = 0 if ki = 11 if gi = 1ci-1 if pi = 1

ci

ci-1

genkill

ci

ci-1

prop

X

Page 9: Variable Latency Speculative Addition:  A New Paradigm for Arithmetic Circuit Design

9

Main Idea: Limited Carry Main Idea: Limited Carry PropagationPropagation

gen

X

gen

X

prop prop prop kill

X

Page 10: Variable Latency Speculative Addition:  A New Paradigm for Arithmetic Circuit Design

10

Longest Sequence of PropagatesLongest Sequence of Propagates Longest sequence of propagates

Longest run of 1’s in the XOR of input integers (A B) Longest run of heads in tossing a coin n times

Tk = Tk-1 + average number of steps to advance from k-1 to k

Tk = Tk-1 +1 + (1 + Tk)

2 Tk = 2k+1 - 2

Page 11: Variable Latency Speculative Addition:  A New Paradigm for Arithmetic Circuit Design

11

Probabilistic Bounds on The Longest Probabilistic Bounds on The Longest Sequence of PropagatesSequence of Propagates

An (x) = number of instances in n-bit addition, where longest sequence of propagates is bounded by x

An (x) = 22n if n ≤ x

2n (An-1 (x) + An-2 (x) + … + An-x-1 (x)) otherwise

Bitwidth Longest sequence of propagates with 99%

probability

Longest sequence of propagates with 99.99%

probability64 11 17

128 12 18256 13 20512 14 211024 15 222048 16 23

Page 12: Variable Latency Speculative Addition:  A New Paradigm for Arithmetic Circuit Design

12

A Primitive Design of ACA (1 of 2) A Primitive Design of ACA (1 of 2)

Page 13: Variable Latency Speculative Addition:  A New Paradigm for Arithmetic Circuit Design

13

A Primitive Design of ACA (2 of 2)A Primitive Design of ACA (2 of 2)

ADDA [5, 0]B [5, 0]

S [0]

S [5]

ADDA [6, 1]B [6, 1] S [6]

ADDA [7, 2]B [7, 2] S [7]

ADDA [19, 14]B [19, 14] S [19]

Large area overhead due to the multitude

of small adders

Page 14: Variable Latency Speculative Addition:  A New Paradigm for Arithmetic Circuit Design

14

Area Overhead in ACA (1 of 2)Area Overhead in ACA (1 of 2)a15 a14 a13 a12 a1 a0

b15 b14 b13 b12 b1 b0

p, g (15, 0)p, g (14, 0)

bitposition

Page 15: Variable Latency Speculative Addition:  A New Paradigm for Arithmetic Circuit Design

15

Area Overhead in ACA (2 of 2)Area Overhead in ACA (2 of 2)

Step 1: compute the (p, g) for any group of two consecutive bit positions Step 2: compute the (p, g) for any group of four consecutive bit positions Final step: combine the computed (p, g)’s to compute the (p, g) for any group

of k consecutive bi positions

A slightly more complicated design can be used to further reduce the hardware area

Page 16: Variable Latency Speculative Addition:  A New Paradigm for Arithmetic Circuit Design

16

OutlineOutline Related work Main Idea

Limited carry propagation occurs in most cases Design of the ACA

Delay optimal design with minimal area Design of the VLSA

Error detection and recovery of ACA Results Extension to other arithmetic components

Parallel counters, multipliers etc. Conclusions

Page 17: Variable Latency Speculative Addition:  A New Paradigm for Arithmetic Circuit Design

17

Error DetectionError Detection Error occurs if there is a long chain of propagates

ER = ∑ pi pi+1 … pi+k

Delay of error detection Higher than the delay of an ACA Smaller than the delay of a traditional adder Experimentally 2/3 of the delay of a traditional adder

Page 18: Variable Latency Speculative Addition:  A New Paradigm for Arithmetic Circuit Design

18

Error RecoveryError Recovery

Significant amount of ACA computation can be used for the computation of correct addition in error recovery

Page 19: Variable Latency Speculative Addition:  A New Paradigm for Arithmetic Circuit Design

19

Variable Latency Speculative Variable Latency Speculative AdderAdder

Page 20: Variable Latency Speculative Addition:  A New Paradigm for Arithmetic Circuit Design

20

Example of VLSA ComputationExample of VLSA Computation

Page 21: Variable Latency Speculative Addition:  A New Paradigm for Arithmetic Circuit Design

21

Experimental SetupExperimental Setup

Input N (bitwidth)

Traditional fast adder(Prefix adder)

Almost correct adder(ACA)

Error detection

ACA + error recovery(VLSA)

Logic synthesis

Synopsis Design Compiler - compile_ultra - minimize delay

Artisan Standard CellsUMC (0.18µm)

Page 22: Variable Latency Speculative Addition:  A New Paradigm for Arithmetic Circuit Design

22

ResultsResults

Average delay of VLSA = 0.70 x delay of traditional adderDelay of ACA = 0.52 x delay of traditional adder

Page 23: Variable Latency Speculative Addition:  A New Paradigm for Arithmetic Circuit Design

23

ConclusionsConclusions We have presented an exponentially fast adder that

works correctly in more than 99.99% cases

We have also presented the reliable version of above adder that works correctly in all case, and Is extremely fast in more than 99.99% cases Has almost the same delay as traditional adder in

other cases

An extension for the similar approach for other arithmetic components is desirable

Page 24: Variable Latency Speculative Addition:  A New Paradigm for Arithmetic Circuit Design

24

Future Work: Can We Have A Fast Future Work: Can We Have A Fast Almost Correct (Counter/Multiplier)Almost Correct (Counter/Multiplier)

1

1

11

1

1 11

0

00

0

0

0

1

Ex [path number] = sum of bitsOutput = path number = 1001

00011001110111011101010110011001

1001000

Var [path number] = high

Since each output bit depends on each input bit equally,one cannot discard some input bits in the computation of an output bit

Page 25: Variable Latency Speculative Addition:  A New Paradigm for Arithmetic Circuit Design

25

Future Work: Few Most Significant Future Work: Few Most Significant Bits in MultiplierBits in Multiplier

1001 01101101 1001x

0111 1111 0010 0110

10011101x

0111 0101

Even if we ignore the lower half bits of two inputs, most significant (log n) bits of output will remain same with high probability