arithmetic for computers mehran rezaeiengold.ui.ac.ir/~m.rezaei/architecture/calendar/... ·...

85
Arithmetic For Computers Mehran Rezaei

Upload: others

Post on 19-Jul-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Arithmetic For Computers Mehran Rezaeiengold.ui.ac.ir/~m.rezaei/architecture/calendar/... · Addition and subtraction 3 + + +. . . . B A 0 B 31 A 31 B 1 A 1 0 op a/s R R 1 R 0 31

Arithmetic For Computers

Mehran Rezaei

Page 2: Arithmetic For Computers Mehran Rezaeiengold.ui.ac.ir/~m.rezaei/architecture/calendar/... · Addition and subtraction 3 + + +. . . . B A 0 B 31 A 31 B 1 A 1 0 op a/s R R 1 R 0 31

Introduction

• What happens if an operation generates

a number bigger than it can be

represented (by the space given

originally)?

• How does hardware multiply and divide

numbers?

• What about fractions, floating points and

real numbers? How does computer deal

with them?2

Page 3: Arithmetic For Computers Mehran Rezaeiengold.ui.ac.ir/~m.rezaei/architecture/calendar/... · Addition and subtraction 3 + + +. . . . B A 0 B 31 A 31 B 1 A 1 0 op a/s R R 1 R 0 31

Addition and subtraction

3

+++

. . . .

A0B0A1B1A31B31

op

a/s

R0R1R31

00

z

Page 4: Arithmetic For Computers Mehran Rezaeiengold.ui.ac.ir/~m.rezaei/architecture/calendar/... · Addition and subtraction 3 + + +. . . . B A 0 B 31 A 31 B 1 A 1 0 op a/s R R 1 R 0 31

Have you thought of performance?

4

ab

cin

cout

abcin s

Page 5: Arithmetic For Computers Mehran Rezaeiengold.ui.ac.ir/~m.rezaei/architecture/calendar/... · Addition and subtraction 3 + + +. . . . B A 0 B 31 A 31 B 1 A 1 0 op a/s R R 1 R 0 31

Big-O notation

• f(n) is O(g(n)):

if (two constants) n0 and c can be found to satisfy:

f(n) < cg(n) for any n, n > n0

• g(n) is simple function: 1, n, log2n, n2, n3, 2n

• Following are O(n2):

5

Page 6: Arithmetic For Computers Mehran Rezaeiengold.ui.ac.ir/~m.rezaei/architecture/calendar/... · Addition and subtraction 3 + + +. . . . B A 0 B 31 A 31 B 1 A 1 0 op a/s R R 1 R 0 31

Big-O notation (cont’d)

6

Page 7: Arithmetic For Computers Mehran Rezaeiengold.ui.ac.ir/~m.rezaei/architecture/calendar/... · Addition and subtraction 3 + + +. . . . B A 0 B 31 A 31 B 1 A 1 0 op a/s R R 1 R 0 31

Have you thought of performance?

7

ab

cin

cout

abcin s

Tpd(result3) = 3*(Tpd(nand3)+Tpd(nand2))+Tpd(xor)

Tpd(resultn-1) = (n-1)*(Tpd(nand3)+Tpd(nand2))+Tpd(xor)

Tpd(resultn-1) = (n-1)*constant1+constant2 = O(n)

Page 8: Arithmetic For Computers Mehran Rezaeiengold.ui.ac.ir/~m.rezaei/architecture/calendar/... · Addition and subtraction 3 + + +. . . . B A 0 B 31 A 31 B 1 A 1 0 op a/s R R 1 R 0 31

Ripple carry adder

• What seems to be the problem?

• N-bit (32-bit or 64-bit) ripple carry adder

8

Delay = O(n)

Area = O(n)

Page 9: Arithmetic For Computers Mehran Rezaeiengold.ui.ac.ir/~m.rezaei/architecture/calendar/... · Addition and subtraction 3 + + +. . . . B A 0 B 31 A 31 B 1 A 1 0 op a/s R R 1 R 0 31

Carry select adder

9

Tpd(32 bit CSA) = Tpd(16 bit RCA) + Tpd(multiplexer)

Courtesy of slide: Chris Terman, computational structure, MIT

Page 10: Arithmetic For Computers Mehran Rezaeiengold.ui.ac.ir/~m.rezaei/architecture/calendar/... · Addition and subtraction 3 + + +. . . . B A 0 B 31 A 31 B 1 A 1 0 op a/s R R 1 R 0 31

10

Courtesy of slide: Chris Terman, computational structure, MIT

Speedup: 2.5 times faster than 32 bit ripple carry adder

(in cost of: twice as much area)

Page 11: Arithmetic For Computers Mehran Rezaeiengold.ui.ac.ir/~m.rezaei/architecture/calendar/... · Addition and subtraction 3 + + +. . . . B A 0 B 31 A 31 B 1 A 1 0 op a/s R R 1 R 0 31

Different flavors of adders

• Ripple Carry Adder موج لرزه های کوچک را ریپل می گویند

– RCA

• Carry Select Adder

– Delay: O(log2n)

– Area: O(n)

• Carry Lookahead Adder (CLA)

– Delay: O(log2n)

– Area: O(nlog2n)

• Carry skip adder

– Delay: O(n1/2)

– Area: O(n)

11

Page 12: Arithmetic For Computers Mehran Rezaeiengold.ui.ac.ir/~m.rezaei/architecture/calendar/... · Addition and subtraction 3 + + +. . . . B A 0 B 31 A 31 B 1 A 1 0 op a/s R R 1 R 0 31

Carry Lookahead Adder

12

C1 = A0B0 + A0C0 + B0C0 = A0B0 + C0(A0+B0)

C2 = G1 + C1.P1

= G1 + (G0 + C0.P0).P1

C3 = G2 + C2.P2

= G2 + (G1 + G0.P1 + C0.P0.P1).P2

C4 = G3 + G2.P3 + G1.P2.P3 + G0.P1.P2.P3 + C0.P0.P1.P2.P3

Page 13: Arithmetic For Computers Mehran Rezaeiengold.ui.ac.ir/~m.rezaei/architecture/calendar/... · Addition and subtraction 3 + + +. . . . B A 0 B 31 A 31 B 1 A 1 0 op a/s R R 1 R 0 31

Carry Lookahead Adder (cont’d)

13

C4 = G3 + G2.P3 + G1.P2.P3 + G0.P1.P2.P3 + C0.P0.P1.P2.P3

Delay: O(log2n)

Area: O(nlog2n)

Page 14: Arithmetic For Computers Mehran Rezaeiengold.ui.ac.ir/~m.rezaei/architecture/calendar/... · Addition and subtraction 3 + + +. . . . B A 0 B 31 A 31 B 1 A 1 0 op a/s R R 1 R 0 31

Hybrid (CLA & CRA)

14

CLA

8 8

8

CLA

8 8

8

CLA

8 8

8

CLA

8 8

8

C0 C32C8 C16 C24

Group Generate/ Group Propagate

Page 15: Arithmetic For Computers Mehran Rezaeiengold.ui.ac.ir/~m.rezaei/architecture/calendar/... · Addition and subtraction 3 + + +. . . . B A 0 B 31 A 31 B 1 A 1 0 op a/s R R 1 R 0 31

Final note on CLA

• Could I change Pi = Ai + Bi to Pi = Ai Bi?

15

If Cin = 0 then “carry is generated”

Else “carry is propagated”

End if

Ci+1 = Gi + Ci.Pi where Gi = Ai.Bi and Pi = Ai + Bi

Pi = Ai Bi =>

Ci+1= Gi + Ci•Pi

= Ai•Bi+ Ci•(Ai•Bi’+ Ai’•Bi)

= Ai•Bi + Ci•Ai•Bi’ + Ci•Ai’•Bi

= Ai•Bi + Ci•Ai•Bi’+ Ai•Bi + Ci•Ai’•Bi

= Ai•(Bi + Ci•Bi’) + Bi•(Ai + Ci•Ai’)

= Ai•(Bi+Bi’)•(Bi+Ci) + Bi•(Ai+Ai’)•(Ai+Ci)

= Ai•(Bi+Ci) + Bi•(Ai+Ci)

= Ai•Bi + Ci•(Ai+Bi)

Page 16: Arithmetic For Computers Mehran Rezaeiengold.ui.ac.ir/~m.rezaei/architecture/calendar/... · Addition and subtraction 3 + + +. . . . B A 0 B 31 A 31 B 1 A 1 0 op a/s R R 1 R 0 31

Final note on CLA (cont’d)

16

FA

Ai

BiCi

Pi

SiGi

CoH=GH+CiHPH=GH+(GL+CiLPL)PH

= GH+GLPH + CiLPLPH

=GHL+CiLPHL

Page 17: Arithmetic For Computers Mehran Rezaeiengold.ui.ac.ir/~m.rezaei/architecture/calendar/... · Addition and subtraction 3 + + +. . . . B A 0 B 31 A 31 B 1 A 1 0 op a/s R R 1 R 0 31

17

O(logN)

8 bit CLA (with Generate and Propagate)

Page 18: Arithmetic For Computers Mehran Rezaeiengold.ui.ac.ir/~m.rezaei/architecture/calendar/... · Addition and subtraction 3 + + +. . . . B A 0 B 31 A 31 B 1 A 1 0 op a/s R R 1 R 0 31

8 bit CLA (with Generate and Propagate)

18

Page 19: Arithmetic For Computers Mehran Rezaeiengold.ui.ac.ir/~m.rezaei/architecture/calendar/... · Addition and subtraction 3 + + +. . . . B A 0 B 31 A 31 B 1 A 1 0 op a/s R R 1 R 0 31

19

Addition/Subtraction and overflow

• Examples

0X4E + 0X1F

0X4E – 0X1F

• Overflow

operation result

A + B A > 0 B > 0 < 0

A + B A < 0 B < 0 > 0

A - B A > 0 B < 0 < 0

A - B A < 0 B > 0 > 0

condition

Page 20: Arithmetic For Computers Mehran Rezaeiengold.ui.ac.ir/~m.rezaei/architecture/calendar/... · Addition and subtraction 3 + + +. . . . B A 0 B 31 A 31 B 1 A 1 0 op a/s R R 1 R 0 31

20

What does the ALU (hardware) do

when overflow happens?

• Ignore

– Programmer is responsible for

• Leave it to OS

– Either completely takes care of it

– Or signals the application

• What does MIPS do?

– For signed operation (if overflow occurs) it throws an

exception

– It ignores the overflow of unsigned operations

Page 21: Arithmetic For Computers Mehran Rezaeiengold.ui.ac.ir/~m.rezaei/architecture/calendar/... · Addition and subtraction 3 + + +. . . . B A 0 B 31 A 31 B 1 A 1 0 op a/s R R 1 R 0 31

signed addition with status

• Adder with

– Carry-in: need an extra bit (LSB)

– Carry-out: need an extra bit (MSB)

– Overflow:

• two operands has the same sign but the sum has a different sign

– Zero

• 1 If result is zero and no overflow

• 0 otherwise

– Sign (of the addition result)

If not overflow, MSB result

Else (MSB result)’

21

Page 22: Arithmetic For Computers Mehran Rezaeiengold.ui.ac.ir/~m.rezaei/architecture/calendar/... · Addition and subtraction 3 + + +. . . . B A 0 B 31 A 31 B 1 A 1 0 op a/s R R 1 R 0 31

Array Multiplier

22

𝑦 =

𝑖=0

𝑛−1

𝑎. 𝑏𝑖. 2𝑖

Page 23: Arithmetic For Computers Mehran Rezaeiengold.ui.ac.ir/~m.rezaei/architecture/calendar/... · Addition and subtraction 3 + + +. . . . B A 0 B 31 A 31 B 1 A 1 0 op a/s R R 1 R 0 31

Array Multiplier

23

FAFAFA

Page 24: Arithmetic For Computers Mehran Rezaeiengold.ui.ac.ir/~m.rezaei/architecture/calendar/... · Addition and subtraction 3 + + +. . . . B A 0 B 31 A 31 B 1 A 1 0 op a/s R R 1 R 0 31

Array Multiplier

24

FAFAFA

FAFAFA

Page 25: Arithmetic For Computers Mehran Rezaeiengold.ui.ac.ir/~m.rezaei/architecture/calendar/... · Addition and subtraction 3 + + +. . . . B A 0 B 31 A 31 B 1 A 1 0 op a/s R R 1 R 0 31

Array Multiplier

25

FAFAFA

FAFAFA

FAFAFA

Page 26: Arithmetic For Computers Mehran Rezaeiengold.ui.ac.ir/~m.rezaei/architecture/calendar/... · Addition and subtraction 3 + + +. . . . B A 0 B 31 A 31 B 1 A 1 0 op a/s R R 1 R 0 31

Array Multiplier

26

FAFAFA

FAFAFA

FAFAFA

FAFAFA

Page 27: Arithmetic For Computers Mehran Rezaeiengold.ui.ac.ir/~m.rezaei/architecture/calendar/... · Addition and subtraction 3 + + +. . . . B A 0 B 31 A 31 B 1 A 1 0 op a/s R R 1 R 0 31

Add and Shift Multiplier

27

Page 28: Arithmetic For Computers Mehran Rezaeiengold.ui.ac.ir/~m.rezaei/architecture/calendar/... · Addition and subtraction 3 + + +. . . . B A 0 B 31 A 31 B 1 A 1 0 op a/s R R 1 R 0 31

Add and Shift Multiplier (Cont’d)

28

1101

1011

000000000

1101

011010000

001101000

1101

100111000

010011100

0000

010011100

001001110

1101

100011110

010001111

Page 29: Arithmetic For Computers Mehran Rezaeiengold.ui.ac.ir/~m.rezaei/architecture/calendar/... · Addition and subtraction 3 + + +. . . . B A 0 B 31 A 31 B 1 A 1 0 op a/s R R 1 R 0 31

Add and Shift Multiplier (Cont’d)

29

Page 30: Arithmetic For Computers Mehran Rezaeiengold.ui.ac.ir/~m.rezaei/architecture/calendar/... · Addition and subtraction 3 + + +. . . . B A 0 B 31 A 31 B 1 A 1 0 op a/s R R 1 R 0 31

Add and Shift Multiplier (Cont’d)

30

Page 31: Arithmetic For Computers Mehran Rezaeiengold.ui.ac.ir/~m.rezaei/architecture/calendar/... · Addition and subtraction 3 + + +. . . . B A 0 B 31 A 31 B 1 A 1 0 op a/s R R 1 R 0 31

Add and Shift Multiplier (Cont’d)

31

ProductCout

X

Y

for(i=0;i<4;i++){

if(Product[0] == 1)

(Cout,Product[7-4]) <-- Product[7-4] + Y;

ShiftRight (Cout,Product);

}

Page 32: Arithmetic For Computers Mehran Rezaeiengold.ui.ac.ir/~m.rezaei/architecture/calendar/... · Addition and subtraction 3 + + +. . . . B A 0 B 31 A 31 B 1 A 1 0 op a/s R R 1 R 0 31

Data path and control unit

32

Page 33: Arithmetic For Computers Mehran Rezaeiengold.ui.ac.ir/~m.rezaei/architecture/calendar/... · Addition and subtraction 3 + + +. . . . B A 0 B 31 A 31 B 1 A 1 0 op a/s R R 1 R 0 31

Data path

• Definition:

– A collection of computational components

(e.g., adders/subs, multipliers/dividers, FP

computational units, …) and memory

elements (e.g., flip flops, registers, shift

registers, …) connected with each other via

routing networks (buses) for performing all

the requirements needed and defined in

system’s specification

33

Page 34: Arithmetic For Computers Mehran Rezaeiengold.ui.ac.ir/~m.rezaei/architecture/calendar/... · Addition and subtraction 3 + + +. . . . B A 0 B 31 A 31 B 1 A 1 0 op a/s R R 1 R 0 31

Control Unit

• Definition

– A combinational or sequential circuit that

controls the flow of data in the data path for

performing all the requirements needed and

defined in system’s specification

34

Page 35: Arithmetic For Computers Mehran Rezaeiengold.ui.ac.ir/~m.rezaei/architecture/calendar/... · Addition and subtraction 3 + + +. . . . B A 0 B 31 A 31 B 1 A 1 0 op a/s R R 1 R 0 31

Synchronous vs Asynchronous circuits

• Globally synchronous circuit: all memory

elements (D FFs) controlled (synchronized) by

a common global clock signal

• Globally asynchronous but locally synchronous

circuit (GALS).

• Globally asynchronous circuit

– Use D FF but not with a global clock

– Use no clock signal

35

Page 36: Arithmetic For Computers Mehran Rezaeiengold.ui.ac.ir/~m.rezaei/architecture/calendar/... · Addition and subtraction 3 + + +. . . . B A 0 B 31 A 31 B 1 A 1 0 op a/s R R 1 R 0 31

Synchronous Circuit

• The Big idea: Synchronous methodology

– Group all D FFs together with a single clock:

Synchronous methodology

– Only need to deal with the timing constraint

of one memory element

36

Page 37: Arithmetic For Computers Mehran Rezaeiengold.ui.ac.ir/~m.rezaei/architecture/calendar/... · Addition and subtraction 3 + + +. . . . B A 0 B 31 A 31 B 1 A 1 0 op a/s R R 1 R 0 31

Routing Networks (Buses)

• Tri-state buffer:

– Output with “high-impedance”

37

Page 38: Arithmetic For Computers Mehran Rezaeiengold.ui.ac.ir/~m.rezaei/architecture/calendar/... · Addition and subtraction 3 + + +. . . . B A 0 B 31 A 31 B 1 A 1 0 op a/s R R 1 R 0 31

Routing Networks (Bi – directional)

38

Page 39: Arithmetic For Computers Mehran Rezaeiengold.ui.ac.ir/~m.rezaei/architecture/calendar/... · Addition and subtraction 3 + + +. . . . B A 0 B 31 A 31 B 1 A 1 0 op a/s R R 1 R 0 31

Routing Networks (Multiplexers)

39

What are the differences between tri-state

buffers and multiplexers?

Page 40: Arithmetic For Computers Mehran Rezaeiengold.ui.ac.ir/~m.rezaei/architecture/calendar/... · Addition and subtraction 3 + + +. . . . B A 0 B 31 A 31 B 1 A 1 0 op a/s R R 1 R 0 31

Routing Networks (Cont’d)

40

S SS

D DD

S SS

D DD

Page 41: Arithmetic For Computers Mehran Rezaeiengold.ui.ac.ir/~m.rezaei/architecture/calendar/... · Addition and subtraction 3 + + +. . . . B A 0 B 31 A 31 B 1 A 1 0 op a/s R R 1 R 0 31

Routing Networks (Multiplexers)

41

What are the differences between tri-state

buffers and multiplexers?

i1

i2

sel2

sel1

o

Page 42: Arithmetic For Computers Mehran Rezaeiengold.ui.ac.ir/~m.rezaei/architecture/calendar/... · Addition and subtraction 3 + + +. . . . B A 0 B 31 A 31 B 1 A 1 0 op a/s R R 1 R 0 31

Side note wrap up

• Methodology

– Separate data path from control unit

• Defined data path and control unit

– Synchronous circuit design

• Routing networks and buses

– Tri-state buffers

– Multiplexers as routing networks

42

Page 43: Arithmetic For Computers Mehran Rezaeiengold.ui.ac.ir/~m.rezaei/architecture/calendar/... · Addition and subtraction 3 + + +. . . . B A 0 B 31 A 31 B 1 A 1 0 op a/s R R 1 R 0 31

Add and Shift Multiplier (Cont’d)

43

ProductCout

X

Y

for(i=0;i<4;i++){

if(Product[0] == 1)

(Cout,Product[7-4]) <-- Product[7-4] + Y;

ShiftRight (Cout,Product);

}

A

Page 44: Arithmetic For Computers Mehran Rezaeiengold.ui.ac.ir/~m.rezaei/architecture/calendar/... · Addition and subtraction 3 + + +. . . . B A 0 B 31 A 31 B 1 A 1 0 op a/s R R 1 R 0 31

A&S multiplier – data path

• Requirements

– 4 bit registers

• Shift register A (clear, load, and shift right)

• Shift register X (load, and shift right) - multiplier

• Register Y (load) - multiplicant

– A flip flop (load and clear)

– 4 bit adder

– A 2 to 1 multiplexer (if we have 4 bit wide

output bus)

44

Page 45: Arithmetic For Computers Mehran Rezaeiengold.ui.ac.ir/~m.rezaei/architecture/calendar/... · Addition and subtraction 3 + + +. . . . B A 0 B 31 A 31 B 1 A 1 0 op a/s R R 1 R 0 31

A&S multiplier – data path (cont’d)

45

A X

Y

adder

Input Bus

Output Bus

Cout

result

carry out

Page 46: Arithmetic For Computers Mehran Rezaeiengold.ui.ac.ir/~m.rezaei/architecture/calendar/... · Addition and subtraction 3 + + +. . . . B A 0 B 31 A 31 B 1 A 1 0 op a/s R R 1 R 0 31

A&S multiplier – data path (cont’d)

46

A X

Y

adder

Input Bus

Output Bus

Cout

result

carry out

sel

shift

load

cle

ar

sh

ift

loa

d

cle

ar

sh

ift

load

Page 47: Arithmetic For Computers Mehran Rezaeiengold.ui.ac.ir/~m.rezaei/architecture/calendar/... · Addition and subtraction 3 + + +. . . . B A 0 B 31 A 31 B 1 A 1 0 op a/s R R 1 R 0 31

A&S multiplier – control unit

47

Page 48: Arithmetic For Computers Mehran Rezaeiengold.ui.ac.ir/~m.rezaei/architecture/calendar/... · Addition and subtraction 3 + + +. . . . B A 0 B 31 A 31 B 1 A 1 0 op a/s R R 1 R 0 31

Examples

48

Page 49: Arithmetic For Computers Mehran Rezaeiengold.ui.ac.ir/~m.rezaei/architecture/calendar/... · Addition and subtraction 3 + + +. . . . B A 0 B 31 A 31 B 1 A 1 0 op a/s R R 1 R 0 31

A catch

49

n n - 1

n = 0 n != 0

n n - 1

wait

n = 0 n != 0

Page 50: Arithmetic For Computers Mehran Rezaeiengold.ui.ac.ir/~m.rezaei/architecture/calendar/... · Addition and subtraction 3 + + +. . . . B A 0 B 31 A 31 B 1 A 1 0 op a/s R R 1 R 0 31

A&S multiplier – control unit

start start - X[0] = 1

X[0] = 0

-

-

-

--

load X

clear Cout

clear A

load Y load A

load Cout

add

shift A

shift X

shift

done =1

sel = 0

done =1

sel = 1

entity

Page 51: Arithmetic For Computers Mehran Rezaeiengold.ui.ac.ir/~m.rezaei/architecture/calendar/... · Addition and subtraction 3 + + +. . . . B A 0 B 31 A 31 B 1 A 1 0 op a/s R R 1 R 0 31

51

Add & Shift Multiplier

Product Multiplier

Multiplicand

32-bit ALU

Control

32 bits

64 bits

s

w

What happens on signed

Multiplications?

Page 52: Arithmetic For Computers Mehran Rezaeiengold.ui.ac.ir/~m.rezaei/architecture/calendar/... · Addition and subtraction 3 + + +. . . . B A 0 B 31 A 31 B 1 A 1 0 op a/s R R 1 R 0 31

52

Signed Multiplications

• Consider again our 4 bit word multiplication X is

multiplicand and Y is multiplier

– If x3 = y3 = 0

• Unsigned multiplications

– If x3 = 0 and y3 = 1

• For the first 3 steps, do the normal add and shift; and finally

P = P – X

– If x3 = 1 and y3 = 0

• Do the normal shift until the first one is reached (in Y) from

this point on shift 1 instead of zero

– If x3 = 1 and y3 = 1

• Think about this as a homework problem

Page 53: Arithmetic For Computers Mehran Rezaeiengold.ui.ac.ir/~m.rezaei/architecture/calendar/... · Addition and subtraction 3 + + +. . . . B A 0 B 31 A 31 B 1 A 1 0 op a/s R R 1 R 0 31

53

Signed Multiplication X Positive and Y negative

0110

1001

----

0110

00000000

--------

01100000

00110000

0110

0100

----

0000

00110000

--------

00110000

00011000

0110

0010

----

0000

00011000

--------

00011000

00001100

0110

0001

----

0110

00001100

--------

1010

00001100

--------

10101100

11010110

shiftMultiplicand

Multiplier

------------

Result

+ Product

------------

Product

Shift

Tw

o’s co

mp

limen

t

Page 54: Arithmetic For Computers Mehran Rezaeiengold.ui.ac.ir/~m.rezaei/architecture/calendar/... · Addition and subtraction 3 + + +. . . . B A 0 B 31 A 31 B 1 A 1 0 op a/s R R 1 R 0 31

54

Booth’s algorithm (1951)

• In any of approaches we have seen, the

multiplier was examined bit by bit

• Can we take advantage of addition and

subtraction?

• In Booth’s algorithm every two bits of the

multiplier will indicate the action

Page 55: Arithmetic For Computers Mehran Rezaeiengold.ui.ac.ir/~m.rezaei/architecture/calendar/... · Addition and subtraction 3 + + +. . . . B A 0 B 31 A 31 B 1 A 1 0 op a/s R R 1 R 0 31

55

Booth’s algorithm (Cont’d)

0 0 0 1 1 1 1 0Beginning

of run

End

of run

middle of run

–1

+ 10000

-------

011110 Current

PositionPrevious

Position

observation

Page 56: Arithmetic For Computers Mehran Rezaeiengold.ui.ac.ir/~m.rezaei/architecture/calendar/... · Addition and subtraction 3 + + +. . . . B A 0 B 31 A 31 B 1 A 1 0 op a/s R R 1 R 0 31

56

Booth’s algorithm (example)

0110

1001

----

0000 10010

1010

---------

1010 1001

1101 01001

0110

---------

0011 0100

0001 10100

0000 1101

0000 11010

1010

---------

1010 1101

1101 0110

Prd

Sub

Prd

Shift

Add

Prd

shift

prd

shift

Sub

prd

Shift

Page 57: Arithmetic For Computers Mehran Rezaeiengold.ui.ac.ir/~m.rezaei/architecture/calendar/... · Addition and subtraction 3 + + +. . . . B A 0 B 31 A 31 B 1 A 1 0 op a/s R R 1 R 0 31

57

Divide: Paper & Pencil

1000 1001010

-1000

-----

10

101

1010

-1000

-----

10

1001

Dividend : Divisor = Quotient

Divisor * Quotient + Remainder

= Dividend

Page 58: Arithmetic For Computers Mehran Rezaeiengold.ui.ac.ir/~m.rezaei/architecture/calendar/... · Addition and subtraction 3 + + +. . . . B A 0 B 31 A 31 B 1 A 1 0 op a/s R R 1 R 0 31

58

Divide: Paper & Pencil (Cont’d)

• Initial values

• Rg = Rg – Div

• Rg < 0 -> Q0 = 0 and Rg = Rg + Div

• Rg >= 0 -> Q0 = 1

0010 0000 0000 0111

Divisor Rg

Remainder Rg

Quotient

-

+

0

1 0

0

+

1110 0000

7: 2 or 0000 0111 : 0010

+ 0010 0000

Page 59: Arithmetic For Computers Mehran Rezaeiengold.ui.ac.ir/~m.rezaei/architecture/calendar/... · Addition and subtraction 3 + + +. . . . B A 0 B 31 A 31 B 1 A 1 0 op a/s R R 1 R 0 31

59

Divide: Paper & Pencil (Cont’d)

• Shift Div to Right

0001 0000 0000 0111

Divisor Rg

Remainder Rg

Quotient

-

+

0

1 0

0

0

Page 60: Arithmetic For Computers Mehran Rezaeiengold.ui.ac.ir/~m.rezaei/architecture/calendar/... · Addition and subtraction 3 + + +. . . . B A 0 B 31 A 31 B 1 A 1 0 op a/s R R 1 R 0 31

60

Divide: Paper & Pencil (Cont’d)

• Check if the iteration N+1 reached

0001 0000 0000 0111

Divisor Rg

Remainder Rg

Quotient

-

+

0

1 0

0

0

Doneyes No

Page 61: Arithmetic For Computers Mehran Rezaeiengold.ui.ac.ir/~m.rezaei/architecture/calendar/... · Addition and subtraction 3 + + +. . . . B A 0 B 31 A 31 B 1 A 1 0 op a/s R R 1 R 0 31

61

The rest of the example

Page 268, figure 4.38

Page 62: Arithmetic For Computers Mehran Rezaeiengold.ui.ac.ir/~m.rezaei/architecture/calendar/... · Addition and subtraction 3 + + +. . . . B A 0 B 31 A 31 B 1 A 1 0 op a/s R R 1 R 0 31

62

V1 algorithm

2b. Restore the original value by adding the

Divisor register to the Remainder register, &

place the sum in the Remainder register. Also

shift the Quotient register to the left, setting

the new least significant bit to 0.

Test

Remainder

Remainder < 0Remainder 0

1. Subtract the Divisor register from the

Remainder register, and place the result

in the Remainder register.

2a. Shift the

Quotient register

to the left setting

the new rightmost

bit to 1.

3. Shift the Divisor register right 1 bit.

Done

Yes: n+1 repetitions (n = 4 here)

Start: Place Dividend in Remainder

n+1

repetition?

No: < n+1 repetitions

Page 63: Arithmetic For Computers Mehran Rezaeiengold.ui.ac.ir/~m.rezaei/architecture/calendar/... · Addition and subtraction 3 + + +. . . . B A 0 B 31 A 31 B 1 A 1 0 op a/s R R 1 R 0 31

63

DIVIDE HARDWARE Version 1

Remainder

Quotient

Divisor

64-bit ALU

Shift Right

Shift Left

WriteControl

32 bits

64 bits

64 bits

Page 64: Arithmetic For Computers Mehran Rezaeiengold.ui.ac.ir/~m.rezaei/architecture/calendar/... · Addition and subtraction 3 + + +. . . . B A 0 B 31 A 31 B 1 A 1 0 op a/s R R 1 R 0 31

64

Observations on Divide Version 1

• 1/2 bits in divisor always 0

=> 1/2 of 64-bit adder is wasted

=> 1/2 of divisor is wasted

– Cut the divisor and ALU to half

• Instead of shifting divisor to right,

shift remainder to left?

Page 65: Arithmetic For Computers Mehran Rezaeiengold.ui.ac.ir/~m.rezaei/architecture/calendar/... · Addition and subtraction 3 + + +. . . . B A 0 B 31 A 31 B 1 A 1 0 op a/s R R 1 R 0 31

65

DIVIDE HARDWARE Version 2

Remainder

Quotient

Divisor

32-bit ALU

Shift Left

Write

Control

32 bits

32 bits

64 bits

Shift Left

Page 66: Arithmetic For Computers Mehran Rezaeiengold.ui.ac.ir/~m.rezaei/architecture/calendar/... · Addition and subtraction 3 + + +. . . . B A 0 B 31 A 31 B 1 A 1 0 op a/s R R 1 R 0 31

66

Observations on Divide Version 2

• If the quotient receives a 1 at the first iteration then the

quotient register will not be long enough to hold the

value (one bit for any iteration!)

• Eliminate Quotient register by combining with

Remainder as shifted left

– Start by shifting the Remainder left as before.

– Thereafter loop contains only two steps because the shifting of

the Remainder register shifts both the remainder in the left half

and the quotient in the right half

– The consequence of combining the two registers together and

the new order of the operations in the loop is that the remainder

will shifted left one time too many.

– Thus the final correction step must shift back only the

remainder in the left half of the register

Page 67: Arithmetic For Computers Mehran Rezaeiengold.ui.ac.ir/~m.rezaei/architecture/calendar/... · Addition and subtraction 3 + + +. . . . B A 0 B 31 A 31 B 1 A 1 0 op a/s R R 1 R 0 31

67

DIVIDE HARDWARE Version 3

Remainder (Quotient)

Divisor

32-bit ALU

Write

Control

32 bits

64 bits

Shift Left“HI” “LO”

Page 68: Arithmetic For Computers Mehran Rezaeiengold.ui.ac.ir/~m.rezaei/architecture/calendar/... · Addition and subtraction 3 + + +. . . . B A 0 B 31 A 31 B 1 A 1 0 op a/s R R 1 R 0 31

68

V3 example

Page 271, figure 4.42

Page 69: Arithmetic For Computers Mehran Rezaeiengold.ui.ac.ir/~m.rezaei/architecture/calendar/... · Addition and subtraction 3 + + +. . . . B A 0 B 31 A 31 B 1 A 1 0 op a/s R R 1 R 0 31

69

Observations on Divide Version 3

• Same Hardware as Multiply: just need ALU to add or subtract, and 63-bit register to shift left or shift right

• Hi and Lo registers in MIPS combine to act as 64-bit register for multiply and divide

• Signed Divides: Simplest is to remember signs, make positive, and complement quotient and remainder if necessary– Note: Dividend and Remainder must have same sign

– Note: Quotient negated if Divisor sign & Dividend sign disagreee.g., –7 ÷ 2 = –3, remainder = –1

• Possible for quotient to be too large: if divide 64-bit integer by 1, quotient is 64 bits (“called saturation”)

Page 70: Arithmetic For Computers Mehran Rezaeiengold.ui.ac.ir/~m.rezaei/architecture/calendar/... · Addition and subtraction 3 + + +. . . . B A 0 B 31 A 31 B 1 A 1 0 op a/s R R 1 R 0 31

70

Multiplication and division Inst.

Page 274, figure 4.43

Page 71: Arithmetic For Computers Mehran Rezaeiengold.ui.ac.ir/~m.rezaei/architecture/calendar/... · Addition and subtraction 3 + + +. . . . B A 0 B 31 A 31 B 1 A 1 0 op a/s R R 1 R 0 31

71

Review of Numbers

• Computers are made to deal with

numbers

• What can we represent in N bits?

– Unsigned integers:

0 to 2N - 1

– Signed Integers (Two’s Complement)

-2(N-1) to 2(N-1) - 1

Page 72: Arithmetic For Computers Mehran Rezaeiengold.ui.ac.ir/~m.rezaei/architecture/calendar/... · Addition and subtraction 3 + + +. . . . B A 0 B 31 A 31 B 1 A 1 0 op a/s R R 1 R 0 31

72

Other Numbers

• What about other numbers?– Very large numbers? (seconds/century)

3,155,760,00010 (3.1557610 x 109)

– Very small numbers? (atomic diameter)0.0000000110 (1.010 x 10-8)

– Rational (repeating pattern) 2/3 (0.666666666. . .)

– Irrationals21/2 (1.414213562373. . .)

– Transcendental e (2.718...), (3.141...)

• All represented in scientific notation

Page 73: Arithmetic For Computers Mehran Rezaeiengold.ui.ac.ir/~m.rezaei/architecture/calendar/... · Addition and subtraction 3 + + +. . . . B A 0 B 31 A 31 B 1 A 1 0 op a/s R R 1 R 0 31

73

Scientific Notation Review

• Normalized form: no leadings 0s (exactly one digit to left of decimal point)

• Alternatives to representing 1/1,000,000,000– Normalized: 1.0 x 10-9

– Not normalized: 0.1 x 10-8,10.0 x 10-10

6.02 x 1023

radix (base)decimal point

mantissa exponent

Page 74: Arithmetic For Computers Mehran Rezaeiengold.ui.ac.ir/~m.rezaei/architecture/calendar/... · Addition and subtraction 3 + + +. . . . B A 0 B 31 A 31 B 1 A 1 0 op a/s R R 1 R 0 31

74

Scientific Notation for Binary Numbers

• Computer arithmetic that supports it called floating point, because it represents numbers where binary point is not fixed, as it is for integers– Declare such variable in C as float

1.0two x 2-1

radix (base)“binary point”

Mantissa exponent

Page 75: Arithmetic For Computers Mehran Rezaeiengold.ui.ac.ir/~m.rezaei/architecture/calendar/... · Addition and subtraction 3 + + +. . . . B A 0 B 31 A 31 B 1 A 1 0 op a/s R R 1 R 0 31

75

Floating Point Representation (1/2)

• Normal format: +1.xxxxxxxxxxtwo*2yyyytwo

• Multiple of Word Size (32 bits)

031S Exponent30 23 22

Significand

1 bit 8 bits 23 bits

• S represents Sign

Exponent represents y’s

Significand represents x’s

• Represent numbers as small as

2.0 x 10-38 to as large as 2.0 x 1038

Page 76: Arithmetic For Computers Mehran Rezaeiengold.ui.ac.ir/~m.rezaei/architecture/calendar/... · Addition and subtraction 3 + + +. . . . B A 0 B 31 A 31 B 1 A 1 0 op a/s R R 1 R 0 31

76

Floating Point Representation (2/2)

• What if result too large? (> 2.0x1038 )

– Overflow!

– Overflow => Exponent larger than represented in 8-bit

Exponent field

• What if result too small? (>0, < 2.0x10-38 )

– Underflow!

– Underflow => Negative exponent larger than

represented in 8-bit Exponent field

• How to reduce chances of overflow or

underflow?

Page 77: Arithmetic For Computers Mehran Rezaeiengold.ui.ac.ir/~m.rezaei/architecture/calendar/... · Addition and subtraction 3 + + +. . . . B A 0 B 31 A 31 B 1 A 1 0 op a/s R R 1 R 0 31

77

Double Precision Fl. Pt. Representation

Next Multiple of Word Size (64 bits)

• Double Precision (vs. Single Precision)

– C variable declared as double

– Represent numbers almost as small as

2.0 x 10-308 to almost as large as 2.0 x 10308

– But primary advantage is greater accuracy

due to larger significand

031S Exponent

30 20 19Significand

1 bit 11 bits 20 bitsSignificand (cont’d)

32 bits

Page 78: Arithmetic For Computers Mehran Rezaeiengold.ui.ac.ir/~m.rezaei/architecture/calendar/... · Addition and subtraction 3 + + +. . . . B A 0 B 31 A 31 B 1 A 1 0 op a/s R R 1 R 0 31

78

IEEE 754 Floating Point Standard

• Single Precision, DP similar

• Sign bit: 1 means negative

0 means positive

• Significand:

– To pack more bits, leading 1 implicit for normalized

numbers

– 1 + 23 bits single, 1 + 52 bits double

– always true: Significand < 1 (for normalized

numbers)

• Note: 0 has no leading 1, so reserve exponent

value 0 just for number 0

Page 79: Arithmetic For Computers Mehran Rezaeiengold.ui.ac.ir/~m.rezaei/architecture/calendar/... · Addition and subtraction 3 + + +. . . . B A 0 B 31 A 31 B 1 A 1 0 op a/s R R 1 R 0 31

79

IEEE 754 Floating Point Standard

• Kahan wanted FP numbers to be used even if no FP hardware; e.g., sort records with FP numbers using integer compares

• Could break FP number into 3 parts: compare signs, then compare exponents, then compare significands

• Wanted it to be faster, single compare if possible, especially if positive numbers

• Then want order:– Highest order bit is sign ( negative < positive)

– Exponent next, so big exponent => bigger #

– Significand last: exponents same => bigger #

Page 80: Arithmetic For Computers Mehran Rezaeiengold.ui.ac.ir/~m.rezaei/architecture/calendar/... · Addition and subtraction 3 + + +. . . . B A 0 B 31 A 31 B 1 A 1 0 op a/s R R 1 R 0 31

80

IEEE 754 Floating Point Standard

• Called Biased Notation, where bias is number

subtract to get real number

–IEEE 754 uses bias of 127 for single prec.

–Subtract 127 from Exponent field to get actual value for

exponent

–1023 is bias for double precision

031S Exponent30 23 22

Significand

1 bit 8 bits 23 bits

• (-1)S x (1 + Significand) x 2(Exponent-127)

– Double precision identical, except with exponent

bias of 1023

Page 81: Arithmetic For Computers Mehran Rezaeiengold.ui.ac.ir/~m.rezaei/architecture/calendar/... · Addition and subtraction 3 + + +. . . . B A 0 B 31 A 31 B 1 A 1 0 op a/s R R 1 R 0 31

81

“Father” of the Floating point standard

IEEE Standard 754

for Binary Floating-Point Arithmetic.

www.cs.berkeley.edu/~wkahan/

…/ieee754status/754story.html

Prof. Kahan

1989

ACM Turing

Award Winner

Page 82: Arithmetic For Computers Mehran Rezaeiengold.ui.ac.ir/~m.rezaei/architecture/calendar/... · Addition and subtraction 3 + + +. . . . B A 0 B 31 A 31 B 1 A 1 0 op a/s R R 1 R 0 31

82

Converting Decimal to FP

• Simple Case: If denominator is an exponent of 2 (2, 4, 8, 16, etc.), then it’s easy.

• Show MIPS representation of -0.75

-0.75 = -3/4

-11two/100two = -0.11two

Normalized to -1.1two x 2-1

(-1)S x (1 + Significand) x 2(Exponent-127)

(-1)1 x (1 + .100 0000 ... 0000) x 2(126-127)

1 0111 1110 100 0000 0000 0000 0000 0000

Page 83: Arithmetic For Computers Mehran Rezaeiengold.ui.ac.ir/~m.rezaei/architecture/calendar/... · Addition and subtraction 3 + + +. . . . B A 0 B 31 A 31 B 1 A 1 0 op a/s R R 1 R 0 31

83

Hairy Example

• How to represent 1/3 in MIPS?

• 1/3

= 0.33333…10

= 0.25 + 0.0625 + 0.015625 + 0.00390625 +

0.0009765625 + …

= 1/4 + 1/16 + 1/64 + 1/256 + 1/1024 + …

= 2-2 + 2-4 + 2-6 + 2-8 + 2-10 + …

= 0.0101010101… 2 * 20

= 1.0101010101… 2 * 2-2

Page 84: Arithmetic For Computers Mehran Rezaeiengold.ui.ac.ir/~m.rezaei/architecture/calendar/... · Addition and subtraction 3 + + +. . . . B A 0 B 31 A 31 B 1 A 1 0 op a/s R R 1 R 0 31

84

Hairy Example

• Sign: 0

• Exponent = -2 + 127 = 12510=011111012

• Significand = 0101010101…

0 0111 1101 0101 0101 0101 0101 0101 010

Page 85: Arithmetic For Computers Mehran Rezaeiengold.ui.ac.ir/~m.rezaei/architecture/calendar/... · Addition and subtraction 3 + + +. . . . B A 0 B 31 A 31 B 1 A 1 0 op a/s R R 1 R 0 31

85

Floating point instructions

Page 291, figure 4.47