lecture 9 sept 28

24
Lecture 9 Sept 28 Chapter 3 Arithmetic for Computers

Upload: tibor

Post on 06-Jan-2016

45 views

Category:

Documents


6 download

DESCRIPTION

Lecture 9 Sept 28. Chapter 3 Arithmetic for Computers. Arithmetic for Computers. §3.1 Introduction. Operations on integers Addition and subtraction Multiplication and division Dealing with overflow Floating-point real numbers - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Lecture 9                                      Sept 28

Lecture 9 Sept 28

Chapter 3 Arithmetic for Computers

Page 2: Lecture 9                                      Sept 28

Arithmetic for Computers

Operations on integers Addition and subtraction Multiplication and division Dealing with overflow

Floating-point real numbers Representation and operations

§3.1

Intro

ductio

n

Page 3: Lecture 9                                      Sept 28

Integer Addition

Example: 7 + 6

§3.2

Add

ition

and

Subtra

ction

Overflow if result out of range Adding +ve and –ve operands, no overflow Adding two +ve operands

Overflow if result sign is 1 Adding two –ve operands

Overflow if result sign is 0

Page 4: Lecture 9                                      Sept 28

Integer Subtraction

Add negation of second operand Example: 7 – 6 = 7 + (–6)

+7: 0000 0000 … 0000 0111–6: 1111 1111 … 1111 1010+1: 0000 0000 … 0000 0001

Overflow if result out of range Subtracting two +ve or two –ve operands, no

overflow Subtracting +ve from –ve operand

Overflow if result sign is 0 Subtracting –ve from +ve operand

Overflow if result sign is 1

Page 5: Lecture 9                                      Sept 28

Dealing with Overflow

Some languages (e.g., C) ignore overflow Use MIPS addu, addui, subu instructions

Other languages (e.g., Ada, Fortran) require raising an exception Use MIPS add, addi, sub instructions On overflow, invoke exception handler

Save PC in exception program counter (EPC) register

Jump to predefined handler address mfc0 (move from coprocessor reg)

instruction can retrieve EPC value, to return after corrective action

Page 6: Lecture 9                                      Sept 28

Two’s-Complement Addition and Subtraction

Binary adder used as 2’s-complement adder/subtractor.

AddSub

x y

y

x

k /

k /

k /

y or y

Adder

c out

c in

k /

Page 7: Lecture 9                                      Sept 28

Simple Adders

Binary half-adder (HA) and full-adder (FA).

x y c s 0 0 0 0 0 1 0 1 1 0 0 1 1 1 1 0

Inputs Outputs

HA

x y

c

s

x y c c s 0 0 0 0 0 0 0 1 0 1 0 1 0 0 1 0 1 1 1 0 1 0 0 0 1 1 0 1 1 0 1 1 0 1 0 1 1 1 1 1

Inputs Outputs

c out c in

out in x

y

s

FA

Digit-set interpretation:{0, 1} + {0, 1} = {0, 2} + {0, 1}

Digit-set interpretation:{0, 1} + {0, 1} + {0, 1} = {0, 2} + {0, 1}

Page 8: Lecture 9                                      Sept 28

Full-Adder Implementations

Full adder implemented with two half-adders, by means of two 4-input multiplexers, and as two-level gate network.

(a) FA built of two HAs

(c) Two-level AND-OR FA (b) CMOS mux-based FA

1

0

3

2

HA

HA

1

0

3

2

0

1

x y

x y

x y

s

s s

c out

c out

c out

c in

c in

c in

Page 9: Lecture 9                                      Sept 28

Ripple-Carry Adder: Slow But Simple

Ripple-carry binary adder with 32-bit inputs and output.

x

s

y

c c

x

s

y

c

x

s

y

c

c out c in

0 0

0

c 0

1 1

1

1 2

31

31

31

31

FA FA FA 32 . . .

Critical path

Page 10: Lecture 9                                      Sept 28

Carry Propagation Networks

The main part of an adder is the carry network. The rest is just a set of gates to produce the g and p signals and the sum

bits.

Carry network

. . . . . .

x i y i

g p

s

i i

i

c i c i+1

c k 1

c k

c k 2 c 1

c 0

g p 1 1 g p 0 0

g p k 2 k 2 g p i+1 i+1 g p k 1 k 1

c 0 . . . . . .

0 0 0 1 1 0 1 1

annihilated or killed propagated generated (impossible)

Carry is: g i p i

gi = xi yi pi = xi yi

Page 11: Lecture 9                                      Sept 28

Carry-Lookahead Logic with 4-Bit Block

Blocks needed in the design of carry-lookahead adders with four-way grouping of bits.

Blo

ck s

ign

al g

en

era

tion

p [i, i+3]

c i

Inte

rme

idte

ca

rrie

s

c i+1 c i+2 c i+3 g [i, i+3]

p i+3 g i+3 p i+2 g i+2 p i+1 g i+1 p i g i

Page 12: Lecture 9                                      Sept 28

Another Speed-Up Method: Carry Select

Carry-select addition circuit

c out c in Adder

Version 1 of sum bits 1

0

x [a, b ]

c out c in Adder

Version 0 of sum bits

y [a, b]

s [a, b]

c a

0 1

Allows doubling of adder width with a single-mux additional delay

Page 13: Lecture 9                                      Sept 28

Arithmetic for Multimedia

Graphics and media processing operates on vectors of 8-bit and 16-bit data Use 64-bit adder, with partitioned carry chain

Operate on 8×8-bit, 4×16-bit, or 2×32-bit vectors

SIMD (single-instruction, multiple-data)

Saturating operations On overflow, result is largest representable

value c.f. 2’s-complement modulo arithmetic

E.g., clipping in audio, saturation in video

Page 14: Lecture 9                                      Sept 28

Shift-Add Multiplication

Multiplication of 4-bit numbers in dot notation

Multiplicand

Partial products bit-matrix

x y

z

y x 2 0 0

y x 2 1 1

y x 2 2 2

y x 2 3 3

Multiplier

Product

z(j+1) = (z(j) + yj x 2k) 2–1 with z(0) = 0 and z(k) = z |––– add –––|

|–– shift right ––|

Page 15: Lecture 9                                      Sept 28

Multiplication

Start with long-multiplication approach

1000× 1001 1000 0000 0000 1000 1001000

Length of product is the

sum of operand lengths

multiplicand

multiplier

product

§3.3

Mu

ltiplica

tion

Page 16: Lecture 9                                      Sept 28

Multiplication Hardware

Initially 0

Page 17: Lecture 9                                      Sept 28

Optimized Multiplier

Perform steps in parallel: add/shift

One cycle per partial-product additionThat’s ok, if frequency of multiplications is low

Page 18: Lecture 9                                      Sept 28

Faster Multiplier

Uses multiple adders Cost/performance tradeoff

Can be pipelined Several multiplications performed in parallel

Page 19: Lecture 9                                      Sept 28

MIPS Multiplication

Two 32-bit registers for product HI: most-significant 32 bits LO: least-significant 32-bits

Instructions mult rs, rt / multu rs, rt

64-bit product in HI/LO mfhi rd / mflo rd

Move from HI/LO to rd Can test HI value to see if product overflows 32 bits

mul rd, rs, rt Least-significant 32 bits of product –> rd

Page 20: Lecture 9                                      Sept 28

Division Check for 0 divisor Long division approach

If divisor ≤ dividend bits 1 bit in quotient, subtract

Otherwise 0 bit in quotient, bring down next

dividend bit Restoring division

Do the subtract, and if remainder goes < 0, add divisor back

Signed division Divide using absolute values Adjust sign of quotient and

remainder as required

10011000 1001010 -1000 10 101 1010 -1000 10

n-bit operands yield n-bitquotient and remainder

quotient

dividend

remainder

divisor

Page 21: Lecture 9                                      Sept 28

Division Hardware

Initially dividend

Initially divisor in left

half

Page 22: Lecture 9                                      Sept 28

Optimized Divider

One cycle per partial-remainder subtraction Looks a lot like a multiplier!

Same hardware can be used for both

Page 23: Lecture 9                                      Sept 28

MIPS Division

Use HI/LO registers for result HI: 32-bit remainder LO: 32-bit quotient

Instructions div rs, rt / divu rs, rt No overflow or divide-by-0 checking

Software must perform checks if required Use mfhi, mflo to access result

Page 24: Lecture 9                                      Sept 28

Binary and Decimal Multiplication

Step-by-step multiplication examples for 4-digit unsigned numbers.

Position 7 6 5 4 3 2 1 0 Position 7 6 5 4 3 2 1 0========================= =========================x24 1 0 1 0 x104 3 5 2 8y 0 0 1 1 y 4 0 6 7========================= =========================z

(0) 0 0 0 0 z (0) 0 0 0 0

+y0x24 1 0 1 0 +y0x104 2 4 6 9 6–––––––––––––––––––––––––– ––––––––––––––––––––––––––2z

(1) 0 1 0 1 0 10z (1) 2 4 6 9 6

z (1) 0 1 0 1 0 z

(1) 0 2 4 6 9 6+y1x24 1 0 1 0 +y1x104 2 1 1 6 8–––––––––––––––––––––––––– ––––––––––––––––––––––––––2z

(2) 0 1 1 1 1 0 10z (2) 2 3 6 3 7 6

z (2) 0 1 1 1 1 0 z

(2) 2 3 6 3 7 6+y2x24 0 0 0 0 +y2x104 0 0 0 0 0–––––––––––––––––––––––––– ––––––––––––––––––––––––––2z

(3) 0 0 1 1 1 1 0 10z (3) 0 2 3 6 3 7 6

z (3) 0 0 1 1 1 1 0 z

(3) 0 2 3 6 3 7 6+y3x24 0 0 0 0 +y3x104 1 4 1 1 2–––––––––––––––––––––––––– ––––––––––––––––––––––––––2z

(4) 0 0 0 1 1 1 1 0 10z (4) 1 4 3 4 8 3 7 6

z (4) 0 0 0 1 1 1 1 0 z

(4) 1 4 3 4 8 3 7 6========================= =========================