lecture 9 sept 28

Lecture 9 Sept 28

Chapter 3 Arithmetic for Computers

Arithmetic for Computers

Operations on integers Addition and subtraction Multiplication and division Dealing with overflow

Floating-point real numbers Representation and operations

§3.1

Intro

ductio

n

Integer Addition

Example: 7 + 6

§3.2

Add

ition

and

Subtra

ction

Overflow if result out of range Adding +ve and –ve operands, no overflow Adding two +ve operands

Overflow if result sign is 1 Adding two –ve operands

Overflow if result sign is 0

Integer Subtraction

Add negation of second operand Example: 7 – 6 = 7 + (–6)

+7: 0000 0000 … 0000 0111–6: 1111 1111 … 1111 1010+1: 0000 0000 … 0000 0001

Overflow if result out of range Subtracting two +ve or two –ve operands, no

overflow Subtracting +ve from –ve operand

Overflow if result sign is 0 Subtracting –ve from +ve operand

Overflow if result sign is 1

Dealing with Overflow

Some languages (e.g., C) ignore overflow Use MIPS addu, addui, subu instructions

Other languages (e.g., Ada, Fortran) require raising an exception Use MIPS add, addi, sub instructions On overflow, invoke exception handler

Save PC in exception program counter (EPC) register

Jump to predefined handler address mfc0 (move from coprocessor reg)

instruction can retrieve EPC value, to return after corrective action

Two’s-Complement Addition and Subtraction

Binary adder used as 2’s-complement adder/subtractor.

AddSub

x y

y

x

k /

k /

k /

y or y

Adder

c out

c in

k /

Simple Adders

Binary half-adder (HA) and full-adder (FA).

x y c s 0 0 0 0 0 1 0 1 1 0 0 1 1 1 1 0

Inputs Outputs

HA

x y

c

s

x y c c s 0 0 0 0 0 0 0 1 0 1 0 1 0 0 1 0 1 1 1 0 1 0 0 0 1 1 0 1 1 0 1 1 0 1 0 1 1 1 1 1

Inputs Outputs

c out c in

out in x

y

s

FA

Digit-set interpretation:{0, 1} + {0, 1} = {0, 2} + {0, 1}

Digit-set interpretation:{0, 1} + {0, 1} + {0, 1} = {0, 2} + {0, 1}

Full-Adder Implementations

Full adder implemented with two half-adders, by means of two 4-input multiplexers, and as two-level gate network.

(a) FA built of two HAs

(c) Two-level AND-OR FA (b) CMOS mux-based FA

1

0

3

2

HA

HA

1

0

3

2

0

1

x y

x y

x y

s

s s

c out

c out

c out

c in

c in

c in

Ripple-Carry Adder: Slow But Simple

Ripple-carry binary adder with 32-bit inputs and output.

x

s

y

c c

x

s

y

c

x

s

y

c

c out c in

0 0

0

c 0

1 1

1

1 2

31

31

31

31

FA FA FA 32 . . .

Critical path

Carry Propagation Networks

The main part of an adder is the carry network. The rest is just a set of gates to produce the g and p signals and the sum

bits.

Carry network

. . . . . .

x i y i

g p

s

i i

i

c i c i+1

c k 1

c k

c k 2 c 1

c 0

g p 1 1 g p 0 0

g p k 2 k 2 g p i+1 i+1 g p k 1 k 1

c 0 . . . . . .

0 0 0 1 1 0 1 1

annihilated or killed propagated generated (impossible)

Carry is: g i p i

gi = xi yi pi = xi yi

Carry-Lookahead Logic with 4-Bit Block

Blocks needed in the design of carry-lookahead adders with four-way grouping of bits.

Blo

ck s

ign

al g

en

era

tion

p [i, i+3]

c i

Inte

rme

idte

ca

rrie

s

c i+1 c i+2 c i+3 g [i, i+3]

p i+3 g i+3 p i+2 g i+2 p i+1 g i+1 p i g i

Another Speed-Up Method: Carry Select

Carry-select addition circuit

c out c in Adder

Version 1 of sum bits 1

0

x [a, b ]

c out c in Adder

Version 0 of sum bits

y [a, b]

s [a, b]

c a

0 1

Allows doubling of adder width with a single-mux additional delay

Arithmetic for Multimedia

Graphics and media processing operates on vectors of 8-bit and 16-bit data Use 64-bit adder, with partitioned carry chain

Operate on 8×8-bit, 4×16-bit, or 2×32-bit vectors

SIMD (single-instruction, multiple-data)

Saturating operations On overflow, result is largest representable

value c.f. 2’s-complement modulo arithmetic

E.g., clipping in audio, saturation in video

Shift-Add Multiplication

Multiplication of 4-bit numbers in dot notation

Multiplicand

Partial products bit-matrix

x y

z

y x 2 0 0

y x 2 1 1

y x 2 2 2

y x 2 3 3

Multiplier

Product

z(j+1) = (z(j) + yj x 2k) 2–1 with z(0) = 0 and z(k) = z |––– add –––|

|–– shift right ––|

Multiplication

Start with long-multiplication approach

1000× 1001 1000 0000 0000 1000 1001000

Length of product is the

sum of operand lengths

multiplicand

multiplier

product

§3.3

Mu

ltiplica

tion

Multiplication Hardware

Initially 0

Optimized Multiplier

Perform steps in parallel: add/shift

One cycle per partial-product additionThat’s ok, if frequency of multiplications is low

Faster Multiplier

Uses multiple adders Cost/performance tradeoff

Can be pipelined Several multiplications performed in parallel

MIPS Multiplication

Two 32-bit registers for product HI: most-significant 32 bits LO: least-significant 32-bits

Instructions mult rs, rt / multu rs, rt

64-bit product in HI/LO mfhi rd / mflo rd

Move from HI/LO to rd Can test HI value to see if product overflows 32 bits

mul rd, rs, rt Least-significant 32 bits of product –> rd

Division Check for 0 divisor Long division approach

If divisor ≤ dividend bits 1 bit in quotient, subtract

Otherwise 0 bit in quotient, bring down next

dividend bit Restoring division

Do the subtract, and if remainder goes < 0, add divisor back

Signed division Divide using absolute values Adjust sign of quotient and

remainder as required

10011000 1001010 -1000 10 101 1010 -1000 10

n-bit operands yield n-bitquotient and remainder

quotient

dividend

remainder

divisor

Division Hardware

Initially dividend

Initially divisor in left

half

Optimized Divider

One cycle per partial-remainder subtraction Looks a lot like a multiplier!

Same hardware can be used for both

MIPS Division

Use HI/LO registers for result HI: 32-bit remainder LO: 32-bit quotient

Instructions div rs, rt / divu rs, rt No overflow or divide-by-0 checking

Software must perform checks if required Use mfhi, mflo to access result

Binary and Decimal Multiplication

Step-by-step multiplication examples for 4-digit unsigned numbers.

Position 7 6 5 4 3 2 1 0 Position 7 6 5 4 3 2 1 0========================= =========================x24 1 0 1 0 x104 3 5 2 8y 0 0 1 1 y 4 0 6 7========================= =========================z

(0) 0 0 0 0 z (0) 0 0 0 0

+y0x24 1 0 1 0 +y0x104 2 4 6 9 6–––––––––––––––––––––––––– ––––––––––––––––––––––––––2z

(1) 0 1 0 1 0 10z (1) 2 4 6 9 6

z (1) 0 1 0 1 0 z

(1) 0 2 4 6 9 6+y1x24 1 0 1 0 +y1x104 2 1 1 6 8–––––––––––––––––––––––––– ––––––––––––––––––––––––––2z

(2) 0 1 1 1 1 0 10z (2) 2 3 6 3 7 6

z (2) 0 1 1 1 1 0 z

(2) 2 3 6 3 7 6+y2x24 0 0 0 0 +y2x104 0 0 0 0 0–––––––––––––––––––––––––– ––––––––––––––––––––––––––2z

(3) 0 0 1 1 1 1 0 10z (3) 0 2 3 6 3 7 6

z (3) 0 0 1 1 1 1 0 z

(3) 0 2 3 6 3 7 6+y3x24 0 0 0 0 +y3x104 1 4 1 1 2–––––––––––––––––––––––––– ––––––––––––––––––––––––––2z

(4) 0 0 0 1 1 1 1 0 10z (4) 1 4 3 4 8 3 7 6

z (4) 0 0 0 1 1 1 1 0 z

(4) 1 4 3 4 8 3 7 6========================= =========================

lecture 9 sept 28

Documents