eecc550 - shaaban #1 lec # 5 winter 2000 12-20-2000 cpu design steps 1. analyze instruction set...

29
EECC550 - Shaaban EECC550 - Shaaban #1 Lec # 5 Winter 2000 12-20-2 CPU Design Steps CPU Design Steps 1. Analyze instruction set operations using independent RTN => datapath requirements. 2. Select set of datapath components & establish clock methodology. 3. Assemble datapath meeting the requirements. 4. Analyze implementation of each instruction to determine setting of control points that effects the register transfer. 5. Assemble the control logic.

Post on 20-Dec-2015

217 views

Category:

Documents


2 download

TRANSCRIPT

EECC550 - ShaabanEECC550 - Shaaban#1 Lec # 5 Winter 2000 12-20-2000

CPU Design StepsCPU Design Steps1. Analyze instruction set operations using independent

RTN => datapath requirements.

2. Select set of datapath components & establish clock methodology.

3. Assemble datapath meeting the requirements.

4. Analyze implementation of each instruction to determine setting of control points that effects the register transfer.

5. Assemble the control logic.

EECC550 - ShaabanEECC550 - Shaaban#2 Lec # 5 Winter 2000 12-20-2000

CPU Design & Implantation ProcessCPU Design & Implantation Process• Bottom-up Design:

– Assemble components in target technology to establish critical timing.

• Top-down Design:– Specify component behavior from high-level requirements.

• Iterative refinement:– Establish a partial solution, expand and improve.

datapath control

processorInstruction SetArchitecture

=>

Reg. File Mux ALU Reg Mem Decoder Sequencer

Cells Gates

EECC550 - ShaabanEECC550 - Shaaban#3 Lec # 5 Winter 2000 12-20-2000

Single Cycle MIPS Datapath: Single Cycle MIPS Datapath: CPI = 1, Long Clock CycleCPI = 1, Long Clock Cycleim

m16

32

ALUctr

Clk

busW

RegWr

32

32

busA

32busB

55 5

Rw Ra Rb32 32-bitRegisters

Rs

Rt

Rt

RdRegDst

Exten

der

Mu

x

3216imm16

ALUSrcExtOp

Mu

x

MemtoReg

Clk

Data InWrEn32 Adr

DataMemory

MemWrA

LU

Equal

Instruction<31:0>

0

1

0

1

01

<21:25>

<16:20>

<11:15>

<0:15>

Imm16RdRtRs

=

Ad

der

Ad

der

PC

Clk

00

Mu

x

4

nPC_sel

PC

Ext

Adr

InstMemory

EECC550 - ShaabanEECC550 - Shaaban#4 Lec # 5 Winter 2000 12-20-2000

Drawback of Single Cycle ProcessorDrawback of Single Cycle Processor

• Long cycle time.

• All instructions must take as much time as the slowest:– Cycle time for load is longer than needed for all other

instructions.

• Real memory is not as well-behaved as idealized memory– Cannot always complete data access in one (short) cycle.

EECC550 - ShaabanEECC550 - Shaaban#5 Lec # 5 Winter 2000 12-20-2000

Abstract View of Single Cycle CPUAbstract View of Single Cycle CPU

PC

Nex

t P

C

Reg

iste

rF

etch ALU Reg

. W

rt

Mem

Acc

ess

Dat

aM

emInst

ruct

ion

Fet

ch

Res

ult

Sto

re

AL

Uct

r

Reg

Dst

AL

US

rc

Ext

Op

Mem

Wr

Eq

ual

nPC

_sel

Reg

Wr

Mem

Wr

Mem

Rd

MainControl

ALUcontrol

op

fun

Ext

EECC550 - ShaabanEECC550 - Shaaban#6 Lec # 5 Winter 2000 12-20-2000

Single Cycle Instruction TimingSingle Cycle Instruction Timing

PC Inst Memory mux ALU Data Mem mux

PC Reg FileInst Memory mux ALU mux

PC Inst Memory mux ALU Data Mem

PC Inst Memory cmp mux

Reg File

Reg File

Reg File

Arithmetic & Logical

Load

Store

Branch

Critical Path

setup

setup

EECC550 - ShaabanEECC550 - Shaaban#7 Lec # 5 Winter 2000 12-20-2000

Reducing Cycle Time: Multi-Cycle DesignReducing Cycle Time: Multi-Cycle Design• Cut combinational dependency graph by inserting registers / latches.• The same work is done in two or more fast cycles, rather than one slow cycle.

storage element

Acyclic CombinationalLogic

storage element

storage element

Acyclic CombinationalLogic (A)

storage element

storage element

Acyclic CombinationalLogic (B)

=>

EECC550 - ShaabanEECC550 - Shaaban#8 Lec # 5 Winter 2000 12-20-2000

Clock Cycle Time & Critical PathClock Cycle Time & Critical Path

• Critical path: the slowest path between any two storage devices

• Cycle time is a function of the critical path

• must be greater than:

– Clock-to-Q + Longest Path through the Combination Logic + Setup

Clk

.

.

.

.

.

.

.

.

.

.

.

.

EECC550 - ShaabanEECC550 - Shaaban#9 Lec # 5 Winter 2000 12-20-2000

Instruction Processing CyclesInstruction Processing Cycles

Obtain instruction from program storage

Determine instruction type

Obtain operands from registers

Compute result value or status

Store result in register/memory if needed

(usually called Write Back).

Update program counter to address

of next instruction } Commonsteps for all instructions

Instruction

Fetch

Instruction

Decode

Execute

Result

Store

Next

Instruction

EECC550 - ShaabanEECC550 - Shaaban#10 Lec # 5 Winter 2000 12-20-2000

Partitioning The Single Cycle DatapathPartitioning The Single Cycle Datapath Add registers between smallest steps

PC

Nex

t P

C

Ope

rand

Fet

ch Exec Reg

. F

ile

Mem

Acc

ess

Dat

aM

em

Inst

ruct

ion

Fet

ch

Res

ult

Sto

re

AL

Uct

r

Reg

Dst

AL

US

rc

Ext

Op

Mem

Wr

nPC

_sel

Reg

Wr

Mem

Wr

Mem

Rd

EECC550 - ShaabanEECC550 - Shaaban#11 Lec # 5 Winter 2000 12-20-2000

Example Multi-cycle DatapathExample Multi-cycle Datapath

PC

Nex

t P

C

Ope

rand

Fet

ch

Ext

ALU Reg

. F

ile

Mem

Acc

ess

Dat

aM

em

Inst

ruct

ion

Fet

ch

Res

ult

Sto

re

AL

Uct

r

Reg

Dst

AL

US

rc

Ext

Op

nPC

_sel

Reg

Wr

Mem

Wr

Mem

Rd

IR

A

B

R

M

RegFile

Mem

ToR

eg

Equ

al

Registers added:

IR: Instruction registerA, B: Two registers to hold operands read from register file.R: or ALUOut, holds the output of the ALUM: or Memory data register (MDR) to hold data read from data memory

EECC550 - ShaabanEECC550 - Shaaban#12 Lec # 5 Winter 2000 12-20-2000

Operations In Each CycleOperations In Each Cycle

Instruction Fetch

Instruction Decode

Execution

Memory

WriteBack

R-Type

IR Mem[PC]

A R[rs]

B R[rt]

R A + B

R[rd] R

PC PC + 4

Logic Immediate

IR Mem[PC]

A R[rs]

R A OR ZeroExt[imm16]

R[rt] R

PC PC + 4

Load

IR Mem[PC]

A R[rs]

R A + SignEx(Im16)

M Mem[R]

R[rd] M

PC PC + 4

Store

IR Mem[PC]

A R[rs]

B R[rt]

R A + SignEx(Im16)

Mem[R] B

PC PC + 4

Branch

IR Mem[PC]

A R[rs]

B R[rt]

If Equal = 1

PC PC + 4 +

(SignExt(imm16) x4)

else

PC PC + 4

EECC550 - ShaabanEECC550 - Shaaban#13 Lec # 5 Winter 2000 12-20-2000

Finite State Machine (FSM) Control ModelFinite State Machine (FSM) Control Model

• State specifies control points for Register Transfer.

• Transfer occurs upon exiting state (same falling edge).

State X

Register TransferControl Points

Depends on Input

Control State

Next StateLogic

Output Logic

inputs (conditions)

outputs (control points)

EECC550 - ShaabanEECC550 - Shaaban#14 Lec # 5 Winter 2000 12-20-2000

Control Specification For Multi-cycle CPUControl Specification For Multi-cycle CPUFinite State Machine (FSM)Finite State Machine (FSM)

IR MEM[PC]

R-type

A R[rs]B R[rt]

R A fun B

R[rd] RPC PC + 4

R A or ZX

R[rt] RPC PC + 4

ORi

R A + SX

R[rt] MPC PC + 4

M MEM[R]

LW

R A + SX

MEM[R] BPC PC + 4

BEQ & Equal

BEQ & ~Equal

PC PC + 4 PC PC + SX || 00

SW

“instruction fetch”

“decode / operand fetch”

Execute

Memory

Write-back

To instruction fetch

To instruction fetchTo instruction fetch

EECC550 - ShaabanEECC550 - Shaaban#15 Lec # 5 Winter 2000 12-20-2000

Traditional FSM ControllerTraditional FSM Controller

State

6

4

11nextState

op

Equal

control points

state op condnextstate control points

Truth or Transition Table

datapath State

To datapath

EECC550 - ShaabanEECC550 - Shaaban#16 Lec # 5 Winter 2000 12-20-2000

Traditional FSM ControllerTraditional FSM Controller

datapath + state diagram => controldatapath + state diagram => control

• Translate RTN statements into control points.

• Assign states.

• Implement the controller.

EECC550 - ShaabanEECC550 - Shaaban#17 Lec # 5 Winter 2000 12-20-2000

Mapping RTNs To Control Points ExamplesMapping RTNs To Control Points Examples& State Assignments& State Assignments

IR MEM[PC]

0000

R-type

A R[rs]B R[rt] 0001

R A fun B 0100

R[rd] RPC PC + 4

0101

R A or ZX 0110

R[rt] RPC PC + 4

0111

ORi

R A + SX 1000

R[rt] MPC PC + 4

1010

M MEM[S] 1001

LW

R A + SX 1011

MEM[S] BPC PC + 4 1100

BEQ & Equal

BEQ & ~Equal

PC PC + 4 0011

PC PC + SX || 00 0010

SW

“instruction fetch”

“decode / operand fetch”

Execute

Memory

Write-back

imem_rd, IRen

Aen, Ben

ALUfun, Sen

RegDst,RegWr,PCen To instruction fetch

state 0000

To instruction fetch state 0000To instruction fetch state 0000

EECC550 - ShaabanEECC550 - Shaaban#18 Lec # 5 Winter 2000 12-20-2000

Detailed Control SpecificationState Op field Eq Next IR PC Ops Exec Mem Write-Back

en sel A B Ex Sr ALU S R W M M-R Wr Dst0000 ?????? ? 0001 10001 BEQ 0 0011 1 10001 BEQ 1 0010 1 10001 R-type x 0100 1 10001 orI x 0110 1 10001 LW x 1000 1 10001 SW x 1011 1 10010 xxxxxx x 0000 1 10011 xxxxxx x 0000 1 00100 xxxxxx x 0101 0 1 fun 10101 xxxxxx x 0000 1 0 0 1 10110 xxxxxx x 0111 0 0 or 10111 xxxxxx x 0000 1 0 0 1 01000 xxxxxx x 1001 1 0 add 11001 xxxxxx x 1010 1 0 01010 xxxxxx x 0000 1 0 1 1 01011 xxxxxx x 1100 1 0 add 11100 xxxxxx x 0000 1 0 0 1

R

ORI

LW

SW

BEQ

EECC550 - ShaabanEECC550 - Shaaban#19 Lec # 5 Winter 2000 12-20-2000

Alternative Multiple Cycle Datapath (In Textbook)• Miminizes Hardware: 1 memory, 1 adder

IdealMemoryWrAdrDin

RAdr

32

32

32Dout

MemWr

32

AL

U

3232

ALUOp

ALUControl

Instru

ction R

eg

32

IRWr

32

Reg File

Ra

Rw

busW

Rb5

5

32busA

32busB

RegWr

Rs

Rt

Mu

x

0

1

Rt

Rd

PCWr

ALUSelA

Mux 01

RegDst

Mu

x

0

1

32

PC

MemtoReg

Extend

ExtOp

Mu

x0

132

0

1

23

4

16Imm 32

<< 2

ALUSelB

Mu

x1

0

Target32

Zero

ZeroPCWrCond PCSrc BrWr

32

IorD

AL

U O

ut

EECC550 - ShaabanEECC550 - Shaaban#20 Lec # 5 Winter 2000 12-20-2000

Alternative Multiple Cycle Datapath (In Textbook)

•Shared instruction/data memory unit• A single ALU shared among instructions• Shared units require additional or widened multiplexors• Temporary registers to hold data between clock cycles of the instruction:

• Additional registers: Instruction Register (IR), Memory Data Register (MDR), A, B, ALUOut

EECC550 - ShaabanEECC550 - Shaaban#21 Lec # 5 Winter 2000 12-20-2000

Operations In Each CycleOperations In Each Cycle

Instruction Fetch

Instruction Decode

Execution

Memory

WriteBack

R-Type

IR Mem[PC]PC PC + 4

A R[rs]

B R[rt]

ALUout PC + (SignExt(imm16) x4)

ALUout A + B

R[rd] ALUout

Logic Immediate

IR Mem[PC]PC PC + 4

A R[rs]

B R[rt]

ALUout PC +

(SignExt(imm16) x4)

ALUout

A OR ZeroExt[imm16]

R[rt] ALUout

Load

IR Mem[PC]PC PC + 4

A R[rs]

B R[rt]

ALUout PC +

(SignExt(imm16) x4)

ALUout

A + SignEx(Im16)

M Mem[ALUout]

R[rd] Mem

Store

IR Mem[PC]PC PC + 4

A R[rs]

B R[rt]

ALUout PC +

(SignExt(imm16) x4)

ALUout

A + SignEx(Im16)

Mem[ALUout] B

Branch

IR Mem[PC]PC PC + 4

A R[rs]

B R[rt]

ALUout PC +

(SignExt(imm16) x4)

If Equal = 1

PC ALUout

EECC550 - ShaabanEECC550 - Shaaban#22 Lec # 5 Winter 2000 12-20-2000

High-Level View of Finite State High-Level View of Finite State Machine ControlMachine Control

• First steps are independent of the instruction class• Then a series of sequences that depend on the instruction opcode• Then the control returns to fetch a new instruction.• Each box above represents one or several state.

EECC550 - ShaabanEECC550 - Shaaban#23 Lec # 5 Winter 2000 12-20-2000

Instruction Fetch and Decode Instruction Fetch and Decode FSM StatesFSM States

EECC550 - ShaabanEECC550 - Shaaban#24 Lec # 5 Winter 2000 12-20-2000

Load/Store Instructions FSM StatesLoad/Store Instructions FSM States

EECC550 - ShaabanEECC550 - Shaaban#25 Lec # 5 Winter 2000 12-20-2000

R-Type Instructions R-Type Instructions FSM StatesFSM States

EECC550 - ShaabanEECC550 - Shaaban#26 Lec # 5 Winter 2000 12-20-2000

Jump Instruction Jump Instruction Single StateSingle State

Branch Instruction Branch Instruction Single StateSingle State

EECC550 - ShaabanEECC550 - Shaaban#27 Lec # 5 Winter 2000 12-20-2000

EECC550 - ShaabanEECC550 - Shaaban#28 Lec # 5 Winter 2000 12-20-2000

Finite State Machine (FSM) SpecificationFinite State Machine (FSM) SpecificationIR MEM[PC]

PC PC + 4

R-type

ALUout A fun B

R[rd] ALUout

ALUout A op ZX

R[rt] ALUout

ORiALUout

A + SX

R[rt] M

M MEM[ALUout]

LW

ALUout A + SX

MEM[ALUout] B

SW

“instruction fetch”

“decode”

Exe

cute

Mem

ory

Writ

e-ba

ck

0000

0001

0100

0101

0110

0111

1000

1001

1010

1011

1100

BEQ

0010

If A = B then PC ALUout

A R[rs]B R[rt]

ALUout PC +SX

To instruction fetch

To instruction fetchTo instruction fetch

EECC550 - ShaabanEECC550 - Shaaban#29 Lec # 5 Winter 2000 12-20-2000

MIPS Multi-cycle Datapath MIPS Multi-cycle Datapath Performance EvaluationPerformance Evaluation

• What is the average CPI?– State diagram gives CPI for each instruction type

– Workload below gives frequency of each type

Type CPIi for type Frequency CPIi x freqIi

Arith/Logic 4 40% 1.6

Load 5 30% 1.5

Store 4 10% 0.4

branch 3 20% 0.6

Average CPI: 4.1

Better than CPI = 5 if all instructions took the same number of clock cycles (5).