a pipelined cpu - cseweb.ucsd.educseweb.ucsd.edu/classes/su01/cse141/lect9.pdf · cse 141 -carro a...

11
CSE 141 - Carro A Pipelined CPU The beauty of parallel operations CSE 141 - Carro Review -- Single Cycle CPU CSE 141 - Carro Review -- Multiple Cycle CPU

Upload: buidieu

Post on 20-Apr-2018

228 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: A Pipelined CPU - cseweb.ucsd.educseweb.ucsd.edu/classes/su01/cse141/lect9.pdf · CSE 141 -Carro A Pipelined CPU The beauty of parallel operations CSE 141 -Carro Review -- Single

CSE 141 - Carro

A Pipelined CPU

The beauty of parallel operations

CSE 141 - Carro

Review -- Single Cycle CPU

CSE 141 - Carro

Review -- Multiple Cycle CPU

Page 2: A Pipelined CPU - cseweb.ucsd.educseweb.ucsd.edu/classes/su01/cse141/lect9.pdf · CSE 141 -Carro A Pipelined CPU The beauty of parallel operations CSE 141 -Carro Review -- Single

CSE 141 - Carro

Review -- Instruction Latencies

Cycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 5

Ifetch Reg/Dec Exec Mem WrLoad

Ifetch Reg/Dec Exec Mem WrLoad

Single-Cycle CPU

Multiple Cycle CPU

Ifetch Reg/Dec Exec WrAdd

CSE 141 - Carro

Instruction Latencies and ThroughputSingle-Cycle CPU

Multiple Cycle CPU

Pipelined CPU

Cycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 5 Cycle 6 Cycle 7 Cycle 8

Ifetch Reg/Dec Exec Mem WrLoad

Ifetch Reg/Dec Exec Mem WrLoad

Ifetch Reg/Dec Exec Mem WrLoad

Ifetch Reg/Dec Exec Mem WrLoad

Cycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 5

Ifetch Reg/Dec Exec Mem WrLoad

Ifetch Reg/Dec Exec Mem WrLoad

CSE 141 - Carro

Pipelining Advantages

Higher maximum throughput

Higher utilization of CPU resources

Ideal speedup is number of stages in the pipeline. Do weachieve this?

What makes it easy:

But, more complicated datapath, more complex control

Page 3: A Pipelined CPU - cseweb.ucsd.educseweb.ucsd.edu/classes/su01/cse141/lect9.pdf · CSE 141 -Carro A Pipelined CPU The beauty of parallel operations CSE 141 -Carro Review -- Single

CSE 141 - Carro

Pipelining Advantages

CPU Design Technology

Single-Cycle CPU

Multiple-Cycle CPU

Pipelined CPU

Control Logic

Combinational Logic

FSM or Microprogram

Combinational Logic

Peak Throughput

1, but slow clock

3, but fast clock

1, with fast clock3

CSE 141 - Carro

Pipelining in Modern CPUs

CPU Datapath

Arithmetic Units

System Buses

Software (at multiple levels)

etc...

CSE 141 - Carro

A Pipelined Datapath

IF: Instruction fetch

ID: Instruction decode and register fetch

EX: Execution and effective address calculation

MEM: Memory access

WB: Write back

Page 4: A Pipelined CPU - cseweb.ucsd.educseweb.ucsd.edu/classes/su01/cse141/lect9.pdf · CSE 141 -Carro A Pipelined CPU The beauty of parallel operations CSE 141 -Carro Review -- Single

Basic Question -> Basic Idea

Each HW resource has a specific task. What do we need to add to actually split the datapath into stages?

CSE 141 - Carro

Pipelined Datapath

Instruction Inst

ruct

ion

A word of advice: there is a hidden problem here! Can you find it?

CSE 141 - Carro

Corrected datapath

Instruction Inst

ruct

ion

Page 5: A Pipelined CPU - cseweb.ucsd.educseweb.ucsd.edu/classes/su01/cse141/lect9.pdf · CSE 141 -Carro A Pipelined CPU The beauty of parallel operations CSE 141 -Carro Review -- Single

CSE 141 - Carro

Graphically Representing Pipelines

IM Reg DM Reg

IM Reg DM Reg

CC 1 CC 2 CC 3 CC 4 CC 5 CC 6

Time (in clock cycles)

lw $10, 20($1)

Program

The physical components are not there, it is just a representationCan help with answering questions like:

how many cycles does it take to execute this code?what is the ALU doing during cycle 4?use this representation to help understand datapaths

CSE 141 - Carro

Execution in a Pipelined Datapath

IM Reg

AL

U DM Reg

IM Reg

AL

U DM Reg

IM Reg

AL

U DM Reg

IM Reg

AL

U DM Reg

IM Reg

AL

U DM Reg

CC1 CC2 CC3 CC4 CC5 CC6 CC7 CC8 CC9

lw

lw

lw

lw

lw

steadystate

IF ID EX MEM WB

IF ID EX MEM WB

CSE 141 - Carro

Mixed Instructions in the Pipeline

IM Reg

AL

U Reg

IM Reg

AL

U DM Reg

CC1 CC2 CC3 CC4 CC5 CC6

lw

add

Page 6: A Pipelined CPU - cseweb.ucsd.educseweb.ucsd.edu/classes/su01/cse141/lect9.pdf · CSE 141 -Carro A Pipelined CPU The beauty of parallel operations CSE 141 -Carro Review -- Single

CSE 141 - Carro

Pipeline Principles

All instructions that share a pipeline must have the samestages in the same order.

add

sw

All intermediate values must be latched each cycle.

There is no functional block reuse (in the same instruction)

IM Reg

AL

U DM Reg

IF ID EX MEM WB

Because of this, the HW resembles the one of the single cycle CPU

CSE 141 - Carro

Pipelined DatapathInstruction Fetch Instruction Decode/

Register FetchExecute/

Address CalculationMemory Access Write Back

registers!

Instructionmemory

Address

4

32

0

AddAdd

result

Shiftleft 2

IF/ID EX/MEM MEM/WB

Mux

0

1

Add

PC

0Writedata

Mux

1Registers

Readdata 1

Readdata 2

Readregister 1

Readregister 2

16Sign

extend

Writeregister

Writedata

Readdata

1

ALUresult

Mux

ALU

Zero

ID/EX

Datamemory

Address

CSE 141 - Carro

The Pipeline in Executionadd $10, $1, $2 Instruction Decode/

Register FetchExecute/

Address CalculationMemory Access Write Back

Instructionmemory

Address

4

32

0

AddAdd

result

Shiftleft 2

IF/ID EX/MEM MEM/WB

Mux

0

1

Add

PC

0Writedata

Mux

1Registers

Readdata 1

Readdata 2

Readregister 1

Readregister 2

16Sign

extend

Writeregister

Writedata

Readdata

1

ALUresult

Mux

ALU

Zero

ID/EX

Datamemory

Address

Page 7: A Pipelined CPU - cseweb.ucsd.educseweb.ucsd.edu/classes/su01/cse141/lect9.pdf · CSE 141 -Carro A Pipelined CPU The beauty of parallel operations CSE 141 -Carro Review -- Single

CSE 141 - Carro

The Pipeline in Executionlw $12, 1000($4) add $10, $1, $2 Execute/

Address CalculationMemory Access Write Back

Instructionmemory

Address

4

32

0

AddAdd

result

Shiftleft 2

IF/ID EX/MEM MEM/WB

Mux

0

1

Add

PC

0Writedata

Mux

1Registers

Readdata 1

Readdata 2

Readregister 1

Readregister 2

16Sign

extend

Writeregister

Writedata

Readdata

1

ALUresult

Mux

ALU

Zero

ID/EX

Datamemory

Address

CSE 141 - Carro

The Pipeline in Executionsub $15, $4, $1 lw $12, 1000($4) add $10, $1, $2 Memory Access Write Back

Instructionmemory

Address

4

32

0

AddAdd

result

Shiftleft 2

IF/ID EX/MEM MEM/WB

Mux

0

1

Add

PC

0Writedata

Mux

1Registers

Readdata 1

Readdata 2

Readregister 1

Readregister 2

16Sign

extend

Writeregister

Writedata

Readdata

1

ALUresult

Mux

ALU

Zero

ID/EX

Datamemory

Address

CSE 141 - Carro

The Pipeline in ExecutionInstruction Fetch sub $15, $4, $1 lw $12, 1000($4) add $10, $1, $2 Write Back

Instructionmemory

Address

4

32

0

AddAdd

result

Shiftleft 2

IF/ID EX/MEM MEM/WB

Mux

0

1

Add

PC

0Writedata

Mux

1Registers

Readdata 1

Readdata 2

Readregister 1

Readregister 2

16Sign

extend

Writeregister

Writedata

Readdata

1

ALUresult

Mux

ALU

Zero

ID/EX

Datamemory

Address

Page 8: A Pipelined CPU - cseweb.ucsd.educseweb.ucsd.edu/classes/su01/cse141/lect9.pdf · CSE 141 -Carro A Pipelined CPU The beauty of parallel operations CSE 141 -Carro Review -- Single

CSE 141 - Carro

The Pipeline in ExecutionInstruction Fetch Instruction Decode/

Register Fetchsub $15, $4, $1 lw $12, 1000($4) add $10, $1, $2

Instructionmemory

Address

4

32

0

AddAdd

result

Shiftleft 2

IF/ID EX/MEM MEM/WB

Mux

0

1

Add

PC

0Writedata

Mux

1Registers

Readdata 1

Readdata 2

Readregister 1

Readregister 2

16Sign

extend

Writeregister

Writedata

Readdata

1

ALUresult

Mux

ALU

Zero

ID/EX

Datamemory

Address

CSE 141 - Carro

The Pipeline in ExecutionInstruction Fetch Instruction Decode/

Register FetchExecute/

Address Calculationsub $15, $4, $1 lw $12, 1000($4)

Instructionmemory

Address

4

32

0

AddAdd

result

Shiftleft 2

IF/ID EX/MEM MEM/WB

Mux

0

1

Add

PC

0Writedata

Mux

1Registers

Readdata 1

Readdata 2

Readregister 1

Readregister 2

16Sign

extend

Writeregister

Writedata

Readdata

1

ALUresult

Mux

ALU

Zero

ID/EX

Datamemory

Address

CSE 141 - Carro

Pipeline Control

PC

Instruction Inst

ruct

ion

ALUOp

RegDst

ALUSrc

16 32ALU

MemRead

Instruction

AddAdd

Add

0

1

M

Page 9: A Pipelined CPU - cseweb.ucsd.educseweb.ucsd.edu/classes/su01/cse141/lect9.pdf · CSE 141 -Carro A Pipelined CPU The beauty of parallel operations CSE 141 -Carro Review -- Single

CSE 141 - Carro

Pipelined Control

FSM not really appropriate (many details to remember)

Combinational Logic, at the right time!

IF/I

D

ID/E

X

EX

/ME

M

ME

M/W

B

controlinstruction

CSE 141 - Carro

Pipelined Control Signals

Execution Stage Control Lines Memory Stage Control Lines Write Back Stage ControlLines

Instruction RegDst ALUOp1 ALUOp0 ALUSrc Branch MemRead MemWrite RegWrite MemtoRegR-Format 1 1 0 0 0 0 0 1 0lw 0 0 0 1 0 1 0 1 1sw x 0 0 1 0 0 1 0 xbeq x 0 1 0 1 0 0 0 x

Control

EX

M

WB

M

WB

WB

IF/ID ID/EX EX/MEM MEM/WB

Instruction

CSE 141 - Carro

The Pipeline with Control Logic

PC

Instruction Inst

ruct

ion

Mem

toR

eg

Reg

Writ

e

Mem

Writ

e

Page 10: A Pipelined CPU - cseweb.ucsd.educseweb.ucsd.edu/classes/su01/cse141/lect9.pdf · CSE 141 -Carro A Pipelined CPU The beauty of parallel operations CSE 141 -Carro Review -- Single

CSE 141 - Carro

Is it really that easy?

What happens when...add $3, $10, $11

lw $8, 1000($3)

sub $11, $8, $7

Typical problem of starting something without having finished the previous task

CSE 141 - Carro

The Pipeline in Executionlw $8, 1000($3) add $3, $10, $11 Execute/

Address CalculationMemory Access Write Back

Instructionmemory

Address

4

32

0

AddAdd

result

Shiftleft 2

IF/ID EX/MEM MEM/WB

Mux

0

1

Add

PC

0Writedata

Mux

1Registers

Readdata 1

Readdata 2

Readregister 1

Readregister 2

16Sign

extend

Writeregister

Writedata

Readdata

1

ALUresult

Mux

ALU

Zero

ID/EX

Datamemory

Address

CSE 141 - Carro

The Pipeline in Executionsub $11, $8, $7 lw $8, 1000($3) add $3, $10, $11 Memory Access Write Back

Instructionmemory

Address

4

32

0

AddAdd

result

Shiftleft 2

IF/ID EX/MEM MEM/WB

Mux

0

1

Add

PC

0Writedata

Mux

1Registers

Readdata 1

Readdata 2

Readregister 1

Readregister 2

16Sign

extend

Writeregister

Writedata

Readdata

1

ALUresult

Mux

ALU

Zero

ID/EX

Datamemory

Address

Page 11: A Pipelined CPU - cseweb.ucsd.educseweb.ucsd.edu/classes/su01/cse141/lect9.pdf · CSE 141 -Carro A Pipelined CPU The beauty of parallel operations CSE 141 -Carro Review -- Single

CSE 141 - Carro

The Pipeline in Executionadd $10, $1, $2 sub $11, $8, $7 lw $8, 1000($3) add $3, $10, $11 Write Back

Instructionmemory

Address

4

32

0

AddAdd

result

Shiftleft 2

IF/ID EX/MEM MEM/WB

Mux

0

1

Add

PC

0Writedata

Mux

1Registers

Readdata 1

Readdata 2

Readregister 1

Readregister 2

16Sign

extend

Writeregister

Writedata

Readdata

1

ALUresult

Mux

ALU

Zero

ID/EX

Datamemory

Address

CSE 141 - Carro

Data HazardsWhen a result is needed in the pipeline before it is

IM Reg

AL

U DM Reg

IM Reg

AL

U DM

IM Reg

AL

U DM Reg

IM Reg A

LU DM Reg

IM Reg

AL

U DM Reg

CC1 CC2 CC3 CC4 CC5 CC6 CC7 CC8

sub $2, $1, $3

and $12, $2, $5

or $13, $6, $2

add $14, $2, $2

sw $15, 100($2)

R2 Available

R2 Needed