cs352h: computer systems architecture - department of ... · university of texas at austin cs352h -...

42
University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell CS352H: Computer Systems Architecture Topic 9: MIPS Pipeline - Hazards October 1, 2009

Upload: nguyencong

Post on 12-Nov-2018

222 views

Category:

Documents


0 download

TRANSCRIPT

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell

CS352H: Computer Systems Architecture

Topic 9: MIPS Pipeline - Hazards

October 1, 2009

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 2

Data Hazards in ALU Instructions

Consider this sequence:sub $2, $1,$3and $12,$2,$5or $13,$6,$2add $14,$2,$2sw $15,100($2)

We can resolve hazards with forwardingHow do we detect when to forward?

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 3

Dependencies & Forwarding

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 4

Detecting the Need to Forward

Pass register numbers along pipelinee.g., ID/EX.RegisterRs = register number for Rssitting in ID/EX pipeline register

ALU operand register numbers in EX stage aregiven by

ID/EX.RegisterRs, ID/EX.RegisterRtData hazards when1a. EX/MEM.RegisterRd = ID/EX.RegisterRs1b. EX/MEM.RegisterRd = ID/EX.RegisterRt2a. MEM/WB.RegisterRd = ID/EX.RegisterRs2b. MEM/WB.RegisterRd = ID/EX.RegisterRt

Fwd fromEX/MEMpipeline reg

Fwd fromMEM/WBpipeline reg

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 5

Detecting the Need to Forward

But only if forwarding instruction will write to aregister!

EX/MEM.RegWrite, MEM/WB.RegWriteAnd only if Rd for that instruction is not $zero

EX/MEM.RegisterRd ≠ 0,MEM/WB.RegisterRd ≠ 0

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 6

Forwarding Paths

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 7

Forwarding Conditions

EX hazardif (EX/MEM.RegWrite and (EX/MEM.RegisterRd ≠ 0) and (EX/MEM.RegisterRd = ID/EX.RegisterRs)) ForwardA = 10if (EX/MEM.RegWrite and (EX/MEM.RegisterRd ≠ 0) and (EX/MEM.RegisterRd = ID/EX.RegisterRt)) ForwardB = 10

MEM hazardif (MEM/WB.RegWrite and (MEM/WB.RegisterRd ≠ 0) and (MEM/WB.RegisterRd = ID/EX.RegisterRs)) ForwardA = 01if (MEM/WB.RegWrite and (MEM/WB.RegisterRd ≠ 0) and (MEM/WB.RegisterRd = ID/EX.RegisterRt)) ForwardB = 01

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 8

Double Data Hazard

Consider the sequence:add $1,$1,$2add $1,$1,$3add $1,$1,$4

Both hazards occurWant to use the most recent

Revise MEM hazard conditionOnly fwd if EX hazard condition isn’t true

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 9

Revised Forwarding Condition

MEM hazardif (MEM/WB.RegWrite and (MEM/WB.RegisterRd ≠ 0) and not (EX/MEM.RegWrite and (EX/MEM.RegisterRd ≠ 0) and (EX/MEM.RegisterRd = ID/EX.RegisterRs)) and (MEM/WB.RegisterRd = ID/EX.RegisterRs)) ForwardA = 01

if (MEM/WB.RegWrite and (MEM/WB.RegisterRd ≠ 0) and not (EX/MEM.RegWrite and (EX/MEM.RegisterRd ≠ 0) and (EX/MEM.RegisterRd = ID/EX.RegisterRt)) and (MEM/WB.RegisterRd = ID/EX.RegisterRt)) ForwardB = 01

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 10

Datapath with Forwarding

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 11

Load-Use Data Hazard

Need to stallfor one cycle

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 12

Load-Use Hazard Detection

Check when using instruction is decoded in ID stageALU operand register numbers in ID stage are given by

IF/ID.RegisterRs, IF/ID.RegisterRtLoad-use hazard when

ID/EX.MemRead and ((ID/EX.RegisterRt = IF/ID.RegisterRs) or (ID/EX.RegisterRt = IF/ID.RegisterRt))

If detected, stall and insert bubble

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 13

How to Stall the Pipeline

Force control values in ID/EX registerto 0

EX, MEM and WB do nop (no-operation)

Prevent update of PC and IF/ID registerUsing instruction is decoded againFollowing instruction is fetched again1-cycle stall allows MEM to read data for lw

Can subsequently forward to EX stage

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 14

Stall/Bubble in the Pipeline

Stall insertedhere

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 15

Stall/Bubble in the Pipeline

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 16

Datapath with Hazard Detection

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 17

Stalls and Performance

Stalls reduce performanceBut are required to get correct results

Compiler can arrange code to avoid hazards and stallsRequires knowledge of the pipeline structure

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 18

Branch Hazards

If branch outcome determined in MEM

PC

Flush theseinstructions(Set controlvalues to 0)

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 19

Reducing Branch Delay

Move hardware to determine outcome to ID stageTarget address adderRegister comparator

Example: branch taken36: sub $10, $4, $840: beq $1, $3, 744: and $12, $2, $548: or $13, $2, $652: add $14, $4, $256: slt $15, $6, $7 ...72: lw $4, 50($7)

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 20

Example: Branch Taken

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 21

Example: Branch Taken

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 22

Data Hazards for Branches

If a comparison register is a destination of 2nd or 3rd

preceding ALU instruction

IF ID EX MEM WB

IF ID EX MEM WB

IF ID EX MEM WB

IF ID EX MEM WB

add $4, $5, $6

add $1, $2, $3

beq $1, $4, target

Can resolve using forwarding

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 23

Data Hazards for Branches

If a comparison register is a destination of preceding ALUinstruction or 2nd preceding load instruction

Need 1 stall cycle

beq stalled

IF ID EX MEM WB

IF ID EX MEM WB

IF ID

ID EX MEM WB

add $4, $5, $6

lw $1, addr

beq $1, $4, target

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 24

Data Hazards for Branches

If a comparison register is a destination of immediatelypreceding load instruction

Need 2 stall cycles

beq stalled

IF ID EX MEM WB

IF ID

ID

ID EX MEM WB

beq stalled

lw $1, addr

beq $1, $0, target

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 25

Dynamic Branch Prediction

In deeper and superscalar pipelines, branchpenalty is more significantUse dynamic prediction

Branch prediction buffer (aka branch history table)Indexed by recent branch instruction addressesStores outcome (taken/not taken)To execute a branch

Check table, expect the same outcomeStart fetching from fall-through or targetIf wrong, flush pipeline and flip prediction

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 26

1-Bit Predictor: Shortcoming

Inner loop branches mispredicted twice!outer: … …inner: … … beq …, …, inner … beq …, …, outer

Mispredict as taken on last iteration of inner loopThen mispredict as not taken on first iteration ofinner loop next time around

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 27

2-Bit Predictor

Only change prediction on two successive mispredictions

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 28

Calculating the Branch Target

Even with predictor, still need to calculate the targetaddress

1-cycle penalty for a taken branch

Branch target bufferCache of target addressesIndexed by PC when instruction fetched

If hit and instruction is branch predicted taken, can fetch targetimmediately

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 29

Exceptions and Interrupts

“Unexpected” events requiring changein flow of control

Different ISAs use the terms differently

ExceptionArises within the CPU

e.g., undefined opcode, overflow, syscall, …

InterruptFrom an external I/O controller

Dealing with them without sacrificing performance is hard

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 30

Handling Exceptions

In MIPS, exceptions managed by a SystemControl Coprocessor (CP0)Save PC of offending (or interrupted) instruction

In MIPS: Exception Program Counter (EPC)Save indication of the problem

In MIPS: Cause registerWe’ll assume 1-bit

0 for undefined opcode, 1 for overflow

Jump to handler at 8000 00180

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 31

An Alternate Mechanism

Vectored InterruptsHandler address determined by the cause

Example:Undefined opcode: C000 0000Overflow: C000 0020…: C000 0040

Instructions eitherDeal with the interrupt, orJump to real handler

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 32

Handler Actions

Read cause, and transfer to relevant handlerDetermine action requiredIf restartable

Take corrective actionuse EPC to return to program

OtherwiseTerminate programReport error using EPC, cause, …

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 33

Exceptions in a Pipeline

Another form of control hazardConsider overflow on add in EX stageadd $1, $2, $1

Prevent $1 from being clobberedComplete previous instructionsFlush add and subsequent instructionsSet Cause and EPC register valuesTransfer control to handler

Similar to mispredicted branchUse much of the same hardware

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 34

Pipeline with Exceptions

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 35

Exception Properties

Restartable exceptionsPipeline can flush the instructionHandler executes, then returns to the instruction

Refetched and executed from scratch

PC saved in EPC registerIdentifies causing instructionActually PC + 4 is saved

Handler must adjust

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 36

Exception Example

Exception on add in40 sub $11, $2, $444 and $12, $2, $548 or $13, $2, $64C add $1, $2, $150 slt $15, $6, $754 lw $16, 50($7)…

Handler80000180 sw $25, 1000($0)80000184 sw $26, 1004($0)…

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 37

Exception Example

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 38

Exception Example

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 39

Multiple Exceptions

Pipelining overlaps multiple instructionsCould have multiple exceptions at once

Simple approach: deal with exception from earliestinstruction

Flush subsequent instructions“Precise” exceptions

In complex pipelinesMultiple instructions issued per cycleOut-of-order completionMaintaining precise exceptions is difficult!

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 40

Imprecise Exceptions

Just stop pipeline and save stateIncluding exception cause(s)

Let the handler work outWhich instruction(s) had exceptionsWhich to complete or flush

May require “manual” completion

Simplifies hardware, but more complex handler softwareNot feasible for complex multiple-issueout-of-order pipelines

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 41

Fallacies

Pipelining is easy (!)The basic idea is easyThe devil is in the details

e.g., detecting data hazards

Pipelining is independent of technologySo why haven’t we always done pipelining?More transistors make more advanced techniquesfeasiblePipeline-related ISA design needs to take account oftechnology trends

e.g., predicated instructions

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 42

Pitfalls

Poor ISA design can make pipelining hardere.g., complex instruction sets (VAX, IA-32)

Significant overhead to make pipelining workIA-32 micro-op approach

e.g., complex addressing modesRegister update side effects, memory indirection

e.g., delayed branchesAdvanced pipelines have long delay slots