cs 61c: great ideas in computer architecture control and...

35
CS 61C: Great Ideas in Computer Architecture Control and Pipelining 1 Instructors: Vladimir Stojanovic and Nicholas Weaver http://inst.eecs.Berkeley.edu/~cs61c/sp16

Upload: others

Post on 25-May-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: CS 61C: Great Ideas in Computer Architecture Control and ...gamescrafters.berkeley.edu/.../19/2016Sp-CS61C-L19... · Summary of the Control Signals (2/2) add sub ori lw sw beq jump

CS61C:GreatIdeasinComputerArchitecture

ControlandPipelining

1

Instructors:VladimirStojanovicandNicholasWeaverhttp://inst.eecs.Berkeley.edu/~cs61c/sp16

Page 2: CS 61C: Great Ideas in Computer Architecture Control and ...gamescrafters.berkeley.edu/.../19/2016Sp-CS61C-L19... · Summary of the Control Signals (2/2) add sub ori lw sw beq jump

Datapath ControlSignals• ExtOp: “zero”,“sign”• ALUsrc: 0⇒ regB;

1⇒ immed• ALUctr: “ADD”,“SUB”,“OR”

• MemWr: 1⇒writememory• MemtoReg:0⇒ ALU;1⇒Mem• RegDst: 0⇒ “rt”;1⇒ “rd”• RegWr: 1⇒writeregister

32

ALUctr

clk

busW

RegWr

32

32busA

32

busB

5 5

Rw Ra Rb

RegFile

Rs

Rt

Rt

RdRegDst

Extender 3216Imm16

ALUSrcExtOp

MemtoReg

clk

DataIn32

MemWr01

0

1

ALU 0

1WrEn Adr

DataMemory

5

Imm16

clk

PC

00

4nPC_sel &Equal

PCExt

AdderAdder

Mux

InstAddress

0

1

2

Page 3: CS 61C: Great Ideas in Computer Architecture Control and ...gamescrafters.berkeley.edu/.../19/2016Sp-CS61C-L19... · Summary of the Control Signals (2/2) add sub ori lw sw beq jump

SummaryoftheControlSignals(1/2)inst Register Transfer

add R[rd] ← R[rs] + R[rt]; PC ← PC + 4

ALUsrc=RegB, ALUctr=“ADD”, RegDst=rd, RegWr, nPC_sel=“+4”

sub R[rd] ← R[rs] – R[rt]; PC ← PC + 4

ALUsrc=RegB, ALUctr=“SUB”, RegDst=rd, RegWr, nPC_sel=“+4”

ori R[rt] ← R[rs] + zero_ext(Imm16); PC ← PC + 4

ALUsrc=Im, Extop=“Z”, ALUctr=“OR”, RegDst=rt,RegWr, nPC_sel=“+4”

lw R[rt] ← MEM[ R[rs] + sign_ext(Imm16)]; PC ← PC + 4

ALUsrc=Im, Extop=“sn”, ALUctr=“ADD”, MemtoReg, RegDst=rt, RegWr, nPC_sel = “+4”

sw MEM[ R[rs] + sign_ext(Imm16)] ← R[rs]; PC ← PC + 4

ALUsrc=Im, Extop=“sn”, ALUctr = “ADD”, MemWr, nPC_sel = “+4”

beq if (R[rs] == R[rt]) then PC ← PC + sign_ext(Imm16)] || 00else PC ← PC + 4

nPC_sel = “br”, ALUctr = “SUB”

3

Page 4: CS 61C: Great Ideas in Computer Architecture Control and ...gamescrafters.berkeley.edu/.../19/2016Sp-CS61C-L19... · Summary of the Control Signals (2/2) add sub ori lw sw beq jump

SummaryoftheControlSignals(2/2)

add sub ori lw sw beq jumpRegDstALUSrcMemtoRegRegWriteMemWritenPCselJumpExtOpALUctr<2:0>

1001000xAdd

1001000x

Subtract

01010000Or

01110001Add

x1x01001Add

x0x0010x

Subtract

xxx00?1xx

op targetaddress

op rs rt rd shamt funct061116212631

op rs rt immediate

R-type

I-type

J-type

add,sub

ori,lw,sw,beq

jump

funcop 000000 000000 001101 100011 101011 000100 000010AppendixA

100000See 100010 WeDon’tCare:-)

4

Page 5: CS 61C: Great Ideas in Computer Architecture Control and ...gamescrafters.berkeley.edu/.../19/2016Sp-CS61C-L19... · Summary of the Control Signals (2/2) add sub ori lw sw beq jump

BooleanExpressionsforControllerRegDst = add + subALUSrc = ori + lw + swMemtoReg = lwRegWrite = add + sub + ori + lw MemWrite = swnPCsel = beqJump = jump ExtOp = lw + swALUctr[0] = sub + beq (assume ALUctr is 00 ADD, 01 SUB, 10 OR)ALUctr[1] = or

Where:

rtype = ~op5 • ~op4 • ~op3 • ~op2 • ~op1 • ~op0, ori = ~op5 • ~op4 • op3 • op2 • ~op1 • op0lw = op5 • ~op4 • ~op3 • ~op2 • op1 • op0sw = op5 • ~op4 • op3 • ~op2 • op1 • op0beq = ~op5 • ~op4 • ~op3 • op2 • ~op1 • ~op0jump = ~op5 • ~op4 • ~op3 • ~op2 • op1 • ~op0

add = rtype • func5 • ~func4 • ~func3 • ~func2 • ~func1 • ~func0sub = rtype • func5 • ~func4 • ~func3 • ~func2 • func1 • ~func0

How do we implement this in

gates?

5

Page 6: CS 61C: Great Ideas in Computer Architecture Control and ...gamescrafters.berkeley.edu/.../19/2016Sp-CS61C-L19... · Summary of the Control Signals (2/2) add sub ori lw sw beq jump

ControllerImplementation

addsuborilwswbeqjump

RegDstALUSrcMemtoRegRegWriteMemWritenPCselJumpExtOpALUctr[0]ALUctr[1]

“AND” logic “OR” logic

opcode func

6

Page 7: CS 61C: Great Ideas in Computer Architecture Control and ...gamescrafters.berkeley.edu/.../19/2016Sp-CS61C-L19... · Summary of the Control Signals (2/2) add sub ori lw sw beq jump

P&HFigure4.17

7

Page 8: CS 61C: Great Ideas in Computer Architecture Control and ...gamescrafters.berkeley.edu/.../19/2016Sp-CS61C-L19... · Summary of the Control Signals (2/2) add sub ori lw sw beq jump

Summary:Single-cycleProcessor• Fivestepstodesignaprocessor:

1.Analyzeinstructionsetàdatapathrequirements

2.Selectsetofdatapathcomponents&establishclockmethodology

3.Assembledatapathmeetingtherequirements

4.Analyzeimplementationofeachinstructiontodeterminesettingofcontrolpointsthateffectstheregistertransfer.

5.Assemblethecontrollogic• FormulateLogicEquations• DesignCircuits

Control

Datapath

Memory

ProcessorInput

Output

8

Page 9: CS 61C: Great Ideas in Computer Architecture Control and ...gamescrafters.berkeley.edu/.../19/2016Sp-CS61C-L19... · Summary of the Control Signals (2/2) add sub ori lw sw beq jump

SingleCyclePerformance• Assumetimefor actionsare

– 100psforregisterreadorwrite;200psforother events

• Clockperiodis?Instr Instr fetch Register

readALU op Memory

accessRegister write

Total time

lw 200ps 100 ps 200ps 200ps 100 ps 800ps

sw 200ps 100 ps 200ps 200ps 700ps

R-format 200ps 100 ps 200ps 100 ps 600ps

beq 200ps 100 ps 200ps 500ps

• Clock rate (cycles/second = Hz) = 1/Period (seconds/cycle)

9

Page 10: CS 61C: Great Ideas in Computer Architecture Control and ...gamescrafters.berkeley.edu/.../19/2016Sp-CS61C-L19... · Summary of the Control Signals (2/2) add sub ori lw sw beq jump

SingleCyclePerformance• Assumetimefor actionsare

– 100psforregisterreadorwrite;200psforother events

• Clockperiodis?Instr Instr fetch Register

readALU op Memory

accessRegister write

Total time

lw 200ps 100 ps 200ps 200ps 100 ps 800ps

sw 200ps 100 ps 200ps 200ps 700ps

R-format 200ps 100 ps 200ps 100 ps 600ps

beq 200ps 100 ps 200ps 500ps

• What can we do to improve clock rate?• Will this improve performance as well?

Want increased clock rate to mean faster programs10

Page 11: CS 61C: Great Ideas in Computer Architecture Control and ...gamescrafters.berkeley.edu/.../19/2016Sp-CS61C-L19... · Summary of the Control Signals (2/2) add sub ori lw sw beq jump

LevelsofRepresentation/Interpretation

lw $t0,0($2)lw $t1,4($2)sw $t1,0($2)sw $t0,4($2)

HighLevelLanguageProgram(e.g.,C)

AssemblyLanguageProgram(e.g.,MIPS)

MachineLanguageProgram(MIPS)

HardwareArchitectureDescription(e.g.,blockdiagrams)

Compiler

Assembler

MachineInterpretation

temp=v[k];v[k]=v[k+1];v[k+1]=temp;

0000 1001 1100 0110 1010 1111 0101 10001010 1111 0101 1000 0000 1001 1100 0110 1100 0110 1010 1111 0101 1000 0000 1001 0101 1000 0000 1001 1100 0110 1010 1111

LogicCircuitDescription(CircuitSchematicDiagrams)

ArchitectureImplementation

Anythingcanberepresentedasanumber,

i.e.,dataorinstructions

11

Page 12: CS 61C: Great Ideas in Computer Architecture Control and ...gamescrafters.berkeley.edu/.../19/2016Sp-CS61C-L19... · Summary of the Control Signals (2/2) add sub ori lw sw beq jump

NoMoreMagic!

12

CS61ACS61BCS61C ✔CS61C ✔CS61C ✔CS61C çCS61C ✔EE40Phys 7B

I/O systemProcessor

CompilerOperatingSystem(Mac OSX)

Application (ex: browser)

Digital DesignCircuit Design

Instruction Set Architecture

Datapath & Control

transistors

MemoryHardware

Software Assembler

Page 13: CS 61C: Great Ideas in Computer Architecture Control and ...gamescrafters.berkeley.edu/.../19/2016Sp-CS61C-L19... · Summary of the Control Signals (2/2) add sub ori lw sw beq jump

Administrivia

• Project2-2due3/8@23:59:59(Tue)• GuerrillaSessions:MIPSCPU

– Wed3/093- 5PM@241Cory– Sat3/121- 3PM@651@611Soda

13

Page 14: CS 61C: Great Ideas in Computer Architecture Control and ...gamescrafters.berkeley.edu/.../19/2016Sp-CS61C-L19... · Summary of the Control Signals (2/2) add sub ori lw sw beq jump

GottaDoLaundry• Ann,Brian,Cathy,Daveeachhaveoneloadofclothestowash,dry,fold,andputaway– Washertakes30minutes

– Dryertakes30minutes

– “Folder”takes30minutes

– “Stasher”takes30minutestoputclothesintodrawers

A B C D

14

Page 15: CS 61C: Great Ideas in Computer Architecture Control and ...gamescrafters.berkeley.edu/.../19/2016Sp-CS61C-L19... · Summary of the Control Signals (2/2) add sub ori lw sw beq jump

SequentialLaundry

• Sequentiallaundrytakes8hoursfor4loads

Task

Order

B

CD

A30Time

30 30 3030 30 3030 30 30 3030 30 30 3030

6 PM 7 8 9 10 11 12 1 2 AM

15

Page 16: CS 61C: Great Ideas in Computer Architecture Control and ...gamescrafters.berkeley.edu/.../19/2016Sp-CS61C-L19... · Summary of the Control Signals (2/2) add sub ori lw sw beq jump

PipelinedLaundry

• Pipelinedlaundrytakes3.5hoursfor4loads!

Task

Order

BC

D

A

12 2 AM6 PM 7 8 9 10 11 1

Time3030 30 3030 30 30

16

Page 17: CS 61C: Great Ideas in Computer Architecture Control and ...gamescrafters.berkeley.edu/.../19/2016Sp-CS61C-L19... · Summary of the Control Signals (2/2) add sub ori lw sw beq jump

• Pipeliningdoesn’thelplatencyofsingletask,ithelpsthroughputofentireworkload

• Multiple tasksoperatingsimultaneouslyusingdifferentresources

• Potentialspeedup=Numberpipestages

• Timeto“fill”pipelineandtimeto“drain”itreducesspeedup:2.3x(8/3.5)v.4x(8/2)inthisexample

6 PM 7 8 9Time

BC

D

A3030 30 3030 30 30

Task

Order

PipeliningLessons(1/2)

17

Page 18: CS 61C: Great Ideas in Computer Architecture Control and ...gamescrafters.berkeley.edu/.../19/2016Sp-CS61C-L19... · Summary of the Control Signals (2/2) add sub ori lw sw beq jump

• SupposenewWashertakes20minutes,newStashertakes20minutes.Howmuchfasterispipeline?

• Pipelineratelimitedbyslowest pipelinestage

• Unbalancedlengthsofpipestagesreducesspeedup

6 PM 7 8 9Time

BC

D

A3030 30 3030 30 30

Task

Order

PipeliningLessons(2/2)

18

Page 19: CS 61C: Great Ideas in Computer Architecture Control and ...gamescrafters.berkeley.edu/.../19/2016Sp-CS61C-L19... · Summary of the Control Signals (2/2) add sub ori lw sw beq jump

1)IFtch:InstructionFetch,IncrementPC2)Dcd:InstructionDecode,ReadRegisters3)Exec:

Mem-ref: CalculateAddressArith-log:PerformOperation

4)Mem:Load:ReadDatafromMemoryStore:WriteDatatoMemory

5)WB:WriteDataBacktoRegister

ExecutionStepsinMIPSDatapath

19

Page 20: CS 61C: Great Ideas in Computer Architecture Control and ...gamescrafters.berkeley.edu/.../19/2016Sp-CS61C-L19... · Summary of the Control Signals (2/2) add sub ori lw sw beq jump

PC

inst

ruct

ion

mem

ory

+4

rtrsrd

regi

ster

s

ALU

Dat

am

emor

y

imm

1. InstructionFetch

2. Decode/Register Read

3. Execute 4. Memory 5. WriteBack

SingleCycleDatapath

20

Page 21: CS 61C: Great Ideas in Computer Architecture Control and ...gamescrafters.berkeley.edu/.../19/2016Sp-CS61C-L19... · Summary of the Control Signals (2/2) add sub ori lw sw beq jump

PC

inst

ruct

ion

mem

ory

+4

rtrsrd

regi

ster

s

ALU

Dat

am

emor

y

imm

1. InstructionFetch

2. Decode/Register Read

3. Execute 4. Memory 5. WriteBack

Pipelineregisters

• Needregistersbetweenstages– Toholdinformationproducedinpreviouscycle

21

Page 22: CS 61C: Great Ideas in Computer Architecture Control and ...gamescrafters.berkeley.edu/.../19/2016Sp-CS61C-L19... · Summary of the Control Signals (2/2) add sub ori lw sw beq jump

MoreDetailedPipeline

22

Page 23: CS 61C: Great Ideas in Computer Architecture Control and ...gamescrafters.berkeley.edu/.../19/2016Sp-CS61C-L19... · Summary of the Control Signals (2/2) add sub ori lw sw beq jump

IFforLoad,Store,…

23

Page 24: CS 61C: Great Ideas in Computer Architecture Control and ...gamescrafters.berkeley.edu/.../19/2016Sp-CS61C-L19... · Summary of the Control Signals (2/2) add sub ori lw sw beq jump

IDforLoad,Store,…

24

Page 25: CS 61C: Great Ideas in Computer Architecture Control and ...gamescrafters.berkeley.edu/.../19/2016Sp-CS61C-L19... · Summary of the Control Signals (2/2) add sub ori lw sw beq jump

EXforLoad

25

Page 26: CS 61C: Great Ideas in Computer Architecture Control and ...gamescrafters.berkeley.edu/.../19/2016Sp-CS61C-L19... · Summary of the Control Signals (2/2) add sub ori lw sw beq jump

MEMforLoad

26

Page 27: CS 61C: Great Ideas in Computer Architecture Control and ...gamescrafters.berkeley.edu/.../19/2016Sp-CS61C-L19... · Summary of the Control Signals (2/2) add sub ori lw sw beq jump

WBforLoad– Oops!

Wrongregisternumber!

27

Page 28: CS 61C: Great Ideas in Computer Architecture Control and ...gamescrafters.berkeley.edu/.../19/2016Sp-CS61C-L19... · Summary of the Control Signals (2/2) add sub ori lw sw beq jump

CorrectedDatapathforLoad

28

Page 29: CS 61C: Great Ideas in Computer Architecture Control and ...gamescrafters.berkeley.edu/.../19/2016Sp-CS61C-L19... · Summary of the Control Signals (2/2) add sub ori lw sw beq jump

PipelinedExecutionRepresentation

• Everyinstructionmusttakesamenumberofsteps,sosomestageswillidle– e.g.MEMstageforanyarithmeticinstruction

IF ID EX MEM WBIF ID EX MEM WB

IF ID EX MEM WBIF ID EX MEM WB

IF ID EX MEM WBIF ID EX MEM WB

Time

29

Page 30: CS 61C: Great Ideas in Computer Architecture Control and ...gamescrafters.berkeley.edu/.../19/2016Sp-CS61C-L19... · Summary of the Control Signals (2/2) add sub ori lw sw beq jump

GraphicalPipelineDiagrams

• Usedatapath figurebelowtorepresentpipeline:IF ID EX Mem WB

ALUI$ Reg D$ Reg

1.InstructionFetch

2.Decode/RegisterRead

3.Execute 4.Memory 5.WriteBack

PC

inst

ruct

ion

mem

ory

+4

RegisterFilert

rsrd

ALU

Dat

am

emor

y

imm

MU

X

30

Page 31: CS 61C: Great Ideas in Computer Architecture Control and ...gamescrafters.berkeley.edu/.../19/2016Sp-CS61C-L19... · Summary of the Control Signals (2/2) add sub ori lw sw beq jump

Instr

Order

Load

Add

Store

Sub

Or

I$

Time (clock cycles)

I$

ALU

Reg

Reg

I$

D$

ALU

ALU

Reg

D$

Reg

I$

D$

Reg

ALU

Reg Reg

Reg

D$

Reg

D$

ALU

• RegFile: left half is write, right half is read

Reg

I$

GraphicalPipelineRepresentation

31

Page 32: CS 61C: Great Ideas in Computer Architecture Control and ...gamescrafters.berkeley.edu/.../19/2016Sp-CS61C-L19... · Summary of the Control Signals (2/2) add sub ori lw sw beq jump

PipeliningPerformance(1/3)

• UseTc (“timebetweencompletionofinstructions”)tomeasurespeedup

– Equalityonlyachievedifstagesarebalanced(i.e.takethesameamountoftime)

• Ifnotbalanced,speedupisreduced• Speedupduetoincreasedthroughput

– Latency foreachinstructiondoesnotdecrease

32

Page 33: CS 61C: Great Ideas in Computer Architecture Control and ...gamescrafters.berkeley.edu/.../19/2016Sp-CS61C-L19... · Summary of the Control Signals (2/2) add sub ori lw sw beq jump

PipeliningPerformance(2/3)

• Assumetimeforstagesis– 100psforregisterreadorwrite– 200psforotherstages

• Whatispipelinedclockrate?– Comparepipelineddatapath withsingle-cycledatapath

Instr Instrfetch

Register read

ALU op Memory access

Register write

Total time

lw 200ps 100 ps 200ps 200ps 100 ps 800ps

sw 200ps 100 ps 200ps 200ps 700ps

R-format 200ps 100 ps 200ps 100 ps 600ps

beq 200ps 100 ps 200ps 500ps

33

Page 34: CS 61C: Great Ideas in Computer Architecture Control and ...gamescrafters.berkeley.edu/.../19/2016Sp-CS61C-L19... · Summary of the Control Signals (2/2) add sub ori lw sw beq jump

PipeliningPerformance(3/3)

Single-cycleTc = 800 psf = 1.25GHz

PipelinedTc = 200 ps

f = 5GHz

34

Page 35: CS 61C: Great Ideas in Computer Architecture Control and ...gamescrafters.berkeley.edu/.../19/2016Sp-CS61C-L19... · Summary of the Control Signals (2/2) add sub ori lw sw beq jump

Clicker/PeerInstructionLogicinsomestagestakes200psandinsome100ps.Clk-Qdelayis30psandsetup-timeis20ps.Whatisthemaximumclockfrequencyatwhichapipelineddesigncanoperate?• A:10GHz• B:5GHz• C:6.7GHz• D:4.35GHz• E:4GHz

35