the risc-v berkeley out-of-order machine · icache uncore lsu rename table fpu rob free list issue...

34
UC Berkeley The RISC-V Berkeley Out-of-Order Machine: An update on BOOM and the wider ecosystem (Chisel, FIRRTL, & Rocket-chip) Christopher Celio ORCONF 2016 October [email protected] http://ucb-bar.github.io/riscv-boom

Upload: others

Post on 14-Oct-2020

7 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: The RISC-V Berkeley Out-of-Order Machine · ICache Uncore LSU Rename Table FPU ROB Free List Issue Window Branch Predictor ALUs Fetch Buffer DCache DCache Control IMUL Busy IDIV Table

UC Berkeley

TheRISC-VBerkeleyOut-of-OrderMachine:AnupdateonBOOMandthewiderecosystem

(Chisel,FIRRTL,&Rocket-chip)ChristopherCelio

ORCONF2016October

[email protected]

http://ucb-bar.github.io/riscv-boom

Page 2: The RISC-V Berkeley Out-of-Order Machine · ICache Uncore LSU Rename Table FPU ROB Free List Issue Window Branch Predictor ALUs Fetch Buffer DCache DCache Control IMUL Busy IDIV Table

UC Berkeley

AnUpdateontheBerkeleyArchitectureResearchInfrastructure

(Oh,andBOOMiscooltoo.)ChristopherCelio

ORCONF2016October

[email protected]

http://ucb-bar.github.io/riscv-boom

Page 3: The RISC-V Berkeley Out-of-Order Machine · ICache Uncore LSU Rename Table FPU ROB Free List Issue Window Branch Predictor ALUs Fetch Buffer DCache DCache Control IMUL Busy IDIV Table

UC Berkeley WhatisBOOM?◾ superscalar,out-of-orderRISC-VprocessorwriGeninBerkeley’sChiselhardwareconstrucIonlanguage(HCL)

◾ Itissynthesizable◾ Itisparameterizable◾ itisopen-source

3

RegFile

ICache

Uncore

LSU

RenameTable

FPU

ROB

Free List

Issue Window

Branch PredictorALUs

Fetch Buffer

DCache

DCacheControl

IDIVIMUL Busy Table

Bypasses

2-wide BOOM (16kB/16kB) 1.2mm2 @ 45nm

Fetch

Decode &Rename

Issue Window

PhysicalRegisterFile (5x3)

(PRF)

ALU

ROB

Commitwakeup

uop tags

inst

2

2

2

2

2

FPUALU LSU

Datacache

Page 4: The RISC-V Berkeley Out-of-Order Machine · ICache Uncore LSU Rename Table FPU ROB Free List Issue Window Branch Predictor ALUs Fetch Buffer DCache DCache Control IMUL Busy IDIV Table

UC Berkeley HowdoyoumakeanOoOprocessor?

◾ StartwithanewhardwareconstrucIonlanguage- Verilogisawfulandwilljustgetintheway.

◾ Startwithaworkingprocessor.-WayeasierthanwriIngeverythingfromscratch!- PTWs,FPUs,uncore,devices,off-chipIOsareunglamorous.

4

Page 5: The RISC-V Berkeley Out-of-Order Machine · ICache Uncore LSU Rename Table FPU ROB Free List Issue Window Branch Predictor ALUs Fetch Buffer DCache DCache Control IMUL Busy IDIV Table

UC Berkeley Ittakesavillage.

◾ RISC-VISA- veryout-of-orderfriendly!

◾ ChiselhardwareconstrucIonlanguage- object-oriented,funcIonalprogramming

◾ FIRRTL(brandnew!)- exposedRTLintermediaterepresentaIon(IR)

◾ Rocket-chip- AfullworkingSoCplaVormbuiltaroundtheRocketin-ordercore

◾ Thanksto:- KrsteAsanović,RimasAvizienis,JonathanBachrach,ScoGBeamer,DavidBiancolin,ChristopherCelio,HenryCook,PalmerDabbelt,JohnHauser,AdamIzraelevitz,SagarKarandikar,BenKeller,DonggyuKim,JackKoenig,JimLawson,YunsupLee,RichardLin,EricLove,MarInMaas,ChickMarkley,AlbertMagyar,HowardMao,MiquelMoreto,QuanNguyen,AlbertOu,DavidA.PaGerson,BrianRichards,ColinSchmidt,WenyuTang,StephenTwigg,HuyVo,AndrewWaterman,AngieWang,andmore...

5

Page 6: The RISC-V Berkeley Out-of-Order Machine · ICache Uncore LSU Rename Table FPU ROB Free List Issue Window Branch Predictor ALUs Fetch Buffer DCache DCache Control IMUL Busy IDIV Table

UC Berkeley TheRocket-ChipSoCGeneratorEcosystem

◾ BOOMisapieceoftheRocket-chipecosystem◾ Startedin2011◾ tapedout10(12?)ImesbyBerkeley◾ runsat1.65GHzinIBM45nm◾MMUsupportspage-basedvirtualmemory◾ IEEE754-2008compliantFPU

- supportsSP,DPFMAwithhwsupportforsubnormals

◾ cachecoherent,non-blockingL1datacache,L2cache,andmore

6https://github.com/ucb-bar/rocket-chip

Rocket5-stagepipeline

https://www2.eecs.berkeley.edu/Pubs/TechRpts/2016/EECS-2016-17.html

Page 7: The RISC-V Berkeley Out-of-Order Machine · ICache Uncore LSU Rename Table FPU ROB Free List Issue Window Branch Predictor ALUs Fetch Buffer DCache DCache Control IMUL Busy IDIV Table

UC Berkeley BOOMfitsintoRocket-chipSoC

7

BOOM

Rocket

Rocket

Page 8: The RISC-V Berkeley Out-of-Order Machine · ICache Uncore LSU Rename Table FPU ROB Free List Issue Window Branch Predictor ALUs Fetch Buffer DCache DCache Control IMUL Busy IDIV Table

◾ Thedevsgraduated!- startedSiFivestart-uptosupportRISC-V,rocket-chipanddesigncustomchipsaroundthem

◾ workinprogressonanewRocket-chipFoundaIon◾ newgeneraIonofBerkeleystudentdevs!

- BerkeleyiscommiGedtoopen-sourceRocket-chip- (ourresearchdependsonit!)

◾ 37contributors- SiFive,Berkeley,LowRISC,BostonU.,andmore

UC Berkeley Rocket-chipUpdates

8

the vision...

https://dev.sifive.com/documentation/freedom-u500-platform-guide/

Rocket-chip

Page 9: The RISC-V Berkeley Out-of-Order Machine · ICache Uncore LSU Rename Table FPU ROB Free List Issue Window Branch Predictor ALUs Fetch Buffer DCache DCache Control IMUL Busy IDIV Table

UC Berkeley Rocket-chipUpdates◾ commiGedtoopen-sourcesinceOct2011

- 3588commits!!!- 310commitslastfourweeks!

◾BeGerdocumentaIon- hGps://dev.sifive.com/documentaIon/u5-coreplex-series-manual/

◾ removedgitsubmodules- speedupdevelopment(reducemergeheadaches)

◾ RISC-VPrivilegedSpecv1.9◾uncachedloads/stores(memory-mappedIO)◾ RISC-VExternalDebug

- breakpoints- stand-aloneboot!- non-standardHTIF(host/targetinterface)hasbeenremoved!

◾ MoreconfiguraIons- RV32support+M/A/FasopIons- RISC-VCompressedISAsupport- blockingdatacache(MSHRs=0)

◾ UpdatedtoChisel3- NomoreC++backend,onlyVerilogisemiGed.- weuseVerilatorinthebuildsystemforfreeandfastsimulaIon.

◾ Andalotmore...- Ilelinkupdates,mulI-clockdomains,I/Os,devices,...- I'mhavingtroublekeepingup! 9

Page 10: The RISC-V Berkeley Out-of-Order Machine · ICache Uncore LSU Rename Table FPU ROB Free List Issue Window Branch Predictor ALUs Fetch Buffer DCache DCache Control IMUL Busy IDIV Table

UC Berkeley Chisel

◾ HardwareConstrucIonLanguageembeddedinScala◾ notahigh-levelsynthesislanguage◾ hardwaremoduleisadatastructureinScala◾ FullpowerofScalaforwriInggenerators

- object-orientedprogramming- factoryobjects,traits,overloading

- funcIonalprogramming- high-orderfuns,anonymousfuncs,currying

10

Chisel Program

Scala/JVM

magic!

Verilator Verilog

FPGA Verilog

ASIC Verilog

C++ Compiler/Verilator FPGA Tools ASIC Tools

Verilator/Software Simulator

FPGA Emulation

GDS Layout

Page 11: The RISC-V Berkeley Out-of-Order Machine · ICache Uncore LSU Rename Table FPU ROB Free List Issue Window Branch Predictor ALUs Fetch Buffer DCache DCache Control IMUL Busy IDIV Table

UC Berkeley Chisel3(frontend)andFIRRTL(backend)◾ Goal

- turnresearch-wareintoaqualitycompilerplaVorm- openuptheIRforhardwaredesignerstowritetheirownIRtransformpasses

◾ Chisel3- embeddedinScala- generatesFIRRTLRTLcode

◾ FIRRTL(FlexibleIRforRTL)- servesasanIRforhardware(i.e.,LLVMforhw!)- generatesVerilog

◾ Success!- canaddyourowntransformaIons- youcanthrowawayChiselandwriteyourownfront-end- hGps://github.com/ucb-bar/chisel3(alphaversion)- hGps://github.com/ucb-bar/firrtl(alphaversion)- Spec:hGps://github.com/ucb-bar/firrtl/tree/master/spec

11

Page 12: The RISC-V Berkeley Out-of-Order Machine · ICache Uncore LSU Rename Table FPU ROB Free List Issue Window Branch Predictor ALUs Fetch Buffer DCache DCache Control IMUL Busy IDIV Table

UC Berkeley FIRRTLIRPasses

◾ Codecoverage- howmuchofmycircuitisbeingexercised?

◾ ScanchaininserIon◾ SRAM- FPGAsvsASICmemories- re-sizing(transformingskinny/talltorectangular)

◾ earlystaIsIcgathering(e.g.,gatecount)- synthesistoolstakealongIme...

◾ DecouplingtargetImefromhostIme- e.g.,addinga"stop-the-world"buGon

◾ asserIonsupport(TBA)- turningsimulatorasserIonsintoarealHWasserIon

12

Page 13: The RISC-V Berkeley Out-of-Order Machine · ICache Uncore LSU Rename Table FPU ROB Free List Issue Window Branch Predictor ALUs Fetch Buffer DCache DCache Control IMUL Busy IDIV Table

UC Berkeley New&UpcomingChiselFeatures◾ parameterizedVerilogblackboxsupport◾ Analogtypes

- representsoutsidewiresthatare"notdigital"- e.g.,connecInginoutI/Ostoblackboxes

◾mulI-clockdomainsupport- fixed-raIohasfirst-classciIzensupport- youprovideyourownclockdomaincrossings*- e.g.,asyncFIFOs

◾ *asyncreset- TBD,butwithaneyetowardsmulI-clocksupport

◾ DSPsupport- FixedPointtype- canprovidevalueranges,notbitwidths

◾ AnnotaIons- allowFIRRTLback-endtogetaddiIonalinformaIonfromChisel

13

Page 14: The RISC-V Berkeley Out-of-Order Machine · ICache Uncore LSU Rename Table FPU ROB Free List Issue Window Branch Predictor ALUs Fetch Buffer DCache DCache Control IMUL Busy IDIV Table

UC Berkeley Chisel2

◾ providescompaIbilitywarningstohelpmigraIontoChisel3

◾migraIondocumentaIonisavailable◾ Endoflife...?◾ ifyouuseChisel2anddependonitletusknow!

14

Page 15: The RISC-V Berkeley Out-of-Order Machine · ICache Uncore LSU Rename Table FPU ROB Free List Issue Window Branch Predictor ALUs Fetch Buffer DCache DCache Control IMUL Busy IDIV Table

UC Berkeley Ques[onsonBAR?

15

Page 16: The RISC-V Berkeley Out-of-Order Machine · ICache Uncore LSU Rename Table FPU ROB Free List Issue Window Branch Predictor ALUs Fetch Buffer DCache DCache Control IMUL Busy IDIV Table

UC Berkeley WhatisBOOM?

◾ superscalar,out-of-orderRISC-VprocessorwriGeninBerkeley’sChiselRTL

◾ Itissynthesizable◾ Itisparameterizable◾ itisopen-source◾ startedin2012◾ 10klinesofcode◾ implementsRV64G

16

RegFile

ICache

Uncore

LSU

RenameTable

FPU

ROB

Free List

Issue Window

Branch PredictorALUs

Fetch Buffer

DCache

DCacheControl

IDIVIMUL Busy Table

Bypasses

2-wide BOOM (16kB/16kB) 1.2mm2 @ 45nm

http://ucb-bar.github.io/riscv-boom

Page 17: The RISC-V Berkeley Out-of-Order Machine · ICache Uncore LSU Rename Table FPU ROB Free List Issue Window Branch Predictor ALUs Fetch Buffer DCache DCache Control IMUL Busy IDIV Table

UC Berkeley BOOM

◾ PRF- explicitrenaming- holdsspeculaIveandcommiGeddata- holdsbothx-regs,f-regs

◾ UnifiedIssueWindow- holdsallinstrucIons

◾ splitROB/issuewindowdesign 17

Fetch Decode &Rename

Issue Window Unified

PhysicalRegister

File (PRF)

FPU

ALU

Rename Map Tables & Freelist

ROB

Commit

Page 18: The RISC-V Berkeley Out-of-Order Machine · ICache Uncore LSU Rename Table FPU ROB Free List Issue Window Branch Predictor ALUs Fetch Buffer DCache DCache Control IMUL Busy IDIV Table

UC Berkeley BenefitsofusingChisel&Rocket-chip◾ ~10,000locinBOOMgithubrepo◾ +addiIonal~12,000locinstanIatedfromotherlibraries

- ~5,000locfromRocketcorerepository- 90(integerALU)- 150(unpipelinedmul/div)- 550(floaIngpointunits)- 1,000(non-blockingdatacache)- 300(icache)- 300(nextlinepredictor/BTB/RAS)- 200(decoderminimizaIonlogic)- 200(page-tablewalker)- 200(TLB)- 400(control/statusregisterfile)- 300(instrucIondefiniIons+constants)

- ~4,500locfromuncore- coherencehubs,L2caches,networks

- ~2000locfromhardfloat- floaIngpointhardunits

18

Page 19: The RISC-V Berkeley Out-of-Order Machine · ICache Uncore LSU Rename Table FPU ROB Free List Issue Window Branch Predictor ALUs Fetch Buffer DCache DCache Control IMUL Busy IDIV Table

UC Berkeley ParameterizedSuperscalar

19

OR

valexe_units=ArrayBuffer[ExecutionUnit]()exe_units+=Module(newALUExeUnit(is_branch_unit=true,has_fpu=true,has_mul=true))exe_units+=Module(newALUMemExeUnit(fp_mem_support=true,has_div=true))Issue

SelectRegfileWriteback

dual-issue (5r,3w)

bypassing

ALU

div

LSUAgen D$

bypassing

ALU

FPU

bypassnetwork

RegfileRead

imul

exe_units+=Module(newALUExeUnit(is_branch_unit=true))exe_units+=Module(newALUExeUnit(has_fpu=true,has_mul=true))exe_units+=Module(newALUExeUnit(has_div=true))exe_units+=Module(newMemExeUnit())

Issue Select

RegfileWriteback

Quad-issue (9r,4w)

ALU

div

LSUAgen D$

ALU

imul

FPU

ALU

bypassing

bypassnetwork

RegfileRead

Page 20: The RISC-V Berkeley Out-of-Order Machine · ICache Uncore LSU Rename Table FPU ROB Free List Issue Window Branch Predictor ALUs Fetch Buffer DCache DCache Control IMUL Busy IDIV Table

UC Berkeley ARMCortex-A9vs.RISC-VBOOM

20

Category ARM Cortex-A9 RISC-V BOOM-2w

ISA 32-bit ARM v7 64-bit RISC-V v2 (RV64G)

Architecture 2 wide, 3+1 issue Out-of-Order 8-stage

2 wide, 3 issue Out-of-Order 6-stage

Performance 3.59 CoreMarks/MHz 4.61 CoreMarks/MHz

Process TSMC 40GPLUS TSMC 40GPLUSArea with 32K caches 2.5 mm2 1.00 mm2

Area efficiency

1.4 CoreMarks/MHz/mm2

4.6 CoreMarks/MHz/mm2

Frequency 1.4 GHz 1.5 GHz

Caveats:A9includesNEON;BOOMis64-bit,hasIEEE-2008fusedmul-add

Page 21: The RISC-V Berkeley Out-of-Order Machine · ICache Uncore LSU Rename Table FPU ROB Free List Issue Window Branch Predictor ALUs Fetch Buffer DCache DCache Control IMUL Busy IDIV Table

UC Berkeley BOOMUpdates◾ opensourcedinJan2016- hGp://ucb-bar.github.io/riscv-boom/

◾ ~60pageBOOMDesignSpecificaIon- hGps://ccelio.github.io/riscv-boom-doc/

◾ usedforcasestudyinISCA2016paper(strober.org)◾ firstexternalcontribuIon(visualizer)◾ portedtoChisel3/FIRRTL◾ supportsuncachedloads,stores(allowsformemory-mappedIO)

◾ updatedtoPrivilegeSpecv1.9◾ supportsRISC-VExternalDebugSpec◾ addedHighPerformanceMonitorcounters(HPM)◾ branchpredictorimprovements◾ beginningtape-out 21

Page 22: The RISC-V Berkeley Out-of-Order Machine · ICache Uncore LSU Rename Table FPU ROB Free List Issue Window Branch Predictor ALUs Fetch Buffer DCache DCache Control IMUL Busy IDIV Table

UC Berkeley ZynqFPGARepository

◾ BOOMrunsonaXilinxZynqzc706◾ hGps://github.com/ucb-bar/fpga-zynq◾UpdatedtohandlelatestPrivilegedSpecv1.9,ExternalDebugspec,self-booIng

◾was"tethered"viatheDebugTransportModule,but...◾ ...justfinishedupa"TetherSerialInterface"update

22

Page 23: The RISC-V Berkeley Out-of-Order Machine · ICache Uncore LSU Rename Table FPU ROB Free List Issue Window Branch Predictor ALUs Fetch Buffer DCache DCache Control IMUL Busy IDIV Table

UC Berkeley Visualiza[on

◾ Aboveshows:- SD-to-LBUhazardindhrystone- instrucIonsdependentonLBUwaitforLBUtoexecute- instrucIonsnot-dependentissueout-of-order,beforeLBUexecutes

23

Page 24: The RISC-V Berkeley Out-of-Order Machine · ICache Uncore LSU Rename Table FPU ROB Free List Issue Window Branch Predictor ALUs Fetch Buffer DCache DCache Control IMUL Busy IDIV Table

UC Berkeley Iacceptcontribu[ons!

◾ VisualizerisBOOM'sfirstexternalcontribuIon◾ Happytoacceptmore!- codethatcrashes- fixestocodethatcrashes- performanceanalysistools- debuggingorvisualizaIontools- performanceimprovements- newfeatures

24

Page 25: The RISC-V Berkeley Out-of-Order Machine · ICache Uncore LSU Rename Table FPU ROB Free List Issue Window Branch Predictor ALUs Fetch Buffer DCache DCache Control IMUL Busy IDIV Table

UC Berkeley BranchPredictors

25

Fetch Decode &Rename

Issue Window

Unified PhysicalRegister

File

Functional Unit

0x100:%add%r1,%r2,%r30x104:%add%r3,%r0,%r40x108:%lw%%r2%0(r3)0x10c:%beq%r1,%r2,%0x200

Instructions

0x110:%ld%%r2,%0(r1)0x114:%ld%%r3,%4(r1)0x118:%ld%%r4,%8(r1)0x11c:%ld%%r5,%12(r1)0x120:%...

0x200:%add%r1,%r2,%r30x204:%add%r3,%r2,%r10x208:%sw%%r1,%9(r1)0x20c:%sw%%r3,%0(r1)...

?

Page 26: The RISC-V Berkeley Out-of-Order Machine · ICache Uncore LSU Rename Table FPU ROB Free List Issue Window Branch Predictor ALUs Fetch Buffer DCache DCache Control IMUL Busy IDIV Table

UC Berkeley GSharePredictor

◾ globalhistory- trackoutcomeoflastNbranches

◾ 2-bitsaturaIngcountertable

26

1 11 00 10 0

predict taken

predictnot taken

110110

global history

inst address0x1000

hash

11

0 1

predict taken?

hysteresis

Page 27: The RISC-V Berkeley Out-of-Order Machine · ICache Uncore LSU Rename Table FPU ROB Free List Issue Window Branch Predictor ALUs Fetch Buffer DCache DCache Control IMUL Busy IDIV Table

UC Berkeley BOOM'sBranchPredic[on

◾ next-linepredictor(NLP)- BTB,BHT,RAS- combinaIonal

◾ backingpredictor(BPD)- globalhistorypredictor- SRAM(1r/wport)

27

Branch Prediction

I$

Fetch Buffer

Fetch1

 μDecPC1

Fetch2

NLP

ExeBrTarget

NPC

Front-end

BPD

PC2

Front-end

TakePC

BHT Target >>

Page 28: The RISC-V Berkeley Out-of-Order Machine · ICache Uncore LSU Rename Table FPU ROB Free List Issue Window Branch Predictor ALUs Fetch Buffer DCache DCache Control IMUL Busy IDIV Table

UC Berkeley GSharePredictor-FiengitinSRAM

◾ change"2-bitcounter"statemachine- onmispredicIon,jumpfromweaktostrong- 00->01->11

- thiswillallowustoreducetheread/writerequirements- describedinHennessy&PaGersoncomputerarchitecturetextbook

28

1 11 00 10 0

predict taken

predictnot taken

Page 29: The RISC-V Berkeley Out-of-Order Machine · ICache Uncore LSU Rename Table FPU ROB Free List Issue Window Branch Predictor ALUs Fetch Buffer DCache DCache Control IMUL Busy IDIV Table

UC Berkeley GSharePredictor-FiengitinSRAM

◾ p-bit(predicIon)- readeverycycle- writeonmispredict(valueofh-bit)

◾h-bit(hysteresis)- writeeverybranch- readonmispredict

29

1 11 00 10 0

predict taken

predictnot taken

110110

global history

inst address0x1000

hash

11

0 1

predict taken?

hysteresis

Page 30: The RISC-V Berkeley Out-of-Order Machine · ICache Uncore LSU Rename Table FPU ROB Free List Issue Window Branch Predictor ALUs Fetch Buffer DCache DCache Control IMUL Busy IDIV Table

UC Berkeley GShareinsingle-portedSRAM

- delayedghistoryupdate- super-scalarpredicIons- ghistoryisfetchpacketgranularity- bankedp-table- resetghistoryonmisspeculaIons- updateduringcommit 30

npc

hash

paddr

NPC (BP0)

uDec

predictions

BTB

validmaskinsts

IC-Access (BP1) IC-Resp (BP2)

FetchBuffer

I$

p-table

ghistory

wpredict

h-tablebr_unit_pc+ ghistory

S1 S2

write-buffer

h_out

addr

data

Page 31: The RISC-V Berkeley Out-of-Order Machine · ICache Uncore LSU Rename Table FPU ROB Free List Issue Window Branch Predictor ALUs Fetch Buffer DCache DCache Control IMUL Busy IDIV Table

holds both rename-table snapshots, and bpd

snapshot information that can be released once a

branch is resolved.

npc

hash

paddr

NPC (BP0)

uDec

predictions

BTB

validmaskinsts

IC-Access (BP1) IC-Resp (BP2)

FetchBuffer

I$

ghistory

wpredict

S1S2

global history register* update at end of BP1 stage - it contains the branch prediction history of the fetch unit (including BTB decisions) - compresses entire fetch-packet decision - only includes branches, not JAL/JALRs* reset on fetch unit redirect (misprediction) - new prediction must use the new history

prediction tables

addr

data

allocateresources

B-ROB

Commit,update predictor

branch snapshots

B-ROB holds all inflight branches, and can bypass updates that haven't

been committed to incoming predictions.

imem

req

pred_resp(1 per packet)

predictions (1 per inst)ras_update

predictor pipeline

br_unit

update

ROB

kill

predictor

Decode/Rename/Dispatch

req_pc

resp

predictor (base)

ghist

brUnitResp

commit

UC Berkeley

AbstractBranchPredictors

31

Page 32: The RISC-V Berkeley Out-of-Order Machine · ICache Uncore LSU Rename Table FPU ROB Free List Issue Window Branch Predictor ALUs Fetch Buffer DCache DCache Control IMUL Busy IDIV Table

UC Berkeley BOOM'sBranchPredictors◾ Null

- predictsnot-taken◾ Random

- servesasthebaselineworst-casepredictor- usefulfortesIngthepipeline

◾ SimpleGshare- demonstrateshowtointerfacewiththebranchpredictorframework- notsynthesizable

◾ GShare- targeIng1r/1wSRAM(dualported)...- ...or1rwSRAM(banked)

◾ 2bc-GSkew- basedontheEV8(Alpha21464)predictor- 1rw(or1r/1w)SRAM- took12hourstoimplement

◾ TAGE- asuperawesomepredictor- sIllinprototyping;WIPtomakeitsynthesizableandscalable

32

Page 33: The RISC-V Berkeley Out-of-Order Machine · ICache Uncore LSU Rename Table FPU ROB Free List Issue Window Branch Predictor ALUs Fetch Buffer DCache DCache Control IMUL Busy IDIV Table

UC Berkeley Thankyou!

◾ Source- hGps://ucb-bar.github.io/riscv-boom

◾ DocumentaIon- hGps://ccelio.github.io/riscv-boom-doc

◾ Googlegroup- hGps://groups.google.com/forum/#!forum/riscv-boom

◾ TechReport- hGps://www.eecs.berkeley.edu/Pubs/TechRpts/2015/EECS-2015-167.html

◾ TwiGer- hGps://twiGer.com/boom_cpu

33

Page 34: The RISC-V Berkeley Out-of-Order Machine · ICache Uncore LSU Rename Table FPU ROB Free List Issue Window Branch Predictor ALUs Fetch Buffer DCache DCache Control IMUL Busy IDIV Table

UC Berkeley FundingAcknowledgements

34

◾ Researchpar*allyfundedbyDARPAAwardNumberHR0011-12-2-0016,theCenterforFutureArchitectureResearch,amemberofSTARnet,aSemiconductorResearchCorpora*onprogramsponsoredbyMARCOandDARPA,andASPIRELabindustrialsponsorsandaffiliatesIntel,Google,Huawei,Nokia,NVIDIA,Oracle,andSamsung.

◾ Approvedforpublicrelease;distribu*onisunlimited.Thecontentofthispresenta*ondoesnotnecessarilyreflecttheposi*onorthepolicyoftheUSgovernmentandnoofficialendorsementshouldbeinferred.

◾ Anyopinions,findings,conclusions,orrecommenda*onsinthispaperaresolelythoseoftheauthorsanddoesnotnecessarilyreflecttheposi*onorthepolicyofthesponsors.