soha hassoun tufts university medford, ma thanks to: carl ebeling university of washington
DESCRIPTION
Fine Grain Incremental Rescheduling Via Architectural Retiming. Soha Hassoun Tufts University Medford, MA Thanks to: Carl Ebeling University of Washington Seattle, WA. Problem -- Clock period is too large. Example. Write Address. RAM. Read Address. Offset. Pipelining. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Soha Hassoun Tufts University Medford, MA Thanks to: Carl Ebeling University of Washington](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681313a550346895d97afa8/html5/thumbnails/1.jpg)
Soha HassounSoha Hassoun
Tufts UniversityTufts University
Medford, MAMedford, MA
Thanks to: Carl EbelingThanks to: Carl Ebeling
University of WashingtonUniversity of Washington
Seattle, WASeattle, WA
Fine Grain Incremental ReschedulingFine Grain Incremental ReschedulingViaVia
Architectural RetimingArchitectural Retiming
![Page 2: Soha Hassoun Tufts University Medford, MA Thanks to: Carl Ebeling University of Washington](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681313a550346895d97afa8/html5/thumbnails/2.jpg)
RAM
OffsetOffset
ExampleExample
Problem -- Clock period is too largeProblem -- Clock period is too large
Write AddressWrite Address
Read AddressRead Address
![Page 3: Soha Hassoun Tufts University Medford, MA Thanks to: Carl Ebeling University of Washington](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681313a550346895d97afa8/html5/thumbnails/3.jpg)
RAM
Write AddressWrite Address
Read AddressRead Address
OffsetOffset
PipeliningPipelining
Problems w/ consecutive dependent operationsProblems w/ consecutive dependent operations
![Page 4: Soha Hassoun Tufts University Medford, MA Thanks to: Carl Ebeling University of Washington](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681313a550346895d97afa8/html5/thumbnails/4.jpg)
Performance BottleneckPerformance Bottleneck
Latency constrained pathsLatency constrained paths
Latency = n
![Page 5: Soha Hassoun Tufts University Medford, MA Thanks to: Carl Ebeling University of Washington](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681313a550346895d97afa8/html5/thumbnails/5.jpg)
Performance BottleneckPerformance Bottleneck
Latency constrained pathsLatency constrained paths
Latency = n
ApproachApproachapply architectural retiming at the RT levelapply architectural retiming at the RT level
![Page 6: Soha Hassoun Tufts University Medford, MA Thanks to: Carl Ebeling University of Washington](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681313a550346895d97afa8/html5/thumbnails/6.jpg)
Problem:Problem: too much work, too little timetoo much work, too little time
Architectural RetimingArchitectural Retiming
yk
![Page 7: Soha Hassoun Tufts University Medford, MA Thanks to: Carl Ebeling University of Washington](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681313a550346895d97afa8/html5/thumbnails/7.jpg)
Problem:Problem: too much work, too little timetoo much work, too little time
D
pipelinepipelineregisterregister
yk
Architectural RetimingArchitectural Retiming
![Page 8: Soha Hassoun Tufts University Medford, MA Thanks to: Carl Ebeling University of Washington](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681313a550346895d97afa8/html5/thumbnails/8.jpg)
N
negative registernegative register
Problem:Problem: too much work, too little timetoo much work, too little time
pipelinepipelineregisterregister
DCyk
Architectural RetimingArchitectural Retiming
![Page 9: Soha Hassoun Tufts University Medford, MA Thanks to: Carl Ebeling University of Washington](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681313a550346895d97afa8/html5/thumbnails/9.jpg)
N
negative registernegative register
Problem:Problem: too much work, too little timetoo much work, too little time
pipelinepipelineregisterregister
DCyk
Architectural RetimingArchitectural Retiming
precomputation prediction
![Page 10: Soha Hassoun Tufts University Medford, MA Thanks to: Carl Ebeling University of Washington](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681313a550346895d97afa8/html5/thumbnails/10.jpg)
OutlineOutline
PrecomputationPrecomputationincremental rescheduling incremental rescheduling withoutwithout resource resource
constraintsconstraints
PredictionPredictionincremental rescheduling incremental rescheduling withwith resource resource
constraintsconstraints
ResultsResults
![Page 11: Soha Hassoun Tufts University Medford, MA Thanks to: Carl Ebeling University of Washington](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681313a550346895d97afa8/html5/thumbnails/11.jpg)
DD t t = C = C t+1t+1
Precomputation FunctionPrecomputation Function
hhhDCxi
ffggyk
x iN
![Page 12: Soha Hassoun Tufts University Medford, MA Thanks to: Carl Ebeling University of Washington](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681313a550346895d97afa8/html5/thumbnails/12.jpg)
DD t t = C = C t+1t+1
= f ( ... , x= f ( ... , xi i t+1t+1 , ... ) , ... )
Precomputation FunctionPrecomputation Function
hhhDCxi
ffggyk
x iN
![Page 13: Soha Hassoun Tufts University Medford, MA Thanks to: Carl Ebeling University of Washington](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681313a550346895d97afa8/html5/thumbnails/13.jpg)
DD t t = C = C t+1t+1
= f ( ... , x= f ( ... , xi i t+1t+1 , ... ) , ... )
xxi i t+1t+1 = x´= x´ii
t t == gg ( ... , y( ... , ykktt , ... ) , ... )
Precomputation FunctionPrecomputation Function
hhhDCxi
ffggyk
x iN
![Page 14: Soha Hassoun Tufts University Medford, MA Thanks to: Carl Ebeling University of Washington](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681313a550346895d97afa8/html5/thumbnails/14.jpg)
f´f´DD t t = C = C t+1t+1
= f ( ... , x= f ( ... , xi i t+1t+1 , ... ) , ... )
xxi i t+1t+1 = x´= x´ii
t t == gg ( ... , y( ... , ykktt , ... ) , ... )
Precomputation FunctionPrecomputation Function
hhhDCxi
ffggyk
x iN
DD tt = f ( ... , g= f ( ... , g ( ... , y( ... , ykktt , ... ) , ...) , ... ) , ...)
= f´( ... , y= f´( ... , ykktt , ... ) , ... )
![Page 15: Soha Hassoun Tufts University Medford, MA Thanks to: Carl Ebeling University of Washington](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681313a550346895d97afa8/html5/thumbnails/15.jpg)
Incremental ReschedulingIncremental Rescheduling
hhhffggyk
Time n g
Time n+1 f, h
N
![Page 16: Soha Hassoun Tufts University Medford, MA Thanks to: Carl Ebeling University of Washington](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681313a550346895d97afa8/html5/thumbnails/16.jpg)
f´f´
Incremental ReschedulingIncremental Rescheduling
hhhffggyk
Time n g
Time n+1 f, h
N
Time n f ’
Time n+1 h
![Page 17: Soha Hassoun Tufts University Medford, MA Thanks to: Carl Ebeling University of Washington](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681313a550346895d97afa8/html5/thumbnails/17.jpg)
PrecomputingPrecomputingWith Register ArraysWith Register Arrays
Read Data
Write Address
Read Address
Write Data
Read Data
![Page 18: Soha Hassoun Tufts University Medford, MA Thanks to: Carl Ebeling University of Washington](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681313a550346895d97afa8/html5/thumbnails/18.jpg)
PrecomputingPrecomputingWith Register ArraysWith Register Arrays
Write Address
Read Address
Write Data
Read Data
Out
N
F
![Page 19: Soha Hassoun Tufts University Medford, MA Thanks to: Carl Ebeling University of Washington](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681313a550346895d97afa8/html5/thumbnails/19.jpg)
PrecomputingPrecomputingWith Register ArraysWith Register Arrays
F t = Out t+1
Write Address
Read Address
Write Data
Read Data
Out
N
F
![Page 20: Soha Hassoun Tufts University Medford, MA Thanks to: Carl Ebeling University of Washington](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681313a550346895d97afa8/html5/thumbnails/20.jpg)
PrecomputingPrecomputingWith Register ArraysWith Register Arrays
F t = Out t+1
= Arrayt+1 [Read Addresst+1 ]
Write Address
Read Address
Write Data
Read Data
Out
N
F
![Page 21: Soha Hassoun Tufts University Medford, MA Thanks to: Carl Ebeling University of Washington](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681313a550346895d97afa8/html5/thumbnails/21.jpg)
Synthesizing Bypass PathsSynthesizing Bypass Paths
Write Address
PrecomputedRead
Address
Write Data
Read Data
=?
Write Address
Read Address
Write Data
Read Data
![Page 22: Soha Hassoun Tufts University Medford, MA Thanks to: Carl Ebeling University of Washington](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681313a550346895d97afa8/html5/thumbnails/22.jpg)
Precomputing RAM OutputPrecomputing RAM Output
RAM
N
RAM
![Page 23: Soha Hassoun Tufts University Medford, MA Thanks to: Carl Ebeling University of Washington](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681313a550346895d97afa8/html5/thumbnails/23.jpg)
PredictionPrediction
DCffgi
Z
N
What if ? What if ? can’t precompute, can’t precompute, too many additional resources, ortoo many additional resources, orperformance is unsatisfactoryperformance is unsatisfactory
![Page 24: Soha Hassoun Tufts University Medford, MA Thanks to: Carl Ebeling University of Washington](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681313a550346895d97afa8/html5/thumbnails/24.jpg)
PredictionPrediction
DCffgi
Z
N
What if ? What if ? can’t precompute, can’t precompute, too many additional resources, ortoo many additional resources, orperformance is unsatisfactoryperformance is unsatisfactory
Predict C one cycle before its arrivalPredict C one cycle before its arrival
![Page 25: Soha Hassoun Tufts University Medford, MA Thanks to: Carl Ebeling University of Washington](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681313a550346895d97afa8/html5/thumbnails/25.jpg)
Schedule with MispredictionsSchedule with Mispredictions
C HR1 R2
t-1 t t+1C c1 c2
H h1 h2
![Page 26: Soha Hassoun Tufts University Medford, MA Thanks to: Carl Ebeling University of Washington](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681313a550346895d97afa8/html5/thumbnails/26.jpg)
Schedule with MispredictionsSchedule with Mispredictions
C HR1 R2
t-1 t t+1C c1
H
Verify
NegativeRegister
c2
h1 h2
![Page 27: Soha Hassoun Tufts University Medford, MA Thanks to: Carl Ebeling University of Washington](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681313a550346895d97afa8/html5/thumbnails/27.jpg)
Schedule with MispredictionsSchedule with Mispredictions
C HR1 R2
t-1 t t+1C c1
H
Verify
NegativeRegister
![Page 28: Soha Hassoun Tufts University Medford, MA Thanks to: Carl Ebeling University of Washington](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681313a550346895d97afa8/html5/thumbnails/28.jpg)
Schedule with MispredictionsSchedule with Mispredictions
C HR1 R2
t-1 t t+1C c1
H
h1
c1*=? c1
c1*
Verify
NegativeRegister
c2*
c2
h2
c2*=? c2
c2
![Page 29: Soha Hassoun Tufts University Medford, MA Thanks to: Carl Ebeling University of Washington](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681313a550346895d97afa8/html5/thumbnails/29.jpg)
Synthesis Issues in PredictionSynthesis Issues in Prediction
Negative register as predicting FSM Negative register as predicting FSM use signal transition probabilitiesuse signal transition probabilitiesincorporate don’t care conditionsincorporate don’t care conditions
Nullifying mispredictionsNullifying mispredictionsTwo correction strategiesTwo correction strategies
• As-Soon-As-Possible restoration• As-Late-As-Possible correction
Add handshaking signals to coordinate with Add handshaking signals to coordinate with interfaceinterface
![Page 30: Soha Hassoun Tufts University Medford, MA Thanks to: Carl Ebeling University of Washington](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681313a550346895d97afa8/html5/thumbnails/30.jpg)
Related WorkRelated Work PrecomputationPrecomputation
Bypass Synthesis Bypass Synthesis lookahead [Kogge ‘81, …..]lookahead [Kogge ‘81, …..]
Prediction / Speculative ExecutionPrediction / Speculative ExecutionMost likely path, arbitrarily deep [Holtmann & Ernst Most likely path, arbitrarily deep [Holtmann & Ernst
‘93,’95]‘93,’95]Pre-execution [Radivojevic & Brewer ‘94]Pre-execution [Radivojevic & Brewer ‘94]Possible multiple paths & arbitrarily deep Possible multiple paths & arbitrarily deep
[Lakshminarayana et al. ‘98][Lakshminarayana et al. ‘98]
Percolation scheduling Percolation scheduling [Potasman et al. ‘90][Potasman et al. ‘90]
![Page 31: Soha Hassoun Tufts University Medford, MA Thanks to: Carl Ebeling University of Washington](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681313a550346895d97afa8/html5/thumbnails/31.jpg)
ResultsResults
0
0.5
1
1.5
2
2.5
Seq QC GCD-prec FA1 FA2 MIM MIM-pred GCD-pred
Speed up Area Increase
![Page 32: Soha Hassoun Tufts University Medford, MA Thanks to: Carl Ebeling University of Washington](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681313a550346895d97afa8/html5/thumbnails/32.jpg)
Architectural RetimingArchitectural Retiming Improves throughput while preserving Improves throughput while preserving
functionality and sometimes latencyfunctionality and sometimes latency
Bridge gap between HLS and logic optimizationsBridge gap between HLS and logic optimizations
Unifies several sequential optimizationsUnifies several sequential optimizationsbypass synthesisbypass synthesislookahead transformationlookahead transformationbranch predictionbranch predictionfine-grain cross register optimizationsfine-grain cross register optimizations
![Page 33: Soha Hassoun Tufts University Medford, MA Thanks to: Carl Ebeling University of Washington](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681313a550346895d97afa8/html5/thumbnails/33.jpg)
Ph.D. Forum at DAC ‘99Ph.D. Forum at DAC ‘99 Goal Goal
increase interaction between academia and industryincrease interaction between academia and industry
FormatFormatstudents present work at poster session at DAC students present work at poster session at DAC researchers give feedbackresearchers give feedback
Who’s eligible?Who’s eligible?Students within 1 or 2 years of finishing Ph.D. thesisStudents within 1 or 2 years of finishing Ph.D. thesis
www.cs.washington.edu/homes/soha/forum
![Page 34: Soha Hassoun Tufts University Medford, MA Thanks to: Carl Ebeling University of Washington](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681313a550346895d97afa8/html5/thumbnails/34.jpg)
The EndThe End
![Page 35: Soha Hassoun Tufts University Medford, MA Thanks to: Carl Ebeling University of Washington](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681313a550346895d97afa8/html5/thumbnails/35.jpg)
Precomputing in Precomputing in Single-Register CyclesSingle-Register Cycles
Original CircuitBA
![Page 36: Soha Hassoun Tufts University Medford, MA Thanks to: Carl Ebeling University of Washington](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681313a550346895d97afa8/html5/thumbnails/36.jpg)
Precomputing in Precomputing in Single-Register CyclesSingle-Register Cycles
Original CircuitN BA
![Page 37: Soha Hassoun Tufts University Medford, MA Thanks to: Carl Ebeling University of Washington](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681313a550346895d97afa8/html5/thumbnails/37.jpg)
Precomputing in Precomputing in Single-Register CyclesSingle-Register Cycles
Lookahead -- A(n) is a function of B(n-2)
N BA
A' BAB'
[Kogge, ‘81], [Parhi & Messerschmidtt, ‘89]
![Page 38: Soha Hassoun Tufts University Medford, MA Thanks to: Carl Ebeling University of Washington](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681313a550346895d97afa8/html5/thumbnails/38.jpg)
Precomputing RAM OutputPrecomputing RAM Output
RAMRAM
![Page 39: Soha Hassoun Tufts University Medford, MA Thanks to: Carl Ebeling University of Washington](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681313a550346895d97afa8/html5/thumbnails/39.jpg)
Precomputing RAM OutputPrecomputing RAM Output
RAMRAM
![Page 40: Soha Hassoun Tufts University Medford, MA Thanks to: Carl Ebeling University of Washington](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681313a550346895d97afa8/html5/thumbnails/40.jpg)
Speculative Execution Speculative Execution
c1
c2
c3
c4
c5
c6
Scope and Depth
![Page 41: Soha Hassoun Tufts University Medford, MA Thanks to: Carl Ebeling University of Washington](https://reader036.vdocuments.site/reader036/viewer/2022062314/5681313a550346895d97afa8/html5/thumbnails/41.jpg)
Speculative Execution Speculative Execution
Scope and Depth