a parametrizable processor - computation structures...
TRANSCRIPT
![Page 1: A Parametrizable Processor - Computation Structures Groupcsg.csail.mit.edu/.../group2_final_presentation.pdf · 5/17/07 6.375: Bichler, Carli, Yamhure 14 Results: IPS 0.00 10000000.00](https://reader034.vdocuments.site/reader034/viewer/2022052015/602d24651e04087e3706bb8f/html5/thumbnails/1.jpg)
5/17/07 6.375: Bichler, Carli, Yamhure 1
A Parametrizable Processor
Olivier BichlerRoberto Carli
Alessandro Yamhure
![Page 2: A Parametrizable Processor - Computation Structures Groupcsg.csail.mit.edu/.../group2_final_presentation.pdf · 5/17/07 6.375: Bichler, Carli, Yamhure 14 Results: IPS 0.00 10000000.00](https://reader034.vdocuments.site/reader034/viewer/2022052015/602d24651e04087e3706bb8f/html5/thumbnails/2.jpg)
5/17/07 6.375: Bichler, Carli, Yamhure 2
Motivation
• Efficient system-on-a-chip solutions• Wide spectrum of requirements
– Performance– Area / Power
• Tuning without engineering
![Page 3: A Parametrizable Processor - Computation Structures Groupcsg.csail.mit.edu/.../group2_final_presentation.pdf · 5/17/07 6.375: Bichler, Carli, Yamhure 14 Results: IPS 0.00 10000000.00](https://reader034.vdocuments.site/reader034/viewer/2022052015/602d24651e04087e3706bb8f/html5/thumbnails/3.jpg)
5/17/07 6.375: Bichler, Carli, Yamhure 3
Overview
• One-Rule Synchronous Processor• Combinational stages packaged as functions• Parameter-controlled configuration• Optimization (performance/area):
– Compiler macros– Aggressive compiler
• Results / tradeoffs
![Page 4: A Parametrizable Processor - Computation Structures Groupcsg.csail.mit.edu/.../group2_final_presentation.pdf · 5/17/07 6.375: Bichler, Carli, Yamhure 14 Results: IPS 0.00 10000000.00](https://reader034.vdocuments.site/reader034/viewer/2022052015/602d24651e04087e3706bb8f/html5/thumbnails/4.jpg)
5/17/07 6.375: Bichler, Carli, Yamhure 4
One Rule to rule them all
• Explicit guards (WILL_FIRE)– FIFOF Methods– Data Memory– Writeback stage guard
• Customized SFIFO– First, find, find2, notFull, notEmpty, deq < enq < clear– Guards < Actions– Writeback < Execute– No Bypassing
![Page 5: A Parametrizable Processor - Computation Structures Groupcsg.csail.mit.edu/.../group2_final_presentation.pdf · 5/17/07 6.375: Bichler, Carli, Yamhure 14 Results: IPS 0.00 10000000.00](https://reader034.vdocuments.site/reader034/viewer/2022052015/602d24651e04087e3706bb8f/html5/thumbnails/5.jpg)
5/17/07 6.375: Bichler, Carli, Yamhure 5
Packaging for Abstraction
• Achieve highly modular parameterization• Combinational workhorse unchanged
– Can be packaged/abstracted into functions– ActionValue functions
![Page 6: A Parametrizable Processor - Computation Structures Groupcsg.csail.mit.edu/.../group2_final_presentation.pdf · 5/17/07 6.375: Bichler, Carli, Yamhure 14 Results: IPS 0.00 10000000.00](https://reader034.vdocuments.site/reader034/viewer/2022052015/602d24651e04087e3706bb8f/html5/thumbnails/6.jpg)
5/17/07 6.375: Bichler, Carli, Yamhure 6
3-stage packaged version
![Page 7: A Parametrizable Processor - Computation Structures Groupcsg.csail.mit.edu/.../group2_final_presentation.pdf · 5/17/07 6.375: Bichler, Carli, Yamhure 14 Results: IPS 0.00 10000000.00](https://reader034.vdocuments.site/reader034/viewer/2022052015/602d24651e04087e3706bb8f/html5/thumbnails/7.jpg)
5/17/07 6.375: Bichler, Carli, Yamhure 7
2-stage version (no pcQ)
![Page 8: A Parametrizable Processor - Computation Structures Groupcsg.csail.mit.edu/.../group2_final_presentation.pdf · 5/17/07 6.375: Bichler, Carli, Yamhure 14 Results: IPS 0.00 10000000.00](https://reader034.vdocuments.site/reader034/viewer/2022052015/602d24651e04087e3706bb8f/html5/thumbnails/8.jpg)
5/17/07 6.375: Bichler, Carli, Yamhure 8
2-stage version (no wbQ)
![Page 9: A Parametrizable Processor - Computation Structures Groupcsg.csail.mit.edu/.../group2_final_presentation.pdf · 5/17/07 6.375: Bichler, Carli, Yamhure 14 Results: IPS 0.00 10000000.00](https://reader034.vdocuments.site/reader034/viewer/2022052015/602d24651e04087e3706bb8f/html5/thumbnails/9.jpg)
5/17/07 6.375: Bichler, Carli, Yamhure 9
1-stage version
![Page 10: A Parametrizable Processor - Computation Structures Groupcsg.csail.mit.edu/.../group2_final_presentation.pdf · 5/17/07 6.375: Bichler, Carli, Yamhure 14 Results: IPS 0.00 10000000.00](https://reader034.vdocuments.site/reader034/viewer/2022052015/602d24651e04087e3706bb8f/html5/thumbnails/10.jpg)
5/17/07 6.375: Bichler, Carli, Yamhure 10
High-level schematics
![Page 11: A Parametrizable Processor - Computation Structures Groupcsg.csail.mit.edu/.../group2_final_presentation.pdf · 5/17/07 6.375: Bichler, Carli, Yamhure 14 Results: IPS 0.00 10000000.00](https://reader034.vdocuments.site/reader034/viewer/2022052015/602d24651e04087e3706bb8f/html5/thumbnails/11.jpg)
5/17/07 6.375: Bichler, Carli, Yamhure 11
2-stage Exploration
• Instruction memory– Never skipped– High latency
• Data memory– Only used on memory LD/ST instructions– Pipeline causes stalls– On branches/jumps no need for writeback
• However, specific SOC tasks can benefit fromalternative solutions
![Page 12: A Parametrizable Processor - Computation Structures Groupcsg.csail.mit.edu/.../group2_final_presentation.pdf · 5/17/07 6.375: Bichler, Carli, Yamhure 14 Results: IPS 0.00 10000000.00](https://reader034.vdocuments.site/reader034/viewer/2022052015/602d24651e04087e3706bb8f/html5/thumbnails/12.jpg)
5/17/07 6.375: Bichler, Carli, Yamhure 12
Test Strategy
• Custom Makefile– Scans and changes parameters automatically– Synthesizes various configurations– Reports IPC, IPS, clock period, area
![Page 13: A Parametrizable Processor - Computation Structures Groupcsg.csail.mit.edu/.../group2_final_presentation.pdf · 5/17/07 6.375: Bichler, Carli, Yamhure 14 Results: IPS 0.00 10000000.00](https://reader034.vdocuments.site/reader034/viewer/2022052015/602d24651e04087e3706bb8f/html5/thumbnails/13.jpg)
5/17/07 6.375: Bichler, Carli, Yamhure 13
Results: Clock Period
Post-place+route effective clock period
0.00
1.00
2.00
3.00
4.00
5.00
6.00
7.00
8.00
w/o pcQ, w/owbQ
w/o pcQ, wwbQ
w pcQ, w/owbQ
w pcQ, wwbQ
Expectation:
Effective clock periodis expected to decreasewith the number ofpipelined stages, as wemake the critical pathshorter
![Page 14: A Parametrizable Processor - Computation Structures Groupcsg.csail.mit.edu/.../group2_final_presentation.pdf · 5/17/07 6.375: Bichler, Carli, Yamhure 14 Results: IPS 0.00 10000000.00](https://reader034.vdocuments.site/reader034/viewer/2022052015/602d24651e04087e3706bb8f/html5/thumbnails/14.jpg)
5/17/07 6.375: Bichler, Carli, Yamhure 14
Results: IPS
0.00
10000000.00
20000000.00
30000000.00
40000000.00
50000000.00
60000000.00
70000000.00
80000000.00
90000000.00
IPS(median)
IPS (qsort) IPS(towers)
IPS (vvadd) IPS(multiply)
w/o pcQ, w/o wbQ
w/o pcQ, w wbQ
w pcQ, w/o wbQ
w pcQ, w wbQ
0.00
10000000.00
20000000.00
30000000.00
40000000.00
50000000.00
60000000.00
70000000.00
80000000.00
90000000.00
100000000.00
IPS (median) IPS (qsort) IPS (towers) IPS (vvadd) IPS(multiply)
w/o pcQ, w/o wbQ
w pcQ, w/o wbQ
w/o pcQ, w wbQ
w pcQ, w wbQ
Designexploration
non-EHR RegFile
EHR RegFile
![Page 15: A Parametrizable Processor - Computation Structures Groupcsg.csail.mit.edu/.../group2_final_presentation.pdf · 5/17/07 6.375: Bichler, Carli, Yamhure 14 Results: IPS 0.00 10000000.00](https://reader034.vdocuments.site/reader034/viewer/2022052015/602d24651e04087e3706bb8f/html5/thumbnails/15.jpg)
5/17/07 6.375: Bichler, Carli, Yamhure 15
Results: AreaPost-synthesis total area
18000.00
19000.00
20000.00
21000.00
22000.00
23000.00
24000.00
w/o pcQ, w/owbQ
w/o pcQ, wwbQ
w pcQ, w/owbQ
w pcQ, wwbQ
Post-place+route total area
300000.00
310000.00
320000.00
330000.00
340000.00
350000.00
360000.00
370000.00
w/o pcQ, w/owbQ
w/o pcQ, wwbQ
w pcQ, w/owbQ
w pcQ, wwbQ
Expectation:
Area should increase with thenumber of pipelined stages
Post-place+route total area
325000.00330000.00335000.00340000.00345000.00350000.00355000.00360000.00365000.00370000.00375000.00
w/o pcQ, w/owbQ
w/o pcQ, wwbQ
w pcQ, w/owbQ
w pcQ, wwbQ
non-EHR RegFile
EHR RegFile
![Page 16: A Parametrizable Processor - Computation Structures Groupcsg.csail.mit.edu/.../group2_final_presentation.pdf · 5/17/07 6.375: Bichler, Carli, Yamhure 14 Results: IPS 0.00 10000000.00](https://reader034.vdocuments.site/reader034/viewer/2022052015/602d24651e04087e3706bb8f/html5/thumbnails/16.jpg)
5/17/07 6.375: Bichler, Carli, Yamhure 16
Performance/Area TradeoffsIPS / Post-synthesis area
0
500
1000
1500
2000
2500
3000
3500
4000
w/o pcQ, w/o wbQ w/o pcQ, w wbQ w pcQ, w/o wbQ w pcQ, w wbQFOM = SQRT(IPS) / Post-synthesis area
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
w/o pcQ, w/o wbQ w/o pcQ, w wbQ w pcQ, w/o wbQ w pcQ, w wbQPowerArea
IPSFOM
×∝
![Page 17: A Parametrizable Processor - Computation Structures Groupcsg.csail.mit.edu/.../group2_final_presentation.pdf · 5/17/07 6.375: Bichler, Carli, Yamhure 14 Results: IPS 0.00 10000000.00](https://reader034.vdocuments.site/reader034/viewer/2022052015/602d24651e04087e3706bb8f/html5/thumbnails/17.jpg)
5/17/07 6.375: Bichler, Carli, Yamhure 17
Conclusions
• Task-specific, customizable and flexibleprocessor design
• User-defined parameter controls degree ofparallelism / number of stages
• Each configuration optimized for area andperformance
• Balanced tradeoff between configurations