uic thesis novati
TRANSCRIPT
![Page 1: UIC Thesis Novati](https://reader033.vdocuments.site/reader033/viewer/2022052506/55703c83d8b42a611e8b4ffa/html5/thumbnails/1.jpg)
1D and 2D Bitstream Relocation for Partially
Dynamically Reconfigurable Architecture
BY
Marco Novati
Thesis committee:
Shantanu Dutt (chair), Marco Domenico Santambrogio, Piotr Gmytrasiewicz
UIC Thesis Defense: May 8, 2008
![Page 2: UIC Thesis Novati](https://reader033.vdocuments.site/reader033/viewer/2022052506/55703c83d8b42a611e8b4ffa/html5/thumbnails/2.jpg)
AimsAims
Architectural support for relocation:
Create an integrated HW/SW system to manage online relocation (1D and 2D) in reconfigurable architecture
Create efficient bitstream relocation solutions suitable for the target system:
1D - 2DHW – SW
2
![Page 3: UIC Thesis Novati](https://reader033.vdocuments.site/reader033/viewer/2022052506/55703c83d8b42a611e8b4ffa/html5/thumbnails/3.jpg)
OutlineOutline
IntroductionRelocationState of ArtProposed SolutionsResultsConcluding Remarks and Future Work
3
![Page 4: UIC Thesis Novati](https://reader033.vdocuments.site/reader033/viewer/2022052506/55703c83d8b42a611e8b4ffa/html5/thumbnails/4.jpg)
What’s Next…What’s Next…
IntroductionReconfigurationXilinx FPGAs
RelocationState of ArtProposed SolutionsResultsConcluding Remarks and Future Work
4
![Page 5: UIC Thesis Novati](https://reader033.vdocuments.site/reader033/viewer/2022052506/55703c83d8b42a611e8b4ffa/html5/thumbnails/5.jpg)
55
Reconfigurable ComputingReconfigurable Computing
“Reconfigurable computing is intended to fill the gap between hardware and software, achieving potentially
much higher performance than software, while maintaining a higher level of flexibility than hardware”
(K. Compton and S. Hauck, Reconfigurable Computing: a Survey of Systems and Software, 2002)
![Page 6: UIC Thesis Novati](https://reader033.vdocuments.site/reader033/viewer/2022052506/55703c83d8b42a611e8b4ffa/html5/thumbnails/6.jpg)
6
5 W5 W
whowho controls the reconfiguration
wherewhere the reconfigurator is located
whenwhen the configurations are generated
whichwhich is the granularity of the reconfiguration
in whatwhat dimension the reconfiguration operates
![Page 7: UIC Thesis Novati](https://reader033.vdocuments.site/reader033/viewer/2022052506/55703c83d8b42a611e8b4ffa/html5/thumbnails/7.jpg)
7
Reconfiguration in everyday Reconfiguration in everyday lifelife
Hocke
y
Football
(Complete – Static)
(Partial –
Dynamic)
(Partial – Static)
7
Soccer
![Page 8: UIC Thesis Novati](https://reader033.vdocuments.site/reader033/viewer/2022052506/55703c83d8b42a611e8b4ffa/html5/thumbnails/8.jpg)
88
Reconfigurable architectureReconfigurable architecture
A basic reconfigurable architecture consists of:a Static area: a basic Harward architecturea Reconfigurable area: an device area composed by several reconfigurable regions
![Page 9: UIC Thesis Novati](https://reader033.vdocuments.site/reader033/viewer/2022052506/55703c83d8b42a611e8b4ffa/html5/thumbnails/9.jpg)
9
Basic DefinitionsBasic Definitions
CoreCore: a specific representation of a functionality. It is possible, for example, to have a core described in VHDL, in C or in an intermediate representation (e.g. a DFG)
IP-CoreIP-Core: a core described using a HD Language combined with its communication infrastructure (i.e. the bus interface)
Reconfigurable Functional UnitReconfigurable Functional Unit: an IP-Core that can be plugged and/or unplugged at runtime in an already working architecture
Reconfigurable RegionReconfigurable Region: a portion of the device area used to implement a reconfigurable core
![Page 10: UIC Thesis Novati](https://reader033.vdocuments.site/reader033/viewer/2022052506/55703c83d8b42a611e8b4ffa/html5/thumbnails/10.jpg)
10
Xilinx FPGAs and Configuration Xilinx FPGAs and Configuration MemoryMemory
![Page 11: UIC Thesis Novati](https://reader033.vdocuments.site/reader033/viewer/2022052506/55703c83d8b42a611e8b4ffa/html5/thumbnails/11.jpg)
Frame Addressing: Virtex, Frame Addressing: Virtex, Virtex-EVirtex-E
11* Inspired to Virtex Series Configuration Architecture User Guide
![Page 12: UIC Thesis Novati](https://reader033.vdocuments.site/reader033/viewer/2022052506/55703c83d8b42a611e8b4ffa/html5/thumbnails/12.jpg)
Frame Addressing: Virtex2proFrame Addressing: Virtex2pro
12* Taken from Virtex-II Pro and Virtex-II Pro X FPGA User Guide
![Page 13: UIC Thesis Novati](https://reader033.vdocuments.site/reader033/viewer/2022052506/55703c83d8b42a611e8b4ffa/html5/thumbnails/13.jpg)
Frame Addressing: Virtex 4-5 Frame Addressing: Virtex 4-5 (1/2)(1/2)
New Frame Addressing:Possibility of addressing rows and columns
13* Inspired to Virtex 4 & 5 Configuration Architecture User Guide
![Page 14: UIC Thesis Novati](https://reader033.vdocuments.site/reader033/viewer/2022052506/55703c83d8b42a611e8b4ffa/html5/thumbnails/14.jpg)
Frame Addressing: Virtex 4-5 Frame Addressing: Virtex 4-5 (2/2)(2/2)
14* Inspired to Virtex 4 & 5 Configuration Architecture User Guide
![Page 15: UIC Thesis Novati](https://reader033.vdocuments.site/reader033/viewer/2022052506/55703c83d8b42a611e8b4ffa/html5/thumbnails/15.jpg)
What’s Next…What’s Next…
IntroductionRelocationState of ArtProposed SolutionsResultsConcluding Remarks and Future Work
15
![Page 16: UIC Thesis Novati](https://reader033.vdocuments.site/reader033/viewer/2022052506/55703c83d8b42a611e8b4ffa/html5/thumbnails/16.jpg)
16
Relocation: RationaleRelocation: Rationale
Bitstreams relocation technique to: speedup the overall system executionreduce the amount of memory used to store partial bitstreamsachieve a core preemptive execution assign at runtime the bitstreams placement
![Page 17: UIC Thesis Novati](https://reader033.vdocuments.site/reader033/viewer/2022052506/55703c83d8b42a611e8b4ffa/html5/thumbnails/17.jpg)
17
Relocation: The ProblemRelocation: The Problem
People Demanding for Functionalities
Set of Available Functionalities
FiArea/Time
Legenda:
A2/1
B 1/2
C2/2
D 1/1 E 1/1
F 2/2
RR3RR2RR1
FPGA
RR3RR2RR1
A
RR3RR2RR1
F
RR3RR2RR1
D
RR3RR2RR1
B
RR3RR2RR1
C
E
RR3RR2RR1
RFU Implementations
![Page 18: UIC Thesis Novati](https://reader033.vdocuments.site/reader033/viewer/2022052506/55703c83d8b42a611e8b4ffa/html5/thumbnails/18.jpg)
18
Relocation: ScenarioRelocation: Scenario
Time
Area
AB
Rec. F
F
Rec. E
E
Rec. C
C
Rec. D
D
RR3RR2RR1
A
RR3RR2RR1
F
RR3RR2RR1
D
RR3RR2RR1
B
RR3RR2RR1
C
E
RR3RR2RR1
RFU Implementations
A
E
D
C
B
F
2/1
2/2
1/2
1/1
1/1
2/2
A possible scenario
FiArea/Time
Legenda:
Time
![Page 19: UIC Thesis Novati](https://reader033.vdocuments.site/reader033/viewer/2022052506/55703c83d8b42a611e8b4ffa/html5/thumbnails/19.jpg)
19
Relocation: MotivationRelocation: Motivation
A
E
D
C
B
F
2/1
2/2
1/2
1/1
1/1
2/2
A possible scenario
FiArea/Time
Legenda:
Time
RR3RR2RR1
A
RR3RR2RR1
F
RR3RR2RR1
D
RR3RR2RR1
B
RR3RR2RR1
C
E
RR3RR2RR1
RFU Implementations
RR3RR2RR1
A
RR3RR2RR1
C
RR3RR2RR1
B
RR3RR2RR1
B
RR3RR2RR1
D
RR3RR2RR1
D
E
RR3RR2RR1
E
RR3RR2RR1
RR3RR2RR1
F
Time
Area
AB
Rec. C
C
Rec. F
F
Rec. E
E
DRec. D
Time
Area
AB
Rec. C
C
R2 F
F
R2 E
E
DR2 D
RR3RR2RR1
A
RR3RR2RR1
F
RR3RR2RR1
D
RR3RR2RR1
B
RR3RR2RR1
C
E
RR3RR2RR1
RFU Implementations
![Page 20: UIC Thesis Novati](https://reader033.vdocuments.site/reader033/viewer/2022052506/55703c83d8b42a611e8b4ffa/html5/thumbnails/20.jpg)
What’s Next…What’s Next…
IntroductionRelocationState of Art
PARBITBITPOSBAnMaTREPLICA
Proposed SolutionsResultsConcluding Remarks and Future Work
20
![Page 21: UIC Thesis Novati](https://reader033.vdocuments.site/reader033/viewer/2022052506/55703c83d8b42a611e8b4ffa/html5/thumbnails/21.jpg)
PARBITPARBIT
[E. Horta and John W. Lockwood, ”PARBIT: A Tool to Transform Bitfiles to Implement Partial Reconfiguration of Field Programmable Gate Arrays (FPGAs)”, Washington University, Technical Report, July 2001.]
Features:PureC software Enables the generation of the partial bitstream fileSmall modifications, altering only the parts related to the location on the device.
CONS:Only offlineOnly 1D reconfiguration
21
![Page 22: UIC Thesis Novati](https://reader033.vdocuments.site/reader033/viewer/2022052506/55703c83d8b42a611e8b4ffa/html5/thumbnails/22.jpg)
BITPOSBITPOS
[Yana E. Krasteva, Eduardo de la Torre, Teresa Riesgo and Didier Joly, ”Virtex II FPGA Bitstream Maniplation: Application to Reconfiguration Control Systems”, 2006 International Conference on Field Programmable Logic and Applications, August 2006.]
Features:Extract an area from a configuration fileGenerate the new relocated bitstream
CONS:Only offlineOnly Virtex II, Virtex II Pro [1D]
22
![Page 23: UIC Thesis Novati](https://reader033.vdocuments.site/reader033/viewer/2022052506/55703c83d8b42a611e8b4ffa/html5/thumbnails/23.jpg)
BAnMaTBAnMaT
[D. Deori, ”BAnMaT: un Framework per l’Analisi e la Manipolazione di un Bitstream Orientato alla Riconfigurabilita Parziale”, DEI, Milano, Politecnico di Milano, 2006]
Features:Bitstream correctness checkPerform modification on a configuration bitstreamPermits to bypass synthesis process from the VHDL
CONS:Only offline manipulation
23
![Page 24: UIC Thesis Novati](https://reader033.vdocuments.site/reader033/viewer/2022052506/55703c83d8b42a611e8b4ffa/html5/thumbnails/24.jpg)
REPLICAREPLICA
[H. Kalte, G. Lee, M. Porrmann and U. Rckert, ”REPLICA: A Bitstream Manipulation Filter for Module Relocation in Partial Reconfigurable Systems”, The 12th Reconfigurable Architectures Workshop (RAW 2005), 2005.]
Features:Hardware filter that exploit relocationNecessary manipulations during the download processRelocation hiding
CONS:Only for external reconfigurable systemOnly 1D relocationMaximum frequency of 50 MHz 24
![Page 25: UIC Thesis Novati](https://reader033.vdocuments.site/reader033/viewer/2022052506/55703c83d8b42a611e8b4ffa/html5/thumbnails/25.jpg)
What’s Next…What’s Next…
IntroductionRelocationState of ArtPolarisProposed Solutions
PolarisTarget ArchitectureProposed Relocation Solutions
Results Concluding Remarks and Future Work
25
![Page 26: UIC Thesis Novati](https://reader033.vdocuments.site/reader033/viewer/2022052506/55703c83d8b42a611e8b4ffa/html5/thumbnails/26.jpg)
26
Polaris: MotivationsPolaris: Motivations
Complete workflow to generate a self dynamically reconfigurable architecture that:– Supports 1D and 2D reconfiguration
– Has “good” area constraints for cores
– Performs Runtime task placement decisions
– Exploits internal and fast Core relocation
Starting from specification of:– Target application– Target device info– Reconfiguration model– Communication Infrastructure
![Page 27: UIC Thesis Novati](https://reader033.vdocuments.site/reader033/viewer/2022052506/55703c83d8b42a611e8b4ffa/html5/thumbnails/27.jpg)
2727
Polaris Polaris OverviewOverview
Workflow to manage allocation and relocation of tasks in self dynamically reconfigurable architectures
Final goal: complete architecture (bitstreams and code) generation
![Page 28: UIC Thesis Novati](https://reader033.vdocuments.site/reader033/viewer/2022052506/55703c83d8b42a611e8b4ffa/html5/thumbnails/28.jpg)
Target Architecture: YaRATarget Architecture: YaRA
28
![Page 29: UIC Thesis Novati](https://reader033.vdocuments.site/reader033/viewer/2022052506/55703c83d8b42a611e8b4ffa/html5/thumbnails/29.jpg)
PPC Based YaRAPPC Based YaRA
29
STATIC AREA
![Page 30: UIC Thesis Novati](https://reader033.vdocuments.site/reader033/viewer/2022052506/55703c83d8b42a611e8b4ffa/html5/thumbnails/30.jpg)
Proposed Relocation SolutionsProposed Relocation Solutions
Runtime Support for Self Dynamical Runtime 1D and 2D Reconfiguration– Xilinx Virtex, Virtex-E, Virtex2pro [1D]– Xilinx Virtex-4 and Virtex-5 [2D]
Relocation, different solutions:– Software:
• BAnMaT Lite– Hardware:
• BiRF [1D]• BiRF Square [2D]
30
![Page 31: UIC Thesis Novati](https://reader033.vdocuments.site/reader033/viewer/2022052506/55703c83d8b42a611e8b4ffa/html5/thumbnails/31.jpg)
Configuration BitstreamConfiguration Bitstream
31
![Page 32: UIC Thesis Novati](https://reader033.vdocuments.site/reader033/viewer/2022052506/55703c83d8b42a611e8b4ffa/html5/thumbnails/32.jpg)
BiRF & BiRF Square Block BiRF & BiRF Square Block DiagramDiagram
32
![Page 33: UIC Thesis Novati](https://reader033.vdocuments.site/reader033/viewer/2022052506/55703c83d8b42a611e8b4ffa/html5/thumbnails/33.jpg)
The ParserThe Parser
33
![Page 34: UIC Thesis Novati](https://reader033.vdocuments.site/reader033/viewer/2022052506/55703c83d8b42a611e8b4ffa/html5/thumbnails/34.jpg)
CRC CalculationCRC Calculation
Particular CRC value, used by Xilinx tools
Two version of BiRF and BiRF Square:– By using the “predefined” values– With actual CRC calculation
X16 + X15 + X2 + 1 [1D]
X32 + X28 + X27 + X26 + X25 + X23 + X22 + X20 + X19 + X18 + X14 + X13 + X11 + X10 + X9 + X8 + X6 + 1 [2D]
34
![Page 35: UIC Thesis Novati](https://reader033.vdocuments.site/reader033/viewer/2022052506/55703c83d8b42a611e8b4ffa/html5/thumbnails/35.jpg)
What’s Next…What’s Next…
Introduction Relocation State of Art Proposed Solutions Results
– Synthesis Results– Relocation Solutions Results
Concluding Remarks and Future Work
35
![Page 36: UIC Thesis Novati](https://reader033.vdocuments.site/reader033/viewer/2022052506/55703c83d8b42a611e8b4ffa/html5/thumbnails/36.jpg)
ResultsResults
Relocation solutions:– Small area usage (slide 37)– High time performance (slide 38)
Relocation results:– Internal memory saving (slides 39 – 40)– Time saving (slides 41- 44)
36
![Page 37: UIC Thesis Novati](https://reader033.vdocuments.site/reader033/viewer/2022052506/55703c83d8b42a611e8b4ffa/html5/thumbnails/37.jpg)
Synthesis Results: AreaSynthesis Results: Area
37
FPGA BiRF BiRF Square
FamilyModel
Generic
Version
Optimized
Version
Generic
Version
Optimized
Version
Virtex II Pro
vp7 11.6 % 3.6 % − −
Virtex II Pro
vp20 5.8 % 1.8 % − −
Virtex II Pro
vp30 4.2 % 1.3 % − −
Virtex 4 vlx40 − − 2.2 % 0.9 %
Virtex 4 vlx60 − − 1.5 % 0.6 %
Virtex 4 vlx100
− − 0.8 % 0.3 %
Virtex 5 vlx50 − − 1.1 % 0.8 %
Virtex 5 vlx85 − − 0.6 % 0.4 %
Virtex 5 vlx110
− − 0.5 % 0.3 %
![Page 38: UIC Thesis Novati](https://reader033.vdocuments.site/reader033/viewer/2022052506/55703c83d8b42a611e8b4ffa/html5/thumbnails/38.jpg)
Synthesis Results: Time Synthesis Results: Time PerformancesPerformances
BiRF:– On a Virtex2pro with speed grade -5
• General purpose version: max frequency of 101 MHz• Specific version: max frequency of 136 MHz
BiRF Square:– On a Virtex-4 with speed grade -12
• General purpose version: max frequency of 160 MHz• Specific version: max frequency of 290 MHz
– On a Virtex-5 with speed grade -3• General purpose version: max frequency of 226 MHz• Specific version: max frequency of 304 MHz
38
![Page 39: UIC Thesis Novati](https://reader033.vdocuments.site/reader033/viewer/2022052506/55703c83d8b42a611e8b4ffa/html5/thumbnails/39.jpg)
Relocation Solutions Results Relocation Solutions Results (1/2)(1/2)
BiRF, BiRF Square, BAnMaT Lite– Permit to support relocation in a self partially and
dynamically 1D or 2D reconfigurable system– The occupation ratio is relatively small– Frequency more than acceptable– Reduction of internal memory requirements
Throughput:– BiRF: 6 MB/s – BiRF Square: 7.3 MB/s– BAnMaT Lite: 2.6 MB/s
39
![Page 40: UIC Thesis Novati](https://reader033.vdocuments.site/reader033/viewer/2022052506/55703c83d8b42a611e8b4ffa/html5/thumbnails/40.jpg)
Relocation Solutions Results Relocation Solutions Results (2/2)(2/2)
A total configuration file size is about 1 MB Considering an architecture:
– 1/3 of the area as fixed part – 2/3 as reconfigurable part with 6 slots
With such hypothesis– Size of a partial bitstream will be about 110 KB– Relocation time of about:
• 18 ms with BiRF• 15 ms with BiRF Square• 42 ms with BAnMaT Lite
40
![Page 41: UIC Thesis Novati](https://reader033.vdocuments.site/reader033/viewer/2022052506/55703c83d8b42a611e8b4ffa/html5/thumbnails/41.jpg)
Relocation Time Results (1/4)Relocation Time Results (1/4)
41
![Page 42: UIC Thesis Novati](https://reader033.vdocuments.site/reader033/viewer/2022052506/55703c83d8b42a611e8b4ffa/html5/thumbnails/42.jpg)
Relocation Time Results (2/4)Relocation Time Results (2/4)
FPU1: clock time 0.01 ms, required for 3.65 s (7 add, 3 sub, 10 mul, 1 square root and 4 div)– Feasible RR assignment: (0,0) and (6,0)
JPEG: a complete JPEG Hardware Compressor, compression rate 24 img(352x288)/s, required for 3 seconds (72 img 352x288)– Feasible RR assignment: (0,0), (0,1), (6,0) and
(6,1) FPU2: clock time 0.01 ms, required for 3.13 s (6
add, 5 sub, 8 mul and 4 div)– Feasible RR assignment: (0,0) and (6,0)
3DES: a Triple-DES 64-bit block cipher, required for 1 second, in order to process a file of 72 MB– Feasible RR assignment: (0,0),(1,0), (3,0) and (3,1)
42
![Page 43: UIC Thesis Novati](https://reader033.vdocuments.site/reader033/viewer/2022052506/55703c83d8b42a611e8b4ffa/html5/thumbnails/43.jpg)
Relocation Time Results (3/4)Relocation Time Results (3/4)
43
![Page 44: UIC Thesis Novati](https://reader033.vdocuments.site/reader033/viewer/2022052506/55703c83d8b42a611e8b4ffa/html5/thumbnails/44.jpg)
Relocation Time Results (4/4)Relocation Time Results (4/4)
44
![Page 45: UIC Thesis Novati](https://reader033.vdocuments.site/reader033/viewer/2022052506/55703c83d8b42a611e8b4ffa/html5/thumbnails/45.jpg)
What’s Next…What’s Next…
Introduction Relocation State of Art Proposed Solutions Results Concluding Remarks and Future Work
45
![Page 46: UIC Thesis Novati](https://reader033.vdocuments.site/reader033/viewer/2022052506/55703c83d8b42a611e8b4ffa/html5/thumbnails/46.jpg)
Concluding RemarksConcluding Remarks Architectural support for relocation:
– Create an integrated HW/SW system to manage online relocation (1D and 2D) in reconfigurable architecture
– Create efficient bitstream relocation solutions suitable for the target system:
• 1D - 2D• HW – SW
Pubblications:– International conferences:
• M. Morandi, M. Novati, M. D. Santambrogio, D. Sciuto, Core allocation and relocation management for a self dynamically reconfigurable architecture, ISVLSI 2008, IEEE Computer Society Annual Symposium on VLSI
• S. Corbetta, F. Ferrandi, M. Morandi, M. Novati, M. D. Santambrogio, D. Sciuto, Two Novel Approaches to Online Partial Bitstream Relocation in a Dynamically Reconfigurable System, ISVLSI 2007, IEEE Computer Society Annual Symposium on VLSI
– IEEE Transaction on VLSI (Second Rewiew Phase):• M. Morandi, M. Novati, M.D. Santambrogio, P Spoletini, D. Sciuto, Internal and
External Bitstream Relocation for Partial Dynamic Reconfiguration, TSVLSI, IEEE Transactions on Very Large Scale Integration Systems46
![Page 47: UIC Thesis Novati](https://reader033.vdocuments.site/reader033/viewer/2022052506/55703c83d8b42a611e8b4ffa/html5/thumbnails/47.jpg)
Future WorkFuture Work
Validation tool for the chosen– Reconfiguration model– Communication infrastructure
Simulation framework– Monitor the reconfigurable system evolution– Evaluate different placement policies and area
constraints definitions
47
![Page 48: UIC Thesis Novati](https://reader033.vdocuments.site/reader033/viewer/2022052506/55703c83d8b42a611e8b4ffa/html5/thumbnails/48.jpg)
48
General InformationGeneral Information
Webpage– www.dresd.org/?q=polaris
Mailing List– [email protected]
Contact– To have more information regarding polaris:
• [email protected] – For a complete list of information on how to contact us:
• www.dresd.org/?q=contact_polaris
![Page 49: UIC Thesis Novati](https://reader033.vdocuments.site/reader033/viewer/2022052506/55703c83d8b42a611e8b4ffa/html5/thumbnails/49.jpg)
49
QuestionsQuestions