![Page 1: A Flexible Interconnection Structure for Reconfigurable FPGA Dataflow Applications Gianluca Durelli, Alessandro A. Nacci, Riccardo Cattaneo, Christian](https://reader036.vdocuments.site/reader036/viewer/2022062518/56649e4d5503460f94b43a0c/html5/thumbnails/1.jpg)
A Flexible Interconnection Structure
for Reconfigurable FPGA Dataflow Applications
Gianluca Durelli, Alessandro A. Nacci, Riccardo Cattaneo, Christian Pilato, Donatella Sciuto and Marco Domenico Santambrogio
Politecnico di MilanoDipartimento di Elettronica, Informazione e Bioingegneria
Milano, IT
[durelli, nacci, rcattaneo, pilato, sciuto]@[email protected]
1
20th Reconfigurable Architectures Workshop May 20-21, 2013, Boston, USA
![Page 2: A Flexible Interconnection Structure for Reconfigurable FPGA Dataflow Applications Gianluca Durelli, Alessandro A. Nacci, Riccardo Cattaneo, Christian](https://reader036.vdocuments.site/reader036/viewer/2022062518/56649e4d5503460f94b43a0c/html5/thumbnails/2.jpg)
Rationale
• Strive for performance in computing intensive applications
• Reconfigurable HW well suited for certain classes of applications– Multimedia, computational biology, physical
simulation
• FPGA used in HPC systems• High maintenance costs
– need to share resources among users
• Need to dynamically share and reuse components on FPGA among different users
2
![Page 3: A Flexible Interconnection Structure for Reconfigurable FPGA Dataflow Applications Gianluca Durelli, Alessandro A. Nacci, Riccardo Cattaneo, Christian](https://reader036.vdocuments.site/reader036/viewer/2022062518/56649e4d5503460f94b43a0c/html5/thumbnails/3.jpg)
Outline
• Goals• State of Art• Proposed Solution• Design and Evaluation• Case Study• Conclusions and Future work
3
![Page 4: A Flexible Interconnection Structure for Reconfigurable FPGA Dataflow Applications Gianluca Durelli, Alessandro A. Nacci, Riccardo Cattaneo, Christian](https://reader036.vdocuments.site/reader036/viewer/2022062518/56649e4d5503460f94b43a0c/html5/thumbnails/4.jpg)
Goals
• Design an interconnection able to:– Create different pipelines reusing
available components on the FPGA– Share the resources between different
applications– Not insert any stall in the pipeline
• Target FPGA for HPC scenario
4
![Page 5: A Flexible Interconnection Structure for Reconfigurable FPGA Dataflow Applications Gianluca Durelli, Alessandro A. Nacci, Riccardo Cattaneo, Christian](https://reader036.vdocuments.site/reader036/viewer/2022062518/56649e4d5503460f94b43a0c/html5/thumbnails/5.jpg)
State of Art
• BUS interconnection– Congestion problem– Does not scale
• Network on Chip– Possible congestion problem– Good scalability
5
• Introduce unexpected delays in computation– Can’t assure performance when sharing
the device between different users
![Page 6: A Flexible Interconnection Structure for Reconfigurable FPGA Dataflow Applications Gianluca Durelli, Alessandro A. Nacci, Riccardo Cattaneo, Christian](https://reader036.vdocuments.site/reader036/viewer/2022062518/56649e4d5503460f94b43a0c/html5/thumbnails/6.jpg)
Proposed Solution
• Switch based interconnection– Cores inputs connected to interconnection
outputs– Cores outputs connected to interconnection
inputs– Fully pipelined point-to-point communication
• Data read/write only when all the inputs are available
• Can be configured by setting for each input and output channels:– Switching configuration:
• Multiplexer configuration to route information
– From which clock cycle the channel is active– How much data have to be read/write through that
channel6
![Page 7: A Flexible Interconnection Structure for Reconfigurable FPGA Dataflow Applications Gianluca Durelli, Alessandro A. Nacci, Riccardo Cattaneo, Christian](https://reader036.vdocuments.site/reader036/viewer/2022062518/56649e4d5503460f94b43a0c/html5/thumbnails/7.jpg)
Proposed Solution
• Suited for Dataflow/Pipelined applications• Parameters can be extracted from an high
level description of the application and pipeline structure:– Possibility to automate the parameter
extraction and interconnection design
7
3
5
2
4
![Page 8: A Flexible Interconnection Structure for Reconfigurable FPGA Dataflow Applications Gianluca Durelli, Alessandro A. Nacci, Riccardo Cattaneo, Christian](https://reader036.vdocuments.site/reader036/viewer/2022062518/56649e4d5503460f94b43a0c/html5/thumbnails/8.jpg)
Implementation
8
• Solution Implemented with HLS:– HLS well suited for dataflow/stencil loop synthesis– Simplify HW development– Generation of compatible interfaces
• Maxeler Technologies:– HPC Dataflow computing exploiting FPGA– Proprietary HLS starting from Java-like description:
• Proposed interconnection solution easily described in Java
• MaxWorkstation 3A:– Intel i7 quad-core– Xilinx Virtex6 XC6VSX547T– PCIe communication:
• Maximum 8 channels/streams
![Page 9: A Flexible Interconnection Structure for Reconfigurable FPGA Dataflow Applications Gianluca Durelli, Alessandro A. Nacci, Riccardo Cattaneo, Christian](https://reader036.vdocuments.site/reader036/viewer/2022062518/56649e4d5503460f94b43a0c/html5/thumbnails/9.jpg)
Evaluation: Area Occupation
9
• Area increment (10-30%) due to increase in switching logic
• The interconnection consumes up to 6% of the FPGA:– Lot of space remains for user cores
![Page 10: A Flexible Interconnection Structure for Reconfigurable FPGA Dataflow Applications Gianluca Durelli, Alessandro A. Nacci, Riccardo Cattaneo, Christian](https://reader036.vdocuments.site/reader036/viewer/2022062518/56649e4d5503460f94b43a0c/html5/thumbnails/10.jpg)
Evaluation: Frequency
10
• Tested with pass-through cores to evaluate maximum working frequency of the interconnection (300MHz)
• In case of real life applications (Brain network with cores working at 200MHz) the interconnection does not affect the critical path
![Page 11: A Flexible Interconnection Structure for Reconfigurable FPGA Dataflow Applications Gianluca Durelli, Alessandro A. Nacci, Riccardo Cattaneo, Christian](https://reader036.vdocuments.site/reader036/viewer/2022062518/56649e4d5503460f94b43a0c/html5/thumbnails/11.jpg)
Case Study• Application:
– Image processing pipeline (up to 4 stages):• Gray scale (GS), Gaussian blur (GB), Edge detection (ED) filters• Their combinations
• Tested architectures:
• Experiments:– Single execution of a N stages pipeline– Batch execution of a workload of 100 random applications
11
(A) (B) (C) (D)
![Page 12: A Flexible Interconnection Structure for Reconfigurable FPGA Dataflow Applications Gianluca Durelli, Alessandro A. Nacci, Riccardo Cattaneo, Christian](https://reader036.vdocuments.site/reader036/viewer/2022062518/56649e4d5503460f94b43a0c/html5/thumbnails/12.jpg)
Case Study: Single execution
12
(A) (B) (C) (D)
![Page 13: A Flexible Interconnection Structure for Reconfigurable FPGA Dataflow Applications Gianluca Durelli, Alessandro A. Nacci, Riccardo Cattaneo, Christian](https://reader036.vdocuments.site/reader036/viewer/2022062518/56649e4d5503460f94b43a0c/html5/thumbnails/13.jpg)
Case Study: Single execution
13
(A) (B) (C) (D)
![Page 14: A Flexible Interconnection Structure for Reconfigurable FPGA Dataflow Applications Gianluca Durelli, Alessandro A. Nacci, Riccardo Cattaneo, Christian](https://reader036.vdocuments.site/reader036/viewer/2022062518/56649e4d5503460f94b43a0c/html5/thumbnails/14.jpg)
Case Study: Batch execution
14
• Proposed solution (D) does not introduce overhead in the overall execution time w.r.t. the other two architectures
• Low system load:– Up to 30% reduction in the overall workload execution time
![Page 15: A Flexible Interconnection Structure for Reconfigurable FPGA Dataflow Applications Gianluca Durelli, Alessandro A. Nacci, Riccardo Cattaneo, Christian](https://reader036.vdocuments.site/reader036/viewer/2022062518/56649e4d5503460f94b43a0c/html5/thumbnails/15.jpg)
Case Study: Batch execution
15
• Low system load (1-2 applications):– Proposed solution (D) does not introduce delays in the
execution of a single application of the workload
• Higher system loads (more than 2 applications):– 10%-30% reduction in single application execution time
![Page 16: A Flexible Interconnection Structure for Reconfigurable FPGA Dataflow Applications Gianluca Durelli, Alessandro A. Nacci, Riccardo Cattaneo, Christian](https://reader036.vdocuments.site/reader036/viewer/2022062518/56649e4d5503460f94b43a0c/html5/thumbnails/16.jpg)
Conclusions and Future work
• Conclusion:– Design of a interconnection to support HW
resource sharing in multi-application scenario
– Solution suited for dataflow/pipelined systems
– Possibility to realize different pipeline configurations at run-time
• Future works:– Design of a mapping/reconfiguration strategy
to allocate user cores and configure new core instances at run-time
16
![Page 17: A Flexible Interconnection Structure for Reconfigurable FPGA Dataflow Applications Gianluca Durelli, Alessandro A. Nacci, Riccardo Cattaneo, Christian](https://reader036.vdocuments.site/reader036/viewer/2022062518/56649e4d5503460f94b43a0c/html5/thumbnails/17.jpg)
17