programming fpgas2020/04/02 · experiments for nearest neighbor computations 6x - 64x worse...
TRANSCRIPT
The Open Source Way
Programming FPGAs
Ahmed SanaullahSenior Data Scientist
Office of the CTO
Ulrich DrepperDistinguished Engineer
Office of the CTO
1
Hugh BrockResearch Director
Office of the CTO
2
Productivity for FPGAs: A Simplified Model
Simply put, being productive means getting all required functions on the FPGA with low effort and high performance
3
Stages Affecting Productivity in a FPGA Toolchain
High Level Synthesis Compiler
Verilog/VHDL Code
IP Block Library
High Level Language Code
Custom HDL Library
Synthesis Logic OptimizerRTL Simulation
Netlist File
Place and Route
Bitstream File
Programmer
FPGA
Software Runtime
RHOS / Shell
FPGA Database (Layout)
FPGA Toolchains have largely been proprietary -> Reduced productivity
4
Reduced Productivity due to Proprietary Tooling
Lack of Customizability Cost of Individual Licenses
Rigidity of Algorithms Security
5
Overview of Some Open Source Efforts
Verilog/VHDL Code
Yosys (Synthesis) Berkley ABC (Logic Optimizer)
Netlist File
Nextpnr (Place & Route)
Bitstream File
OpenOCD (Programmer)
FPGA
OPAE (Software Runtime)
Project Icestorm Trellis/XrayRapidWright(FPGA Database)
VerilatorIcarus Verilog
(Simulation)
Morpheus (RHOS)
BU RH Collab (HLS Compiler) High Level Language
Code
Open Cores (Custom HDL Library)
6
Example: Hacking the Intel OpenCL SDK for FPGAs
Out of box OpenCL performance is really bad!Using documented best practices
Experiments for nearest neighbor computations6x - 64x worse performance than VerilogUp to 7x more resource usage than Verilog
Yang, Chen, et al. "OpenCL for HPC with FPGAs: Case study in molecular electrostatics." 2017 IEEE High Performance Extreme Computing Conference (HPEC). IEEE, 2017.
7
Example: Hacking the Intel OpenCL SDK for FPGAs
https://llvm-hpc3-workshop.github.io/slides/Denisenko.pdf
8
Example: Hacking the Intel OpenCL SDK for FPGAs
For Every Compilation
Once Per Compiler
PROBES IR REPORT
APP HLL APP HLL HDL
CODETRNSFRM
STATICPROFILER
FRONT-ENDCOMPILER
FULLCOMPILER
PREPROCESSOR
Sanaullah, A. (2019). Towards hardware as a reconfigurable, elastic, and specialized service (Doctoral dissertation).
9
Example: Hacking the Intel OpenCL SDK for FPGAs
Performance Evaluation for Packet Processing Workloads
* Bojie Li et al, “Flexible and High Performance Network Processing with Reconfigurable Hardware. In Proceedings of the 2016 ACM SIGCOMM Conference,pages 1–14. ACM, 2016
AES-256 Comparison SHA-1 Comparison
10
Example: Hacking the Intel OpenCL SDK for FPGAs
Performance Evaluation for Parallel Computing Dwarfs
Sanaullah, Ahmed, Rushi Patel, and Martin Herbordt. "An Empirically Guided Optimization Framework for FPGA OpenCL." 2018 International Conference on Field-Programmable Technology (FPT). IEEE, 2018.
linkedin.com/company/red-hat
youtube.com/user/RedHatVideos
facebook.com/redhatinc
twitter.com/RedHat
Thank you
11