enabling multi-fpga clusters as an openmp acceleration device · year multi-fpga compiler toolkit...

59
Enabling Multi-FPGA Clusters as an OpenMP Acceleration Device Ramon Nepomuceno Renan Sterle Guido Araujo

Upload: others

Post on 07-Sep-2020

33 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Enabling Multi-FPGA Clusters as an OpenMP Acceleration Device · Year Multi-FPGA Compiler ToolKit Target System Leow 2006 no C-Breeze Celoxica RC100 Board-Spartan Cabrera 2009 no

 

Enabling Multi-FPGA Clusters as an OpenMP Acceleration Device

Ramon Nepomuceno Renan Sterle Guido Araujo

Page 2: Enabling Multi-FPGA Clusters as an OpenMP Acceleration Device · Year Multi-FPGA Compiler ToolKit Target System Leow 2006 no C-Breeze Celoxica RC100 Board-Spartan Cabrera 2009 no

  2

Motivation: Seismic Imaging

image taken from the internet:https://www.saltambiental.com.br/geofisica/apresentacao/

Page 3: Enabling Multi-FPGA Clusters as an OpenMP Acceleration Device · Year Multi-FPGA Compiler ToolKit Target System Leow 2006 no C-Breeze Celoxica RC100 Board-Spartan Cabrera 2009 no

  3

Motivation: Seismic Imaging

image taken from the internet:https://www.saltambiental.com.br/geofisica/apresentacao/

Page 4: Enabling Multi-FPGA Clusters as an OpenMP Acceleration Device · Year Multi-FPGA Compiler ToolKit Target System Leow 2006 no C-Breeze Celoxica RC100 Board-Spartan Cabrera 2009 no

  4

Motivation: Seismic Imaging

image taken from the internet:https://www.saltambiental.com.br/geofisica/apresentacao/

Page 5: Enabling Multi-FPGA Clusters as an OpenMP Acceleration Device · Year Multi-FPGA Compiler ToolKit Target System Leow 2006 no C-Breeze Celoxica RC100 Board-Spartan Cabrera 2009 no

 

Introduction

Heterogeneous Systems

5

Page 6: Enabling Multi-FPGA Clusters as an OpenMP Acceleration Device · Year Multi-FPGA Compiler ToolKit Target System Leow 2006 no C-Breeze Celoxica RC100 Board-Spartan Cabrera 2009 no

 

Introduction

Heterogeneous Systems

6

Page 7: Enabling Multi-FPGA Clusters as an OpenMP Acceleration Device · Year Multi-FPGA Compiler ToolKit Target System Leow 2006 no C-Breeze Celoxica RC100 Board-Spartan Cabrera 2009 no

 

Introduction

Heterogeneous Systems

7

Page 8: Enabling Multi-FPGA Clusters as an OpenMP Acceleration Device · Year Multi-FPGA Compiler ToolKit Target System Leow 2006 no C-Breeze Celoxica RC100 Board-Spartan Cabrera 2009 no

 

Introduction

Heterogeneous Systems

8

Page 9: Enabling Multi-FPGA Clusters as an OpenMP Acceleration Device · Year Multi-FPGA Compiler ToolKit Target System Leow 2006 no C-Breeze Celoxica RC100 Board-Spartan Cabrera 2009 no

 

Introduction

Heterogeneous Systems

9

Page 10: Enabling Multi-FPGA Clusters as an OpenMP Acceleration Device · Year Multi-FPGA Compiler ToolKit Target System Leow 2006 no C-Breeze Celoxica RC100 Board-Spartan Cabrera 2009 no

 

Introduction

• An FPGA is a powerful device that can speedup pipelined applications

10

Page 11: Enabling Multi-FPGA Clusters as an OpenMP Acceleration Device · Year Multi-FPGA Compiler ToolKit Target System Leow 2006 no C-Breeze Celoxica RC100 Board-Spartan Cabrera 2009 no

 

Introduction

11

• But even a powerful FPGA cannot handle very large problems like Seismic Imaging

• Need Multi-FPGA architectures

Page 12: Enabling Multi-FPGA Clusters as an OpenMP Acceleration Device · Year Multi-FPGA Compiler ToolKit Target System Leow 2006 no C-Breeze Celoxica RC100 Board-Spartan Cabrera 2009 no

 

Introduction

• Moreover....

• FPGA toolchains are hardware oriented and unfriendly to software designers

Page 13: Enabling Multi-FPGA Clusters as an OpenMP Acceleration Device · Year Multi-FPGA Compiler ToolKit Target System Leow 2006 no C-Breeze Celoxica RC100 Board-Spartan Cabrera 2009 no

 

Introduction

• Moreover....

• FPGA toolchains are hardware oriented and unfriendly to software designers

• Need a simpler and transparent approach to embed FPGA accelerators into software applications

Page 14: Enabling Multi-FPGA Clusters as an OpenMP Acceleration Device · Year Multi-FPGA Compiler ToolKit Target System Leow 2006 no C-Breeze Celoxica RC100 Board-Spartan Cabrera 2009 no

 

Proposal

How to program such systems?

• We propose using OpenMP task-based programming model for Multi-FPGA architectures. We are extending it to:

• Enable automatic bitstream/data offloading to FPGAs

• Transparently map data dependencies to IPs/boards interconnects

• Allow a seamless flow of data to/from FPGA.

14

Page 15: Enabling Multi-FPGA Clusters as an OpenMP Acceleration Device · Year Multi-FPGA Compiler ToolKit Target System Leow 2006 no C-Breeze Celoxica RC100 Board-Spartan Cabrera 2009 no

 

Background

15

Page 16: Enabling Multi-FPGA Clusters as an OpenMP Acceleration Device · Year Multi-FPGA Compiler ToolKit Target System Leow 2006 no C-Breeze Celoxica RC100 Board-Spartan Cabrera 2009 no

 

OpenMP Task Directive

16

#pragma omp parallel #pragma omp single #pragma omp task depend (out:A) A = foo(); for(int i = 0; i < 2; i ++){ #pragma omp task depend(in:A)\\ depend(out: B[i])

B[i]=bar(a) } for(int i = 0; i < 4; i ++){ #pragma omp task depend(in:B[i/2])

fun(B[i/2]); }

Page 17: Enabling Multi-FPGA Clusters as an OpenMP Acceleration Device · Year Multi-FPGA Compiler ToolKit Target System Leow 2006 no C-Breeze Celoxica RC100 Board-Spartan Cabrera 2009 no

 

OpenMP Task Directive

17

#pragma omp parallel #pragma omp single #pragma omp task depend (out:A) A = foo(); for(int i = 0; i < 2; i ++){ #pragma omp task depend(in:A)\\ depend(out: B[i])

B[i]=bar(a); } for(int i = 0; i < 4; i ++){ #pragma omp task depend(in:B[i/2])

fun(B[i/2]); }

Create a pool of worker threads

Page 18: Enabling Multi-FPGA Clusters as an OpenMP Acceleration Device · Year Multi-FPGA Compiler ToolKit Target System Leow 2006 no C-Breeze Celoxica RC100 Board-Spartan Cabrera 2009 no

 

OpenMP Task Directive

18

#pragma omp parallel #pragma omp single #pragma omp task depend (out:A) A = foo(); for(int i = 0; i < 2; i ++){ #pragma omp task depend(in:A)\\ depend(out: B[i])

B[i]=bar(a); } for(int i = 0; i < 4; i ++){ #pragma omp task depend(in:B[i/2])

fun(B[i/2]); }

Select the control thread

Page 19: Enabling Multi-FPGA Clusters as an OpenMP Acceleration Device · Year Multi-FPGA Compiler ToolKit Target System Leow 2006 no C-Breeze Celoxica RC100 Board-Spartan Cabrera 2009 no

 

#pragma omp parallel #pragma omp single #pragma omp task depend (out:A) A = foo(); for(int i = 0; i < 2; i ++){ #pragma omp task depend(in:A)\\ depend(out: B[i])

B[i]=bar(a) } for(int i = 0; i < 4; i ++){ #pragma omp task depend(in:B[i/2])

fun(B[i/2]); }

OpenMP Task Directive

foo

19

Page 20: Enabling Multi-FPGA Clusters as an OpenMP Acceleration Device · Year Multi-FPGA Compiler ToolKit Target System Leow 2006 no C-Breeze Celoxica RC100 Board-Spartan Cabrera 2009 no

 

#pragma omp parallel #pragma omp single #pragma omp task depend (out:A) A = foo(); for(int i = 0; i < 2; i ++){ #pragma omp task depend(in:A)\\ depend(out: B[i])

B[i]=bar(a) } for(int i = 0; i < 4; i ++){ #pragma omp task depend(in:B[i/2])

fun(B[i/2]); }

OpenMP Task Directive

foo

bar bar

20

Page 21: Enabling Multi-FPGA Clusters as an OpenMP Acceleration Device · Year Multi-FPGA Compiler ToolKit Target System Leow 2006 no C-Breeze Celoxica RC100 Board-Spartan Cabrera 2009 no

 

#pragma omp parallel #pragma omp single #pragma omp task depend (out:A) A = foo(); for(int i = 0; i < 2; i ++){ #pragma omp task depend(in:A)\\ depend(out: B[i])

B[i]=bar(a) } for(int i = 0; i < 4; i ++){ #pragma omp task depend(in:B[i/2])

fun(B[i/2]); }

OpenMP Task Directive

foo

fun

bar

fun fun fun

bar

21

Page 22: Enabling Multi-FPGA Clusters as an OpenMP Acceleration Device · Year Multi-FPGA Compiler ToolKit Target System Leow 2006 no C-Breeze Celoxica RC100 Board-Spartan Cabrera 2009 no

 

control thread

OpenMP Task Runtime

22

Page 23: Enabling Multi-FPGA Clusters as an OpenMP Acceleration Device · Year Multi-FPGA Compiler ToolKit Target System Leow 2006 no C-Breeze Celoxica RC100 Board-Spartan Cabrera 2009 no

 

OpenMP Runtime

OpenMP Task Runtime

23

dependency graph

Page 24: Enabling Multi-FPGA Clusters as an OpenMP Acceleration Device · Year Multi-FPGA Compiler ToolKit Target System Leow 2006 no C-Breeze Celoxica RC100 Board-Spartan Cabrera 2009 no

 

worker threads

ready queue

OpenMP Task Runtime

24

Page 25: Enabling Multi-FPGA Clusters as an OpenMP Acceleration Device · Year Multi-FPGA Compiler ToolKit Target System Leow 2006 no C-Breeze Celoxica RC100 Board-Spartan Cabrera 2009 no

 

#pragma omp target device(FPGA) map(to: X) map(from: Y)

CPU FPGA

X

Y

Shared Memory

OpenMP Target Directive

25

Page 26: Enabling Multi-FPGA Clusters as an OpenMP Acceleration Device · Year Multi-FPGA Compiler ToolKit Target System Leow 2006 no C-Breeze Celoxica RC100 Board-Spartan Cabrera 2009 no

 

#pragma omp target device(FPGA) map(to: X) map(from: Y)

CPU FPGA

X

Y

Shared Memory

1

1

OpenMP Target Directive

26

Page 27: Enabling Multi-FPGA Clusters as an OpenMP Acceleration Device · Year Multi-FPGA Compiler ToolKit Target System Leow 2006 no C-Breeze Celoxica RC100 Board-Spartan Cabrera 2009 no

 

#pragma omp target device(FPGA) map(to: X) map(from: Y)

CPU FPGA

X

Y

Shared Memory

2

2

OpenMP Target Directive

27

Page 28: Enabling Multi-FPGA Clusters as an OpenMP Acceleration Device · Year Multi-FPGA Compiler ToolKit Target System Leow 2006 no C-Breeze Celoxica RC100 Board-Spartan Cabrera 2009 no

 

#pragma omp target device(FPGA) map(to: X) map(from: Y)

CPU FPGA

X

Y

Shared Memory

3

3

OpenMP Target Directive

28

Page 29: Enabling Multi-FPGA Clusters as an OpenMP Acceleration Device · Year Multi-FPGA Compiler ToolKit Target System Leow 2006 no C-Breeze Celoxica RC100 Board-Spartan Cabrera 2009 no

 

#pragma omp parallel #pragma omp single #pragma omp target device(GPU)\\ map(from: A) depend (out: A) nowait A = foo(); for(int i = 0; i < 2; i ++){ #pragma omp target device(GPU)\\ map(to: A) map(from: B[i]) \\ depend (in: A) depend (out: B[i]) nowait

B[i]=bar(a) } for(int i = 0; i < 4; i ++){ #pragma omp target device(GPU)\\ map(to: B[i/2]) \\ depend (in: B[i/2]) nowait

fun(B[i/2]); }

OpenMP Target Depend Directive

29

Page 30: Enabling Multi-FPGA Clusters as an OpenMP Acceleration Device · Year Multi-FPGA Compiler ToolKit Target System Leow 2006 no C-Breeze Celoxica RC100 Board-Spartan Cabrera 2009 no

 

But LLVM/OpenMP cannot handle FPGAs

LLVM/OpenMP FPGA Limitations:

● Does not support for bitstream offloading;● Does not support for mapping data dependencies to

IP/boards interconnects; ● Does not allow a seamless flow of data to/from

multiple-FPGA.

30

Page 31: Enabling Multi-FPGA Clusters as an OpenMP Acceleration Device · Year Multi-FPGA Compiler ToolKit Target System Leow 2006 no C-Breeze Celoxica RC100 Board-Spartan Cabrera 2009 no

 

The Software

31

Page 32: Enabling Multi-FPGA Clusters as an OpenMP Acceleration Device · Year Multi-FPGA Compiler ToolKit Target System Leow 2006 no C-Breeze Celoxica RC100 Board-Spartan Cabrera 2009 no

 

LLVMgeneratedHost Code libomptarget.so

Data GPU Plugin

DSP Plugin

X86_64 Plugin

GPU

DSP

X86_64 core

Host Devices

GPU Code

DSP Code

X86_64 Code

VC709 Code

VC709 Plugin

VC709 Board

Our modules: LLVM/OpenMP VC709 Plugin

32

fat binary

Page 33: Enabling Multi-FPGA Clusters as an OpenMP Acceleration Device · Year Multi-FPGA Compiler ToolKit Target System Leow 2006 no C-Breeze Celoxica RC100 Board-Spartan Cabrera 2009 no

 

The Hardware

Single Node

33

Page 34: Enabling Multi-FPGA Clusters as an OpenMP Acceleration Device · Year Multi-FPGA Compiler ToolKit Target System Leow 2006 no C-Breeze Celoxica RC100 Board-Spartan Cabrera 2009 no

 

The FPGA

● Field Programmable Gate Array.○ Xilinx VC709 Board

■ 2x 4GB DDR3 SODIMM memories;■ 8-lane PCIe;■ 4x SFP;

34

Page 35: Enabling Multi-FPGA Clusters as an OpenMP Acceleration Device · Year Multi-FPGA Compiler ToolKit Target System Leow 2006 no C-Breeze Celoxica RC100 Board-Spartan Cabrera 2009 no

 

The Board

35

4 x 10Gb/s optical links

Page 36: Enabling Multi-FPGA Clusters as an OpenMP Acceleration Device · Year Multi-FPGA Compiler ToolKit Target System Leow 2006 no C-Breeze Celoxica RC100 Board-Spartan Cabrera 2009 no

 

VFI

FO

DM

A IP

PC

Ie IP

256-bit @250 MHz

256-bit @250 MHz

Configuration Registers

64-bit @156.25 MHz

NET

64-bit @156.25 MHz

PC

I Exp

ress

Inte

rface

sfp+

VC709-board

Virtex-7

NET

NET

NET

sfp+

sfp+

sfp+

2x DDR3

VC709 Target Reference Design

36

Page 37: Enabling Multi-FPGA Clusters as an OpenMP Acceleration Device · Year Multi-FPGA Compiler ToolKit Target System Leow 2006 no C-Breeze Celoxica RC100 Board-Spartan Cabrera 2009 no

 

VFI

FO

PC

IE/D

MA

net

AX

IS-IC

IP.0

Configuration registers

netIP.1

IP.2

IP.3

AXI-L AXI-L

Virtex-7

VC709-board

sfp+

sfp+

net

net

sfp+

2x DDR3

sfp+

2x DDR3

Our modules

37

Page 38: Enabling Multi-FPGA Clusters as an OpenMP Acceleration Device · Year Multi-FPGA Compiler ToolKit Target System Leow 2006 no C-Breeze Celoxica RC100 Board-Spartan Cabrera 2009 no

 

PC

IE/D

MA

A

XIS-

IC

Virtex-7

VC709-board

+1+2+3+4

Single node: How it works

Create pool of threads

38

Page 39: Enabling Multi-FPGA Clusters as an OpenMP Acceleration Device · Year Multi-FPGA Compiler ToolKit Target System Leow 2006 no C-Breeze Celoxica RC100 Board-Spartan Cabrera 2009 no

 

PC

IE/D

MA

A

XIS-

IC

Virtex-7

VC709-board

+1+2+3+4

Single node: How it works

Select the control thread

39

Page 40: Enabling Multi-FPGA Clusters as an OpenMP Acceleration Device · Year Multi-FPGA Compiler ToolKit Target System Leow 2006 no C-Breeze Celoxica RC100 Board-Spartan Cabrera 2009 no

 

PC

IE/D

MA

A

XIS-

IC

Virtex-7

VC709-board

+1+2+3+4

Single node: How it works

create the tasks

40

Page 41: Enabling Multi-FPGA Clusters as an OpenMP Acceleration Device · Year Multi-FPGA Compiler ToolKit Target System Leow 2006 no C-Breeze Celoxica RC100 Board-Spartan Cabrera 2009 no

 

PC

IE/D

MA

A

XIS-

IC

Virtex-7

VC709-board

+1+2+3+4

Single node: How it works

+1 +2 +3 +4

41

Page 42: Enabling Multi-FPGA Clusters as an OpenMP Acceleration Device · Year Multi-FPGA Compiler ToolKit Target System Leow 2006 no C-Breeze Celoxica RC100 Board-Spartan Cabrera 2009 no

 

Single node: How it works

+1 +2 +3 +4

PC

IE/D

MA

A

XIS-

IC

Virtex-7

VC709-board

+1+2+3+4

42

Page 43: Enabling Multi-FPGA Clusters as an OpenMP Acceleration Device · Year Multi-FPGA Compiler ToolKit Target System Leow 2006 no C-Breeze Celoxica RC100 Board-Spartan Cabrera 2009 no

 

Single node: How it works

+1 +2 +3 +4

PC

IE/D

MA

A

XIS-

IC

Virtex-7

VC709-board

+1+2+3+4

43

Page 44: Enabling Multi-FPGA Clusters as an OpenMP Acceleration Device · Year Multi-FPGA Compiler ToolKit Target System Leow 2006 no C-Breeze Celoxica RC100 Board-Spartan Cabrera 2009 no

 

Single node: How it works

+1 +2 +3 +4

PC

IE/D

MA

A

XIS-

IC

Virtex-7

VC709-board

+1+2+3+4

44

Page 45: Enabling Multi-FPGA Clusters as an OpenMP Acceleration Device · Year Multi-FPGA Compiler ToolKit Target System Leow 2006 no C-Breeze Celoxica RC100 Board-Spartan Cabrera 2009 no

 

Single node: How it works

+1 +2 +3 +4

PC

IE/D

MA

A

XIS-

IC

Virtex-7

VC709-board

+1+2+3+4

45

Page 46: Enabling Multi-FPGA Clusters as an OpenMP Acceleration Device · Year Multi-FPGA Compiler ToolKit Target System Leow 2006 no C-Breeze Celoxica RC100 Board-Spartan Cabrera 2009 no

 

The Hardware

Multiple Nodes

46

Page 47: Enabling Multi-FPGA Clusters as an OpenMP Acceleration Device · Year Multi-FPGA Compiler ToolKit Target System Leow 2006 no C-Breeze Celoxica RC100 Board-Spartan Cabrera 2009 no

 

Multiple Nodes

47

Page 48: Enabling Multi-FPGA Clusters as an OpenMP Acceleration Device · Year Multi-FPGA Compiler ToolKit Target System Leow 2006 no C-Breeze Celoxica RC100 Board-Spartan Cabrera 2009 no

 

Multi-Nodes

48

• Need to forward dependency data across optical links○ Designed a MAC Frame Handler (MFH)○ Mount and unmount a MAC frame.

Page 49: Enabling Multi-FPGA Clusters as an OpenMP Acceleration Device · Year Multi-FPGA Compiler ToolKit Target System Leow 2006 no C-Breeze Celoxica RC100 Board-Spartan Cabrera 2009 no

 

Adding MFH

VFI

FO

PC

IE/D

MA

net

AXI-S AXI-SAXI-S

AXI-L: AXI-LITEAXI-S: AXI-STREAM

AX

IS-IC

IP.0

Configuration registers

netIP.1

IP.2

IP.3

AXI-L AXI-L

AXI-L

Virtex-7

VC709-board

sfp+

sfp+

net

net

sfp+

2x DDR3

sfp+

2x DDR3

MFH

49

Page 50: Enabling Multi-FPGA Clusters as an OpenMP Acceleration Device · Year Multi-FPGA Compiler ToolKit Target System Leow 2006 no C-Breeze Celoxica RC100 Board-Spartan Cabrera 2009 no

 

Multiple nodes: How it works

Virtex-7

VC709-board A

+4

+1

A-S

WT

configuration registers

MFH

Virtex-7

VC709-board B

+3

+2A-S

WT

configuration registers

MFH

NODE-A NODE-B

50

Page 51: Enabling Multi-FPGA Clusters as an OpenMP Acceleration Device · Year Multi-FPGA Compiler ToolKit Target System Leow 2006 no C-Breeze Celoxica RC100 Board-Spartan Cabrera 2009 no

 

Multiple nodes: How it works

Virtex-7

VC709-board A

+4

+1

A-S

WT

configuration registers

MFH

Virtex-7

VC709-board B

+3

+2A-S

WT

configuration registers

MFH

NODE-A NODE-B

51

Page 52: Enabling Multi-FPGA Clusters as an OpenMP Acceleration Device · Year Multi-FPGA Compiler ToolKit Target System Leow 2006 no C-Breeze Celoxica RC100 Board-Spartan Cabrera 2009 no

 

Multiple nodes: How it works

Virtex-7

VC709-board A

+4

+1

A-S

WT

configuration registers

MFH

Virtex-7

VC709-board B

+3

+2A-S

WT

configuration registers

MFH

NODE-A NODE-B

52

Page 53: Enabling Multi-FPGA Clusters as an OpenMP Acceleration Device · Year Multi-FPGA Compiler ToolKit Target System Leow 2006 no C-Breeze Celoxica RC100 Board-Spartan Cabrera 2009 no

 

Multiple nodes: How it works

Virtex-7

VC709-board A

+4

+1

A-S

WT

configuration registers

MFH

Virtex-7

VC709-board B

+3

+2A-S

WT

configuration registers

MFH

NODE-A NODE-B

53

Page 54: Enabling Multi-FPGA Clusters as an OpenMP Acceleration Device · Year Multi-FPGA Compiler ToolKit Target System Leow 2006 no C-Breeze Celoxica RC100 Board-Spartan Cabrera 2009 no

 

Multiple nodes: How it works

Virtex-7

VC709-board A

+4

+1

A-S

WT

configuration registers

MFH

Virtex-7

VC709-board B

+3

+2A-S

WT

configuration registers

MFH

NODE-A NODE-B

54

Page 55: Enabling Multi-FPGA Clusters as an OpenMP Acceleration Device · Year Multi-FPGA Compiler ToolKit Target System Leow 2006 no C-Breeze Celoxica RC100 Board-Spartan Cabrera 2009 no

 

Ongoing Experiments

VC709-board A

laplace 3

laplace 0

A-S

WT

configuration registers

MFH

NODE-A

laplace 1

laplace 2

VC709-board A

laplace 3

laplace 0

A-S

WT

configuration registers

MFH

NODE-B

laplace 1

laplace 2

Page 56: Enabling Multi-FPGA Clusters as an OpenMP Acceleration Device · Year Multi-FPGA Compiler ToolKit Target System Leow 2006 no C-Breeze Celoxica RC100 Board-Spartan Cabrera 2009 no

 

Related Works

56

Page 57: Enabling Multi-FPGA Clusters as an OpenMP Acceleration Device · Year Multi-FPGA Compiler ToolKit Target System Leow 2006 no C-Breeze Celoxica RC100 Board-Spartan Cabrera 2009 no

 

Year Multi-FPGA Compiler ToolKit Target System

Leow 2006 no C-Breeze Celoxica RC100 Board-Spartan

Cabrera 2009 no Mercurium, GCC, SGI RASClib SGI RASC 2.2 Board-Virtex 4

Cilardo 2013 no Custom OpenMP, Xilinx EDK Xilinx EDK Boards

Choi 2013 no LLVM GCC Altera Board with Stratix IV

Filgueras 2014 no Mercurium, Nanos++ Xilinx Board with Zynq-700

Podobas 2014 no Custom C89 Compiler Altera ED5 Stratix V

Sommer 2017 no Clang, LLVM, libomptarget, TPC Xilinx VC709 Board

Ceissler(FCCM) 2018 no Clang, LLVM, libomptarget Intel HARP2 and Amazon AWS

Bosh 2018 no Mercurium, Nanos++ Zynq Ultrascale+

Knaust 2019 no Clang LLVM Intel Board Arria 10 GX

Our work 2021 yes Clang LLVM, libomptarget Cluster of Xilinx VC709 Boards

OpenMP for FPGAs

57

Page 58: Enabling Multi-FPGA Clusters as an OpenMP Acceleration Device · Year Multi-FPGA Compiler ToolKit Target System Leow 2006 no C-Breeze Celoxica RC100 Board-Spartan Cabrera 2009 no

 

Year Multi-FPGA Compiler ToolKit Target System

Leow 2006 no C-Breeze Celoxica RC100 Board-Spartan

Cabrera 2009 no Mercurium, GCC, SGI RASClib SGI RASC 2.2 Board-Virtex 4

Cilardo 2013 no Custom OpenMP, Xilinx EDK Xilinx EDK Boards

Choi 2013 no LLVM GCC Altera Board with Stratix IV

Filgueras 2014 no Mercurium, Nanos++ Xilinx Board with Zynq-700

Podobas 2014 no Custom C89 Compiler Altera ED5 Stratix V

Sommer 2017 no Clang, LLVM, libomptarget, TPC Xilinx VC709 Board

Ceissler(FCCM) 2018 no Clang, LLVM, libomptarget Intel HARP2 and Amazon AWS

Bosh 2018 no Mercurium, Nanos++ Zynq Ultrascale+

Knaust 2019 no Clang LLVM Intel Board Arria 10 GX

Our work 2021 yes Clang LLVM, libomptarget Cluster of Xilinx VC709 Boards

OpenMP for FPGAs

58

Page 59: Enabling Multi-FPGA Clusters as an OpenMP Acceleration Device · Year Multi-FPGA Compiler ToolKit Target System Leow 2006 no C-Breeze Celoxica RC100 Board-Spartan Cabrera 2009 no

 

Q&A

Thank you!

59

email: [email protected]