hardwired networks on chip for fpgas
TRANSCRIPT
1
Hardwired networks on chip for FPGAs
Kees Goossens (TUD, NXP) Muhammad Aqeel Wahlah (TUD)
Kees Goossens 2009-06-02 Tubs.CITY
2
overview
  applications   network on chip   FPGA
  key ideas –  hardwired NOC –  unified interconnect –  data coercion / type casting
  dynamic partial reconfiguration –  multiple applications –  multiplex sub-applications (“hardware tasks”)
  example   conclusions
2
Kees Goossens 2009-06-02 Tubs.CITY
3
applications
BAC
T1 T2 T3
C1 C2 C3 A1 A2 BA
  task / function mapped on IP –  includes storage / buffering
  application: set of communicating IPs / tasks / ... –  data, control, code –  communication via connections
  use case: set of concurrent applications
Kees Goossens 2009-06-02 Tubs.CITY
4
network on chip (NOC)
  connects ports on hardware blocks (IP) –  data, control
  connections: virtual wires   programmable at run-time
–  set up & destroy connections by programming control registers in the NOC
  styles of communication –  address-based /
memory-mapped –  streaming
  real-time / quality of service
R R
R
NI
NI
NI
NI NI
IP
IP
IP IP
IP
NOC
T1
T2
T3
BAC
A1 A2
BA
3
Kees Goossens 2009-06-02 Tubs.CITY
5
FPGA fabric
LUT
LUT
LUT
LUT
IO processor
CPU
on-chip memory
off-chip memory
de/encrypt accelerator
on-chip memory
LUT
LUT
LUT
LUT
ICAP
  soft IP are configured in –  configurable elements (LUT) –  and switch boxes (not shown)
  with a given configuration granularity (frame) using the configuration interconnect (ICAP)
  hard IP –  CPU –  on-chip memories (BRAM, ...) –  off-chip memory interfaces –  decryption IP –  etc.
Kees Goossens 2009-06-02 Tubs.CITY
6
LUT
LUT
LUT
LUT
application on FPGA
LUT
LUT
LUT
LUT
IO processor
CPU
on-chip memory
off-chip memory
de/encrypt accelerator
on-chip memory
A2
A1
BAC
BA
ICAP
  map application (IPs + interconnect + storage) on soft + hard IP
  traditionally data and control interconnects are separate
  could also use NOC for both
soft data interconnect
soft control interconnect
4
Kees Goossens 2009-06-02 Tubs.CITY
7
LUT
LUT
LUT
LUT
multiple applications on FPGA
LUT
LUT
LUT
LUT
IO processor
CPU
on-chip memory
off-chip memory
de/encrypt accelerator
on-chip memory
A2
A1
BAC
BA
ICAP
T3
T1
  interconnects and IPs of different applications share reconfiguration regions (frames) –  dynamic reconfiguration is
global, not partial –  applications interfere
soft data interconnect
soft control interconnect
T2
Kees Goossens 2009-06-02 Tubs.CITY
8
overview
  application   network on chip   FPGA
  key ideas –  hardwired NOC –  unified interconnect –  data coercion / type casting
  dynamic partial reconfiguration –  multiple applications –  multiplex sub-applications (“hardware tasks”)
  example   conclusions
5
Kees Goossens 2009-06-02 Tubs.CITY
9
1. hardwired interconnect
CFR
CFR
CFR
CFR
IO processor
CPU
on-chip memory
off-chip memory
de/encrypt accelerator
on-chip memory
A2
A1
BAC
BA
ICAP
T3
T1
T2
  replace soft interconnect(s) by hard interconnect(s)
  interconnect regions of LUTs (CFR)
  ~35 X smaller area   ~5 X higher speed
–  program, don’t configure
  bit-level (CFR) vs. transaction-level (NOC) reconfigurability –  memory mapped –  streaming
hard interconnect(s)
Kees Goossens 2009-06-02 Tubs.CITY
10
hard interconnect(s)
1. hardwired interconnect
CFR
CFR
CFR
CFR
IO processor
CPU
on-chip memory
off-chip memory
de/encrypt accelerator
on-chip memory
BAC
ICAP
T3
T1
T2
  dynamic partial reconfiguration –  no constraints on soft IP
placement
  loss of flexibility –  fewer LUTs
C1
C2
c3
6
Kees Goossens 2009-06-02 Tubs.CITY
11
2. unified interconnect
CFR
CFR
CFR
CFR
IO processor
CPU
on-chip memory
off-chip memory
de/encrypt accelerator
on-chip memory
A2
A1
BAC
BA
ICAP
T3
T1
T2
  one interconnect (e.g. NOC) for –  data for functional mode –  control for programming –  bitstreams for configuration
  dynamic partitioning of different interconnects
single hard interconnect
Kees Goossens 2009-06-02 Tubs.CITY
12
single hard interconnect
3. data coercion
CFR
CFR
CFR
CFR
IO processor
CPU
on-chip memory
off-chip memory
de/encrypt accelerator
on-chip memory
  data = control = bitstream = …
  connect a data port to a configuration port –  decrypt bitstreams
bitstream
data
7
Kees Goossens 2009-06-02 Tubs.CITY
13
single hard interconnect
3. data coercion
CFR
CFR
CFR
CFR
IO processor
CPU
on-chip memory
off-chip memory
de/encrypt accelerator
on-chip memory
PH
IP
  data = control = bitstream
  connect a data port to a configuration port –  decrypt bitstreams –  run-time compute / optimise
bitstreams •  JIT, peephole
bitstream
Kees Goossens 2009-06-02 Tubs.CITY
14
single hard interconnect
3. data coercion
CFR
CFR
CFR
CFR
IO processor
CPU
on-chip memory
off-chip memory
de/encrypt accelerator
on-chip memory
TR
TV
DUT
  data = control = bitstream = test
  connect a data port to a configuration port –  decrypt bitstreams –  run-time compute / optimise
bitstreams
  connect a data port to a test port –  run-time structural test
data
test data
data
8
Kees Goossens 2009-06-02 Tubs.CITY
15
overview
  applications   network on chip   FPGA
  key ideas –  hardwired NOC –  unified interconnect –  data coercion / type casting
  dynamic partial reconfiguration –  multiple applications –  multiplex sub-applications (“hardware tasks”)
  example   conclusions
Kees Goossens 2009-06-02 Tubs.CITY
16
dynamic partial reconfiguration
  “hardware operating system” implements run-time scheduling of
1.  multiple concurrent applications –  independent applications on own virtual platform
•  no communication, no interference –  activation given by user, environment, etc.
T1 T2 T3
BAC C1 C2 C3 A1 A2 BA
app T
time
app D A app AC
9
Kees Goossens 2009-06-02 Tubs.CITY
17
dynamic partial reconfiguration
  “hardware operating system” implements run-time scheduling of
1.  multiple concurrent applications 2.  parts of single applications (soft IP, “hardware tasks”)
–  multiplex resources of a single application
BAC C1 C2 C3 A1 A2 BA
app T
time
app D A C
Kees Goossens 2009-06-02 Tubs.CITY
18
dynamic partial reconfiguration
  “hardware operating system” implements run-time scheduling of
1.  multiple concurrent applications 2.  parts of single applications (soft IP, “hardware tasks”)
–  multiplex resources of a single application –  internal state
BAC C1 C2 C3 A1 A2 BA
app T
time
app D A C
state
10
Kees Goossens 2009-06-02 Tubs.CITY
19
dynamic partial reconfiguration
1.  system manager –  resource management (CFR, NOC, …)
•  inter-application virtual platforms
time
system manager
A C
application manager
BAC
T
application manager
Kees Goossens 2009-06-02 Tubs.CITY
20
dynamic partial reconfiguration
1.  system manager –  resource management (CFR, NOC, …)
•  inter-application virtual platforms •  intra-application phases
–  NOC programming –  soft IP / (sub)-application configuration
time
system manager
A C
application manager
BAC
11
Kees Goossens 2009-06-02 Tubs.CITY
21
dynamic partial reconfiguration
1.  system manager 2.  application manager
–  application programming
time
system manager
A C
application manager
BAC
T
application manager
Kees Goossens 2009-06-02 Tubs.CITY
22
dynamic partial reconfiguration
1.  system manager 2.  application manager
–  application programming –  intra-application persistent data management
time
system manager
A C
application manager
BAC
BAC C1 C2 C3 A1 A2 BA
state
12
Kees Goossens 2009-06-02 Tubs.CITY
23
overview
  applications   FPGA   network on chip
  key ideas –  hardwired NOC –  unified interconnect –  data coercion / type casting
  dynamic partial reconfiguration –  multiple applications –  multiplex sub-applications (“hardware tasks”)
  example   conclusions
Kees Goossens 2009-06-02 Tubs.CITY
24
modelling
  SystemC –  bit & cycle accurate NOC model –  behavioural CFR models –  accurate bitstream structure –  behavioural hard IP models
  model –  starting / stopping of applications
•  dynamic, based on user input –  starting / stopping of sub-applications
•  dynamic, based on flow of data
–  configuration: loading of bitstreams for soft IP; clock & reset –  programming: of NOC, system & sub-application managers –  management of persistent state
13
Kees Goossens 2009-06-02 Tubs.CITY
25
single hard interconnect
example
  system manager –  program NOC for configuration
CFR
CFR
CFR
CFR
IO processor
CPU
on-chip memory
off-chip memory
de/encrypt accelerator
on-chip memory
A2
A1
BAC
BA
system manager
application manager
Kees Goossens 2009-06-02 Tubs.CITY
26
single hard interconnect
example
  system manager –  program NOC for configuration –  configure: load bitstreams
•  including bitstream syntax, etc.
CFR
CFR
CFR
CFR
IO processor
CPU
on-chip memory
off-chip memory
de/encrypt accelerator
on-chip memory
A2
A1
BAC
BA
system manager
application manager
bitstream programming data
14
Kees Goossens 2009-06-02 Tubs.CITY
27
single hard interconnect
example
  system manager –  program NOC for configuration –  configure: load bitstreams –  program NOC for (sub)-application A
CFR
CFR
CFR
CFR
IO processor
CPU
on-chip memory
off-chip memory
de/encrypt accelerator
on-chip memory
A2
A1
BAC
BA
system manager
application manager
bitstream programming data
Kees Goossens 2009-06-02 Tubs.CITY
28
single hard interconnect
example
  system manager –  program NOC for configuration –  configure: load bitstreams –  program NOC for (sub)-application A –  program & start application manager
•  including clocking & reset
CFR
CFR
CFR
CFR
IO processor
CPU
on-chip memory
off-chip memory
de/encrypt accelerator
on-chip memory
A2
A1
BAC
BA
system manager
application manager
bitstream programming data
15
Kees Goossens 2009-06-02 Tubs.CITY
29
single hard interconnect
example
  system manager –  program NOC for configuration –  configure: load bitstreams –  program NOC for (sub)-application A –  program & start application manager
  application manager –  programs & starts sub-app A
•  soft IP fn is modelled by CFR
CFR
CFR
CFR
CFR
IO processor
CPU
on-chip memory
off-chip memory
de/encrypt accelerator
on-chip memory
A2
A1
BAC
BA
system manager
application manager
bitstream programming data
Kees Goossens 2009-06-02 Tubs.CITY
30
single hard interconnect
example
  system manager –  program NOC for configuration –  configure: load bitstreams –  program NOC for (sub)-application A –  program & start application manager
  application manager –  programs & starts sub-app A
  sub-application A runs
CFR
CFR
CFR
CFR
IO processor
CPU
on-chip memory
off-chip memory
de/encrypt accelerator
on-chip memory
A2
A1
BAC
BA
system manager
application manager
bitstream programming data
16
Kees Goossens 2009-06-02 Tubs.CITY
31
conclusions
  ideas: –  hardwire NOC –  unified interconnects –  data coercion / type casting
  very detailed model   many simplifications & restrictions
  many open issues –  design flow: soft IP placement, binding, relocation, etc. –  application model:
•  extend use-case model with intra-application dynamism •  more general notions of persistent state
–  implementation: separation of system & application managers
Kees Goossens 2009-06-02 Tubs.CITY
32