high-level interconnect architectures for fpgas

24
High-Level Interconnect Architectures for FPGAs Nick Barrow-Williams

Upload: ban

Post on 13-Jan-2016

57 views

Category:

Documents


2 download

DESCRIPTION

High-Level Interconnect Architectures for FPGAs. Nick Barrow-Williams. Introduction. Continued shrinking of device dimension introduces new design challenges Moving data around a chip can now be the limiting factor of performance Existing interconnection solutions do not scale well. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: High-Level Interconnect Architectures for FPGAs

High-Level Interconnect Architectures for FPGAs

Nick Barrow-Williams

Page 2: High-Level Interconnect Architectures for FPGAs

Introduction

Continued shrinking of device dimension introduces new design challenges

Moving data around a chip can now be the limiting factor of performance

Existing interconnection solutions do not scale well

2

Page 3: High-Level Interconnect Architectures for FPGAs

Why do existing solutions not scale?

Global connections are longer

Wire depth increased to counter width decrease

Parasitic capacitive effects increase and cause slow signal propagation

3

Page 4: High-Level Interconnect Architectures for FPGAs

Why do existing solutions not scale?

Existing system-level connection uses buses

Buses increase resource efficiency and decrease wiring congestion

Not suitable for a large number of modules

A network based alternative would offer higher aggregate bandwidth

4

Page 5: High-Level Interconnect Architectures for FPGAs

Why design for FPGA systems?

FPGA silicon area already dominated by wiring

Global wires are limited in number

Increasing gate count only increases wiring congestion

5

Page 6: High-Level Interconnect Architectures for FPGAs

The Solution: Network-on-Chip

Use technologies from network systems

Replace inefficient global wiring with high-level interconnection network

Create scalable systems to handle large numbers of modules

6

Page 7: High-Level Interconnect Architectures for FPGAs

Existing Solutions Most existing systems are for ASIC designs

Stanford Interconnect RAW SCALE SPIN

PNoC: An solution for FPGAs Complex High hardware cost

Other simulated solutions exist but few are implemented

7

Page 8: High-Level Interconnect Architectures for FPGAs

Proposal: Two network systems

Existing solutions use either packet switching or circuit switching techniques

Design, implement, test and synthesise one of each to compare performance and hardware cost

Map solutions to an FPGA platform to evaluate hardware cost in current generation systems

8

Page 9: High-Level Interconnect Architectures for FPGAs

Network Architecture Design

Topology Simple Scalable 2 Dimensional

Solution: 2D mesh Topology

9

Page 10: High-Level Interconnect Architectures for FPGAs

Network Architecture Design

Routing Algorithm Deterministic

Data always follows same path through network Simple hardware Sensitive to congestion

Adaptive Paths through network can change according to load Complex hardware Avoids congestion

10

Page 11: High-Level Interconnect Architectures for FPGAs

Network Architecture Design When choosing routing algorithms must avoid:

Deadlock:

Livelock

Solution: Use unidirectional wiring and allow each node to make two connections

Solution: Use deterministic routing

11

Page 12: High-Level Interconnect Architectures for FPGAs

Network Architecture Design Flow control methods

Circuit switched Circuit request propagates through network Path reserved to destination Grant signal propagates back Data sent then circuit deallocated

Packet switched Use header, body and tail Wormhole routing

Forward header and body without waiting for tail Need buffers to store stalled packets

12

Page 13: High-Level Interconnect Architectures for FPGAs

Router Design Each router contains a number of modules

FIFOs (only present in packet switched router)

Address to port-request decoder

Arbiter

Control finite state machines

Crossbar

13

Page 14: High-Level Interconnect Architectures for FPGAs

Circuit Switched Router Structure

Request

InR

equest

In

Request

Out

Gra

nt In

Gra

nt

Out

Data

In

Data

O

ut

Data In

In & Out Ports

CrossbarCrossbarCrossbarCrossbar

FSMFSMFSMFSM

ArbiterArbiterArbiterArbiter Address to Port Address to Port DecoderDecoder

Address to Port Address to Port DecoderDecoder

14

Page 15: High-Level Interconnect Architectures for FPGAs

Packet Switched Router Structure

Request

Fro

m

FIF

Os

Request

In

Write

Out

Full In

Gra

nt

Out

Data

Fro

m

FIF

Os

Data

O

ut

Data From FIFOs

In & Out Ports

CrossbarCrossbarCrossbarCrossbar

ControlControlControlControl

ArbiterArbiterArbiterArbiter Address to Port Address to Port DecoderDecoder

Address to Port Address to Port DecoderDecoderFIFOFIFO FSMFSMData In

Full

Write

Grant

Req

Data

15

Page 16: High-Level Interconnect Architectures for FPGAs

Router Implementation and Testing

Both routers were coded using VHDL

Simulation and testing used a combination of ModelSim and Xilinx ISE 9.1

Ad-hoc tests used for individual modules

VHDL testbench used for system verification

16

Page 17: High-Level Interconnect Architectures for FPGAs

Testbench Structure

Mesh Network

Mesh Network

ReadInputReadInput

Input Tables

TestTable

SourceSource

OutputTable

SinkSink

CompareCompare

TESTBENCH

Command File

Output File

Clock Gen

Clock Gen

Reset Gen

Reset Gen

Cycle CountCycle Count

Success: ID: 1 Source : (0,3) Dest : (1,0) Hops : 4 Latency: 34Success: ID: 2 Source : (0,2) Dest : (1,0) Hops : 3 Latency: 27Success: ID: 3 Source : (3,2) Dest : (1,1) Hops : 3 Latency: 22Success: ID: 4 Source : (1,3) Dest : (0,1) Hops : 3 Latency: 22Success: ID: 5 Source : (3,0) Dest : (3,1) Hops : 1 Latency: 12

#START SOURCE DEST SIZE ID# ------------------------------------------------------ 2 3 0 0 1 8 1 3 2 0 0 1 2 2 3 2 3 1 1 2 3 4 3 1 1 0 8 4 5 0 3 1 3 7 5

17

Page 18: High-Level Interconnect Architectures for FPGAs

Synthesis

Each router was synthesised for a Virtex-4 LX platform

Post-synthesis verification

Resource usage

Timing

18

Page 19: High-Level Interconnect Architectures for FPGAs

Circuit Switched Resource Usage

LUTsLUTs Flip-FlopsFlip-Flops

Total of 586 4 Input LUTS

~0.1% of a Virtex 5

Total of 202 Flip Flops

19

Page 20: High-Level Interconnect Architectures for FPGAs

Packet Switched Resource Usage

LUTsLUTs Flip-FlopsFlip-Flops

Total of 786 4 Input LUTS

+34% compared to circuit switched

Total of 237Flip Flops

20

Page 21: High-Level Interconnect Architectures for FPGAs

Timing Results

Circuit Switched Packet Switched

Max Freq 126.330MHz

Setup time 5.308ns

Hold time 0.272ns

Max Freq 144.533MHz

Setup time 6.125ns

Hold time 0.272ns

Critical path is through Arbiter in both designs

21

Page 22: High-Level Interconnect Architectures for FPGAs

Project Appraisal Maintaining an accurate software simulation

proved difficult

A great deal was learnt during the implementation of the circuit switched network

HDL implementations are only prototypes

Testbench provides a good framework but more time is needed to gather performance data

22

Page 23: High-Level Interconnect Architectures for FPGAs

Conclusions

Possible to make low complexity network-on-chip systems suitable for FPGAs

Latency has to be traded for throughput

Hard to collect performance data without application driven benchmarks

Both networks are viable so why not use both?

23

Page 24: High-Level Interconnect Architectures for FPGAs

Future Work

Cycle accurate software simulations

Application driven benchmarking

Serial transmission

Power efficiency

Industry standard solution

24