an fpga design and implementation framework combined with

21
An FPGA Design And Implementation Framework Combined With Commercial VLSI CADs ReCoSoC 2013 Qian Zhao Motoki Amagasaki Masahiro IidaMorihiro KugaToshinori Sueyoshi (Kumamoto University, Japan)

Upload: others

Post on 10-Feb-2022

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: An FPGA Design And Implementation Framework Combined With

An FPGA Design And Implementation Framework Combined With Commercial VLSI CADs            

ReCoSoC 2013

Qian Zhao・Motoki Amagasaki Masahiro Iida・Morihiro Kuga・Toshinori Sueyoshi

(Kumamoto University, Japan)

Page 2: An FPGA Design And Implementation Framework Combined With

Kumamoto University Background

•  FPGA IP core development – Make SoC programmable – Requirements:

•  Synthesizable •  Architecture exploration •  Scalability •  Other features (3D, dependability,etc)

•  FPGA IP design framework is required

– Efficiently evaluate & implement various arch. – Connect to commercial VLSI design flow

1

MCU Mem.

ADC FPGA

IP

System-On-a-Chip

Traditional academic flows

are not applicable

Page 3: An FPGA Design And Implementation Framework Combined With

Kumamoto University

•  Goal: Develop an FPGA IP design framework •  Method: Develop novel routing tool EasyRouter •  New features:

–  Implement new architectures easily •  C#(Mono) Platform (Linux/Windows) •  C# script based Arch. definition

– Arch. Exploration -> HDL & Bitstream generation •  Same arch. exploration functions as traditional tools •  Generate HDL & BitStream with simple templates •  Performance analysis with commercial VLSI flow

Goal of this research

2

Page 4: An FPGA Design And Implementation Framework Combined With

Kumamoto University Compare with VPR

•  VPR is the main academic placement & routing tool

•  Only support Island style FPGA •  Arch. is defined with XML&C

•  Hard to implement new

architectures 3

•  C# language •  Script based

arch. definition

Novel EasyRouter •  HDL & Bitstream

generation •  High accuracy

analysis with VLSI CADs

VPR6.0 (current version)

•  Not for synthesizable FPGA

•  Delay model is not suitable for

standard cells based design

Page 5: An FPGA Design And Implementation Framework Combined With

Kumamoto University

•  Only support “Island Style” FPGA •  Parameters can be changed in XML based Arch. File

•  Only small changes can be made for the supposed Arch. •  For unsupported Arch., need to modify the source code

4

VPR

void MakeSB(type, fs) {

}

type=“Wilton“

fs="3"

Arch. File(XML based)

Example of making “Switch box”

VPR: XML based Arch. definition

Page 6: An FPGA Design And Implementation Framework Combined With

Kumamoto University VPR: Delay model

5

Routing delay Logic delay

Synthesis

HDL(Gate) & .sdc

HDL(Gate) & .spef Layout

Static Timing Analysis

Logic Block Verilog

Write delay info. to Arch. file

L

WHρ

Wire model

L : calc. from LB area   (length of segment wire) W, ρ,H : determined by process

R=ρ𝐿/𝐻∗𝑊 

𝐶↓𝑐𝑏_𝑡𝑜_𝐿𝐵 

𝐶↓𝑐𝑏_𝑓𝑟𝑜𝑚_𝐿𝐵 

𝐶↓𝑡𝑜_𝑠𝑤𝑖𝑡𝑐ℎ  𝐶↓𝑤𝑖𝑟𝑒 

𝐶↓𝑓𝑟𝑜𝑚_𝑠𝑤𝑖𝑡𝑐ℎ 

Elmore delay model

Critical path delay (Composed with different

delay model)

R

C

Page 7: An FPGA Design And Implementation Framework Combined With

Kumamoto University EasyRouter blocks

Clustered netlist

Placement result

EasyRouter

FPGA HDL codes

RRGraph building block

Routing block

HDL codes Generation

block

BitStream Generation

block

Reports

Report Generation

block

Bitstream

Netlist & placement files import

FPGA architecture script file

Arch. & physical parameters setup

RRGraph build

HDL templates

Mapped netlist

6

Breadth_first & Timing_driven routing algorithm (same with VPR)

HDL & Bitstream Generation with routed info.

RRGraph: Routing-resource graph

Arch. Scrip File HDL templates

Page 8: An FPGA Design And Implementation Framework Combined With

Kumamoto University

•  Implement FPGA Arch. in a script file –  Dynamic compiled during execution –  Contains both Arch. Implement codes and parameters –  Any architecture can be implemented in low cost

EasyRouter

load switch block object

from c# script

int Fs=3; Public switchbox makeSB() { … … }

Arch. Script

Structure can be modified freely

Script based Arch. definition

7

Example of making “Switch box”

Page 9: An FPGA Design And Implementation Framework Combined With

Kumamoto University Verilog & BitStream generation

VerilogHDL •  Logic block

–  Prepared in template •  Routing

–  Generated according to Channel width / CB,SB topology

•  Chip :Top level module according to array size BitStream •  Routing & Local connection block

–  Generated according to routed info. •  Logic block

–  Technology mapping results 8

CB: Connection Block SB: Switch Box

Page 10: An FPGA Design And Implementation Framework Combined With

Kumamoto University FPGA design flow based on EasyRouter

9

Standard Cell Library

HDL or Blif file

VTR Project (ODIN II , ABC , VPR)

Clustered Netlists

Placement results

EasyRouter (evaluation mode)

Clustered Netlist

Placement result

EasyRouter (CW exploring mode)

Minimum Channel Width ArraySize

Device scale exploration

Fixed Channel Width & Array Size

HDL Template

Device implementation

HDL Codes (RTL) BitStream

Function Verification

VLSI CADs

Synthesis

Timing Report

HDL(Gate) & .sdc

HDL(Gate) & .spef Layout

Static Timing Analysis

GDS

Formal Verification

Formal Verification

FPGA CADs

Page 11: An FPGA Design And Implementation Framework Combined With

Kumamoto University Fast evaluation with EasyRouter

Synthesis

Layout

STA

Tile HDL code

EasyRouter (Fast performance analysis)

Chip area & timing report

Tile physical information

Fixed Channel

Width & Array Size

Netlist & placement reading block

RRGraph building block

Routing block

Report generation block

(a) Tile info. (b) Fast evaluation with EasyRouter 10

Placement results

Tile physical information

•  The fast evaluation with EasyRouter can be used when •  VLSI analysis is too slow (big chip) •  No available VLSI flow(3D-FPGA)

Clustered Netlists (.blif)

Page 12: An FPGA Design And Implementation Framework Combined With

Kumamoto University Evaluation : EasyRouter

•  Device structure –  Island style FPGA

•  MCNC Benchmarks •  Evaluation items

–  Minimum channel width for routability evaluation –  Execution time for tool efficiency evaluation

11

Item value Logic cluster 4-LUT×4 # of LB inputs 10 SB Wilton(Fs=3) CB Fc_in=0.5, Fc_out=1.0 Channel Single lines

Page 13: An FPGA Design And Implementation Framework Combined With

Kumamoto University Minimum channel width (Routability)

0 5

10 15 20 25 30 35 40 45 50

Cha

nnel

wid

th

Benchmarks

VPR EasyRouter

12

•  EasyRouter has almost the same routability with VPR •  Minor differences are from different routing orders

Page 14: An FPGA Design And Implementation Framework Combined With

Kumamoto University Execution time (CPU time)

0

2

4

6

8

10

12

14

Exec

utio

n tim

e no

rmal

ized

by

VPR

Benchmarks

13

•  EasyRouter is 8 times slower than VPR on average •  Big circuits are about 5 times slower

Page 15: An FPGA Design And Implementation Framework Combined With

Kumamoto University Evaluation for proposed design flow

•  Process:e-Shuttle 65nm •  Device parameters

•  MCNC Benchmarks •  Evaluation items

–  Critical path delay on average(10 placement seeds) 14

Item value Array size 15×15 Logic cluster 4-LUT×4 # of LB inputs 10 SB Wilton(Fs=3) CB Fc_in=0.5, Fc_out=1.0 IOB Fc=1.0 # of routing tracks(single line) 50/channel

Page 16: An FPGA Design And Implementation Framework Combined With

Kumamoto University Device architecture for evaluation

15

Homogeneous FPGA Tile structure

IOB

IOB

IOB

IOB

IOB

IOB

IOB

IOB IOB IOB IOB

IOB IOB IOB IOB

Tile Tile Tile Tile

Tile Tile Tile Tile

Tile Tile Tile Tile

Tile Tile Tile Tile IOB

LB

Local Connection

BLE_3

BLE_0

BLE_1

BLE_2

CB_Y

CB_X SB

Page 17: An FPGA Design And Implementation Framework Combined With

Kumamoto University Critical path delay results

16

0

50

100

150

200

250

s298 elliptic diffeq alu4 misex3 apex4

Crit

ical

pat

h de

lay

(ns)

Benchmarks

Full FPGA STA

EasyRouter

VPR

•  Delay from VPR is about 10x worse than STA results •  Delay from EasyRouter is about 1.5x worse then STA results

Page 18: An FPGA Design And Implementation Framework Combined With

Kumamoto University Conclusions

EasyRouter •  Simple:New arch. can be implemented in low cost •  High accuracy:Link academic FPGA tools with

commercial VLSI CADs

Other works with the proposed flow •  3D-FPGA(VLSI-SoC2013)

–  “Three-Dimensional Stacking FPGA Architecture Using Face-to-Face Integration”

•  Hierarchy routing FPGA(FPL2013) –  “Defect-Robust FPGA Architectures For Intellectual Property

Cores In System LSI” 17

Page 19: An FPGA Design And Implementation Framework Combined With

Kumamoto University

Thank you for your attention.

Page 20: An FPGA Design And Implementation Framework Combined With

Kumamoto University VPR(MIN-MAX-AVG. value from 10 seeds results)

0.0

50.0

100.0

150.0

200.0

250.0

300.0

misex3 alu4 apex4 diffeq elliptic s298

Crit

ical

pat

h de

lay

(ns)

Benchmarks

•  Change range of delay results of VPR is large when placement seed value changes 19

76.19ns

Page 21: An FPGA Design And Implementation Framework Combined With

Kumamoto University EasyRouter(MIN-MAX-AVG. value from 10 seeds results)

20

Benchmarks

Crit

ical

pat

h de

lay

(ns)

•  Change range of delay results of STA results are smaller than VPR when placement seed changes

0.0

2.5

5.0

7.5

10.0

12.5

15.0

17.5

20.0

22.5

25.0

misex3 alu4 apex4 diffeq elliptic s298

4.17ns