reconfigurable computing: hpc network aspects

19
Reconfigurable Computing: HPC Network Aspects Mitch Sukalski (8961) David Thompson (8963) Craig Ulmer (8963) [email protected] Pete Dean R&D Seminar December 11, 2003

Upload: zarola

Post on 05-Jan-2016

35 views

Category:

Documents


3 download

DESCRIPTION

Reconfigurable Computing: HPC Network Aspects. Craig Ulmer (8963) [email protected]. Mitch Sukalski (8961) David Thompson (8963). Pete Dean R&D Seminar December 11, 2003. FPGAs are promising…. But what’s the catch? - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Reconfigurable Computing:  HPC Network Aspects

Reconfigurable Computing: HPC Network Aspects

Mitch Sukalski (8961)

David Thompson (8963)

Craig Ulmer (8963)[email protected]

Pete Dean R&D SeminarDecember 11, 2003

Page 2: Reconfigurable Computing:  HPC Network Aspects

FPGAs are promising…

But what’s the catch?

There are three main challenges that need to be addressed in order to apply to practical, scientific computing.

Page 3: Reconfigurable Computing:  HPC Network Aspects

RC Challenge #1: Floating Point

• Most FPGAs fine grained

• Floating point units are large– 32b FP occupies ~1,000 CLBs– Commercial capacity improving

• 2000: 6,000 CLBs

• 2003: 40,000 CLBs (Max: 220,000)

• Keith Underwood at Sandia/NM– LDRD: Working on high-speed 64b floating-point cores

32b FP in Xilinx V2P7

Page 4: Reconfigurable Computing:  HPC Network Aspects

RC Challenge #2: Design Tools

• Hardware design is non-trivial– Micromanage computations, clock-by-clock– Not appropriate for most scientists– Need languages, APIs that are easy to use

• Maya Gokhale at LANL– Streams-C: C-like language for HW design– Pipeline/unroll loops– Schedules access to external memory

Page 5: Reconfigurable Computing:  HPC Network Aspects

RC Challenge #3: High-speed I/O

• FPGAs have large internal computational power– How do we get data into/out of FPGA?– How do we connect to our existing HPC machines?

• Mitch Sukalski, David Thompson, Craig Ulmer– LDRD: Connect FPGAs to high-performance SANs

FPGA

FPGA

Page 6: Reconfigurable Computing:  HPC Network Aspects

Outline

• Where we have beenNetworking FPGAs using external NI cards

• Where we are goingNetworking FPGAs using internal transceivers

• Project statusEarly details

Page 7: Reconfigurable Computing:  HPC Network Aspects

Previous Work

Where we’ve been..

Page 8: Reconfigurable Computing:  HPC Network Aspects

Networking Earlier FPGAs

• Previous generation of FPGAs were like blank ASICs– Configurable logic and pins

• Attach a network card to an FPGA card– Communication over PCI

• Examples:– Virginia Tech: Myrinet– Washington U. in St. Louis: ATM (inline)– Clemson University: Gigabit Ethernet– Georgia Tech: Myrinet

CPU

FPGA

NIC

PC

I B

us

Page 9: Reconfigurable Computing:  HPC Network Aspects

GRIM Project at Georgia Tech

• Add multimedia devices to cluster– Message layer connects

CPUs, memory, and peripherals

– Myrinet between hosts,PCI within hosts

• Celoxica RC-1000 FPGA– Virtex FPGA (1M logic gates)– Four SRAM banks – PCI w/ PMC

SRAM

0SRAM

1SRAM

2SRAM

3

PCIFPGA

Control & Switching

CPU

CPUCPU CPU CPU

CPU

FPGA

RAID

FPGAFPGA

Ethernet

GRIM

Page 10: Reconfigurable Computing:  HPC Network Aspects

FPGA Organization

Frame

Incoming Message Queues

OutgoingMessage Queues

Communication Library API

ApplicationData

Memory API

FPGA Card Memory

FPGACircuit Canvas

User Circuit API

User Circuit n

User Circuit 1

Page 11: Reconfigurable Computing:  HPC Network Aspects

Lessons Learned

• Frame provides simple OS– Isolates users from board– Portability

• Dynamically manage resources– Card memory– Computational circuits

• PCI bottleneck– Distance between NI and FPGA– PCI difficult to work with

Page A

SRAM 1

Page B

SRAM 2

HostCPU

FPGA

Circuit X

Circuit Y

Circuit E

Circuit F

Circuit G

FunctionFault

Message:Use Circuit F

on $C0000000

PageFault

Page C

Page C

NIC

Page 12: Reconfigurable Computing:  HPC Network Aspects

Network Features of Recent FPGAs

Where we’re going…

Page 13: Reconfigurable Computing:  HPC Network Aspects

FPGA Network Improvements

• Recent FPGAs have special, built-in cores– High-speed transceivers, dedicated processors

• Idea: Build our NI inside the FPGA– FPGA becomes a networked, compute resource– Removes the PCI bottleneck

FPGA

NI Tx

Rx

NI Tx

Rx

User-definedComputational

Circuits

CPU

NIC

System Area Network

CPU

NIC

CPU

NIC

Page 14: Reconfigurable Computing:  HPC Network Aspects

Xilinx Virtex-II/Pro FPGA

• Up to 4 PowerPC405 cores– Embedded version of PPC– 300-400MHz

• Multiple gigabit transceivers– Run at 600Mbps to 3.125Gbps– Up to twenty-four transceivers

• Additional cores– Distributed internal memory– Arrays of 18b multipliers– Digital clock multipliers, PLLs

Xilinx V2P20

Page 15: Reconfigurable Computing:  HPC Network Aspects

Multi-Gigabit Transceivers: Rocket I/O

• Flexible, high-speed transceivers– Can be configured to connect with different physical layers– InfiniBand, GigE, FC, 10GigE, Aurora– Note: low-level interface (commas, disparity, clock mismatches)

FPGAFabric

Serializer

Deserializer

Tx FIFO8B/10B

EncoderCRC

8B/10BDecoder

Rx ElasticBuffer

ClockRecoverCRC check

PIN+

-PIN

PIN+

-PIN

FPGAFabric

Rocket I/OPIN

PIN

Rocket I/OPIN

PIN

Rocket I/OPIN

PIN

Page 16: Reconfigurable Computing:  HPC Network Aspects

Why MGTs are Important

• Direct connection to networks– Same chip, different network – Remove PCI from equation

• Fast connections between FPGAs– Reduces analog design issues– Chain FPGAs together– Reduce pin count

• Update: Virtex II/ProX– Now 2.488 Gbps – 10.3125 Gbps– Chips have either 8 or 20 transceivers

3.125 Gbps over 44” FR4 *

* From Xilinx, http://www.xilinx.com/products/virtex2pro/mgtcharacter.htm

Page 17: Reconfigurable Computing:  HPC Network Aspects

Hard PowerPC Core

• PowerPC 405– 16KB Instruction / 16KB Data caches– Real and Virtual memory modes– GCC is available

• Multiple memory ports for core– On-chip memory (OCM)– Processor Local Bus (PLB)

• User-defined memory map– Connect memory blocks or cores– External memory cores available

ProcessorLocal

Bus (PLB)

PowerPC

I-Cache D-Cache

On-ChipMemory

(OCM) Interface

Page 18: Reconfigurable Computing:  HPC Network Aspects

System on a Chip (SoC)

• Commercial SoC– Designing with cores– Customize system

• New tools– Rapidly connect cores– Library of cores & buses– Saves on wiring legwork

Xilinx Platform Studio

Page 19: Reconfigurable Computing:  HPC Network Aspects

Current Status

• Exploring V2P– New architecture, new tools

• Two reference boards– ML300 (V2P7-6)– Avnet (V2P20-6)

• Transceiver work– Raw transmission over fiber– Working towards IB

http://cdulmer.ran.sandia.gov