bill camp, jim tomkins & rob leland. how much commodity is enough? the red storm architecture...

50
Bill Camp, Jim Tomkins & Rob Leland

Upload: georgia-goodman

Post on 18-Dec-2015

221 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Bill Camp, Jim Tomkins & Rob Leland. How Much Commodity is Enough? the Red Storm Architecture William J. Camp, James L. Tomkins & Rob Leland CCIM, Sandia

Bill Camp, Jim Tomkins

& Rob Leland

Page 2: Bill Camp, Jim Tomkins & Rob Leland. How Much Commodity is Enough? the Red Storm Architecture William J. Camp, James L. Tomkins & Rob Leland CCIM, Sandia

How Much Commodity is Enough?the Red Storm Architecture

William J. Camp, James L. Tomkins & Rob Leland

CCIM, Sandia National Laboratories

Albuquerque, NM

[email protected]

Page 3: Bill Camp, Jim Tomkins & Rob Leland. How Much Commodity is Enough? the Red Storm Architecture William J. Camp, James L. Tomkins & Rob Leland CCIM, Sandia

Sandia MPPs (since 1987)

1987: 1024-processor nCUBE10 [512 Mflops]

1990--1992 + +: 2 1024-processor nCUBE-2 machines [2 @ 2 Gflops]

1988--1990: 16384-processor CM-200

1991: 64-processor Intel IPSC-860

1993--1996: ~3700-processor Intel Paragon [180 Gflops]

1996--present: 9400-processor Intel TFLOPS (ASCI Red) [3.2 Tflops]

1997--present: 400 --> 2800 processors in Cplant Linux Cluster [~3 Tflops]

2003: 1280-processor IA32- Linux cluster [~7 Tflops]

2004: Red Storm: ~11600 processor Opteron-based MPP [>40 Tflops]

Page 4: Bill Camp, Jim Tomkins & Rob Leland. How Much Commodity is Enough? the Red Storm Architecture William J. Camp, James L. Tomkins & Rob Leland CCIM, Sandia

Our rubric (since 1987)

Complex, mission-critical, engineering & science applications Large systems (1000’s of PE’s) with a few processors per node Message passing paradigm Balanced architecture Use commodity wherever possible Efficient systems software Emphasis on scalability & reliability in all aspects Critical advances in parallel algorithms Vertical integration of technologies

Page 5: Bill Camp, Jim Tomkins & Rob Leland. How Much Commodity is Enough? the Red Storm Architecture William J. Camp, James L. Tomkins & Rob Leland CCIM, Sandia

Net I/O

Service

Users

File I/OCompute

/home

A partitioned, scalable computing architecture

Page 6: Bill Camp, Jim Tomkins & Rob Leland. How Much Commodity is Enough? the Red Storm Architecture William J. Camp, James L. Tomkins & Rob Leland CCIM, Sandia

Computing domains at Sandia

Red Storm is targeting the highest-end market but has real advantages for the mid-range market (from 1 cabinet on up)

Domain

# Procs 1 101 102 103 104

Red StormX X X

Cplant Linux Supercluster

X X X

Beowulf clusters X X X

Desktop X

VolumeMid-Range

Peak

Page 7: Bill Camp, Jim Tomkins & Rob Leland. How Much Commodity is Enough? the Red Storm Architecture William J. Camp, James L. Tomkins & Rob Leland CCIM, Sandia

Red Storm Architecture True MPP, designed to be a single system-- not a cluster Distributed memory MIMD parallel supercomputer Fully connected 3D mesh interconnect. Each compute

node processor has a bi-directional connection to the primary communication network

108 compute node cabinets and 10,368 compute node processors (AMD Sledgehammer @ 2.0--2.4 GHz)

~10 or 20 TB of DDR memory @ 333MHz Red/Black switching: ~1/4, ~1/2, ~1/4 (for data security) 12 Service, Visualization, and I/O cabinets on each end

(640 S,V & I processors for each color) 240 TB of disk storage (120 TB per color) initially

Page 8: Bill Camp, Jim Tomkins & Rob Leland. How Much Commodity is Enough? the Red Storm Architecture William J. Camp, James L. Tomkins & Rob Leland CCIM, Sandia

Red Storm Architecture

Functional hardware partitioning: service and I/O nodes, compute nodes, Visualization nodes, and RAS nodes

Partitioned Operating System (OS): LINUX on Service, Visualization, and I/O nodes, LWK (Catamount) on compute nodes, LINUX on RAS nodes

Separate RAS and system management network (Ethernet) Router table-based routing in the interconnect Less than 2 MW total power and cooling Less than 3,000 ft2 of floor space

Page 9: Bill Camp, Jim Tomkins & Rob Leland. How Much Commodity is Enough? the Red Storm Architecture William J. Camp, James L. Tomkins & Rob Leland CCIM, Sandia

Usage Model

Unix (Linux)Login Nodewith Unix

environment

BatchProcessing

or

ComputeResource

I/O

User sees a coherent, single system

Page 10: Bill Camp, Jim Tomkins & Rob Leland. How Much Commodity is Enough? the Red Storm Architecture William J. Camp, James L. Tomkins & Rob Leland CCIM, Sandia

Thor’s Hammer Topology

3D-mesh Compute node topology: 27 x 16 x 24 (x, y, z) – Red/Black split: 2,688 – 4,992 – 2,688

Service, Visualization, and I/O partitions 3 cab’s on each end of each row

• 384 full bandwidth links to Compute Node Mesh• Not all nodes have a processor-- all have routers

256 PE’s in each Visualization Partition--2 per board 256 PE’s in each I/O Partition-- 2 per board 128 PE’s in each Service Partition-- 4 per board

Page 11: Bill Camp, Jim Tomkins & Rob Leland. How Much Commodity is Enough? the Red Storm Architecture William J. Camp, James L. Tomkins & Rob Leland CCIM, Sandia

3-D Mesh topology (Z direction is a torus)

10,368Compute

Node Mesh

X=27

Y=16

Z=24

TorusInterconnect

in Z

640 V

isualization S

ervice &

I/O N

odes

640

Vis

ualiz

atio

n,

Ser

vice

& I

/O N

odes

Page 12: Bill Camp, Jim Tomkins & Rob Leland. How Much Commodity is Enough? the Red Storm Architecture William J. Camp, James L. Tomkins & Rob Leland CCIM, Sandia

Thor’s Hammer Network Chips

3D-mesh is created by SEASTAR ASIC: Hyper-transport Interface and 6 network router ports on each chip In computer partitions each processor has its own SEASTAR In service partition, some boards are configured like compute

partition (4 PE’s per board) Others have only 2 PE’s per board; but still have 4 SEASTARS

• So, network topology is uniform

SEASTAR designed by CRAY to our spec’s, Fabricated by IBM The only truly custom part in Red Storm-- complies with HT open

standard

Page 13: Bill Camp, Jim Tomkins & Rob Leland. How Much Commodity is Enough? the Red Storm Architecture William J. Camp, James L. Tomkins & Rob Leland CCIM, Sandia

Node architecture

CPUAMD

Opteron

DRAM 1 (or 2) Gbyte or more

ASICNIC +Router

Six LinksTo Other

Nodes in X, Y,and Z

ASIC = ApplicationSpecific Integrated

Circuit, or a“custom chip”

Page 14: Bill Camp, Jim Tomkins & Rob Leland. How Much Commodity is Enough? the Red Storm Architecture William J. Camp, James L. Tomkins & Rob Leland CCIM, Sandia

System Layout(27 x 16 x 24 mesh)

NormallyUnclassified

NormallyClassified

SwitchableNodes

Disconnect Cabinets

{ {

Page 15: Bill Camp, Jim Tomkins & Rob Leland. How Much Commodity is Enough? the Red Storm Architecture William J. Camp, James L. Tomkins & Rob Leland CCIM, Sandia

Thor’s Hammer Cabinet Layout Compute Node Partition

3 Card Cages per Cabinet 8 Boards per Card Cage 4 Processors per Board 4 NIC/Router Chips per Board N + 1 Power Supplies Passive Backplane

Service. Viz, and I/O Node Partition 2 (or 3) Card Cages per Cabinet 8 Boards per Card Cage 2 (or 4) Processors per Board 4 NIC/Router Chips per Board 2-PE I/O Boards have 4 PCI-X

busses N + 1 Power Supplies Passive Backplane

Compute Node CabinetCPU Boards

Fan Fan PowerSupply

Cab

les

Front Side

2 ft 4 ft } 96 PE

Page 16: Bill Camp, Jim Tomkins & Rob Leland. How Much Commodity is Enough? the Red Storm Architecture William J. Camp, James L. Tomkins & Rob Leland CCIM, Sandia

Performance

Peak of 41.4 (46.6) TF based on 2 floating point instruction issues per clock at 2.0 Gigahertz .

We required 7-fold speedup versus ASCI Red but based on our benchmarks expect performance will be 8-10 time faster than ASCI Red.

Expected MP-Linpack performance: ~30--35 TF Aggregate system memory bandwidth: ~55 TB/s Interconnect Performance:

Latency <2 s (neighbor), <5 s (full machine) Link bandwidth ~ 6.0 GB/s bi-directional Minimal XC bi-section bandwidth ~2.3 TB/s

Page 17: Bill Camp, Jim Tomkins & Rob Leland. How Much Commodity is Enough? the Red Storm Architecture William J. Camp, James L. Tomkins & Rob Leland CCIM, Sandia

Performance

I/O System Performance Sustained file system bandwidth of 50 GB/s for each color Sustained external network bandwidth of 25 GB/s for each color

Node memory system Page miss latency to local memory is ~80 ns Peak bandwidth of ~5.4 GB/s for each processor

Page 18: Bill Camp, Jim Tomkins & Rob Leland. How Much Commodity is Enough? the Red Storm Architecture William J. Camp, James L. Tomkins & Rob Leland CCIM, Sandia

Red Storm System Software

Operating Systems LINUX on service and I/O nodes Sandia’s LWK (Catamount) on compute nodes LINUX on RAS nodes

Run-Time System Logarithmic loader Fast, efficient Node allocator Batch system – PBS Libraries – MPI, I/O, Math

File Systems being considered include PVFS – interim file system Lustre – Design Intent Panassas-- possible alternative …

Page 19: Bill Camp, Jim Tomkins & Rob Leland. How Much Commodity is Enough? the Red Storm Architecture William J. Camp, James L. Tomkins & Rob Leland CCIM, Sandia

Red Storm System Software

Tools All IA32 Compilers, all AMD 64-bit Compilers – Fortran, C, C++ Debugger – Totalview (also examining alternatives) Performance Tools (was going to be Vampir until Intel bought

Pallas-- now?)

System Management and Administration Accounting RAS GUI Interface

Page 20: Bill Camp, Jim Tomkins & Rob Leland. How Much Commodity is Enough? the Red Storm Architecture William J. Camp, James L. Tomkins & Rob Leland CCIM, Sandia

Comparison of ASCI Redand Red Storm

ASCI Red Red Storm

Full System Operational Time Frame June 1997 (processor and memory upgrade in 1999)

August 2004

Theoretical Peak (TF)-- compute partition alone

3.15 41.47

MP-Linpack Performance (TF) 2.38 >30 (estimated)

Architecture Distributed Memory MIMD Distributed Memory MIMD

Number of Compute Node Processors 9,460 10,368

Processor Intel P II @ 333 MHz AMD Opteron @ 2 GHz

Total Memory 1.2 TB 10.4 TB (up to 80 TB)

System Memory Bandwidth 2.5 TB/s 55 TB/s

Disk Storage 12.5 TB 240 TB

Parallel File System Bandwidth 1.0 GB/s each color 50.0 GB/s each color

External Network Bandwidth 0.2 GB/s each color 25 GB/s each color

Page 21: Bill Camp, Jim Tomkins & Rob Leland. How Much Commodity is Enough? the Red Storm Architecture William J. Camp, James L. Tomkins & Rob Leland CCIM, Sandia

Comparison of ASCI Redand Red Storm

ASCI Red RED STORM

Interconnect Topology 3D Mesh (x, y, z)

38 x 32 x 2

3D Mesh (x, y, z)27 x 16 x 24

Interconnect Performance MPI Latency

Bi-Directional Bandwidth Minimum Bi-section Bandwidth

15 s 1 hop, 20 s max800 MB/s51.2 GB/s

2.0 s 1 hop, 5 s s max6.0 GB/s2.3 TB/s

Full System RAS RAS Network RAS Processors

10 Mbit Ethernet

1 for each 32 CPUs

100 Mbit Ethernet1 for each 4 CPUs

Operating System Compute Nodes Service and I/O Nodes RAS Nodes

CougarTOS (OSF1 UNIX)

VX-Works

CatamountLINUXLINUX

Red/Black Switching 2260 – 4940 – 2260 2688 – 4992 - 2688

System Foot Print ~2500 ft2 ~3000 ft2

Power Requirement 850 KW 1.7 MW

Page 22: Bill Camp, Jim Tomkins & Rob Leland. How Much Commodity is Enough? the Red Storm Architecture William J. Camp, James L. Tomkins & Rob Leland CCIM, Sandia

Red Storm Project

23 months, design to First Product Shipment! System software is a joint project between Cray and Sandia

Sandia is supplying Catamount LWK and the service node run-time system Cray is responsible for Linux, NIC software interface, RAS software, file

system software, and Totalview port Initial software development was done on a cluster of workstations with a

commodity interconnect. Second stage involves an FPGA implementation of SEASTAR NIC/Router (Starfish). Final checkout on real SEASTAR-based system

System design is going on now Cabinets-- exist SEASTAR NIC/Router-- released to Fabrication at IBM earlier this month

Full system to be installed and turned over to Sandia in stages culminating in August--September 2004

Page 23: Bill Camp, Jim Tomkins & Rob Leland. How Much Commodity is Enough? the Red Storm Architecture William J. Camp, James L. Tomkins & Rob Leland CCIM, Sandia

New Building for Thor’s Hammer

Page 24: Bill Camp, Jim Tomkins & Rob Leland. How Much Commodity is Enough? the Red Storm Architecture William J. Camp, James L. Tomkins & Rob Leland CCIM, Sandia

Designing for scalable supercomputing

Challenges in: -Design-Integration-Management-Use

Page 25: Bill Camp, Jim Tomkins & Rob Leland. How Much Commodity is Enough? the Red Storm Architecture William J. Camp, James L. Tomkins & Rob Leland CCIM, Sandia

SUREty for Very Large Parallel Computer Systems

Scalability - Full System Hardware and System Software

Usability - Required Functionality Only

Reliability - Hardware and System Software

Expense minimization- use commodity, high-volume parts SURE poses Computer System Requirements:

Page 26: Bill Camp, Jim Tomkins & Rob Leland. How Much Commodity is Enough? the Red Storm Architecture William J. Camp, James L. Tomkins & Rob Leland CCIM, Sandia

SURE Architectural tradeoffs:• Processor and memory sub-

system balance• Compute vs interconnect balance• Topology choices• Software choices• RAS• Commodity vs. Custom technology• Geometry and mechanical design

Page 27: Bill Camp, Jim Tomkins & Rob Leland. How Much Commodity is Enough? the Red Storm Architecture William J. Camp, James L. Tomkins & Rob Leland CCIM, Sandia

Sandia Strategies:-build on commodity-leverage Open Source (e.g., Linux)-Add to commodity selectively (in RS there is basically one truly custom part!)-leverage experience with previous scalable supercomputers

Page 28: Bill Camp, Jim Tomkins & Rob Leland. How Much Commodity is Enough? the Red Storm Architecture William J. Camp, James L. Tomkins & Rob Leland CCIM, Sandia

System Scalability Driven Requirements

Overall System Scalability - Complex scientific applications such as molecular dynamics, hydrodynamics & radiation transport should achieve scaled parallel efficiencies greater than 50% on the full system (~20,000 processors).

-

Page 29: Bill Camp, Jim Tomkins & Rob Leland. How Much Commodity is Enough? the Red Storm Architecture William J. Camp, James L. Tomkins & Rob Leland CCIM, Sandia

ScalabilitySystem Software;System Software Performance scales nearly perfectly with the number of processors to the full size of the computer (~30,000 processors). This means that System Software time (overhead) remains nearly constant with the size of the system or scales at most logarithmically with the system size.

- Full re-boot time scales logarithmically with the system size.- Job loading is logarithmic with the number of processors.- Parallel I/O performance is not sensitive to # of PEs doing I/O- Communication Network software must be scalable.

- No connection-based protocols among compute nodes.

- Message buffer space independent of # of processors.- Compute node OS gets out of the way of the

application.

Page 30: Bill Camp, Jim Tomkins & Rob Leland. How Much Commodity is Enough? the Red Storm Architecture William J. Camp, James L. Tomkins & Rob Leland CCIM, Sandia

Hardware scalability•Balance in the node hardware:

•Memory BW must match CPU speed

Ideally 24 Bytes/flop (never yet done)

•Communications speed must match CPU speed

•I/O must match CPU speeds

•Scalable System SW( OS and Libraries)

•Scalable Applications

Page 31: Bill Camp, Jim Tomkins & Rob Leland. How Much Commodity is Enough? the Red Storm Architecture William J. Camp, James L. Tomkins & Rob Leland CCIM, Sandia

Usability>Application Code Support:

Software that supports scalability of the Computer System

Math LibrariesMPI Support for Full System SizeParallel I/O LibraryCompilers

Tools that Scale to the Full Size of the Computer System

DebuggersPerformance Monitors

Full-featured LINUX OS support at the user interface

Page 32: Bill Camp, Jim Tomkins & Rob Leland. How Much Commodity is Enough? the Red Storm Architecture William J. Camp, James L. Tomkins & Rob Leland CCIM, Sandia

Reliability

Light Weight Kernel (LWK) O. S. on compute partition Much less code fails much less often

Monitoring of correctible errors Fix soft errors before they become hard

Hot swapping of components Overall system keeps running during maintenance

Redundant power supplies & memories Completely independent RAS System monitors virtually

every component in system

Page 33: Bill Camp, Jim Tomkins & Rob Leland. How Much Commodity is Enough? the Red Storm Architecture William J. Camp, James L. Tomkins & Rob Leland CCIM, Sandia

Economy

1. Use high-volume parts where possible2. Minimize power requirements

Cuts operating costsReduces need for new capital

investment3. Minimize system volume

Reduces need for large new capital facilities

4. Use standard manufacturing processes where possible-- minimize customization

5. Maximize reliability and availability/dollar6. Maximize scalability/dollar7. Design for integrability

Page 34: Bill Camp, Jim Tomkins & Rob Leland. How Much Commodity is Enough? the Red Storm Architecture William J. Camp, James L. Tomkins & Rob Leland CCIM, Sandia

Economy

Red Storm leverages economies of scale AMD Opteron microprocessor & standard memory Air cooled Electrical interconnect based on Infiniband physical devices Linux operating system

Selected use of custom components System chip ASIC

• Critical for communication intensive applications

Light Weight Kernel• Truly custom, but we already have it (4th generation)

Page 35: Bill Camp, Jim Tomkins & Rob Leland. How Much Commodity is Enough? the Red Storm Architecture William J. Camp, James L. Tomkins & Rob Leland CCIM, Sandia

Cplant on a slide

Net I/O

System Support

Service

Sys Admin

Users

File I/O

Compute

/home

other

I/ONodes

Compute NodesService Nodes

……

……

……

… … … …

Ethernet

ATM

Operator(s)

HiPPI

I/O Nodes

System

Goal: MPP “look and feel”

• Start ~1997, upgrade ~1999--2001

• Alpha & Myrinet, mesh topology

• ~3000 procs (3Tf) in 7 systems

• Configurable to ~1700 procs

• Red/Black switching

• Linux w/ custom runtime & mgmt.

• Production operation for several yrs.

ASCI Red

Page 36: Bill Camp, Jim Tomkins & Rob Leland. How Much Commodity is Enough? the Red Storm Architecture William J. Camp, James L. Tomkins & Rob Leland CCIM, Sandia

IA-32 Cplant on a slide

Net I/O

System Support

Service

Sys Admin

Users

File I/O

Compute

/home

other

I/ONodes

Compute NodesService Nodes

……

……

……

… … … …

Ethernet

ATM

Operator(s)

HiPPI

I/O Nodes

System

Goal: Mid-range capacity

• Started 2003, upgrade annually

• Pentium-4 & Myrinet, Clos network

• 1280 procs (~7 Tf) in 3 systems

• Currently configurable to 512 procs

• Linux w/ custom runtime & mgmt.

• Production operation for several yrs.

ASCI Red

Page 37: Bill Camp, Jim Tomkins & Rob Leland. How Much Commodity is Enough? the Red Storm Architecture William J. Camp, James L. Tomkins & Rob Leland CCIM, Sandia

Observation:For most large scientific and engineering applications the performance is more determined by parallel scalability and less by the speed of individual CPUs.

There must be balance between processor, interconnect, and I/O performance to achieve overall performance.

To date, only a few tightly-coupled, parallel computer systems have been able to demonstrate a high level of scalability on a broad set of scientific and engineering applications.

Page 38: Bill Camp, Jim Tomkins & Rob Leland. How Much Commodity is Enough? the Red Storm Architecture William J. Camp, James L. Tomkins & Rob Leland CCIM, Sandia

Let’s Compare Balance In Parallel Systems

10000

2500

24000

2650

64000

500

1000

666

1200

400

Node Speed Rating(MFlops)

0.2650Q*

0.04400Q**

0.0832000White

0.11 (0.05)300 (132)Blue Pacific

0.02 (0.16*)1200 (9600*)BlueMtn**

1.6800Blue Mtn*

0.14140Cplant

(1.2)0.67800(533)ASCI RED**

11200T3E

2(1.33)800(533)ASCI RED

Communications Balance

(Bytes/flop)

Network Link BW

(Mbytes/s)Machine

Page 39: Bill Camp, Jim Tomkins & Rob Leland. How Much Commodity is Enough? the Red Storm Architecture William J. Camp, James L. Tomkins & Rob Leland CCIM, Sandia

Comparing Red Storm and BGL

Blue Gene Light** Red Storm*

Node Speed 5.6 GF 5.6 GF (1x)

Node Memory 0.25--.5 GB 2 (1--8 ) GB (4x nom.)

Network latency 7 secs 2 secs (2/7 x)

Network BW 0.28 GB/s 6.0 GB/s (22x)

BW Bytes/Flops 0.05 1.1 (22x)

Bi-Section B/F 0.0016 0.038 (24x)

#nodes/problem 40,000 10,000 (1/4 x)

*100 TF version of Red Storm

* * 360 TF version of BGL

Page 40: Bill Camp, Jim Tomkins & Rob Leland. How Much Commodity is Enough? the Red Storm Architecture William J. Camp, James L. Tomkins & Rob Leland CCIM, Sandia

Fixed problem performance

Molecular dynamics problem(LJ liquid)

Page 41: Bill Camp, Jim Tomkins & Rob Leland. How Much Commodity is Enough? the Red Storm Architecture William J. Camp, James L. Tomkins & Rob Leland CCIM, Sandia

Parallel Sn Neutronics (provided by LANL)

Page 42: Bill Camp, Jim Tomkins & Rob Leland. How Much Commodity is Enough? the Red Storm Architecture William J. Camp, James L. Tomkins & Rob Leland CCIM, Sandia

Scalable computing works

ASCI Red efficiencies for major codes

0

20

40

60

80

100

1 10 100 1000 10000

Processors

Scaled parallel efficiency (%)

QS-Particles

QS-Fields-Only

QS-1B Cells

Rad x-port-1B Cells

Rad x-port - 17M

Rad x-port - 80M

Rad x-port - 168M

Rad x-port - 532M

Finite Element

Zapotec

Reactive Fluid Flow

Salinas

CTH

Page 43: Bill Camp, Jim Tomkins & Rob Leland. How Much Commodity is Enough? the Red Storm Architecture William J. Camp, James L. Tomkins & Rob Leland CCIM, Sandia

Basic Parallel Efficiency Model

0.00

0.20

0.40

0.60

0.80

1.00

1.20

0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0

Communication/Computation Load

Parallel Efficiency

Red Storm (B=1.5)

ASCI Red (B=1.2)

Ref. Machine (B=1.0)

Earth Sim. (B=.4)

Cplant (B=.25)

Blue Gene Light (B=.05)

Std. Linux Cluster (B=.04)

Balance is critical to scalability

PeakLin

pack

Scientific & eng. codes

Page 44: Bill Camp, Jim Tomkins & Rob Leland. How Much Commodity is Enough? the Red Storm Architecture William J. Camp, James L. Tomkins & Rob Leland CCIM, Sandia

Relating scalability and cost

0.00

1.00

2.00

3.00

4.00

5.00

6.00

1 2 4 8 16 32 64 128 256 512 1024 2048 4096Processors

Efficiency ratio (Red/Cplant)

Eff. Ratio Extrapolation

Efficiency ratio =Cost ratio = 1.8

MPP more cost effective

Cluster more cost effective

Average efficiency ratio over the five codes that consume >80% of Sandia’s cycles

Page 45: Bill Camp, Jim Tomkins & Rob Leland. How Much Commodity is Enough? the Red Storm Architecture William J. Camp, James L. Tomkins & Rob Leland CCIM, Sandia

Scalability determines cost effectiveness

0

10,000,000

20,000,000

30,000,000

40,000,000

50,000,000

60,000,000

70,000,000

80,000,000

1 10 100 1000 10000

Number of Nodes

Total Node-Hours of Jobs

380M node-hrs55M node-hrs

MPP more cost effective

Cluster more cost effective

256

Sandia’s top priority computing workload:

Page 46: Bill Camp, Jim Tomkins & Rob Leland. How Much Commodity is Enough? the Red Storm Architecture William J. Camp, James L. Tomkins & Rob Leland CCIM, Sandia

Scalability also limits capability

ITS Speedup curves

0

200

400

600

800

1000

1200

0128256384512640768896

1024115212801408Processors

Speedup

Red Speedup

Cplant Speedup

Poly. (RedSpeedup)

Poly. (CplantSpeedup)

~3x processors

Page 47: Bill Camp, Jim Tomkins & Rob Leland. How Much Commodity is Enough? the Red Storm Architecture William J. Camp, James L. Tomkins & Rob Leland CCIM, Sandia

Commodity nearly everywhere-- Customization drives cost

• Earth Simulator and Cray X-1 are fully custom Vector systems with good balance• This drives their high cost (and their high performance).

• Clusters are nearly entirely high-volume with no truly custom parts• Which drives their low-cost (and their low scalability)

• Red Storm uses custom parts only where they are critical to performance and reliability• High scalability at minimal cost/performance

Page 48: Bill Camp, Jim Tomkins & Rob Leland. How Much Commodity is Enough? the Red Storm Architecture William J. Camp, James L. Tomkins & Rob Leland CCIM, Sandia
Page 49: Bill Camp, Jim Tomkins & Rob Leland. How Much Commodity is Enough? the Red Storm Architecture William J. Camp, James L. Tomkins & Rob Leland CCIM, Sandia

Scaling data for some key engineering codes

Performance on Engineering Codes

0.00

0.20

0.40

0.60

0.80

1.00

1.20

1 2 4 8 16 32 64 128 256 512 1024Processors

Scaled Parallel Efficiency

ITS, Red

ITS, Cplant

ACME, Red

ACME, Cplant

Random variation at small proc. counts

Large differential in efficiency at large proc. counts

Page 50: Bill Camp, Jim Tomkins & Rob Leland. How Much Commodity is Enough? the Red Storm Architecture William J. Camp, James L. Tomkins & Rob Leland CCIM, Sandia

Scaling data for some key physics codes

Los Alamos’ Radiation transport

code

PARTISN Diffusion Solver Sizeup StudyS6P2, 12 Groups, 13,800 cells/PE

0%

20%

40%

60%

80%

100%

120%

1 2 4 8 16 32 6412825651210242048

Number of Processor Elements

Parallel Efficiency

ASCI Red

Blue Mountain

White

QSC

PARTISN Transport Solver Sizeup StudyS6P2, 12 Groups, 13,800 cells/PE

0%

20%

40%

60%

80%

100%

120%

1 2 4 8 16 32 6412825651210242048

Number of Processor Elements

Parallel Efficiency

ASCI Red

Blue Mountain

White

QSC