supercomputing the next century talk to the max-planck-institut fuer gravitationsphysik...

Supercomputing the Next Century

• Talk to the Max-Planck-Institut fuer Gravitationsphysik Albert-Einstein-Institut, Potsdam, Germany

• June 15, 1998

NCSA is the Leading Edge Site for the National Computational Science Alliance

www.ncsa.uiuc.edu

The Alliance Team Structure

• Leading Edge Center

• Enabling Technology– Parallel Computing– Distributed Computing– Data and Collab. Computing

• Partners for Advanced Computational Services– Communities– Training– Technology Deployment– Comp. Resources & Services

• Strategic Industrial and Technology Partners

• Application Technologies– Cosmology– Environmental Hydrology– Chemical Engineering– Nanomaterials– Bioinformatics– Scientific Instruments

• EOT– Education– Evaluation– Universal Access– Government

Alliance‘98 Hosts 1000 AttendeesWith Hundreds On-Line!

The “Experts” Are Not Always Right!Seek a New Vision and Stick to It

We do not believe that workstation-class systems (even if they offer more processing power than the old Cray Y-MPs) can become the mainstay of a National center. At $500K purchase prices, departments should be able to afford the SGI Power Challenge systems on their own. For these reasons, we wonder whether the role of the proposed SGI systems in NCSA’s plan might be different from that of a “production machine”.

-Program Plan Review Panel Feb 1994

The SGI Power Challenge Array as NCSA’s Production Facility for Four Years

0

100

200

300

400

500

Sep

94

Nov94

Jan

95

Mar9

5

May95

Jul9

5

Sep

95

Nov95

Jan

96

Mar9

6

May96

Jul9

6

Sep

96

Nov96

Jan

97

Mar9

7

May97

Jul9

7

Sep

97

Nov97

Jan

98

Mar9

8

May98

Month

Nu

mb

er

of

Users

Per

Mon

th

SGI Power Challenge ArrayCM5

Convex C3880HP/Convex SPP-1200

Cray Y-MPSGI Origin

HP/Convex SPP-2000

C3880 (retired 10/95)

SPP-1200

Y-MP(retired 12/94)

Origin

SPP-2000

CM-5(retired 1/97)

PCA

The NCSA Origin Array Doubles Again This Month

Let’s Blow This Up!Let’s Blow This Up!

The Growth Rate of the National Capacity is Slowing Down again

10,000

100,000

1,000,000

10,000,000

100,000,000

1,000,000,000

1986

1988

1990

1992

1994

1996

1998

2000

2002

Fiscal Year

No

rmal

ized

CP

U H

ou

rs

Total NU

70% Annual Growth This Year

Source: Quantum Research; Lex Lane, NCSA

Major Gap Developing in National Usage at NSF Supercomputer Centers

70% Annual Growth Rate is the Historical Rate of National Usage Growth.

It is Also Slightly Greater Than the Rate of Moore’s Law, So Lesser Growth Means Desktops Gain on Supers

Projection

-

500,000

1,000,000

1,500,000

2,000,000

2,500,000

3,000,000

3,500,000O

ct-

95

Ap

r-9

6

Oc

t-9

6

Ap

r-9

7

Oc

t-9

7

Ap

r-9

8

Oc

t-9

8

Ap

r-9

9

Oc

t-9

9

NU

s U

se

d P

er

Ye

ar

Grand Total

70% Growth Rate

-

500,000

1,000,000

1,500,000

2,000,000

2,500,000

Oc

t-9

5

Fe

b-9

6

Ju

n-9

6

Oc

t-9

6

Fe

b-9

7

Ju

n-9

7

Oc

t-9

7

Fe

b-9

8

Ju

n-9

8

Oc

t-9

8

Fe

b-9

9

Ju

n-9

9

Oc

t-9

9

NU

s P

er

Mo

nth

Monthly National Usage at NSF Supercomputer Centers

FY96 FY97 FY98 FY99Projection

Capacity Level NSF Proposed at 3/97 NSB Meeting

Transition Period

Accelerated Strategic Computing Initiative is Coupling DOE DP Labs to Universities

• Access to ASCI Leading Edge Supercomputers• Academic Strategic Alliances Program • Data and Visualization Corridors

http://www.llnl.gov/asci-alliances/centers.html

Comparison of the DoE ASCI and the NSF PACI Origin Array Scale Through FY99

www.lanl.gov/projects/asci/bluemtn/Hardware/schedule.html

Los Alamos Origin System FY995-6000 processors

NCSA Proposed System FY996x128 and 4x64=1024 processors

• NEC SX-5– 32 x 16 vector processor SMP– 512 Processors– 8 Gigaflop Peak Processor

• IBM SP– 256 x 16 RISC Processor SMP– 4096 Processors– 1 Gigaflop Peak Processor

• SGI Origin Follow-on – 32 x 128 RISC Processor DSM– 4096 Processors– 1 Gigaflop Peak Processor

High-End Architecture 2000-Scalable Clusters of Shared Memory Modules

Each is 4 Teraflops Peak

Emerging Portable Computing Standards

• HPF• MPI• OpenMP• Hybrids of MPI and OpenMP

Top500 Shared Memory Systems

Vector Processors Microprocessors

TOP500 Reports: http://www.netlib.org/benchmark/top500.html

PVP Systems

0

100

200

300

Ju

n-9

3

No

v-93

Ju

n-9

4

No

v-94

Ju

n-9

5

No

v-95

Ju

n-9

6

No

v-96

Ju

n-9

7

No

v-97

Ju

n-9

8

Nu

mb

er o

f S

yste

ms Europe

Japan

USA

SMP + DSM Systems

0

100

200

300

Ju

n-9

3

No

v-93

Ju

n-9

4

No

v-94

Ju

n-9

5

No

v-95

Ju

n-9

6

No

v-96

Ju

n-9

7

No

v-97

Ju

n-9

8

Nu

mb

er o

f S

yste

ms

USA

The Exponential Growth of NCSA’s SGI Shared Memory Supercomputers

1

10

100

1000

10000

Jan

-94

Jan

-95

Jan

-96

Jan

-97

Jan

-98

Jan

-99

Jan

-00

Jan

-01

SG

I Pro

cess

ors

Doubling Every Nine Months!

Challenge

Power Challenge

Origin

SN1

1

10

100

1000

10000

100000

1000000

1

16

31

46

61

76

91

10

6

12

1

13

6

15

1

16

6

18

1Rank

CP

U-H

ou

rs B

urn

ed 100k to 1 M

10k to 100k

1k to 10k

100 to 1k

10 to 1001 to 10

Extreme and Large PIs Dominant Usage of NCSA Origin

January thru April, 1998

Disciplines Using the NCSA Origin 2000CPU-Hours in March 1995

Particle Physics

Chemistry

Materials Sciences

Engineering CFD

Astronomy

Physics

Industry

Molecular Biology Other

0

20

40

60

80

100

1200

20

40

60

80

10

0

12

0

Processors

Sp

ee

du

p

QMC

Martrix-Vector

ZEUS-MP Blast

RIEMANN-HPF

Laplace-DSM

ZEUS-MP Radn.

Freeman-Elec. Struc.

Woodward-PPM

Perfect Scaling

Source: Mitas, Hayes, Tafti, Saied, Balsara, NCSA; Wilkins, OSU;Woodward, UMinn; Freeman, NWNCSA 128-processor Origin IRIX 6.5

Users, NCSA, SGI, and Alliance Parallel Team Working to Make Better Scaling Routine

0

1

2

3

4

5

6

70

10

20

30

40

50

60

Processors

Gig

afl

op

s

Origin-DSM

Origin-MPI

NT-MPI

SP2-MPI

T3E-MPI

SPP2000-DSM

Solving 2D Navier-Stokes Kernel - Performance of Scalable Systems

Source: Danesh Tafti, NCSA

Preconditioned Conjugate Gradient Method With Multi-level Additive Schwarz Richardson Pre-conditioner

(2D 1024x1024)

A Variety of Discipline Codes -Single Processor Performance Origin vs. T3E

0

20

40

60

80

100

120

140

160

Origin T3E

Sin

gle

Pro

ce

ss

or

MF

LO

PS

QMC

RIEMANN

Laplace

QCD

PPM

PIMC

ZEUS

Alliance PACS Origin2000 Repository

http://scv.bu.edu/SCV/Origin2000/

Kadin Tseng, BU, Gary Jensen, NCSA, Chuck Swanson, SGIJohn Connolly, U Kentucky Developing Repository for HP Exemplar

Simulation of the Evolution of the Universe on a Massively Parallel Supercomputer

12 Billion Light Years 4 Billion Light Years

Virgo Project - Evolving a Billion Pieces of Cold Dark Matter in a Hubble Volume -688-processor CRAY T3E at Garching Computing Centre of the Max-Planck-Society

http://www.mpg.de/universe.htm

Limitations of Uniform Grids for Complex Scientific and Engineering Problems

Source: Greg Bryan, Mike Norman, NCSA

512x512x512 Run on 512-node CM-5

Gravitation Causes Continuous

Increase in Density Until There is a Large Mass in a

Single Grid Zone

Use of Shared Memory Adaptive Grids To Achieve Dynamic Load Balancing

Source: Greg Bryan, Mike Norman, John Shalf, NCSA

64x64x64 Run with Seven Levels of Adaption on SGI Power Challenge,Locally Equivalent to 8192x8192x8192 Resolution

NCSA Visualization --VRML Viewers

John Shalf on Greg Bryan Cosmology AMR Data

http://infinite-entropy.ncsa.uiuc.edu/Projects/AmrWireframe/

NT Workstation Shipments Rapidly Surpassing UNIX

0

0.2

0.4

0.6

0.8

1

1.2

1.4

1995 1996 1997

Wo

rkst

atio

ns

Sh

ipp

ed (

Mill

ion

s)

UNIX

NT

Source: IDC, Wall Street Journal, 3/6/98

Current Alliance LES NT Cluster Testbed -Compaq Computer and Hewlett-Packard

• Schedule of NT Supercluster Goals– 1998 Deploy First Production Clusters

– Scientific and Engineering Tuned Cluster– Andrew Chien, Alliance Parallel Computing Team– Rob Pennington, NCSA C&C– Currently 256-processors of HP and Compaq Pentium II SMPs

– Data Intensive Tuned Cluster

– 1999 Enlarge to 512-Processors in Cluster– 2000 Move to Merced– 2002-2005 Achieve Teraflop Performance

• UNIX/RISC & NT/Intel will Co-exist for 5 Years – 1998-2000 Move Applications to NT/Intel– 2000-2005 Convergence toward NT/Merced

First Scaling Testing of ZEUS-MP on CRAY T3E and Origin vs. NT Supercluster

“Supercomputer performance at mail-order prices”-- Jim Gray, Microsoftaccess.ncsa.uiuc.edu/CoverStories/SuperCluster/super.html

Zeus-MP Hydro Code Running Under MPI

• Alliance Cosmology Team• Andrew Chien, UIUC • Rob Pennington, NCSA

0

20

40

60

80

100

120

140

T3

E

Orig

in

NT

Sin

gle

Pro

cesso

r S

peed

o

n Z

EU

S-M

P (

MF

LO

PS

)

0

1

2

3

4

5

6

7

8

0

20

40

60

80

100

120

140

160

180

200

Processors

GFLO

PS

T3E

Origin

NT/Intel

NCSA NT Supercluster Solving Navier-Stokes Kernel

Preconditioned Conjugate Gradient Method With Multi-level Additive Schwarz Richardson Pre-conditioner

(2D 1024x1024)

Single Processor Performance:MIPS R10k 117 MFLOPSIntel Pentium II 80 MFLOPS

Danesh Tafti, Rob Pennington, Andrew Chien NCSA

0

10

20

30

40

50

60

0

10

20

30

40

50

60

Processors

Sp

ee

du

p

NT MPI

Origin MPI

Origin SM

Perfect

0

1

2

3

4

5

6

7

0

10

20

30

40

50

60

70

Processors

Gig

afl

op

s

NT MPI

Origin MPI

Origin SM

Near Perfect Scaling of Cactus - 3D Dynamic Solver for the Einstein GR Equations

0

20

40

60

80

100

1200

20

40

60

80

10

0

12

0Processors

Sc

alin

g

Origin

NT SC

Ratio of GFLOPsOrigin = 2.5x NT SC

Danesh Tafti, Rob Pennington, Andrew Chien NCSA

Cactus was Developed by Paul Walker, MPI-PotsdamUIUC, NCSA

NCSA Symbio - A Distributed Object Framework Bringing Scalable Computing to NT Desktops

http://access.ncsa.uiuc.edu/Features/Symbio/Symbio.html

• Parallel Computing on NT Clusters– Briand Sanderson, NCSA– Microsoft Co-Funds Development

• Features– Based on Microsoft DCOM– Batch or Interactive Modes– Application Development Wizards

• Current Status & Future Plans– Symbio Developer Preview 2 Released– Princeton University Testbed

The Road to Merced

http://developer.intel.com/solutions/archive/issue5/focus.htm#FOUR

vBNS Connected Alliance Site

vBNS Alliance Site Scheduled for Connection

NCSANCSA

FY98 Assembling the Links in the Gridwith NSF’s vBNS Connections Program

StarTAPStarTAP

27 Alliance sites running...

…16 more in progress.

vBNS Backbone Node

1999: Expansion via AbilenevBNS & Abilene at 2.4 Gbit/s

Source: Charlie Catlett, Randy Butler, NCSA

NCSA Distributed Applications Support Team for vBNS

Globus Ubiquitous Supercomputing Testbed (GUSTO)

• Alliance Middleware for the Grid-Distributed Computing Team• GII Next Generation Winner• SF Express -- NPACI / Alliance DoD Mod. Demonstration

– Largest Distributed Interactive Simulation Ever Performed• The Grid: Blueprint for a New Computing Infrastructure

– Edited by Ian Foster and Carl Kesselman, July 1998• IEEE Symposium on High Performance Distributed Computing

– July 29-31, 1998 Chicago Illinois• NASA IPG Most Recent Funding Addition

Alliance National Technology GridWorkshop and Training Facilities

Powered by Silicon GraphicsLinked by the NSF vBNS

Jason Leigh and Tom DeFanti, EVL; Rick Stevens, ANL

Using NCSA Virtual Director to Explore Structure of Density Isosurfaces of 2563 MHD Star Formation

Simulation by Dinshaw Balsara, NCSA, Alliance Cosmology TeamVisualization by Bob Patterson, NCSA

Red Iso = 4x Mean DensityYellow Iso = 8x Mean DensityRed Iso = 12x Mean Density

Isosurface models generated by Vis5dChoreographed with Cave5D/VirDir

Rendered with Wavefront on SGI Onyx

Linking CAVE to Vis5D = CAVE5D Then Use Virtual Director to Analyze Simulations

Donna Cox, Robert Patterson, Stuart Levy, NCSAVirtual Director Team

Java 3D API HPC Application: VisADEnviron. Hydrology Team, (Bill Hibbard, Wisconsin)

Steve Pietrowicz, NCSA Java TeamStandalone or CAVE-to-Laptop-Collaborative

Environmental Hydrology Collaboration: From CAVE to Desktop

NASA IPG is Adding Funding To Collaborative Java3D

Caterpillar’s Collaborative Virtual Prototyping Environment

Data courtesy of Valerie Lehner, NCSA

Real Time Linked VR and Audio-Video Between NCSA and Germany

Using SGI Indy/Onyx and HP Workstations

supercomputing the next century talk to the max-planck-institut fuer gravitationsphysik...

Documents