cray ornl climate workshop 2009 talk [read-only]high performance computing and modeling algorithmic...

29

Upload: others

Post on 31-May-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Cray ORNL Climate Workshop 2009 Talk [Read-Only]High Performance Computing and Modeling Algorithmic improvements (applied math & domain specific) Comp terScience impro ements challenge
Page 2: Cray ORNL Climate Workshop 2009 Talk [Read-Only]High Performance Computing and Modeling Algorithmic improvements (applied math & domain specific) Comp terScience impro ements challenge

• General Cray Update

• Cray Technology DirectionsCray Technology Directions

• Perspectives on Petascale Computing

Mar-2009 ORNL Climate Meeting Slide 2

Page 3: Cray ORNL Climate Workshop 2009 Talk [Read-Only]High Performance Computing and Modeling Algorithmic improvements (applied math & domain specific) Comp terScience impro ements challenge

General Cray Update

Page 4: Cray ORNL Climate Workshop 2009 Talk [Read-Only]High Performance Computing and Modeling Algorithmic improvements (applied math & domain specific) Comp terScience impro ements challenge

• Used by leading HPC centers worldwide.

• Significant R&D centered around all aspects of high‐end HPCHPC:

• From scalable operating systems,

• To advanced cooling technologies.

• Cray XT MPP architecture has• Cray XT MPP architecture has gained mindshare as prototype for highly scalable computing:

• From 10’s TF to Petaflop.

Mar-2009 ORNL Climate Meeting Slide 4

Page 5: Cray ORNL Climate Workshop 2009 Talk [Read-Only]High Performance Computing and Modeling Algorithmic improvements (applied math & domain specific) Comp terScience impro ements challenge

• >$700M in contracts in 2006~2008.

• Continued strong international momentum.Continued strong international momentum.

• We shipped our 1000th Cray XT cabinet on 4 September 2008 (and 100% are still in service).

• First Petaflop system was delivered to ORNL and accepted by end of 2008.

• 607 TF NSF University of Tennessee system accepted in February• 607 TF NSF University of Tennessee system accepted in February 2009.

• $250M DARPA HPCS Phase III contract award.

• R&D agreement with Intel to create custom compute processors with microprocessor technology as part of HPCS Cascade program.

Mar-2009 ORNL Climate Meeting Slide 5

Page 6: Cray ORNL Climate Workshop 2009 Talk [Read-Only]High Performance Computing and Modeling Algorithmic improvements (applied math & domain specific) Comp terScience impro ements challenge

• Over 50 upgrades or systems built:• >11K Compute blades• >70K sockets (280K cores)• >390 XT cabinets

• Introduced ECOphlex Liquid‐Coolingp q g

• Lustre file system at ORNL breaks 100 GB/s (some tests break 150 GB/s).• Pl t hit 240 GB/• Plan to hit 240 GB/s.

• ORNL JaguarPF designed, built and delivered on schedule!

• 607 TF NSF University of Tennessee system accepted in February607 TF NSF University of Tennessee system accepted in February 2009.

Mar-2009 ORNL Climate Meeting Slide 6

Page 7: Cray ORNL Climate Workshop 2009 Talk [Read-Only]High Performance Computing and Modeling Algorithmic improvements (applied math & domain specific) Comp terScience impro ements challenge

• Officially entered full production on 2 February 2009.

• Next scheduled upgrade is in late 2009.

• NSF’s largest supercomputer and the world’s fastest university managed supercomputer.

• NSF T k 2 d t th U i it f T• NSF Track 2 award to the University of Tennessee.

• Housed at the University of Tennessee – Oak Ridge National Laboratory Joint Institute for Computational Sciences.y p

Mar-2009 ORNL Climate Meeting Slide 7

Page 8: Cray ORNL Climate Workshop 2009 Talk [Read-Only]High Performance Computing and Modeling Algorithmic improvements (applied math & domain specific) Comp terScience impro ements challenge

• Installed at the National Center for Computational Sciences (NCCS) at ORNL.• XT5 with ECOphlex liquid cooling.

• Will enable petascale simulations of:• Climate scienceClimate science• High‐temperature superconductors • Fusion reaction for the 100‐million‐degree ITER reactor

• Not just more of the same but unprecedented simulations• Not just more of the same, but unprecedented simulations.

• The only open science petascale system in the world.

Mar-2009 ORNL Climate Meeting Slide 8

Page 9: Cray ORNL Climate Workshop 2009 Talk [Read-Only]High Performance Computing and Modeling Algorithmic improvements (applied math & domain specific) Comp terScience impro ements challenge

• Milestone capability that has been in demand by climate community for years.

• Capabilities highlighted by Raymond Orbach at the November 2008 DoE sponsored “Challenges in climate change science and role of computing at the extreme scale”role of computing at the extreme scale .

• For modelers – either you have run at 150,000 cores, or you have not.

• For scientists – unprecedented results in the near‐term and a capability to model unknown phenomena.

Mar-2009 ORNL Climate Meeting Slide 9

Page 10: Cray ORNL Climate Workshop 2009 Talk [Read-Only]High Performance Computing and Modeling Algorithmic improvements (applied math & domain specific) Comp terScience impro ements challenge

Breaks the standing Earth Simulator record for sustained performance on a meteorological application

> 50 TF Sustained

•5 km grid spacing over hemispheric domain

•6 second time step

•4486x4486x101 grid (2 billion cells)

•30 minute forecast took 69 seconds

Mar-2009 ORNL Climate Meeting Slide 10

Page 11: Cray ORNL Climate Workshop 2009 Talk [Read-Only]High Performance Computing and Modeling Algorithmic improvements (applied math & domain specific) Comp terScience impro ements challenge

• DoE / NSF Climate End Station (CES)• An interagency collaboration of NSF and DOE in 

d l i h C i Cli S M d l (CCSM)developing the Community Climate System Model (CCSM) for IPCC AR5.

• A collaboration with NASA in carbon data assimilation• A collaboration with university partners with expertise inA collaboration with university partners with expertise in 

computational climate research.

• DoE / NOAA MoU• DoE to provide NOAA/GFDL with millionsDoE to provide NOAA/GFDL with millions

of CPU hours.• Climate change  and near real‐time

high‐impact NWP research .P i f d d hi h l i• Prototyping of advanced high‐resolutionclimate models.

Mar-2009 ORNL Climate Meeting Slide 11

Page 12: Cray ORNL Climate Workshop 2009 Talk [Read-Only]High Performance Computing and Modeling Algorithmic improvements (applied math & domain specific) Comp terScience impro ements challenge

Mar-09 Slide 12ORNL Climate Meeting

Page 13: Cray ORNL Climate Workshop 2009 Talk [Read-Only]High Performance Computing and Modeling Algorithmic improvements (applied math & domain specific) Comp terScience impro ements challenge

Cray Technology Directions

Page 14: Cray ORNL Climate Workshop 2009 Talk [Read-Only]High Performance Computing and Modeling Algorithmic improvements (applied math & domain specific) Comp terScience impro ements challenge

29,712

16,316Supercomputing is increasingly about managingmassive parallelism:

10,073• Key to overall system performance, not only individual 

application performance.

• Demands investment in technologies to

3 518

• Demands investment in technologies toeffectively harness parallelism(hardware, software, I/O,management, etc…).

408 808 1,245 1,0731,644 1,847 2,230

722

2,827 3,093 3,518g , )

Source: www.top500.org Average Number of Processors Per Supercomputer (Top 20 of Top 500)1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007

(Nov)

Mar-2009 ORNL Climate Meeting Slide 14

Page 15: Cray ORNL Climate Workshop 2009 Talk [Read-Only]High Performance Computing and Modeling Algorithmic improvements (applied math & domain specific) Comp terScience impro ements challenge

“Granite”“Baker”+

Realizing Our Adaptive Supercomputing Vision

“Baker”

“Marble”

Cray XT5

& XT5h

Baker

h

Cray XT4

CX1

Cray XMT

Vector

Scalar

Vector

Multithreaded Deskside

Page 16: Cray ORNL Climate Workshop 2009 Talk [Read-Only]High Performance Computing and Modeling Algorithmic improvements (applied math & domain specific) Comp terScience impro ements challenge

• Scalable System Performance:• Enabling scalability through technologies to manage and harness 

ll liparallelism.• In particular interconnect and O/S technologies.

• Cray is a pioneer and leader in light weight kernel design.

• System performance as a whole to support a petascale infrastructure.

• Green Computing: packaging, power consumption and ECOphlexGreen Computing: packaging, power consumption and ECOphlexcooling:• Simple expansion and in‐cabinet upgrade to latest technologies, preserving 

investment in existing infrastructure.g• Lowest possible total electrical footprint.

• Provide investment protection and growth path through R&D into p g p gfuture technologies: HPCS Cascade Program.

Mar-2009 ORNL Climate Meeting Slide 16

Page 17: Cray ORNL Climate Workshop 2009 Talk [Read-Only]High Performance Computing and Modeling Algorithmic improvements (applied math & domain specific) Comp terScience impro ements challenge

C = ComputeS = Service

SC

Microprocessor Based Node Cray Custom Interconnect3‐D Torus

C

CC

C

S

C

C

C SC

I/O Node

Login Node

Network Node

Infrastructure Node

Unified System ArchitectureDesigned for Performance andManagement  at Scale

Compute Node

Mar-2009 ORNL Climate Meeting Slide 17

Page 18: Cray ORNL Climate Workshop 2009 Talk [Read-Only]High Performance Computing and Modeling Algorithmic improvements (applied math & domain specific) Comp terScience impro ements challenge

6.4 GB/sec direct connect HyperTransport6.4 GB/sec direct connect HyperTransportCray XT5 Node

CharacteristicsNumber of Cores

8

Peak P f

73-86 Gflops/sPerformanceMemory Size 8-32 GB per node

Memory B d idth

25.6 GB/sec

25.6 GB/sec direct25.6 GB/sec direct

Bandwidth

CraySeaStar2+

Interconnect

25.6 GB/sec direct connect memory25.6 GB/sec direct connect memory

Interconnect

Mar-09 Slide 18ORNL Climate Meeting

Page 19: Cray ORNL Climate Workshop 2009 Talk [Read-Only]High Performance Computing and Modeling Algorithmic improvements (applied math & domain specific) Comp terScience impro ements challenge

March 09 ORNL Climate Meeting Slide 19

Page 20: Cray ORNL Climate Workshop 2009 Talk [Read-Only]High Performance Computing and Modeling Algorithmic improvements (applied math & domain specific) Comp terScience impro ements challenge

• Cray XT architecture is uniquely positioned to provide both:

• Single model performance at scale

• Maximizes:• Model development and execution.• System administration.Single model performance at scale 

– deterministic forecasts.

• Throughput performance –ensemble modeling

• Facilities and total cost of ownership.

• In a single, unified system architecture.ensemble modeling.

No Degradation in System Efficiency

Mar-2009 Slide 20ORNL Climate Meeting

Page 21: Cray ORNL Climate Workshop 2009 Talk [Read-Only]High Performance Computing and Modeling Algorithmic improvements (applied math & domain specific) Comp terScience impro ements challenge

Cray Custom SeaStar2+Interconnect

From paper:

“Characterizing Parallel Scaling of

Scientific Applications using IPM”

Nicholas J. Wright, Wayne Pfeiffer, and

Allan Snavely

Performance Modeling and 

Characterization Lab

San Diego Supercomputer Center

March 09 ORNL Climate Meeting Slide 21

4x the number of cores

Page 22: Cray ORNL Climate Workshop 2009 Talk [Read-Only]High Performance Computing and Modeling Algorithmic improvements (applied math & domain specific) Comp terScience impro ements challenge

March 09 ORNL Climate Meeting Slide 22

Page 23: Cray ORNL Climate Workshop 2009 Talk [Read-Only]High Performance Computing and Modeling Algorithmic improvements (applied math & domain specific) Comp terScience impro ements challenge

Perspectives on Petascale Computing

Page 24: Cray ORNL Climate Workshop 2009 Talk [Read-Only]High Performance Computing and Modeling Algorithmic improvements (applied math & domain specific) Comp terScience impro ements challenge

ScienceImproved scientific representationIncreased scientific complexity

Increased observational data and data assimilation

TodayCost per grid point is 

increasing due to physicsIncreased observational data and data assimilation

Increased resolution, integration lengthsUse of ensembles and inter‐disciplinary coupled modeling

physics.Code complexity increasing.Not all components run at 

the same efficiency.Greater degree of 

dependence on the overall computing, data and support environment.

TodayIn the past speed was based 

on exponential growth in single CPU 

fperformance.Today it is through 

exponential growth in parallelism.

Efficient scalability is the

High Performance Computing and ModelingAlgorithmic improvements (applied math & domain specific)

Comp ter Science impro ements

Efficient scalability is the challenge.

Every model and system aspect must be addressed.

Computer Science improvementsRaw processing speed

Developer access to scalable systems is essential.

Mar-2009 Slide 24ORNL Climate Meeting

Page 25: Cray ORNL Climate Workshop 2009 Talk [Read-Only]High Performance Computing and Modeling Algorithmic improvements (applied math & domain specific) Comp terScience impro ements challenge

• There remains a tremendous number of models and application areas that have not yet reached even the terascale level.

• Those that can scale have benefited from a focused, iterative multi‐year algorithmic optimization effort:

• Optimization strategies do not remain stagnant and must take advantage of evolving hardware and software technologies.

• Ongoing access to scalable leadership class systems and supportOngoing access to scalable, leadership class systems and support is essential:• Either you have actually run on 5K, 10K, 20K, 50K… processors, or you have 

not !    Theory vs. Realityy y

Mar-2009 ORNL Climate Meeting Slide 25

Page 26: Cray ORNL Climate Workshop 2009 Talk [Read-Only]High Performance Computing and Modeling Algorithmic improvements (applied math & domain specific) Comp terScience impro ements challenge

• Three basic paths are available to users today:• Standards based MPP• Low power• Accelerators

• Cray Cascade program is intended to bring these together to y p g g gprovide greater flexibility within a single architecture.

• Will be accomplished through:• Continued investment in core MPP and emerging technologies• Continued investment in core MPP and emerging technologies.• Continued investment in customer relationships.

• Cray’s wealth of experience in pushing the boundaries of scalability will continue to positively impact entire HPC community.

Mar-2009 ORNL Climate Meeting Slide 26

Page 27: Cray ORNL Climate Workshop 2009 Talk [Read-Only]High Performance Computing and Modeling Algorithmic improvements (applied math & domain specific) Comp terScience impro ements challenge

• HPC is Cray’s only business.

• Cray’s MPP technologies are playing a key role in supporting the HPC community in preparing for and using Petascale capabilities.

• Cray’s wealth of experience in pushing the boundaries of scalability will continue to positively impact entire HPCscalability will continue to positively impact entire HPC community.

Mar-2009 ORNL Climate Meeting Slide 27

Page 28: Cray ORNL Climate Workshop 2009 Talk [Read-Only]High Performance Computing and Modeling Algorithmic improvements (applied math & domain specific) Comp terScience impro ements challenge

1950 NWP Forecasts on ENIAC                    2008 WRF Nature Run on Cray XT

• 736 km grid spacing over U.S.• 15 x 18 x 1

5 km grid spacing over hemispheric domain4486 x 4486 x 101 grid (2 billion cells)Nearly 3 5 quadrillion floating point operations

• 1,000,000 computations• 24 hour forecast took 24 hours

Nearly 3.5 quadrillion floating point operations30 minute forecast took 69 seconds

Mar-2009 ORNL Climate Meeting Slide 28

Page 29: Cray ORNL Climate Workshop 2009 Talk [Read-Only]High Performance Computing and Modeling Algorithmic improvements (applied math & domain specific) Comp terScience impro ements challenge

Thank you for your attention