climate research at the national energy research scientific computing center (nersc) bill kramer

40
Climate Research at the National Energy Research Scientific Computing Center (NERSC) Bill Kramer Deputy Director and Head of High Performance Computing CAS 2001 #

Upload: jonco

Post on 13-Jan-2016

33 views

Category:

Documents


3 download

DESCRIPTION

Climate Research at the National Energy Research Scientific Computing Center (NERSC) Bill Kramer Deputy Director and Head of High Performance Computing CAS 2001 October 30, 2001. #. NERSC Vision. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Climate Research at the National Energy Research Scientific Computing Center (NERSC) Bill Kramer

Climate Research at the National Energy Research Scientific Computing Center (NERSC)

Bill KramerDeputy Director and Head of High Performance Computing

CAS 2001October 30, 2001

#

Page 2: Climate Research at the National Energy Research Scientific Computing Center (NERSC) Bill Kramer

2CCPM, October 3, 2001

NERSC Vision

NERSC strives to be a world leader in accelerating scientific discovery through

computation. Our vision is to provide high-performance computing tools and expertise

to tackle science's biggest and most challenging problems, and to play a major

role in advancing large-scale computational science and computer science.

Page 3: Climate Research at the National Energy Research Scientific Computing Center (NERSC) Bill Kramer

3CCPM, October 3, 2001

Outline

• NERSC-3: Successfully fielding the world’s most powerful unclassified computing resource

• The NERSC Strategic Proposal: An Aggressive Vision for the Future of the Flagship Computing Facility of the Office of Science

• Scientific Discovery through Advanced Computing (SciDAC) at NERSC

• Support for Climate Computing at NERSC: Ensuring Success for the National Program

Page 4: Climate Research at the National Energy Research Scientific Computing Center (NERSC) Bill Kramer

4CCPM, October 3, 2001

FY00 MPP Users/Usage by Scientific Discipline

NERSC FY00 MPPUsers by Discipline

NERSC FY00 MPPUsage by Discipline

Page 5: Climate Research at the National Energy Research Scientific Computing Center (NERSC) Bill Kramer

5CCPM, October 3, 2001

NERSC FY00 Usage by Site

MPP Usage

PVP Usage

Page 6: Climate Research at the National Energy Research Scientific Computing Center (NERSC) Bill Kramer

6CCPM, October 3, 2001

FY00 Users/Usage by Institution Type

Page 7: Climate Research at the National Energy Research Scientific Computing Center (NERSC) Bill Kramer

7CCPM, October 3, 2001

NERSC Computing Highlights for FY 01

• NERSC 3 is in full and final production – exceeding original capability by more than 30% and with much larger memory.

• Increased total FY 02 allocations of computer time by 450% over FY01.

• Activated the new Oakland Scientific Facility• Upgraded NERSC network connection to 655 Mbits/s

(OC12) – ~4 times the previous bandwidth. • Increase archive storage capacity with 33% more

tape slots and double the number of tape drives• PDSF, T3E, SV1s, and other systems all continue

operating very well

Page 8: Climate Research at the National Energy Research Scientific Computing Center (NERSC) Bill Kramer

8CCPM, October 3, 2001

Oakland Scientific Facility

• 20,000 sf computer room; 7,000 sf office space—16,000 sf computer space built out—NERSC occupying 12,000 sf

• Ten-year lease with 3 five-year options• $10.5M computer room construction costs• Option for additional 20,000+ sf computer room

Page 9: Climate Research at the National Energy Research Scientific Computing Center (NERSC) Bill Kramer

9CCPM, October 3, 2001

HPSS Archive Storage

Monthly IO by Month and System

0

5

10

15

20

25

Month

I/O (

TB

)

Archive

User/Regent

Backup

File Counts by Date and System

0

2

4

6

8

10

12

9810

9812

9902

9904

9906

9908

9910

9912

2000

02

2000

04

2000

06

2000

08

2000

10

2000

12

2001

02

2001

04

Month

File

s (

Mill

ion

s)

Archive

User/Regent

Backup

Cumulative Storage by Month and System

0

20

40

60

80

100

120

140

160

180

200

9810

9812

9902

9904

9906

9908

9910

9912

2000

02

2000

04

2000

06

2000

08

2000

10

2000

12

2001

02

2001

04

Month

TB

Archive

User/Regent

Backup•190 Terabytes of data in the storage systems

•9 Million files in the storage systems

•Average 600-800 GBs Data transferred/day

•Peak 1.5 TB

•Average 18,000 files transferred/day

• Peak 60,000

•500-600 Tape mounts/day

•Peak 2000) (12/system)

Page 10: Climate Research at the National Energy Research Scientific Computing Center (NERSC) Bill Kramer

10CCPM, October 3, 2001

NERSC-3 Vital Statistics

• 5 Teraflop/s Peak Performance – 3.05 Teraflop/s with Linpack— 208 nodes, 16 CPUs per node at 1.5 Gflop/s per CPU— “Worst case” Sustained System Performance measure .358 Tflop/s (7.2%)— “Best Case” Gordon Bell submission 2.46 on 134 nodes (77%)

• 4.5 TB of main memory— 140 nodes with 16 GB each, 64 nodes with 32 GBs, and 4 nodes with 64 GBs.

• 40 TB total disk space— 20 TB formatted shared, global, parallel, file space; 15 TB local disk for system usage

• Unique 512 way Double/Single switch configuration

Page 11: Climate Research at the National Energy Research Scientific Computing Center (NERSC) Bill Kramer

11CCPM, October 3, 2001

Two Gordon Bell-Prize Finalists Are Using NERSC-3

• Materials Science -- 2016-atom supercell models for spin dynamics simulations of magnetic structure of iron-magnanese/cobalt interface. Using 2176 processors of NERSC 3 showed a sustained 2.46 teraflop/s – M. Stocks and team at ORNL and U. Pittsburgh with A. Canning at NERSC

• Climate Modeling -- Shallow Water Climate Model sustained 361 Gflop/s (12%) – S. Thomas et al., NCAR.

Section of an FeMn/Co interface shows a new magnetic structure that is different from the magnetic structure of pure FeMn.

Page 12: Climate Research at the National Energy Research Scientific Computing Center (NERSC) Bill Kramer

12CCPM, October 3, 2001

VIS LAB

MILLENNIUM

SYMBOLICMANIPULATION

SERVER

REMOTEVISUALIZATION

SERVER

CRI T3E900 644/256

CRI SV1

FDDI/ETHERNET

10/100/Gigbit

SGIMAX

STRAT

HIPPI

IBMAnd STKRobots

DPSS PDSF

ESnet

HPSSHPSS

NERSC System Architecture

ResearchCluster

IBM SPNERSC-3 – Phase 2

2532 Processors/ 1824 GigabyteMemory/32 Terabytes of Disk

LBNL Cluster

Page 13: Climate Research at the National Energy Research Scientific Computing Center (NERSC) Bill Kramer

13CCPM, October 3, 2001

NERSC Strategic Proposal

An Aggressive Vision for the Future of the Flagship Computing Facility of the Office of

Science

Page 14: Climate Research at the National Energy Research Scientific Computing Center (NERSC) Bill Kramer

14CCPM, October 3, 2001

The NERSC Strategic Proposal

• Requested In February, 2001 by the Office of Science as a proposal for the next five years of the NERSC Center and Program

• Proposal and Implementation Plan delivered to OASCR at the end of May, 2001

• Proposal plays from NERSC’s strengths, but anticipates rapid and broad changes in scientific computing.

• Results of DOE review expected at the end of November-December 2001

Page 15: Climate Research at the National Energy Research Scientific Computing Center (NERSC) Bill Kramer

15CCPM, October 3, 2001

Page 16: Climate Research at the National Energy Research Scientific Computing Center (NERSC) Bill Kramer

16CCPM, October 3, 2001

High-End Systems: A Carefully Researched Plan for Growth

A three-year procurement cycle for leading-edge computing platforms

Balanced Systems, with appropriate data storage

and networking

Page 17: Climate Research at the National Energy Research Scientific Computing Center (NERSC) Bill Kramer

17CCPM, October 3, 2001

Page 18: Climate Research at the National Energy Research Scientific Computing Center (NERSC) Bill Kramer

18CCPM, October 3, 2001

NERSC Support for the DOE Scientific Discovery through Advanced

Computing (SciDAC)

Page 19: Climate Research at the National Energy Research Scientific Computing Center (NERSC) Bill Kramer

19CCPM, October 3, 2001

Scientific Discovery Through Advanced Computing

Subsurface Transport

GlobalSystems

DOE Science ProgramsNeed Dramatic Advances

in Simulation Capabilities

To Meet TheirMission Goals

Health Effects, Bioremediation

Fusion Energy

CombustionMaterials

Page 20: Climate Research at the National Energy Research Scientific Computing Center (NERSC) Bill Kramer

20CCPM, October 3, 2001

LBNL/NERSC SciDAC Portfolio – Project Leadership

Project NamePrincipal Investigator

Partner InstitutionsAnnual Funding

Scientific Data Mgmt Center (ISIC)

Shoshani

ANL, LLNL, ORNL, UC San Diego, Georgia Institute of Tech; Northwestern Univ; No Carolina State Univ

$624,000

Applied Partial Differential Center (ISIC)

ColellaLLNL, Univ of Wash, No Carolina, Wisc, UC Davis; NYU

$1,700,000

Performance Evaluation Research Center (ISIC)

BaileyORNL, ANL, LLNL, Univ of Maryland, Tenn, Ill at Urbana-CHAMPAIGN, UC San Diego

$276,000

DOE Science Grid: Enabling and Deploying the SciDAC Collaboratory Software Environment

Johnston ORNL, ANL, NERSC, PNNL $510,000

Advanced Computing for the 21st Century Accelerator Science Technology

Ryne NERSC, SLAC $650,000

Page 21: Climate Research at the National Energy Research Scientific Computing Center (NERSC) Bill Kramer

21CCPM, October 3, 2001

Applied Partial Differential Equations ISIC

• New algorithmic capabilities with high-performance implementations on high-end computers:

—Adaptive mesh refinement —Cartesian grid embedded boundary

methods for complex geometries —Fast adaptive particle methods

• Close collaboration with applications scientists

• Common mathematical and software framework for multiple applications

Participants: LBNL (J. Bell, P. Colella), LLNL , Courant Institute, Univ. of Washington, Univ. of North Carolina, Univ. of California, Davis,

Univ. of Wisconsin.

Developing a new algorithmic and software framework for solving partial differential equations in core mission areas.

Page 22: Climate Research at the National Energy Research Scientific Computing Center (NERSC) Bill Kramer

22CCPM, October 3, 2001

Scientific Data Management ISIC

Participants: ANL, LBNL, LLNL, ORNL,GTech, NCSU, NWU, SDSC

Tapes

Disks

ScientificSimulations

& Experiments

ScientificAnalysis

& Discovery

DataManipulation:

• Getting files from tape archive• Extracting subset of data from files• Reformatting data• Getting data from heterogeneous, distributed systems• Moving data over the network

Petabytes

Terabytes

DataManipulation:

~80% time

~20% time

~20% time

~80% time

• Using SDM-ISIC technology

ScientificAnalysis

& Discovery

•Optimizing shared access from mass storage systems•Metadata and knowledge- based federations•API for Grid I/O•High-dimensional cluster analysis•High-dimensional indexing•Adaptive file caching•Agents

SDM-ISIC Technology

Goals Optimize and simplify:• Access to very large data sets• Access to distributed data• Access of heterogeneous data• Data mining of very large data sets

Page 23: Climate Research at the National Energy Research Scientific Computing Center (NERSC) Bill Kramer

23CCPM, October 3, 2001

SciDAC Portfolio – NERSC as a Collaborator

Project NameCo-Principal Investigator

Lead PI & Institution

Annual Funding

DOE Science Grid: Enabling and Deploying the SciDAC Collaboratory Software Environment

Kramer Johnston - LBNL $225,000

Scalable Systems Software Enabling Technology Center

Hargrove Al Geist - ORNL $198,000

Advanced Computing for the 21st Century Accelerator Science Technology

Ng Robert Ryne - LBNL $200,000

Terascale Optimal PDE Simulations Center (TOPS Center)

NgBarry Smith and Jorge More - ANL

$516,000

Earth Sys Grid: The Next Generation Turning Climate Datasets into Community Resources

Shoshani Ian Foster - ANL $255,000

Particle Physics Data Grid Collaboratory Pilot ShoshaniRichard Mount - SLAC

$405,000

Collaborative Design and Development of the Community Climate System Model for Terascale Computers

DingMalone – Drake, LANL, ORNL

$400,000

Page 24: Climate Research at the National Energy Research Scientific Computing Center (NERSC) Bill Kramer

24CCPM, October 3, 2001

Strategic Project Support

• Specialized Consulting Support—Project Facilitator Assigned

• Help defining project requirements• Help with getting resources• Code tuning and optimization

—Special Service Coordination • Queues, throughput, increased limits, etc.

• Specialized Algorithmic Support—Project Facilitator Assigned

• Develop and improve algorithms• Performance enhancement

—Coordination with ISICs to represent work and activities

Page 25: Climate Research at the National Energy Research Scientific Computing Center (NERSC) Bill Kramer

25CCPM, October 3, 2001

Strategic Project Support

• Special Software Support—Projects can request support for packages and software that

are special to their work and not as applicable to the general community

• Visualization Support—Apply NERSC Visualization S/W to projects—Develop and improve methods specific to the projects—Support any project visitors who use the local LBNL

visualization lab• SciDAC Conference and Workshop Support

—NERSC Staff will provide content and presentations at project events

—Provide custom training as project events—NERSC staff attend and participate at project events

Page 26: Climate Research at the National Energy Research Scientific Computing Center (NERSC) Bill Kramer

26CCPM, October 3, 2001

Strategic Project Support

• Web Services for interested projects—Provide areas on NERSC web servers for interested

projects• Password protected areas as well• Safe “sandbox” area for dynamic script development

—Provide web infrastructure• Templates, structure, tools, forms, dynamic data scripts (cgi-

gin)

—Archive for mailing lists—Provide consulting support to help projects organize and

manage web content• CVS Support

—Provide a server area for interested projects• Backup, administration, access control

—Provide access to code repositories—Help projects set up and manage code repositories

Page 27: Climate Research at the National Energy Research Scientific Computing Center (NERSC) Bill Kramer

27CCPM, October 3, 2001

Strategic Project Area Facilitators

User Services Facilitator

Scientific Computing Facilitator

Fusion David Turner Dr Jodi Lamoureux

QCD Dr Majdi Baddourah Dr Jodi Lamoureux

Experimental Physics Dr Iwona Sakrejda Dr Jodi Lamoureux

AstroPhysics Dr Richard Gerber Dr Peter Nugent

Accelerator Physics Dr Richard Gerber Dr Esmond Ng

Chemistry Dr David Skinner Dr Lin Wang

Life Science Dr Jonathan Carter Dr Chris Ding

Climate Dr Harsh Anand Passi Dr Chris Ding

Computer Science Thomas Deboni Dr Parry Husbands /Dr Osni Marques (for CCA)

Applied math Dr Majdi Baddourah Dr Chao Yang

Page 28: Climate Research at the National Energy Research Scientific Computing Center (NERSC) Bill Kramer

28CCPM, October 3, 2001

NERSC Support for Climate Research

Ensuring Success for the National Program

Page 29: Climate Research at the National Energy Research Scientific Computing Center (NERSC) Bill Kramer

29CCPM, October 3, 2001

Climate Projects at NERSC

• 20+ projects from the base MPP allocations with about ~6% of the entire base resource

• Two Strategic Climate Projects —High Resolution Global Coupled Ocean/Sea Ice

Modeling – Matt Maltrud @ LANL• 5% of total SP hours (920,000 wall clock hours)• “Couple high resolution ocean general circulation model

with high resolution dynamic thermodynamic sea ice model in a global context.”

—1/10th degree (3 to 5 km in polar regions)

—Warren Washington, Tom Bettge, Tony Craig, et al.

• PCM coupler

Page 30: Climate Research at the National Energy Research Scientific Computing Center (NERSC) Bill Kramer

30CCPM, October 3, 2001

Early Scientific Results Using NERSC-3

• Climate Modeling – 50km resolution for global climate simulation run in a 3 year test. Proved that the model is robust to a large increase in spatial resolution. Highest spatial resolution ever used, 32 times more grid cells than ~300km grids, takes 200 times as long. – P. Duffy, LLNL

Reaching Regional Climate Resolution

Page 31: Climate Research at the National Energy Research Scientific Computing Center (NERSC) Bill Kramer

31CCPM, October 3, 2001

Some other Climate Projects NERSC staff have helped with

• Richard Loft, Stephen Thomas and John Dennis, NCAR - Using 2,048 processors on NERSC-3, demonstration that dynamical core of an atmospheric general circulation model (GCM) can be integrated at a rate of 130 years per day

• Inez Fung (UCB) - CSM to build a Carbon Climate simulation package using the SV1

• Mike Wehner - CCM to do large scale ensemble simulations on T3E

• Doug Rotman – Atmospheric Chemistry/Aerosol Simulations

• Tim Barnett and Detlaf Stammer – PCM runs on T3E and SP.

Page 32: Climate Research at the National Energy Research Scientific Computing Center (NERSC) Bill Kramer

32CCPM, October 3, 2001

ACPI/Avantgarde/SciDAC

• Work done by Chris Ding and team —comprehensive performance analysis of GPFS on

IBM SP (supported by Avant Garde).— I/O performance analysis, see

http://www.nersc.gov/research/SCG/acpi/IO/— numerical reproducibility and stability — MPH: a library for distributed multi-component

environment

Page 33: Climate Research at the National Energy Research Scientific Computing Center (NERSC) Bill Kramer

33CCPM, October 3, 2001

Special Support for Climate Computing

NCAR CSM version 1.2• NERSC was the first site to port NCAR CSM to non-

NCAR Cray PVP machine• Main users Inez Fung (UCB) and Mike Wehner

(LLNL)

NCAR CCM3.6.6• Independent of CSM, NERSC ported NCAR

CCM3.6.6 to NERSC Cray PVP cluster. • See

http://hpcf.nersc.gov/software/apps/climate/ccm3/

Page 34: Climate Research at the National Energy Research Scientific Computing Center (NERSC) Bill Kramer

34CCPM, October 3, 2001

Special Support for Climate Computing – cont.

• T3E netCDF parallelization — NERSC solicited user input for defining parallel I/O requirements for

the MOM3, LAN and CAMILLE climate models (Ron Pacanowski, Venkatramani Balaji, Michael Wehner, Doug Rotman and John Tannahill)

— Development of netCDF parallelization on T3E was done by Dr. RK Owen at NERSC/USG based on modelers requirements

• better I/O performance, • master/slave read/write capability• support for variable unlimited dimension• allow subset of PEs open/close netCDF dataset• user friendly API • etc.

— Demonstrated netCDF parallel I/O usage by building model specific I/O test cases (MOM3, CAMILLE).

— netCDF 3.5 official UNIDATA release includes “added support provided by NERSC for multiprocessing on Cray T3E.“ http://www.unidata.ucar.edu/packages/netcdf/release-notes-3.5.0.html

• Parallel netCDF for IBM SP under development by Dr. Majdi Baddourah of NERSC/USG

Page 35: Climate Research at the National Energy Research Scientific Computing Center (NERSC) Bill Kramer

35CCPM, October 3, 2001

Additional Support for Climate

• Scientific Computing and User Service’s Groups have staff with special climatic focus

• Received funding for a new climate support person at NERSC• Will provide software, consulting, and documentation support for

climate researchers at NERSC• Will port the second generation of NCAR's Community Climate

System Model (CCSM-2) to NERSC's IBM SP. • Put the modified source code under CVS control so that individual

investigators at NERSC can access the NERSC version, and modify and manipulate their own source without affecting others.

• Provide necessary support and consultation on operational issues.

• Will develop enhancements to NetCDF on NERSC machines that benefit NERSC's climate researchers.

• Will respond in a timely, complete, and courteous manner to NERSC user clients, and provide an interface between NERSC users and staff.

Page 36: Climate Research at the National Energy Research Scientific Computing Center (NERSC) Bill Kramer

36CCPM, October 3, 2001

NERSC Systems Utilization

MPP Charging and UsageFY 1998-2000

0

2000

4000

6000

8000

10000

12000

14000

16000

18000

1-O

ct-97

29-O

ct-97

26-N

ov-97

24-D

ec-97

21-Jan-98

18-Feb-98

18-M

ar-98

15-A

pr-98

13-M

ay-98

10-Jun-98

8-Jul-98

5-A

ug-98

2-S

ep-98

30-S

ep-98

28-O

ct-98

25-N

ov-98

23-D

ec-98

20-Jan-99

17-Feb-99

17-M

ar-99

14-A

pr-99

12-M

ay-99

9-Jun-99

7-Jul-99

4-A

ug-99

1-S

ep-99

29-S

ep-99

27-O

ct-99

24-N

ov-99

22-D

ec-99

19-Jan-00

16-Feb-00

15-M

ar-00

12-A

pr-00

10-M

ay-00

7-Jun-00

5-Jul-00

2-A

ug-00

30-A

ug-00

27-S

ep-00

Date

CPU

H

ours

30-Day Moving Ave. Lost Time

30-Day Moving Ave. Pierre Free

30-Day Moving Ave. Pierre

30-Day Moving Ave. GC0

30-Day Moving Ave. Mcurie

30-Day Moving Ave. Overhead

80%

85%

90%

Max CPU Hours

80%

85%

90%

Peak

Goal

Systems MergedAllocationStarvation

AllocationStarvation

CheckpointResart - Start ofCapability J obs

FullScheduling Functionality

4.4% improvement per month

T3E – 95% Gross utilization

IBM SP – 80-85% Gross utilization

Page 37: Climate Research at the National Energy Research Scientific Computing Center (NERSC) Bill Kramer

37CCPM, October 3, 2001

Mcurie MPP Time by Job Size - 30 Day Moving Average

0

2000

4000

6000

8000

10000

12000

14000

16000

4/2

9/200

0

5/6

/2000

5/1

3/200

0

5/2

0/200

0

5/2

7/200

0

6/3

/2000

6/1

0/200

0

6/1

7/200

0

6/2

4/200

0

7/1

/2000

7/8

/2000

7/1

5/200

0

7/2

2/200

0

7/2

9/200

0

8/5

/2000

8/1

2/200

0

8/1

9/200

0

8/2

6/200

0

9/2

/2000

9/9

/2000

9/1

6/200

0

9/2

3/200

0

9/3

0/200

0

10

/7/200

0

10

/14/20

00

10

/21/20

00

10

/28/20

00

11

/4/200

0

11

/11/20

00

11

/18/20

00

11

/25/20

00

Date

Ho

urs

257-512

129-256

97-128

65-96

33-64

17-32

<16

IBM SP

NERSC Systems Run “large” Jobs

T3E

Page 38: Climate Research at the National Energy Research Scientific Computing Center (NERSC) Bill Kramer

38CCPM, October 3, 2001

Balancing Utilization and Turnaround

• NERSC consistently delivers high utilization on MPP systems, while running large applications.

• We are now working with our users to establish methods to provide improved services— Guaranteed throughput for at least a selected group of projects— More interactive and debugging resources for parallel applications— Longer application runs — More options in resource requests

• Because of the special turnaround requirements of the large climate users— NERSC established a queue working group (T. Bettge, Vince

Wayland at NCAR)— Set up special queue scheduling procedures that provide an agreed

upon amount of turnaround per day if there is work in it (Sept. ‘01)— Will present a plan at the NERSC User Group Meeting, November

12, 2001 in Denver, about job scheduling

Page 39: Climate Research at the National Energy Research Scientific Computing Center (NERSC) Bill Kramer

39CCPM, October 3, 2001

Wait times in “regular” queue

Climate jobs

All other jobs

Page 40: Climate Research at the National Energy Research Scientific Computing Center (NERSC) Bill Kramer

40CCPM, October 3, 2001

NERSC Is Delivering on Its Commitment to Make the Entire DOE Scientific Computing

Enterprise Successful• NERSC sets the standard for effective

supercomputing resources• NERSC is a major player in SciDAC and will

coordinate it projects and collaborations• NERSC is providing targeted support to SciDAC

projects• NERSC continues to provide targeted support for the

climate community and is acting on the input and needs of the climate community