information technology infrastructure committee (itic) report to the nac

36
Information Technology Infrastructure Committee (ITIC) Report to the NAC March 8, 2012 Larry Smarr Chair ITIC

Upload: kamali

Post on 09-Jan-2016

32 views

Category:

Documents


0 download

DESCRIPTION

Information Technology Infrastructure Committee (ITIC) Report to the NAC. March 8, 2012 Larry Smarr Chair ITIC. Committee Members. Membership Dr. Larry Smarr, Director- California Institute of Telecommunications and Information Technology Mr. Alan Paller, Research Director- SANS Institute - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Information Technology Infrastructure Committee (ITIC) Report to the NAC

Information Technology Infrastructure Committee (ITIC)

Report to the NAC

March 8, 2012

Larry SmarrChair ITIC

Page 2: Information Technology Infrastructure Committee (ITIC) Report to the NAC

Committee Members

Membership Dr. Larry Smarr, Director- California Institute of Telecommunications and

Information Technology Mr. Alan Paller, Research Director- SANS Institute Dr. Robert Grossman, Professor- University of Chicago Dr. Charles Holmes, Retired- NASA Dr. Alexander Szalay, Professor- Johns Hopkins University Ms. Karen Harper (Exec Sec), IT Workforce Manager, Office of Chief

Information Officer, NASA

Committee Met March 6-7, 2012:Fact Finding at Johns Hopkins

FACA at NASA HQ

Page 3: Information Technology Infrastructure Committee (ITIC) Report to the NAC

Finding #1

To enable new scientific discoveries, in a fiscally constrained environment, NASA must develop more productive IT infrastructure through “frugal innovation” and “agile development”• Easy to use as “flickr”

• Elastic to demand

• Continuous improvement

• More capacity for fixed investment

• Adaptable to changing requirements of multiple missions

• Built-in security that doesn’t hinder deployment

Page 4: Information Technology Infrastructure Committee (ITIC) Report to the NAC

NASA is Falling Behind Federal and Non-Federal Institutions

Big DataCI

10G100G

GPUClusters

Hybrid HPC

Non-Fed Google/MS/Amazon

GLIF/I2/CENIC

Japan TSUBAME24224 GPUs

2.4 PF

China#2 Fastest

5 PFMC/GPU

NSF Gordon GENINext GenInternet

TAAC512 GPUs

Blue Waters*MC/GPU

12 PF

DOE Magellan ANIARRA100Gb

ANL256 GPUs

NG Jaguar*MC/GPU

20 PF

NASA Nebula,Testbed

Goddard to Ames 10G

Ames136 GPU

2 x 64 at Ames & GFSC

PleiadesMC1PF

* Later in 2012

Page 5: Information Technology Infrastructure Committee (ITIC) Report to the NAC

SMD is a Growing NASA HPC User Community

p r

o j

e c

t e

d

Source: Tsengdar Lee, Mike Little, NASA

Page 6: Information Technology Infrastructure Committee (ITIC) Report to the NAC

Leading Edge is Moving to Hybrid Processors:Requiring Major Software Innovations

“With Titan’s arrival, fundamental changes to computer architectures will challenge researchers from every scientific discipline.”

SEATTLE, WA -- (Marketwire) -- 11/14/2011 -- SC11 -- 

Page 7: Information Technology Infrastructure Committee (ITIC) Report to the NAC

Partnering Opportunities with NSF:SDSC’s Gordon-Dedicated Dec. 5, 2011

Data-Intensive Supercomputer Based on SSD Flash Memory and Virtual Shared Memory SW Emphasizes MEM and IOPS over FLOPS Supernode has Virtual Shared Memory:

2 TB RAM Aggregate 8 TB SSD Aggregate

Total Machine = 32 Supernodes 4 PB Disk Parallel File System >100 GB/s I/O

System Designed to Accelerate Access to Massive Datasets being Generated in Many Fields of Science, Engineering, Medicine, and Social Science

Source: Mike Norman, Allan Snavely SDSC

Page 8: Information Technology Infrastructure Committee (ITIC) Report to the NAC

Gordon Bests Previous Mega I/O per Second by 25x

Page 9: Information Technology Infrastructure Committee (ITIC) Report to the NAC

Rapid Evolution of 10GbE Port PricesMakes Campus-Scale 10Gbps CI Affordable

2005 2007 2009 2010

$80K/port Chiaro(60 Max)

$ 5KForce 10(40 max)

$ 500Arista48 ports

~$1000(300+ Max)

$ 400Arista48 ports

• Port Pricing is Falling • Density is Rising – Dramatically• Cost of 10GbE Approaching Cluster HPC Interconnects

Source: Philip Papadopoulos, SDSC/Calit2

Page 10: Information Technology Infrastructure Committee (ITIC) Report to the NAC

10G Optical Switch is the Center of a Switched Big Data Analysis Resource

212

OptIPuterOptIPuter

32

Co-LoCo-Lo

UCSD RCI

UCSD RCI

CENIC/NLR

CENIC/NLR

Trestles100 TF

8Dash

128Gordon

Oasis Procurement (RFP)

• Phase0: > 8GB/s Sustained Today • Phase I: > 50 GB/sec for Lustre (May 2011) :Phase II: >100 GB/s (Feb 2012)

40128

Source: Philip Papadopoulos, SDSC/Calit2

Triton32

Radical Change Enabled by Arista 7508 10G Switch

384 10G Capable

8Existing

Commodity Storage1/3 PB

2000 TB> 50 GB/s

10Gbps

58

2

4

Page 11: Information Technology Infrastructure Committee (ITIC) Report to the NAC

The Next Step for Data-Intensive Science:Pioneering the HPC Cloud

Page 12: Information Technology Infrastructure Committee (ITIC) Report to the NAC

Maturing Cloud Computing Capabilities: Beyond Nebula

Nebula Testing Conclusions Good first step to evaluate needs for science-driven cloud applications AWS Performance is better than Nebula Nebula cannot achieve economy of scale of AWS NASA cannot invest sufficient funding to compete with AWS ability to move up the

learning curve JPL is carrying out tests of AWS for mission data analysis

Next Step: a NASA Cloud Test-bed Evaluate improvements to cloud software stacks for S&E applications Provide assistance to NASA cloud software developers Assist appropriate NASA S&E users in migrating to clouds Operate cloud instances at ARC and GSFC

Limited Funding under HEC Program as a Technology Development Liaison with other NASA centers and external S&E cloud users Partnership between OCIO and SMD

Source: Tsengdar Lee, Mike Little, NASA

Page 13: Information Technology Infrastructure Committee (ITIC) Report to the NAC

Partnering Opportunities with UniversitiesJohn Hopkins University DataScope

Private Science Cloud for Sustained Analysis of PB Data Sets Built for Under $1M 6.5PB of Storage, 500 Gbytes/sec Sequential BW Disk IO + SSDs Streaming Data into an Array of GPUs Connected to Starlight at 100G (May 2012)

Some Form of a Scalable Cloud Solution Inevitable Who will Operate it, What Business Model, What Scale? How does the On/Off Ramp Work?

Science has Different Tradeoffs than eCommerce: Astronomy, Space Science, Turbulence, Earth Science, Genomics, Large HPC Simulations Analysis

Source: Alex Szalay, JHU

Page 14: Information Technology Infrastructure Committee (ITIC) Report to the NAC

NASA Corporate Wide Area Network Backbone(Dedicated 10G between ARC/GFSC Supercomputing Centers)

HEC ONLY

HQ

GRC

CHI

ARC

BAY

DAL

JSC

GSFC

DC

ATL

MSFC

SONET OC-12

SONET OC-192

SONET OC-48

LARC

SSC

MAF

KSC

WSTFWSC

JPL

DFRC

SONET OC-3

Source: Tsengdar Lee, Mike Little, NASA

Dedicated 10Gbps

Page 15: Information Technology Infrastructure Committee (ITIC) Report to the NAC

Partnering Opportunities with DOE: ARRA Stimulus Investment for DOE ESnet

Source: Presentation to ESnet Policy Board

National-Scale 100Gbps Network Backbone

Page 16: Information Technology Infrastructure Committee (ITIC) Report to the NAC

Aims of DOE ESnet 100G National Backbone

To stimulate the market for 100G

To build pre-production 100G network, linking DOE supercomputing facilities with international peering points

To build a national-scale test bed for disruptive research

ESnet is interested in Federal partnerships that accelerate scientific discovery and leverage our particular capabilities

Source: Presentation to ESnet Policy Board

Page 17: Information Technology Infrastructure Committee (ITIC) Report to the NAC

Global Partnering Opportunities:The Global Lambda Integrated Facility

Research Innovation Labs Linked by 10Gps Dedicated Networks

www.glif.is/publications/maps/GLIF_5-11_World_2k.jpg

Page 18: Information Technology Infrastructure Committee (ITIC) Report to the NAC

NAC Committee on IT Infrastructure Recommendation #1

Recommendation: To enable NASA to gain experience on emerging leading-edge IT technologies such as:

Data-Intensive Cyberinfrastructure, 100 Gbps Networking, GPU Clusters, and Hybrid HPC Architectures,

we recommendation that NASA aggressively pursue partnerships with other Federal agencies, specifically NSF and DOE, as well as public/private opportunities.

We believe joint agency program calls for end users to develop innovative applications will help keep NASA at the leading edge of capabilities and enable training of NASA staff to support NASA researchers as these technologies become mainstream.

Page 19: Information Technology Infrastructure Committee (ITIC) Report to the NAC

NAC Committee on IT Infrastructure Recommendation #1 (Continued)

Major Reasons for the Recommendation: NASA has fallen behind the leading edge, compared to other Federal agencies and international centers, in key emerging information and networking technologies. In a budget constrained fiscal environment, it is unlikely that NASA will be able to catch up by internal efforts. Partnering, as was historically done in HPCC, seems an attractive option.

Consequences of No Action on the Recommendation: Within a few more years, the gap between NASA internally driven efforts and the U.S. and global best-of-breed will become a gap to large to bridge. This will severely undercut NASA’s ability to make progress on a number of critical application arenas.

Page 20: Information Technology Infrastructure Committee (ITIC) Report to the NAC

Finding #2

SMD Data Resides in a Highly Distributed Servers Many Data Storage and Analysis Sites Are Outside NASA Centers Access to Entire Research Community Essential

Over Half Science Publications are From Using Data Archives Secondary Storage Needed in Cloud with High Bandwidth and User Portal

Education and Public Outreach of Data Rapidly Expanding Images for Public Relations Apps for Smart Phones Crowdsouring

PI

Research

Community

Education

Public Outreach

Page 21: Information Technology Infrastructure Committee (ITIC) Report to the NAC

In 2011 there were over

1060 papers written using

data archived at MAST

Majority of Hubble Space Telescope Scientific Publications Come From Data Archives

Page 22: Information Technology Infrastructure Committee (ITIC) Report to the NAC

0

200

400

600

800

1000

1994 1997 2000 2003 2006 2009 2012 2015 2018

Terabytes

ProjectedJWST

JWSTS&ITOther

Multi-Mission Data Archives at STSIWill Continue to Grow - Doubling by 2018

JWST

Cumulative Petabyte Over 20 Years

Page 23: Information Technology Infrastructure Committee (ITIC) Report to the NAC

Solar Dynamics Observatory 4096x4096 AIA Camera – 57, 600 Images/Day

March 6, 2012 X5.4 Flare from Sunspot AR1429 Captured by

the Solar Dynamics Observatory (SDO)

in the 171 Angstrom Wavelength

Credit: NASA/SDO/AIA

JSOC is Archiving ~5TB/day From 6 CamerasLeads to over 1 Petabyte per year!

Page 24: Information Technology Infrastructure Committee (ITIC) Report to the NAC

Public Outreach Through Multiple ModesSimultaneous with Data to Research Scientists

3D Sun App-10s of Thousands Installed on Android in Last 30 Days

Page 25: Information Technology Infrastructure Committee (ITIC) Report to the NAC

NASA Space Images Are Widely Viewed by Public

http://news.nationalgeographic.com/news/2012/03/120306-dark-matter-galaxies-mystery-space-science/

Page 26: Information Technology Infrastructure Committee (ITIC) Report to the NAC

32 of the 200+ Apps in the Apple iStore that Return from a Search on “NASA”

Page 27: Information Technology Infrastructure Committee (ITIC) Report to the NAC

Crowdsourcing Science: Galaxy Zoo and Moon Zoo Bring the Public into Scientific Discovery

More than 250,000 people have taken part in Galaxy Zoo so far. In the 14 months the site was up Galaxy Zoo 2 users helped us make over 60,000,000 classifications. Over the past year, volunteers from the original Galaxy Zoo project created the world's largest database of galaxy shapes.

www.galaxyzoo.org

Page 28: Information Technology Infrastructure Committee (ITIC) Report to the NAC

NASA Earth Sciences Images Are Brought Together by Earth Observatory Web Site

http://earthobservatory.nasa.gov/

Page 29: Information Technology Infrastructure Committee (ITIC) Report to the NAC

NASA Earth Observatory Integrates Global Variables:Time Evolution Over Last Decade

http://earthobservatory.nasa.gov

Page 30: Information Technology Infrastructure Committee (ITIC) Report to the NAC

EOS-DIS Data Products DistributionApproaching ½ Billion/Year!

Page 31: Information Technology Infrastructure Committee (ITIC) Report to the NAC

5 March 2012Robert Hanisch, NAC ITIC @ JHU

31 31

The Virtual Observatory

The VO is foremost a data discovery, access, and integration facility

International collaboration on metadata standards, data models, and protocols Image, spectrum, time series data Catalogs, databases Transient event notices Software and services Distributed computing (authentication, authorization,

process management) Application inter-communication

International Virtual Observatory Alliance established in 2001, patterned on WorldWideWeb Consortium (W3C)

Page 32: Information Technology Infrastructure Committee (ITIC) Report to the NAC

5 March 2012Robert Hanisch, NAC ITIC @ JHU

32 32

SAMP

Page 33: Information Technology Infrastructure Committee (ITIC) Report to the NAC

NSF’s Ocean Observatory Initiative Cyberinfrastructure Supports Science, Education, and Public Outreach

Source: Matthew Arrott, Calit2 Program Manager for OOI CI

Page 34: Information Technology Infrastructure Committee (ITIC) Report to the NAC

OOI CI is Built on Dedicated Optical Networks and Federal Agency & Commercial Clouds

Source: John Orcutt, Matthew Arrott, SIO/Calit2

Page 35: Information Technology Infrastructure Committee (ITIC) Report to the NAC

NAC Committee on IT Infrastructure DRAFT* Recommendation #2

Recommendation: NASA should formally review the existing national data cyberinfrastructure supporting access to data repositories for NASA SMD missions. A comparison with best-of-breed practices within NASA and at other Federal agencies should be made.

We request a briefing on this review to a joint meeting of the NAC IT Infrastructure, Science, and Education committees within one year of this recommendation. The briefing should contain recommendations for a NASA data-intensive cyberinfrastructure to support science discovery by both mission teams, remote researchers, and for education and public outreach appropriate to the growth driven by current and future SMD missions.

* To be completed after a joint meeting of ITIC, Science, and Education Committees in July 2012 and the final

recommendation submitted to July 2012 NAC meeting

Page 36: Information Technology Infrastructure Committee (ITIC) Report to the NAC

NAC Committee on IT Infrastructure Recommendation #2 (continued)

Major Reasons for the Recommendation: NASA data repository and analysis facilities for SMD missions are distributed across NASA centers and throughout U.S. universities and research facilities. There is considerable variation in the sophistication of the integrated

cyberinfrastructure supporting scientific discovery, educational reuse, and public outreach across SMD subdivisions.

The rapid rise in the last decade of “mining data archives” by groups other than those funded by specific missions implies a need for a national-scale cyberinfrastructure architecture that can allow for free-flow of data to where it is needed.

Other agencies, specifically NSF’s Ocean Observatories Initiative Cyberinfrastructure program, should be used as a benchmark for NASA’s data-intensive architecture.

Consequences of No Action on the Recommendation: The science , education, and public outreach potential of NASA’s investment in SMD space missions will not be realized .