national science foundation blue ribbon panel on cyberinfrastructure introduction, context &...

70
National Science Foundation Blue Ribbon Panel on Cyberinfrastructure Introduction, Context & Charge Dan Atkins, Chair [email protected] University of Michigan April 19, 2002

Upload: leslie-reynolds

Post on 01-Jan-2016

218 views

Category:

Documents


0 download

TRANSCRIPT

Nat

ion

al S

cien

ce F

ou

nd

atio

n Blue Ribbon Panel on Cyberinfrastructure

Introduction, Context & Charge

Dan Atkins, [email protected]

University of Michigan

April 19, 2002

Nat

ion

al S

cien

ce F

ou

nd

atio

nPanel Members

• Daniel E. Atkins, Chair, Univ. of Michigan, EECS and SI, [email protected]• Kelvin K. Droegemeier, Center for Analysis and Prediction of Storms,

University of Oklahoma, [email protected]• Stuart I. Feldman, IBM Research, [email protected]• Hector Garcia-Molina, CS Dept., Stanford University,

[email protected]• Michael Klein, Center for Molecular Modeling, University of Pennsylvania,

[email protected]• Paul Messina, Cal Tech, [email protected]• David G. Messerschmitt, UC-Berkeley, EECS & SIMS,

[email protected] • Jeremiah P. Ostriker, Princeton University, [email protected].• Margaret H. Wright, Computer Science Department, Courant Institute of

Mathematical Sciences, New York University, [email protected]

Nat

ion

al S

cien

ce F

ou

nd

atio

nMeeting Agenda: April 19, 2002,

NSF, 1-4 pm

• 1. Review of status of the panel's activities and goals for this meeting.

• 2. Reports from the authoring sub-committees. • 3. Review and discussion of the working draft of the

report. • 4. Discussion of primary recommendations. • 5. Stewardship and additional use of the material

gathered by the Panel. • 6. Summary of additional activities to create final version

of report. • 7. Matters arising.

Nat

ion

al S

cien

ce F

ou

nd

atio

n

Historical Schematic

CISEDirectorate

CSE researchelsewhere in NSF

Provision of advanced scientific computing

5 SupercomputerCenters, NSFnet,

Support for an array of small, medium, and large CISE basic research projects

Hayes Report

1984Lax ->Curtis/Bardon Reports

1995ComputationalScience init.;Expanded equip. program.

1993 BRP:“Desktop to Teraflop”

PACI: NCSA & NPACI

Terascale Computing Initiatives

OURREPORT

Nat

ion

al S

cien

ce F

ou

nd

atio

n

Charge

OURREPORT

• A) Evaluate the current PACI programs.

WRT meeting needs of the scientific and engineering research community:

• B) Recommend new areas of emphasis for CISE Directorate,

• C) Recommend an implementation plan to enact recommended changes.“C

yber

-in

fras

truc

ture

Nat

ion

al S

cien

ce F

ou

nd

atio

n

Process

• Web survey• Hearings• Reviewing prior reports• Random input• Knowledge and expertise of the Panel members.

Nat

ion

al S

cien

ce F

ou

nd

atio

n

Epigraph

• Cyberinfrastructure is the sine qua non for true progress in much of the mathematical and physical sciences – And progress in CI is often driven by real-world problems.– Robert Eisenstein, AD for MPS,

11/30/01

Nat

ion

al S

cien

ce F

ou

nd

atio

nRevolutionizing Science and Engineering through

Cyberinfrastructure:Table of Contents

• 1. The Vision

• 2. Background and Charge

• 3. Challenges and Opportunities for the Scientific Research Community

• 4. The New Cyberinfrastructure: What Changed in Computing

• 5. The Landscape of Related Activities

• 6. Partnerships for Advanced Computational Infrastructure: Past and Future Roles

• 7. Achieving the Vision

• 8. Scope and Budget Estimates

Nat

ion

al S

cien

ce F

ou

nd

atio

n Draft Report Available in pdf atworktools.si.umich.edu/workspaces/datkins/001.nsf

Please send comments by May 1, 2002 to [email protected]

Nat

ion

al S

cien

ce F

ou

nd

atio

nRevolutionizing Science and Engineering through

Cyberinfrastructure:Table of Contents

• 1. The Vision 2. Feldman

• 2. Background and Charge 1. Atkins

• 3. Challenges and Opportunities for the Scientific Research Community 3. Droegemeier

• 4. The New Cyberinfrastructure: What Changed in Computing 2. Feldman

• 5. The Landscape of Related Activities 2. Feldman

• 6. Partnerships for Advanced Computational Infrastructure: Past and Future Roles 6. Wright

• 7. Achieving the Vision 4. Messerschmidt

• 8. Scope and Budget Estimates 5. Messina

• Summary and Discussion - Atkins

Nat

ion

al S

cien

ce F

ou

nd

atio

n

Blue Ribbon Panel on Cyberinfrastructure

Vision

Stuart I. FeldmanIBM

April 19, 2002

Nat

ion

al S

cien

ce F

ou

nd

atio

n

Recommendations

• New INITIATIVE to revolutionize science and engineering research at NSF and worldwide to capitalize on new computing and communications opportunities 21st Century Cyberinfrastructure includes supercomputing, but also massive storage, networking, software, collaboration, visualization, and human resources

– Current centers (NCSA, SDSC, PSC) are a key resource for the INITIATIVE

– Budget estimate: incremental $650 M/year (continuing)

• An INITIATIVE OFFICE with a highly placed, credible leader empowered to

– Initiate competitive, discipline-driven path-breaking applications within NSF of cyberinfrastructure which contribute to the shared goals of the INITIATIVE

– Coordinate policy and allocations across fields and projects. Participants across NSF directorates, Federal agencies, and international e-science

– Develop high quality middleware and other software that is essential and special to scientific research

– Manage individual computational, storage, and networking resources at least 100x larger than individual projects or universities can provide.

Nat

ion

al S

cien

ce F

ou

nd

atio

nScience and Engineering Research

Depends on Computing and Communications

• Online fast publication (and archives too)• New collections accessible• Raw data and digital libraries• Collaboration (Collaboratories, Access Grid,

etc.)• In silico science

Nat

ion

al S

cien

ce F

ou

nd

atio

n

Furthering the Revolution

• Saving raw data• Cross-disciplinary collections• Richer publications• Grander simulations (cells and organisms;

entire earth system)• Breadth and depth of collaborations, routinely

international

Nat

ion

al S

cien

ce F

ou

nd

atio

n

Thresholds and Opportunities

• Internet and Web use almost universal– Activity would stop without e-mail and WWW

• Expectations rising with generations and for all disciplines

• Supercomputers and terabytes in the lab• Simulation required to do new science• Standardized formats, software

Nat

ion

al S

cien

ce F

ou

nd

atio

n

Risks and Costs

• Inconsistent formats across fields and sites• Data loss• Field boundaries• Duplicative moderate quality software• Falling behind on computing technologies

Nat

ion

al S

cien

ce F

ou

nd

atio

n

Proposals for the INITIATIVE

• Large incremental budget• Drive applications that revolutionize the way that research is

done– Fund competitive discipline-driven projects– With cyberinfrastructure contribution and standards and

participation by computing experts• Supply shared resources

– Supercomputers and data farms that provide 100-1000x what can be found locally

– New shared middleware, content standards, basic applications

– New research (emphasizing computation, social science, – New education and outreach

• Central organization with authority

Nat

ion

al S

cien

ce F

ou

nd

atio

n

The New Cyberinfrastructure

Nat

ion

al S

cien

ce F

ou

nd

atio

n

Hardware Trends

• Processor speeds and memory increasing with Moore’s Law

• Cluster sizes – now 1000s, soon even larger– Largest sites at 10TF, moving toward PF

• Disk capacity increasing with areal density (60%-100%/year)– Terabytes typical, petabytes coming

• Wide area networking moving to Gb/s• Large and high-resolution displays

Nat

ion

al S

cien

ce F

ou

nd

atio

n

Software

• Information networking – applications, messages, self-describing content, not just bit streams– The Grids

• Content management – metadata, searches, persistence

• Collaboration• Middleware

Nat

ion

al S

cien

ce F

ou

nd

atio

n

Ecology of Scientific Computing

• Computing industry– Commercial requirements drive basic

hardware and software– Important additional needs for scientific

computing• Computing Research• Other sciences• Other federal agencies• Non-US activities

Nat

ion

al S

cien

ce F

ou

nd

atio

n Blue Ribbon Panel on Cyber Infrastructure

Science & Engineering

Community Needs and Challenges

Kelvin K. DroegemeierUniversity of Oklahoma

April 19, 2002

Nat

ion

al S

cien

ce F

ou

nd

atio

n

Goals

• Engage the broadest elements of the science and engineering communities as a means for critically assessing needs and challenges– Scientific– Technological– Sociological

• Identify barriers and opportunities

Nat

ion

al S

cien

ce F

ou

nd

atio

n

The Communities

• Domestic and International• Academia• Private Industry• Government Agencies• Laboratories• State, Regional, and National Centers

Nat

ion

al S

cien

ce F

ou

nd

atio

nMethodology

• Community-wide web survey– Widely publicized– >700 responses– Quantitative comparisons with the Hayes Report

• Oral public testimony (3 sessions)– 62 participants selected from: research scientists and engineers;

computer and computational scientists; center directors; agency and corporate leaders; system administrators; educators; students and young scientists; technicians and consultants

– Emphasis given to traditionally underrepresented groups and the physically challenged

– Written transcripts and A/V materials assembled• Existing reports and planning documents• Ad hoc communications• Personal experiences and expertise

Nat

ion

al S

cien

ce F

ou

nd

atio

nAnalysis

• Results from all 5 methodologies have been synthesized

• Remarkable consistency among individual responses and within and among disciplines

• No prioritization of findings: all summary issues are viewed as critically important

• Categorization– Philosophy and Process– Current Resources– Future Infrastructure– Emerging Paradigms and Activities

Nat

ion

al S

cien

ce F

ou

nd

atio

n

Philosophy and Process• Cyber infrastructure lies at the heart of

revolutionary science and engineering• NSF should take the lead in charting a

national course for cyber infrastructure• NSF should consider human capital and

software as co-equals with traditional physical infrastructure

• Cyber infrastructure requires continuity, consistency, and sufficient funding; NSF should consider the consequences of periodic full re-competition of CI centers

Nat

ion

al S

cien

ce F

ou

nd

atio

n

Philosophy and Process• NSF needs to

– Provide a framework, motivation, and clear direction for building and sustaining linkages between academia and industry

– Give attention to the sociological, economic, and cultural issues associated with cyber infrastructure

– Continue supporting open source software strategies

Nat

ion

al S

cien

ce F

ou

nd

atio

n

Current Resources

• The entry barrier into high performance computing continues to be high

• Effective use of parallel computers is becoming increasingly complex

• Greater investments are needed in– Software development– Training and support

Nat

ion

al S

cien

ce F

ou

nd

atio

n

Current Resources• The PACI centers have successfully

– brought high performance computing to the masses;

– broadened the spectrum of users; and– responded to dramatic changes in the user

base, technology, and applications• However, the PACI centers remain a largely

batch oriented environment and are not configured or funded to deliver significant resources in novel ways (dedicated, on-demand) to large numbers of users

Nat

ion

al S

cien

ce F

ou

nd

atio

n

Current Resources• The NRAC allocation process no longer is

effective– Double jeopardy– Yearly resource allocations not congruent

with multi-year agency grants– Proposal development process is time-

consuming– Reviewer base insufficiently broad– Need flexibility to accommodate future

resources (e.g., data repositories)

Nat

ion

al S

cien

ce F

ou

nd

atio

n

Current Resources

• The PACI centers have been highly successful in developing visionary, innovative technologies and prototype tools

• However, insufficient funding and the lack of selective investment has hampered transition to full deployment

Nat

ion

al S

cien

ce F

ou

nd

atio

n

Future Infrastructure• The “last mile problem” continues and is especially

serious for HBCUs, Tribal Colleges and Universities, and Hispanic institutions

• Research-group and departmental-scale facilities (100 to 1000x less powerful than national centers) are becoming increasingly important; thus, national centers need to be a factor of 100 to 1000x more capable

• High speed networks with high quality of service continue to be foundational to research and education at all levels

• On-demand (not pre-scheduled) and instantaneous access is becoming increasingly important (computers, data bases, networks)

Nat

ion

al S

cien

ce F

ou

nd

atio

n

Future Infrastructure• Comprehensive environments are needed for

linking models from multiple disciplines and for synthesizing results in interoperable frameworks

• The Grid represents an important opportunity for the future and should receive high priority for support

• Inexpensive and reliable tools are needed to support distance collaborations

• Higher levels of security are needed

Nat

ion

al S

cien

ce F

ou

nd

atio

n Emerging Paradigms and Activities

• Cyber infrastructure is becoming the essential lynchpin for research at the boundaries among disciplines and should be driven by user needs

• The need for a new information technology professional is emerging– Expertise in one or more disciplines plus

computer science– They will develop, maintain, and integrate

complex hardware and software systems– They are an important bridge to users– Educational institutions must develop strategies

for creating this computational science workforce

Nat

ion

al S

cien

ce F

ou

nd

atio

n Emerging Paradigms and Activities

• Scientific and engineering applications are becoming more multi-scale (both space and time) and compute-intensive; thus, the need for high-end resources continues to grow. However, cyber infrastructure research needs to span the spectrum from small grants to large centers

Nat

ion

al S

cien

ce F

ou

nd

atio

n Emerging Paradigms and Activities

• Significant need exists for access to long-term, distributed, stable data and meta data repositories and digital libraries

• Legacy data likewise are important and must be digitized and preserved

Nat

ion

al S

cien

ce F

ou

nd

atio

n

Knowledge Frontiers

• Several new projects provide a glimpse of the future

Nat

ion

al S

cien

ce F

ou

nd

atio

nBlue Ribbon Panel on Cyberinfrastructure

Organization

David G. MesserschmittUniversity of California at

Berkeley

April 19, 2002

Nat

ion

al S

cien

ce F

ou

nd

atio

n

Layered structure of the INITIATIVE

Applications of information technology to science and engineering research

Conduct of science and engineering research

Cyberinfrastructure supporting applications

Core technologies incorporated into cyberinfrastructure

Nat

ion

al S

cien

ce F

ou

nd

atio

n

Some roles of cyberinfrastructure

• Processing, storage, connectivity– Performance, sharing, integration, etc

• Make it easy to develop and deploy new applications– Tools, services, application commonality

• Interoperability enables future collaboration across disciplines

• Best practices, assistance, expertise• Greatest need is software and experienced

people

Nat

ion

al S

cien

ce F

ou

nd

atio

n

Operations in support of end users

Development or acquisition

Classes of activities

Research in technologies, systems, and applications

Applications of information technology to science and engineering research

Cyberinfrastructure supporting applications

Core technologies incorporated into cyberinfrastructure

Nat

ion

al S

cien

ce F

ou

nd

atio

n

Defining applications

• Only domain science and engineering researchers can create a vision and implement the methodology and process changes

• Information technologists need to be deeply involved– What technology can be, not what it is– Conduct research to advance the supporting

technologies and systems– Applications inform research

• Shared responsibility

Nat

ion

al S

cien

ce F

ou

nd

atio

n

Core information technologies (CISE, E)

Technological (CISE) and social systems (CISE, SBE)

Applications (multi-disciplinary)

Applications (discipline specific)

All science (natural and social) and engineering disciplines

Mapping onto disciplines

Nat

ion

al S

cien

ce F

ou

nd

atio

n

Who delivers

Research in technologies, systems, and applications

Operations in support of end users

Long-term and applied researchers (applications, systems, core technologies)

Development or acquisition

Commercial suppliers, development centers, community development, integrators

End-user staff support, operational centers, service providers

Nat

ion

al S

cien

ce F

ou

nd

atio

n

Evaluation and assessment

Research in technologies, systems, and applications

Operations in support of end users

Ideas:

outcomes

Development or acquisition

Plans:

impact and use

Users:

impact and satisfaction

Nat

ion

al S

cien

ce F

ou

nd

atio

n

Responsibility for applications

Applications (discipline specific)

All science (natural and social) andengineering disciplines

OtherDirectorates

Applications (multi-disciplinary) CISE

Close coordination and collaboration(matrix organization)

Nat

ion

al S

cien

ce F

ou

nd

atio

n

Applications (multi-disciplinary)

Applications (discipline specific)

All science (natural and social) andengineering disciplines

Responsibility for cyberinfrastructure

OtherDirectorates

Close coordination and collaboration(matrix organization)

Technological systems Social systems

CISE CISE and SBE

CISE

Nat

ion

al S

cien

ce F

ou

nd

atio

n

OFFICE for the INITIATIVE

• Headed by a leader with experience, credibility, commitment, persuasiveness, accountability

• Complex matrix organization spaning all Directorates needs central direction

• Vision and coordination• Manage INITIATIVE budget (competitive and

community input)• Outreach to agencies, international

Nat

ion

al S

cien

ce F

ou

nd

atio

n

Blue Ribbon Panel on Cyberinfrastructure

Scope and Budget

Paul MessinaCalifornia Institute of

Technology

April 19, 2002

Nat

ion

al S

cien

ce F

ou

nd

atio

n To achieve its goals, the INITIATIVE should include funding for

software and people

• Long-term research in IT and CI• Applied research in IT and CI, with deep

involvement by applications projects• Developing new applications enabled by IT

and CI• Enhancing existing applications to take

advantage of the new facilities and capabilities• Transforming research software into robust

products

Nat

ion

al S

cien

ce F

ou

nd

atio

nTo achieve its goals, the INITIATIVE

should include funding for data

• Creating and operating data repositories in many disciplines – taking existing data collections and making them

conveniently accessible

• Establishing discipline-specific coordination centers to guide and coordinate software and data format choices for the repositories

• Establishing STCs for addressing common issues that arise in creation and use of data collections, especially across disciplines

Nat

ion

al S

cien

ce F

ou

nd

atio

nTo achieve its goals, the INITIATIVE should

include funding for physical infrastructure and its operation

• Acquiring and operating high-end computers, visualization facilities, data archives, and networks of much greater power and in substantially greater quantity– in particular, multiple computers that

are among the world’s most powerful• Establishing production data libraries

Nat

ion

al S

cien

ce F

ou

nd

atio

n

Basis for budget estimates

• Our estimates are based on – current and previous NSF activities– testimonies– other agencies’ programs in related areas– activities in other countries

Nat

ion

al S

cien

ce F

ou

nd

atio

n Preliminary Budget Overview(Incremental)

Funding Levelin millions

Research in IT and its applications and social context 20

Applications of IT in science and engineering 100

Cyberinfrastructure supporting applications

High-end general-purpose centers 280

Networks 50

Data repositories 120

Coord center for data repositories (discipline specific) 20

STCs for data collections 20

Electronic Service Centers TBDDigital Libraries TBD

Core technologies incorporated into cyberinfrastructure 40

System software and tools research and development

INITIATIVE Total 650

Nat

ion

al S

cien

ce F

ou

nd

atio

n

Is this enough to support a revolution?

• Not by itself• However, there are activities in CISE, in

other parts of NSF, and in the world at large that will complement the funding we recommend for this INITIATIVE

Nat

ion

al S

cien

ce F

ou

nd

atio

n Ongoing NSF CISE-funded activities that would be folded into the INITIATIVE

ActivityFunding relevant toINITIATIVE (FY2002 level)ACIR$85MANIR$70MITR (principally the largeprojects)$60M – 120M (out of$180M)Terascale MRE$35MTotal$250M - $310M

Nat

ion

al S

cien

ce F

ou

nd

atio

nThere are other NSF activities that

would contribute to and benefit from the INITIATIVE

• NCAR• Network for Earthquake Engineering

Simulation (NEES)• National Ecological Observatory Network

(NEON)• and others

Nat

ion

al S

cien

ce F

ou

nd

atio

n

Related activities supported by other governmental entities

• NASA IPG• NIH BIRN• DOE Science Grid• DOE SciDAC• DOE/NNSA ASCI• UK e-Science• EU Grid projects (9)• All of the above (and others) support Research,

Development, and Deployment activities that will bolster the NSF INITIATIVE

Nat

ion

al S

cien

ce F

ou

nd

atio

n

And the private sector is also making investments

• Most high-end computer manufacturers have announced substantial efforts in grid software– and are participating in Global Grid Forum

• Twelve companies announced support of Globus last November

• End-user companies in aerospace, pharmaceuticals are using or investigating grid approaches

Nat

ion

al S

cien

ce F

ou

nd

atio

n

Open issues

• Is the funding level high enough for the system software and tools R&D?– Taking into consideration the number of

people who could and would engage in those activities

• Is the funding level high enough for the development of production-quality software?– With same consideration, but note that work

not necessarily done in universities• Funding level for production digital libraries

Nat

ion

al S

cien

ce F

ou

nd

atio

nBlue Ribbon Panel on Cyberinfrastructure

PACIs: Past and Future Roles

Margaret H. WrightNew York University

April 19, 2002

Nat

ion

al S

cien

ce F

ou

nd

atio

n

The PAST

• NSF Supercomputer Centers (1986-87)

• Multiple reports (Branscomb, Brooks-Sutherland, Hayes) PACI program (1997)

• Two PACI partnerships (NCSA, NPACI)

Nat

ion

al S

cien

ce F

ou

nd

atio

n

The PRESENT

Multiple functions within PACI program

• Provision of high-end resources (cycles, networking, data, …)

• Discipline-specific codes and infrastructure

• Generic tools and infrastructure for users of high-end computing

• Education, outreach, and training

Nat

ion

al S

cien

ce F

ou

nd

atio

n

Part A of our charge:Assessment of PACI program

• Our interpretation: the potential roles for the PACIs and PSC in a GREATLY expanded context

• Annual evaluations of PACIs: positive overall• Repeated concerns: effectiveness of enabling

and application technology projects in serving the science, engineering, and computer science communities who use high-end computing

Nat

ion

al S

cien

ce F

ou

nd

atio

n

Rationale for the Future

• Insatiable demand for highest-end cycles, networking, data (quantity, speed)

• Need for sustained work on industrial-strength discipline-specific codes and infrastructure, generic software tools and infrastructure– Effort at least one order of magnitude greater

than high-quality prototypes

Nat

ion

al S

cien

ce F

ou

nd

atio

n

Within the INITIATIVE

• Disaggregation of PACI functions

• Augmented centralized high-end resources

• Enabling/application infrastructure projects peer-reviewed

• Expanded, peer-reviewed education, outreach, and training

Nat

ion

al S

cien

ce F

ou

nd

atio

n

Future of PACI within the INITIATIVE

• Two-year extension of current PACI program requested

• Until 2007, PACI’s and PSC should receive stable funding to provide high-end resources and associated operations

• 2004: INITIATIVE funding begins– Important to retain skilled PACI staff and successful

collaborations– PACI’s can compete for all aspects of the larger

INITIATIVE funding– Separate peer-reviewed enabling and application

infrastructure projects

Nat

ion

al S

cien

ce F

ou

nd

atio

n Blue Ribbon Panel on Cyberinfrastructure

Summary recommendations

April 19, 2002

Nat

ion

al S

cien

ce F

ou

nd

atio

n

Recommendations

• New INITIATIVE to revolutionize science and engineering research at NSF and worldwide to capitalize on new computing and communications opportunities

– 21st Century Cyberinfrastructure includes supercomputing, but also massive storage, networking, software, collaboration, visualization, and human resources

– Current centers (NCSA, SDSC, PSC) are a key resource– Budget estimate: incremental $650M/year (continuing)

• INITIATIVE OFFICE with a highly placed, credible leader empowered to– Initiate competitive, discipline-driven path-breaking applications within

NSF of cyberinfrastructure which contribute to the shared goals of the INITIATIVE

– Coordinate policy and allocations across fields and projects. Participants across NSF directorates, Federal agencies, and international e-science

– Develop high quality middleware and other software that is essential and special to scientific research

– Manage individual computational, storage, and networking resources at least 100x larger than individual projects or universities can provide.