csig 10 survey of emerging it trends and technologies
DESCRIPTION
CSIG 10 Survey of Emerging IT Trends and Technologies. Chaitan Baru SDSC. Cyberinfrastructure. The “cyberinfrastructure” initiative is an attempt to provide explicit investments in IT for science & engineering research and education - PowerPoint PPT PresentationTRANSCRIPT
CSIG 10Survey of Emerging IT Trends
and Technologies
Chaitan Baru
SDSC
1
Cyberinfrastructure• The “cyberinfrastructure” initiative is an attempt to provide explicit
investments in IT for science & engineering research and education
• From NSF’s Cyberinfrastructure Vision for 21st Century Discovery, www.nsf.gov/od/oci/ci-v7.pdf, July 20, 2006
– “The comprehensive infrastructure needed to capitalize on dramatic advances in information technology has been termed cyberinfrastructure.”
– “…integrates hardware for computing, data and networks, digitally-enabled sensors, observatories and experimental facilities…:
– “…an interoperable suite of software and middleware services and tools...”
– Investments in interdisciplinary teams and cyberinfrastructure professionals with expertise in algorithm development, system operations, and applications development are also essential…”
– “In 1999, the PITAC released the seminal report ITR-Investing in our Future, prompting new and complementary NSF investments in CI projects, such as the Grid Physics Network (GriPhyN) and international Virtual Data Grid Laboratory (iVDGL) and the Geosciences Network, known as GEON.”
2
Geoinformatics
• A vision for Geoinformatics, from the NSF Workshop on Envisioning a National Geoinformatics System for the United States Denver, March 2007– “…a future in which someone can sit at a
terminal and have easy access to vast stores of data of almost any kind, with the easy ability to visualize, analyze and model those data.”
3
4
GeoinformaticsFrom David Lambert, NSF EAR/GEO
Presentation at GEON Annual Meeting, 2005
GeoinformaticsCyberinfrastructure for the Solid
Earth Sciences: Objectives
• Make data, tools, applications
…and communities…
easily accessible online
• Provide an integration environment for 3D and 4D geoscience data integration
Book to be published this year by Cambridge University Press. Co-editors: Randy Keller and Chaitan Baru
5
A Use Case for Geoinformatics
6
• A user request of the form:
“For a given region (i.e. lat/long extent, plus depth), return a 3D structural model with accompanying physical parameters of density, seismic velocities, geochemistry, and geologic ages, using a cell size of 10km”
Portal-based Science EnvironmentsSupport for resource sharing and collaborations
EarthScope Data Portal
- SDSC San Diego
- IRIS Seattle
- UNAVCO Boulder
- ICDP Potsdam
portal.earthscope.org
CUAHSI Hydrologic Information System, HIS (http://his.cuahsi.org)
– Data Discovery, Data Access, Data Publication
9
GEON: Geosciences Network*
10
• Funded by NSF IT Research program
• Multi-institution collaboration between IT and Earth Science researchers
• GEON Cyberinfrastructure provides:– Authenticated access to data and Web services
– Registration of data sets, tools, and services with metadata
– Search for data, tools, and services, using ontologies
– Scientific workflow environment and access to HPC
– Data and map integration capability
– Scientific data visualization and GIS mapping* The network / grid concept has been evolving over past several years
GEON: The Geosciences Network
GEON is a coalition among IT and Earth Science researchers with the goal of developing advanced information technologies to enable new modes of geosciences research GEON is developing technologies for information integration and knowledge discovery Project participants: 14 PI institutions, and partners including, other projects, agencies, and industry GEON has deployed a Web services-based, distributed computing infrastructure, called the GEONgrid, across PI and partner sites GEONgrid provides access to data collections, tools, and applications that support geosciences research Project funding: $11.25M, 2002-2007
www.geongrid.org
RESEARCH AND EDUCATION PRODUCTS AND RESULTS
Technologies for Ontology-Based Data Registration, GIS Map Integration, Distributed Portals, and 4D Visualization
Research on 3D Lithospheric structure Gravity Modeling Remote Sensing Data Integration
Cyberinfrastructure Summer Institute for Geoscientists and graduate courses in Geoinformatics
• 14 PI institutions• Over 20 other partners including, universities, industry, government agencies/labs
GEON Partners
PI Institutions
• Arizona State University
• Bryn Mawr College
• Penn State University
• Rice University
• San Diego State University
• San Diego Supercomputer Center/UCSD
• University of Arizona
• University of Idaho
• University of Missouri, Columbia
• University of Texas at El Paso
• University of Utah
• Virginia Tech
• UNAVCO
• Digital Library for Earth System Education (DLESE)
Partners
• Chronos
• CUAHSI-HIS
• ESRI
• Calit2
• Georgia State University
• Geological Survey of Canada
• Georeference Online
• HP
• IBM
• Lawrence Livermore Natl Laboratory
• NASA Goddard, Earth System Division
• SCEC
• U.S. Geological Survey (USGS)
• Purdue University
Affiliated Projects
• EarthScope, IRIS
Key Informatics Areas• Portals
– Authenticated, role-based access to cyber resources: data, tools, models, model outputs, collaboration spaces, …
• Data Integration– Search, discovery and integration of data from heterogeneous information sources
(“mediation” and “semantic integration”)
• Use of workflow systems, and access to HPC– Ability to “program” at a higher level of abstraction– Sharing of models, along with “provenance” information– Gateways to HPC environments
• Management of Geospatial Information– Using GIS capabilities, map services, geospatial data integration
• Visualization of 3D, 4D geospatial data and information
GEON Portalportal.geongrid.org
14
• Generic Capabilities:– Search
– Workbench
– Dynamic map services, map integration
• Applications:– Paleo database integration
– LiDAR data access and data processing
– SYNSEIS: Online access to computational modeling system
– Gravity and Magnetic database for US
GEON and Related Portals
EarthScope
CUAHSI Hydrologic Information System
Chesapeake Bay Environmental Observatory National Ecological Observatory Network Prototype
Tropical Ecology Assessment and Monitoring Network
Data Search and Integration
GEON LiDAR Workflow (GLW) Portlet
GEON Project and Funding Structure
18
GEON
GEON Portal
OpenTopographyOpenEarth Framework
• NSF ITR
NSF Geoinformatics
NSF EAR/IF Facility (GEO, OCI, CISE) • OCI Software Development for
Cyberinfrastructure (SDCI)
CluENSF CluE (GEO, CISE)
Integrated Cyberinfrastructure System Source: Dr. Deborah Crawford, Chair, NSF CI Working Committee
19
Hardware
Middleware Services
DevelopmentTools & Libraries
Application Domains• Geosciences, Engineering, Environmental Sciences, Physics, Astronomy, Archaeology, Neurosciences, Biomedicine, …
Domain-specific Cybertools (software)
Domain-specific Cybertools (software)
Shared Cybertools (software)
Shared Cybertools (software)
Distributed Resources (computation, storage, communication, etc.)
Distributed Resources (computation, storage, communication, etc.)
Ed
uca
tion a
nd
Tra
inin
g
Dis
covery
& In
novati
on
Friendly Work-Facilitating Portals
Authentication - Authorization – Auditing - Resource Discovery - Workflows - Visualization - Analysis
20
Community Cyberinfrastructure Projects
Middleware Services
DevelopmentTools & Libraries
Distributed Computing, Instruments and Data Resources
Bio
med
ical
In
form
atic
s (B
IRN
)
Hig
h E
neg
y P
hys
ics
(Gri
Ph
yN)
Geo
scie
nce
s (G
EO
N)
Eco
log
ical
Ob
serv
ato
ries
(N
EO
N)
Ear
thq
uak
e E
ng
inee
rin
g (
NE
ES
)
Oce
an O
bse
rvin
g (
OR
ION
)
HardwareSource: Prof. Mark Ellisman, UC San Diego
Shared Tools ScienceDomains
Shared Tools ScienceDomains
Your Specific Tools
& User Apps.
Your Specific Tools
& User Apps.
Services implied by the Geoinformatics use case
21
“For a given region (i.e. lat/long extent, plus depth), return a 3D structural model with accompanying physical parameters of density, seismic velocities, geochemistry, and geologic ages, using a cell size of 10km”
Services implied by the use case1. Search and discovery
2. Data access
3. Data integration, including transformations, model execution, and visualization
4. Result publication (and preservation—so that results can be searched and discovered)
22
All in a distributed environment
google,
bing,..?Grid
computingsupercompu
ters
Some
database
technologie
sSome scientific visualization
Digital libraries and archives
Data “integration”
• A priori integration– Consistent metadata and data standards and data
“schema”/structure, and semantics are pre-defined across a set of data resources
– User simply issues a query and receives a result
versus• Ad hoc integration
– Consistent standards for discovery and data access, but retrieved data are visualized in a common environment and user interactively integrates the data
23
Evolution of distributed environments
• Mainframes – with distributed “synchronous” terminals
• Networked minicomputers – with proprietary computer networking
protocols
• The Web– Engineering workstations with open
communications protocols
24
Evolution of distributed environments
• The Grid– Distributed computational and storage
resources owned by organizations, orchestrated together to form “metacomputers”
• The Cloud– On-demand computational and storage
resources provided as a service over the Internet, with incremental cost models
25
Clients in a distributed environment
• “Dumb” terminals– IBM 3270, vt100
• “Thick” clients– Workstations as clients in a client-server system
• “Thin” clients– Original PC desktops
• Thick clients– Modern PCs with powerful capabilities (64-bit, multicore, large
memory)
• Thin clients– Mobile devices
26
Distributed environments…contd.
• Service-oriented architecture, SOA– A programming style for distributed computing– Services may be distributed in wide area
(Internet scale) – or local area (within a datacenter)
• Data inertia– Moving data to computation vs– Computation to data
27
Virtual Organizations (VOs)
• A socio-technical concept
• A distributed collection of entities and resources that come together to solve a specific problem– Multiple participants
– Distributed sites
– Participants are from different “administrative domains”
– Policies, rules, systems of the VO may be different than those of the participating organizations
• Requires agreement on basics standards and protocols to enable resource and data sharing
28
Other Geoinformatics Efforts• OneGeology.org
– International initiative of geological surveys to create dynamic geological map data available via the web.
• USGS initiative– Presentation by Dr. Linda
Gundersen, at Geoinformatics 2007, San Diego.
29
USGS: 1000’s of National USGS: 1000’s of National and Regional Databasesand Regional Databases
The National Map – topographic, The National Map – topographic, elevation, orthoimagery, elevation, orthoimagery, transportation hydrography etc.transportation hydrography etc.
Geospatial One Stop-portalGeospatial One Stop-portal MRDATA – Mineral Resources and MRDATA – Mineral Resources and
Related DataRelated Data The National Geologic Map Database The National Geologic Map Database
stnadardized community collection of stnadardized community collection of geologic mappinggeologic mapping
National Water Information System - National Water Information System - NWISWebNWISWeb
National Geochemical Survey National Geochemical Survey Database (PLUTO, NURE)Database (PLUTO, NURE)
National Geophysical Database National Geophysical Database (aeromag, gravity, aerorad)(aeromag, gravity, aerorad)
Earthquake CatalogsEarthquake Catalogs North American Breeding Bird SurveyNorth American Breeding Bird Survey National Vegetation/speciation mapsNational Vegetation/speciation maps National Oil and Gas AssessmentNational Oil and Gas Assessment National Coal Quality InventoryNational Coal Quality InventorySource: Presentation by Dr. Linda Gundersen, USGS, at Geoinformatics
2007, San Diego, CA.
USGIN: Geoscience Information Network
31