6-7 may 2002 ceos grid workshop 1 european datagrid for eo [email protected] -...
Post on 20-Jan-2016
222 views
TRANSCRIPT
6-7 May 2002 CEOS GRID Workshop 1
European DataGRID for EO
[email protected] - [email protected]
ESRIN, 6-7 May 2002
CEOS Workshop on GRID
6-7 May 2002 CEOS GRID Workshop 2
Summary
• EO applications and GRID requirements
• ESA EO participation to European GRID projects – DataGrid
• Ideas for CEOS
6-7 May 2002 CEOS GRID Workshop 3
Earth Observation Community GRID interactive scenario
Common access to EO missions cataloguesAcquisition plan, order, delivery
Parametric data fusion and models integration
Collaborative publishing of results
On demand high level products generation
6-7 May 2002 CEOS GRID Workshop 4
EO and Networking Computing – which data models?
• Distributed Computing– Integration of data from various instruments
and missions
• High-Throughput Computing– Interferometry …
• On-Demand Computing– Generation of EO user products…
• Data-Intensive Computing– Archive data re-processing, climate modeling…
• Collaborative Computing– Scientists application interactions, Instrument
cal/val …
Ian Foster and Carl Kesselman, editors, “The Grid: Blueprint for a New Computing Infrastructure,” Morgan Kaufmann, 1999
6-7 May 2002 CEOS GRID Workshop 5
High demanding computing
Pomona (Cal): subsidence velocity fields40 ERS1/2 images (92-99), Ambiguity: 28 mm
Digital Elevation Model
GRID requirements:• large data files (10+ GB) • stages with intensive
processing • science driven value adding
6-7 May 2002 CEOS GRID Workshop 6
Science collaborative environment: El Niño
November 1997: El Niño January 1999: La Niña
SST
SST
anomaly
6-7 May 2002 CEOS GRID Workshop 7
Global fire atlas - ATSR: 1997
6-7 May 2002 CEOS GRID Workshop 8
Global fire atlas - ATSR: 1998
6-7 May 2002 CEOS GRID Workshop 9
• Provide a single access point to space systems to emergency & rescue organisations in case of disasters
• Participating Space Agencies: CNES, CSA, ESA, ISRO, NOAA, …
• Missions: RADARSAT; ERS, (Envisat); SPOT; IRS; NOAA, …
6-7 May 2002 CEOS GRID Workshop 10
ESRIN
MATERA (I)
NEUSTREL.ITZ (D)
KIRUNA (S)- ESRANGE
Earthnet Facilities real time Infrastructure
MASPALOMAS (E)
TROMSO (N)
MATERA (I)
METADATABROWSEWEB
SEAWIFS
SPOT IRS-P3
LANDSAT 7TERRA/MODIS
STANDARDPRODUCTION CHAINS
MULTIMISSIONDATABASES FOR REMOTE ACCESSAND USER SERVICES
USERS
HISTORICALARCHIVES
USERS
PRODUCTS
AVHRR
6-7 May 2002 CEOS GRID Workshop 11
ENVISAT FACILITIES ORGANISATION
Decentralised architecture, central co-ordination and supervision.
National facilities put at ESA’s disposal via MOUs and contracts.
Direct dealing with scientific users (outside ESA operational remit)
Co-operation with value added industry in E.O. promotion and in technology transfer from research to applications.
ESRIN
I-PAC
D-PAC
F-PAC
UK-PAC
E-PAC
LRAC/S-PAC
F-PAF
FIN Co-PAC
ESOC
6-7 May 2002 CEOS GRID Workshop 12
No Projects1-25 Projects26-5051-100100+ Projects
Countries
P.I. geographic distribution
AOs: Stimulating scientific research world-wideAOs: Stimulating scientific research world-wide
AO-1 (1986)
AO-2 (1994)
AO-3 (1998)
AO- ENVISAT (2000)0
100
200
300
400
500
600
700N
r o
f p
roje
cts
3500+ science Users of ESA data3500+ science Users of ESA data
120 New Cat-1 Projects in 2001120 New Cat-1 Projects in 2001
700 Envisat AOs to start in 2002700 Envisat AOs to start in 2002
Stimulating new researchs
6-7 May 2002 CEOS GRID Workshop 13
Why GRID in EO? (1)
• EO Community: Progressive refinement of data from many sourcess to produce higher quality products• Product generation chain involving distributed
organisations and users• Collaborative: distributed users and data – large
international cooperation• Discovery: large numbers of products &
resources• Interoperabiltiy of catalogue and metadata
already in operation• Web based data services
6-7 May 2002 CEOS GRID Workshop 14
Why GRID in EO? (2)
• Massive, non-stop data volumes• New instruments, sensors & product types• Distributed archives• Historical dataset reprocessing
• Complex numerical processing algorithms• Near real-time turnover
6-7 May 2002 CEOS GRID Workshop 15
The Grid from a Services View
:
:E.g.,
Resource-specific implementations of basic services
E.g., Transport protocols, name servers, differentiated services, CPU schedulers, public keyinfrastructure, site accounting, directory service, OS bypass
Resource-independent and application-independent services authentication, authorization, resource location, resource allocation, events, accounting,remote data access, information, policy, fault detection
DistributedComputing
Toolkit
Grid Fabric(Resources)
Grid Services
ApplicationToolkits
Data-Intensive
ApplicationsToolkit
CollaborativeApplications
Toolkit
RemoteVisualizationApplications
Toolkit
ProblemSolving
ApplicationsToolkit
RemoteInstrumentation
ApplicationsToolkit
Applications Space Science
S/C modelling
Cosmology
Space weather
Environment
...
EO
En
viro
nm
ent
GR
ID M
idd
lew
are
6-7 May 2002 CEOS GRID Workshop 16
Needed GRID technologies
• Resource-independent and application-independent services (middleware)– authentication, authorization, resource location,
resource allocation, remote data access, – accounting, security, quality of services, fault
detection, real time services, …
• Specialized protocols, procedures, data standards, operational environments, interfaces to EO legacy systems…
• EO dedicated portal and user access…
6-7 May 2002 CEOS GRID Workshop 17
Participation to GRID initiatives
6-7 May 2002 CEOS GRID Workshop 18
Participation in European GRID projectsEU funded• DataGRID – Earth Observation application• EGSO – Solar radiance• DataTAG – access to Trans Atlantic
Connectivity• …ESA funded• SpaceGRID – vision of GRID systems for space• ESA internal GRID initiative• …
6-7 May 2002 CEOS GRID Workshop 19
DataGrid EO application objectives
Specification of EO requirements Bringing Grid-aware application concepts into the
Earth Science environment Adaptation of existing systems and selected EO
applications to use the DataGrid infrastructure Testbed validation through prototyping activity Activities handled in coordination and
synchronisation with other related and relevant work packages
Key partners: ESA-ESRIN, KNMI (NL), IPSL (F) Associated partners: ENEA (I), BADC (UK)
6-7 May 2002 CEOS GRID Workshop 20
GOME Instrument (1 day coverage)
GOME’s Ground track
6-7 May 2002 CEOS GRID Workshop 21
Application of DataGrid in EO
• One Use Case being studied in detail (GOME)• Develop generic components• Feedback to DataGrid developers and
Architecture Group• Re-use components to add new applications• Testing in “controlled” GRID environment
(ESRIN-ENEA) and in “wide-European” environment
6-7 May 2002 CEOS GRID Workshop 22
Why Grid in EO?
L2 VAL
L1 L2RAW L1
ESA ESA / KNMI
IPSL
End UserL2
L3
ScienceApplication
VAL
+
An Example: GOME Use CaseProcess 1 Year of data
Regulated Access to Grid processing power
Secure access to Grid-registered high-volume data storage
L1
L2
4724 files = 66 Gb
9,448,000 files = 108 Gb
6-7 May 2002 CEOS GRID Workshop 23
Architecture Group
DataGrid Overview (1/5)
MiddlewareDevelopers
WP1
WP7
WP3WP2
WP4
WP5
MiddlewarePackages
IntegrationTesting
Integration
WP6
InformationIndex
ResourceBroker
UserInterface
ComputingElement
StorageElement
Information &Monitoring
ReplicaManagement
InstallationManagement
NetworkMonitoring
CVS Repository CertificateAuthorities
EDG Rules
EDGMembershipRegistration
Documentation
Applications
WP8-9-10
Requirements
Evaluation &Prototyping
SE
Site H
SE
Site G
SE
Site F
SE
Site E
SE
Site D
SE
Site C
SE
Site B
Site AInstallationManagement
Sites
SitesInstallation
1. Organization
ApplicationEnvironments
6-7 May 2002 CEOS GRID Workshop 24
DataGrid Overview (2/5)
VOLDAP Server
Certificate Authorities
Users
1. Obtain certificate
2. Join VO
3. Sites subscribe to one or more VOs
4. Publishdetails
InformationIndex
GridResource
Broker
UserInterface
Search
5. Submit Jobs
CE SE
Site H
CE SE
Site G
CE SE
Site F
CE SE
Site E
CE SE
Site D
CE SE
Site C
CE SE
Site B
Grid fabric resources
Site A
2. VO registrationand information publishing
6-7 May 2002 CEOS GRID Workshop 25
ResourceBroker
CE SE
Site H
CE SE
Site G
CE SE
Site F
CE SE
Site E
CE SE
Site D
CE SE
Site C
CE SE
Site B
DataGrid Overview (3/5)
Certificate AuthoritiesInformation
Index
UserInterface
JDLscript Executable
inputdatainput
datainputdata
Myjob
Search
Submit job
Request status
Check certificate
Retrieve result
3. Job submission with local data
6-7 May 2002 CEOS GRID Workshop 26
DataGrid Overview (4/5)
UserInterface
ReplicaManager
Submit job
inputdata
Mydata
inputdatainput
datainputdata
CE SE
Site H
CE SE
Site G
CE SE
Site F
CE SE
Site E
CE SE
Site D
CE SE
Site C
CE SE
Site B
ReplicaCatalog
Replicate
4. Data replication
6-7 May 2002 CEOS GRID Workshop 27
DataGrid Overview (5/5)
5. Job submission using replicated data
ResourceBroker
CE SE
Site H
CE SE
Site G
CE SE
Site F
CE SE
Site E
CE SE
Site D
CE SE
Site C
CE SE
Site B
Certificate Authorities
UserInterface
JDLscript Executable
Myjob
Submit job
Request status
Check certificate
Search
InformationIndex
ReplicaCatalog
Search
inputdatainput
datainputdata
LFNLFN
LFN
Retrieve result
LFN
PFN
Logical filename
Physical filename
LFN PFN::
LFN PFN::
LFN PFN::
6-7 May 2002 CEOS GRID Workshop 28
DataGrid Activities
• Testbed validation– writing scripts to test and validate Testbed1 services
• Develop Use Cases for end-to-end GOME processing and validation demonstration across three sites in Holland, France and Italy)
• Develop EO Grid Application Interfacing Components – for generic application interfacing
• High-speed connection to ENEA HPC network• Installation of ESRIN DataGrid site
– using DataGrid installation tools– installation of 2 CEs:
• ESRIN cluster using PBS• ENEA using LSF/AFS
– and 1 SE 0.5TB RAID array on ESRIN cluster populated from ESA AMS MSS archive
6-7 May 2002 CEOS GRID Workshop 29
DataGrid Issues• Very-large-scale, complex system with large
numbers of participants– Dealing with new concepts and technology– Communication and coordination in large, distributed, multi-cultural,
multi-institutional development group– Agressive deployment of middleware releases
• Driven by needs of HEP– With EO & Biology contributions– Reliant on HEP making the right choices
• Testbed stability, usability, performance and scalability– Application Grid interfacing layer needs to be developed– After CLIs, need APIs
• Ongoing rapid prototyping and development– Keeping step with code & documentation
• Architecture will evolve according to findings– Will take time to make fair assessment
6-7 May 2002 CEOS GRID Workshop 30
Future Directions
• In general– OGSA and integration of Web Services– Wider uptake of Grid computing concepts
• In EO– Matrix of common application requirements– Development of Generic Grid platform interface
components– Portals-based– Application Frameworks
6-7 May 2002 CEOS GRID Workshop 31
Considerations for CEOS “involvement” in GRID
• “gridding” of EO emerging technologies and services– Interoperability– EO data format handling– Web-mapping– Archive management
• Demonstrate GRID applications– International project dimension– collaborative environment– relation with IGOS, WGISS Test Facilities …
• Support “CEOS standardisation” approach to metadata and data access