a grid portal for earth observation community g. aloisio, m. cafaro, g. cartenì, i. epicoco, g....

28
A Grid Portal for Earth Observation A Grid Portal for Earth Observation Community Community G. Aloisio, M. Cafaro, G. Cartenì, I. Epicoco, G. Aloisio, M. Cafaro, G. Cartenì, I. Epicoco, G. Quarta G. Quarta 8° WORKSHOP SUL CALCOLO AD ALTE PRESTAZIONI IN ITALIA 8° WORKSHOP SUL CALCOLO AD ALTE PRESTAZIONI IN ITALIA CAPI’04 CAPI’04 Milano , 24-25 Novembre 2004 Milano , 24-25 Novembre 2004 Center for Advanced Computational Center for Advanced Computational Technologies University of Lecce - Italy Technologies University of Lecce - Italy

Upload: trevor-stanley

Post on 16-Jan-2016

219 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: A Grid Portal for Earth Observation Community G. Aloisio, M. Cafaro, G. Cartenì, I. Epicoco, G. Quarta 8° WORKSHOP SUL CALCOLO AD ALTE PRESTAZIONI IN ITALIA

A Grid Portal for Earth Observation A Grid Portal for Earth Observation CommunityCommunity

G. Aloisio, M. Cafaro, G. Cartenì, I. Epicoco, G. Aloisio, M. Cafaro, G. Cartenì, I. Epicoco, G. QuartaG. Quarta

8° WORKSHOP SUL CALCOLO AD ALTE PRESTAZIONI IN ITALIA8° WORKSHOP SUL CALCOLO AD ALTE PRESTAZIONI IN ITALIACAPI’04CAPI’04

Milano , 24-25 Novembre 2004Milano , 24-25 Novembre 2004

Center for Advanced Computational Technologies Center for Advanced Computational Technologies University of Lecce - ItalyUniversity of Lecce - Italy

Page 2: A Grid Portal for Earth Observation Community G. Aloisio, M. Cafaro, G. Cartenì, I. Epicoco, G. Quarta 8° WORKSHOP SUL CALCOLO AD ALTE PRESTAZIONI IN ITALIA

OUTLINE

• The GRID.IT project

• Remote sensing background

• Remote sensing ISSUES

• Grid computing paradigm for remote sensing data processing and management

• The Italian Grid For Earth Observation (I-GEO) project: a prototype of grid infrastructure for the remote sensing community

• SPACI Project: the testbed infrastructure

Page 3: A Grid Portal for Earth Observation Community G. Aloisio, M. Cafaro, G. Cartenì, I. Epicoco, G. Quarta 8° WORKSHOP SUL CALCOLO AD ALTE PRESTAZIONI IN ITALIA

THE GRID.IT ProjectThis project, coordinated by the National Research Council (CNR), is defined within the scientific and technological context of new ITC platforms and of large scale distributed systems. The goal is to study and to experiment systems and software tools that turn out to be innovative at all levels, as well as to demonstrate their capabilities through some specific applications.

National CoordinatorProf. Marco VanneschiUniversity of Pisa and ISTI-CNR, Pisa

Research Units CNR, ISTI, Pisa (D. Laforenza) CNR, ISTM, Perugia (M. Rosi) CNR, ICAR, Napoli (A. Murli) INFN, Padova (M. Mazzucato) CNIT, Pisa (G. Prati) ASI, Matera (G. Milillo)

Page 4: A Grid Portal for Earth Observation Community G. Aloisio, M. Cafaro, G. Cartenì, I. Epicoco, G. Quarta 8° WORKSHOP SUL CALCOLO AD ALTE PRESTAZIONI IN ITALIA

THE GRID.IT ProjectWork Packages

WP1. Grid Oriented Optical Switching Paradigms (Piero Castoldi, CNIT, Pisa) WP2. High Performance Photonic Testbeds (Stefano Giordano, CNIT, Pisa) WP3. Grid Deployment (Cristina Vistoli, INFN, Padova) WP4. Security (Maurizio Talamo, Univ. of Roma "Tor Vergata") WP5. Data Intensive Core Services (Cristina Vistoli, INFN, Padova) WP6. Knowledge Services for Intensive Data Analysis (Franco Turini, Univ. of Pisa) WP7. Grid Portals (Giovanni Aloisio, Univ. of Lecce, ISUFI) WP8. High-performance Component-based Programming Environment (Marco Danelutto, Univ. of Pisa) WP9. Grid-enabled Scientific Libraries (Almerico Murli, Univ. Napoli & ICAR-CNR, Napoli) WP10. Grid Applications for Astrophysics (Leopoldo Benacchio, Inaf, Padova) WP11. Grid Applications for Earth Observation Systems Applications (Giovanni Milillo, ASI, Matera) WP12. Grid Applications for Biology (Alberto Apostolico, Univ. of Padova) WP13. Grid Applications for Molecular Virtual Reality (Antonio Laganà, Univ. Perugia & ISTM-CNR WP14. Grid Applications for Geophysics (Antonio Navarra, INGV, Bologna) WP15. Project Management (Marco Vanneschi, Univ. of Pisa)

Page 5: A Grid Portal for Earth Observation Community G. Aloisio, M. Cafaro, G. Cartenì, I. Epicoco, G. Quarta 8° WORKSHOP SUL CALCOLO AD ALTE PRESTAZIONI IN ITALIA

EnergySources

Variables to be detected

Systems&Observation

Platforms

Acquisition& Archiving

Systems

Pre-Processing & Post-processing

Systems

Final Users

What is the Remote Sensing?

Satellite dish

Page 6: A Grid Portal for Earth Observation Community G. Aloisio, M. Cafaro, G. Cartenì, I. Epicoco, G. Quarta 8° WORKSHOP SUL CALCOLO AD ALTE PRESTAZIONI IN ITALIA

What is the Remote Sensing?

Applications:• Earth planet study

(mapping ocean features, land use, environmental change, emergency search and rescue, natural disaster monitoring and response, etc.)

• Urban development planning (unauthorized building, traffic planning and monitoring, etc.)

• Military intelligence (targets detection and recognition, spying, etc.)

• …

Izmit’s Earthquake - 1999(SAR interferogramm)

Maastricht ‘s flooding – 1995(SAR imagery)

Straits of Messina - 1994 (SAR imagery)

Ikonos meter-resolution satellite imagery

Sicily - 2003 (ENVISAT MERIS imagery)

Page 7: A Grid Portal for Earth Observation Community G. Aloisio, M. Cafaro, G. Cartenì, I. Epicoco, G. Quarta 8° WORKSHOP SUL CALCOLO AD ALTE PRESTAZIONI IN ITALIA

Remote Sensing ISSUES

• Terabyte of data per month produced by several Earth observation missions: need using high performance and capacity storage systems,

• Data mining over geographically spread archiving centers: need using efficient and secure mechanisms for data searching, accessing, manipulating and integration,

• Complex computational intensive processing: need using several applications over high performance computing resources to achieve a single result,

• Need to efficiently use the high performance computing resources.

Page 8: A Grid Portal for Earth Observation Community G. Aloisio, M. Cafaro, G. Cartenì, I. Epicoco, G. Quarta 8° WORKSHOP SUL CALCOLO AD ALTE PRESTAZIONI IN ITALIA

Remote Sensing ISSUES

Terabyte of data per mount produced by several Earth observation missions

ENVISAT,ERS-1/2

JERSSRTMJERS

LANDSAT…

Page 9: A Grid Portal for Earth Observation Community G. Aloisio, M. Cafaro, G. Cartenì, I. Epicoco, G. Quarta 8° WORKSHOP SUL CALCOLO AD ALTE PRESTAZIONI IN ITALIA

Remote Sensing ISSUESData mining over geographically spread

archiving center / Multi-source data integration

?Laptop

user

Page 10: A Grid Portal for Earth Observation Community G. Aloisio, M. Cafaro, G. Cartenì, I. Epicoco, G. Quarta 8° WORKSHOP SUL CALCOLO AD ALTE PRESTAZIONI IN ITALIA

Remote Sensing ISSUES

Complex computational intensive processing

Page 11: A Grid Portal for Earth Observation Community G. Aloisio, M. Cafaro, G. Cartenì, I. Epicoco, G. Quarta 8° WORKSHOP SUL CALCOLO AD ALTE PRESTAZIONI IN ITALIA

GRID computing point of viewToday, GRID COMPUTING is an emerging technology to solve particular problems in dynamic, multi-institutional Virtual Organizations (VOs) coordinated by sharing resources such as high-performance computer, observation devices, data and databases over high speed network, etc.

VO1

VO2

VO3

Page 12: A Grid Portal for Earth Observation Community G. Aloisio, M. Cafaro, G. Cartenì, I. Epicoco, G. Quarta 8° WORKSHOP SUL CALCOLO AD ALTE PRESTAZIONI IN ITALIA

The Italian GRID for EARTH Observationproject, I-GEO

Page 13: A Grid Portal for Earth Observation Community G. Aloisio, M. Cafaro, G. Cartenì, I. Epicoco, G. Quarta 8° WORKSHOP SUL CALCOLO AD ALTE PRESTAZIONI IN ITALIA

The Italian GRID for EARTH Observationproject, I-GEO

Primary Node

Primary Node

Primary Node Primary Node

Primary Node

Portal Node

ConfigurationRepository

http/https

GSI

Laptop

user

administrator

contributor

Page 14: A Grid Portal for Earth Observation Community G. Aloisio, M. Cafaro, G. Cartenì, I. Epicoco, G. Quarta 8° WORKSHOP SUL CALCOLO AD ALTE PRESTAZIONI IN ITALIA

The Italian GRID for EARTH Observationproject, I-GEO

primary nodes

Laptop

users

Information system

Information system

Distributed data management system

Distributed data management system

Services Discovery Module

Services Discovery Module

WORKFLOW Interface

WORKFLOW Interface

Scheduler and Monitoring Module

Scheduler and Monitoring Module

data

Laptop

Laptop

Configuration repository

Page 15: A Grid Portal for Earth Observation Community G. Aloisio, M. Cafaro, G. Cartenì, I. Epicoco, G. Quarta 8° WORKSHOP SUL CALCOLO AD ALTE PRESTAZIONI IN ITALIA

The Italian GRID for EARTH Observationproject, I-GEOI-GEO Information System

it is an information source that contains an ontology about applications usually employed in the remote sensing field. Information related to an application includes: (a) input and output data formats; (b) application capabilities (we distinguish applications with pre-processing capability, post-processing or data format conversion capability); (c) information needed to launch the application i.e. hostname, pathname, shared libraries on which the application depends, environment variables, application arguments and so on;

softwaretools

characterization

hardware resources

characterization

dataformats

characterization

Page 16: A Grid Portal for Earth Observation Community G. Aloisio, M. Cafaro, G. Cartenì, I. Epicoco, G. Quarta 8° WORKSHOP SUL CALCOLO AD ALTE PRESTAZIONI IN ITALIA

The Italian GRID for EARTH Observationproject, I-GEO• Centralized approach,• Web-based management,• XML-based,• Several level of access (user, contributor, administrator)

I-GEO Information System

Page 17: A Grid Portal for Earth Observation Community G. Aloisio, M. Cafaro, G. Cartenì, I. Epicoco, G. Quarta 8° WORKSHOP SUL CALCOLO AD ALTE PRESTAZIONI IN ITALIA

The Italian GRID for EARTH Observationproject, I-GEOI-GEO Distributed Data Management System

based on a common metadata schema for describing EO and geospatial products, it uses some modules belonging to the Grid Relational Catalog Project to provide transparent, efficient and secure heterogeneous data integration:

• Based on GRelC project libraries, • Peer to peer approaches,• Management of heterogeneous data sources,• Secure access to the distributed catalogues,• Efficiency granted using GridFTP data transfer protocol and data compression algorithms (Lempel-Ziv77 and Huffman coding),• Distributed and geographic queries are allowed,• Common schema for describing remote sensing data

Page 18: A Grid Portal for Earth Observation Community G. Aloisio, M. Cafaro, G. Cartenì, I. Epicoco, G. Quarta 8° WORKSHOP SUL CALCOLO AD ALTE PRESTAZIONI IN ITALIA

The Italian GRID for EARTH Observationproject, I-GEOI-GEO Distributed Data Management System

• The GRelC Service (GS) modules provide a robust and uniform access interface to each remote sensing data repository (basic primitives to trasparently interact with heterogeneous data sources),• The Enhanced GRelC Gather Service (EGGS) modules gather partial resultsets coming from several GS and other EGGS, submit queries to the connected GS, forward queries to other EGGSs connected with it

Page 19: A Grid Portal for Earth Observation Community G. Aloisio, M. Cafaro, G. Cartenì, I. Epicoco, G. Quarta 8° WORKSHOP SUL CALCOLO AD ALTE PRESTAZIONI IN ITALIA

The Italian GRID for EARTH Observationproject, I-GEOI-GEO Workflow Management System

an integrated workflow management system for EO applications that includes a web based user interface and a resource manager optimized for EO applications; it interacts with underlying services like the I-GEO Information System and Monitoring Systems provided by the grid middleware;• grid-workflow composition interface is provided,• the submitted workflows are verified and mapped on grid resources into real jobs and data transfers,• the workflow engine automatically starts data format conversion, if needed, in order to grant the maximum compatibility among data formats and applications

Page 20: A Grid Portal for Earth Observation Community G. Aloisio, M. Cafaro, G. Cartenì, I. Epicoco, G. Quarta 8° WORKSHOP SUL CALCOLO AD ALTE PRESTAZIONI IN ITALIA

The Italian GRID for EARTH Observationproject, I-GEOI-GEO Workflow Management System

Page 21: A Grid Portal for Earth Observation Community G. Aloisio, M. Cafaro, G. Cartenì, I. Epicoco, G. Quarta 8° WORKSHOP SUL CALCOLO AD ALTE PRESTAZIONI IN ITALIA

The Italian GRID for EARTH Observationproject, I-GEOI-GEO Scheduler and Monitoring Module

Is the component responsible for job scheduling and file transfer taking into account available computational resources, the locations where the datasets are stored and where the services are installed on, and several performance parameters provided both by the Network Weather Service and by an historical archive.

The software scheduling modules which operates in these two layers, are:

•Workflow Controller

•Job Scheduler

They are implemented as web services and so they communicate via SOAP messages (APACHE-AXIS is the web services container).

The execution of a workflow produces the so called Concrete Workflow the physical translation of the user’s defined, with the details of the executions and data transfers of the set of jobs.

Page 22: A Grid Portal for Earth Observation Community G. Aloisio, M. Cafaro, G. Cartenì, I. Epicoco, G. Quarta 8° WORKSHOP SUL CALCOLO AD ALTE PRESTAZIONI IN ITALIA

The Italian GRID for EARTH Observationproject, I-GEO Video …

Page 23: A Grid Portal for Earth Observation Community G. Aloisio, M. Cafaro, G. Cartenì, I. Epicoco, G. Quarta 8° WORKSHOP SUL CALCOLO AD ALTE PRESTAZIONI IN ITALIA

The Italian GRID for EARTH Observationproject, I-GEO• As a grid middleware we have used the GLOBUS TOOLKIT and the GRB/ GRB-GSIFTP libraries developed at the CACT/ISUFI of the University of Lecce:

the use of the Globus Toolkit as grid middleware and GRB and GRB-GSIFTP libraries, beyond simplifying the access to remote computational resources, does guarantee security, data integrity and confidential communications

• For the development of the distributed catalogue access mechanism, we have used the Grid Relational Catalogue project (GRelC) libraries developed at the CACT/ISUFI of the University of Lecce:

the use of the GRelC libraries beyond simplifying the access to the distributed catalogues, the management of heterogeneous sources of data and guarantee the needed efficiency

Some considerations about the employed technologies:

Page 24: A Grid Portal for Earth Observation Community G. Aloisio, M. Cafaro, G. Cartenì, I. Epicoco, G. Quarta 8° WORKSHOP SUL CALCOLO AD ALTE PRESTAZIONI IN ITALIA

The Italian GRID for EARTH Observationproject, I-GEO

The web application has been developed using:

• cgi written in C (for the implementation of the distributed catalogue access and the authentication functionalities),

• Java Server Pages (for the implementation of the configuration repository management functionalities and the user interface),

• Java Applet (for the implementation of the workflow composition interface),

The web application is hosted on a Linux server with Apache 2 as web sever and Jakarta-Tomcat as application server.

Some considerations about the employed technologies:

Page 25: A Grid Portal for Earth Observation Community G. Aloisio, M. Cafaro, G. Cartenì, I. Epicoco, G. Quarta 8° WORKSHOP SUL CALCOLO AD ALTE PRESTAZIONI IN ITALIA

The Southern Partnership for Advanced Computational Infrastrures: the testbed scenario

MIUR/HPCC

Center of Excellence forHigh Perfomance Computing

University of Calabria Director: Prof. Lucio Grandinetti

CPS/CNR

Center for Research on Parallel Computing and Supercomputing

(now Section of Naples of ICAR/CNR)Director: Prof. Almerico Murli

Southern Partnership for Advanced Computational Infrastructures

A grid infrastructure based on three geographically spread High Performance Computing Centers located in Southern Italy

ISUFI/CACT

Center for Advanced ComputingTechnologies

University of Lecce Director: Prof. Giovanni Aloisio

Page 26: A Grid Portal for Earth Observation Community G. Aloisio, M. Cafaro, G. Cartenì, I. Epicoco, G. Quarta 8° WORKSHOP SUL CALCOLO AD ALTE PRESTAZIONI IN ITALIA

•In order to build an efficient environment, several aspect have been considered: the description of the resources, their efficient usage, the efficient composition of the user’s processing applications, and so on.

•Each aspects has been investigated and the architecture of the modules that have been developed has been described.

•The environment is currently being tested on a national grid test bed which uses the SPACI (Southern Partnership for Advanced Computational Infrastructures) geographically spread resources, that provides altogether a computational power of about 1800 Gflops.

•The people involved in the testbed are computer scientists, physics, experts in Earth Observations domain and so on. The heterogeneity of the involved people, guarantees that required feedback and all of the involved aspects will be properly considered and improved if needed.

•As future work, we plan to migrate towards a Grid Services architecture (OGSA compliant, based on the emerging WSRF standard)

Conclusions

Page 27: A Grid Portal for Earth Observation Community G. Aloisio, M. Cafaro, G. Cartenì, I. Epicoco, G. Quarta 8° WORKSHOP SUL CALCOLO AD ALTE PRESTAZIONI IN ITALIA

References[1] G. Aloisio, M. Cafaro, "A Dynamic Earth Observation System", Parallel Computing, Volume 29, Issue 10 (2003), pp. 1357-1362, Special Issue on High Performance computing with geographical data.

[2] G. Aloisio, M. Cafaro, I. Epicoco, G. Quarta, "A Problem Solving Environment for Remote Sensing Data Processing", Proceedings of the International Conference on Information Technology (ITCC 2004), IEEE Press, April 5 to 7, Las Vegas (Nevada) USA, Volume II, pp. 56-61. [3] G. Aloisio, E. Blasi, M. Cafaro, I. Epicoco, G. Quarta, M. Tana, A. Zuccalà, “A Distributed Architecture for Remote Sensing Data Management”, Proceedings of Systemics, Cybernetics and Informatics (SCI 2004), IIIS Press, ISBN 980-6560-13-2, July 18 to 21, 2004, Orlando (Florida) USA, Volume I, pp. 236-240. [4] G. Aloisio, M. Cafaro, I. Epicoco, G. Quarta, “Information Management for Grid-Based Remote Sensing Problem Solving Environment”, Proceedings of the International Conference on Parallel and Distributed Processing Techniques and Applications (PDPTA 2004), CSREA Press, ISBN 1-932415-24-6, June 21 to 24, 2004, Las Vegas (Nevada) USA, Volume II, pp. 887-893.

[5] G. Aloisio, M. Cafaro, I. Epicoco, G. Quarta, “Grid Application for Earth Observation Systems”, Proceedings of the VII Congresso SIMAI, September 20 to 24, 2004, Venice ITALY.

[6] G. Aloisio, M. Cafaro, S. Fiore, G. Quarta, “A Grid-Based Architecture for Earth Observation Data Access”, accepted to the 20th Annual ACM Symposium on Applied Computing, March 13 -17, 2005, Santa Fe, New Mexico.

Page 28: A Grid Portal for Earth Observation Community G. Aloisio, M. Cafaro, G. Cartenì, I. Epicoco, G. Quarta 8° WORKSHOP SUL CALCOLO AD ALTE PRESTAZIONI IN ITALIA

More information can be found at the URL:

http://leonardo.unile.it/igeo

Director: Prof. Giovanni ALOISIO Project P. I.: Gianvito QUARTA