cybergis gateway for enabling data-rich geospatial ... · o cloud-based resource management •...
TRANSCRIPT
CyberGIS Gateway for Enabling Data-Rich
Geospatial Research and Education
Yan Liu, Anand Padmanabhan, and Shaowen Wang
CyberInfrastructure and Geospatial Information Laboratory (CIGI)
National Center for Supercomputing Applications (NCSA)
University of Illinois at Urbana-Champaign
SGIW2013
Indianapolis, IN September 27, 2013
Outline
• Introduction
• Gateway user environment
• Open service framework
• Spatial data and analytics o Data
o Analytics
o Integration experience
• Education and outreach
• Future work and concluding discussion
2
Six Major Goals of the NSF CyberGIS Project
3
1. Engage multidisciplinary communities through a participatory approach to evolving CyberGIS software requirements;
2. Integrate and sustain a core set of composable, interoperable, manageable, and reusable CyberGIS software elements based on community-driven and open source strategies;
3. Empower high-performance and scalable CyberGIS by exploiting spatial characteristics of data and analytical operations for achieving unprecedented capabilities for geospatial scientific discoveries;
4. Enhance an online geospatial problem solving environment to allow for the contribution, sharing and learning of CyberGIS software by numerous users, which will foster the development of crosscutting education, outreach and training programs with significant broad impacts;
5. Deploy and test CyberGIS software by linking with national and international cyberinfrastructure to achieve scalability to significant sizes of geospatial problems, amounts of cyberinfrastructure resources, and number of users; and
6. Evaluate and improve the CyberGIS framework through domain science applications and vibrant partnerships to gain better understanding of the complexity of coupled human-natural systems.
Gateway
approach
Gateway
applications
CyberGIS Software Environment
4
CyberGIS
Gateway
CyberGIS
Toolkit
GISolve
Middleware
CyberGIS Science Communities
• Domain Sciences o Advanced cyberinfrastructure
o Climate change impact assessment
o Emergency management
o Geographic information science
o Geography and spatial sciences
o Geosciences
• User Communities o Biologists
o Geographers
o Geoscientists
o Social scientists
o General public
o Broad geographic information systems (GIS) users
5
Cyberinfrastructure Resources
• XSEDE o Allocated resources
• Stampede@TACC: large-scale parallel spatial analysis, optimization, and modeling
• Lonestar@TACC: production runs of Gateway applications
• Trestles@SDSC: data-intensive applications
• Gordon@SDSC: dedicated I/O node for data- and compute- intensive services
• Blacklight@PSC: large-scale shared memory runs (Python and C libraries)
• Keeneland@NICS: GPU-based spatial analysis methods
o Software environment configuration on XSEDE
• Environment Modules
• Adaptive CyberGIS Toolkit software module loading/unloading (user-mode)
• Computation and resource management plugins and wrappers
• Open Science Grid (OSG) o Within OSG: virtual organization (VO) - CIGI
o Through XSEDE: XSEDE-OSG cluster
o Use: high-throughput analysis and modeling methods
• Local clusters
• Cloud resources
6
7
Gateway Portal
Open Service Framework
Analysis
High Performance High Throughput
CyberGIS Software Environment
Cyberinfrastructure
Computation
Management
Web/Mobile
Browser
External Applications
Visualization Data
Computational
Intensity Analysis CI Resource
Management
Integration
Services Build & Test
Service Infrastructure
Clouds
GISolve Middleware
CyberGIS Toolkit
CyberGIS Gateway
CyberGIS Gateway Objectives
• Empower online high-performance and collaborative
geospatial problem solving environment
• Enable easy access to cyberGIS analytics and data sources
• Seamless integration of advanced cyberinfrastructure, GIS,
and spatial analysis and modeling o Provide Web-based user interactions for handling huge volumes of data, complex
analysis and visualization without exposing complexities in cyberinfrastructure
• Gain fundamental understanding of scalable and
sustainable geospatial software ecosystems through an
online integration platform
8
Important Characteristics
• Usability o Highly interactive online user interface
o Transparent access to backend service infrastructure and cyberinfrastructure
• Scalability o Number of users
o Amount of CI resources
o Number of gateway applications
• Interoperability o User environment
o Services
o Cyberinfrastructure
• Sustainability o Consistent interfaces
o Based on open service APIs
9
Design Approach
• Rich-client Web portal o Geographic Information Sceince (GIS) applications require highly interactive user interfaces
o HTML5-based Web technology advances make Web interface usability comparable to desktop GUI
• Modular user interface component development o Customizable portal components
o Modular gateway application packages and reusable components
• Open API approach for service environment interactions o Open service framework
o Standard-based implementation for service interoperability
• Scalable application integration o Streamlined service integration
o Agile user interface development
• Transparent access to cyberinfrastructure o Hide cyberinfrastructure details from users
10
Gateway User Environment
11
Web Portal
12 http://gateway.cybergis.org/
Technologies
• Portal o Rich client: Ext JS (Web) or Sencha Touch (Mobile)
• A rich set of ready-to-use interface components
o Server: PHP + Open Service API
o Client-server communication
• AJAX + JSON
• Application development o Client-side model-view-controller (MVC)
• Mapping o Rich client: OpenLayers
• Customized to work with Ext JS 4+
o Backend services
• Base maps
o Google, Open Street Map, Bing, ArcGIS
• GeoServer for large data and raster visualization
• Service infrastructure o GISolve middleware
13
Application User Interface Development
• MVC code structure
• Mashup library creation o Dependent libraries
• Web vs. Mobile
o Code compilation (compression)
o Packaging
• Mashup API documentation o JSDuck-based Ext JS/Sencha Touch documentation
• Integration with portal o Template-based configuration
• Entry points (menu, app panel)
• Portal environment setting
o user information, token, data servers
• Coherent layout and Web element placement
14
• app.js
• app/
• view/
• map.js
• model/
• mapConfig.js
• store/
• maps.js
• controller/
• mapControl.js
• css/
• resources/
• server-scripts/
• getMap.php
Reusable Components
• Portal o Layout
o Menu
o Header
o Footer
• User interface components o Map panel
o Map layer loading utilities
o Legend configuration
o Visualization panel
o Analysis list
o Data selection
o Upload panel
o Etc.
15
Open Service Framework
16
Objectives
• Manage complexity of cyberGIS application integration and cyberinfrastructure-based computation with a standard service Application Programming Interface (API) o Bridge GISolve service infrastructure and frontend Gateway, external web services,
and application developers
• Provide an easy-to-learn and easy-to-use API software to support broader access to integrated cyberGIS applications, data, and services
• Facilitate the streamlining of software integration process
• Provide a unified and reliable interface for the daily use of integrated spatial analysis and modeling methods by community users
17
Open Service Framework Capabilities
• Usability
o Easy to use: Web-based
o Multiple programming languages support
• PHP, Java, Python, Perl, JavaScript
o Documentation and tutorial resources
• Standard programming interface
o Standard REST Web service interface
o Provide a programming way to integrate gateway applications
o Consistent interface to computation management
• Scalable software integration
o Easy-to-use for cyberGIS application/software contributors within or outside of the project
• Interoperability with Gateway user environment
o Coupled with Gateway identify management service
o API for Gateway application development for accessing underlying service infrastructure
18
Open Service Framework
19 Wang, S., Anselin, L., Bhaduri, B., Crosby, C., Goodchild, M. F., Liu, Y., and Nyerges, T. L. “CyberGIS Software: A Synthetic Review and Integration
Roadmap.” International Journal of Geographical Information Science, DOI:10.1080/13658816.2013.776049.
Service Interoperability
• Web services o Protocols
• SOAP, REST, Open Geospatial Consortium (OGC)
o Data and mapping
• Raster: OGC Web Coverage Service (WCS)
• Vector: OGC Web Feature Service (WFS)
• Mapping: OGC Web Mapping Service (WMS)
o Metadata
• OGC Catalog Service for Web (CSW)
• Service registration o OGC Web Services
• E.g., Web Processing Service (WPS): support the registration of REST and SOAP services
• Service interactions in CyberGIS software environment o GISolve service infrastructure
o ESRI REST API
o OpenTopography Opal2 service toolkit, CSW metadata services, and WPS
o PyWPS for spatial econometric services
• Service integration research o Extensions of WPS toward cyberinfrastructure-based computation management
• QoS-based resource management and process scheduling
20
Spatial Data and Analytics
21
Interoperability with ArcGIS Online
• Objectives o Enable broad data access within
Gateway through integrated access to community data sources hosted on ArcGIS Online
o Verify the interoperability between two major cyberGIS software environments: Gateway and ArcGIS Online
• Integration o Service-oriented chaining
o ArcGIS Online access
• Data: ESRI REST API
• Visualization: OGC WMS/WCS/WFS
o Data access and visualization mashups within Gateway
• Next step o Workflow support for connecting
ArcGIS Online data and CyberGIS analytics
22
Data Integration - USGS
• National Elevation Dataset o Digital Elevation Model (DEM)
o 1/3 arcsec (10m) resolution
o 0.5 TB
o Accessibility issues
• Integration o A copy of NED 10m dataset at UIUC
o Online visualization
• Pyramid building
• Established as OGC WMS
o Streamlined data processing
• Data clipping
• Reprojection
• Integration in application
workflow
23
Interoperability with OpenTopography
• Objective o EnableLiDAR-derived digital elevation
model (DEM) data access within Gateway through interoperable access between CyberGIS Gateway and the NSF OpenTopography data facility
• Integration o Service-oriented chaining
o Metadata services
• OGC CSW (Catalogue Service for the Web)
o DEM access and processing
• Opal2 SOAP web services
• Established pre-processing service in CyberGIS service infrastructure
• Seamless integration in application workflow
o User interface
• Mashup library use in Gateway for LiDAR data selection
o Google Fusion Table
o OpenTopography data selection mashup
24 Padmanabhan, A., Youn, C., Hwang, M., Liu, Y., Wang, S., Wilkins-Diehr, N., and Crosby, C. 2013. Integration of Science Gateways: A Case Study with
CyberGIS and OpenTopography. In Proceedings of XSEDE 2013, Jul 22-25 2013, San Diego, CA, USA.
Gateway Applications and Integration
• Flumapper
• CGPySAL spatial regression
• Viewshed analysis
• BioScope o Decision support for biomass/bioenergy supply chain study
25
Flumapper: Objectives
• Provide an exploratory tool to public health researchers for detect Influenza like illnesses (ILI) activity early as indicated in social media data
• Efficiently manage the large volumes of social media data
• Capture the dynamics of flu risk across multiple spatiotemporal scales o Provide ILI risk maps from national scale to very fine local scales
• Identify movement patterns of population o Aggregate individual user trajectory across different spatial and temporal scales to
generate flows
o Capture travel patterns across the country, regional, state, city and local levels
• Facilitate interactive and exploratory analysis of location-based massive social media data o Update ILI activity in near real-time for the conterminous United States
o Provide an integrated view of flu risk indicators and mobility of users
26 Padmanabhan, A., Wang, S., Cao, G., Hwang, M., Zhao, Y., Zhang, Z., and Gao, Y. 2013. FluMapper: An Interactive CyberGIS Environment for Massive
Location-based Social Media Data Analysis. In Proceedings of XSEDE 2013, Jul 22-25 2013, San Diego, CA, USA.
Flumapper: Architecture
27
Flumapper: Integration Strategies
• Data handling o Four major components
• Implemented as distributed web services
o Cloud-based resource management
• Accommodate multi-scale spatial and temporal data growth
• Construct the space-time cube
• Exploratory spatial data analysis o GPU-based spatial analysis algorithm (kernel density estimation)
o Integrated as an application service
• Service integration o Each service: service plugin + trigger
o Workflow controlled by a central service controller
• User interface o Web and Mobile user interface
• Different view and controller implementation
• Same models
o Real-time multi-scale rendering of risk map and flow vectors
• Risk map: customized WMS
• Flows: OpenLayers-based rendering within browser
28
Flumapper User Interface
29
Mobile interface – flow mapping
Multiple sources
Web interface – flow mapping
Single source
CGPySAL Spatial Regression
• Popular application in spatial econometrics domain
• Interactive computing o Results should return in 15
seconds
• Computing model: Cloud o On-demand deployment of
spreg service VM instances
• Benefits o Collaborative online analysis
• Results sharing
• Analysis template sharing
o Online data usage
o User interface as powerful as desktop GUI
30
Anselin, L., Amaral, P. V., and Arribas-Bel, D. 2012. Technical Aspects of Implementing GMM Estimation of the Spatial Error Model in PySAL and GeoDaSpace
Viewshed Analysis
• Objective o Enable large-scale DEM-based visibility
analysis on cyberinfrastructure
o Provide easy-to-use web interface
• Integration o DEM data sources
• USGS: National Elevation Dataset
• OpenTopography: LiDAR-derived
DEM
• User uploads
o Computing
• GPU-based high-performance
visibility analysis on XSEDE
• Application workflow integrates
DEM data selection, fetching,
processing, computing, and result
visualization
31 Zhao, Y., Padmanabhan, A., and Wang, S. 2013. “A Parallel Computing Approach to Viewshed Analysis of Large Terrain Data Using Graphics Processing
Units.” International Journal of Geographical Information Science, 27 (2): 363-384.
BioScope – Biomass/Bioenergy Supply Chain Optimization
• Objectives o Provide a GIS-based decision support platform for biomass/bioenergy supply chain study
o Facilitate complex optimization model solving by leveraging high-performance computing resources
• Integration o User interface developed by reusing Gateway user components for layout, job management, and mapping
o Open service API used to manage local clusters for model computation
• Status o Prototype established on UIUC campus
32
Demo
• CyberGIS Gateway o Flumapper
o Viewshed analysis
o Spatial regression
o BioScope
33
Education and Outreach
34
Gateway User Map
• 805 users (as of 09/23/2013)
• ~200 organizations 35
Education and Training
• CyberGIS teaching o 2012-2013
• 6 courses on 3 campuses
• One online course
• >150 students
• Science gateway development training o SimpleGrid Toolkit – the learning toolkit developed from GISolve
o Hands-on tutorial on gateway application development
• XSEDE/TeraGrid conferences: 2007-2012
• Other conference venues
o e.g., SciDAC, GIS Day@ISU, CyberGIS conferences
36
Collaboration within Science Gateway Community
• Open Gateway Computing Environment (OGCE) o Interoperability between OGCE user environment and SimplGrid service infrastructure
o Testing Apache Rave and Airavata
• CyberSecurity Collaboration o NSF Distributed Web Security for Science Gateways Project
• PI: Dr. Jim Basney
• CyberGIS security requirements gathered from 4 project institutions, taken as a
use case by this project
o Center for Trustworthy Scientific Cyberinfrastructure (CTSC)
• http://trustedci.org
• Cybersecurity risk assessment on the CyberGIS gateways
• Engagement report produced
37
Concluding Discussion
• Gateway for sustainable development of the cyberGIS software
environment for geospatial discovery and innovation o One of the three primary components in CyberGIS software environment
• Gateway user environment supports modular integration of
cyberGIS applications
• The open service framework bridges service infrastructure and
service users
• Gateway provides easy access to a wide spectrum of advanced
spatial data and analytics for science communities
• Gateway continues to enrich the cyberGIS software environment
for effectively enabling data-rich geospatial research and
education
38
Acknowledgements
• NSF Software Infrastructure for Sustained Innovation (SI2) Program
• This material is based in part upon work supported by NSF under Grant Number 0846655 and 1047916. This work used the Extreme Science and Engineering Discovery Environment (XSEDE), which is supported by National Science Foundation grant number OCI-1053575. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author and do not necessarily reflect the views of the National Science Foundation
39
Thanks!
40