ogce briefing to nsf oci an overview of the nmi- funded ogce project
TRANSCRIPT
OGCE Briefing to NSF OCI
An overview of the NMI-funded OGCE project
Why Are We Here? Discuss OGCE successes and influence on the Web
portal and science gateway community. OGCE is funded to build standards-compliant portal components
and services to support user interactions with CI middleware. Discuss portal and gateway research opportunities.
What are the future directions of portal and gateway technologies?
What is the role of portals within Cyberinfrastructure Discuss portal software future
Proposed Software Service Provider: TeraGrid Science Gateway Software Center
Summary of Successes Supporting Science
RENCI, TeraGrid User Portal, LEAD, DES/LSST, CIMA, and other portals represent a significant amount of NSF and other funding
Supporting (collectively) > 100’s of codes (from RENCI alone), potentially 1000’s of users (from TUP alone), and access to Terabytes of data (just from CIMA and DES).
Software Over 1800 IP-unique downloads of the OGCE portal software COG Kit downloads: Over 4000+ over last year. Developed comprehensive set of portlets and science gateway services.
Outreach, Leadership, Convergence Annual GCE Portal Workshop with special issue in journal Concurrency. Over 80 presentations, tutorials, and classes. 9 book chapters, 19 journal articles, 38 peer-reviewed conference
papers. 1 book in preparation.
OGCE Success Stories
Exemplary portal projects supported by OGCE
LEAD Gateway PortalNSF Large ITR and Teragrid Gateway - Adaptive Response to Mesoscale weather events - Supports Data exploration,Grid Workflow
LEAD Gateway Architecture Portal server composed of
portlets and supported by scalable, persistent web services Typical of gateways
NCAR tests with 2 groups of 25 concurrent users each launching forecast workflows and visualizing results. Goal is to support 100’s of
users. 10+ applications in various
workflow combinations Services and portlets flow
between LEAD and OGCE. GFAC, PURSe, Proxy
Management, etc.
Gateway Services
Grid Portal Server
Grid Portal Server
SecurityServices
SecurityServices
Workflow/ ApplicationExecution Engine
Workflow/ ApplicationExecution Engine
ApplicationResourceCatalogs
ApplicationResourceCatalogs
User Data& Metadata
Catalogs
User Data& Metadata
Catalogs
User’s BrowserUser’s Browser
Workflow ComposerWorkflow Composer
Use
r’s D
esk
top
Use
r’s D
esk
top
DataServices
DataServices Information
Services
InformationServices Job MGMT, Resource Broker
And Scheduling Services
Job MGMT, Resource BrokerAnd Scheduling Services Security
Services
SecurityServices
Globus-Teragrid “OGSA-Like” Services
TeraGrid User Portal
User Portal Sharable PortletsAccount Management
view projects and allocation usageview system account usernamesview DNs registered for accountadd users to projectssupports >3500 users
Resourceview comprehensive list of TG
resources and their attributesview job queues, load, status of
resources
Documentation current User Info
documentation contextual help for all interfaces
Consulting TG help desk information portal feedback channel
Allocation Info about how to apply
for/renew allocations
North Carolina Bioportal Principal collaborators: John McGee and Lavanya Ramakrishnan Features
access to common bioinformatics tools extensible toolkit and infrastructure
OGCE and National Middleware Initiative (NMI) leverages emerging international standards
remotely accessible or locally deployable packaged and distributed with documentation
National reach and community TeraGrid deployment Portals hosted at RENCI and NCSA
Education and training hands-on workshops across North Carolina
clusters, Grids, portals and bioinformatics
PittGrid: Portal
PittGrid Portal is built using OGCE Portal Toolkit Supports PittGrid’s Globus 4 and Condor services PittGrid users can login to the portal to submit and
monitor their jobs Job submission portlet and Condor job submission portlet allows
user to submit their job online to Globus and Condor, respectively
GPIR is used to provide information services OGCE has worked closely with Senthil Natarajan (Pitt)
and Matt Farrellee (UW) on enhancements to Condor portlets and BirdBath.
UNC-CharlotteVisual Grid Portal
Project Lead: Prof. Barry WilkinsonPortal Developer: Jeremy Villalobos
Summary of OGCE Collaborations Mixture of
Portal builders for hire (TUP, TIGRE) Direct collaboration, consulting with existing projects (OGCE person on-site) Portal developer off the street
OGCE software used includes portlet components, libraries, and services.
Wide variety of projects, personnel, funding levels, and expectations Some have many full time developers for all project aspects: from portal
cosmetics to grid system admin Some have one tech person: a grad student or system admin
Interesting requirements: Expected: TeraGrid access, support for SRB, condor and traditional
schedulers. 1-100 codes, 1-1000 users, 1-1,000,000,000,000 bytes of data Unexpected: AJAX, virtual workspaces, integration of multiple semi-
independent portals (portal federation)
OGCE Portal Software and Services
OGCE Software Development Overview Portlets are our central
technology JSR 168 standard.
Standard compliant portlets allow reuse of portal code between projects. This should definitely be used
in the TeraGrid Science Gateway community.
OGCE devotes significant effort to services, tools, and libraries.
Portal Container
Grid Libraries
Service Service Service
Portlets
Grid Infrastructure
Why Portlets? Use of Standards
These are standard components for building (Java) portals out of reusable parts.
We work within a larger community Commercial efforts Open Source: GridSphere, uPortal/JA-SIG, Pluto,
StringBeans, Jetspeed2, eXo Supporting Apache portals efforts (Jetspeed2, Portlet
Bridges efforts) We participate in standards development.
JSR 286 expert committee, GGF/OGF But Grids have their own problems and
requirements that we must solve.
Grid Portal Problem OGCE Solution
Users don’t like to manage grid credentials
Proxy manager portlet, PURSe portlets, SSO portal module.
Must interact with Globus Toolkit services
GridFTP, MyProxy, and GRAM portlets support both GT2 and GT4.
Must support multiple versions of the toolkit
Java COG portlet API allows dynamic binding to different versions of Globus
We must support other Grid middleware pieces
Developed SRB and Condor portlets
Must support user collaboration
OGCE-Sakai portlets allow access to Sakai collaboration services.
Grid portlets must be easier to develop.
We developed support for Velocity and Grid programming tag libraries.
Users need to monitor resources.
Developed GPIR Portlets to support GPIR service instances.
Problem OGCE Service or Library
Need programming libraries to support diverse file-like systems including mass storage systems.
NCSA Trebuchet libraries can be used to build both portlets and services.
Need to support semantic portal metadata.
Tupelo metadata service developed.
Need to provide persistent storage for Grid resource information; information must be accessible programmatically.
Developed GPIR Web service.
Science applications must be easier to deploy as a Grid service with a portlet interface.
Developed GFAC application factory service.
Need to support coupled job execution.
Java COG workflow service developed.
Community Leadership, Outreach, and Participation
GCE Workshops at Supercomputing Open calls to the portal community GCE05 held in Seattle, November 18th 2005
http://pipeline0.acel.sdsu.edu/mtgs/gce05/ 5 invited talks (judged by tech committee) 11 accepted posters 50+ participants Expanded, re-reviewed papers to appear in Concurrency and
Computation GCE06 scheduled Nov 12-13, 2006 in Tampa
Selected as part of SC06 workshop peer-review process Call for papers just went out http://www.cogkit.org/GCE06/
Plenary Session
Poster Session
Science Portals in 2010 and Beyond
What we want to make happen and how
Virtualizing Grid Access As TeraGrid expands it will become a
“utility” that extends our desktop with huge resources.
Portals and Gateways will provide access to: Virtualized Storage:
An “infinite capacity” data and replica management service.
Users will not manage data but have access through personal metadata catalogs.
Virtualized Computation: Portal is a front-end to services that
automatically allocate and schedule computational cycles as needed.
User focuses on science … not resource management.
Knowledge Discovery and Delivery
Agents for Search Portals support data discovery. We will be able to pose queries for
future discovery “I am interested in all new data relating to
chemical structures of the following form … When you find them, run the following analysis workflow against it and notify me if the result is interesting”.
“Mash Ups” show how to integrate data from multiple sources. Combine “big” data (Google) and
“little” data (my GPS data)
SearchAgent
SearchAgent
ChemInfo
Crawler
ChemInfo
Crawler
Analysis workflow
Candidateevent
discoveryevent
Validating Scientific Discovery The portal is an integral part of the process of
computational science Serves as an active repository of data
provenance The portal records each computational
experiment that a user initiates Disks are cheap, so why not record everything? Provides a complete audit trail of the experiment
or computation Published results will include link to provenance
information for repeatability and transparency. Many portals have done this on a smaller scale
CIMA, PubChem + NIH cancer screening centers, LEAD, SERVO/Quakesim, ...
But this should be standard practice. Should be persistently stored in journal catalogs
Grid Portal Software Development
Science portal and gateway research have exciting opportunities.
We must balance these research opportunities with nuts-and-bolts software development.
We can make an accurate short term forecast for the next generation of portals.
Opportunity Approach Task
Current portlet standard needs enhancements to support JavaScript, inter-portlet communication, etc.
JSR 286 should address these shortcomings.
Upgrade current portal containers to support the new standard.
Portlets need better interactivity; need to support science mash-ups.
Encapsulate AJAX techniques in libraries.
Build high quality AJAX tag libraries for portlets; support JSR 286.
Portals need to bind to and share externally running portlets.
WSRP 2.0 standard serves this purpose.
Build a high quality WSRP 2.0 implementation.
PHP, Python, Ruby, and other popular languages are used to develop portals.
Both WSRP and Apache Portal Bridges projects allow language independence.
Build Ruby Grid programming libraries and portlet bridges.
Need portlet metadata standards for provenance.
Build from current community standards.
Build and release.
Need to move components seamlessly between desktops and portals.
Examine approaches such as WSRP desktops, JSF support for XUL, etc. Portlet and container APIs will be generalized.
Develop this within the GGF/OGF community.
Portal Software Center
Directly supporting science gateways through as a software service provider
Supporting TeraGrid Gateways: a No-Cost Extension Activity The NSF has a significant investment in the success of the TeraGrid
Science Gateways. Current Gateway efforts focus on integration of gateways with the
TeraGrid. This is currently in heroic phase, uses on-site staff people. This assumes there is an on-site staff person at the Gateway. True for large, well-funded projects but maybe not true for others.
This has to be a potential success story: small colleges, MSIs, etc., need TeraGrid resources.
We think the Gateways effort should be expanded to include software support as well as integration. Directly support common software base of many of the Gateways. Respond to gateway requirements, bugs, feature requests with priority. Provide depth of support for smaller gateways.
Portal hosting, custom development, training.
A TeraGrid Science Gateway Software Center The center would focus on a common (but not
required) software stack for gateways. Represents common practice and the “eigen” portal,
at least for Java. Possible future eigen-portals for Python, PHP, Ruby,
etc, and linear combinations thereof. Center’s board of directors would consist of
current TG Gateway leadership and representatives from active gateway projects. Those in charge now would still be in charge. We would give them more power.
How Is This Different from Now? Current efforts focus on integration. Two concerns:
How do we support smaller groups? The Gateway bar is getting higher, not lower, as we think through the
requirements. How do we help bridge between campus Grids and the TeraGrid?
A Gateway SSP will support integration through both common and Gateway specific software. General portal/portlet software AND services (such as logging, auditing, accounting, shutdown) that all
gateways need based on gateway requirements. AND hosting services to help smaller groups AND training on specific base software for new developers.
We are NOT the portal police Existing gateways can maintain their own autonomous software bases. Not everything goes in the gateway stack.
Looking Forward We obviously are positioning the OGCE project to be a
Software Service Provider for the science gateways. As we envision it, the Gateways SSP would
Develop portlets and services to support gateways generally. “Tactical to strategic” approach as we ramp up.
Collaborate with large gateways (RENCI) on specific problems. Package and integrate tools into a simple Gateway download. Support these tools through help desks.
How does this compare to the NSF vision for SSPs?
OGCE Project Participants
PI/Co-I Institution Major Contributions
Marlon Pierce, Dennis Gannon, Beth Plale, Geoffrey Fox
Indiana University Packaging, Grid portlet development, GFAC, PURSe, GGF Leadership
Mary Thomas, Jay Boisseau (Eric Roberts)
San Diego State University/Texas Advanced Computing Center
Packaging, Grid portlet development, GPIR, CFT, OGCE Web Site Development; GCE05 organization; SRB Portlets
Jay Alameda, Joe Futrelle
National Center for Supercomputing Applications
Grid tool development (Trebuchet, OGRE), Tupelo metadata development, portlet development
Charles Severance, Joseph Hardin
University of Michigan
Sakai collaboration services and portlets, JSR 286 participation.
Gregor von Laszewski
University of Chicago/Argonne National Lab
Grid portlet API and library development; GCE06 organization; GlobDev liaison
Additional Slides
GPIR Deployment and TIGRE Portal
VLAB Computational Chemistry Portal
DES and LSST Portals
Monitor workflows
Set up and launch pipelines