1 overall architectural design of the earth system grid
Post on 21-Jan-2016
220 Views
Preview:
TRANSCRIPT
1
Overall Architectural Design of the Earth System Grid
2
Architecture of the Production Earth System Grid
Centralized portal provides all user interactions, most system services
Data may be co-located with gateway or at remote sites
Data nodes respond to gateway requests for specific files
Users access gateway via web browser or Data Mover Lite (DML)
Users do not talk to data nodes directly
3
Technologies Underlying the Production ESG
Climate Data• Metadata Catalog• NcML (metadata schema)• OPeNDAP-g (aggregation and
subsetting) Data Management
• Storage Resource Mgr Data Transfer
• Globus Security Infra-structure
• Data Mover Lite• GridFTP• Monitoring and Discovery
Services• Replica Location Service
Security• Access Control• MyProxy• User Registration
4
Current production Deployments
Holdings: CCSM, POP, CISM, CLM, NARCCAP, PCM• Gateway: NCAR• Data nodes: LANL, NCAR, NERSC, ORNL
Holdings: CMIP3 (IPCC AR4)• Gateway: LLNL• Data node: LLNL
Holdings: C-LAMP• Gateway: ORNL• Data node: ORNL
5
Key Requirements for Next Generation ESG
CMIP5 drives most requirements for the scale and global of ESG We are expecting…
• 30+ contributing sites in 17+ countries• Data volumes 600+ TB “core”, 6+ PB total• Collect and replicate core to ~4 sites
Surveyed initial testbed sites for details of setup, plans, expectations
Keep data (close to) where it is generated• Server-side analysis and processing to minimize delivered data volumes• Deliver to users from archive/processing location, not gateway
Give contributors significant autonomy to ease participation• ESG team does not own or operate all (most) nodes• Flexibility on hardware, personnel commitments• Nodes can come & go without taking down ESG
Interface with local data, identity management where appropriate Support topical & institutional gateways as needed
6
The Next-Generation ESG: A Federated Global Enterprise
Independent gateways federating metadata, users Any user can discover any data from any gateway Each data node publishes to one or more gateways Specific data collections are managed through specific gateways
7
Federated architectureFederation is a virtual trust relationship among independent management domains that have their own set of services. Users authenticate once to gain access to data across multiple systems and organizations
Gateways• Where data is discovered, requested• Portals, search capability, distributed metadata, registration and user management• May be customized to an institution’s requirements, topical focus• More complex architecture than nodes, fewer sites• Initially PCMDI, NCAR, ORNL, eventually GFDL
Nodes• Where data is stored and published• Data may be on disk or tertiary mass store• Each data node can publish to any gateway (facilitates topical gateways)• Data reduction/analysis• Less complex architecture, including possible minimalist deployment w/o services• Anticipate ~20 data nodes for CMIP5, many others have expressed interest
Sites• A site can be both a gateway and a data node
Gateways and Data Nodes
8
Next-Generation ESG Architectural Details
New architectural features “Global services” layer Gateway adds data
products UI, metadata harvesting
Data node adds subsetting and analysis capabilities
More details about next-gen software stack throughout the day…
9
OpenID for Accessing Federated Data Systems
ESG-CET invested a lot of effort in examining security/identity approaches
Relatively open data access for thousands of users around the world
More in common with social networking than high-value computational environments
OpenID provides a user-centric federated identity Estimates are upwards of a billion OpenID’s, 40+K sites
accepting IBM, Microsoft, Google, Verisign, PayPal, FaceBook as
corporate board members (BBC, Orange, SourceForge adoption)
10
Federated Registration and Authentication
All users must register their credentials with ESG• OpenID identities might be
managed outside of ESG
Data “owners” manage authorizations to access their collections• Groups may have special
requirements User searching for data is
redirected to authenticate or apply for authorizations as needed
top related