ics-tcs integration guidelines - epos · figure 2: epos functional architecture, describing the...

18
ICS-TCS Integration Guidelines Handbook for TCS integration: Level-2 Prepared by EPOS-IP WP6 &WP7 teams Project: EPOS-IP (Horizon2020 – Project number: 676564) Work Package(s): WP6, WP7 and TCS WPs Task(s): Tasks regarding ICS-TCS integration and interoperability Document type: ICS TCS Integration Guidelines – Level-2 Document file name: EPOS-ICS-TCS-Integration-Guidelines-Level-2-20150917.doc Document title: ICS-TCS Integration: Handbook for TCS Integration: Level-2 Purpose: For internal use between Transverse WPs and the TCS WPs Author(s): EPOS – IP WP6 & WP7 Teams Version: 1.0 Date: 20150917 Contact: http://www.epos-eu.org/ http://eposwiki.bo.ingv.it/ [email protected] September 2015

Upload: others

Post on 17-Sep-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: ICS-TCS Integration Guidelines - Epos · Figure 2: EPOS functional Architecture, describing the technical functional components of EPOS. It specifies for each layer the ICT modules

ICS-TCSIntegrationGuidelinesHandbookforTCSintegration:Level-2

Preparedby

EPOS-IPWP6&WP7teamsProject: EPOS-IP(Horizon2020–Projectnumber:676564)WorkPackage(s): WP6,WP7andTCSWPsTask(s): TasksregardingICS-TCSintegrationandinteroperabilityDocumenttype: ICSTCSIntegrationGuidelines–Level-2Documentfilename:

EPOS-ICS-TCS-Integration-Guidelines-Level-2-20150917.doc

Documenttitle: ICS-TCSIntegration:HandbookforTCSIntegration:Level-2Purpose: ForinternalusebetweenTransverseWPsandtheTCSWPsAuthor(s): EPOS–IPWP6&WP7TeamsVersion: 1.0Date: 20150917Contact: http://www.epos-eu.org/http://eposwiki.bo.ingv.it/[email protected]

September2015

Page 2: ICS-TCS Integration Guidelines - Epos · Figure 2: EPOS functional Architecture, describing the technical functional components of EPOS. It specifies for each layer the ICT modules

2

ICS-TCSIntegrationGuidelinesHandbookforTCSintegration:Level-2

Preparedby

EPOSWP6&WP7teams

Tableofcontents1WhyThisDocument.......................................................................................................................32EPOSe-Architecture.......................................................................................................................3Organization.....................................................................................................................................3Technicalarchitecture..................................................................................................................41) Metadatacatalogue.........................................................................................................53) WorkflowenginesandProvenance.........................................................................64) IAAAtodataandcomputationalresources(cloud,grid,HPC)....................75) ICS-D......................................................................................................................................76) Webservices/APIs........................................................................................................7

3Principles............................................................................................................................................7Co-development(ICSandTCS)................................................................................................8Donotreinventthewheel..........................................................................................................8Microservicesapproach..............................................................................................................8Clearlong-termtechnicalgoals,butiterativeshort-termapproach.......................8

4RecommendationstoTCS............................................................................................................95TCSIntegration.................................................................................................................................9TCSgenericarchitecture.............................................................................................................9ICS-TCScommunication...........................................................................................................11Expectedwork-cycle..................................................................................................................11Timeline...........................................................................................................................................11ICS-TCScommonwork..............................................................................................................12

6ICS-dIntegration...........................................................................................................................15Appendix1-Acronyms.................................................................................................................16Appendix2-references.................................................................................................................18Metadata..........................................................................................................................................18e-Infrastructures..........................................................................................................................18

Page 3: ICS-TCS Integration Guidelines - Epos · Figure 2: EPOS functional Architecture, describing the technical functional components of EPOS. It specifies for each layer the ICT modules

3

1WhyThisDocumentThe purpose of this short document is to provide the TCS and other partnerswith the technical details and procedures required to achieve the integrationbetween ICS and TCS. The ICS-TCS Integration Guidelines are intended as ahandbook for TCS integration. A general introductory version (Level-1) hasalready been distributed within the EPOS community. This document is thesecondversion(Level-2)ofsimilarfollow-updocumentsthatwillbedistributedatlaterstages.

2EPOSe-Architecture

OrganizationThe EPOS architecture has been designed to organize and manage theinteractionsamongdifferentEPOSactorsandassets.TomakeitpossiblefortheEPOS enterprise to work as a single, but distributed, sustainable researchinfrastructure, its architecture takes into account technical, governance, legalandfinancialissues.Fourcomplementaryelementsformtheinfrastructure:

1. TheNationalResearch Infrastructures (NRIs) contribute to EPOSwhilebeingownedandmanagedatanational levelandrepresentthebasicEPOSdataproviders. These require significant economic resources, both in terms ofconstructionandyearlyoperationalcosts,whicharetypicallycoveredbynationalinvestments that must continue during EPOS implementation, construction andoperation.2. The Thematic Core Services (TCS) enable integration across specificscientificcommunities.Theyrepresentagovernance frameworkwheredataandservices are provided andwhere each community discusses its implementationandsustainabilitystrategiesaswellaslegalandethicalissues.3. The Integrated Core Services (ICS) represents the e-infrastructureconsisting of services that will allow access to multidisciplinary resourcesprovidedbytheNRIsandTCS.Thesewill includedataanddataproductsaswellas, synthetic data from simulations, processing, and visualization tools. The ICSwill be composed of the ICS-Central Hub (ICS-C) and distributed computationalresources includingalsoprocessingandvisualisation services (ICS-D). ICS is theplacewhereintegrationoccurs.4. TheExecutive andCoordinationOffice (ECO) is the EPOSHeadquartersand the legal seat (ERIC) of the distributed infrastructure governing theconstructionandoperationoftheICSandcoordinatingtheimplementationoftheTCS.

The European Research Infrastructure Consortium (ERIC) has been chosen bytheBoardofGovernmentalRepresentativesas the legalmodel forEPOSand isused in designing the GovernanceModel. This includes a General Assembly ofmembers and an Executive Director, supported by a Coordination Office. AfundingmodelhasbeendesignedthatwillsupportthesustainableconstructionandoperationofthewholeEPOSenterprise.ThemodelincludescomplementaryfundingsourcesforeachofthekeyEPOSelements.Figure1describestheEPOStechnicalarchitectureorganisedinthreelayers.

Page 4: ICS-TCS Integration Guidelines - Epos · Figure 2: EPOS functional Architecture, describing the technical functional components of EPOS. It specifies for each layer the ICT modules

4

Figure1:EPOStechnicalarchitecture.ThediagramshowsthethreelayersinwhichtheEPOScomponents(institutionsandservices)havebeenorganized:NationalLayer,CommunityLayer,IntegrationLayerincludingalsoanInteroperabilityLayer.ThemainconceptisthattheEPOSTCSdataandservicesareprovidedtotheICS(seeFig.1)bymeansofacommunicationlayercalledtheinteroperabilitylayer,asshown in the functional architecture (Fig. 2). This layer contains all thetechnologyto integratedata,dataproducts,servicesandsoftware(DDSS) frommanyscientific,thematiccommunitiesintothesingleintegratedenvironmentoftheIntegratedCoreServices(ICS).TheICSrepresentsthe“core”ofthewholee-infrastructure and those responsible for its implementation will provide thespecification of the “interoperability layer”. The ICS is conceptually a single,centralized facility but in practice is likely to be replicated (for resilience andperformance) and localized for particular natural language groupings or legaljurisdictions.

TechnicalarchitectureThe ICS ismadeupof several,modular, interoperable buildingblocks (Fig. 2).ThethreelayerstructureadoptedinthetechnicalarchitectureofEPOSconsistsof theNational Layerwhere theNationalResearch Infrastructuresprovide theDDSS. Data providers in this layer are independent national institutions ororganizationswhichhave their own technical solutions thatmay (ormaynot)follow international standards in providing data and data products to the

Page 5: ICS-TCS Integration Guidelines - Epos · Figure 2: EPOS functional Architecture, describing the technical functional components of EPOS. It specifies for each layer the ICT modules

5

community. The second layer, Thematic Core Services (TCS) is the (European)Community Layer where community standards are applied to DDSS that arerelevanttothespecificthematicareaofconcern.Thethird(top)layerrepresentsthe integration of the DDSS that come from the TCSs, where high levelinternational standards are applied.At this levelmetadatadescribing allDDSSneed to be harmonized into a single metadata catalogue which is based oninternational standards. During the Preparatory Phase of the EPOS Project, aEuropean metadata catalogue standard, CERIF (Common European ResearchInfrastructureFormat),was tested andused for theprototypedevelopment. IthassubsequentlybeenadoptedintheEPOSImplementationPhase.Inorderthatthe DDSS from the various TCSs can be converted into the chosen metadatacataloguestandard(i.e.CERIF),thereisaneedforanadditionallayerwhereTCSdata sets will be mapped and converted to CERIF. This is referred to as the“Interoperabilitylayer”.ThevariouscomponentsoftheICSarenowexplainedinmoredetail.

Figure2:EPOSfunctionalArchitecture,describingthetechnicalfunctionalcomponentsofEPOS.ItspecifiesforeachlayertheICTmodulesandtheirfunction.AttheICSlayeritdescribesthedesignoftheintegratinge-Infrastructure.

1) MetadatacatalogueMetadatadescribingtheTCSDDSSarestoredusingtheCERIFdatamodelwhichdiffersfrommostmetadatastandardsinthatit(1)separatesbaseentitiesfrom

Page 6: ICS-TCS Integration Guidelines - Epos · Figure 2: EPOS functional Architecture, describing the technical functional components of EPOS. It specifies for each layer the ICT modules

6

linking entities thusproviding a fully connected graph structure; (2) using thesamesyntax, stores thesemanticsassociatedwithvaluesofattributesboth forbase entities and (for role of the relationship) for linking entities, which alsostore the temporal duration of the validity of the linkage. This provides greatpower and flexibility. CERIF also (as a superset) interoperates with widelyadopted metadata formats such as DC (Dublin Core), DCAT (Data CatalogueVocabulary), CKAN (Comprehensive Knowledge Archive Framework), INSPIRE(theECversionofISO19115forgeospatialdata)andothers.Themetadatacataloguewillalsomanagethesemantics,inordertoprovidethemeaningoftheattributevalues.CERIFstoresthesemanticsina‘semanticlayer’referencedfromthesyntactic layerthusprovidingasingle integratedsemanticenvironmentwhichisefficientbecauseitusesstandardIT(usuallyrelationalbutallotherdatabase/processingenvironmentsmaybeused).CERIFinteroperateswith OWL (Web Ontology Language), SKOS (Simple Knowledge OrganizationSystem)andothersemanticrepresentationlanguages.Metadata from the communities will bemapped to themetadata catalogue inorder to create appropriate links between common concepts in differentdisciplines.Thisprocess involvestheharmonizationandinteroperabilityof thevarious DDSS from the different TCSs through dedicated softwaremodules. ItrequiresTCSAPIsforconvertingDDSStotheTCSspecificmetadatastandard.Italso requires ICS APIs (wrappers) to map and store this in the ICS metadatacatalogue (i.e. CERIF). These TCS APIs and the corresponding ICS APIscollectivelyformthe“interoperabilitylayer”,whichisthelinkbetweentheTCSsandtheICS.

2) SystemManagingSoftwareThesystemmanagingsoftwarewillmanagethemetadatacatalogueandallothermodules(e.g.workflowengineandgenerallyalltheresourcesinvolvedtosatisfytheuserrequests).

3) WorkflowenginesandprovenanceA key aspect of the semi-automatic composition of software to meet a userrequest istheprovisionofaworkflowtolinktogetherthesoftwareservicesastheyaccessappropriatedata.Workflowenginesavailablearemanyandeachofthem fits different use cases and architectures. We will take into account thecomputationalmodels(i.e.fromtheComputationalEarthSciencecommunity)tobe supported and the communities’ requirements. We can anticipate that,following the experience gained by the EPOS partners in initiatives such asVERCE1andongoingworkinRDA(ResearchDataAlliance),particularattentionwillbededicatedtocrossplatformstreaminglibraries.CERIF can also provide provenance information since the linking entitiesassociatewith the role have, as attributes, bothdate/time start anddate/timeend.Thishandlesversioningand–viathelinkingentityrecord–therelationshipofonebaseentity instance(e.g.adataset) toanother.On theotherhand, foracomprehensive traceabilityof theprocessesandagents that contributed to thegenerationof the researchproduct,we foresee the integration inCERIFof the1VirtualEarthquakeandseismologyResearchCommunityinEuropee-scienceenvironmenthttp://www.verce.eu/

Page 7: ICS-TCS Integration Guidelines - Epos · Figure 2: EPOS functional Architecture, describing the technical functional components of EPOS. It specifies for each layer the ICT modules

7

W3C-PROVontology.Thiswillguaranteeinteroperabilitywithotherinstitutionaldataarchives,fosteringdatapreservationandcurationacrossdomains.

4) IAAAtodataandcomputationalresources(cloud,grid,HPC)This module will manage and interoperate with all themajor ‘common’ IAAA(Identification, Authentication, Authorisation, Accounting) services andstandardsfromAAAI(Authentication,Authorisation,AccountingInfrastructure)such as SAML, OAuth, OpenID, X.509 and related products such as EduGAIN,Shibboleth, Kerberos and others and also for user directory services such asMicrosoftActiveDirectoryandLDAP.AddressingtheIAAAinasatisfactorywayisachallengeatthepresentstage,andisalsobeingfacedinotherprojectsandinitiatives following AAAI (e.g. AARC2, EGI-Engage3), with which EPOS iscollaborating. The goal in this collaboration is to implement a smart IAAAmechanismwhich is able to hide from the user all the complexity of delegation-basedAAAImechanisms.

5) ICS-DAsalreadydescribed,IntegratedCoresServices–Distributed(ICS-D)willincludeservices from external computing facilities. These will include HPC (HighPerformance Computing) machines for modelling and simulation according totherequirementsoftheComputationalEarthSciencecommunity,andHTC(HighThroughput Computing) clusters for data intensive applications such as datamining.ThedataworkflowwillbemanagedbyEPOSICS-C inordertoprovidethe end user with appropriate computational services, even though actualcomputationswillbeprovidedby ICS-D.Additional ICS-Dserviceswillprovidevisualization andprocessing capabilities. ICS-Cwill have todevelopprovisionsforcommunicatingwiththeseexternalservicesinaseamlessmanner.

6) Webservices/APIsEPOS-IP, wherever possible, will use web services4 as the main vehicle forsoftwareservices,definingAPIs,andimplementingthebestpracticesforasoundmicroservices architecture, so that workflows can be composed semi-automatically. Web services and APIs, together with appropriate mapping ofmetadatawhichwilldrivedataconvertors,willalsobethedrivingtechnologyforthe“connection”ofTCSwith“ICS”.ThedetaileddescriptionofthewholeEPOSe-infrastructureisoutofthescopeofthisdocument.ForfurtherdetailsandinformationfollowthelinktoEPOS-ICTsummaryat:http://www.epos-eu.org/assets/documents/WG7/EPOS-ICT-summary.pdf

3PrinciplesInthefollowingthehigh-levelprinciplesfortheICSdevelopmentandtheICS-TCSinteractionsareexplained.2AuthenticationandAuthorizationforResearchandCollaboration,https://aarc-project.eu/3EGI-Engageproject(EngagingtheResearchCommunitytowardsanOpenScienceCommons,https://www.egi.eu/about/egi-engage/4WebservicesalsoactasGRIDservicesorCLOUDservices

Page 8: ICS-TCS Integration Guidelines - Epos · Figure 2: EPOS functional Architecture, describing the technical functional components of EPOS. It specifies for each layer the ICT modules

8

Co-development(ICSandTCS)The development of the ICS depends on end-user requirements and the DDSSprovidedbytheTCS.TheTCSareatdifferentstagesofmaturity.Forsome–withlittle infrastructure to date - the adoption of the EPOS architecture isstraightforward. For others – with several years (decades) of infrastructuralinvestmentsalreadyexisting-ajointlyagreedevolutionaryplantoconvergetointeroperabilitywith ICSwillbeadopted.TheEPOSapproach in thiscontext isneithertop-downnorbottomup:themainideaistousethegeneralarchitectureand follow a cooperative approach in the designing and development of thesoftware to build the compatibility layer, which is the place whereharmonisationsandcommunicationsareachieved.SuchanapproachalsoencouragesEPOStofocusmoreonanarchitecturaldesignfor the system that is capable of being engineered and implementedcollaborativelyratherthanusingatop-downreferencemodelapproach.Thiswouldfacilitatecross-disciplinaryaccessandutilizationofthedataforthescientificpurpose.

Donotreinventthewheela.ReuselocaltechnologiesTheco-developmentphilosophymaximisesre-useofexistingsoftwareservices,dataavailabilityandresources.b. Do not build a supercomputer: build an integrationwith an ICS-D serviceproviderEPOSwill have some of its own computing resources in order to provide the‘uniformview’ over EPOS entities. However, to facilitate intensive processing,EPOSwillprovide–subjecttoauthorisation–accesstoappropriatecomputingfacilities includingHPC(HighPerformanceComputing)machines formodellingand simulation, and HTC (High Throughput Computing) clusters for dataintensive applications such as data mining. Other ICD services will includevisualizationandprocessingservices.These facilitiesareknowncollectivelyasICS-D(IntegratedCoreServices–Distributed)andwillbeprovidedbyexternalserviceproviders.

MicroservicesapproachThe Microservices architecture approach envisages small “micro” servicesdedicatedtotheexecutionofaspecificclassoftasks,whichhavehighreliability.EPOSwilleithertakeexistingsoftwareservicesand‘wrap’themforEPOSuseorbuildnewservicescomplyingwithEPOSarchitectural standards.Theaim is tohavemicroserviceswithdefinedinterfaceswhichcanbecomposedtogethertoformasoftwarestackabletoaddressunpredictableuserrequest.

Clearlong-termtechnicalgoals,butiterativeshort-termapproachTheoverallarchitectureofEPOSisclearlydefinedandagreedthroughEPOS-PP.However,itsimplementationinEPOS-IPwouldrequireastep-by-stepapproachto build a reliable systemenvironment tomeet the requirements of end-usersand their communities. To this purpose iterative work cycles both for ICSdevelopmentsandforICS-TCScommunicationhavebeensetup.

Page 9: ICS-TCS Integration Guidelines - Epos · Figure 2: EPOS functional Architecture, describing the technical functional components of EPOS. It specifies for each layer the ICT modules

9

4RecommendationstoTCSR.1Each typeofDDSSdeliveredmustbeassociatedwithdescriptivemetadatathatenablesefficientdatadiscoveryandcontextualisation(whichenablesausertodeterminerelevanceandquality).Themetadatashouldbe,whenpossible,anapproved(e.g.OSI)ordefactostandardandideallyalreadyinCERIF.R.2 Each type of DDSS delivered (and its associated metadata) should beaccessibleviawebservicesand/orAPIs.R.3EachwebserviceorAPItoaccessDDSSshouldbe,whenpossible,basedonapproved(e.g.OSI)ordefactostandards.Beforebuildinganynewwebservices,apreliminarycommunicationwiththeICSteamshouldbecarriedoutinordertoverifythatitissuitable.TheICSdevelopmentteamwillhappilyhelpandprovideadvice. In order to optimize efforts, TCS should also carry out a preliminarycheck with other communities to ensure the proposed web service has notalreadybeenimplemented.R.4 If a TCS needs to develop a metadata standard for its community, apreliminary check should be carried out with the ICS team and othercommunities,inordertoverifythatelementsoftheirdatahasnotalreadybeendescribedbyanothercommunity.R.5 The integration among TCS should be as deep as possible: ideally, everysinglecommunityshouldadopt(intra-community)acommonstrategytosharetools, data formats, API, datamanagement rules, data analysis and processingframeworkstoacommoncross-EPOSstandard.Asaconsequence,(a)effortstobuildnewsservicesortoimproveoldoneswillbeminimizedand(b)itwillbesimplertointegratediversedataproductsatICSlevel.R.6 Recommending IAAA standards is premature at this stage but will becommunicatedwhen appropriate solutions are found. It is important howeverthatallTCSsreflectupontheirneedsintermsofIAAA.

5TCSIntegration

TCSgenericarchitectureThedifferentThematicCoreServices(TCS)havevaryingdegreesofmaturityintheirdevelopment.Therefore,itisnotpossibletodealwithTCSasiftheyareallequal and homogeneous. Some TCS have a very specific services architecturebasedonyearsofexperienceinthatspecificdomain.OtherTCSdon’thaveanyhistoryofdevelopingservices.SomeTCSrelyonafederatedarchitecture,othersonasinglesiteddatacentre.

Page 10: ICS-TCS Integration Guidelines - Epos · Figure 2: EPOS functional Architecture, describing the technical functional components of EPOS. It specifies for each layer the ICT modules

10

Figure3: genericTCSarchitecture. Independently of the community technical choices to federateandmakethedataaccessible,allTCSsshouldhavewebservices/APIswhichenableICStoaccessthedata.Some have already done the effort of defining metadata standards and webservices todisseminate thedata.Others are still in theprocess of undertakingsuchwork.As a consequence the scenario is very heterogeneous and includesmany serviceswithdifferentlevelsof“technical”maturity.Nevertheless, it is still possible to define a very generic architecture which,independentoftheTCSlevelofmaturity,canbeapplicabletoanyTCS.The architecture is very “general purpose”, simple, and presents the followingelements:

a) National Layer Research Infrastructures: these can vary in differentcountries within a specific scientific community. They represent theexistingDDSS.

b) TCS system: this represents the e-Infrastructure for a specific scientificcommunity.ItmayincludethesoftwareusedtofederateNationalRIs,orthesoftwaretopresentresultsontheweb(webportal).

c) MetadataCatalogue:thisisusuallyadatabasewhereeachdataobject(e.g.file, in case of non-streamed data) is referenced and described by themetadata.ItcanbeusedtodrivetheWebServices

Page 11: ICS-TCS Integration Guidelines - Epos · Figure 2: EPOS functional Architecture, describing the technical functional components of EPOS. It specifies for each layer the ICT modules

11

d) WebServices/API:thisistheentrypointwhereuserscanaccessthedata.EPOSICShasnoparticularrecommendationwithrespecttothesoftwareused to build the TCS system. What matters is the way ICS and otherstakeholders can access data. API basedWeb Services ensure data andmetadataareaccessibleanddiscoverablebyhumansandmachines(e.g.ICSsystem)

With respect to themetadataand relatedweb services, thereare twopossiblestrategiestofollowinordertoachievetheICS–TCSinteroperability:

1. Metadatadump:themetadatafromTCSisfullycopiedtotheICSmetadatacatalogue.ItguaranteesthatthemetadataisfullymanagedbytheICS,andit lowers the burden of TCS in providing a highly efficient and robustsystem providing access to themetadata. However, it requires periodic(e.g.daily)polling/copyingproceduresandsynchronizationmechanismsmustbeputinplacetoguaranteeconsistency.

2. MetadataRuntimeAccess:Theaccess tometadata isdoneat runtimebyqueryingwebserviceswiththedefinedAPIs.TheAPIsspecificationmustbe stored in the ICS Metadata catalogue (to enable ICS to access thesystem in an autonomic way). It avoids the error-prone procedure ofdumping themetadata.However it requires thatTCSbuildveryreliableandrobustsystems,abletomanageahighnumberofconcurrentqueriesasgeneratedbyusersoftheICS.

Ineithercase theconversion toCERIFneeds tobedone. Since thiscould takesome computing resources (1) is likely to be more efficient unless the datastructures(notthedatathemselves)ofaparticularTCSareveryfastchanging.It ishoweverlikelythatweneedtousebothstrategiesdescribedaboveduetothehighdegreeofvariabilityintheTCSDDSS.

ICS-TCScommunicationThemainrequirementsofeachTCSwerecollectedduringEPOS-PP. TherewillbeanadditionalrequirementselicitationprocessduringtheinitialphaseoftheEPOS-IP project in order to ensure that the requirements from the TCScommunitiesareproperly taken intoaccount. InEPOS-IP interactionsbetweenICSandTCSwillbehandledthroughcyclic teleconferences(every3-6months)with each TCS. In such meetings the current status of the ICS and itsimplementationwillbesummarizedandfeedbackfromtheTCScommunitywillbeprovided.EPOSmeetingsandassemblieswillbealsoexploitedtodiscussICTissues,plantheworkandimprovesynergies.

Expectedwork-cycleCommunicationandcollaborationbetweentheICSandTCSteamswillbekeytobuildingacommonunderstandingoftheworkwearerequiredtoundertake.ToinitiatethisprocesstheICSdevelopmentteamwillcollectrequirementsandusecases from TCSs in order to start to outline and develop the system for theintegrationlayer.

Page 12: ICS-TCS Integration Guidelines - Epos · Figure 2: EPOS functional Architecture, describing the technical functional components of EPOS. It specifies for each layer the ICT modules

12

Basedontheinformationcollectedduringtherequirementselicitationprocess,the ICS development team will prepare a suggestion for the harmonisationprocessbetweentheTCSs.BoththerequirementselicitationprocessaswellastheharmonisationoftherequirementsbetweenthevariousTCSs,willtakeplaceinaniterativemannerallowingbothICSdevelopingteamandTCScommunitiestocontributebothinthepreparationphasebutalsolaterintheimplementationandvalidationphaseofthisinformation.

TimelineThedetailedtimelineforthework-cycledescribedabovewillbedevelopedandpresentedduringthekick-offmeetingandwillbeincludedinthenextversion(Level-3)oftheICS-TCSIntegrationGuidelinesdocument.

ICS-TCScommonworkThejointworkiscarriedoutinthe“interoperabilitylayer”,asoftware/technicallayer to enable communication among ICS and TCS. Once the TCS-ICScommunication is established, data, metadata and services coming from thevariouscommunitiesmustbemanaged.This is accomplished in the ICS Central Hub (ICS-C) by means of a metadatacatalogue, namely the facility which, together with system manager software,manages and orchestrates all resources required to satisfy a user request. Byusingmetadata, theICS-Ccandiscoverdatarequestedbyauser,gainaccesstothem,sendthemtoaprocessingfacility(ormovethecomputationtothedata),and perform other complex tasks. The catalogue contains: (i) technicalspecifications to enable autonomic ICS access to TCS discovery and accessservices, (ii)metadatadescriptionof thedigital object (DO)withdirect link toDO, (iii) information about users, resources, software, and services other thandata services (e.g. rock mechanics, geochemical analysis, visualization,processing).In the roadmap to enable data integration and ICS-TCS communication, threestepsarerequired(figure4):

(i)Metadatadefinition,(ii)webservices/APIsdefinition,(iii)matchandmapwithICS

Page 13: ICS-TCS Integration Guidelines - Epos · Figure 2: EPOS functional Architecture, describing the technical functional components of EPOS. It specifies for each layer the ICT modules

13

Figure 4: The three steps required in TCS-ICS common work and interaction through theinteroperabilitylayer.Theirdescriptionfollows:STEP1:MetadataAs anticipated in the recommendations, it includes the definition andusage ofmetadata describing the community data (e.g. seismic waveforms, GPS time-series,geologicalmaps),software(e.g.ananalysisorvisualisationapplication),services (e.g. use of specialist equipment), resources (e.g. computers,instrumentation, detectors) and users (with their roles, responsibilities andauthoritiesusedforAAI).STEP2:WebservicesforaccessThe heterogeneity of the EPOS community implies that TCS have differentmaturitylevels.Somecommunitiesmayonlygiveaccesstometadataattachedtoraw Digital Objects (e.g. ftp repository containing files with metadata in theheader). However the best option is to provide a software layer whichimplements services for (a) “Data discovery”; (b) “Data Access and Retrieval”(e.g.APIs,RESTful);(c)“dataanalysis/visualisation/modelling/mining”,possiblybasedonexistingstandards.

a. DataDiscoveryservicesare used to discover, through metadata, the data of interest inremote repositories. EPOS ICS is indifferent with respect to

Page 14: ICS-TCS Integration Guidelines - Epos · Figure 2: EPOS functional Architecture, describing the technical functional components of EPOS. It specifies for each layer the ICT modules

14

topology (single-sited, distributed or federated repository) andaccesspoints(singleormultiple)aslongascleardefinitionsabouthowtoaccesstotheservicesareprovided.

b. DataAccessandRetrievalservicesareusedtoaccessadigitalobject(i.e.afile,adataset,adocument)anduseit.Thetermaccessusedcanbethoughtalsoasadownloadprocess.TheseservicesmayincludealsoIAAA.For both services the use of already existing, international,community accepted standards is highly recommended. OGC-services standards, INSPIRE based standards, Dublin Core andother international standards can be relatively easily integratedbecauseanumberof tools tomanagedataandmetadataalreadyexist. Reusing existing software tools will save resources forsoftwaremaintenanceinthelongtermonbothTCSandICSside.

c. Dataanalysis/visualisation/modelling/miningservicesare used to extract meaning from the data using the softwareservices provided. EPOS ICS will discover appropriate softwareservices for the kinds of data requested by the user and (semi-automatically) compose the software stack to process thedataset(s)asrequired.

STEP3:Metadatamatch&mapFor eachmetadata standard from a single TCS ametadataMatching/Mappingprocedure is required. This is also called the metadata assisted canonicalbrokering.The purpose is matching themetadata elements of each dataset (but also forsoftwareservices,usersandresourcesascomputers,detectors)atTCSlevelandmetadata entities in the ICS metadata catalogue, and generating mappingsbetween matched elements so that any data instances (values of attributeswithin a record) under one schema can be converted to instances under theotherschema.Matching involves finding correspondingattributes in the two schemaseven iftheyhavedifferentnames (especiallyaproblem inmultilingual environments)or–intheworstcase–evendifferentdatatypes.Asanexample,intwodifferentdatasets both concerning observations on the earth surface, there may beattributes describing geospatial positions having different names(Latitude/Longitude versus Northing/Easting) and different accuracy andprecisionattributes. Inorder toproperlymatchmetadataelements,a commonICSandTCStechnicalstaffsharedwork,isrequired.Suchworkimpliesthat:

• A subset of the most relevant community metadata elements isselectedbyTCStechnicalstaffincooperationwithTCSscientists

• Such subset is then mapped into the EPOS catalogue by the ICStechnicalstaff.

NEXTSTEPS:Followingissueswillbeaddressedinthenextsteps:

• Suggestedtechnologies/standards.

Page 15: ICS-TCS Integration Guidelines - Epos · Figure 2: EPOS functional Architecture, describing the technical functional components of EPOS. It specifies for each layer the ICT modules

15

• ICSwill collect fromTCS requirements, use cases, assets (requirementselicitationprocess).

• DetailsofthetimelineandplansshowinghowtosharetheworkbetweenICSandTCS.ItwillbeuptoTCSstodeterminewhattheyexposetoICS.

6ICS-DIntegrationThis short remark is intended to organize and provide mechanisms forinteraction between those groups that are either interested in following thedevelopmentswith regard to the IntegratedCore ServicesCentralHub (ICS-C)andtheIntegratedCoreServicesDistributed(ICS-D)orthosewhohaveafutureinterestinhostingtheICS-Cordeclaringanofficialin-kindcontributionforanyofthefutureICS-D’s,andthosewhoaredirectlyinvolvedindevelopingICS-CandICS-d in the EPOS Implementation Phase (EPOS-IP) (i.e. WP6: ICS-TCSinteractionsandinteroperabilityandWP7:ICSdesignanddevelopment).Forthesakeofsimplicitythefirstgroupofactorsiscalled“interestgroups”andthesecondgroupofactorsiscalled“WP6andWP7developers”throughoutthefollowingtext.ThedifferencebetweentheinterestgroupsandtheWP6andWP7developersisthatthelatterarethebeneficiariesoftheWP6andWP7asdevelopers,whereasthe first group represents those thatmay either be involved in otherWP’s ofEPOS-IP as beneficiaries, may be involved in one or several of TCS’s withoutbeing a beneficiary of EPOS-IP or they may be third-parties completelyindependentfromtheEPOS-IP.InordertoprovideaforumforinteractionsbetweentheinterestgroupsandtheWP6andWP7developers,weproposethefollowingmechanisms:

1. During the EPOS-IP Proposal writing stage (i.e. until the January 14,2015):

WehaveestablishedanopeninteractionanddiscussionforumintheEPOSCollaborative area (ICSdiscussion area)where thiswasused by both interest groups and theWP6 andWP7 developers.Based on the above, WP6 and WP7 developers wrote up theindividualtasksandthedescriptionsinWP6andWP7,takingintoaccount the requests and declarations provided by the interestgroups.

2. During the EPOS-IP Project execution stage (i.e. 2015-2018) there aremechanismsprovidedandprovisionsgiven in thevarious tasksofWP6andWP7 to integrate declarationsmade by interest groups during theEPOS-IPprojectduration.Thisincludes:

a. AformalprocedureforselectingtheICS-Chostingcountryduring2015.

b. AformalapplicationprocedureforintegrationofthevariousICS-D’swillbepreparedthroughatemplate.Their furtherevaluationandsubsequent selectionprocesswillbemadeavailable throughanopen,inclusiveandtransparentapproach.

Page 16: ICS-TCS Integration Guidelines - Epos · Figure 2: EPOS functional Architecture, describing the technical functional components of EPOS. It specifies for each layer the ICT modules

16

In addition we should mention here the continuous evolving nature of EPOSthroughtimebeyondtheEPOS-IPandthepossibilityofaddingfurtherrelevantandinterestingICS-D’sduringtheEPOSOperationalPhase(EPOS-OP)(i.e.2019-....).

Appendix1-AcronymsAAAI Authentication,Authorisation,Accounting,

Infrastructure(inthecontextofarchitecture)AARC AuthenticationandAuthorizationforResearchand

Collaboration,https://aarc-project.eu/APIs ApplicationProgrammingInterfaceCERIF CommonEuropeanResearchInfrastructureFormatCES ComputationalEarthScienceCLOUD Cloudcomputingisamodelforenablingubiquitous

networkaccesstoasharedpoolofconfigurablecomputingresources.Remotecomputerserversprovidingstorageandcomputationalservices

CKAN ComprehensiveKnowledgeArchiveFrameworkDC DublinCoreMetadataStandardDCAT DataCatalogueVocabularyDDSS data,dataproducts,servicesandsoftwareDO DigitalobjectECO ExecutiveandCoordinationOfficeEduGAIN Internationalinterfederationservice

interconnectingresearchandeducationidentityfederations

EGI-Engage EngagingtheResearchCommunitytowardsanOpenScienceCommons,https://www.egi.eu/about/egi-engage/

EPOS-IP EPOSImplementationPhaseEPOS-OP EPOSOperationalPhaseERIC EuropeanResearchInfrastructureConsortiumGRID Anetworkofinterconnectedlargescalehigh

performancecomputationalfacilitiesHPC HighPerformanceComputingHTC HighThroughputComputingIAAA Identification,Authentication,Authorisation,

Accounting(inthecontextofservices)ICS IntegratedCoreServicesICS-D DistributedIntegratedCoreServicesINSPIRE INSPIREisaEUDirectivein(May2007),

establishinganinfrastructureforspatialinformationinEuropetosupportCommunityenvironmentalpolicies,andpoliciesoractivitieswhichmayhaveanimpactontheenvironment

ISO-19115 Internationalstandardforgeospatialdata

Page 17: ICS-TCS Integration Guidelines - Epos · Figure 2: EPOS functional Architecture, describing the technical functional components of EPOS. It specifies for each layer the ICT modules

17

Kerberos AcomputernetworkauthenticationprotocolLDAP LightweightDirectoryAccessProtocolNRIs NationalResearchInfrastructuresOAuth Anopenprotocoltoallowsecureauthorizationina

simpleandstandardmethodfromweb,mobileanddesktopapplications

OGC OpenGeospatialConsortiumOpenID OpenIDisanopenstandardthatallowsusersto

authenticatetowebsiteswithouthavingtocreateanewpassword.AAAIstandard

OSI ApprovedAPIorWebservicestandardOWL WebOntologyLanguageSAML SecurityAssertionMarkupLanguage.AAAIstandardShibboleth AcomputernetworkauthenticationprotocolSKOS SimpleKnowledgeOrganizationSystem RESTfulwebservice WebServicesbasedontheRESTarchitecture(web)TCS ThematicCoreservicesVERCE VirtualEarthquakeandseismologyResearch

CommunityinEuropee-scienceenvironmenthttp://www.verce.eu/

W3C-PROV WorldWideWebConsortiumstandardsonprovenanceinformation

X.509 TheX.509specificationdefinesastandardformanagingpublickeysthroughaPublicKeyInfrastructure(PKI).AAAIstandard

Page 18: ICS-TCS Integration Guidelines - Epos · Figure 2: EPOS functional Architecture, describing the technical functional components of EPOS. It specifies for each layer the ICT modules

18

Appendix2-references

Metadata

Bailo,D.,&Jeffery,K.G.(2014).EPOS:AnoveluseofCERIFfordata-intensivescience.InProcediaComputerScience(Vol.33,pp.3–10).doi:10.1016/j.procs.2014.06.002

Jeffery,K.,&Bailo,D.(2014).EPOS:UsingMetadatainGeoscience.MetadataandSemanticsResearch,170–184.Retrievedfromhttp://link.springer.com/chapter/10.1007/978-3-319-13674-5_17

Jeffery,K.,Houssos,N.,Jörg,B.,&Asserson,A.(2014).Researchinformationmanagement:theCERIFapproach.InternationalJournalofMetadata,SemanticsandOntologies,9(1).Retrievedfromhttp://inderscience.metapress.com/index/VL5422N2U7112669.pdf

Jeffery,K.,Asserson,A.,Houssos,N.,&Jörg,B.(2013).A3-LayerModelforMetadata.Proc.Int’lConf.onDublinCoreandMetadataApplications2013,3–5.Retrievedfromhttp://wiki.dublincore.org/images/d/da/CAMPpapers.pdf

e-InfrastructuresBailo,D.,Jeffery,K.G,Spinuso,A.,FiameniG.InteroperabilityOriented

Architecture:TheApproachofEPOSforSolidEarthe-Infrastructures,in2015IEEE11thInternationalConferenceoneScienceeScience2015(inpress)

Atkinson,Malcolm,etal.(2015),VERCEdeliversaproductivee-Science

environmentforseismologyresearch,in2015IEEE11thInternationalConferenceoneScienceeScience2015(inpress).