emechanisms early release - unicorn project

64
D5.1 Cloud Governance Mechanisms – Early Release 1 Cloud Governance Mechanisms Early Release Deliverable D5.1 Editors Demetris Trihinas Zacharias Georgiou Reviewers Manos Papoutsakis (FORTH) Sotiris Koussouris (Suite5) Date 26 March 2018 Classification Public

Upload: others

Post on 12-Mar-2022

1 views

Category:

Documents


0 download

TRANSCRIPT

D5.1CloudGovernanceMechanisms–EarlyRelease

1

CloudGovernanceMechanismsEarlyReleaseDeliverableD5.1

Editors DemetrisTrihinas

ZachariasGeorgiou Reviewers ManosPapoutsakis(FORTH)

SotirisKoussouris(Suite5) Date 26March2018 Classification Public

!

D5.1CloudGovernanceMechanisms–EarlyRelease

2

ContributingAuthor # VersionHistory

Name Partner Description

DemetrisTrihinas UCY 1 TableofContents(ToC)andpartnercontributionassignment.

ZachariasGeorgiou UCY 2 Introduction,DocumentPurposeandRelationtootherWPs.

GeorgePallis UCY 3 MergedcontentforMonitoringState-of-the-Art

4 MergedcontentforDecision-MakingState-of-the-ArtandArchitecture

5 Monitoringreferencearchitectureandimplementation

6 MergedDecisionMakingImplementationandRequirements

7 MergedMonitoringRequirements,APIandupdatedimplementationdetails

8UpdatedRequirementsandExposedFunctionalityforMonitoringandDecisionMaking

9 MergedUpdatedContentforDecisionMakingimplementation

10 ExecutiveSummaryandConclusions.Submittedforinternalreview.

11 AddressedALLreceivedfeedbackfrominternalreview.Finalversion.

D5.1CloudGovernanceMechanisms–EarlyRelease

3

1 INTRODUCTION 8

1.1 DocumentPurposeandScope 101.2 DocumentRelationshipwithotherProjectWorkPackages 101.3 DocumentStructure 10

2 STATEOFTHEARTANDKEYTECHNOLOGYAXESCHALLENGES 11

2.1 Micro-ServiceandContainerMonitoring 112.2 Auto-Scaling 12

3 MONITORINGANDANALYSISSERVICE 14

3.1 RequirementsandExposedFunctionality 143.1.1 FunctionalRequirements 143.1.2 Non-FunctionalRequirements 17

3.2 ReferenceArchitectureandImplementation 173.3 InteractionwithotherUnicornServicesandComponents 25

4 DECISIONMAKINGANDAUTO-SCALINGSERVICE 27

4.1 RequirementsandExposedFunctionality 274.1.1 FunctionalRequirements 274.1.2 Non-FunctionalRequirements 29

4.2 ReferenceArchitectureandImplementation 304.2.1 ElasticityManager 324.2.2 ElasticityController 33

4.3 InteractionwithotherUnicornServicesandComponents 36

5 CONCLUSIONS 37

6 REFERENCES 39

7 APPENDIX 41

7.1 MonitoringandAnalysisServiceAPIDocumentation 417.1.1 MonitoringAPIKeys 417.1.2 MonitoredApplications 437.1.3 MonitoringAgents 467.1.4 MonitoringMetrics 51

7.2 DecisionMakingandAuto-scalingServiceAPIDocumentation 537.2.1 ElasticityAPIKeys 547.2.2 ElasticApplication 557.2.3 ElasticityPolicies 58

D5.1CloudGovernanceMechanisms–EarlyRelease

4

ListofFiguresFigure1:UnicornReferenceArchitecture 9Figure2:High-LevelandAbstractOverviewofUnicornMonitoringandAnalysisService 18Figure3:High-LevelandAbstractOverviewofMonitoringAgentInterfaces 20Figure4:ExampleofaNewlyDefinedMonitoringProbe 23Figure5:DynamicMonitoringAgentDiscovery 25Figure6:DecisionMaking&Auto-scalingServiceReferenceArchitecture 31Figure7:ElasticityPolicyExample 31Figure8:ElasticityPolicyStateDiagram 32Figure9:ElasticityManager 32Figure10:ElasticityControllerReferenceArchitecture 34ListofTablesTable1:MonitoringMetricHandlers 21Table2:MonitoringProbesCurrentlyAvailableinUnicornProbeRepository 22Table3:CREATEMonitoringAPIKeyContext 41Table4:DELETEMonitoringAPIKeyContext 41Table5:GETMonitoringAPIKeyContext 42Table6:UPDATEMonitoringAPIKeyContext 42Table7:CREATEMonitoringApplicationContext 43Table8:GETMonitoredApplicationContextAssociatedwithMonitoringKey 44Table9:GETMonitoredApplicationContext 44Table10:DELETEMonitoredApplicationContext 45Table11:UPDATEMonitoredApplicationContext 45Table12:CREATEMonitoringAgentContext 46Table13:GETMonitoringAgentcontextsassociatedwithMonitoredApplication 47Table14:GETMonitoringAgentContext 48Table15:DELETEMonitoringAgentContext 49Table16:UPDATEMonitoringAgentContext 49Table17:CREATEMonitoringMetricValuecontext 51Table18:GETMonitoringMetricvaluecontextsassociatedwithMonitoredAgent 51Table19:GETMonitoringMetricContext 52Table20:CREATEElasticityAPIKeyContext 54Table21:DELETEElasticityAPIKeyContext 54Table22:GETElasticityAPIKeyContext 54Table23:CreateElasticApplication’sContext 55Table24:GETElasticApplications’Context 56Table25:GETElasticApplication’sContext 56Table26:DeleteElasticApplicationContext 57Table27:UPDATEElasticApplication’scontext 57Table28:CREATEanewElasticityPolicy 58Table29:GETallElasticityPoliciesofaspecificdeployment 60Table30:GETanElasticityPolicy 61

D5.1CloudGovernanceMechanisms–EarlyRelease

5

Table31:UPDATEanexistingElasticityPolicy 62Table32:DELETEanexistingElasticityPolicy 64

D5.1CloudGovernanceMechanisms–EarlyRelease

6

ExecutiveSummary

TheaimthisDeliverableistoprovideacomprehensiveoverviewanddocumentationreportfortheearlyreleaseoftheUnicornGovernanceMechanisms.TheUnicornGovernanceMechanismscontributetothemonitoringandmanagementoftheruntimeaspectsoftheunderlyingmulti-cloudexecutionenvironmentstargetedbytheUnicornPlatformandaredevelopedwithinthescopeofWorkPackage5(WP5).

ThisDeliverablebeginsbypresentingthechallengesintroducedinreferencetomonitoringandmanagingmicro-service deployments across multi-cloud and containerized execution environments. Next, it continues withderivationof therequirementsandexposed functionality for thecomponentscomprisingtheUnicornCloudGovernanceMechanisms.

Fromtheidentifiedsetofrequirements,thereferencearchitectureandpublicAPIsofboththeMonitoringandAnalysisServiceandDecision-MakingandAuto-Scalingservice,arederivedandreported,givinganoverviewoftheearlyimplementationofthesecomponentswhichconstitutekeyaspectsfortheruntimemanagementstack,partofthefirstprototypereleaseoftheUnicornPlatform.

Finally, theDeliverableconcludesandoutlines thework tobeconducted towards introducingD5.2 thatwillassesstheaccomplishmentoftherequirements,featuresandtoolsets introducedinthisdeliverableandwillprovidethefinaldocumentationoftheUnicornPlatformCloudGovernanceMechanisms.

D5.1CloudGovernanceMechanisms–EarlyRelease

7

TableofAbbreviationsAPI ApplicationprogramminginterfaceCI/CD ContinuousIntegrationandContinuousDeliveryCRUD CreateReadUpdateDeleteDRL DroolsRuleLanguageEBNF ExtendedBackus-NaurFormIDE IntegratedDevelopmentEnvironmentJ2EE Java2PlatformEnterpriseEditionJSON JavaScriptObjectNotationJVM JavaVirtualMachineMAPE-K MonitorAnalyzePlanExecute-KnowledgeMaaS MonitoringasaServiceNoSQL Non-RelationalDatabaseOS OperatingSystemQoS QualityofServiceREST RepresentationalStateTransferTCP TransmissionControlProtocolVM VirtualMachine

D5.1CloudGovernanceMechanisms–EarlyRelease

8

1 IntroductionTheaimoftheUnicornprojectistoempowertheEuropeandigitalSMEeco-systembydeliveringanovelandunified framework that simplifies thedesign,deploymentandmanagementof secureandelastic-by-designcloudapplicationsthatfollowthemicro-servicearchitecturalparadigmandcanbedeployedovermulti-cloudcontainerizedexecutionenvironments.To thisend,DeliverableD5.1,henceforthsimply referred toasD5.1,providesacomprehensiveoverviewanddocumentationreportfortheearlyreleaseoftheUnicornGovernanceMechanisms. The Unicorn Governance Mechanisms contribute to the monitoring and management of theruntimeaspectsoftheunderlyingmulti-cloudexecutionenvironmentstargetedbytheUnicornPlatformandaredevelopedwithinthescopeofWorkPackage5(WP5).

TheUnicornGovernanceMechanismsincludethreevitalsoftwarecomponents:

• TheMonitoringandAnalysisService:Theroleof thiscomponent is toprovidereal-timemonitoringdata storage and analysis in order to detect and promptly notify cloud consumers and platformoperatorsofpotentialperformanceinefficiencies,securityrisksandexhibitedrecurringcustomerandresourcebehaviourpatterns.

• TheExecutionEnvironmentAgent1:Theroleofthiscomponentistocollectmonitoringdatainanon-intrusiveandinteroperablemanner,regardingresourceutilizationfromtheunderlyingcontainerizedexecutionenvironment(e.g.,compute,memory,network)anddeployedcloudapplicationbehaviourfromtailoredapplication-levelmetrics(e.g.,throughput,activeusers).

• The DecisionMaking and Auto-Scaling Service: The role of this component is to decide the mostefficientconfigurationfortheexecutionofthecloudapplicationbycontinuouslyevaluatingapplicationand service tier behaviour, the underlying multi-cloud provisioned infrastructure and user-definedrequirements,policiesandconstraints.

Figure1depictsthecurrentversionoftheUnicornReferenceArchitecturewiththecomponentscomprisingtheUnicornGovernanceMechanismshighlighted.AMonitoringAgentisbundledbytheUnicornPlatformwithineach containerizedexecutionenvironmentupondeploymentand is configured toadhere to the constraints(e.g., collection periodicity) defined at design-time in the application service description. The UnicornMonitoringAgentisparticularlytailoredforcontainerizedenvironments(e.g.,absenceoffile-system)suchasDockerandanyothercontainerformatadoptingtheOpenContainerSpecification[1].Afterdeployment,theset of interested metrics, from both the containerized environment (resource utilization) and application-specificmetricsextractedfromtheannotatedsource-code,areautomaticallypublishedbytheMonitoringAgenttotheMonitoringandAnalysisService,whichispartoftheRuntimeEnforcementLayeroftheUnicornPlatform.

Atthistime,theMonitoringandAnalysisServiceprocessesincomingmonitoringdataandstoresmetricupdatestotherespectiveDataStoreforhistoricreference.Inturn,real-timemetricupdatesarefedforfurtheranalysistoconstructhigh-levelanalyticinsightsandaggregateddatabeforestreamedtotheintelligentDecision-MakingandAuto-ScalingService.Thisservice,aspartofMAPE-Kcontrolloop2,thenproceedstoassessifadaptationofthe underlying virtual execution environment is required. Adaptation is based on semi-supervised decisionalgorithms for the optimal placement of virtual machines and containers across multiple availability zones

1AsthetermsExecutionEnvironmentAgentandMonitoringAgentareinterchangeablefortheUnicornFramework,wewillsimplyrefertotheExecutionEnvironmentAgentastheMonitoringAgent2Monitor-Analyse-Plan-ExecutewithKnowledge

D5.1CloudGovernanceMechanisms–EarlyRelease

9

and/orcloudsites,whilerealizingtheheterogeneityamongcloudprovidersandtheircapabilities.Atthesametime, the Unicorn decision-making process still adheres to the conditions and high-level policy constraintsdefinedbytheApplicationAdministrator.

Through theUnicornDashboard,applicationadministratorshave theability toview inan intuitivegraphicalmanner real-timemonitoring data capturing the application behaviour and performance of the underlyingplatform,assessmentoftheapplication’selasticbehaviourandpotentialsecurityincidents.Inturn,throughtheUnicornDashboard,userscanformulate(continuous)monitoringqueriestoaccessandtrawlhistoricaland/oraggregatedmonitoringdataextractedfromtheMonitoringandAnalysisService.

Figure1:UnicornReferenceArchitecture

D5.1CloudGovernanceMechanisms–EarlyRelease

10

1.1 DocumentPurposeandScopeThepurposeofthisdeliverableistoprovideacomprehensiveoverviewanddocumentationreportoftheearlyreleaseoftheUnicornGovernanceMechanismswhichcontributetothemonitoringandmanagementoftheruntimeaspectsof theunderlyingmulti-cloudexecutionenvironments targetedby theUnicornPlatform. Inrespect to this, D5.1 aims to derive a clear overview of the early design and development of the threecomponentscomprisingtheUnicornGovernanceMechanismsandaredevelopedundertheumbrellaofWP5:(i)theMonitoringandAnalysisService;(ii)theMonitoringAgent;and(iii)theDecisionMakingandAuto-ScalingService.WenotethatastheMonitoringAgentistightly-coupledwiththeMonitoringandAnalysisService,thedocumentationreportforthesetwocomponentswillbeintroducedtogether.Tothisend,D5.1documentsforeachcomponentoftheUnicornGovernanceMechanisms,therequirementsthatmustbesatisfiedtoovercomethe challenges introduced when monitoring, managing and scaling multi-cloud containerized executionenvironments,theirfunctionalities,howtheyoperateandthefirstversionoftheirexposedAPIwhichisusedtointeractwithotherUnicorncomponents,usersand/orthird-partyservices.Finally,wenotethatpartsofD5.1arebasedonanumberofscientificpapers[2][3],whichintroducecoreconceptsofthecomponentspartoftheUnicornGovernanceMechanismsandWP5.

1.2 DocumentRelationshipwithotherProjectWorkPackagesThisdeliverable isbuilton the foundationofD1.2,whichprovidesaconcretedocumentationof thecurrentversion of the reference architecture and key technologies supported by Unicorn and provides an initialdescription of the components comprising the Unicorn framework. To this end, D5.1 extends the UnicorndocumentationbyprovidingacomprehensivereportfortheUnicornGovernanceMechanisms.Whatismore,D5.1 serves as a guide forD5.2, theUnicornGovernanceMechanisms - FinalRelease,whichwill assess theaccomplishmentoftherequirements,featuresandtoolsetsintroducedinthisdeliverableandwillprovidethefinaldocumentationoftheUnicornGovernanceMechanisms.

1.3 DocumentStructureTherestofthisdeliverableisstructuredasfollows:

Section2,providesanupdatedguideofthecomprehensivereportintroducedinD1.2andreferringtotheState-of-the-Art landscape in monitoring, managing and scaling multi-cloud applications and containerizedenvironments.

Section 3 and 4, present a comprehensive documentation report introducing the reference architecture,exposed functionality and implementation details referring to theMonitoring and Analysis Service and theDecisionMakingandAuto-ScalingService,respectively.

Section5,concludesthisDeliverableandoutlinestheworktobeconductedtowardsintroducingD5.2.

IntheAppendix,acomprehensivedocumentationofthecurrentAPIexposedbyboththeMonitoringandAuto-ScalingServices,isprovided.

D5.1CloudGovernanceMechanisms–EarlyRelease

11

2 StateoftheArtandKeyTechnologyAxesChallengesIn the cloudera, as applications growby addingmore services, real-timemonitoring anddynamic resourceallocation become significant challenges. At scale, these challenges can be addressed with autonomicity.Through automation, microservices are equipped with the ability to continuously control the underlyinginfrastructure, thus turning into services that can be harnessed programmatically at runtime. However,traditionalmonitoring is ineffective forephemeral,decomposedandhighlydynamicmicroservicesdeployedoversharedexecutionenvironments.Ontheotherhand,finerservice-granularitymeansmoremovingpartsandhence an increased complexity of auto-scaling, potentiallymorepoints of failure, andmorepossibilities forserioussecurityviolationsandprivacyleaks.Thischallengesetcreatestheneedofdevisingnewsolutionswiththeabilitytodesign,runandmonitormicro-servicesatscalewhilealsoachievingtheanticipatedautonomicityoftheapplicationruntimeontopoftheunderlyingprogrammablecloudinfrastructure.

InthisSection,wewillupdatetheState-of-the-ArtpresentedinD1.2.Particularly,wepresentthechallengesintroduced in reference to monitoring and managing micro-service deployments across multi-cloud andcontainerizedexecutionenvironments.

2.1 Micro-ServiceandContainerMonitoringTraditionalmonitoringtools,eveninthecloudera,aredesignedforslowlyevolvingexecutionenvironmentswhere application instances resemble one another and reside on either physical or virtual machines.Containerizedexecutionenvironmentseaseapplicationdevelopmentanddeployment,astheuserisabstractedfrom the complexity of configuring the underlying offerings including the (virtual) infrastructure, network,storageandaccompanyingservices[4].However,monitoringcontainerizedenvironmentsisacomplexandopenchallenge.

Specifically, inacontainerizedenvironmentthereisnoguestOSorfile-systemtodeployamonitoringagentalongsiderunningservices.Anagentmusteitherrunthroughthecontainerengineorbepartoftheapplicationitself,whichmeansthatmonitoringshouldbeanintegralpartofapplicationdesignandcannotbedecidedafterdeployment.Toovercomethischallenge,DockerStats[5],thenativeDockermonitoringtool,andcAdvisor[6],anopen-sourcemonitoringtooldevelopedbyGoogleforDocker,assumeusershaveaccesstohostmachines,hookingthemselvestothecontainerenginedaemontoaccessandcollectmonitoringdataforthecontainersdeployedontheparticularhost.

Althoughaccessingmonitoringdatathroughthehostseemsaviablesolution,onemustconsiderthedifficultyofprovidingportableandinteroperablemonitoring.Thisisthecasewhensupportingmulti-clouddeploymentswhereinstancesspanacrossmultipleandheterogeneouscloudofferings.Tothisend,monitoringtoolssuchasMonPaaS [7], Tower4Clouds [8] and JCatascopia [9], offer portable and multi-cloud monitoring for cloudapplications although these tools are not tailored for containerized environments as their agents ormetriccollectorsrequireafile-systemfordeployment.Mostimportantly,accesstohostmachinesisnotpossiblewhenthe underlying infrastructure is provided through an intermediate cloud broker, such as in the case of theUnicornFrameworkwhenprovidedasaDevOps-as-a-Serviceeco-system.

Whatismore,granularlyslicinganapplicationinto(micro-)servicesinherentlyintroducesheterogeneitywhichrequires full customization of themonitoring process to access runtime application behaviour and performdiagnostics toreceivehelpful insights [10].Particularly,unless limitingthemonitoringtocapacityutilization,different insightsare required fordifferent servicesof amicro-servicedeployment.However, customization

D5.1CloudGovernanceMechanisms–EarlyRelease

12

mustbeautomatedandbecomepartofthecontinuousdeliveryprocessforimmutablecontainerizedexecutionenvironments.ThismeansthatforDockerapplications,customizationofthemonitoringprocessmustbepartofthecontainerbundlingtobeconsistentwiththeprinciplesofthemicro-servicearchitectureparadigm[11].However,currentmonitoringsolutions,suchasNewRelicAPM[12]andDataDog[13],whichsupportapplicationmonitoring require that users download and utilize bundled and pre-configured metric collectors that areavailableforpopularprogramminglanguages(e.g.,java,ruby,python)andframeworks(e.g.,tomcat,django,sql).

This leads toanother challengewhere ignoring theunique characteristicsof the containerizedenvironmentresorts toutilizingmultiple share-nothingmetric collectorsdue to theabsenceof customizationandmetriccompilationfromalreadycollecteddata,whichincreasescontainersizesandintroduceshighruntimefootprints.Thus,therearesignificantcostsandactualruntimeoverheadswhenmonitoringephemeral,decomposedandhighlydynamicapplicationsovervirtualizedandsharedexecutionenvironments[14].

2.2 Auto-ScalingOneofthemostvaluablecharacteristicsofmicroservice-basedcloudapplicationsistheircapabilitytoefficientlyscalehorizontallyandindependentlytorespondtochangesintheworkload,duetotheirinherentdecouplednature.Traditional scaling solutionsbycloudproviders suchasAWS [15] andMicrosoftAzure [16] relyoneventstriggeredbythresholdviolationsonmonitoringdatathatareexpressedthroughsimplerulesintheform“IF-THEN-ACTION”andaremanuallypre-determinedbytheuser.However,findingthemostappropriaterulesforscalingisnotatrivialtaskasitrequiresthattheuserhasprioriknowledgeoftheoptimalthresholdvalues,and even so, thresholds should be constantly adjusted to reflect any workload or application’s behaviourchangesinsuchadynamicandcomplexenvironment.

Anothercriticalparameterintraditionalrule-basedscalingmechanismsisthetimewindow,whichindicatesthattheconditionofarulemustbeevaluatedtotrueforapre-determinedperiodoftime(e.g.,cpu_usage>60%for5minutes).Thishelpsindeterminingwhetherascalingalertisissuedduetoanactualchangeinthedemandofanapplication,orduetosuddenandshort-livedspikesonhighlysensitivemonitoringdata(e.g.,cpuusage).This approach can improve the accuracy of scaling, however when not configured properly it can lead toundesirableresults.Specifically,whenthechoiceofperiodisrelativelysmall,thesystemmayreachanon-stablestatewhereresourcesareprovisionedandde-provisionedrapidly,butmostimportantlyarebilledeventhoughrealdemanddoesnotexist.Thisphenomenonisknownasa“ping-pong”effect[2].Ontheotherhand,delayingto determine an actual change in the application load, there is the possibility of a severe performancedegradation that affects the overall application’s QoS. A third and equally important factor that impactsapplication’soverallperformance, is the cooldownperiod.Thisparameter introducesanartificialdelay thatprovidesadditionaltimetoasystemtoabsorbtheeffectsofpreviousscalingactions(e.g.,provisionandregisternewservicestoloadbalancer)andbringsthesystembacktoastablestate,beforetriggeringanotheraction.The above difficulties, clearly impose a significant challenge for the user in finding optimal parameterconfigurations, especially for microservice-based cloud applications which are inherently decomposed intomultipleandheterogeneousservices.

Ascalingmechanismthatalleviatestheuserfromthepreviouserror-proneconfigurationistarget-basedscaling.Thismechanism,offeredbyGoogleCloudPlatform[17]andrecentlybyAWS[18],allowstheusertospecifyatargetvalueforametric,withtheAuto-Scalingmechanismintroducingtheappropriateadjustmentstokeepthemetricclosetothedefinedtargetvalue.However,thismethodimposesthesignificantchallengeoffindinga

D5.1CloudGovernanceMechanisms–EarlyRelease

13

representativemetrictoscaletheapplication.Theworkdonein[19],usesaclusteringtechniqueinordertoreduceandgroup relatedmetricsandusinga service call-graph that is constructedbyapplyingapplication-specificload,identifiesthemostrelevantmetricsforscaling.Thisapproachimprovestheorchestrationofauto-scalingmechanisms,howeveritdoesn’tcaptureanychangesintheapplicationorworkloadbehaviourchangesthatcanmaketherepresentativemetricsirrelevantanduseless.

Concerningmulti-cloudscaling,while itoffersgreaterflexibilitytoenterprises,notallapplications inherentlybenefitfromitbysimplyapplyingexistingscalingrulesbutinamulti-cloudfashion.Forexample,servicesthatconstantly exchange data,may suffer from excessive pricing charges and introduce significant performancedegradation (e.g., lowbandwidth across cloud providers). Furthermore, the lack of standardizedAPIswhenaccessingcloudofferingsacrossmultipleprovidersremainsasignificantchallenge.Inparticular,cloudprovidersandplatformsusetheirowntechnologystack,makingdifficultforclientstoexploittheadvantagesoftrulymulti-clouddeployments.Finally,withthediversityofresourcerequirementsofeachmicroserviceofanapplicationandwiththeresourceheterogeneityintroducedbydifferentcloudproviders,theprocessoffindinganoptimalplacementtoimprovethequalityandperformanceoftheapplicationandminimizecosts,becomesasignificantchallengeforanenterprise.

D5.1CloudGovernanceMechanisms–EarlyRelease

14

3 MonitoringandAnalysisServiceIn this Section, we present a comprehensive documentation report introducing the reference architecture,exposedfunctionalityandimplementationdetailsreferringtotheMonitoringandAnalysisService.

3.1 RequirementsandExposedFunctionalityTheUsersinteractingwiththeUnicornMonitoringandAnalysisServiceinclude:

• CloudApplicationDeveloper:interactswithUnicornMonitoringbydefiningthescopeandintensityofmonitoringmetrics to assess application behaviorwith source code annotations via theMonitoringDesignLibrary.

• CloudApplicationOwner: interactswithUnicornMonitoring by defining the scope and intensity ofmonitoring metrics to assess application behavior by using the tools available from the UnicornDashboardMonitoringFacetforservicedescriptionenrichment.

• Unicorn Developer: interacts with Unicorn Monitoring by developing custom metric collectors forframeworksandprogramminglanguagestoincreasethereachoftheMonitoringandAnalysisService.

• CloudProvider:interactswithUnicornMonitoringbyassessingmonitoringdataregardingcloudofferingutilizationtooptimizetheprovisioningandqualityofserviceoftheunderlyingresourceofferings.

TosatisfyUse-CaseUC.8(Monitorapplicationbehaviourandperformance),documentedinD1.2,whileadheringtosystemrequirementFR.7(Accessapplicationbehaviourandperformancemonitoringdata),documentedinD1.1,thefollowingfunctionalitymustbeexposedbytheMonitoringandAnalysisService.

3.1.1 FunctionalRequirements

ID FR.MON.1

Title Monitorcloudandcontainerlevelutilization

Description UnicornMonitoringmustprovide themeans tomonitorunderlyingcloudofferingsandcontainerizedexecutionenvironmentsthatareprovisionedforapplicationsdeployedbythe Unicorn Platform.Monitoring includes exposing provisioned resources capabilities,currentconsumption,service(un-)availabilityandfaults.

ExposedFunctionality

Metriccollectors,denotedasMonitoringProbes,aredeveloped tomonitorandextractmetricsfromtheunderlyingcloudelementtheyresideon(e.g.,VM,container).Thecode-base ofMonitoring Agents, the entities responsible formanaging themetric collectionprocess,isdecoupledfromtheactualmetriccollection(MonitoringProbes)whichallowsfortheselectionanddynamicinstantiationofonlytherequiredmetriccollectorstosatisfythemonitoringtaskinhand.Tothisend,UnicornprovidesanumberofMonitoringProbescapableofmonitoringDockercontainers,JVMandJ2EEcontainers,aswellas,LinuxVMsdeployedonprovisionedcloudofferings.

D5.1CloudGovernanceMechanisms–EarlyRelease

15

ID FR.MON.2

Title Monitorcloudapplicationbehaviorandperformance

Description UnicornMonitoringmustprovidethemeanstomonitorthebehavior,qualityofserviceand current performance of deployed cloud applications. This must be achieved byprovidingamonitoringlibraryforapplicationinstrumentationvia,eitherorboth,sourcecodeannotationsandtheservicedescription.

ExposedFunctionality

TheMonitoring Library is developed to expose source code annotation decorators fordevelopers todefine the scopeand configurationof application levelmetrics at designtime.Atruntime,theMonitoringLibraryinstantiatesthemetrichandlersresponsibleforinstrumentation and metric update extraction from the executed source code. MetricupdatesandthenpushedtotherespectedMonitoringAgentforparsinganddisseminationtotheUnicornMonitoringandAnalysisService.UnicorncurrentlyprovidestheMonitoringLibrary for cloud applications developed in Java, with the library featuring additionalenhancements such as high-level metric handlers for monitoring and measuring thecompletionofapplicationtasks(timers),interceptingandfilteringcommunicationamonginteracting components (interceptors) and rate of completed tasks in a user-definedtimeframe(meters).

ID FR.MON.3

Title Metriccollectordevelopmenttoolkit

Description UnicornMonitoringmustexposeatoolkitforuserstodevelopcustommetriccollectorstailoredtotheirapplicationrequirements.DevelopedmetriccollectorsthatadheretotheUnicornmetricparadigmmustbeseamlesslyintegratedtotheUnicornMonitoringprocessateitherdeploymentorruntime.

ExposedFunctionality

Metric collectors, denoted as Monitoring Probes, must adhere to the Unicorn probeinterface and metric abstractions that allow for Monitoring Probes to be dynamicallypluggedtoMonitoringAgentsandformetricupdatestobeparsedanddisseminatedtothe Monitoring and Analysis Service. To ease Monitoring Probe development, thereferencedtoolkiteasesMonitoringProbedevelopmentbyhidingthecomplexityoftheProbefunctionalityandrequestingfromdeveloperstoonlydefinedefaultvaluesfortheProbe periodicity and a name, a short description of the offered functionality and aconcreteimplementationofhowmetricvaluesareupdated.

ID FR.MON.4

Title Accesstohistoricalandreal-timemonitoringmetricdata

Description UnicornMonitoringmustprovidethemeansforuserstoaccessbothhistoricalandreal-timemonitoringdatawiththedatastoragebackendrestrictingaccesstoonlyauthorizedentities. Access tomonitoring datamust support both a push and pullmetric delivery

D5.1CloudGovernanceMechanisms–EarlyRelease

16

mechanism to reduce the overhead of exposing monitoring data depending on therequirementsoftheinterestingentityandtheUnicornplatform.

ExposedFunctionality

AthinAPIlayerontopoftheMonitoringDataStoreisprovidedtoabstractandmanagesecureandauthorizedaccesstostoringandextractingmonitoringdata.TheMonitoringDataStoreis implementedasaNoSQLdistributeddatabasedtoscaledependingontheimposed load for accessing historical monitoring data. To reduce the overhead ofcontinuous monitoring queries from entities requesting real-time data, a high-performance queueing service is exposed by the Analysis Service which receives andmanages subscription requests to metric topics of interest. After subscription, metricupdatesarepushedimmediatelytorelatedtopicswithoutinterestedentitiesrequiredtocontinuouslyissuerequeststhroughtheMonitoringAPI,andconsequentlytheMonitoringDataStore.

ID FR.MON.5

Title Runtimemonitoringtopologyadaptationacknowledgement

Description Unicorn Monitoring must provide the means to acknowledge runtime adaptation ofapplication service topologies, including the (de-)provisioning of service instances, thealterationofcloudandcontainerofferingcapabilitiesandattachmentofadditionalmetriccollectors. Adaptation must be timely acknowledged without the need to restart theentire,orpartial,monitoringprocess.

ExposedFunctionality

TheUnicornMonitoringandAnalysisServiceisabletoacknowledgedynamicchangestothe underlying monitoring topology of an application embracing the micro-serviceparadigmwhereservicedecompositionandelasticscalingareinherentruntimefeatures.Thisisaccomplishedbyembracingavariationofthepub/submessageprotocoltodevelopthecommunicationplanebetweenMonitoringAgentsand theMonitoringandAnalysisService that allows for rapid propagation of changes to the underlying cloud offerings,containerizedenvironmentandthecardinalityofinstancesperservicelayer.

ID FR.MON.6

Title Monitoringrulelanguageformetriccomposition,aggregationandgrouping

Description UnicornMonitoringmustprovideuserswiththemeanstocompose,aggregateandgroupmonitoring data in order to derive high-level analytic insights. Metric rules should bevalidatedandonceacceptedmustprovidetimelyanswerstousersvia,eitherorboth,apushandpulldeliverymechanism.

ExposedFunctionality

Theunderlyingmonitoringmetricmodel, documented inD2.1, provides themeans forboth cloud application developers and administrators to specify monitoring rules formetriccomposition,aggregationandgroupthatallowforhigh-levelanalyticinsightstobederived from low-levelmonitoring data and correlation of collected data. To this end,metricrulesaresupportedbyUnicornaseitherenhancementstoanapplication’sservicedescription at either deployment or runtime through the Management Perspective of

D5.1CloudGovernanceMechanisms–EarlyRelease

17

UnicornDashboard,andinparticular,theMonitoringfacet.MetricrulesarevalidatedandassessmentisprovidedbytheAnalysisService.

3.1.2 Non-FunctionalRequirements

ID NFR.MON.1

Title Scalability

Description UnicornMonitoringmustbescalableinordertohandlealargenumberofmetricproducerson different cloud levelswhile simultaneously being able to handle a large number ofmetricconsumers.Thus,UnicornMonitoringshouldnotbefragmentedbythenumberofrunningmonitoringinstancesorthenumberofmetriccollectorsdeployedoneachrunninginstance.

ID NFR.MON.2

Title Non-Intrusiveness

Description UnicornMonitoringmustnotinterferewiththesystemorapplication(s)monitored,andmust not consume excessive resources from either the underlying cloud offerings orcontainerized execution environment. Thus, Unicorn Monitoring should have minimalruntime impact to provisioned resources (compute,memory, network) in order to notaffectbehaviorandperformance.

ID NFR.MON.3

Title CustomizationandExtensibility

Description Unicorn Monitoring must support users with the means to customize the monitoringprocesstotailortheirdeployedapplicationsneed.Inrespecttothis,thenumberandtypeofmetricsalongwith the intensityof themonitoringprocess,mustbecustomizable. Inturn, Unicorn Monitoring must be extensible by providing the ability to include newfunctionality andmetrics. In respect to this,UnicornMonitoringmust be adaptive andflexibleinordertoutilizeexpandedfunctionality.

3.2 ReferenceArchitectureandImplementationToaddresstheaforementionedchallengesandadheretothedocumentedrequirements,Unicornintroducesacompletemonitoringstackforautomatingthemonitoringofcloudapplicationsdeployedthroughcontainerizedexecutionenvironments. Figure2depicts ahigh-level andabstractoverviewof theUnicornMonitoringandAnalysisServiceinamulti-cloudcontainerizedexecutionenvironment.

D5.1CloudGovernanceMechanisms–EarlyRelease

18

The architecture of the Unicorn Monitoring and Analysis Service follows an agent-based architecture thatembracestheproducer-consumercommunicationparadigm.Thisapproachprovidesinteroperable,scalableandreal-timecloudmonitoringforextractingbothplatformandapplicationbehaviourdatafromdeployedcloudapplications.TheUnicornMonitoringandAnalysisServicerunsinanon-intrusiveandtransparentmannertoanyunderlyingcloudasneitherthemetriccollectionprocessnormetricdistributionandstoragearedependentto theunderlyingplatformAPIsandcommunicationmechanisms. In turn, theMonitoringService takes intoconsideration the rapid changes that occur due to the enforcement of elastic actions to the underlyinginfrastructureandtheapplicationtopology.

Figure2:High-LevelandAbstractOverviewofUnicornMonitoringandAnalysisService

ThemaincomponentsofthatcomprisetheUnicornMonitoringandAnalysisServicearethefollowing:

• MonitoringAgents: lightweight entities deployable on any cloud element to bemonitored, such ascontainerizedexecutionenvironmentsorvirtualmachinesresidingonpublicorprivatecloudofferings.MonitoringAgents are the entities responsible for coordinating andmanaging themetric collectionprocess on the respective cloud element (e.g., container, VM), which includes aggregation anddissemination ofmonitoring data to theMonitoring Service over a secure control plane. Additionalfunctionality of a Monitoring Agent includes adapting the intensity of the monitoring process byadaptingtheperiodicityofboththemetriccollectionanddisseminationbasedonthecurrentevolutionofthemetricdatastream.

• MonitoringProbes:theactualmetriccollectorsthatadheretoacommonmetriccollectioninterfacewithspecificMonitoringProbeimplementationsgatheringmetricsfromtheunderlyinginfrastructure,thecontainerizedexecutionenvironmentsorperformancemetricsfromdeployedcloudapplications.MonitoringProbes featurebothapushandpullmetricdeliverymechanismwithMonitoringAgentsbenefitingfromthepushmechanismtoavoidtheoverheadofconstantlycheckingformetricupdates.

D5.1CloudGovernanceMechanisms–EarlyRelease

19

Monitoring Probes logically group multiple metrics together, in order to reduce the monitoringoverheadwhenaccessingcommonandsharedresources.

• MonitoringLibrary3:thesourcecodeannotationdesignlibrarysupportingapplicationinstrumentationforUnicorncompliantcloudapplications.Byaddingsourcecodeannotationsformonitoring,Developerscandefinethescopeandtargetofmetrichandlerstoenableandalterthemetriccollection,aggregationanddisseminationprocess.Enabledmetrichandlersallowdeveloperstodefinemetriccounters,timers,and traffic interceptors to gather application performance metrics in order to assess and evaluateapplicationbehaviourandthequalityofserviceoftheunderlyingcloudofferings.

• MonitoringService:theentityeasingthemanagementofthemonitoring infrastructurebyprovidingscalable andmulti-tenant monitoring alongside the Unicorn platform. In particular, theMonitoringServiceisresponsibleforreceiving,processingandstoringmonitoringmetricstotheMonitoringDataStore.TheMonitoringServiceinternallyiscomprisedofadistributedandhorizontallyscalabletierofMonitoringServersthatcoordinatesandhandlesmetricandconfigurationrequestsfrombothusersandactiveMonitoringAgents.ThecommunicationbetweenMonitoringAgentsandtheMonitoringServiceis accomplished by utilizing a variation of the traditional publish and subscribe (pub/sub) messageparadigmwhichreducestherelatednetworkcommunicationoverhead.

• AnalysisService:theentitydeployedontopoftheMonitoringServicethatisresponsibleforaggregatingandcompilingatruntimehigh-levelanalytic insightsfromcollectedmonitoringdatabasedonmetricrules defined by users at either deployment or through the service graph editor from the UnicornDashboard.Afterassessingthevalidityandcorrectnessofthereceivedmetricrules,real-timeanalyticinsightsarethenservedtointerestedentitiesthroughahigh-performancequeueingservicetoeithertheUnicornDashboardortoexternalentitiesthatsubscribeviatheMonitoringRESTAPItotopicsofinterestandreceivestreamedmonitoringdata.

• MonitoringDataStore:Adistributedandscalabledatastorewithahigh-performanceindexingschemeforstoringandextractingmonitoringupdates.

• MonitoringRESTAPI:theentityresponsibleformanagingandauthorizingaccesstomonitoringdatastoredintheMonitoringDataStore.

Monitoring Agents are integral to the Unicorn monitoring process as they are the lightweight monitoringinstancesresponsibleforthecoordinationofthemetriccollectionprocess.Specifically,aMonitoringAgentisinstantiated as a light-weight user process in the deployed cloud element (e.g., container environment),decouplingthemetriccollectionprocessfromthemetricdisseminationtotheMonitoringService.Thisallowsfor theMonitoringAgentcode-basetobereusedonvariouscloud layersandelementswithonly theactualmetriccollectorsdiffering.TheenablementoftheMonitoringAgentispartofUnicornContinuousIntegrationandDelivery(CI/CD)cyclewhereforeachinstantiatedcontainerizedenvironmentthathostsaserviceofacloudapplication, the Unicorn Smart Orchestrator will configure, bundle and deploy alongside the application, aMonitoringAgent. Inregardtoconfiguration,usersarefreetosettheperiodicityofmetricstobecollected,enable logging and the level of reporting, and also give a name and tags to theMonitoring Agent to easereadabilityandassociationwhenperformingmonitoringqueriesviatheUnicornDashboard.

3InthissectionforcoherencepurposeswesimplyrefertotheMonitoringandElasticitylibraryasMonitoringLibrarysinceweareinterestedinthefunctionalityofmonitoring.

D5.1CloudGovernanceMechanisms–EarlyRelease

20

Figure3depictsahigh-levelandabstractoverviewoftheinterfacesthatmaptothefunctionalityprovidedbyaMonitoringAgentandareexplainedinthecontentthatfollows.

Figure3:High-LevelandAbstractOverviewofMonitoringAgentInterfaces

MetriccollectionandconfigurationisachievedbyUnicornUsersthroughtheMonitoringLibraryandMonitoringProbes.TheMonitoringLibraryprovidesuserswiththetoolset forapplication instrumentationtodefinethescopeandtargetofmetrichandlersthatenableandalterthemetriccollection,aggregationanddisseminationprocess. Touse theMonitoring Library, users simplydownload it from theUnicornArtefactRepository andembedittotheirapplication.Forsimplicity,UnicornofferstheMonitoringLibraryasaMavenDependencyforJavaapplicationswhichautomaticallydownloadsandintegratestheLibrarywiththeapplication,asdepictedinthefollowingcodesnippet.DetailedinstructionsandguidelinesareadditionallyprovidedthroughtheUnicornFramework Developer page. Itmust also be noted, that in the case of Developers taking advantage of theUnicornCloudIDEpluginforEclipseChe,thelatestversionoftheMonitoringLibraryismadeavailablethroughtheUnicornRuntimeStack(seeD2.1)withnoadditionalstepsrequired.ItmustalsobenotedthatthecurrentprototypeoftheMonitoringAgentisimplementedandofferedforJavaapplications.However,asitfeaturesnoexternaldependenciestootherframeworksorlibraries,itcanbeportedtootherprogramminglanguagesandframeworks. In turn, userswith applications developed in other languages are still able to use theUnicornMonitoringandAnalysisServicebytakingadvantageoftheMonitoringServiceRESTAPItodefineandstreammonitoringdatawithoutanylimitations,asdocumentedinSection7.1.

<dependency> <groupid>eu.unicornH2020</groupid> <artifactid>UnicornMonitoring</artifactid> <version>LATEST</version> </dependency>

CodeSnippet1:MonitoringLibraryMavenDependency

InthecodesnippetthatfollowsisametricdefinitioninJavaforahypotheticalpageviewcounterforanitemlistinginane-commerceplatform.Inthisexample,theDeveloperimportsintheitemclassdefinitiontheUnicornsourcecodeannotationlibraryformonitoringanddenotesthemetricproperties(e.g.,metricname,valuetype,

D5.1CloudGovernanceMechanisms–EarlyRelease

21

measurementunits,etc.)forconfiguringacountermetric(CounterMetric)basedontheUnicornmonitoringmetricmodel4.

import eu.unicornH2020.annotations.monitoring; ... @UnicornMetric(name=”views”, handler=MetricHandlerType.CounterMetric, units=””, valType=MetricValueType.Integer, initVal=0, minValue=0, maxVal=Integer.MaxInt, higherIsBetter=true, desc=”number of views for specific page” ) public ItemController { ... private int views; ... //metric handler value extraction method public int getViews() { return views; } }

CodeSnippet2:ExemplaryMonitoringLibraryMetricAnnotation

Inbrief,ametric, in itssimplisticform(SimpleMetric), iscomprisedbyaname,measurementunits,valuetypeandashortdescription. Inaddition,ametricmay includeotherproperties includingan initial/min/maxvalueandadefinitionifagreaterorlowervalueisbetterinreferenceforoptimizationsuchasinthecaseofelasticscaling.Ametricmaytakeotheradvancedforms,denotedasmetrichandlers.Table1introducesthe,todate,metrichandlersmadeavailabletousersthroughtheUnicornMonitoringLibrary.

Table1:MonitoringMetricHandlers

MetricHandlers DescriptionSimpleMetric Emitsavalueforareferencedmetricperiodically.TriggerMetric Emits a value for a referencedmetric butonlywhen called

upon.TimerMetric Emitsthetimeconsumedforthecompletionofareferenced

task(e.g.,APIcall).MeterMetric Emits the rate of measured events with a determined

timeframe(e.g.,throughputperminute).InterceptorMetric Emitstheoutputofafilterfunctionappliedtotheintercepted

trafficamongtwoevents(e.g.,TCPtrafficexchangedbetweentwoAPIcalls).

InadditiontotheMonitoringLibrarywhichprovidesapplicationinstrumentationthroughannotations,Unicornsupports metric extraction through Monitoring Probes. In particular, a Monitoring Probe is a lightweightmonitoringthreadadheringtoacommondefinedMonitoringProbeinterfacewiththeimplementationtailoredtothemonitoringtasktobeachieved.Forinstance,theDockerProbe,isresponsibleforextractingandupdating

4ToreducerepetitionwithD2.1,wewillavoidintroducingsegmentsofthemonitoringmetricmodeldefinitionandrequestthat interested readers use D2.1 as their model reference guide which introduces all the models and service graphenhancementsthatareprovidedbytheUnicornFramework.

D5.1CloudGovernanceMechanisms–EarlyRelease

22

monitoringmetricsfromtheDockercontaineritisdeployedupon.MonitoringProbesrunindependentlyfromeachother and canbedeployeddynamicallywithout theneed to restart the entiremonitoringprocess foralteration.IfaProbeencountersaproblem(e.g.,unexpectedtermination)themetriccollectionprocessofotherProbesandtherespectedMonitoringAgent,arenotaffected.

Withregardstodeployment,MonitoringProbesaredynamicallypluggabletoMonitoringAgentsviatheAgent’sprobe loader which embraces the class reflection paradigm to dynamically link, configure and instantiateMonitoringProbesatruntimeinimmutableexecutionenvironments,whichisarequirementforcontainerizedofferings.ThisprovidesflexibilitytothemonitoringprocessinkeepingMonitoringAgentscompactbyallowingthenumberandtypeofProbesutilizedbyMonitoringAgentstonotbepre-bundledandvarydependingonthemonitoringtaskthatmustbeachieved.Toachievethis,usersareonlyrequiredtospecifyatdeploymenttimewhichMonitoringProbesare inneed for their application servicesandupondeployment, theseMonitoringProbeswillbefetchedfromtherespectedProberepositoryanddynamicallypluggedtotheMonitoringAgentatruntime.IntheUnicornMonitoringProbeRepository5thereexistsanumberofpublicallyavailableMonitoringProbes thatcanbeusedbyusers.Todate, theUnicornMonitoringProbeRepositoryhostsa JVM, J2EEandDockerProbe,withthemetricsexposeddepictedinTable2.

Table2:MonitoringProbesCurrentlyAvailableinUnicornProbeRepository

MonitoringProbe Metric Units Type

JVM

CPULoad % DoubleAverageGarbageCollectingTime ms Double

InitialHeapSpace KB LongMaxHeapSpace KB Long

CurrentHeapSpaceUsed KB LongInitialNonHeapSpace KB LongMaxNonHeapSpace KB Long

CurrentNonHeapSpaceUsed KB LongCurrentAllocatedMemory % Double

J2EE

AverageResponseTime ms DoubleThroughput ops/s Double

Docker

CPULoad % DoubleCgroupPeriodicity μs LongCgroupQuota μs Long

NumberofAllocatedThreads # IntegerCPUCores # Integer

CPUUserTime ns LongCPUSystemTime ns Long

CurrentMemoryUtilization % DoubleCurrentMemoryCache MB LongCurrentMemoryRSS MB Long

TotalAllocatedMemory MB LongContainerarchitecture - String

ContainerOS - StringContainerBootTime ns Long

IngressPacketspersecond pckt/s Long5https://gitlab.com/unicorn-project/uCatascopia-Probe-Repo

D5.1CloudGovernanceMechanisms–EarlyRelease

23

EgressPacketspersecond pckt/s LongIngressKBytespersecond KB/s LongEgressKBytespersecond KB/s Long

Nonetheless, Developers are free to create their ownMonitoring Probes and Metrics, by adhering to thepropertiesdefinedintheMonitoringProbeAPIwhichprovidesacommonAPIinterfaceandabstractionshidingthecomplexityoftheunderlyingProbefunctionality.Figure4,depictstheimplementationofanExampleProbewhich includesthedefinitionof twoSimpleMetric’s,denotedasMetric1andMetric2, thatperiodicallyreportrandomintegeranddoublevaluesrespectively.Inthisfigurewealsoobservethatforausertodevelopa Monitoring Probe, she must only provide default values for the Probe periodicity and a name, a shortdescriptionoftheofferedfunctionalityandaconcrete implementationofthecollect()methodwhich,asdenotedbythename,defineshowmetricvaluesareupdated.

Figure4:ExampleofaNewlyDefinedMonitoringProbe

To reduce the intrusiveness of the monitoring process, the load imposed to the monitoring service, datamovementacrossmultiplecloudsitesandmonitoringcostswhicharebothnoticeableandbillableinlarge-scaleanddistributedcloudenvironments,UnicornMonitoringembracesadaptivenessofthemonitoringprocess.Inparticular, Unicorn provides low-cost approximate and adaptivemonitoring by adopting and extending thealgorithmic process proposed in [3], [20] to be suitable formicro-servicemonitoring through the AdaptiveInterface exposed by UnicornMonitoring Agents. Specifically, users are free to enable adaptiveness of themonitoringprocesssimplybysetting,theminimumacceptableaccuracy,denotedasaconfidencemetric(e.g.,90%), for a metric stream during the configuration process. Having defined the acceptable accuracy, therespectedMonitoringAgentwillemploylow-costapproximateandadaptivemonitoringtechniquestoadaptat

D5.1CloudGovernanceMechanisms–EarlyRelease

24

runtime both the rate at which metrics are collected and disseminated to the Monitoring Service. This isachievedbyutilizingalow-costestimationmodelthatcapturesatruntimethecurrentevolutionofthemetricdata stream and if stable phases in the evolution exist, then the collection period will be increased tocomputationallyoffloadtheMonitoringAgent.Inturn,filteringisappliedtoreducethemetricdisseminationwhenconsecutivemetricupdatesdonotdifferandtheuser-definedaccuracyguaranteeshold.

TheMonitoringandAnalysisServicerunsalongsidetheUnicornPlatformaspartoftheRuntimeEnforcementLayer.TheMonitoringServiceisinchargeofreceiving,processingandstoringmonitoringdatatotheMonitoringDataStore.MonitoringisprovidedtoUnicornUsersasaservice(MaaS),thus,removingfromuserstheoverheadofdeployingandmaintainingin-housemonitoringinfrastructure.Thisallowsforthemonitoringprocesstobedecoupledfromcloudproviderdependenciessoasformonitoringtonotbedisruptedandrequiresignificantamountofconfigurationwhenacloudservicemustspanacrossmultipleavailabilityzonesand/orcloudsites.Althoughcentrallyaccessiblebymultipletenants,througheithertheUnicornDashboardortheMonitoringRESTAPI, the Monitoring and Analysis Service, internally receives, processes and stores monitoring data in adistributed fashion. Specifically, the Monitoring Service embraces in situ monitoring to horizontally andelasticallyscaletomultipleinstances,denotedasMonitoringServers,whichintercommunicatetomonitorthehealthoftheentireMonitoringServiceandrecoverfromnetworkfaultsand/orunexpecteddowntime.

Toeasefeaturedevelopment,testingandcodereleasesbydecomposingitsfunctionality,theMonitoringandAnalysisServiceadheres to themicro-servicearchitecturalparadigm.The implementationof theMonitoringService is developed by embracing Java 8 and the Spring Boot framework (v2.0) for simplifying thebootstrapping,configurationandmanagementoftheembeddedwebserviceandforthedevelopmentofthedataabstractionsrequiredforaccessingandmanagingtheMonitoringDataStore.TheUnicornMonitoringDataStoreinterfaceabstractstheimplementationoftheunderlyingstoragebackend,thus,supportingflexibilitytotheselectionofthebackendimplementationofchoice.Inturn,SpringBootisutilizedforservingtheMonitoringServiceRESTAPIuponsecuretoken-basedauthorization.TheMonitoringServiceRESTAPIimplementstheCRUDoperationsforMonitoringAgentandMetricupdatesaccess,storageandmanagement,withresponsesencodedinJSONwhenexchangedoverthenetworkandmappedtoobjectswhenreceivedasrequests,usingSpringBoot(de-)serialization.Moreover,theMonitoringServicealsoembracesSpringCloudforexternalizedconfigurationpropagationuponMonitoringServerbootstrappingwhichalsoallowsforzerodowntimeandnorecompilationwhenchangesmustbepropagatedtotheentiretiercomprisingtheMonitoringService.

Inregardstometricstreamdissemination,theoverlaycommunicationplaneestablishedbetweenMonitoringAgents and the Monitoring Service adheres to the open-source JCatascopia Communication Protocol [21]developedtoincreaseautonomicityandfault-toleranceinelasticanddistributedmonitoringtopologieswhilereducingnetworktrafficandthecommunicationoverhead.Inbrief,thisprotocolusesavariationofthepublish-and-subscribemessagecommunicationpattern,wherethemetricpublisher(e.g.,MonitoringAgent) initiatesthesubscriptionprocesswiththemetricconsumer(e.g.,MonitoringService),insteadoftheotherwayaround,asdepicted in Figure5. This significantly reduces theoverhead inbothestablishing anddecommissioning adedicatedmonitoringstreambetweenthepublisherandconsumer,byremovingtheneedofanintermediatebrokeranddirectoryservicetrackingthenetworklocationofbothMonitoringAgentsandServerspartoftheMonitoringService.ToutilizethisprotocoltoestablishthecommunicationplanebetweenMonitoringAgentsandtheMonitoringServiceinmulti-cloudandcontainerizedexecutionenvironments,weextendtheprotocoltosupportcommunicationoverHTTPasynchronousconnectionsvia theMonitoringRESTAPIandadapt the

D5.1CloudGovernanceMechanisms–EarlyRelease

25

metric stream interface to support event-based metric dissemination instead of periodic dissemination tosupportadaptivemonitoringatthemetriccollectionlevel.

Figure5:DynamicMonitoringAgentDiscovery

Providing aMonitoring Service that is horizontally scalable in order to accommodate a dynamic and largenumberofbothmonitoredapplicationsandMonitoringAgents isonlyonepartofmonitoringscalability.ToofferacompletelyscalableMonitoringService,metricstorageandextractionmustbescalableaswell,andalsocapable of handling a dynamic and high volume of traffic since monitoring metrics are collected anddisseminatedattherateofseconds.Toaccommodatethis,theMonitoringDataStoreutilizes,CassandraDB,adistributedandscalableNoSQLdatabasebackend.TheselectionofCassandraDBwaspromptfromtheneedtosupport(i)fastwritesformetricproducers(MonitoringAgents),and(ii)fastreadsonrecentmonitoringmetricsthatarerequestedfrommetricconsumers.Specifically,thedatabaseschemadevelopedfortheMonitoringDataStoresupportsfastwritesforMonitoringAgentswithastabletimecomplexity.Inregardtofastreads,wehavedevelopedahigh-performanceindexingschemeformonitoringdatathatsupportsstabletimemetricextractionforrecentdatawhilerangequeriesforaparticulartime-windowaresupportedinlineartime.

TheAnalysisService isresponsibleforaggregatingandcompilingatruntimehigh-levelanalytic insights fromcollectedmonitoringdatabasedoncompiledmetricruleswhichadheretotheMetricModel,definedinD2.1.WiththeMetricModelusersareabletousetheUnicornDashboardMonitoringViewtocomposequeriesthatcreatenewhigh-levelmetrics andaggregates. Forexample,one can compileadatabaseoverall throughputmetricbyaggregatingbothreadandwriteoperationspersecond:dbThroughput = readps + writeps.Toreducethemetricextractionoverheadevenmoreandtoprovidereal-timeandpush-basedmetricupdates,theAnalysisService,servesreal-timestreamedmonitoringdatathroughahigh-performancequeueingservicetoboth interested entities and Unicorn components that subscribe via the Monitoring REST API to topics ofinterest.Atthispoint,wenotethatfurtherdetailsreferringtheAnalysisServicewillbedocumentedinD5.2withtheAnalysisServiceimplementationpartoftheUnicorneco-systemsecondprototyperelease.

3.3 InteractionwithotherUnicornServicesandComponentsTheMonitoringandAnalysisServiceinteractswiththreeUnicornComponentsatthePlatformanduserlevelwhile its exposed REST API allows for third-party services and developers to access historic and real-timemonitoringdata,permitted,authorizationisobtained.

Inparticular,theMonitoringandAnalysisServiceinteractswiththefollowingUnicornComponents:

D5.1CloudGovernanceMechanisms–EarlyRelease

26

• The Decision Making and Auto-Scaling Service: This component utilizes real-time and historicmonitoring data to derive if deployed applications and the underlying virtual and containerizedinfrastructuremustexpandorcontractinordertomeetcurrentdemand,achievetargetedperformanceandefficientlyutilizeprovisionedresources.HistoricmonitoringdataisaccessedbythiscomponentviatheMonitoringServiceAPIwhilereal-timeaggregatedandprocesseddataisfedbytheAnalysisServiceto the Decision Making Service via a high-performance queueing schema in order to reducecommunicationoverheadandexposereal-timedatainatimelymanner.

• TheSecurityEnforcementService:Thiscomponentutilizesthehigh-performanceindexingschemeoftheDistributedMonitoringDataStore,andconsequentlytheMonitoringServiceAPI,tostoreandaccessmonitoring data. This data is obtained from the interception of cloud application network trafficbetweenoutside servicesand internal communication toassessand report the risk applicationsareexposedtoattacksandifvulnerabilitieshinderintheconfigurationofthenetwork.

• TheUnicornDashboard:TheManagementPerspectiveoftheUnicornDashboardvisualizesinagraphicmannerreal-timeandhistoricmonitoringdataaccessedthroughtheMonitoringServiceAPIthatareofinterest to Dashboard users in order to understand their deployed application’s behaviour andperformanceoftheunderlyingplatform.Inturn,alongwithmetricgraphs,Dashboardusersalsoreceivenotificationsofalertsplacedto reportwhencertainconditionsareviolated (e.g.,aparticularmetricexceeds a certain threshold). Moreover, through the Unicorn Dashboard, users can formulate(continuous)monitoringqueries to access and trawl aggregated andprocessedmonitoringdata fedfromtheAnalysisService.

D5.1CloudGovernanceMechanisms–EarlyRelease

27

4 DecisionMakingandAuto-ScalingServiceIn this Section, we present a comprehensive documentation report introducing the reference architecture,exposedfunctionalityandimplementationdetailsreferringtotheDecisionMakingandAuto-ScalingService.

4.1 RequirementsandExposedFunctionalityThefollowinguserroles,identifiedinD1.1[22],arerelatedtotheDecisionMakingandAuto-ScalingService.

• CloudApplicationOwner:Followsanoptimizationstrategyconcerningtheruntimeexecutionofhis/hercloud application, that is aligned with the business aspects of the application, such as quality,performance,andcost.

• CloudApplication Administrator: Defines the elasticity policies required to realize the optimizationstrategyfollowedbytheCloudApplicationOwner.ThesepoliciescanbedefinedbothatruntimeusingtheServiceGraphandatdesigntimeusingtheElasticityLibrary.

• CloudApplicationDeveloper:DefinestheelasticitypoliciesviatheElasticityLibrarywhicharedefinedatdesign-time.

• UnicornDeveloper:DesigntheElasticityLibraryandextendstheinterfaceoftheElasticityEnablertosupportdifferentreactiveandproactivealgorithms,foroptimizingtheelasticitypolicies.

• Cloud Provider: Provides cloud offerings in the form of programmable infrastructure and hostUNICORN-compliantcloudapplications.

4.1.1 FunctionalRequirementsTosatisfyUse-CaseUC.9 (AdaptDeployedCloudApplications in real time),documented inD1.2 [23] ,whileadheringtosystemrequirementFR.9(Autonomicmanagementofdeployedcloudapplicationsandreal-timeadaptation based on intelligent decision-making mechanisms), documented in D1.1 [22], the followingfunctionalitymustbeexposedbytheDecisionMakingandAuto-ScalingService.

ID FR.DM.1

Title Defineandmanageelasticitypoliciesforcost,qualityandperformanceoptimizationofaUNICORN-enabledapplication

Description TheDecisionMaking&Auto-ScalingserviceshouldoffertheabilitytotheCloudApplicationAdministrator & Developer to express an optimization strategy for the performance andquality of the application in response to application demand (workload), while alsoacknowledging any budget constraints. The elasticity policies should be defined andmanagedbothatdesign-timeandduringruntime,withsyntacticandsemanticvalidation.

ExposedFunctionality

TheMonitoring& Elasticity Design Library offers the functionality to the user to expresshis/herscalingrequirementsviatheuseofElasticityPolicies.AnElasticityPolicycontainsasetofscalingconfigurationsthatarecarriedoutwhenascalingalertisissued.Ascalingalertistriggeredwhenasetofconditionsaresatisfied.Theusercanspecifyscalingalertsbasedon application insights (e.g., latency, throughput), resource runtime information (e.g.,numberof running services) and cost constraints (<20 credits/hour). Through themetric

D5.1CloudGovernanceMechanisms–EarlyRelease

28

definition,theusercanannotateacustommetricwhetherahighervalueisbetterornot,which is important for the Decision Making & Auto-Scaling service in the optimizationprocess.TheElasticitypoliciescanbecreatedandmanagedviatheServiceGraphbothatdesign-timeandatruntimebyUnicornApplicationAdministrator.Also,duringthedesign-timetheUnicornDevelopercancreateelasticitypolicieswithinthescopeofaserviceusingelasticityannotationsfromtheMonitoring&ElasticityLibrary.Finally,theelasticitypoliciesarevalidatedbothsyntacticallyandsemanticallyatdesign-timeandduringruntimebytheElasticityValidationmodule.

ID FR.DM.2

Title Autonomousruntimemonitor&enforcementofElasticitypolicies

Description TheDecisionMaking&Auto-Scalingserviceshouldconstantlymonitortheelasticitypoliciesatruntimeforanyviolationsinanautonomousway.WhenanelasticitypolicyistriggeredtheAuto-scalingserviceshouldperformthescalingactionspecifiedbythepolicy.

ExposedFunctionality

TheDecisionMaking&Auto-scalingserviceoffersthefunctionalitytoautonomouslymonitorandreacttoscalingalertsspecifiedbytheelasticitypolicies,throughtheElasticityController.The Elasticity Controller as part of the MAPE-K loop (Monitor-Analyse-Plan-Execute), inregular time intervals observes the application behaviour, through application andinfrastructure metrics specified in the elasticity conditions. When the conditions of anelasticitypolicyaresatisfiedtheAuto-scalingserviceprovidestotheResourceManagerthescalingre-configurations,specifiedbytheelasticityactionofthepolicy.

ID FR.DM.3

Title Resource-Aware&TransparentMulti-CloudElasticityControl

Description TheDecisionMaking&Auto-Scalingserviceshouldprovideatransparentscalingmechanismover multiple public cloud providers and private deployments. This means that thereshouldn't be any additional effort and configuration from the user-side for scaling anapplication to multiple cloud providers. Also, the different offerings, price schemes andresource heterogeneity of subscribed cloud providers should be acknowledged by theDecisionMaking&Auto-Scalingserviceforoptimizingthecost,qualityandperformanceofanapplication.

D5.1CloudGovernanceMechanisms–EarlyRelease

29

ExposedFunctionality

TheMonitoring&ElasticityDesignLibraryoffersthefunctionalitytotheCloudApplicationAdministrator & Developer to specify elasticity policies that concern different cloudproviders,zonesorregions,byusingthenotionofacluster.Aclusterisagroupofresourcesin the same network, therefore, a service can be placed tomultiple clusters that can belocatedindifferentgeographicallocationsorzones.InfrastructuralresourcesaretransparenttotheuserastheyarehandledbytheDecisionMaking&Auto-Scalingservice.

ID FR.DM.4

Title Continuousassessmentofelasticitypoliciesandadaptation

Description The DecisionMaking & Auto-Scaling service in order to provide optimal or near-optimalscaling decisions it should constantly assess the effectiveness of scaling configurations.Specifically,during theapplication runtime, it shouldevaluate theeffectsof theelasticitypoliciesontheperformance,qualityandcostoftheapplicationandadjustthemaccordingly.

ExposedFunctionality

The Decision Making & Auto-Scaling service offers the functionality to assess theeffectiveness of the elasticity policies through the Elasticity Enabler component. Thiscomponent,basedonhistoricalmonitoringdata,profilestheapplicationandconstructsormodifiesexistingelasticitypoliciestoadapttoanychangesintheapplicationbehaviourorworkload, to improve the elasticity control. Thismodification includes the adjustment ofparameters such as, cooldown and warmup periods, thresholds and time-windows. TheElasticityEnablercomponentishighlyextensible,allowingtheadoptionofdifferentreactiveandproactivealgorithmsbytheUnicornDeveloper,forimprovingelasticitycontrol.

4.1.2 Non-FunctionalRequirements

ID NFR.DM.1

Title Robustness

Description The DecisionMaking & Auto-Scaling servicemust copewith any potential errors fromunexpectedinputsandfaultsduringtheexecution,whilealsocontinuetoworkasusualafteraninterruptionfromunexpectedcrashesbyrestoringitslastvalidstate.

D5.1CloudGovernanceMechanisms–EarlyRelease

30

ID NFR.DM.2

Title High-Availability

Description The DecisionMaking & Auto-Scaling service should remain highly available in cases ofincreased load, while also minimize downtime by adding redundancy to criticalcomponentstoavoidsinglepointsoffailure.

ID NFR.DM.3

Title NearReal-timeAdaptability

Description DecisionMaking&Auto-Scalingservicemustbeabletoadaptitselftothedemandsoftheworkloadandrequirementsofthevariousoptimizationalgorithmswithoutdegradingitsperformance,byaddingorremovingthenecessarycomponentstoremainfunctionalandproduceinreal-timeoratleastnearreal-timescalingdecisions.

4.2 ReferenceArchitectureandImplementationTo address the preceding challenges and support the documented requirements, Unicorn introduces theDecisionMaking&Auto-Scalingserviceforadaptingapplicationsdeployedonmultiplecloudprovidersbasedonuseroptimizationstrategies.Ahigh-level referencearchitecturediagramof theDecisionMaking&Auto-scalingserviceisdepictedinFigure6.ThecomponentispartoftheUNICORNplatformandcommunicateswiththeMonitoringandAnalysisserviceandtheResourceManager.

The communicationwithMonitoringandAnalysis service isbidirectional, as it subscribes to relevantmetricstreamsandalsofetcheshistoricaldatatosupportthedecision-makingprocessoftheElasticitycontroller.ThecommunicationwithResourceManagerisalsobidirectional.Theinboundinterfaceisusedforcollectingruntimeinformation (e.g., running instances) and the elasticity capabilities. These capabilities denote the availablescaling actions with their associated cost information, provided by the cloud offerings (e.g., add vm-small$0.001/hour).TheoutboundinterfaceoftheDecisionMaking&Auto-ScalingserviceisusedtoprovidetotheResourceManagerthescalingactionsforanapplicationthatareissuedbytheElasticitycontroller.

TheDecisionMaking&Auto-scalingserviceiscomposedfromtwomaincomponents.ThefirstcomponentistheElasticityManager,whichisresponsibleforstoring,fetchingandmodifyingelasticitypoliciesformultipledeployments.Thesecondcomponent istheElasticitycontroller,which is instantiatedwhenanapplication isdeployed and its purpose is to enforce the elasticity policies of the application, retrieved by the ElasticityManager.

D5.1CloudGovernanceMechanisms–EarlyRelease

31

Figure6:DecisionMaking&Auto-scalingServiceReferenceArchitecture

Elasticity Policy is an important concept of the Decision-Making & Auto-Scaling service, thus to betterunderstand,anexampleofsuchpolicyisprovidedinFigure7.Inthisexample,thepolicyspecifiesthatonemoreserviceoftypesvc-s(streamingservice)shouldbeaddedwhenthereisi)asufficientincrease(10requests/secinthatlast5minutes)onthenumberofrequestsintheEuropeanregion,ii)theaggregatedcostoftheseservicesdonotexceedthebudgetconstraint(1.5credits/hour),iii)andalsotherearelessthan10servicesrunning.Thishigh-levelpolicyfollowstheIF-THEN-ACTIONrule-basedapproachandiswell-definedinEBNFlanguage[24]anddescribedthoroughlyinDeliverable2.1[25].

Figure7:ElasticityPolicyExample

D5.1CloudGovernanceMechanisms–EarlyRelease

32

TheDecisionMaking&Auto-scalingServiceobtainstheElasticityPolicy,performsthenecessaryvalidationsandenforcesatruntimethespecifiedactions.Figure8,showsallthepossiblestatesofanElasticityPolicy.Whenloaded,thepolicyremains inactiveuntil isvalidated. If it issuccessfullyvalidated, itcanbeactivatedfortheruntimeenforcement.Whentheconditionsofthepolicyareallsatisfied,thepolicyistriggered,andremainsinthisstateuntiltheactioniscompleted(e.g.,servicehasbeenaddedtoeu-zone-1),andfinallygoesbacktoanactivestate.Thepolicycanbemodifiedordeleteatruntime,onlywhenisinaninactivestate.

Figure8:ElasticityPolicyStateDiagram

4.2.1 ElasticityManagerThereferencearchitectureoftheElasticityManagercomponentispresentedinFigure9.Thiscomponentallowsthe UNICORN Policies Manager to create, modify and remove elasticity policies of a UNICORN-enabledapplication.ThisisachievedthroughitsexposedAPI,thatisextensivelydescribedinSection7.2.

Figure9:ElasticityManager

When an elasticity policy is created or modified it goes through the Validation module. This module isresponsibletoanalyseandvalidatebothsyntacticallyandsemanticallytheelasticitypolicy.ThisisperformedusingtheformalschemadefinitionoftheUNICORN’sElasticityLanguage,describedintheDeliverable2.1[25].Ifthevalidationfails,adescriptiveerrormessageisreturnedtotheUnicornPoliciesManager.Whenthepolicy

D5.1CloudGovernanceMechanisms–EarlyRelease

33

issuccessfullyvalidated,itisstoredinthePoliciesRepositoryoftheElasticityManager.TheresponsibilityofthePoliciesRepositoryistostoretheelasticitypoliciesalongwithusefulinformationabouttheirstate(e.g.,active,triggered,etc.).Concerningtheimplementation,thePoliciesRepositoryisimplementedusingtheCouchbaseServer5.1.0CommunityEdition[26],asit isanopensourceNoSQLdatabasethatoffershighavailabilityandscalingcapabilities.

The secondcomponent that communicateswithElasticityManager is theElasticityController. It allows theElasticitycontrollertoretrievetheelasticitypoliciesofhisdeployment,createnewormodifyexistingpolicies.Whennewelasticitypoliciesarecreatedaftertheinitialdeploymentoftheapplicationormodified,theelasticitycontrolleroftheparticulardeploymentretrievesthenewormodifiedpolicies.ThisisachievedbysubscribingtotheElasticityManager,throughitsAPI.

The Elasticity Manager is implemented with Java 8 and uses the Spring Boot 2.0 [27] framework forimplementing the web server. Spring Boot is a micro-framework that simplifies the bootstrapping anddevelopmentofSpringwebapplicationsthatsupportsembeddableserverssuchasTomcat[28]orJetty[29].TheElasticityManagerfollowsthearchitecturalpatternofmicroservices,exposingasecuretoken-basedRESTfulAPI.TheAPIimplementsthecommonCRUDoperationsforanElasticityPolicythroughtheHTTPmethods(GET,POST,etc.).AninstanceofElasticityPolicyisencodedwithitsJSONrepresentationwhenitisexchangedoverthe network and is mapped back to Java classes using Spring Boot’s serialization and de-serializationfunctionality. The syntax validation of an Elasticity Policy is performed during the serialization process. Ifserializationfails,theanalogouserrormessageisreturnedbytheValidationModule.Thesemanticvalidation,isperformedusingtheschemadefinitionoftheUNICORN’sElasticityLanguage.Itshouldbenotedthatinthefinalreleasetheschemawillbeenrichedandprobablychange,thustheValidationModuleisdevelopedinsuchawaytosupportmultipleversionsoftheschema.

4.2.2 ElasticityControllerThe Elasticity Controller reference architecture is the depicted in Figure 10. This component contains thedecision-makingprocessforthescalingcapabilitiesandadaptationofaUNICORN-enabledcloudapplication.Inthediagram,thegreenarrowsshowthecomponentsthataredirectlyrelatedwiththeflowofanelasticitypolicy.Thebluearrows,showthecomponentsthatarepartoftheMAPE-Kcontrol loop,describinghowapolicy isenforced,andfinallytheredarrowsshowimportantconfigurationactionsbetweenthecomponents.

D5.1CloudGovernanceMechanisms–EarlyRelease

34

Figure10:ElasticityControllerReferenceArchitecture

Concerningtheimplementation,thederivedarchitectureisimplementedfollowingthemicroservicesapproach,with the main sub-components developed as independent microservices. In the following paragraphs thefunctionalityofeachcomponentanddetailsoftheearlyimplementationareprovided.Forbetterunderstandingthedescriptionfollowstheflowofanelasticitypolicy(greenarrows),fromitscreationuntilitsenforcement.

PolicyLoader

Whenanapplicationisdeployed,theElasticityControllerisinstantiated.Thefirststepistoloadtheelasticitypoliciesofitscurrentdeployment.Forthis,thePolicyLoadercomponentisresponsibletofetchandloadthepolicies from the ElasticityManager though its API. The Unicorn Application Administrator can create newpolicies or modify existing ones during runtime, therefore the Policy Loader using a pub/sub subscriptionmechanismcommunicateswiththeElasticityManagertoretrieveanyupdatesonthepolicies.

RuntimeValidation

Thenexttaskafterloadingthepolicy,istoperformaruntimevalidationtocheckwhetherreferencestometrics,services and scaling actions, specified by the policy are available. This is the responsibility of theRuntime

D5.1CloudGovernanceMechanisms–EarlyRelease

35

Validationcomponent.TheMonitoringandAnalysisserviceisusedtoverifytheexistenceofametric(analyticinsight)thatisspecifiedintheelasticityconditionsofthepolicy,whiletheResourceManagercomponentisusedforretrievingtheelasticitycapabilitiestocheckwhetherspecifiedscalingactions,typesofservicesandresourcesareavailable.Ifthevalidationfailsandreferencescannotbefound,anerrormessageispropagatedbacktotheElasticityManagerandthepolicybecomesinactive.

ElasticityEnabler

After the runtime validation, the elasticity policy goes through the Elasticity Enabler. The first task of thiscomponent is to activate the necessary runtime information listeners to obtain the real-time informationspecifiedbytheelasticitypolicybycreating,i)metricstreamsbetweentheAnalysisServiceandtheElasticityRuntimeInformationcomponentandii)runtimeinformationstreamsbetweentheResourceManagerandtheElasticityRuntimeInformationcomponent.ThemainfunctionalitiesoftheElasticityEnablerarei)toassesstheperformanceandeffectivenessoftheelasticitypolicies,ii)createnewpoliciesandii)modifyexistingones,soasto continuously improve the elasticity control. For this purpose, it provides a generic interface for theimplementationofseveralsemi-superviseddecisionalgorithmsbytheUNICORNdeveloper,basedonhistoricalmonitoringdataoftheapplicationandoptimizationstrategies.

PolicyTranslator

From the Elasticity Enabler, the elasticity policy is then forwarded to thePolicy Translator component. ThismoduletakesasinputanelasticitypolicyandperformsthenecessarytransformationtoproducerulesthatarefedtotheExpertSystemcomponent.Concerningitsimplementation,thiscomponenttranslatesthepoliciestotheDroolsRulelanguage(DRL)[30].

ExpertSystem

The Expert System is responsible for the enforcement of the scaling rules. These rules are stored in theProductionMemory.Arulehastwoparts;theprecondition(IF)andtheaction(THEN).TheInferenceEnginematchesthefactsfromtheWorkingMemoryagainsttheproductionrulestoinferconclusionswhichresultinscalingactions,thataresentouttotheResourceManager.Inthecasewheremultiplerulesaresimultaneouslysatisfied,aconflictresolutionstrategy(e.g.,basedonpriority)isusedtomanagetheexecutionorderoftheseconflictingrules.Concerningtheimplementationofthiscomponent,UNICORNusestheDroolsBusinessRulesEngine[31]forpolicyenforcementthatsupportsreasoningandconflictresolutionovertheprovidedsetoffactsandrulesaswellastriggeringoftheappropriateactions.

ElasticityRuntimeInformation

TheElasticityRuntime Information (ERI) componentexposes two inbound interfaces for collecting real-timeinformationfromtheResourceManagerandAnalysisservice.TheResourceManagerfeedstheERIcomponentwithruntime informationabout theservicesandunderline infrastructureresources,while theMonitoring&AnalysisservicefeedstheERIcomponentwithaggregatedmetricinformation(analyticinsights)aboutservicesandunderlineinfrastructureresources.TheruntimemetricsandinformationistranslatedtofactsandfedtotheWorkingMemoryof theExpertSystem. Inaddition, the runtimemetricsand informationare fed to theElasticityEnablerforanalysispurposesandpolicyassessment.

D5.1CloudGovernanceMechanisms–EarlyRelease

36

4.3 InteractionwithotherUnicornServicesandComponentsTheDecision-MakingandAuto-scalingserviceinteractswiththefollowingUNICORNPlatformcomponents:

• Monitoring&Analytics service: This service is usedmainly for retrieving real-timemonitoring data(analytic insights) via its high-performance streaming queue and for collecting aggregated historicmonitoringdataoftheapplicationandunderlinevirtualandcontainerizedinfrastructure.Finally,itisusedintheelasticitypolicyvalidationprocessforthereferentialintegrityoftheelasticitymetrics.

• Resource Manager: Used for collecting the available elasticity capabilities in order to validate theelasticitypoliciesandalsoforanalysingtheofferingsandresourceheterogeneityofsubscribedcloudprovidersinordertooptimizecost,qualityandperformanceofanapplication.Also,itisusedtoretrievetheruntimeinformationoftheapplication’sservicesandresources(e.g.,numberofserviceinstances).

D5.1CloudGovernanceMechanisms–EarlyRelease

37

5 ConclusionsThescopeofthisdeliverablewastoprovideacomprehensiveoverviewanddocumentationreportoftheearlyreleaseoftheUnicornGovernanceMechanismswhichcontributetothemonitoringandmanagementoftheruntimeaspectsof theunderlyingmulti-cloudexecutionenvironments targetedby theUnicornPlatform. Inparticular, D5.1 derives a clear overview of the early design and development of the three componentscomprising the Unicorn Governance Mechanisms that are developed under the umbrella of WP5: (i) theMonitoringandAnalysisService;(ii)theExecutionEnvironmentAgent6;and(iii)theDecision-MakingandAuto-ScalingService.TheMonitoringandAnalysisServicealongwiththeExecutionEnvironmentAgent,representWP5Tasks4.1and4.2respectively,andhaveeffectivelystartedasplanedonMonth7oftheUnicornProjectwithnodeviationsoralterationstothedocumentedworkplan.Inturn,theDecision-MakingandAuto-ScalingService representsWP5Task4.3whichhaseffectivelystartedasplanedonMonth10withnodeviationsoralterations to the documented work plan. For each of these components we have documented, therequirementsthatmustbesatisfiedtoovercomethechallengesintroducedwhenmonitoring,managingandscalingmulti-cloudcontainerizedexecutionenvironments,theirfunctionalities,howtheyoperate,andtheAPIusedtointeractwithotherUnicorncomponents,usersand/orthird-partyservices.

Specifically, for the Monitoring and Analysis service, six functional requirements and three non-functionalrequirements were identified. These requirements along with their exposed functionality lead to thedevelopmentofacomplete, interoperable,scalableandreal-timecloudmonitoringstackforautomatingthemonitoringofcloudapplicationsdeployedthroughcontainerizedexecutionenvironments.Inthefirstperiodoftheprojectfocuswasdrivenin:(i)designingandimplementingthefoundationalmechanismsoftheMonitoringAgentfornon-intrusivemetricextractioninimmutablecontainerizedenvironmentsdeployedonheterogeneousmulti-cloud offerings; and (ii) designing and implementing the first version of theMonitoring and AnalysisServicewithemphasisgivenoncreatingthemetricmodelanddataabstractionsforeffectiveandscalablemetricstorageandextraction.Asthelifespanoftheprojectprogresses,effortwillbegivenin:(i)increasingthenumberofMonitoringProbesavailableintheUnicornProbeRepository;(ii)implementthescalableAnalysisServiceformetriccompilation,aggregationandgroupingwhichwillservereal-timemetricupdatesthroughsubscriptiontopics extracted from a high-performance queueing service; and (iii) introduce auto-configuration to theMonitoring Service tier to allow feature introduction and reconfiguration through the Unicorn CI/CD cyclewithoutdowntime.

FortheDecisionMakingandAuto-ScalingService,theexposedfunctionalityoffourfunctionalrequirementsandthreenon-functionalrequirements,leadtothedevelopmentofalanguageandaruntimeenvironmentthatiscapableofadaptingacloudapplicationbasedonsemi-superviseddecisionalgorithmsfortheoptimalplacementof virtualmachines and containers acrossmultiple availability zones and/or cloud sites, while realizing theheterogeneityamongcloudprovidersandtheircapabilitiesandadheringhigh-levelpolicyconstraintsdefinedbytheApplicationAdministrator.Inthefirstperiodoftheprojectfocuswasdrivenin:(i)creatingthenecessarymodelanddataabstractionstohandlethefunctionalityexposedbythescalingmechanismsbasedonuser’soptimizationstrategiesexpressedviatheelasticitypolicies;andii)designingandimplementingthefirstversionoftheDecision-Making&Auto-ScalingServicewithemphasisgivenonderivinganarchitecturethat ishighlyextensible allowing the development of algorithms for improving the elasticity control, while also allowingexistingstate-of-the-artreactivescalingmechanismstobenefitfromthesetechniques.Asthelifespanofthe

6DenotedthroughoutthedocumentastheMonitoringAgent

D5.1CloudGovernanceMechanisms–EarlyRelease

38

projectprogresses,effortwillbegivenin:(i)improvingtheDecision-Makingprocessbydevelopingoptimizationalgorithms (reactive and proactive) for virtualized and containerized resource placement based on themonitoredbehaviourofthecloudapplicationanduser’spolicies;and(ii)enrichingtheElasticityLibrarywithpre-definedoptimizationstrategiesfor improvingthequality,costandperformance,avoidingtheprocessofdefiningandfindingoptimallow-levelscalingpoliciesfortheapplication.

Finally, in the forthcoming Deliverable 5.2 – Unicorn Governance Mechanisms, the final version of theMonitoringandAnalysisServiceandtheDecision-MakingandAuto-ScalingServicewillbedocumented.Thiswork,willassesstheaccomplishmentoftherequirements,featuresandtoolsetsintroducedinthisdeliverableandwillprovidethefinaldocumentationoftheUnicornGovernanceMechanisms.

D5.1CloudGovernanceMechanisms–EarlyRelease

39

6 References[1] OpenContainerSpecification-LinuxFoundation,“https://www.opencontainers.org/about.”.

[2] D. Trihinas, Z. Georgiou, G. Pallis, and M. D. Dikaiakos, “Improving Rule-Based Elasticity Control byAdapting theSensitivityof theAuto-ScalingDecisionTimeframe,” inThird InternationalWorkshoponAlgorithmic Aspects of Cloud Computing (ALGOCLOUD 2017), in conjunction with the ALGO 2017Conference,2017.

[3] D.Trihinas,G.Pallis,andM.Dikaiakos,“{ADMin:}AdaptiveMonitoringDisseminationfortheInternetofThings,”inIEEEINFOCOM2017-IEEEConferenceonComputerCommunications(INFOCOM2017),2017.

[4] R. Morabito, V. Cozzolino, A. Y. Ding, N. Beijar, and J. Ott, “Consolidate IoT Edge Computing withLightweightVirtualization,”IEEENetw.,vol.32,no.1,pp.102–111,Jan.2018.

[5] DockerStats,“https://docs.docker.com/engine/reference/commandline/stats/.”.

[6] cAdvisor,“https://github.com/google/cadvisor.”.

[7] J.M.AlcarazCaleroandJ.GutierrezAguado,“MonPaaS:AnAdaptiveMonitoringPlatformasaServiceforCloudComputingInfrastructuresandServices,”IEEETrans.Serv.Comput.,pp.1–1,2014.

[8] Tower4CloudsMulti-CloudMonitoring,“http://deib-polimi.github.io/tower4clouds/.”.

[9] D.Trihinas,G.Pallis,andM.D.Dikaiakos,“MonitoringElasticallyAdaptiveMulti-CloudServices,” IEEETrans.CloudComput.,vol.4,2016.

[10] M. R. López and J. Spillner, “Towards Quantifiable Boundaries for Elastic Horizontal Scaling ofMicroservices,” in Companion Proceedings of the10th International Conference on Utility and CloudComputing,2017,pp.35–40.

[11] J.Thones,“Microservices,”IEEESoftw.,vol.32,no.1,p.116,Jan.2015.

[12] AppDynamics,“https://www.appdynamics.com/.”.

[13] DataDog,“https://www.datadoghq.com.”.

[14] S.Meng and L. Liu, “Enhancedmonitoring-as-a-service for effective cloudmanagement,” IEEE Trans.Comput.,vol.62,no.9,pp.1705–1720,2013.

[15] Amazon’sAutoScaling,“https://aws.amazon.com/ec2/autoscaling.”.

[16] MicrosoftAzureAutoScaling,“https://azure.microsoft.com/en-us/features/autoscale/.”.

[17] GoogleCloudAutoscaler,“https://cloud.google.com/compute/docs/autoscaler/.”.

[18] AWS Target-tracking scaling,“https://docs.aws.amazon.com/autoscaling/application/userguide/application-auto-scaling-target-tracking.html.”.

[19] J.Thalheim,A.Rodrigues,I.E.Akkus,P.Bhatotia,R.Chen,B.Viswanath,L.Jiao,andC.Fetzer,“Sieve:ActionableInsightsfromMonitoredMetricsinMicroservices,”2017.

[20] D.Trihinas,G.Pallis,andM.D.Dikaiakos,“AdaM:anAdaptiveMonitoringFrameworkforSamplingandFilteringonIoTDevices,”inIEEEInternationalConferenceonBigData,2015.

D5.1CloudGovernanceMechanisms–EarlyRelease

40

[21] D.Trihinas,G.PallisandM.D.Dikaiakos,“MonitoringElasticallyAdaptiveMulti-CloudServices,” IEEETrans.CloudComput.,vol.4,no.X,pp.1–14,2016.

[22] Unicorn,“UnicornDeliverableD1.1StakeholdersRequirementsAnalysis.”2017.

[23] Unicorn,“UnicornReferenceArchitectureDeliverable1.2.”2017.

[24] EBNF,“https://en.wikipedia.org/wiki/Extended_Backus%E2%80%93Naur_form.”.

[25] Unicorn,“Deliverable2.1-UNICORNLibraries,IDEPlugin,ContainerPackagingandDeploymentToolsetEarlyRelease,”2018.

[26] Couchbase,“https://www.couchbase.com/.”.

[27] SpringBoot,“https://projects.spring.io/spring-boot/.”.

[28] ApacheTomcat,“http://tomcat.apache.org/.”

[29] Jetty,“https://www.eclipse.org/jetty/.”.

[30] Drools Rule Language, “https://docs.jboss.org/drools/release/5.2.0.Final/drools-expert-docs/html/ch05.html.”

[31] DroolsBusinessRulesEngine,“https://www.drools.org/.”.

D5.1CloudGovernanceMechanisms–EarlyRelease

41

7 Appendix

7.1 MonitoringandAnalysisServiceAPIDocumentationUnicornMonitoringAPIisaRESTAPIdocumentedinwhatfollowsandiscomprisedofthefollowingresources:

• MonitoringAPIKeys• MonitoredApplications• MonitoringAgentscomprisingaMonitoredApplication• MonitoringMetricscollectedbyaMonitoringAgent

To reduce repetition,we note that ALL requestsmust be issuedwith theMonitoringAPI key authorizationheader.Withouttheinclusionofthisheader,orifnotavalidheader,anUNAUTHORIZEDresponseisreturn.InthecaseofrequestingaMonitoringAPIKeyforthefirsttime,theusermustperformthisrequestthroughtheUnicornDashboard.

7.1.1 MonitoringAPIKeys

Table3:CREATEMonitoringAPIKeyContext

Description CREATEAPIKeywhichprovidesauthorizedaccesstoUnicornMonitoring

URI /apikeyMethod POSTParameters -Request/ResponseFormat

application/json

ResponseStatusCodes

200–OK201–CREATED401–UNAUTHORIZED403–FORBIDDEN

SampleRequest POST /apikey

RequestBody { "userID": "123456789", "username": "dtrihinas", "country": "Cyprus", "availability_zone": "eu" }

ResponseBody { "apiKey": "27ff56dac859400bac07e30f50f1f0d0", }

ResponseCode 201–CREATED

Table4:DELETEMonitoringAPIKeyContext

Description REVOKEAPIKeywhichprovidesauthorizedaccesstoUnicornMonitoring

URI /apikey/{apikey}

Method DELETE

D5.1CloudGovernanceMechanisms–EarlyRelease

42

Parameters -Request/ResponseFormat

application/json

ResponseStatusCodes

200–OK401–UNAUTHORIZED403–FORBIDDEN404–NOTFOUND

SampleRequest DELETE /apikey/27ff56dac859400bac07e30f50f1f0d0

RequestBody -ResponseBody -

ResponseCode 200–OK

Table5:GETMonitoringAPIKeyContext

Description GETAPIKeyassociatedusermetadata

URI /apikey/{apikey}

Method GETParameters -Request/ResponseFormat

application/json

ResponseStatusCodes

200–OK401–UNAUTHORIZED403–FORBIDDEN404–NOTFOUND

SampleRequest GET /apikey/27ff56dac859400bac07e30f50f1f0d0

RequestBody -ResponseBody {

"created_at": "1521458234", "last_modified": "1521458234", "userID": "123456789", "username": "dtrihinas", "country": "Cyprus", "availability_zone": "eu" }

ResponseCode 200–OK

Table6:UPDATEMonitoringAPIKeyContext

Description UPDATEAPIKeyassociatedusermetadata

URI /apikey/{apikey}

Method PUTParameters -Request/ResponseFormat

application/json

ResponseStatusCodes

200–OK401–UNAUTHORIZED

D5.1CloudGovernanceMechanisms–EarlyRelease

43

403–FORBIDDEN404–NOTFOUND

SampleRequest PUT /apikey/27ff56dac859400bac07e30f50f1f0d0

RequestBody { "availability_zone": "us-east" }

ResponseBody { "created_at": "1521458234", "last_modified": "15214598351", "userID": "123456789", "username": "dtrihinas", "country": "Cyprus", "availability_zone": "us-east" }

ResponseCode 200–OK

7.1.2 MonitoredApplications

Table7:CREATEMonitoringApplicationContext

Description CREATEMonitoredApplicationcontext

URI /appsMethod POSTParameters -Request/ResponseFormat

application/json

ResponseStatusCodes

200–OK201–CREATED401–UNAUTHORIZED403–FORBIDDEN

SampleRequest POST /apps

RequestBody { "name": "my-app-1", "availability_zone": "eu", "tags": [ "spring-boot", "mysqldb" ], "providers": [ "aws", "google-ce" ] }

ResponseBody { "appID": "95a01dfe4667414f9336b7d7495cc7a7", "created_at": "1521459234", "last_modified": "1521459234" }

ResponseCode 201–CREATED

D5.1CloudGovernanceMechanisms–EarlyRelease

44

Table8:GETMonitoredApplicationContextAssociatedwithMonitoringKey

Description GETMonitoredApplicationcontextsassociatedwithprovidedMonitoringKey

URI /appsMethod GETParameters -Request/ResponseFormat

application/json

ResponseStatusCodes

200–OK401–UNAUTHORIZED403–FORBIDDEN404–NOTFOUND

SampleRequest GET /apps/

RequestBody -

ResponseBody [ { "appID": "95a01dfe4667414f9336b7d7495cc7a7", "name": "my-app-1", "availability_zone": "eu", "tags": [ "spring-boot", "mysqldb" ], "providers": [ "aws", "google-ce" ], "created_at": "1521459234", "last_modified": "1521459234" }, { . . . } ]

ResponseCode 200–OK

Table9:GETMonitoredApplicationContext

Description GETMonitoredApplicationcontext

URI /apps/{appID}Method GETParameters -Request/ResponseFormat

application/json

ResponseStatusCodes

200–OK401–UNAUTHORIZED403–FORBIDDEN404–NOTFOUND

D5.1CloudGovernanceMechanisms–EarlyRelease

45

SampleRequest GET /apps/95a01dfe4667414f9336b7d7495cc7a7

RequestBody -

ResponseBody { "appID": "95a01dfe4667414f9336b7d7495cc7a7", "name": "my-app-1", "availability_zone": "eu", "tags": [ "spring-boot", "mysqldb" ], "providers": [ "aws", "google-ce" ], "created_at": "1521459234", "last_modified": "1521459234" }

ResponseCode 200–OK

Table10:DELETEMonitoredApplicationContext

Description DELETEMonitoredApplicationcontext

URI /apps/{appID}Method DELETEParameters -Request/ResponseFormat

application/json

ResponseStatusCodes

200–OK401–UNAUTHORIZED403–FORBIDDEN404–NOTFOUND

SampleRequest DELETE /apps/95a01dfe4667414f9336b7d7495cc7a7

RequestBody -

ResponseBody -

ResponseCode 200–OK

Table11:UPDATEMonitoredApplicationContext

Description UPDATEMonitoredApplicationcontext

URI /apps/{appID}Method PUTParameters -Request/ResponseFormat

application/json

ResponseStatusCodes

200–OK401–UNAUTHORIZED403–FORBIDDEN

D5.1CloudGovernanceMechanisms–EarlyRelease

46

404–NOTFOUNDSample

Request PUT /apps/95a01dfe4667414f9336b7d7495cc7a7

RequestBody { "name": "my-new-appname-1", "availability_zone": "us-west", "tags": [ "spring-boot", "mysqldb", "java" ] }

ResponseBody { "appID": "95a01dfe4667414f9336b7d7495cc7a7", "name": "my-new-appname-1", "availability_zone": "us-west", "tags": [ "spring-boot", "mysqldb", "java", ], "providers": [ "aws", "google-ce" ], "created_at": "1521459234", "last_modified": "1521467357" }

ResponseCode 200–OK

7.1.3 MonitoringAgents

Table12:CREATEMonitoringAgentContext

Description CREATEMonitoringAgentcontext

URI /apps/{appID}/agentsMethod POSTParameters -Request/ResponseFormat

application/json

ResponseStatusCodes

200–OK201–CREATED401–UNAUTHORIZED403–FORBIDDEN

SampleRequest POST /apps/95a01dfe4667414f9336b7d7495cc7a7/agents

RequestBody { "name": "my-agent-1", "host": "docker-engine-1", "tags": [ "item-catalog", "java8",

D5.1CloudGovernanceMechanisms–EarlyRelease

47

"spring-boot", ], "metrics": [ { "name": "cpu", "unit": "%", "type": "double" }, { "name": "memory", "unit": "%", "type": "double" } ] }

ResponseBody { "agentID": "70910177071c4bc68419fb63e72b7cbc", "created_at": "1521459234", "last_modified": "1521459234" }

ResponseCode 201–CREATED

Table13:GETMonitoringAgentcontextsassociatedwithMonitoredApplication

Description GETMonitoringAgentcontextsassociatedwithMonitoredApplication

URI /apps/{appID}/agentsMethod GETParameters status={UP,DOWN,TERMINATED},tagRequest/ResponseFormat

application/json

ResponseStatusCodes

200–OK401–UNAUTHORIZED403–FORBIDDEN404–NOTFOUND

SampleRequest GET /apps/95a01dfe4667414f9336b7d7495cc7a7/agents?status=UP

RequestBody -

ResponseBody [ { "agentID": "70910177071c4bc68419fb63e72b7cbc", "name": "my-agent-1", "host": "docker-engine-1", "tags": [ "item-catalog", "java8", "spring-boot", ], "metrics": [ { "metricID": "70910177071c4bc68419fb63e72b7cbc:cpu", "name": "cpu", "unit": "%", "type": "double"

D5.1CloudGovernanceMechanisms–EarlyRelease

48

}, { "metricID": "70910177071c4bc68419fb63e72b7cbc:memory", "name": "memory", "unit": "%", "type": "double" } ], "created_at": "1521459234", "last_modified": "1521459234", "status": "UP" }, { "agentID": "b8a8804c4a927458da5e5570707f9b54c2a3", "name": "my-agent-42", "host": "aws-m3-small-linux-1", "tags": [ "data-miner", "python3" ], "metrics": [ { "metricID": "70910177071c4bc68419fb63e72b7cbc:throughput", "name": "throughput", "unit": "ops/s", "type": "double" }, { "metricID": "70910177071c4bc68419fb63e72b7cbc:memory:", "name": "memory", "unit": "%", "type": "double" } ], "created_at": "15218321343", "last_modified": "15218362356", "status": "UP" }, { . . . } ]

ResponseCode 200–OK

Table14:GETMonitoringAgentContext

Description GETMonitoringAgentcontext

URI /apps/{appID}/agents/{agentID}Method GETParameters -Request/ResponseFormat

application/json

D5.1CloudGovernanceMechanisms–EarlyRelease

49

ResponseStatusCodes

200–OK401–UNAUTHORIZED403–FORBIDDEN404–NOTFOUND

SampleRequest GET /apps/95a01dfe4667414f9336b7d7495cc7a7/agents/

70910177071c4bc68419fb63e72b7cbc RequestBody -

ResponseBody { "agentID": "70910177071c4bc68419fb63e72b7cbc", "name": "my-agent-1", "host": "docker-engine-1", "tags": [ "item-catalog", "java8", "spring-boot", ], "created_at": "1521459234", "last_modified": "1521459234", "status": "UP" }

ResponseCode 200–OK

Table15:DELETEMonitoringAgentContext

Description DELETEMonitoringAgentcontext

URI /apps/{appID}/agents/{agentID}Method DELETEParameters -Request/ResponseFormat

application/json

ResponseStatusCodes

200–OK401–UNAUTHORIZED403–FORBIDDEN404–NOTFOUND

SampleRequest DELETE /apps/95a01dfe4667414f9336b7d7495cc7a7/agents/

70910177071c4bc68419fb63e72b7cbc RequestBody -

ResponseBody -

ResponseCode 200–OK

Table16:UPDATEMonitoringAgentContext

Description UPDATEMonitoringAgentcontext

URI /apps/{appID}/agents/{agentID}Method PUTParameters -

D5.1CloudGovernanceMechanisms–EarlyRelease

50

Request/ResponseFormat

application/json

ResponseStatusCodes

200–OK401–UNAUTHORIZED403–FORBIDDEN404–NOTFOUND

SampleRequest PUT /apps/95a01dfe4667414f9336b7d7495cc7a7/agents/

70910177071c4bc68419fb63e72b7cbc RequestBody {

"metrics": [ { "name": "ingress_pcts", "unit": "#", "type": "long" } ] }

ResponseBody { "agentID": "70910177071c4bc68419fb63e72b7cbc", "name": "my-agent-1", "host": "docker-engine-1", "tags": [ "item-catalog", "java8", "spring-boot", ], "metrics": [ { "metricID": "70910177071c4bc68419fb63e72b7cbc:cpu", "name": "cpu", "unit": "%", "type": "double" }, { "metricID": "70910177071c4bc68419fb63e72b7cbc:memory", "name": "memory", "unit": "%", "type": "double" }, { "metricID": "70910177071c4bc68419fb63e72b7cbc:ingress_pcts", "name": "ingress_pcts", "unit": "#", "type": "long" } ], "created_at": "1521459234", "last_modified": "1521673128", "status": "UP" }

ResponseCode 200–OK

D5.1CloudGovernanceMechanisms–EarlyRelease

51

7.1.4 MonitoringMetricsTable17:CREATEMonitoringMetricValuecontext

Description CREATEMonitoringMetricValuecontext

URI /apps/{appID}/agents/{agentID}/metrics/Method POSTParameters -Request/ResponseFormat

application/json

ResponseStatusCodes

200–OK201–CREATED401–UNAUTHORIZED403–FORBIDDEN

SampleRequest POST /apps/95a01dfe4667414f9336b7d7495cc7a7/agents/

70910177071c4bc68419fb63e72b7cbc/metrics/ RequestBody [

{ "metricID": "70910177071c4bc68419fb63e72b7cbc:cpu", "name": "cpu", "unit": "%", "type": "double", "value": "25", "timestamp": "1521542354" }, { "metricID": "70910177071c4bc68419fb63e72b7cbc:memory", "name": "memory", "unit": "%", "type": "double", "value": "48", "timestamp": "1521542354" }, { "metricID": "70910177071c4bc68419fb63e72b7cbc:ingress_pcts", "name": "ingress_pcts", "unit": "#", "type": "long", "value": "3504", "timestamp": "1521542378" } ]

ResponseBody { "created_at": "1521542389", }

ResponseCode 201–CREATED

Table18:GETMonitoringMetricvaluecontextsassociatedwithMonitoredAgent

Description GETMonitoringMetricValuecontextsassociatedwithMonitoredAgent

URI /apps/{appID}/agents/{agentID}/metrics

D5.1CloudGovernanceMechanisms–EarlyRelease

52

Method GETParameters -Request/ResponseFormat

application/json

ResponseStatusCodes

200–OK401–UNAUTHORIZED403–FORBIDDEN404–NOTFOUND

SampleRequest GET /apps/95a01dfe4667414f9336b7d7495cc7a7/agents/

70910177071c4bc68419fb63e72b7cbc/metrics/ RequestBody -

ResponseBody { "created_at": "1521542389", "metrics:[ { "metricID": "70910177071c4bc68419fb63e72b7cbc:cpu", "name": "cpu", "unit": "%", "type": "double", "value": "25", "timestamp": "1521542354" }, { "metricID": "70910177071c4bc68419fb63e72b7cbc:memory", "name": "memory", "unit": "%", "type": "double", "value": "48", "timestamp": "1521542354" }, { "metricID": "70910177071c4bc68419fb63e72b7cbc:ingress_pcts", "name": "ingress_pcts", "unit": "#", "type": "long", "value": "3504", "timestamp": "1521542378" } ] }

ResponseCode 200–OK

Table19:GETMonitoringMetricContext

Description GETMonitoringMetriccontext

URI /apps/{appID}/agents/{agentID}/metrics/{metricID}Method GETParameters interval,tstart,tendRequest/ResponseFormat

application/json

D5.1CloudGovernanceMechanisms–EarlyRelease

53

ResponseStatusCodes

200–OK401–UNAUTHORIZED403–FORBIDDEN404–NOTFOUND

SampleRequest GET /apps/95a01dfe4667414f9336b7d7495cc7a7/agents/

70910177071c4bc68419fb63e72b7cbc/metrics/ 70910177071c4bc68419fb63e72b7cbc:cpu?interval=600

RequestBody -

ResponseBody [ { "metricID": "70910177071c4bc68419fb63e72b7cbc:cpu", "name": "cpu", "unit": "%", "type": "double", "value": "25", "timestamp": "1521542354" }, { "metricID": "70910177071c4bc68419fb63e72b7cbc:cpu", "name": "cpu", "unit": "%", "type": "double", "value": "27", "timestamp": "1521543355" }, { "metricID": "70910177071c4bc68419fb63e72b7cbc:cpu", "name": "cpu", "unit": "%", "type": "double", "value": "48", "timestamp": "1521544355" }, . . . ]

ResponseCode 200–OK

7.2 DecisionMakingandAuto-scalingServiceAPIDocumentationElasticityManagerAPIisaRESTfulAPIexposingthestandardCRUDoperationsofthefollowingresources:

• ElasticityAPIKeys• ElasticApplications• ElasticityPolicies

Toreducerepetition,wenotethatALLrequestsmustbeissuedwiththeElasticityAPIkeyauthorizationheader.Withouttheinclusionofthisheader,orifnotavalidheader,anUNAUTHORIZEDresponseisreturn.InthecaseofrequestinganElasticityAPIKeyforthefirsttime,theinterestedentitymustrequesttheAPIkeyfromtheElasticityManager.

D5.1CloudGovernanceMechanisms–EarlyRelease

54

7.2.1 ElasticityAPIKeysTable20:CREATEElasticityAPIKeyContext

Description CREATEAPIKeywhichprovidesauthorizedaccesstoElasticityManager

URI /apikeyMethod POSTParameters -Request/ResponseFormat

application/json

ResponseStatusCodes

200–OK201–CREATED401–UNAUTHORIZED403–FORBIDDEN

SampleRequest POST /apikey

RequestBody { "userID": "0320013", "username": "zgeorg03", "country": "Cyprus", "availability_zone": "eu" }

ResponseBody { "apiKey": "a7ba56adc85941deaf47630f50c240e1", }

ResponseCode 201–CREATED

Table21:DELETEElasticityAPIKeyContext

Description REVOKEAPIKeywhichprovidesauthorizedaccesstoElasticityManager

URI /apikey/{apikey}Method DELETEParameters -Request/ResponseFormat

application/json

ResponseStatusCodes

200–OK401–UNAUTHORIZED403–FORBIDDEN404–NOTFOUND

SampleRequest DELETE /apikey/a7ba56adc85941deaf47630f50c240e1

RequestBody -ResponseBody -

ResponseCode 200–OK

Table22:GETElasticityAPIKeyContext

Description GETAPIKeyassociatedusermetadata

URI /apikey/{apikey}

D5.1CloudGovernanceMechanisms–EarlyRelease

55

Method GETParameters -Request/ResponseFormat

application/json

ResponseStatusCodes

200–OK401–UNAUTHORIZED403–FORBIDDEN404–NOTFOUND

SampleRequest GET /apikey/a7ba56adc85941deaf47630f50c240e1

RequestBody -

ResponseBody { "created_at": "1521468211", "last_modified": "1521468211", "userID": "0320013", "username": "zgeorg03", "country": "Cyprus", "availability_zone": "eu" }

ResponseCode 200–OK

7.2.2 ElasticApplicationTable23:CreateElasticApplication’sContext

Description CREATEthecontextofanElasticApplication

URI /appsMethod POSTParameters -Request/ResponseFormat

application/json

ResponseStatusCodes

200–OK201–CREATED401–UNAUTHORIZED403–FORBIDDEN

SampleRequest POST /apps

RequestBody { "name": "my-app-1", "availability_zone": "eu", "tags": [ "spring-boot", "mysqldb" ], "providers": [ "aws", "google-ce" ] }

ResponseBody { "appID": "95a01dfe4667414f9336b7d7495cc7a7", "created_at": "1521459234", "last_modified": "1521459234" }

ResponseCode 201–CREATED

D5.1CloudGovernanceMechanisms–EarlyRelease

56

Table24:GETElasticApplications’Context

Description GETthecontextoftheElasticApplications

URI /appsMethod GETParameters -offset:Integer

required:Falsedescription:Thenumberofelasticitypoliciestoskip-limit:required:Falsedescription:Thenumberofelasticitypoliciestoreturn

Request/ResponseFormat

application/json

ResponseStatusCodes

200–OK401–UNAUTHORIZED403–FORBIDDEN404–NOTFOUND

SampleRequest GET /apps/

RequestBody -

ResponseBody [ { "appID": "95a01dfe4667414f9336b7d7495cc7a7", "name": "my-app-1", "availability_zone": "eu", "tags": [ "spring-boot", "mysqldb" ], "providers": [ "aws", "google-ce" ], "created_at": "1521459234", "last_modified": "1521459234" }, { . . . } ]

ResponseCode 200–OK

Table25:GETElasticApplication’sContext

Description GETthecontextofanElasticApplication

URI /apps/{appID}Method GETParameters -Request/ResponseFormat

application/json

D5.1CloudGovernanceMechanisms–EarlyRelease

57

ResponseStatusCodes

200–OK401–UNAUTHORIZED403–FORBIDDEN404–NOTFOUND

SampleRequest GET /apps/95a01dfe4667414f9336b7d7495cc7a7

RequestBody -

ResponseBody { "appID": "95a01dfe4667414f9336b7d7495cc7a7", "name": "my-app-1", "availability_zone": "eu", "tags": [ "spring-boot", "mysqldb" ], "providers": [ "aws", "google-ce" ], "created_at": "1521459234", "last_modified": "1521459234" }

ResponseCode 200–OKTable26:DeleteElasticApplicationContext

Description DELETEthecontextofanElasticApplication

URI /apps/{appID}Method DELETEParameters -Request/ResponseFormat

application/json

ResponseStatusCodes

200–OK401–UNAUTHORIZED403–FORBIDDEN404–NOTFOUND

SampleRequest DELETE /apps/95a01dfe4667414f9336b7d7495cc7a7

RequestBody -

ResponseBody -

ResponseCode 200–OK

Table27:UPDATEElasticApplication’scontext

Description UPDATEthecontextofanElasticApplication

URI /apps/{appID}Method PUTParameters -Request/ResponseFormat

application/json

D5.1CloudGovernanceMechanisms–EarlyRelease

58

ResponseStatusCodes

200–OK401–UNAUTHORIZED403–FORBIDDEN404–NOTFOUND

SampleRequest PUT /apps/95a01dfe4667414f9336b7d7495cc7a7

RequestBody { "name": "my-new-appname-1", "availability_zone": "us-west", "tags": [ "spring-boot", "mysqldb", "java" ] }

ResponseBody { "appID": "95a01dfe4667414f9336b7d7495cc7a7", "name": "my-new-appname-1", "availability_zone": "us-west", "tags": [ "spring-boot", "mysqldb", "java", ], "providers": [ "aws", "google-ce" ], "created_at": "1521459234", "last_modified": "1521467357" }

ResponseCode 200–OK

7.2.3 ElasticityPoliciesTable28:CREATEanewElasticityPolicy

Description CREATEanewelasticityPolicyURI /{appID}/elasticityPolicy/Method POSTParameters -Request/ResponseFormat application/jsonResponseStatusCodes 200–OK

201–CREATED401–UNAUTHORIZED403–FORBIDDEN

SampleRequest /95a01dfe4667414f9336b7d7495cc7a7/elasticityPolicy/

RequestBody { "name": "policy_scale_out_streaming_cluster_eu_1", "trigger": [ { "info": { "option": "HOURLY_COST", "members": [ { "resource": {

D5.1CloudGovernanceMechanisms–EarlyRelease

59

"name": "svc_streaming" } }, { "resource": { "name": "svc_analytics" } }, { "resource": { "name": "svc_front" } }, { "resource": { "name": "svc_data_store" } } ] }, "relOp": "LTE", "value": 10 }, { "groupFunction": "AVERAGE", "metric": { "name": "requests_per_sec", "members": { "resource": { "name": "svc_streaming" }, "clusters": [ "cluster_eu_1" ] } }, "relOp": "GTE", "value": 10, "timeWindow": { "duration": 5, "unit": "MINUTES" } } ], "action": { "scaleOut": [ { "count": 1, "resource": { "name": "svc_streaming" }, "cluster": "cluster_eu_1", "cooldown": { "duration": 1, "unit": "MINUTES" }, "warmup": { "duration": 1, "unit": "MINUTES" } } ] }, "priority": 1 }

D5.1CloudGovernanceMechanisms–EarlyRelease

60

ResponseBody { "msg": "Policy successfully created", "elasticityPolicyID": "abf10-fb19-2ccf" }

ResponseCode 200–OK

Table29:GETallElasticityPoliciesofaspecificdeployment

Description GETalltheElasticityPoliciesofaspecificdeploymentEndpoint /{appID}/elasticityPolicy/Method GETParameters -offset:Integer

required:Falsedescription:Thenumberofelasticitypoliciestoskip-limit:required:Falsedescription:Thenumberofelasticitypoliciestoreturn

Request/ResponseFormat application/jsonResponseStatusCodes 200–OK

401–UNAUTHORIZED403–FORBIDDEN404–NOTFOUND

SampleRequest /95a01dfe4667414f9336b7d7495cc7a7/elasticityPolicy/

RequestBody -

ResponseBody [ { "name": "policy_scale_out_streaming_cluster_eu_1", "trigger": [ … ], "action": { … }, "priority": 1 }, { "name": "policy_scale_out_streaming_cluster_usa_1", "trigger": [ … ], "action": { … }, "priority": 2 }, { "name": "policy_scale_in_streaming_cluster_eu_1", "trigger": [ … ], "action": { … }, "priority": 1 }, { "name": "policy_scale_in_streaming_cluster_usa_1",

D5.1CloudGovernanceMechanisms–EarlyRelease

61

"trigger": [ … ], "action": { … }, "priority": 2 } ]

ResponseCode 200–OK

Table30:GETanElasticityPolicy

Description GETanelasticityPolicyforthespecifieddeploymentEndpoint /{appID}/elasticityPolicy/{elasticityPolicyID}Method GETParameters -Request/ResponseFormat application/jsonResponseStatusCodes 200–OK

401–UNAUTHORIZED403–FORBIDDEN404–NOTFOUND

SampleRequest GET

/95a01dfe4667414f9336b7d7495cc7a7/elasticityPolicy/abf10-fb19-2ccf

RequestBody -

ResponseBody { "name": "policy_scale_out_streaming_cluster_eu_1", "trigger": [ { "info": { "option": "HOURLY_COST", "members": [ { "resource": { "name": "svc_streaming" } }, { "resource": { "name": "svc_analytics" } }, { "resource": { "name": "svc_front" } }, { "resource": { "name": "svc_data_store" } } ] }, "relOp": "LTE", "value": 10 },

D5.1CloudGovernanceMechanisms–EarlyRelease

62

{ "groupFunction": "AVERAGE", "metric": { "name": "requests_per_sec", "members": { "resource": { "name": "svc_streaming" }, "clusters": [ "cluster_eu_1" ] } }, "relOp": "GTE", "value": 10, "timeWindow": { "duration": 5, "unit": "MINUTES" } } ], "action": { "scaleOut": [ { "count": 1, "resource": { "name": "svc_streaming" }, "cluster": "cluster_eu_1", "cooldown": { "duration": 1, "unit": "MINUTES" }, "warmup": { "duration": 1, "unit": "MINUTES" } } ] }, "priority": 1 }

ResponseCode 200–OK

Table31:UPDATEanexistingElasticityPolicy

Description UPDATEanexistingelasticityPolicyEndpoint /{appID}/elasticityPolicy/{elasticityPolicyID}Method PUTParameters -Request/ResponseFormat application/jsonResponseStatusCodes 200–OK

401–UNAUTHORIZED403–FORBIDDEN404–NOTFOUND

SampleRequest /95a01dfe4667414f9336b7d7495cc7a7/elasticityPolicy/abf10-

fb19-2ccfRequestBody {

"name": "policy_scale_out_streaming_cluster_eu_1", "trigger": [

D5.1CloudGovernanceMechanisms–EarlyRelease

63

{ "info": { "option": "HOURLY_COST", "members": [ { "resource": { "name": "svc_streaming" } }, { "resource": { "name": "svc_analytics" } }, { "resource": { "name": "svc_front" } }, { "resource": { "name": "svc_data_store" } } ] }, "relOp": "LTE", "value": 20 }, { "groupFunction": "AVERAGE", "metric": { "name": "requests_per_sec", "members": { "resource": { "name": "svc_streaming" }, "clusters": [ "cluster_eu_1" ] } }, "relOp": "GTE", "value": 10, "timeWindow": { "duration": 5, "unit": "MINUTES" } } ], "action": { "scaleOut": [ { "count": 4, "resource": { "name": "svc_streaming" }, "cluster": "cluster_eu_1", "cooldown": { "duration": 1, "unit": "MINUTES" }, "warmup": { "duration": 1, "unit": "MINUTES"

D5.1CloudGovernanceMechanisms–EarlyRelease

64

} } ] }, "priority": 1 }

ResponseBody { "msg": "Policy successfully updated" }

ResponseCode 200–OK

Table32:DELETEanexistingElasticityPolicy

Description DELETEanexistingelasticityPolicyEndpoint /{appID}/elasticityPolicy/{elasticityPolicyID}HTTPMethod DELETEParameters -Request/ResponseFormat application/jsonResponseStatusCodes 200–OK

401–UNAUTHORIZED403–FORBIDDEN404–NOTFOUND

SampleRequest /95a01dfe4667414f9336b7d7495cc7a7/elasticityPolicy/abf10-

fb19-2ccfRequestBody -

ResponseBody { "msg": "Policy successfully deleted" }

ResponseCode 200–OK