emechanisms early release - unicorn project
TRANSCRIPT
D5.1CloudGovernanceMechanisms–EarlyRelease
1
CloudGovernanceMechanismsEarlyReleaseDeliverableD5.1
Editors DemetrisTrihinas
ZachariasGeorgiou Reviewers ManosPapoutsakis(FORTH)
SotirisKoussouris(Suite5) Date 26March2018 Classification Public
!
D5.1CloudGovernanceMechanisms–EarlyRelease
2
ContributingAuthor # VersionHistory
Name Partner Description
DemetrisTrihinas UCY 1 TableofContents(ToC)andpartnercontributionassignment.
ZachariasGeorgiou UCY 2 Introduction,DocumentPurposeandRelationtootherWPs.
GeorgePallis UCY 3 MergedcontentforMonitoringState-of-the-Art
4 MergedcontentforDecision-MakingState-of-the-ArtandArchitecture
5 Monitoringreferencearchitectureandimplementation
6 MergedDecisionMakingImplementationandRequirements
7 MergedMonitoringRequirements,APIandupdatedimplementationdetails
8UpdatedRequirementsandExposedFunctionalityforMonitoringandDecisionMaking
9 MergedUpdatedContentforDecisionMakingimplementation
10 ExecutiveSummaryandConclusions.Submittedforinternalreview.
11 AddressedALLreceivedfeedbackfrominternalreview.Finalversion.
D5.1CloudGovernanceMechanisms–EarlyRelease
3
1 INTRODUCTION 8
1.1 DocumentPurposeandScope 101.2 DocumentRelationshipwithotherProjectWorkPackages 101.3 DocumentStructure 10
2 STATEOFTHEARTANDKEYTECHNOLOGYAXESCHALLENGES 11
2.1 Micro-ServiceandContainerMonitoring 112.2 Auto-Scaling 12
3 MONITORINGANDANALYSISSERVICE 14
3.1 RequirementsandExposedFunctionality 143.1.1 FunctionalRequirements 143.1.2 Non-FunctionalRequirements 17
3.2 ReferenceArchitectureandImplementation 173.3 InteractionwithotherUnicornServicesandComponents 25
4 DECISIONMAKINGANDAUTO-SCALINGSERVICE 27
4.1 RequirementsandExposedFunctionality 274.1.1 FunctionalRequirements 274.1.2 Non-FunctionalRequirements 29
4.2 ReferenceArchitectureandImplementation 304.2.1 ElasticityManager 324.2.2 ElasticityController 33
4.3 InteractionwithotherUnicornServicesandComponents 36
5 CONCLUSIONS 37
6 REFERENCES 39
7 APPENDIX 41
7.1 MonitoringandAnalysisServiceAPIDocumentation 417.1.1 MonitoringAPIKeys 417.1.2 MonitoredApplications 437.1.3 MonitoringAgents 467.1.4 MonitoringMetrics 51
7.2 DecisionMakingandAuto-scalingServiceAPIDocumentation 537.2.1 ElasticityAPIKeys 547.2.2 ElasticApplication 557.2.3 ElasticityPolicies 58
D5.1CloudGovernanceMechanisms–EarlyRelease
4
ListofFiguresFigure1:UnicornReferenceArchitecture 9Figure2:High-LevelandAbstractOverviewofUnicornMonitoringandAnalysisService 18Figure3:High-LevelandAbstractOverviewofMonitoringAgentInterfaces 20Figure4:ExampleofaNewlyDefinedMonitoringProbe 23Figure5:DynamicMonitoringAgentDiscovery 25Figure6:DecisionMaking&Auto-scalingServiceReferenceArchitecture 31Figure7:ElasticityPolicyExample 31Figure8:ElasticityPolicyStateDiagram 32Figure9:ElasticityManager 32Figure10:ElasticityControllerReferenceArchitecture 34ListofTablesTable1:MonitoringMetricHandlers 21Table2:MonitoringProbesCurrentlyAvailableinUnicornProbeRepository 22Table3:CREATEMonitoringAPIKeyContext 41Table4:DELETEMonitoringAPIKeyContext 41Table5:GETMonitoringAPIKeyContext 42Table6:UPDATEMonitoringAPIKeyContext 42Table7:CREATEMonitoringApplicationContext 43Table8:GETMonitoredApplicationContextAssociatedwithMonitoringKey 44Table9:GETMonitoredApplicationContext 44Table10:DELETEMonitoredApplicationContext 45Table11:UPDATEMonitoredApplicationContext 45Table12:CREATEMonitoringAgentContext 46Table13:GETMonitoringAgentcontextsassociatedwithMonitoredApplication 47Table14:GETMonitoringAgentContext 48Table15:DELETEMonitoringAgentContext 49Table16:UPDATEMonitoringAgentContext 49Table17:CREATEMonitoringMetricValuecontext 51Table18:GETMonitoringMetricvaluecontextsassociatedwithMonitoredAgent 51Table19:GETMonitoringMetricContext 52Table20:CREATEElasticityAPIKeyContext 54Table21:DELETEElasticityAPIKeyContext 54Table22:GETElasticityAPIKeyContext 54Table23:CreateElasticApplication’sContext 55Table24:GETElasticApplications’Context 56Table25:GETElasticApplication’sContext 56Table26:DeleteElasticApplicationContext 57Table27:UPDATEElasticApplication’scontext 57Table28:CREATEanewElasticityPolicy 58Table29:GETallElasticityPoliciesofaspecificdeployment 60Table30:GETanElasticityPolicy 61
D5.1CloudGovernanceMechanisms–EarlyRelease
5
Table31:UPDATEanexistingElasticityPolicy 62Table32:DELETEanexistingElasticityPolicy 64
D5.1CloudGovernanceMechanisms–EarlyRelease
6
ExecutiveSummary
TheaimthisDeliverableistoprovideacomprehensiveoverviewanddocumentationreportfortheearlyreleaseoftheUnicornGovernanceMechanisms.TheUnicornGovernanceMechanismscontributetothemonitoringandmanagementoftheruntimeaspectsoftheunderlyingmulti-cloudexecutionenvironmentstargetedbytheUnicornPlatformandaredevelopedwithinthescopeofWorkPackage5(WP5).
ThisDeliverablebeginsbypresentingthechallengesintroducedinreferencetomonitoringandmanagingmicro-service deployments across multi-cloud and containerized execution environments. Next, it continues withderivationof therequirementsandexposed functionality for thecomponentscomprisingtheUnicornCloudGovernanceMechanisms.
Fromtheidentifiedsetofrequirements,thereferencearchitectureandpublicAPIsofboththeMonitoringandAnalysisServiceandDecision-MakingandAuto-Scalingservice,arederivedandreported,givinganoverviewoftheearlyimplementationofthesecomponentswhichconstitutekeyaspectsfortheruntimemanagementstack,partofthefirstprototypereleaseoftheUnicornPlatform.
Finally, theDeliverableconcludesandoutlines thework tobeconducted towards introducingD5.2 thatwillassesstheaccomplishmentoftherequirements,featuresandtoolsets introducedinthisdeliverableandwillprovidethefinaldocumentationoftheUnicornPlatformCloudGovernanceMechanisms.
D5.1CloudGovernanceMechanisms–EarlyRelease
7
TableofAbbreviationsAPI ApplicationprogramminginterfaceCI/CD ContinuousIntegrationandContinuousDeliveryCRUD CreateReadUpdateDeleteDRL DroolsRuleLanguageEBNF ExtendedBackus-NaurFormIDE IntegratedDevelopmentEnvironmentJ2EE Java2PlatformEnterpriseEditionJSON JavaScriptObjectNotationJVM JavaVirtualMachineMAPE-K MonitorAnalyzePlanExecute-KnowledgeMaaS MonitoringasaServiceNoSQL Non-RelationalDatabaseOS OperatingSystemQoS QualityofServiceREST RepresentationalStateTransferTCP TransmissionControlProtocolVM VirtualMachine
D5.1CloudGovernanceMechanisms–EarlyRelease
8
1 IntroductionTheaimoftheUnicornprojectistoempowertheEuropeandigitalSMEeco-systembydeliveringanovelandunified framework that simplifies thedesign,deploymentandmanagementof secureandelastic-by-designcloudapplicationsthatfollowthemicro-servicearchitecturalparadigmandcanbedeployedovermulti-cloudcontainerizedexecutionenvironments.To thisend,DeliverableD5.1,henceforthsimply referred toasD5.1,providesacomprehensiveoverviewanddocumentationreportfortheearlyreleaseoftheUnicornGovernanceMechanisms. The Unicorn Governance Mechanisms contribute to the monitoring and management of theruntimeaspectsoftheunderlyingmulti-cloudexecutionenvironmentstargetedbytheUnicornPlatformandaredevelopedwithinthescopeofWorkPackage5(WP5).
TheUnicornGovernanceMechanismsincludethreevitalsoftwarecomponents:
• TheMonitoringandAnalysisService:Theroleof thiscomponent is toprovidereal-timemonitoringdata storage and analysis in order to detect and promptly notify cloud consumers and platformoperatorsofpotentialperformanceinefficiencies,securityrisksandexhibitedrecurringcustomerandresourcebehaviourpatterns.
• TheExecutionEnvironmentAgent1:Theroleofthiscomponentistocollectmonitoringdatainanon-intrusiveandinteroperablemanner,regardingresourceutilizationfromtheunderlyingcontainerizedexecutionenvironment(e.g.,compute,memory,network)anddeployedcloudapplicationbehaviourfromtailoredapplication-levelmetrics(e.g.,throughput,activeusers).
• The DecisionMaking and Auto-Scaling Service: The role of this component is to decide the mostefficientconfigurationfortheexecutionofthecloudapplicationbycontinuouslyevaluatingapplicationand service tier behaviour, the underlying multi-cloud provisioned infrastructure and user-definedrequirements,policiesandconstraints.
Figure1depictsthecurrentversionoftheUnicornReferenceArchitecturewiththecomponentscomprisingtheUnicornGovernanceMechanismshighlighted.AMonitoringAgentisbundledbytheUnicornPlatformwithineach containerizedexecutionenvironmentupondeploymentand is configured toadhere to the constraints(e.g., collection periodicity) defined at design-time in the application service description. The UnicornMonitoringAgentisparticularlytailoredforcontainerizedenvironments(e.g.,absenceoffile-system)suchasDockerandanyothercontainerformatadoptingtheOpenContainerSpecification[1].Afterdeployment,theset of interested metrics, from both the containerized environment (resource utilization) and application-specificmetricsextractedfromtheannotatedsource-code,areautomaticallypublishedbytheMonitoringAgenttotheMonitoringandAnalysisService,whichispartoftheRuntimeEnforcementLayeroftheUnicornPlatform.
Atthistime,theMonitoringandAnalysisServiceprocessesincomingmonitoringdataandstoresmetricupdatestotherespectiveDataStoreforhistoricreference.Inturn,real-timemetricupdatesarefedforfurtheranalysistoconstructhigh-levelanalyticinsightsandaggregateddatabeforestreamedtotheintelligentDecision-MakingandAuto-ScalingService.Thisservice,aspartofMAPE-Kcontrolloop2,thenproceedstoassessifadaptationofthe underlying virtual execution environment is required. Adaptation is based on semi-supervised decisionalgorithms for the optimal placement of virtual machines and containers across multiple availability zones
1AsthetermsExecutionEnvironmentAgentandMonitoringAgentareinterchangeablefortheUnicornFramework,wewillsimplyrefertotheExecutionEnvironmentAgentastheMonitoringAgent2Monitor-Analyse-Plan-ExecutewithKnowledge
D5.1CloudGovernanceMechanisms–EarlyRelease
9
and/orcloudsites,whilerealizingtheheterogeneityamongcloudprovidersandtheircapabilities.Atthesametime, the Unicorn decision-making process still adheres to the conditions and high-level policy constraintsdefinedbytheApplicationAdministrator.
Through theUnicornDashboard,applicationadministratorshave theability toview inan intuitivegraphicalmanner real-timemonitoring data capturing the application behaviour and performance of the underlyingplatform,assessmentoftheapplication’selasticbehaviourandpotentialsecurityincidents.Inturn,throughtheUnicornDashboard,userscanformulate(continuous)monitoringqueriestoaccessandtrawlhistoricaland/oraggregatedmonitoringdataextractedfromtheMonitoringandAnalysisService.
Figure1:UnicornReferenceArchitecture
D5.1CloudGovernanceMechanisms–EarlyRelease
10
1.1 DocumentPurposeandScopeThepurposeofthisdeliverableistoprovideacomprehensiveoverviewanddocumentationreportoftheearlyreleaseoftheUnicornGovernanceMechanismswhichcontributetothemonitoringandmanagementoftheruntimeaspectsof theunderlyingmulti-cloudexecutionenvironments targetedby theUnicornPlatform. Inrespect to this, D5.1 aims to derive a clear overview of the early design and development of the threecomponentscomprisingtheUnicornGovernanceMechanismsandaredevelopedundertheumbrellaofWP5:(i)theMonitoringandAnalysisService;(ii)theMonitoringAgent;and(iii)theDecisionMakingandAuto-ScalingService.WenotethatastheMonitoringAgentistightly-coupledwiththeMonitoringandAnalysisService,thedocumentationreportforthesetwocomponentswillbeintroducedtogether.Tothisend,D5.1documentsforeachcomponentoftheUnicornGovernanceMechanisms,therequirementsthatmustbesatisfiedtoovercomethe challenges introduced when monitoring, managing and scaling multi-cloud containerized executionenvironments,theirfunctionalities,howtheyoperateandthefirstversionoftheirexposedAPIwhichisusedtointeractwithotherUnicorncomponents,usersand/orthird-partyservices.Finally,wenotethatpartsofD5.1arebasedonanumberofscientificpapers[2][3],whichintroducecoreconceptsofthecomponentspartoftheUnicornGovernanceMechanismsandWP5.
1.2 DocumentRelationshipwithotherProjectWorkPackagesThisdeliverable isbuilton the foundationofD1.2,whichprovidesaconcretedocumentationof thecurrentversion of the reference architecture and key technologies supported by Unicorn and provides an initialdescription of the components comprising the Unicorn framework. To this end, D5.1 extends the UnicorndocumentationbyprovidingacomprehensivereportfortheUnicornGovernanceMechanisms.Whatismore,D5.1 serves as a guide forD5.2, theUnicornGovernanceMechanisms - FinalRelease,whichwill assess theaccomplishmentoftherequirements,featuresandtoolsetsintroducedinthisdeliverableandwillprovidethefinaldocumentationoftheUnicornGovernanceMechanisms.
1.3 DocumentStructureTherestofthisdeliverableisstructuredasfollows:
Section2,providesanupdatedguideofthecomprehensivereportintroducedinD1.2andreferringtotheState-of-the-Art landscape in monitoring, managing and scaling multi-cloud applications and containerizedenvironments.
Section 3 and 4, present a comprehensive documentation report introducing the reference architecture,exposed functionality and implementation details referring to theMonitoring and Analysis Service and theDecisionMakingandAuto-ScalingService,respectively.
Section5,concludesthisDeliverableandoutlinestheworktobeconductedtowardsintroducingD5.2.
IntheAppendix,acomprehensivedocumentationofthecurrentAPIexposedbyboththeMonitoringandAuto-ScalingServices,isprovided.
D5.1CloudGovernanceMechanisms–EarlyRelease
11
2 StateoftheArtandKeyTechnologyAxesChallengesIn the cloudera, as applications growby addingmore services, real-timemonitoring anddynamic resourceallocation become significant challenges. At scale, these challenges can be addressed with autonomicity.Through automation, microservices are equipped with the ability to continuously control the underlyinginfrastructure, thus turning into services that can be harnessed programmatically at runtime. However,traditionalmonitoring is ineffective forephemeral,decomposedandhighlydynamicmicroservicesdeployedoversharedexecutionenvironments.Ontheotherhand,finerservice-granularitymeansmoremovingpartsandhence an increased complexity of auto-scaling, potentiallymorepoints of failure, andmorepossibilities forserioussecurityviolationsandprivacyleaks.Thischallengesetcreatestheneedofdevisingnewsolutionswiththeabilitytodesign,runandmonitormicro-servicesatscalewhilealsoachievingtheanticipatedautonomicityoftheapplicationruntimeontopoftheunderlyingprogrammablecloudinfrastructure.
InthisSection,wewillupdatetheState-of-the-ArtpresentedinD1.2.Particularly,wepresentthechallengesintroduced in reference to monitoring and managing micro-service deployments across multi-cloud andcontainerizedexecutionenvironments.
2.1 Micro-ServiceandContainerMonitoringTraditionalmonitoringtools,eveninthecloudera,aredesignedforslowlyevolvingexecutionenvironmentswhere application instances resemble one another and reside on either physical or virtual machines.Containerizedexecutionenvironmentseaseapplicationdevelopmentanddeployment,astheuserisabstractedfrom the complexity of configuring the underlying offerings including the (virtual) infrastructure, network,storageandaccompanyingservices[4].However,monitoringcontainerizedenvironmentsisacomplexandopenchallenge.
Specifically, inacontainerizedenvironmentthereisnoguestOSorfile-systemtodeployamonitoringagentalongsiderunningservices.Anagentmusteitherrunthroughthecontainerengineorbepartoftheapplicationitself,whichmeansthatmonitoringshouldbeanintegralpartofapplicationdesignandcannotbedecidedafterdeployment.Toovercomethischallenge,DockerStats[5],thenativeDockermonitoringtool,andcAdvisor[6],anopen-sourcemonitoringtooldevelopedbyGoogleforDocker,assumeusershaveaccesstohostmachines,hookingthemselvestothecontainerenginedaemontoaccessandcollectmonitoringdataforthecontainersdeployedontheparticularhost.
Althoughaccessingmonitoringdatathroughthehostseemsaviablesolution,onemustconsiderthedifficultyofprovidingportableandinteroperablemonitoring.Thisisthecasewhensupportingmulti-clouddeploymentswhereinstancesspanacrossmultipleandheterogeneouscloudofferings.Tothisend,monitoringtoolssuchasMonPaaS [7], Tower4Clouds [8] and JCatascopia [9], offer portable and multi-cloud monitoring for cloudapplications although these tools are not tailored for containerized environments as their agents ormetriccollectorsrequireafile-systemfordeployment.Mostimportantly,accesstohostmachinesisnotpossiblewhenthe underlying infrastructure is provided through an intermediate cloud broker, such as in the case of theUnicornFrameworkwhenprovidedasaDevOps-as-a-Serviceeco-system.
Whatismore,granularlyslicinganapplicationinto(micro-)servicesinherentlyintroducesheterogeneitywhichrequires full customization of themonitoring process to access runtime application behaviour and performdiagnostics toreceivehelpful insights [10].Particularly,unless limitingthemonitoringtocapacityutilization,different insightsare required fordifferent servicesof amicro-servicedeployment.However, customization
D5.1CloudGovernanceMechanisms–EarlyRelease
12
mustbeautomatedandbecomepartofthecontinuousdeliveryprocessforimmutablecontainerizedexecutionenvironments.ThismeansthatforDockerapplications,customizationofthemonitoringprocessmustbepartofthecontainerbundlingtobeconsistentwiththeprinciplesofthemicro-servicearchitectureparadigm[11].However,currentmonitoringsolutions,suchasNewRelicAPM[12]andDataDog[13],whichsupportapplicationmonitoring require that users download and utilize bundled and pre-configured metric collectors that areavailableforpopularprogramminglanguages(e.g.,java,ruby,python)andframeworks(e.g.,tomcat,django,sql).
This leads toanother challengewhere ignoring theunique characteristicsof the containerizedenvironmentresorts toutilizingmultiple share-nothingmetric collectorsdue to theabsenceof customizationandmetriccompilationfromalreadycollecteddata,whichincreasescontainersizesandintroduceshighruntimefootprints.Thus,therearesignificantcostsandactualruntimeoverheadswhenmonitoringephemeral,decomposedandhighlydynamicapplicationsovervirtualizedandsharedexecutionenvironments[14].
2.2 Auto-ScalingOneofthemostvaluablecharacteristicsofmicroservice-basedcloudapplicationsistheircapabilitytoefficientlyscalehorizontallyandindependentlytorespondtochangesintheworkload,duetotheirinherentdecouplednature.Traditional scaling solutionsbycloudproviders suchasAWS [15] andMicrosoftAzure [16] relyoneventstriggeredbythresholdviolationsonmonitoringdatathatareexpressedthroughsimplerulesintheform“IF-THEN-ACTION”andaremanuallypre-determinedbytheuser.However,findingthemostappropriaterulesforscalingisnotatrivialtaskasitrequiresthattheuserhasprioriknowledgeoftheoptimalthresholdvalues,and even so, thresholds should be constantly adjusted to reflect any workload or application’s behaviourchangesinsuchadynamicandcomplexenvironment.
Anothercriticalparameterintraditionalrule-basedscalingmechanismsisthetimewindow,whichindicatesthattheconditionofarulemustbeevaluatedtotrueforapre-determinedperiodoftime(e.g.,cpu_usage>60%for5minutes).Thishelpsindeterminingwhetherascalingalertisissuedduetoanactualchangeinthedemandofanapplication,orduetosuddenandshort-livedspikesonhighlysensitivemonitoringdata(e.g.,cpuusage).This approach can improve the accuracy of scaling, however when not configured properly it can lead toundesirableresults.Specifically,whenthechoiceofperiodisrelativelysmall,thesystemmayreachanon-stablestatewhereresourcesareprovisionedandde-provisionedrapidly,butmostimportantlyarebilledeventhoughrealdemanddoesnotexist.Thisphenomenonisknownasa“ping-pong”effect[2].Ontheotherhand,delayingto determine an actual change in the application load, there is the possibility of a severe performancedegradation that affects the overall application’s QoS. A third and equally important factor that impactsapplication’soverallperformance, is the cooldownperiod.Thisparameter introducesanartificialdelay thatprovidesadditionaltimetoasystemtoabsorbtheeffectsofpreviousscalingactions(e.g.,provisionandregisternewservicestoloadbalancer)andbringsthesystembacktoastablestate,beforetriggeringanotheraction.The above difficulties, clearly impose a significant challenge for the user in finding optimal parameterconfigurations, especially for microservice-based cloud applications which are inherently decomposed intomultipleandheterogeneousservices.
Ascalingmechanismthatalleviatestheuserfromthepreviouserror-proneconfigurationistarget-basedscaling.Thismechanism,offeredbyGoogleCloudPlatform[17]andrecentlybyAWS[18],allowstheusertospecifyatargetvalueforametric,withtheAuto-Scalingmechanismintroducingtheappropriateadjustmentstokeepthemetricclosetothedefinedtargetvalue.However,thismethodimposesthesignificantchallengeoffindinga
D5.1CloudGovernanceMechanisms–EarlyRelease
13
representativemetrictoscaletheapplication.Theworkdonein[19],usesaclusteringtechniqueinordertoreduceandgroup relatedmetricsandusinga service call-graph that is constructedbyapplyingapplication-specificload,identifiesthemostrelevantmetricsforscaling.Thisapproachimprovestheorchestrationofauto-scalingmechanisms,howeveritdoesn’tcaptureanychangesintheapplicationorworkloadbehaviourchangesthatcanmaketherepresentativemetricsirrelevantanduseless.
Concerningmulti-cloudscaling,while itoffersgreaterflexibilitytoenterprises,notallapplications inherentlybenefitfromitbysimplyapplyingexistingscalingrulesbutinamulti-cloudfashion.Forexample,servicesthatconstantly exchange data,may suffer from excessive pricing charges and introduce significant performancedegradation (e.g., lowbandwidth across cloud providers). Furthermore, the lack of standardizedAPIswhenaccessingcloudofferingsacrossmultipleprovidersremainsasignificantchallenge.Inparticular,cloudprovidersandplatformsusetheirowntechnologystack,makingdifficultforclientstoexploittheadvantagesoftrulymulti-clouddeployments.Finally,withthediversityofresourcerequirementsofeachmicroserviceofanapplicationandwiththeresourceheterogeneityintroducedbydifferentcloudproviders,theprocessoffindinganoptimalplacementtoimprovethequalityandperformanceoftheapplicationandminimizecosts,becomesasignificantchallengeforanenterprise.
D5.1CloudGovernanceMechanisms–EarlyRelease
14
3 MonitoringandAnalysisServiceIn this Section, we present a comprehensive documentation report introducing the reference architecture,exposedfunctionalityandimplementationdetailsreferringtotheMonitoringandAnalysisService.
3.1 RequirementsandExposedFunctionalityTheUsersinteractingwiththeUnicornMonitoringandAnalysisServiceinclude:
• CloudApplicationDeveloper:interactswithUnicornMonitoringbydefiningthescopeandintensityofmonitoringmetrics to assess application behaviorwith source code annotations via theMonitoringDesignLibrary.
• CloudApplicationOwner: interactswithUnicornMonitoring by defining the scope and intensity ofmonitoring metrics to assess application behavior by using the tools available from the UnicornDashboardMonitoringFacetforservicedescriptionenrichment.
• Unicorn Developer: interacts with Unicorn Monitoring by developing custom metric collectors forframeworksandprogramminglanguagestoincreasethereachoftheMonitoringandAnalysisService.
• CloudProvider:interactswithUnicornMonitoringbyassessingmonitoringdataregardingcloudofferingutilizationtooptimizetheprovisioningandqualityofserviceoftheunderlyingresourceofferings.
TosatisfyUse-CaseUC.8(Monitorapplicationbehaviourandperformance),documentedinD1.2,whileadheringtosystemrequirementFR.7(Accessapplicationbehaviourandperformancemonitoringdata),documentedinD1.1,thefollowingfunctionalitymustbeexposedbytheMonitoringandAnalysisService.
3.1.1 FunctionalRequirements
ID FR.MON.1
Title Monitorcloudandcontainerlevelutilization
Description UnicornMonitoringmustprovide themeans tomonitorunderlyingcloudofferingsandcontainerizedexecutionenvironmentsthatareprovisionedforapplicationsdeployedbythe Unicorn Platform.Monitoring includes exposing provisioned resources capabilities,currentconsumption,service(un-)availabilityandfaults.
ExposedFunctionality
Metriccollectors,denotedasMonitoringProbes,aredeveloped tomonitorandextractmetricsfromtheunderlyingcloudelementtheyresideon(e.g.,VM,container).Thecode-base ofMonitoring Agents, the entities responsible formanaging themetric collectionprocess,isdecoupledfromtheactualmetriccollection(MonitoringProbes)whichallowsfortheselectionanddynamicinstantiationofonlytherequiredmetriccollectorstosatisfythemonitoringtaskinhand.Tothisend,UnicornprovidesanumberofMonitoringProbescapableofmonitoringDockercontainers,JVMandJ2EEcontainers,aswellas,LinuxVMsdeployedonprovisionedcloudofferings.
D5.1CloudGovernanceMechanisms–EarlyRelease
15
ID FR.MON.2
Title Monitorcloudapplicationbehaviorandperformance
Description UnicornMonitoringmustprovidethemeanstomonitorthebehavior,qualityofserviceand current performance of deployed cloud applications. This must be achieved byprovidingamonitoringlibraryforapplicationinstrumentationvia,eitherorboth,sourcecodeannotationsandtheservicedescription.
ExposedFunctionality
TheMonitoring Library is developed to expose source code annotation decorators fordevelopers todefine the scopeand configurationof application levelmetrics at designtime.Atruntime,theMonitoringLibraryinstantiatesthemetrichandlersresponsibleforinstrumentation and metric update extraction from the executed source code. MetricupdatesandthenpushedtotherespectedMonitoringAgentforparsinganddisseminationtotheUnicornMonitoringandAnalysisService.UnicorncurrentlyprovidestheMonitoringLibrary for cloud applications developed in Java, with the library featuring additionalenhancements such as high-level metric handlers for monitoring and measuring thecompletionofapplicationtasks(timers),interceptingandfilteringcommunicationamonginteracting components (interceptors) and rate of completed tasks in a user-definedtimeframe(meters).
ID FR.MON.3
Title Metriccollectordevelopmenttoolkit
Description UnicornMonitoringmustexposeatoolkitforuserstodevelopcustommetriccollectorstailoredtotheirapplicationrequirements.DevelopedmetriccollectorsthatadheretotheUnicornmetricparadigmmustbeseamlesslyintegratedtotheUnicornMonitoringprocessateitherdeploymentorruntime.
ExposedFunctionality
Metric collectors, denoted as Monitoring Probes, must adhere to the Unicorn probeinterface and metric abstractions that allow for Monitoring Probes to be dynamicallypluggedtoMonitoringAgentsandformetricupdatestobeparsedanddisseminatedtothe Monitoring and Analysis Service. To ease Monitoring Probe development, thereferencedtoolkiteasesMonitoringProbedevelopmentbyhidingthecomplexityoftheProbefunctionalityandrequestingfromdeveloperstoonlydefinedefaultvaluesfortheProbe periodicity and a name, a short description of the offered functionality and aconcreteimplementationofhowmetricvaluesareupdated.
ID FR.MON.4
Title Accesstohistoricalandreal-timemonitoringmetricdata
Description UnicornMonitoringmustprovidethemeansforuserstoaccessbothhistoricalandreal-timemonitoringdatawiththedatastoragebackendrestrictingaccesstoonlyauthorizedentities. Access tomonitoring datamust support both a push and pullmetric delivery
D5.1CloudGovernanceMechanisms–EarlyRelease
16
mechanism to reduce the overhead of exposing monitoring data depending on therequirementsoftheinterestingentityandtheUnicornplatform.
ExposedFunctionality
AthinAPIlayerontopoftheMonitoringDataStoreisprovidedtoabstractandmanagesecureandauthorizedaccesstostoringandextractingmonitoringdata.TheMonitoringDataStoreis implementedasaNoSQLdistributeddatabasedtoscaledependingontheimposed load for accessing historical monitoring data. To reduce the overhead ofcontinuous monitoring queries from entities requesting real-time data, a high-performance queueing service is exposed by the Analysis Service which receives andmanages subscription requests to metric topics of interest. After subscription, metricupdatesarepushedimmediatelytorelatedtopicswithoutinterestedentitiesrequiredtocontinuouslyissuerequeststhroughtheMonitoringAPI,andconsequentlytheMonitoringDataStore.
ID FR.MON.5
Title Runtimemonitoringtopologyadaptationacknowledgement
Description Unicorn Monitoring must provide the means to acknowledge runtime adaptation ofapplication service topologies, including the (de-)provisioning of service instances, thealterationofcloudandcontainerofferingcapabilitiesandattachmentofadditionalmetriccollectors. Adaptation must be timely acknowledged without the need to restart theentire,orpartial,monitoringprocess.
ExposedFunctionality
TheUnicornMonitoringandAnalysisServiceisabletoacknowledgedynamicchangestothe underlying monitoring topology of an application embracing the micro-serviceparadigmwhereservicedecompositionandelasticscalingareinherentruntimefeatures.Thisisaccomplishedbyembracingavariationofthepub/submessageprotocoltodevelopthecommunicationplanebetweenMonitoringAgentsand theMonitoringandAnalysisService that allows for rapid propagation of changes to the underlying cloud offerings,containerizedenvironmentandthecardinalityofinstancesperservicelayer.
ID FR.MON.6
Title Monitoringrulelanguageformetriccomposition,aggregationandgrouping
Description UnicornMonitoringmustprovideuserswiththemeanstocompose,aggregateandgroupmonitoring data in order to derive high-level analytic insights. Metric rules should bevalidatedandonceacceptedmustprovidetimelyanswerstousersvia,eitherorboth,apushandpulldeliverymechanism.
ExposedFunctionality
Theunderlyingmonitoringmetricmodel, documented inD2.1, provides themeans forboth cloud application developers and administrators to specify monitoring rules formetriccomposition,aggregationandgroupthatallowforhigh-levelanalyticinsightstobederived from low-levelmonitoring data and correlation of collected data. To this end,metricrulesaresupportedbyUnicornaseitherenhancementstoanapplication’sservicedescription at either deployment or runtime through the Management Perspective of
D5.1CloudGovernanceMechanisms–EarlyRelease
17
UnicornDashboard,andinparticular,theMonitoringfacet.MetricrulesarevalidatedandassessmentisprovidedbytheAnalysisService.
3.1.2 Non-FunctionalRequirements
ID NFR.MON.1
Title Scalability
Description UnicornMonitoringmustbescalableinordertohandlealargenumberofmetricproducerson different cloud levelswhile simultaneously being able to handle a large number ofmetricconsumers.Thus,UnicornMonitoringshouldnotbefragmentedbythenumberofrunningmonitoringinstancesorthenumberofmetriccollectorsdeployedoneachrunninginstance.
ID NFR.MON.2
Title Non-Intrusiveness
Description UnicornMonitoringmustnotinterferewiththesystemorapplication(s)monitored,andmust not consume excessive resources from either the underlying cloud offerings orcontainerized execution environment. Thus, Unicorn Monitoring should have minimalruntime impact to provisioned resources (compute,memory, network) in order to notaffectbehaviorandperformance.
ID NFR.MON.3
Title CustomizationandExtensibility
Description Unicorn Monitoring must support users with the means to customize the monitoringprocesstotailortheirdeployedapplicationsneed.Inrespecttothis,thenumberandtypeofmetricsalongwith the intensityof themonitoringprocess,mustbecustomizable. Inturn, Unicorn Monitoring must be extensible by providing the ability to include newfunctionality andmetrics. In respect to this,UnicornMonitoringmust be adaptive andflexibleinordertoutilizeexpandedfunctionality.
3.2 ReferenceArchitectureandImplementationToaddresstheaforementionedchallengesandadheretothedocumentedrequirements,Unicornintroducesacompletemonitoringstackforautomatingthemonitoringofcloudapplicationsdeployedthroughcontainerizedexecutionenvironments. Figure2depicts ahigh-level andabstractoverviewof theUnicornMonitoringandAnalysisServiceinamulti-cloudcontainerizedexecutionenvironment.
D5.1CloudGovernanceMechanisms–EarlyRelease
18
The architecture of the Unicorn Monitoring and Analysis Service follows an agent-based architecture thatembracestheproducer-consumercommunicationparadigm.Thisapproachprovidesinteroperable,scalableandreal-timecloudmonitoringforextractingbothplatformandapplicationbehaviourdatafromdeployedcloudapplications.TheUnicornMonitoringandAnalysisServicerunsinanon-intrusiveandtransparentmannertoanyunderlyingcloudasneitherthemetriccollectionprocessnormetricdistributionandstoragearedependentto theunderlyingplatformAPIsandcommunicationmechanisms. In turn, theMonitoringService takes intoconsideration the rapid changes that occur due to the enforcement of elastic actions to the underlyinginfrastructureandtheapplicationtopology.
Figure2:High-LevelandAbstractOverviewofUnicornMonitoringandAnalysisService
ThemaincomponentsofthatcomprisetheUnicornMonitoringandAnalysisServicearethefollowing:
• MonitoringAgents: lightweight entities deployable on any cloud element to bemonitored, such ascontainerizedexecutionenvironmentsorvirtualmachinesresidingonpublicorprivatecloudofferings.MonitoringAgents are the entities responsible for coordinating andmanaging themetric collectionprocess on the respective cloud element (e.g., container, VM), which includes aggregation anddissemination ofmonitoring data to theMonitoring Service over a secure control plane. Additionalfunctionality of a Monitoring Agent includes adapting the intensity of the monitoring process byadaptingtheperiodicityofboththemetriccollectionanddisseminationbasedonthecurrentevolutionofthemetricdatastream.
• MonitoringProbes:theactualmetriccollectorsthatadheretoacommonmetriccollectioninterfacewithspecificMonitoringProbeimplementationsgatheringmetricsfromtheunderlyinginfrastructure,thecontainerizedexecutionenvironmentsorperformancemetricsfromdeployedcloudapplications.MonitoringProbes featurebothapushandpullmetricdeliverymechanismwithMonitoringAgentsbenefitingfromthepushmechanismtoavoidtheoverheadofconstantlycheckingformetricupdates.
D5.1CloudGovernanceMechanisms–EarlyRelease
19
Monitoring Probes logically group multiple metrics together, in order to reduce the monitoringoverheadwhenaccessingcommonandsharedresources.
• MonitoringLibrary3:thesourcecodeannotationdesignlibrarysupportingapplicationinstrumentationforUnicorncompliantcloudapplications.Byaddingsourcecodeannotationsformonitoring,Developerscandefinethescopeandtargetofmetrichandlerstoenableandalterthemetriccollection,aggregationanddisseminationprocess.Enabledmetrichandlersallowdeveloperstodefinemetriccounters,timers,and traffic interceptors to gather application performance metrics in order to assess and evaluateapplicationbehaviourandthequalityofserviceoftheunderlyingcloudofferings.
• MonitoringService:theentityeasingthemanagementofthemonitoring infrastructurebyprovidingscalable andmulti-tenant monitoring alongside the Unicorn platform. In particular, theMonitoringServiceisresponsibleforreceiving,processingandstoringmonitoringmetricstotheMonitoringDataStore.TheMonitoringServiceinternallyiscomprisedofadistributedandhorizontallyscalabletierofMonitoringServersthatcoordinatesandhandlesmetricandconfigurationrequestsfrombothusersandactiveMonitoringAgents.ThecommunicationbetweenMonitoringAgentsandtheMonitoringServiceis accomplished by utilizing a variation of the traditional publish and subscribe (pub/sub) messageparadigmwhichreducestherelatednetworkcommunicationoverhead.
• AnalysisService:theentitydeployedontopoftheMonitoringServicethatisresponsibleforaggregatingandcompilingatruntimehigh-levelanalytic insightsfromcollectedmonitoringdatabasedonmetricrules defined by users at either deployment or through the service graph editor from the UnicornDashboard.Afterassessingthevalidityandcorrectnessofthereceivedmetricrules,real-timeanalyticinsightsarethenservedtointerestedentitiesthroughahigh-performancequeueingservicetoeithertheUnicornDashboardortoexternalentitiesthatsubscribeviatheMonitoringRESTAPItotopicsofinterestandreceivestreamedmonitoringdata.
• MonitoringDataStore:Adistributedandscalabledatastorewithahigh-performanceindexingschemeforstoringandextractingmonitoringupdates.
• MonitoringRESTAPI:theentityresponsibleformanagingandauthorizingaccesstomonitoringdatastoredintheMonitoringDataStore.
Monitoring Agents are integral to the Unicorn monitoring process as they are the lightweight monitoringinstancesresponsibleforthecoordinationofthemetriccollectionprocess.Specifically,aMonitoringAgentisinstantiated as a light-weight user process in the deployed cloud element (e.g., container environment),decouplingthemetriccollectionprocessfromthemetricdisseminationtotheMonitoringService.Thisallowsfor theMonitoringAgentcode-basetobereusedonvariouscloud layersandelementswithonly theactualmetriccollectorsdiffering.TheenablementoftheMonitoringAgentispartofUnicornContinuousIntegrationandDelivery(CI/CD)cyclewhereforeachinstantiatedcontainerizedenvironmentthathostsaserviceofacloudapplication, the Unicorn Smart Orchestrator will configure, bundle and deploy alongside the application, aMonitoringAgent. Inregardtoconfiguration,usersarefreetosettheperiodicityofmetricstobecollected,enable logging and the level of reporting, and also give a name and tags to theMonitoring Agent to easereadabilityandassociationwhenperformingmonitoringqueriesviatheUnicornDashboard.
3InthissectionforcoherencepurposeswesimplyrefertotheMonitoringandElasticitylibraryasMonitoringLibrarysinceweareinterestedinthefunctionalityofmonitoring.
D5.1CloudGovernanceMechanisms–EarlyRelease
20
Figure3depictsahigh-levelandabstractoverviewoftheinterfacesthatmaptothefunctionalityprovidedbyaMonitoringAgentandareexplainedinthecontentthatfollows.
Figure3:High-LevelandAbstractOverviewofMonitoringAgentInterfaces
MetriccollectionandconfigurationisachievedbyUnicornUsersthroughtheMonitoringLibraryandMonitoringProbes.TheMonitoringLibraryprovidesuserswiththetoolset forapplication instrumentationtodefinethescopeandtargetofmetrichandlersthatenableandalterthemetriccollection,aggregationanddisseminationprocess. Touse theMonitoring Library, users simplydownload it from theUnicornArtefactRepository andembedittotheirapplication.Forsimplicity,UnicornofferstheMonitoringLibraryasaMavenDependencyforJavaapplicationswhichautomaticallydownloadsandintegratestheLibrarywiththeapplication,asdepictedinthefollowingcodesnippet.DetailedinstructionsandguidelinesareadditionallyprovidedthroughtheUnicornFramework Developer page. Itmust also be noted, that in the case of Developers taking advantage of theUnicornCloudIDEpluginforEclipseChe,thelatestversionoftheMonitoringLibraryismadeavailablethroughtheUnicornRuntimeStack(seeD2.1)withnoadditionalstepsrequired.ItmustalsobenotedthatthecurrentprototypeoftheMonitoringAgentisimplementedandofferedforJavaapplications.However,asitfeaturesnoexternaldependenciestootherframeworksorlibraries,itcanbeportedtootherprogramminglanguagesandframeworks. In turn, userswith applications developed in other languages are still able to use theUnicornMonitoringandAnalysisServicebytakingadvantageoftheMonitoringServiceRESTAPItodefineandstreammonitoringdatawithoutanylimitations,asdocumentedinSection7.1.
<dependency> <groupid>eu.unicornH2020</groupid> <artifactid>UnicornMonitoring</artifactid> <version>LATEST</version> </dependency>
CodeSnippet1:MonitoringLibraryMavenDependency
InthecodesnippetthatfollowsisametricdefinitioninJavaforahypotheticalpageviewcounterforanitemlistinginane-commerceplatform.Inthisexample,theDeveloperimportsintheitemclassdefinitiontheUnicornsourcecodeannotationlibraryformonitoringanddenotesthemetricproperties(e.g.,metricname,valuetype,
D5.1CloudGovernanceMechanisms–EarlyRelease
21
measurementunits,etc.)forconfiguringacountermetric(CounterMetric)basedontheUnicornmonitoringmetricmodel4.
import eu.unicornH2020.annotations.monitoring; ... @UnicornMetric(name=”views”, handler=MetricHandlerType.CounterMetric, units=””, valType=MetricValueType.Integer, initVal=0, minValue=0, maxVal=Integer.MaxInt, higherIsBetter=true, desc=”number of views for specific page” ) public ItemController { ... private int views; ... //metric handler value extraction method public int getViews() { return views; } }
CodeSnippet2:ExemplaryMonitoringLibraryMetricAnnotation
Inbrief,ametric, in itssimplisticform(SimpleMetric), iscomprisedbyaname,measurementunits,valuetypeandashortdescription. Inaddition,ametricmay includeotherproperties includingan initial/min/maxvalueandadefinitionifagreaterorlowervalueisbetterinreferenceforoptimizationsuchasinthecaseofelasticscaling.Ametricmaytakeotheradvancedforms,denotedasmetrichandlers.Table1introducesthe,todate,metrichandlersmadeavailabletousersthroughtheUnicornMonitoringLibrary.
Table1:MonitoringMetricHandlers
MetricHandlers DescriptionSimpleMetric Emitsavalueforareferencedmetricperiodically.TriggerMetric Emits a value for a referencedmetric butonlywhen called
upon.TimerMetric Emitsthetimeconsumedforthecompletionofareferenced
task(e.g.,APIcall).MeterMetric Emits the rate of measured events with a determined
timeframe(e.g.,throughputperminute).InterceptorMetric Emitstheoutputofafilterfunctionappliedtotheintercepted
trafficamongtwoevents(e.g.,TCPtrafficexchangedbetweentwoAPIcalls).
InadditiontotheMonitoringLibrarywhichprovidesapplicationinstrumentationthroughannotations,Unicornsupports metric extraction through Monitoring Probes. In particular, a Monitoring Probe is a lightweightmonitoringthreadadheringtoacommondefinedMonitoringProbeinterfacewiththeimplementationtailoredtothemonitoringtasktobeachieved.Forinstance,theDockerProbe,isresponsibleforextractingandupdating
4ToreducerepetitionwithD2.1,wewillavoidintroducingsegmentsofthemonitoringmetricmodeldefinitionandrequestthat interested readers use D2.1 as their model reference guide which introduces all the models and service graphenhancementsthatareprovidedbytheUnicornFramework.
D5.1CloudGovernanceMechanisms–EarlyRelease
22
monitoringmetricsfromtheDockercontaineritisdeployedupon.MonitoringProbesrunindependentlyfromeachother and canbedeployeddynamicallywithout theneed to restart the entiremonitoringprocess foralteration.IfaProbeencountersaproblem(e.g.,unexpectedtermination)themetriccollectionprocessofotherProbesandtherespectedMonitoringAgent,arenotaffected.
Withregardstodeployment,MonitoringProbesaredynamicallypluggabletoMonitoringAgentsviatheAgent’sprobe loader which embraces the class reflection paradigm to dynamically link, configure and instantiateMonitoringProbesatruntimeinimmutableexecutionenvironments,whichisarequirementforcontainerizedofferings.ThisprovidesflexibilitytothemonitoringprocessinkeepingMonitoringAgentscompactbyallowingthenumberandtypeofProbesutilizedbyMonitoringAgentstonotbepre-bundledandvarydependingonthemonitoringtaskthatmustbeachieved.Toachievethis,usersareonlyrequiredtospecifyatdeploymenttimewhichMonitoringProbesare inneed for their application servicesandupondeployment, theseMonitoringProbeswillbefetchedfromtherespectedProberepositoryanddynamicallypluggedtotheMonitoringAgentatruntime.IntheUnicornMonitoringProbeRepository5thereexistsanumberofpublicallyavailableMonitoringProbes thatcanbeusedbyusers.Todate, theUnicornMonitoringProbeRepositoryhostsa JVM, J2EEandDockerProbe,withthemetricsexposeddepictedinTable2.
Table2:MonitoringProbesCurrentlyAvailableinUnicornProbeRepository
MonitoringProbe Metric Units Type
JVM
CPULoad % DoubleAverageGarbageCollectingTime ms Double
InitialHeapSpace KB LongMaxHeapSpace KB Long
CurrentHeapSpaceUsed KB LongInitialNonHeapSpace KB LongMaxNonHeapSpace KB Long
CurrentNonHeapSpaceUsed KB LongCurrentAllocatedMemory % Double
J2EE
AverageResponseTime ms DoubleThroughput ops/s Double
Docker
CPULoad % DoubleCgroupPeriodicity μs LongCgroupQuota μs Long
NumberofAllocatedThreads # IntegerCPUCores # Integer
CPUUserTime ns LongCPUSystemTime ns Long
CurrentMemoryUtilization % DoubleCurrentMemoryCache MB LongCurrentMemoryRSS MB Long
TotalAllocatedMemory MB LongContainerarchitecture - String
ContainerOS - StringContainerBootTime ns Long
IngressPacketspersecond pckt/s Long5https://gitlab.com/unicorn-project/uCatascopia-Probe-Repo
D5.1CloudGovernanceMechanisms–EarlyRelease
23
EgressPacketspersecond pckt/s LongIngressKBytespersecond KB/s LongEgressKBytespersecond KB/s Long
Nonetheless, Developers are free to create their ownMonitoring Probes and Metrics, by adhering to thepropertiesdefinedintheMonitoringProbeAPIwhichprovidesacommonAPIinterfaceandabstractionshidingthecomplexityoftheunderlyingProbefunctionality.Figure4,depictstheimplementationofanExampleProbewhich includesthedefinitionof twoSimpleMetric’s,denotedasMetric1andMetric2, thatperiodicallyreportrandomintegeranddoublevaluesrespectively.Inthisfigurewealsoobservethatforausertodevelopa Monitoring Probe, she must only provide default values for the Probe periodicity and a name, a shortdescriptionoftheofferedfunctionalityandaconcrete implementationofthecollect()methodwhich,asdenotedbythename,defineshowmetricvaluesareupdated.
Figure4:ExampleofaNewlyDefinedMonitoringProbe
To reduce the intrusiveness of the monitoring process, the load imposed to the monitoring service, datamovementacrossmultiplecloudsitesandmonitoringcostswhicharebothnoticeableandbillableinlarge-scaleanddistributedcloudenvironments,UnicornMonitoringembracesadaptivenessofthemonitoringprocess.Inparticular, Unicorn provides low-cost approximate and adaptivemonitoring by adopting and extending thealgorithmic process proposed in [3], [20] to be suitable formicro-servicemonitoring through the AdaptiveInterface exposed by UnicornMonitoring Agents. Specifically, users are free to enable adaptiveness of themonitoringprocesssimplybysetting,theminimumacceptableaccuracy,denotedasaconfidencemetric(e.g.,90%), for a metric stream during the configuration process. Having defined the acceptable accuracy, therespectedMonitoringAgentwillemploylow-costapproximateandadaptivemonitoringtechniquestoadaptat
D5.1CloudGovernanceMechanisms–EarlyRelease
24
runtime both the rate at which metrics are collected and disseminated to the Monitoring Service. This isachievedbyutilizingalow-costestimationmodelthatcapturesatruntimethecurrentevolutionofthemetricdata stream and if stable phases in the evolution exist, then the collection period will be increased tocomputationallyoffloadtheMonitoringAgent.Inturn,filteringisappliedtoreducethemetricdisseminationwhenconsecutivemetricupdatesdonotdifferandtheuser-definedaccuracyguaranteeshold.
TheMonitoringandAnalysisServicerunsalongsidetheUnicornPlatformaspartoftheRuntimeEnforcementLayer.TheMonitoringServiceisinchargeofreceiving,processingandstoringmonitoringdatatotheMonitoringDataStore.MonitoringisprovidedtoUnicornUsersasaservice(MaaS),thus,removingfromuserstheoverheadofdeployingandmaintainingin-housemonitoringinfrastructure.Thisallowsforthemonitoringprocesstobedecoupledfromcloudproviderdependenciessoasformonitoringtonotbedisruptedandrequiresignificantamountofconfigurationwhenacloudservicemustspanacrossmultipleavailabilityzonesand/orcloudsites.Althoughcentrallyaccessiblebymultipletenants,througheithertheUnicornDashboardortheMonitoringRESTAPI, the Monitoring and Analysis Service, internally receives, processes and stores monitoring data in adistributed fashion. Specifically, the Monitoring Service embraces in situ monitoring to horizontally andelasticallyscaletomultipleinstances,denotedasMonitoringServers,whichintercommunicatetomonitorthehealthoftheentireMonitoringServiceandrecoverfromnetworkfaultsand/orunexpecteddowntime.
Toeasefeaturedevelopment,testingandcodereleasesbydecomposingitsfunctionality,theMonitoringandAnalysisServiceadheres to themicro-servicearchitecturalparadigm.The implementationof theMonitoringService is developed by embracing Java 8 and the Spring Boot framework (v2.0) for simplifying thebootstrapping,configurationandmanagementoftheembeddedwebserviceandforthedevelopmentofthedataabstractionsrequiredforaccessingandmanagingtheMonitoringDataStore.TheUnicornMonitoringDataStoreinterfaceabstractstheimplementationoftheunderlyingstoragebackend,thus,supportingflexibilitytotheselectionofthebackendimplementationofchoice.Inturn,SpringBootisutilizedforservingtheMonitoringServiceRESTAPIuponsecuretoken-basedauthorization.TheMonitoringServiceRESTAPIimplementstheCRUDoperationsforMonitoringAgentandMetricupdatesaccess,storageandmanagement,withresponsesencodedinJSONwhenexchangedoverthenetworkandmappedtoobjectswhenreceivedasrequests,usingSpringBoot(de-)serialization.Moreover,theMonitoringServicealsoembracesSpringCloudforexternalizedconfigurationpropagationuponMonitoringServerbootstrappingwhichalsoallowsforzerodowntimeandnorecompilationwhenchangesmustbepropagatedtotheentiretiercomprisingtheMonitoringService.
Inregardstometricstreamdissemination,theoverlaycommunicationplaneestablishedbetweenMonitoringAgents and the Monitoring Service adheres to the open-source JCatascopia Communication Protocol [21]developedtoincreaseautonomicityandfault-toleranceinelasticanddistributedmonitoringtopologieswhilereducingnetworktrafficandthecommunicationoverhead.Inbrief,thisprotocolusesavariationofthepublish-and-subscribemessagecommunicationpattern,wherethemetricpublisher(e.g.,MonitoringAgent) initiatesthesubscriptionprocesswiththemetricconsumer(e.g.,MonitoringService),insteadoftheotherwayaround,asdepicted in Figure5. This significantly reduces theoverhead inbothestablishing anddecommissioning adedicatedmonitoringstreambetweenthepublisherandconsumer,byremovingtheneedofanintermediatebrokeranddirectoryservicetrackingthenetworklocationofbothMonitoringAgentsandServerspartoftheMonitoringService.ToutilizethisprotocoltoestablishthecommunicationplanebetweenMonitoringAgentsandtheMonitoringServiceinmulti-cloudandcontainerizedexecutionenvironments,weextendtheprotocoltosupportcommunicationoverHTTPasynchronousconnectionsvia theMonitoringRESTAPIandadapt the
D5.1CloudGovernanceMechanisms–EarlyRelease
25
metric stream interface to support event-based metric dissemination instead of periodic dissemination tosupportadaptivemonitoringatthemetriccollectionlevel.
Figure5:DynamicMonitoringAgentDiscovery
Providing aMonitoring Service that is horizontally scalable in order to accommodate a dynamic and largenumberofbothmonitoredapplicationsandMonitoringAgents isonlyonepartofmonitoringscalability.ToofferacompletelyscalableMonitoringService,metricstorageandextractionmustbescalableaswell,andalsocapable of handling a dynamic and high volume of traffic since monitoring metrics are collected anddisseminatedattherateofseconds.Toaccommodatethis,theMonitoringDataStoreutilizes,CassandraDB,adistributedandscalableNoSQLdatabasebackend.TheselectionofCassandraDBwaspromptfromtheneedtosupport(i)fastwritesformetricproducers(MonitoringAgents),and(ii)fastreadsonrecentmonitoringmetricsthatarerequestedfrommetricconsumers.Specifically,thedatabaseschemadevelopedfortheMonitoringDataStoresupportsfastwritesforMonitoringAgentswithastabletimecomplexity.Inregardtofastreads,wehavedevelopedahigh-performanceindexingschemeformonitoringdatathatsupportsstabletimemetricextractionforrecentdatawhilerangequeriesforaparticulartime-windowaresupportedinlineartime.
TheAnalysisService isresponsibleforaggregatingandcompilingatruntimehigh-levelanalytic insights fromcollectedmonitoringdatabasedoncompiledmetricruleswhichadheretotheMetricModel,definedinD2.1.WiththeMetricModelusersareabletousetheUnicornDashboardMonitoringViewtocomposequeriesthatcreatenewhigh-levelmetrics andaggregates. Forexample,one can compileadatabaseoverall throughputmetricbyaggregatingbothreadandwriteoperationspersecond:dbThroughput = readps + writeps.Toreducethemetricextractionoverheadevenmoreandtoprovidereal-timeandpush-basedmetricupdates,theAnalysisService,servesreal-timestreamedmonitoringdatathroughahigh-performancequeueingservicetoboth interested entities and Unicorn components that subscribe via the Monitoring REST API to topics ofinterest.Atthispoint,wenotethatfurtherdetailsreferringtheAnalysisServicewillbedocumentedinD5.2withtheAnalysisServiceimplementationpartoftheUnicorneco-systemsecondprototyperelease.
3.3 InteractionwithotherUnicornServicesandComponentsTheMonitoringandAnalysisServiceinteractswiththreeUnicornComponentsatthePlatformanduserlevelwhile its exposed REST API allows for third-party services and developers to access historic and real-timemonitoringdata,permitted,authorizationisobtained.
Inparticular,theMonitoringandAnalysisServiceinteractswiththefollowingUnicornComponents:
D5.1CloudGovernanceMechanisms–EarlyRelease
26
• The Decision Making and Auto-Scaling Service: This component utilizes real-time and historicmonitoring data to derive if deployed applications and the underlying virtual and containerizedinfrastructuremustexpandorcontractinordertomeetcurrentdemand,achievetargetedperformanceandefficientlyutilizeprovisionedresources.HistoricmonitoringdataisaccessedbythiscomponentviatheMonitoringServiceAPIwhilereal-timeaggregatedandprocesseddataisfedbytheAnalysisServiceto the Decision Making Service via a high-performance queueing schema in order to reducecommunicationoverheadandexposereal-timedatainatimelymanner.
• TheSecurityEnforcementService:Thiscomponentutilizesthehigh-performanceindexingschemeoftheDistributedMonitoringDataStore,andconsequentlytheMonitoringServiceAPI,tostoreandaccessmonitoring data. This data is obtained from the interception of cloud application network trafficbetweenoutside servicesand internal communication toassessand report the risk applicationsareexposedtoattacksandifvulnerabilitieshinderintheconfigurationofthenetwork.
• TheUnicornDashboard:TheManagementPerspectiveoftheUnicornDashboardvisualizesinagraphicmannerreal-timeandhistoricmonitoringdataaccessedthroughtheMonitoringServiceAPIthatareofinterest to Dashboard users in order to understand their deployed application’s behaviour andperformanceoftheunderlyingplatform.Inturn,alongwithmetricgraphs,Dashboardusersalsoreceivenotificationsofalertsplacedto reportwhencertainconditionsareviolated (e.g.,aparticularmetricexceeds a certain threshold). Moreover, through the Unicorn Dashboard, users can formulate(continuous)monitoringqueries to access and trawl aggregated andprocessedmonitoringdata fedfromtheAnalysisService.
D5.1CloudGovernanceMechanisms–EarlyRelease
27
4 DecisionMakingandAuto-ScalingServiceIn this Section, we present a comprehensive documentation report introducing the reference architecture,exposedfunctionalityandimplementationdetailsreferringtotheDecisionMakingandAuto-ScalingService.
4.1 RequirementsandExposedFunctionalityThefollowinguserroles,identifiedinD1.1[22],arerelatedtotheDecisionMakingandAuto-ScalingService.
• CloudApplicationOwner:Followsanoptimizationstrategyconcerningtheruntimeexecutionofhis/hercloud application, that is aligned with the business aspects of the application, such as quality,performance,andcost.
• CloudApplication Administrator: Defines the elasticity policies required to realize the optimizationstrategyfollowedbytheCloudApplicationOwner.ThesepoliciescanbedefinedbothatruntimeusingtheServiceGraphandatdesigntimeusingtheElasticityLibrary.
• CloudApplicationDeveloper:DefinestheelasticitypoliciesviatheElasticityLibrarywhicharedefinedatdesign-time.
• UnicornDeveloper:DesigntheElasticityLibraryandextendstheinterfaceoftheElasticityEnablertosupportdifferentreactiveandproactivealgorithms,foroptimizingtheelasticitypolicies.
• Cloud Provider: Provides cloud offerings in the form of programmable infrastructure and hostUNICORN-compliantcloudapplications.
4.1.1 FunctionalRequirementsTosatisfyUse-CaseUC.9 (AdaptDeployedCloudApplications in real time),documented inD1.2 [23] ,whileadheringtosystemrequirementFR.9(Autonomicmanagementofdeployedcloudapplicationsandreal-timeadaptation based on intelligent decision-making mechanisms), documented in D1.1 [22], the followingfunctionalitymustbeexposedbytheDecisionMakingandAuto-ScalingService.
ID FR.DM.1
Title Defineandmanageelasticitypoliciesforcost,qualityandperformanceoptimizationofaUNICORN-enabledapplication
Description TheDecisionMaking&Auto-ScalingserviceshouldoffertheabilitytotheCloudApplicationAdministrator & Developer to express an optimization strategy for the performance andquality of the application in response to application demand (workload), while alsoacknowledging any budget constraints. The elasticity policies should be defined andmanagedbothatdesign-timeandduringruntime,withsyntacticandsemanticvalidation.
ExposedFunctionality
TheMonitoring& Elasticity Design Library offers the functionality to the user to expresshis/herscalingrequirementsviatheuseofElasticityPolicies.AnElasticityPolicycontainsasetofscalingconfigurationsthatarecarriedoutwhenascalingalertisissued.Ascalingalertistriggeredwhenasetofconditionsaresatisfied.Theusercanspecifyscalingalertsbasedon application insights (e.g., latency, throughput), resource runtime information (e.g.,numberof running services) and cost constraints (<20 credits/hour). Through themetric
D5.1CloudGovernanceMechanisms–EarlyRelease
28
definition,theusercanannotateacustommetricwhetherahighervalueisbetterornot,which is important for the Decision Making & Auto-Scaling service in the optimizationprocess.TheElasticitypoliciescanbecreatedandmanagedviatheServiceGraphbothatdesign-timeandatruntimebyUnicornApplicationAdministrator.Also,duringthedesign-timetheUnicornDevelopercancreateelasticitypolicieswithinthescopeofaserviceusingelasticityannotationsfromtheMonitoring&ElasticityLibrary.Finally,theelasticitypoliciesarevalidatedbothsyntacticallyandsemanticallyatdesign-timeandduringruntimebytheElasticityValidationmodule.
ID FR.DM.2
Title Autonomousruntimemonitor&enforcementofElasticitypolicies
Description TheDecisionMaking&Auto-Scalingserviceshouldconstantlymonitortheelasticitypoliciesatruntimeforanyviolationsinanautonomousway.WhenanelasticitypolicyistriggeredtheAuto-scalingserviceshouldperformthescalingactionspecifiedbythepolicy.
ExposedFunctionality
TheDecisionMaking&Auto-scalingserviceoffersthefunctionalitytoautonomouslymonitorandreacttoscalingalertsspecifiedbytheelasticitypolicies,throughtheElasticityController.The Elasticity Controller as part of the MAPE-K loop (Monitor-Analyse-Plan-Execute), inregular time intervals observes the application behaviour, through application andinfrastructure metrics specified in the elasticity conditions. When the conditions of anelasticitypolicyaresatisfiedtheAuto-scalingserviceprovidestotheResourceManagerthescalingre-configurations,specifiedbytheelasticityactionofthepolicy.
ID FR.DM.3
Title Resource-Aware&TransparentMulti-CloudElasticityControl
Description TheDecisionMaking&Auto-Scalingserviceshouldprovideatransparentscalingmechanismover multiple public cloud providers and private deployments. This means that thereshouldn't be any additional effort and configuration from the user-side for scaling anapplication to multiple cloud providers. Also, the different offerings, price schemes andresource heterogeneity of subscribed cloud providers should be acknowledged by theDecisionMaking&Auto-Scalingserviceforoptimizingthecost,qualityandperformanceofanapplication.
D5.1CloudGovernanceMechanisms–EarlyRelease
29
ExposedFunctionality
TheMonitoring&ElasticityDesignLibraryoffersthefunctionalitytotheCloudApplicationAdministrator & Developer to specify elasticity policies that concern different cloudproviders,zonesorregions,byusingthenotionofacluster.Aclusterisagroupofresourcesin the same network, therefore, a service can be placed tomultiple clusters that can belocatedindifferentgeographicallocationsorzones.InfrastructuralresourcesaretransparenttotheuserastheyarehandledbytheDecisionMaking&Auto-Scalingservice.
ID FR.DM.4
Title Continuousassessmentofelasticitypoliciesandadaptation
Description The DecisionMaking & Auto-Scaling service in order to provide optimal or near-optimalscaling decisions it should constantly assess the effectiveness of scaling configurations.Specifically,during theapplication runtime, it shouldevaluate theeffectsof theelasticitypoliciesontheperformance,qualityandcostoftheapplicationandadjustthemaccordingly.
ExposedFunctionality
The Decision Making & Auto-Scaling service offers the functionality to assess theeffectiveness of the elasticity policies through the Elasticity Enabler component. Thiscomponent,basedonhistoricalmonitoringdata,profilestheapplicationandconstructsormodifiesexistingelasticitypoliciestoadapttoanychangesintheapplicationbehaviourorworkload, to improve the elasticity control. Thismodification includes the adjustment ofparameters such as, cooldown and warmup periods, thresholds and time-windows. TheElasticityEnablercomponentishighlyextensible,allowingtheadoptionofdifferentreactiveandproactivealgorithmsbytheUnicornDeveloper,forimprovingelasticitycontrol.
4.1.2 Non-FunctionalRequirements
ID NFR.DM.1
Title Robustness
Description The DecisionMaking & Auto-Scaling servicemust copewith any potential errors fromunexpectedinputsandfaultsduringtheexecution,whilealsocontinuetoworkasusualafteraninterruptionfromunexpectedcrashesbyrestoringitslastvalidstate.
D5.1CloudGovernanceMechanisms–EarlyRelease
30
ID NFR.DM.2
Title High-Availability
Description The DecisionMaking & Auto-Scaling service should remain highly available in cases ofincreased load, while also minimize downtime by adding redundancy to criticalcomponentstoavoidsinglepointsoffailure.
ID NFR.DM.3
Title NearReal-timeAdaptability
Description DecisionMaking&Auto-Scalingservicemustbeabletoadaptitselftothedemandsoftheworkloadandrequirementsofthevariousoptimizationalgorithmswithoutdegradingitsperformance,byaddingorremovingthenecessarycomponentstoremainfunctionalandproduceinreal-timeoratleastnearreal-timescalingdecisions.
4.2 ReferenceArchitectureandImplementationTo address the preceding challenges and support the documented requirements, Unicorn introduces theDecisionMaking&Auto-Scalingserviceforadaptingapplicationsdeployedonmultiplecloudprovidersbasedonuseroptimizationstrategies.Ahigh-level referencearchitecturediagramof theDecisionMaking&Auto-scalingserviceisdepictedinFigure6.ThecomponentispartoftheUNICORNplatformandcommunicateswiththeMonitoringandAnalysisserviceandtheResourceManager.
The communicationwithMonitoringandAnalysis service isbidirectional, as it subscribes to relevantmetricstreamsandalsofetcheshistoricaldatatosupportthedecision-makingprocessoftheElasticitycontroller.ThecommunicationwithResourceManagerisalsobidirectional.Theinboundinterfaceisusedforcollectingruntimeinformation (e.g., running instances) and the elasticity capabilities. These capabilities denote the availablescaling actions with their associated cost information, provided by the cloud offerings (e.g., add vm-small$0.001/hour).TheoutboundinterfaceoftheDecisionMaking&Auto-ScalingserviceisusedtoprovidetotheResourceManagerthescalingactionsforanapplicationthatareissuedbytheElasticitycontroller.
TheDecisionMaking&Auto-scalingserviceiscomposedfromtwomaincomponents.ThefirstcomponentistheElasticityManager,whichisresponsibleforstoring,fetchingandmodifyingelasticitypoliciesformultipledeployments.Thesecondcomponent istheElasticitycontroller,which is instantiatedwhenanapplication isdeployed and its purpose is to enforce the elasticity policies of the application, retrieved by the ElasticityManager.
D5.1CloudGovernanceMechanisms–EarlyRelease
31
Figure6:DecisionMaking&Auto-scalingServiceReferenceArchitecture
Elasticity Policy is an important concept of the Decision-Making & Auto-Scaling service, thus to betterunderstand,anexampleofsuchpolicyisprovidedinFigure7.Inthisexample,thepolicyspecifiesthatonemoreserviceoftypesvc-s(streamingservice)shouldbeaddedwhenthereisi)asufficientincrease(10requests/secinthatlast5minutes)onthenumberofrequestsintheEuropeanregion,ii)theaggregatedcostoftheseservicesdonotexceedthebudgetconstraint(1.5credits/hour),iii)andalsotherearelessthan10servicesrunning.Thishigh-levelpolicyfollowstheIF-THEN-ACTIONrule-basedapproachandiswell-definedinEBNFlanguage[24]anddescribedthoroughlyinDeliverable2.1[25].
Figure7:ElasticityPolicyExample
D5.1CloudGovernanceMechanisms–EarlyRelease
32
TheDecisionMaking&Auto-scalingServiceobtainstheElasticityPolicy,performsthenecessaryvalidationsandenforcesatruntimethespecifiedactions.Figure8,showsallthepossiblestatesofanElasticityPolicy.Whenloaded,thepolicyremains inactiveuntil isvalidated. If it issuccessfullyvalidated, itcanbeactivatedfortheruntimeenforcement.Whentheconditionsofthepolicyareallsatisfied,thepolicyistriggered,andremainsinthisstateuntiltheactioniscompleted(e.g.,servicehasbeenaddedtoeu-zone-1),andfinallygoesbacktoanactivestate.Thepolicycanbemodifiedordeleteatruntime,onlywhenisinaninactivestate.
Figure8:ElasticityPolicyStateDiagram
4.2.1 ElasticityManagerThereferencearchitectureoftheElasticityManagercomponentispresentedinFigure9.Thiscomponentallowsthe UNICORN Policies Manager to create, modify and remove elasticity policies of a UNICORN-enabledapplication.ThisisachievedthroughitsexposedAPI,thatisextensivelydescribedinSection7.2.
Figure9:ElasticityManager
When an elasticity policy is created or modified it goes through the Validation module. This module isresponsibletoanalyseandvalidatebothsyntacticallyandsemanticallytheelasticitypolicy.ThisisperformedusingtheformalschemadefinitionoftheUNICORN’sElasticityLanguage,describedintheDeliverable2.1[25].Ifthevalidationfails,adescriptiveerrormessageisreturnedtotheUnicornPoliciesManager.Whenthepolicy
D5.1CloudGovernanceMechanisms–EarlyRelease
33
issuccessfullyvalidated,itisstoredinthePoliciesRepositoryoftheElasticityManager.TheresponsibilityofthePoliciesRepositoryistostoretheelasticitypoliciesalongwithusefulinformationabouttheirstate(e.g.,active,triggered,etc.).Concerningtheimplementation,thePoliciesRepositoryisimplementedusingtheCouchbaseServer5.1.0CommunityEdition[26],asit isanopensourceNoSQLdatabasethatoffershighavailabilityandscalingcapabilities.
The secondcomponent that communicateswithElasticityManager is theElasticityController. It allows theElasticitycontrollertoretrievetheelasticitypoliciesofhisdeployment,createnewormodifyexistingpolicies.Whennewelasticitypoliciesarecreatedaftertheinitialdeploymentoftheapplicationormodified,theelasticitycontrolleroftheparticulardeploymentretrievesthenewormodifiedpolicies.ThisisachievedbysubscribingtotheElasticityManager,throughitsAPI.
The Elasticity Manager is implemented with Java 8 and uses the Spring Boot 2.0 [27] framework forimplementing the web server. Spring Boot is a micro-framework that simplifies the bootstrapping anddevelopmentofSpringwebapplicationsthatsupportsembeddableserverssuchasTomcat[28]orJetty[29].TheElasticityManagerfollowsthearchitecturalpatternofmicroservices,exposingasecuretoken-basedRESTfulAPI.TheAPIimplementsthecommonCRUDoperationsforanElasticityPolicythroughtheHTTPmethods(GET,POST,etc.).AninstanceofElasticityPolicyisencodedwithitsJSONrepresentationwhenitisexchangedoverthe network and is mapped back to Java classes using Spring Boot’s serialization and de-serializationfunctionality. The syntax validation of an Elasticity Policy is performed during the serialization process. Ifserializationfails,theanalogouserrormessageisreturnedbytheValidationModule.Thesemanticvalidation,isperformedusingtheschemadefinitionoftheUNICORN’sElasticityLanguage.Itshouldbenotedthatinthefinalreleasetheschemawillbeenrichedandprobablychange,thustheValidationModuleisdevelopedinsuchawaytosupportmultipleversionsoftheschema.
4.2.2 ElasticityControllerThe Elasticity Controller reference architecture is the depicted in Figure 10. This component contains thedecision-makingprocessforthescalingcapabilitiesandadaptationofaUNICORN-enabledcloudapplication.Inthediagram,thegreenarrowsshowthecomponentsthataredirectlyrelatedwiththeflowofanelasticitypolicy.Thebluearrows,showthecomponentsthatarepartoftheMAPE-Kcontrol loop,describinghowapolicy isenforced,andfinallytheredarrowsshowimportantconfigurationactionsbetweenthecomponents.
D5.1CloudGovernanceMechanisms–EarlyRelease
34
Figure10:ElasticityControllerReferenceArchitecture
Concerningtheimplementation,thederivedarchitectureisimplementedfollowingthemicroservicesapproach,with the main sub-components developed as independent microservices. In the following paragraphs thefunctionalityofeachcomponentanddetailsoftheearlyimplementationareprovided.Forbetterunderstandingthedescriptionfollowstheflowofanelasticitypolicy(greenarrows),fromitscreationuntilitsenforcement.
PolicyLoader
Whenanapplicationisdeployed,theElasticityControllerisinstantiated.Thefirststepistoloadtheelasticitypoliciesofitscurrentdeployment.Forthis,thePolicyLoadercomponentisresponsibletofetchandloadthepolicies from the ElasticityManager though its API. The Unicorn Application Administrator can create newpolicies or modify existing ones during runtime, therefore the Policy Loader using a pub/sub subscriptionmechanismcommunicateswiththeElasticityManagertoretrieveanyupdatesonthepolicies.
RuntimeValidation
Thenexttaskafterloadingthepolicy,istoperformaruntimevalidationtocheckwhetherreferencestometrics,services and scaling actions, specified by the policy are available. This is the responsibility of theRuntime
D5.1CloudGovernanceMechanisms–EarlyRelease
35
Validationcomponent.TheMonitoringandAnalysisserviceisusedtoverifytheexistenceofametric(analyticinsight)thatisspecifiedintheelasticityconditionsofthepolicy,whiletheResourceManagercomponentisusedforretrievingtheelasticitycapabilitiestocheckwhetherspecifiedscalingactions,typesofservicesandresourcesareavailable.Ifthevalidationfailsandreferencescannotbefound,anerrormessageispropagatedbacktotheElasticityManagerandthepolicybecomesinactive.
ElasticityEnabler
After the runtime validation, the elasticity policy goes through the Elasticity Enabler. The first task of thiscomponent is to activate the necessary runtime information listeners to obtain the real-time informationspecifiedbytheelasticitypolicybycreating,i)metricstreamsbetweentheAnalysisServiceandtheElasticityRuntimeInformationcomponentandii)runtimeinformationstreamsbetweentheResourceManagerandtheElasticityRuntimeInformationcomponent.ThemainfunctionalitiesoftheElasticityEnablerarei)toassesstheperformanceandeffectivenessoftheelasticitypolicies,ii)createnewpoliciesandii)modifyexistingones,soasto continuously improve the elasticity control. For this purpose, it provides a generic interface for theimplementationofseveralsemi-superviseddecisionalgorithmsbytheUNICORNdeveloper,basedonhistoricalmonitoringdataoftheapplicationandoptimizationstrategies.
PolicyTranslator
From the Elasticity Enabler, the elasticity policy is then forwarded to thePolicy Translator component. ThismoduletakesasinputanelasticitypolicyandperformsthenecessarytransformationtoproducerulesthatarefedtotheExpertSystemcomponent.Concerningitsimplementation,thiscomponenttranslatesthepoliciestotheDroolsRulelanguage(DRL)[30].
ExpertSystem
The Expert System is responsible for the enforcement of the scaling rules. These rules are stored in theProductionMemory.Arulehastwoparts;theprecondition(IF)andtheaction(THEN).TheInferenceEnginematchesthefactsfromtheWorkingMemoryagainsttheproductionrulestoinferconclusionswhichresultinscalingactions,thataresentouttotheResourceManager.Inthecasewheremultiplerulesaresimultaneouslysatisfied,aconflictresolutionstrategy(e.g.,basedonpriority)isusedtomanagetheexecutionorderoftheseconflictingrules.Concerningtheimplementationofthiscomponent,UNICORNusestheDroolsBusinessRulesEngine[31]forpolicyenforcementthatsupportsreasoningandconflictresolutionovertheprovidedsetoffactsandrulesaswellastriggeringoftheappropriateactions.
ElasticityRuntimeInformation
TheElasticityRuntime Information (ERI) componentexposes two inbound interfaces for collecting real-timeinformationfromtheResourceManagerandAnalysisservice.TheResourceManagerfeedstheERIcomponentwithruntime informationabout theservicesandunderline infrastructureresources,while theMonitoring&AnalysisservicefeedstheERIcomponentwithaggregatedmetricinformation(analyticinsights)aboutservicesandunderlineinfrastructureresources.TheruntimemetricsandinformationistranslatedtofactsandfedtotheWorkingMemoryof theExpertSystem. Inaddition, the runtimemetricsand informationare fed to theElasticityEnablerforanalysispurposesandpolicyassessment.
D5.1CloudGovernanceMechanisms–EarlyRelease
36
4.3 InteractionwithotherUnicornServicesandComponentsTheDecision-MakingandAuto-scalingserviceinteractswiththefollowingUNICORNPlatformcomponents:
• Monitoring&Analytics service: This service is usedmainly for retrieving real-timemonitoring data(analytic insights) via its high-performance streaming queue and for collecting aggregated historicmonitoringdataoftheapplicationandunderlinevirtualandcontainerizedinfrastructure.Finally,itisusedintheelasticitypolicyvalidationprocessforthereferentialintegrityoftheelasticitymetrics.
• Resource Manager: Used for collecting the available elasticity capabilities in order to validate theelasticitypoliciesandalsoforanalysingtheofferingsandresourceheterogeneityofsubscribedcloudprovidersinordertooptimizecost,qualityandperformanceofanapplication.Also,itisusedtoretrievetheruntimeinformationoftheapplication’sservicesandresources(e.g.,numberofserviceinstances).
D5.1CloudGovernanceMechanisms–EarlyRelease
37
5 ConclusionsThescopeofthisdeliverablewastoprovideacomprehensiveoverviewanddocumentationreportoftheearlyreleaseoftheUnicornGovernanceMechanismswhichcontributetothemonitoringandmanagementoftheruntimeaspectsof theunderlyingmulti-cloudexecutionenvironments targetedby theUnicornPlatform. Inparticular, D5.1 derives a clear overview of the early design and development of the three componentscomprising the Unicorn Governance Mechanisms that are developed under the umbrella of WP5: (i) theMonitoringandAnalysisService;(ii)theExecutionEnvironmentAgent6;and(iii)theDecision-MakingandAuto-ScalingService.TheMonitoringandAnalysisServicealongwiththeExecutionEnvironmentAgent,representWP5Tasks4.1and4.2respectively,andhaveeffectivelystartedasplanedonMonth7oftheUnicornProjectwithnodeviationsoralterationstothedocumentedworkplan.Inturn,theDecision-MakingandAuto-ScalingService representsWP5Task4.3whichhaseffectivelystartedasplanedonMonth10withnodeviationsoralterations to the documented work plan. For each of these components we have documented, therequirementsthatmustbesatisfiedtoovercomethechallengesintroducedwhenmonitoring,managingandscalingmulti-cloudcontainerizedexecutionenvironments,theirfunctionalities,howtheyoperate,andtheAPIusedtointeractwithotherUnicorncomponents,usersand/orthird-partyservices.
Specifically, for the Monitoring and Analysis service, six functional requirements and three non-functionalrequirements were identified. These requirements along with their exposed functionality lead to thedevelopmentofacomplete, interoperable,scalableandreal-timecloudmonitoringstackforautomatingthemonitoringofcloudapplicationsdeployedthroughcontainerizedexecutionenvironments.Inthefirstperiodoftheprojectfocuswasdrivenin:(i)designingandimplementingthefoundationalmechanismsoftheMonitoringAgentfornon-intrusivemetricextractioninimmutablecontainerizedenvironmentsdeployedonheterogeneousmulti-cloud offerings; and (ii) designing and implementing the first version of theMonitoring and AnalysisServicewithemphasisgivenoncreatingthemetricmodelanddataabstractionsforeffectiveandscalablemetricstorageandextraction.Asthelifespanoftheprojectprogresses,effortwillbegivenin:(i)increasingthenumberofMonitoringProbesavailableintheUnicornProbeRepository;(ii)implementthescalableAnalysisServiceformetriccompilation,aggregationandgroupingwhichwillservereal-timemetricupdatesthroughsubscriptiontopics extracted from a high-performance queueing service; and (iii) introduce auto-configuration to theMonitoring Service tier to allow feature introduction and reconfiguration through the Unicorn CI/CD cyclewithoutdowntime.
FortheDecisionMakingandAuto-ScalingService,theexposedfunctionalityoffourfunctionalrequirementsandthreenon-functionalrequirements,leadtothedevelopmentofalanguageandaruntimeenvironmentthatiscapableofadaptingacloudapplicationbasedonsemi-superviseddecisionalgorithmsfortheoptimalplacementof virtualmachines and containers acrossmultiple availability zones and/or cloud sites, while realizing theheterogeneityamongcloudprovidersandtheircapabilitiesandadheringhigh-levelpolicyconstraintsdefinedbytheApplicationAdministrator.Inthefirstperiodoftheprojectfocuswasdrivenin:(i)creatingthenecessarymodelanddataabstractionstohandlethefunctionalityexposedbythescalingmechanismsbasedonuser’soptimizationstrategiesexpressedviatheelasticitypolicies;andii)designingandimplementingthefirstversionoftheDecision-Making&Auto-ScalingServicewithemphasisgivenonderivinganarchitecturethat ishighlyextensible allowing the development of algorithms for improving the elasticity control, while also allowingexistingstate-of-the-artreactivescalingmechanismstobenefitfromthesetechniques.Asthelifespanofthe
6DenotedthroughoutthedocumentastheMonitoringAgent
D5.1CloudGovernanceMechanisms–EarlyRelease
38
projectprogresses,effortwillbegivenin:(i)improvingtheDecision-Makingprocessbydevelopingoptimizationalgorithms (reactive and proactive) for virtualized and containerized resource placement based on themonitoredbehaviourofthecloudapplicationanduser’spolicies;and(ii)enrichingtheElasticityLibrarywithpre-definedoptimizationstrategiesfor improvingthequality,costandperformance,avoidingtheprocessofdefiningandfindingoptimallow-levelscalingpoliciesfortheapplication.
Finally, in the forthcoming Deliverable 5.2 – Unicorn Governance Mechanisms, the final version of theMonitoringandAnalysisServiceandtheDecision-MakingandAuto-ScalingServicewillbedocumented.Thiswork,willassesstheaccomplishmentoftherequirements,featuresandtoolsetsintroducedinthisdeliverableandwillprovidethefinaldocumentationoftheUnicornGovernanceMechanisms.
D5.1CloudGovernanceMechanisms–EarlyRelease
39
6 References[1] OpenContainerSpecification-LinuxFoundation,“https://www.opencontainers.org/about.”.
[2] D. Trihinas, Z. Georgiou, G. Pallis, and M. D. Dikaiakos, “Improving Rule-Based Elasticity Control byAdapting theSensitivityof theAuto-ScalingDecisionTimeframe,” inThird InternationalWorkshoponAlgorithmic Aspects of Cloud Computing (ALGOCLOUD 2017), in conjunction with the ALGO 2017Conference,2017.
[3] D.Trihinas,G.Pallis,andM.Dikaiakos,“{ADMin:}AdaptiveMonitoringDisseminationfortheInternetofThings,”inIEEEINFOCOM2017-IEEEConferenceonComputerCommunications(INFOCOM2017),2017.
[4] R. Morabito, V. Cozzolino, A. Y. Ding, N. Beijar, and J. Ott, “Consolidate IoT Edge Computing withLightweightVirtualization,”IEEENetw.,vol.32,no.1,pp.102–111,Jan.2018.
[5] DockerStats,“https://docs.docker.com/engine/reference/commandline/stats/.”.
[6] cAdvisor,“https://github.com/google/cadvisor.”.
[7] J.M.AlcarazCaleroandJ.GutierrezAguado,“MonPaaS:AnAdaptiveMonitoringPlatformasaServiceforCloudComputingInfrastructuresandServices,”IEEETrans.Serv.Comput.,pp.1–1,2014.
[8] Tower4CloudsMulti-CloudMonitoring,“http://deib-polimi.github.io/tower4clouds/.”.
[9] D.Trihinas,G.Pallis,andM.D.Dikaiakos,“MonitoringElasticallyAdaptiveMulti-CloudServices,” IEEETrans.CloudComput.,vol.4,2016.
[10] M. R. López and J. Spillner, “Towards Quantifiable Boundaries for Elastic Horizontal Scaling ofMicroservices,” in Companion Proceedings of the10th International Conference on Utility and CloudComputing,2017,pp.35–40.
[11] J.Thones,“Microservices,”IEEESoftw.,vol.32,no.1,p.116,Jan.2015.
[12] AppDynamics,“https://www.appdynamics.com/.”.
[13] DataDog,“https://www.datadoghq.com.”.
[14] S.Meng and L. Liu, “Enhancedmonitoring-as-a-service for effective cloudmanagement,” IEEE Trans.Comput.,vol.62,no.9,pp.1705–1720,2013.
[15] Amazon’sAutoScaling,“https://aws.amazon.com/ec2/autoscaling.”.
[16] MicrosoftAzureAutoScaling,“https://azure.microsoft.com/en-us/features/autoscale/.”.
[17] GoogleCloudAutoscaler,“https://cloud.google.com/compute/docs/autoscaler/.”.
[18] AWS Target-tracking scaling,“https://docs.aws.amazon.com/autoscaling/application/userguide/application-auto-scaling-target-tracking.html.”.
[19] J.Thalheim,A.Rodrigues,I.E.Akkus,P.Bhatotia,R.Chen,B.Viswanath,L.Jiao,andC.Fetzer,“Sieve:ActionableInsightsfromMonitoredMetricsinMicroservices,”2017.
[20] D.Trihinas,G.Pallis,andM.D.Dikaiakos,“AdaM:anAdaptiveMonitoringFrameworkforSamplingandFilteringonIoTDevices,”inIEEEInternationalConferenceonBigData,2015.
D5.1CloudGovernanceMechanisms–EarlyRelease
40
[21] D.Trihinas,G.PallisandM.D.Dikaiakos,“MonitoringElasticallyAdaptiveMulti-CloudServices,” IEEETrans.CloudComput.,vol.4,no.X,pp.1–14,2016.
[22] Unicorn,“UnicornDeliverableD1.1StakeholdersRequirementsAnalysis.”2017.
[23] Unicorn,“UnicornReferenceArchitectureDeliverable1.2.”2017.
[24] EBNF,“https://en.wikipedia.org/wiki/Extended_Backus%E2%80%93Naur_form.”.
[25] Unicorn,“Deliverable2.1-UNICORNLibraries,IDEPlugin,ContainerPackagingandDeploymentToolsetEarlyRelease,”2018.
[26] Couchbase,“https://www.couchbase.com/.”.
[27] SpringBoot,“https://projects.spring.io/spring-boot/.”.
[28] ApacheTomcat,“http://tomcat.apache.org/.”
[29] Jetty,“https://www.eclipse.org/jetty/.”.
[30] Drools Rule Language, “https://docs.jboss.org/drools/release/5.2.0.Final/drools-expert-docs/html/ch05.html.”
[31] DroolsBusinessRulesEngine,“https://www.drools.org/.”.
D5.1CloudGovernanceMechanisms–EarlyRelease
41
7 Appendix
7.1 MonitoringandAnalysisServiceAPIDocumentationUnicornMonitoringAPIisaRESTAPIdocumentedinwhatfollowsandiscomprisedofthefollowingresources:
• MonitoringAPIKeys• MonitoredApplications• MonitoringAgentscomprisingaMonitoredApplication• MonitoringMetricscollectedbyaMonitoringAgent
To reduce repetition,we note that ALL requestsmust be issuedwith theMonitoringAPI key authorizationheader.Withouttheinclusionofthisheader,orifnotavalidheader,anUNAUTHORIZEDresponseisreturn.InthecaseofrequestingaMonitoringAPIKeyforthefirsttime,theusermustperformthisrequestthroughtheUnicornDashboard.
7.1.1 MonitoringAPIKeys
Table3:CREATEMonitoringAPIKeyContext
Description CREATEAPIKeywhichprovidesauthorizedaccesstoUnicornMonitoring
URI /apikeyMethod POSTParameters -Request/ResponseFormat
application/json
ResponseStatusCodes
200–OK201–CREATED401–UNAUTHORIZED403–FORBIDDEN
SampleRequest POST /apikey
RequestBody { "userID": "123456789", "username": "dtrihinas", "country": "Cyprus", "availability_zone": "eu" }
ResponseBody { "apiKey": "27ff56dac859400bac07e30f50f1f0d0", }
ResponseCode 201–CREATED
Table4:DELETEMonitoringAPIKeyContext
Description REVOKEAPIKeywhichprovidesauthorizedaccesstoUnicornMonitoring
URI /apikey/{apikey}
Method DELETE
D5.1CloudGovernanceMechanisms–EarlyRelease
42
Parameters -Request/ResponseFormat
application/json
ResponseStatusCodes
200–OK401–UNAUTHORIZED403–FORBIDDEN404–NOTFOUND
SampleRequest DELETE /apikey/27ff56dac859400bac07e30f50f1f0d0
RequestBody -ResponseBody -
ResponseCode 200–OK
Table5:GETMonitoringAPIKeyContext
Description GETAPIKeyassociatedusermetadata
URI /apikey/{apikey}
Method GETParameters -Request/ResponseFormat
application/json
ResponseStatusCodes
200–OK401–UNAUTHORIZED403–FORBIDDEN404–NOTFOUND
SampleRequest GET /apikey/27ff56dac859400bac07e30f50f1f0d0
RequestBody -ResponseBody {
"created_at": "1521458234", "last_modified": "1521458234", "userID": "123456789", "username": "dtrihinas", "country": "Cyprus", "availability_zone": "eu" }
ResponseCode 200–OK
Table6:UPDATEMonitoringAPIKeyContext
Description UPDATEAPIKeyassociatedusermetadata
URI /apikey/{apikey}
Method PUTParameters -Request/ResponseFormat
application/json
ResponseStatusCodes
200–OK401–UNAUTHORIZED
D5.1CloudGovernanceMechanisms–EarlyRelease
43
403–FORBIDDEN404–NOTFOUND
SampleRequest PUT /apikey/27ff56dac859400bac07e30f50f1f0d0
RequestBody { "availability_zone": "us-east" }
ResponseBody { "created_at": "1521458234", "last_modified": "15214598351", "userID": "123456789", "username": "dtrihinas", "country": "Cyprus", "availability_zone": "us-east" }
ResponseCode 200–OK
7.1.2 MonitoredApplications
Table7:CREATEMonitoringApplicationContext
Description CREATEMonitoredApplicationcontext
URI /appsMethod POSTParameters -Request/ResponseFormat
application/json
ResponseStatusCodes
200–OK201–CREATED401–UNAUTHORIZED403–FORBIDDEN
SampleRequest POST /apps
RequestBody { "name": "my-app-1", "availability_zone": "eu", "tags": [ "spring-boot", "mysqldb" ], "providers": [ "aws", "google-ce" ] }
ResponseBody { "appID": "95a01dfe4667414f9336b7d7495cc7a7", "created_at": "1521459234", "last_modified": "1521459234" }
ResponseCode 201–CREATED
D5.1CloudGovernanceMechanisms–EarlyRelease
44
Table8:GETMonitoredApplicationContextAssociatedwithMonitoringKey
Description GETMonitoredApplicationcontextsassociatedwithprovidedMonitoringKey
URI /appsMethod GETParameters -Request/ResponseFormat
application/json
ResponseStatusCodes
200–OK401–UNAUTHORIZED403–FORBIDDEN404–NOTFOUND
SampleRequest GET /apps/
RequestBody -
ResponseBody [ { "appID": "95a01dfe4667414f9336b7d7495cc7a7", "name": "my-app-1", "availability_zone": "eu", "tags": [ "spring-boot", "mysqldb" ], "providers": [ "aws", "google-ce" ], "created_at": "1521459234", "last_modified": "1521459234" }, { . . . } ]
ResponseCode 200–OK
Table9:GETMonitoredApplicationContext
Description GETMonitoredApplicationcontext
URI /apps/{appID}Method GETParameters -Request/ResponseFormat
application/json
ResponseStatusCodes
200–OK401–UNAUTHORIZED403–FORBIDDEN404–NOTFOUND
D5.1CloudGovernanceMechanisms–EarlyRelease
45
SampleRequest GET /apps/95a01dfe4667414f9336b7d7495cc7a7
RequestBody -
ResponseBody { "appID": "95a01dfe4667414f9336b7d7495cc7a7", "name": "my-app-1", "availability_zone": "eu", "tags": [ "spring-boot", "mysqldb" ], "providers": [ "aws", "google-ce" ], "created_at": "1521459234", "last_modified": "1521459234" }
ResponseCode 200–OK
Table10:DELETEMonitoredApplicationContext
Description DELETEMonitoredApplicationcontext
URI /apps/{appID}Method DELETEParameters -Request/ResponseFormat
application/json
ResponseStatusCodes
200–OK401–UNAUTHORIZED403–FORBIDDEN404–NOTFOUND
SampleRequest DELETE /apps/95a01dfe4667414f9336b7d7495cc7a7
RequestBody -
ResponseBody -
ResponseCode 200–OK
Table11:UPDATEMonitoredApplicationContext
Description UPDATEMonitoredApplicationcontext
URI /apps/{appID}Method PUTParameters -Request/ResponseFormat
application/json
ResponseStatusCodes
200–OK401–UNAUTHORIZED403–FORBIDDEN
D5.1CloudGovernanceMechanisms–EarlyRelease
46
404–NOTFOUNDSample
Request PUT /apps/95a01dfe4667414f9336b7d7495cc7a7
RequestBody { "name": "my-new-appname-1", "availability_zone": "us-west", "tags": [ "spring-boot", "mysqldb", "java" ] }
ResponseBody { "appID": "95a01dfe4667414f9336b7d7495cc7a7", "name": "my-new-appname-1", "availability_zone": "us-west", "tags": [ "spring-boot", "mysqldb", "java", ], "providers": [ "aws", "google-ce" ], "created_at": "1521459234", "last_modified": "1521467357" }
ResponseCode 200–OK
7.1.3 MonitoringAgents
Table12:CREATEMonitoringAgentContext
Description CREATEMonitoringAgentcontext
URI /apps/{appID}/agentsMethod POSTParameters -Request/ResponseFormat
application/json
ResponseStatusCodes
200–OK201–CREATED401–UNAUTHORIZED403–FORBIDDEN
SampleRequest POST /apps/95a01dfe4667414f9336b7d7495cc7a7/agents
RequestBody { "name": "my-agent-1", "host": "docker-engine-1", "tags": [ "item-catalog", "java8",
D5.1CloudGovernanceMechanisms–EarlyRelease
47
"spring-boot", ], "metrics": [ { "name": "cpu", "unit": "%", "type": "double" }, { "name": "memory", "unit": "%", "type": "double" } ] }
ResponseBody { "agentID": "70910177071c4bc68419fb63e72b7cbc", "created_at": "1521459234", "last_modified": "1521459234" }
ResponseCode 201–CREATED
Table13:GETMonitoringAgentcontextsassociatedwithMonitoredApplication
Description GETMonitoringAgentcontextsassociatedwithMonitoredApplication
URI /apps/{appID}/agentsMethod GETParameters status={UP,DOWN,TERMINATED},tagRequest/ResponseFormat
application/json
ResponseStatusCodes
200–OK401–UNAUTHORIZED403–FORBIDDEN404–NOTFOUND
SampleRequest GET /apps/95a01dfe4667414f9336b7d7495cc7a7/agents?status=UP
RequestBody -
ResponseBody [ { "agentID": "70910177071c4bc68419fb63e72b7cbc", "name": "my-agent-1", "host": "docker-engine-1", "tags": [ "item-catalog", "java8", "spring-boot", ], "metrics": [ { "metricID": "70910177071c4bc68419fb63e72b7cbc:cpu", "name": "cpu", "unit": "%", "type": "double"
D5.1CloudGovernanceMechanisms–EarlyRelease
48
}, { "metricID": "70910177071c4bc68419fb63e72b7cbc:memory", "name": "memory", "unit": "%", "type": "double" } ], "created_at": "1521459234", "last_modified": "1521459234", "status": "UP" }, { "agentID": "b8a8804c4a927458da5e5570707f9b54c2a3", "name": "my-agent-42", "host": "aws-m3-small-linux-1", "tags": [ "data-miner", "python3" ], "metrics": [ { "metricID": "70910177071c4bc68419fb63e72b7cbc:throughput", "name": "throughput", "unit": "ops/s", "type": "double" }, { "metricID": "70910177071c4bc68419fb63e72b7cbc:memory:", "name": "memory", "unit": "%", "type": "double" } ], "created_at": "15218321343", "last_modified": "15218362356", "status": "UP" }, { . . . } ]
ResponseCode 200–OK
Table14:GETMonitoringAgentContext
Description GETMonitoringAgentcontext
URI /apps/{appID}/agents/{agentID}Method GETParameters -Request/ResponseFormat
application/json
D5.1CloudGovernanceMechanisms–EarlyRelease
49
ResponseStatusCodes
200–OK401–UNAUTHORIZED403–FORBIDDEN404–NOTFOUND
SampleRequest GET /apps/95a01dfe4667414f9336b7d7495cc7a7/agents/
70910177071c4bc68419fb63e72b7cbc RequestBody -
ResponseBody { "agentID": "70910177071c4bc68419fb63e72b7cbc", "name": "my-agent-1", "host": "docker-engine-1", "tags": [ "item-catalog", "java8", "spring-boot", ], "created_at": "1521459234", "last_modified": "1521459234", "status": "UP" }
ResponseCode 200–OK
Table15:DELETEMonitoringAgentContext
Description DELETEMonitoringAgentcontext
URI /apps/{appID}/agents/{agentID}Method DELETEParameters -Request/ResponseFormat
application/json
ResponseStatusCodes
200–OK401–UNAUTHORIZED403–FORBIDDEN404–NOTFOUND
SampleRequest DELETE /apps/95a01dfe4667414f9336b7d7495cc7a7/agents/
70910177071c4bc68419fb63e72b7cbc RequestBody -
ResponseBody -
ResponseCode 200–OK
Table16:UPDATEMonitoringAgentContext
Description UPDATEMonitoringAgentcontext
URI /apps/{appID}/agents/{agentID}Method PUTParameters -
D5.1CloudGovernanceMechanisms–EarlyRelease
50
Request/ResponseFormat
application/json
ResponseStatusCodes
200–OK401–UNAUTHORIZED403–FORBIDDEN404–NOTFOUND
SampleRequest PUT /apps/95a01dfe4667414f9336b7d7495cc7a7/agents/
70910177071c4bc68419fb63e72b7cbc RequestBody {
"metrics": [ { "name": "ingress_pcts", "unit": "#", "type": "long" } ] }
ResponseBody { "agentID": "70910177071c4bc68419fb63e72b7cbc", "name": "my-agent-1", "host": "docker-engine-1", "tags": [ "item-catalog", "java8", "spring-boot", ], "metrics": [ { "metricID": "70910177071c4bc68419fb63e72b7cbc:cpu", "name": "cpu", "unit": "%", "type": "double" }, { "metricID": "70910177071c4bc68419fb63e72b7cbc:memory", "name": "memory", "unit": "%", "type": "double" }, { "metricID": "70910177071c4bc68419fb63e72b7cbc:ingress_pcts", "name": "ingress_pcts", "unit": "#", "type": "long" } ], "created_at": "1521459234", "last_modified": "1521673128", "status": "UP" }
ResponseCode 200–OK
D5.1CloudGovernanceMechanisms–EarlyRelease
51
7.1.4 MonitoringMetricsTable17:CREATEMonitoringMetricValuecontext
Description CREATEMonitoringMetricValuecontext
URI /apps/{appID}/agents/{agentID}/metrics/Method POSTParameters -Request/ResponseFormat
application/json
ResponseStatusCodes
200–OK201–CREATED401–UNAUTHORIZED403–FORBIDDEN
SampleRequest POST /apps/95a01dfe4667414f9336b7d7495cc7a7/agents/
70910177071c4bc68419fb63e72b7cbc/metrics/ RequestBody [
{ "metricID": "70910177071c4bc68419fb63e72b7cbc:cpu", "name": "cpu", "unit": "%", "type": "double", "value": "25", "timestamp": "1521542354" }, { "metricID": "70910177071c4bc68419fb63e72b7cbc:memory", "name": "memory", "unit": "%", "type": "double", "value": "48", "timestamp": "1521542354" }, { "metricID": "70910177071c4bc68419fb63e72b7cbc:ingress_pcts", "name": "ingress_pcts", "unit": "#", "type": "long", "value": "3504", "timestamp": "1521542378" } ]
ResponseBody { "created_at": "1521542389", }
ResponseCode 201–CREATED
Table18:GETMonitoringMetricvaluecontextsassociatedwithMonitoredAgent
Description GETMonitoringMetricValuecontextsassociatedwithMonitoredAgent
URI /apps/{appID}/agents/{agentID}/metrics
D5.1CloudGovernanceMechanisms–EarlyRelease
52
Method GETParameters -Request/ResponseFormat
application/json
ResponseStatusCodes
200–OK401–UNAUTHORIZED403–FORBIDDEN404–NOTFOUND
SampleRequest GET /apps/95a01dfe4667414f9336b7d7495cc7a7/agents/
70910177071c4bc68419fb63e72b7cbc/metrics/ RequestBody -
ResponseBody { "created_at": "1521542389", "metrics:[ { "metricID": "70910177071c4bc68419fb63e72b7cbc:cpu", "name": "cpu", "unit": "%", "type": "double", "value": "25", "timestamp": "1521542354" }, { "metricID": "70910177071c4bc68419fb63e72b7cbc:memory", "name": "memory", "unit": "%", "type": "double", "value": "48", "timestamp": "1521542354" }, { "metricID": "70910177071c4bc68419fb63e72b7cbc:ingress_pcts", "name": "ingress_pcts", "unit": "#", "type": "long", "value": "3504", "timestamp": "1521542378" } ] }
ResponseCode 200–OK
Table19:GETMonitoringMetricContext
Description GETMonitoringMetriccontext
URI /apps/{appID}/agents/{agentID}/metrics/{metricID}Method GETParameters interval,tstart,tendRequest/ResponseFormat
application/json
D5.1CloudGovernanceMechanisms–EarlyRelease
53
ResponseStatusCodes
200–OK401–UNAUTHORIZED403–FORBIDDEN404–NOTFOUND
SampleRequest GET /apps/95a01dfe4667414f9336b7d7495cc7a7/agents/
70910177071c4bc68419fb63e72b7cbc/metrics/ 70910177071c4bc68419fb63e72b7cbc:cpu?interval=600
RequestBody -
ResponseBody [ { "metricID": "70910177071c4bc68419fb63e72b7cbc:cpu", "name": "cpu", "unit": "%", "type": "double", "value": "25", "timestamp": "1521542354" }, { "metricID": "70910177071c4bc68419fb63e72b7cbc:cpu", "name": "cpu", "unit": "%", "type": "double", "value": "27", "timestamp": "1521543355" }, { "metricID": "70910177071c4bc68419fb63e72b7cbc:cpu", "name": "cpu", "unit": "%", "type": "double", "value": "48", "timestamp": "1521544355" }, . . . ]
ResponseCode 200–OK
7.2 DecisionMakingandAuto-scalingServiceAPIDocumentationElasticityManagerAPIisaRESTfulAPIexposingthestandardCRUDoperationsofthefollowingresources:
• ElasticityAPIKeys• ElasticApplications• ElasticityPolicies
Toreducerepetition,wenotethatALLrequestsmustbeissuedwiththeElasticityAPIkeyauthorizationheader.Withouttheinclusionofthisheader,orifnotavalidheader,anUNAUTHORIZEDresponseisreturn.InthecaseofrequestinganElasticityAPIKeyforthefirsttime,theinterestedentitymustrequesttheAPIkeyfromtheElasticityManager.
D5.1CloudGovernanceMechanisms–EarlyRelease
54
7.2.1 ElasticityAPIKeysTable20:CREATEElasticityAPIKeyContext
Description CREATEAPIKeywhichprovidesauthorizedaccesstoElasticityManager
URI /apikeyMethod POSTParameters -Request/ResponseFormat
application/json
ResponseStatusCodes
200–OK201–CREATED401–UNAUTHORIZED403–FORBIDDEN
SampleRequest POST /apikey
RequestBody { "userID": "0320013", "username": "zgeorg03", "country": "Cyprus", "availability_zone": "eu" }
ResponseBody { "apiKey": "a7ba56adc85941deaf47630f50c240e1", }
ResponseCode 201–CREATED
Table21:DELETEElasticityAPIKeyContext
Description REVOKEAPIKeywhichprovidesauthorizedaccesstoElasticityManager
URI /apikey/{apikey}Method DELETEParameters -Request/ResponseFormat
application/json
ResponseStatusCodes
200–OK401–UNAUTHORIZED403–FORBIDDEN404–NOTFOUND
SampleRequest DELETE /apikey/a7ba56adc85941deaf47630f50c240e1
RequestBody -ResponseBody -
ResponseCode 200–OK
Table22:GETElasticityAPIKeyContext
Description GETAPIKeyassociatedusermetadata
URI /apikey/{apikey}
D5.1CloudGovernanceMechanisms–EarlyRelease
55
Method GETParameters -Request/ResponseFormat
application/json
ResponseStatusCodes
200–OK401–UNAUTHORIZED403–FORBIDDEN404–NOTFOUND
SampleRequest GET /apikey/a7ba56adc85941deaf47630f50c240e1
RequestBody -
ResponseBody { "created_at": "1521468211", "last_modified": "1521468211", "userID": "0320013", "username": "zgeorg03", "country": "Cyprus", "availability_zone": "eu" }
ResponseCode 200–OK
7.2.2 ElasticApplicationTable23:CreateElasticApplication’sContext
Description CREATEthecontextofanElasticApplication
URI /appsMethod POSTParameters -Request/ResponseFormat
application/json
ResponseStatusCodes
200–OK201–CREATED401–UNAUTHORIZED403–FORBIDDEN
SampleRequest POST /apps
RequestBody { "name": "my-app-1", "availability_zone": "eu", "tags": [ "spring-boot", "mysqldb" ], "providers": [ "aws", "google-ce" ] }
ResponseBody { "appID": "95a01dfe4667414f9336b7d7495cc7a7", "created_at": "1521459234", "last_modified": "1521459234" }
ResponseCode 201–CREATED
D5.1CloudGovernanceMechanisms–EarlyRelease
56
Table24:GETElasticApplications’Context
Description GETthecontextoftheElasticApplications
URI /appsMethod GETParameters -offset:Integer
required:Falsedescription:Thenumberofelasticitypoliciestoskip-limit:required:Falsedescription:Thenumberofelasticitypoliciestoreturn
Request/ResponseFormat
application/json
ResponseStatusCodes
200–OK401–UNAUTHORIZED403–FORBIDDEN404–NOTFOUND
SampleRequest GET /apps/
RequestBody -
ResponseBody [ { "appID": "95a01dfe4667414f9336b7d7495cc7a7", "name": "my-app-1", "availability_zone": "eu", "tags": [ "spring-boot", "mysqldb" ], "providers": [ "aws", "google-ce" ], "created_at": "1521459234", "last_modified": "1521459234" }, { . . . } ]
ResponseCode 200–OK
Table25:GETElasticApplication’sContext
Description GETthecontextofanElasticApplication
URI /apps/{appID}Method GETParameters -Request/ResponseFormat
application/json
D5.1CloudGovernanceMechanisms–EarlyRelease
57
ResponseStatusCodes
200–OK401–UNAUTHORIZED403–FORBIDDEN404–NOTFOUND
SampleRequest GET /apps/95a01dfe4667414f9336b7d7495cc7a7
RequestBody -
ResponseBody { "appID": "95a01dfe4667414f9336b7d7495cc7a7", "name": "my-app-1", "availability_zone": "eu", "tags": [ "spring-boot", "mysqldb" ], "providers": [ "aws", "google-ce" ], "created_at": "1521459234", "last_modified": "1521459234" }
ResponseCode 200–OKTable26:DeleteElasticApplicationContext
Description DELETEthecontextofanElasticApplication
URI /apps/{appID}Method DELETEParameters -Request/ResponseFormat
application/json
ResponseStatusCodes
200–OK401–UNAUTHORIZED403–FORBIDDEN404–NOTFOUND
SampleRequest DELETE /apps/95a01dfe4667414f9336b7d7495cc7a7
RequestBody -
ResponseBody -
ResponseCode 200–OK
Table27:UPDATEElasticApplication’scontext
Description UPDATEthecontextofanElasticApplication
URI /apps/{appID}Method PUTParameters -Request/ResponseFormat
application/json
D5.1CloudGovernanceMechanisms–EarlyRelease
58
ResponseStatusCodes
200–OK401–UNAUTHORIZED403–FORBIDDEN404–NOTFOUND
SampleRequest PUT /apps/95a01dfe4667414f9336b7d7495cc7a7
RequestBody { "name": "my-new-appname-1", "availability_zone": "us-west", "tags": [ "spring-boot", "mysqldb", "java" ] }
ResponseBody { "appID": "95a01dfe4667414f9336b7d7495cc7a7", "name": "my-new-appname-1", "availability_zone": "us-west", "tags": [ "spring-boot", "mysqldb", "java", ], "providers": [ "aws", "google-ce" ], "created_at": "1521459234", "last_modified": "1521467357" }
ResponseCode 200–OK
7.2.3 ElasticityPoliciesTable28:CREATEanewElasticityPolicy
Description CREATEanewelasticityPolicyURI /{appID}/elasticityPolicy/Method POSTParameters -Request/ResponseFormat application/jsonResponseStatusCodes 200–OK
201–CREATED401–UNAUTHORIZED403–FORBIDDEN
SampleRequest /95a01dfe4667414f9336b7d7495cc7a7/elasticityPolicy/
RequestBody { "name": "policy_scale_out_streaming_cluster_eu_1", "trigger": [ { "info": { "option": "HOURLY_COST", "members": [ { "resource": {
D5.1CloudGovernanceMechanisms–EarlyRelease
59
"name": "svc_streaming" } }, { "resource": { "name": "svc_analytics" } }, { "resource": { "name": "svc_front" } }, { "resource": { "name": "svc_data_store" } } ] }, "relOp": "LTE", "value": 10 }, { "groupFunction": "AVERAGE", "metric": { "name": "requests_per_sec", "members": { "resource": { "name": "svc_streaming" }, "clusters": [ "cluster_eu_1" ] } }, "relOp": "GTE", "value": 10, "timeWindow": { "duration": 5, "unit": "MINUTES" } } ], "action": { "scaleOut": [ { "count": 1, "resource": { "name": "svc_streaming" }, "cluster": "cluster_eu_1", "cooldown": { "duration": 1, "unit": "MINUTES" }, "warmup": { "duration": 1, "unit": "MINUTES" } } ] }, "priority": 1 }
D5.1CloudGovernanceMechanisms–EarlyRelease
60
ResponseBody { "msg": "Policy successfully created", "elasticityPolicyID": "abf10-fb19-2ccf" }
ResponseCode 200–OK
Table29:GETallElasticityPoliciesofaspecificdeployment
Description GETalltheElasticityPoliciesofaspecificdeploymentEndpoint /{appID}/elasticityPolicy/Method GETParameters -offset:Integer
required:Falsedescription:Thenumberofelasticitypoliciestoskip-limit:required:Falsedescription:Thenumberofelasticitypoliciestoreturn
Request/ResponseFormat application/jsonResponseStatusCodes 200–OK
401–UNAUTHORIZED403–FORBIDDEN404–NOTFOUND
SampleRequest /95a01dfe4667414f9336b7d7495cc7a7/elasticityPolicy/
RequestBody -
ResponseBody [ { "name": "policy_scale_out_streaming_cluster_eu_1", "trigger": [ … ], "action": { … }, "priority": 1 }, { "name": "policy_scale_out_streaming_cluster_usa_1", "trigger": [ … ], "action": { … }, "priority": 2 }, { "name": "policy_scale_in_streaming_cluster_eu_1", "trigger": [ … ], "action": { … }, "priority": 1 }, { "name": "policy_scale_in_streaming_cluster_usa_1",
D5.1CloudGovernanceMechanisms–EarlyRelease
61
"trigger": [ … ], "action": { … }, "priority": 2 } ]
ResponseCode 200–OK
Table30:GETanElasticityPolicy
Description GETanelasticityPolicyforthespecifieddeploymentEndpoint /{appID}/elasticityPolicy/{elasticityPolicyID}Method GETParameters -Request/ResponseFormat application/jsonResponseStatusCodes 200–OK
401–UNAUTHORIZED403–FORBIDDEN404–NOTFOUND
SampleRequest GET
/95a01dfe4667414f9336b7d7495cc7a7/elasticityPolicy/abf10-fb19-2ccf
RequestBody -
ResponseBody { "name": "policy_scale_out_streaming_cluster_eu_1", "trigger": [ { "info": { "option": "HOURLY_COST", "members": [ { "resource": { "name": "svc_streaming" } }, { "resource": { "name": "svc_analytics" } }, { "resource": { "name": "svc_front" } }, { "resource": { "name": "svc_data_store" } } ] }, "relOp": "LTE", "value": 10 },
D5.1CloudGovernanceMechanisms–EarlyRelease
62
{ "groupFunction": "AVERAGE", "metric": { "name": "requests_per_sec", "members": { "resource": { "name": "svc_streaming" }, "clusters": [ "cluster_eu_1" ] } }, "relOp": "GTE", "value": 10, "timeWindow": { "duration": 5, "unit": "MINUTES" } } ], "action": { "scaleOut": [ { "count": 1, "resource": { "name": "svc_streaming" }, "cluster": "cluster_eu_1", "cooldown": { "duration": 1, "unit": "MINUTES" }, "warmup": { "duration": 1, "unit": "MINUTES" } } ] }, "priority": 1 }
ResponseCode 200–OK
Table31:UPDATEanexistingElasticityPolicy
Description UPDATEanexistingelasticityPolicyEndpoint /{appID}/elasticityPolicy/{elasticityPolicyID}Method PUTParameters -Request/ResponseFormat application/jsonResponseStatusCodes 200–OK
401–UNAUTHORIZED403–FORBIDDEN404–NOTFOUND
SampleRequest /95a01dfe4667414f9336b7d7495cc7a7/elasticityPolicy/abf10-
fb19-2ccfRequestBody {
"name": "policy_scale_out_streaming_cluster_eu_1", "trigger": [
D5.1CloudGovernanceMechanisms–EarlyRelease
63
{ "info": { "option": "HOURLY_COST", "members": [ { "resource": { "name": "svc_streaming" } }, { "resource": { "name": "svc_analytics" } }, { "resource": { "name": "svc_front" } }, { "resource": { "name": "svc_data_store" } } ] }, "relOp": "LTE", "value": 20 }, { "groupFunction": "AVERAGE", "metric": { "name": "requests_per_sec", "members": { "resource": { "name": "svc_streaming" }, "clusters": [ "cluster_eu_1" ] } }, "relOp": "GTE", "value": 10, "timeWindow": { "duration": 5, "unit": "MINUTES" } } ], "action": { "scaleOut": [ { "count": 4, "resource": { "name": "svc_streaming" }, "cluster": "cluster_eu_1", "cooldown": { "duration": 1, "unit": "MINUTES" }, "warmup": { "duration": 1, "unit": "MINUTES"
D5.1CloudGovernanceMechanisms–EarlyRelease
64
} } ] }, "priority": 1 }
ResponseBody { "msg": "Policy successfully updated" }
ResponseCode 200–OK
Table32:DELETEanexistingElasticityPolicy
Description DELETEanexistingelasticityPolicyEndpoint /{appID}/elasticityPolicy/{elasticityPolicyID}HTTPMethod DELETEParameters -Request/ResponseFormat application/jsonResponseStatusCodes 200–OK
401–UNAUTHORIZED403–FORBIDDEN404–NOTFOUND
SampleRequest /95a01dfe4667414f9336b7d7495cc7a7/elasticityPolicy/abf10-
fb19-2ccfRequestBody -
ResponseBody { "msg": "Policy successfully deleted" }
ResponseCode 200–OK