distributed strategies for elasc data stream processing in ......distributed strategies for elasc...

DistributedStrategiesforElas2cDataStreamProcessingintheFog

ValeriaCardellini,FrancescoLoPres2,Ma-eoNardelli,GabrieleRussoRussoUniversityofRomeTorVergata,Italy

ICT COST Action IC1304

Autonomous Control for a Reliable

Internet of Services (ACROSS)

DataStreamProcessing

2

DataStreamProcessing(DSP)applica>ons:•  processingofdatastreamsgeneratedbydistributedsources•  extractinforma>onina(near)real->memanner•  inmemoryprocessingTwi-ersen>mentanalysisbasedonDSPFrameworkApacheStorm‘>l2014WellsuitedforSmartCityApplica>ons

–  Toincreasescalabilityandavailability,reducelatency,networktraffic,etc…

Exploitfogandnear-edgecomputa>on(distributedcloudandFogcompu>ng)

OldandNewChallengesDistributedEnvironment•  Geographicdistribu>on,networklatenciesarenot-negligible•  Datacannotbequicklymovedamongcompu>ngnodes

DSPApplica>onsarelongrunning•  Subjecttovaryingload,networkvariability

Reconfiguretheapplica>ondeployment•  hasanonnegligiblecost!•  cannega>velyaffectapplica>onperformanceintheshortterm

–  Applica>onfreezing>mes,especiallyforstatefuloperators

3

Stateoftheart

4

Centralizedapproaches:•  mostoftheproposedapproachesdesignedforclusters•  Architecture(andcontrolalgorithms)donotscalewellinadistributed

environment

Decentralizedapproaches:•  severalproposal•  theirinherentlackofcoordina>onmightresultinfrequentreconfigura>ons

MAPE(Monitor,Analyze,PlanandExecute)

DecentralizedMAPE

5

•  ManyPa-erns,eachwithproandcons

D.Weyns,B.Schmerl,V.Grassi,S.Malek,etal.Onpa-ernsfordecentralizedcontrolinself-adap>vesystems.InSo`wareEngineeringforSelf-Adap>veSystemsII,vol.7475ofLNCS,Springer,2013.

Master-worker Regionalpa-ern Hierarchicalcontrolpa-ern

Coordinatedcontrolpa-ern Informa>onsharingpa-ern

Goals

6

•  DesignahierarchicaldistributedapproachtotheautonomouscontrolofDSPapplica>ons

•  Supportrun-2meadapta2on–  Elas2city

automa>callyscalein/outthenumberofoperatorinstances–  StatefulMigra2on

relocateoperatorswithoutcompromisingapplica>onintegrity

•  Designacontrolpolicy•  Integra>onofoursolu>oninStorm

HierarchicalMAPEsinStorm

7

•  NewcomponentsinApacheStormtorealizeaHierarchicalMAPEpa-ern

•  OperatorManagervsApplica>onManager–  Concernsand>mescalesepara>on

HierarchicalMAPEsinStorm

8

OperatorManager•  Monitorsoperatorandlocalresources

–  e.g.,ThreadCPUu>liza>on,•  DetermineswhetheraMigra>on

and/orScaleopera>onisneeded•  Executesthereconfigura>on

–  Ifgetsthepermissionto

Applica>onManager•  MonitorsApplica>onPerformance

–  SLAenforcement•  Coordinatesoperatorreconfigura>ons

–  Grantspermissiontoenactreconfigura>ons–  Controlsreconfigura>onfrequencies

GeneralFrameworkforDistributedOp>miza>on

SimpleDistributedHeuris>c:OperatorManager

9

•  issuesreconfigura>onplans:ac>on,gain,cost

•  ac>on:migratesanoperatorreplica–  thresholdbasedpolicyonCPUu>liza>on–  newloca>on:probabilis>cselec>onfromtheneighborhood–  cost:es>matedstatefulmigra>on>me

•  ac>on:operatorscaling–  Thresholdbased:Uth1.  scaleout:replicatereplicaiifUi>Uth

2.  scalein:removeoneofthenreplicasifremovingitdoesnotoverload

theothers->ΣUi/(n-1)<0.75Uth–  cost:es>mated>metorelocatetheoperatorstate(ifany)

•  gainfunc>on:scale-out > migration > scale-in

SimpleDistributedHeuris>c:Applica>onManager

10

Token-basedpolicy•  Considers>medividedincontrolintervals•  Generatesreconfigura>ontokensbasedonapplica>onperformance•  Grantsasmanyreconfigura>onsasavailabletokens

–  Priori>zingbygaintocostra>o

TokenBucket

controlinterval

Reconfigura>onRequests

GrantedReconfigura>on

Time

Evalua>on

11

Infrastructure•  5workernodes+1hostforNimbusandZooKeeper•  eachnodeIntelXeon8cores@2Ghz,16GBRAM

Applica2on•  DEBS2015GrandChallenge:top10frequentroutesNYCtaxisinthelast30min•  Requires:maxResponseTimeRmax=200ms

Policyparameters•  OperatorManagerpolicy:thresholdsonu>liza>onto70%•  Applica>onManagerpolicy:tokenbucketcapacity=1token

Evalua>onApplica>onManagerpolicy:grantsallreconfigura>onrequests

12

upandrunning93.7%of2me

medianofresponse2me

130.6ms

Evalua>onApplica>onManagerpolicy:1token/minifresponse>me>50%Rmax

13


medianofresponse2me

117.0ms

Evalua>onApplica>onManagerpolicy:1token/minifresponse>me>75%Rmax

14


medianofresponse2me

80.4ms

Conclusions•  Wedesignedahierarchicaldistributedarchitecturefortheautonomous

controlofDSPapplica>ons

•  Wedevelopedasimplecontrolpolicy

•  Weintegratedoursolu>oninStorm

•  Weevaluatedtheeffec>venessofoursolu>on

CurrentWorks•  UseMDP,ReinforcementLearningtodesignControlPolicy

–  TeamMarkovGames

–  Mul>pleAgentReinforcementLearning

15

distributed strategies for elasc data stream processing in ......distributed strategies for elasc...

Documents