distributed strategies for elasc data stream processing in ......distributed strategies for elasc...
TRANSCRIPT
DistributedStrategiesforElas2cDataStreamProcessingintheFog
ValeriaCardellini,FrancescoLoPres2,Ma-eoNardelli,GabrieleRussoRussoUniversityofRomeTorVergata,Italy
ICT COST Action IC1304
Autonomous Control for a Reliable
Internet of Services (ACROSS)
DataStreamProcessing
2
DataStreamProcessing(DSP)applica>ons:• processingofdatastreamsgeneratedbydistributedsources• extractinforma>onina(near)real->memanner• inmemoryprocessingTwi-ersen>mentanalysisbasedonDSPFrameworkApacheStorm‘>l2014WellsuitedforSmartCityApplica>ons
– Toincreasescalabilityandavailability,reducelatency,networktraffic,etc…
Exploitfogandnear-edgecomputa>on(distributedcloudandFogcompu>ng)
OldandNewChallengesDistributedEnvironment• Geographicdistribu>on,networklatenciesarenot-negligible• Datacannotbequicklymovedamongcompu>ngnodes
DSPApplica>onsarelongrunning• Subjecttovaryingload,networkvariability
Reconfiguretheapplica>ondeployment• hasanonnegligiblecost!• cannega>velyaffectapplica>onperformanceintheshortterm
– Applica>onfreezing>mes,especiallyforstatefuloperators
3
Stateoftheart
4
Centralizedapproaches:• mostoftheproposedapproachesdesignedforclusters• Architecture(andcontrolalgorithms)donotscalewellinadistributed
environment
Decentralizedapproaches:• severalproposal• theirinherentlackofcoordina>onmightresultinfrequentreconfigura>ons
MAPE(Monitor,Analyze,PlanandExecute)
DecentralizedMAPE
5
• ManyPa-erns,eachwithproandcons
D.Weyns,B.Schmerl,V.Grassi,S.Malek,etal.Onpa-ernsfordecentralizedcontrolinself-adap>vesystems.InSo`wareEngineeringforSelf-Adap>veSystemsII,vol.7475ofLNCS,Springer,2013.
Master-worker Regionalpa-ern Hierarchicalcontrolpa-ern
Coordinatedcontrolpa-ern Informa>onsharingpa-ern
Goals
6
• DesignahierarchicaldistributedapproachtotheautonomouscontrolofDSPapplica>ons
• Supportrun-2meadapta2on– Elas2city
automa>callyscalein/outthenumberofoperatorinstances– StatefulMigra2on
relocateoperatorswithoutcompromisingapplica>onintegrity
• Designacontrolpolicy• Integra>onofoursolu>oninStorm
HierarchicalMAPEsinStorm
7
• NewcomponentsinApacheStormtorealizeaHierarchicalMAPEpa-ern
• OperatorManagervsApplica>onManager– Concernsand>mescalesepara>on
HierarchicalMAPEsinStorm
8
OperatorManager• Monitorsoperatorandlocalresources
– e.g.,ThreadCPUu>liza>on,• DetermineswhetheraMigra>on
and/orScaleopera>onisneeded• Executesthereconfigura>on
– Ifgetsthepermissionto
Applica>onManager• MonitorsApplica>onPerformance
– SLAenforcement• Coordinatesoperatorreconfigura>ons
– Grantspermissiontoenactreconfigura>ons– Controlsreconfigura>onfrequencies
GeneralFrameworkforDistributedOp>miza>on
SimpleDistributedHeuris>c:OperatorManager
9
• issuesreconfigura>onplans:ac>on,gain,cost
• ac>on:migratesanoperatorreplica– thresholdbasedpolicyonCPUu>liza>on– newloca>on:probabilis>cselec>onfromtheneighborhood– cost:es>matedstatefulmigra>on>me
• ac>on:operatorscaling– Thresholdbased:Uth1. scaleout:replicatereplicaiifUi>Uth
2. scalein:removeoneofthenreplicasifremovingitdoesnotoverload
theothers->ΣUi/(n-1)<0.75Uth– cost:es>mated>metorelocatetheoperatorstate(ifany)
• gainfunc>on:scale-out > migration > scale-in
SimpleDistributedHeuris>c:Applica>onManager
10
Token-basedpolicy• Considers>medividedincontrolintervals• Generatesreconfigura>ontokensbasedonapplica>onperformance• Grantsasmanyreconfigura>onsasavailabletokens
– Priori>zingbygaintocostra>o
TokenBucket
controlinterval
Reconfigura>onRequests
GrantedReconfigura>on
Time
Evalua>on
11
Infrastructure• 5workernodes+1hostforNimbusandZooKeeper• eachnodeIntelXeon8cores@2Ghz,16GBRAM
Applica2on• DEBS2015GrandChallenge:top10frequentroutesNYCtaxisinthelast30min• Requires:maxResponseTimeRmax=200ms
Policyparameters• OperatorManagerpolicy:thresholdsonu>liza>onto70%• Applica>onManagerpolicy:tokenbucketcapacity=1token
Evalua>onApplica>onManagerpolicy:grantsallreconfigura>onrequests
12
upandrunning93.7%of2me
medianofresponse2me
130.6ms
Evalua>onApplica>onManagerpolicy:1token/minifresponse>me>50%Rmax
13
upandrunning93.4%of2me
medianofresponse2me
117.0ms
Evalua>onApplica>onManagerpolicy:1token/minifresponse>me>75%Rmax
14
upandrunning98.3%of2me
medianofresponse2me
80.4ms
Conclusions• Wedesignedahierarchicaldistributedarchitecturefortheautonomous
controlofDSPapplica>ons
• Wedevelopedasimplecontrolpolicy
• Weintegratedoursolu>oninStorm
• Weevaluatedtheeffec>venessofoursolu>on
CurrentWorks• UseMDP,ReinforcementLearningtodesignControlPolicy
– TeamMarkovGames
– Mul>pleAgentReinforcementLearning
15