toward large-scale distributed stream processing: models ......toward large-scale distributed stream...
TRANSCRIPT
Towardlarge-scaledistributedstreamprocessing:
models,systemsandchallengesValeriaCardelliniandFrancescoLoPres>UniversityofRomeTorVergata,Italy
ICT COST Action IC1304
Autonomous Control for a Reliable
Internet of Services (ACROSS)
2ndInt’lSummerSchoolonAutonomousControlforReliableFutureNetworksandServices,30May2016,Opa>ja,Croa>a
Whoarewe?ValeriaCardelliniAssociateprofessor@Univ.ofRomeTorVergata
FrancescoLoPres> Associateprofessor @Univ.ofRomeTorVergata
• JointresearchworkwithVincenzoGrassiandMaXeoNardelli
V.Cardellini-ACROSS2ndSummerSchool 1
Thedatadeluge
• Somewell-knownnumbersrelatedtoBigData:– Everydayin2014wecreated2.5Exabytes– 40ZeXabytesofdatawillbecreatedby2020
• Prolifera>onofnewsourcesofdata– Sensors,mobiledevices,cameras– Socialnetworks– Scien>ficinstruments– Vehicles
• Howcanwemakesenseofallthesedata?– Processdatatoextractvaluableinsights
V.Cardellini-ACROSS2ndSummerSchool 2
Whydatastreamprocessing?• Applica>onssuchas:
– Sen>mentanalysisonmul>pletweetstreams@TwiXer– Userprofiling@Yahoo!– Trackingofquerytrendevolu>on@Google– Frauddetec>on– Busrou>ngmanagement@cityofDublin[Art14]
• Require:– Con>nuousprocessingofunboundeddatastreamsgeneratedbymul>ple,distributedsources
– In(near)real-1mefashionV.Cardellini-ACROSS2ndSummerSchool 3
Whydatastreamprocessing?
• Inthepastyearsdatastreamprocessing(DSP)wasconsideredasolu>onforveryspecificproblems(e.g.,financial>ckers)
• Butnowwehave(andwillhave)moregeneralseings– E.g.,InternetofThings
V.Cardellini-ACROSS2ndSummerSchool 4
Whydatastreamprocessing?
• Decreasetheoveralllatencytoobtainresults– Nodatapersistenceonstablestorage
See“Latencynumberseveryprogrammershouldknow”
– Noperiodicbatchanalysis
• Simplifythedatainfrastructure
• Make>medimensionofdataexplicitV.Cardellini-ACROSS2ndSummerSchool 5
Whydatastreamprocessing?
• Decreasetheoveralllatencytoobtainresults– Nodatapersistenceonstablestorage
See“Latencynumberseveryprogrammershouldknow”
– Noperiodicbatchanalysis
• Simplifythedatainfrastructure
• Make>medimensionofdataexplicitV.Cardellini-ACROSS2ndSummerSchool 6
Tradi>onalDSPchallenges
• Streamdataratescanbehighanddataarriveinlargevolumes– Highresourcerequirementsforprocessing(clusters,datacenters,distributedClouds)
• Processingstreamdatahasreal->measpects– Streamprocessingapplica>onshaveQoSrequirements,e.g.,end-to-endlatency
– Mustbeabletoreacttoeventsastheyoccur
V.Cardellini-ACROSS2ndSummerSchool 7
Whylarge-scalestreamprocessing?• Goals:increasescalabilityandreducelatency
• How?Relyondistributedandnear-edgecomputa>on
V.Cardellini-ACROSS2ndSummerSchool 8
Goalsofthelectures• Giveaflavoroflarge-scaledistributedstreamprocessingandrelatedresearchchallenges
• PartI(V.Cardellini)– Focusonsystemissues– Theseslides
• PartII(F.LoPres>)– Focusonmodelsandalgorithms
• Request– Ifyougeteitherboredorlost,askques>ons…– Ifyouliketoaskques>ons,askques>ons…
V.Cardellini-ACROSS2ndSummerSchool 9
Goalsofthelectures• Giveaflavoroflarge-scaledistributedstreamprocessingandrelatedresearchchallenges
• PartI(V.Cardellini)– Focusonsystemissues
• PartII(F.LoPres>)– Focusonmodelsandalgorithms
• Request– Ifyougeteitherboredorlost,askques>ons…– Ifyouliketoaskques>ons,askques>ons…
V.Cardellini-ACROSS2ndSummerSchool 10
Datastreamdefini>ons
V.Cardellini-ACROSS2ndSummerSchool 11
Datastream
• “Adatastreamisareal->me,con>nuous,ordered(implicitlybyarrival>meorexplicitlyby>mestamp)sequenceofitems.Itisimpossibletocontroltheorderinwhichitemsarrive,norisitfeasibletolocallystoreastreaminitsen>rety.Queriesoverstreamsruncon>nuouslyoveraperiodof>meandincrementallyreturnnewresultsasnewdataarrive.”[Gol03]
V.Cardellini-ACROSS2ndSummerSchool 12
Slidingwindows
• Howmanydataitemsshouldweprocesseach>me?– Processitemsinwindow-sizedbatches
• Count-basedwindow,e.g.,lastnitems
• Time-basedwindow,e.g.from[t-T]to[t]
s1 s2 s3 s4 s5
>me
s6 n=5
V.Cardellini-ACROSS2ndSummerSchool 13
Slidingwindows
• Howosenshouldweevaluatethewindow?– Eagerapproach:outputnewresultitemsassoonasavailable(butcanbedifficulttoimplementefficiently)
– Lazyapproach:slidewindowbysseconds(ormitems)
V.Cardellini-ACROSS2ndSummerSchool 14
DSPapplica>onmodel• ADSPapplica>onismadeofanetworkofoperators(processingelements)connectedbystreams,atleastonedatasourceandatleastonedatasink
• Representedbyadirectedgraph– Graphver>ces:operators– Graphedges:streams
• Graphcanbecyclic– Somesystemsonlysupportdirectedacyclicgraph(DAG)
• GraphtopologyrarelychangesV.Cardellini-ACROSS2ndSummerSchool 15
DSPoperator• Aself-containedprocessingelementthat:
– transformsoneormoreinputstreamsintoanotherstream– canexecuteagenericuser-definedcode
• Algebraicopera>on(filter,aggregate,join,..)• User-defined(morecomplex)opera>on(POS-tagging,…)
– canexecuteinparallelwithotheroperators• Canbestatelessorstateful
– Stateless:knownothingaboutthestate(e.g.,filter,map)– Stateful:keepsomesortofstate
• E.g.,someaggrega>onorsummaryofprocessedelements,orstate-machinefordetec>ngpaXernsforfraudulentfinancialtransac>on
• StatemightbesharedbetweenoperatorsV.Cardellini-ACROSS2ndSummerSchool 16
“HelloWorld”:WordCount
Wordssource
Wordscounter
(word) (word,counter)
(ranks)
Intermediatesorter
Finalsorter
(finalrank)
V.Cardellini-ACROSS2ndSummerSchool 17
SomeDSPapplica>on:DEBS’14GC• Real->meanaly>csoverhighvolumesensordata:analysisof
energyconsump>onmeasurements[DEBS14GC]– Smartplugsdeployedinhouseholdsandequippedwithsensorsthat
measurevaluesrelatedtopowerconsump>on• Inputdatastream:
!2967740693, 1379879533, 82.042, 0, 1, 0, 12 !
• Query1:makeloadforecastsbasedoncurrentloadmeasurementsandhistoricaldata– Outputdatastream:
ts, house_id, predicted_load !
• Query2:findtheoutliersconcerningenergyconsump>on– Outputdatastream:
ts_start, ts_stop, household_id, percentage!V.Cardellini-ACROSS2ndSummerSchool 18
SomeDSPapplica>on:DEBS’15GC• Real->meanaly>csoverhighvolumespa>o-temporaldata
streams:analysisoftaxitripsbasedondatastreamsorigina>ngfromNewYorkCitytaxis[DEBS15GC]
• Query1:iden>fyrecentfrequentroutes• Query2:iden>fyregionswiththehighestprofit• Bothqueriesrelyonaslidingwindowoperator
– Con>nuouslyevaluatethequeryresults• Usegeo-spa>algridstodefinetheeventsofinterest
V.Cardellini-ACROSS2ndSummerSchool 19
SomeDSPapplica>on:DEBS’16GC• Real->meanaly>csforadynamic(evolving)social-network
graph[DEBS16GC]• Query1:iden>fythepoststhatcurrentlytriggerthemost
ac>vityinthesocialnetwork• Query2:iden>fylargecommuni>esthatarecurrently
involvedinatopic• Requirecon>nuousanalysisofdynamicgraphconsideringmul>plestreamsthatreflectgraphupdates
V.Cardellini-ACROSS2ndSummerSchool 20
Datastreamsystems
V.Cardellini-ACROSS2ndSummerSchool 21
Streamingsystem• Distributedsystemthatexecutesstreamgraphs
– con>nuouslycalculatesresultsforlong-standingqueries– overpoten>allyinfinitedatastreams– usingoperators
• thatcanbestatelessorstateful
• Systemnodesmaybeheterogeneous• Mustbehighlyop>mizedandwithminimaloverheadsotodeliverreal->meresponseforhigh-volumeDSPapplica>ons
V.Cardellini-ACROSS2ndSummerSchool 22
Operatorplacement
V.Cardellini-ACROSS2ndSummerSchool 23
1 23
4 6
5
(1,2)
(1,2) (1,2) (2,3)(2,4)
(3,5)(4,5)
(4,6)
(4,6)
(2,4)(2,3)
(3,5)
(4,5)
(4,6)
• Determine,withinasetofavailabledistributedcompu>ngnodes,thenodesthatshouldhostandexecuteeachoperatorofaDSPapplica>on
v
Bigdatacenters
• Whichframeworksfordatastreamprocessing?• Usuallyruninlocallydistributedclusterswithinlargedatacenters
• Assump>ons:– Scaleoutandnotscaleup
• Commodityservers• Data-parallelismisking
– Soswaredesignedforfailure• See[Dea09]
V.Cardellini-ACROSS2ndSummerSchool 24
Source:Google
ApacheStorm• ApacheStorm
– Open-source,real->me,scalablestreamingsystem– Providesanabstrac>onlayertoexecuteDSPapplica>ons
• Topology(streaminggraph)
– Spouts(datasources)andbolts(operatorsanddatasinks)
stream
x5
V.Cardellini-ACROSS2ndSummerSchool 25
worker process
executor executorTHREAD THREAD
JAVA PROCESS
task
task
task
task
task
Stormen>>es• Task:operatorinstance• Executor:smallestschedulableen>ty
– Executeoneormoretasksrelatedtosameoperator
• Workerprocess:Javaprocessrunningasubsetofexecutors
• Workernode:compu>ngresource,acontainerforworkerprocesses
V.Cardellini-ACROSS2ndSummerSchool 26
Stormarchitecture
V.Cardellini-ACROSS2ndSummerSchool 27
Otherframeworks(par=allist)• Cloud-basedframeworks
– AmazonKinesis– GoogleCloudDataflow– Microsos
• ApacheSpark– ImproveMapReduce(batchprocessing)– SparkStreaming:reducethesizeofeachstreamandprocessstreamsofdata(micro-batchprocessing)
V.Cardellini-ACROSS2ndSummerSchool 28
Otherframeworks(par=allist)• Cloud-basedframeworks
– AmazonKinesis– GoogleCloudDataflow– Microsos
• ApacheSpark– ImproveMapReduce(batchprocessing)– SparkStreaming:reducethesizeofeachstreamandprocessstreamsofdata(micro-batchprocessing)
V.Cardellini-ACROSS2ndSummerSchool 29
Otherframeworks(par=allist)• Cloud-basedframeworks
– AmazonKinesis– GoogleCloudDataflow– Microsos
• ApacheSpark– ImproveMapReduce(batchprocessing)– SparkStreaming:reducethesizeofeachstreamandprocessstreamsofdata(micro-batchprocessing)
V.Cardellini-ACROSS2ndSummerSchool 30
(e.g.,ApacheStorm) (e.g.,ApacheSpark)
Anewbreadthofframeworks• Lambdaarchitecture
– Data-processingdesignpaXerntohandlemassivequan>>esofdataandintegratebatchandreal->meprocessingwithinasingleframework
V.Cardellini-ACROSS2ndSummerSchool 31Source:hXps://voltdb.com/products/alterna>ves/lambda-architecture
Challengesindatastreamprocessing
V.Cardellini-ACROSS2ndSummerSchool 32
Challenge1:Op>mizetheDSPapplica>on• Applysometransforma>ontostreaminggraph
– Atdesign>meorrun->me
• Operatorreordering[Hir14]– Toavoidunnecessarydatatransfers
• Redundancyelimina>on[Hir14]
A B B A
A
B
B D
C
A B
D
C
V.Cardellini-ACROSS2ndSummerSchool 33
Challenge1:Op>mizetheDSPapplica>on
• Operatorsepara>on[Hir14]
• Fusion[Hir14]
A A1 A2
A B AB
V.Cardellini-ACROSS2ndSummerSchool 34
Challenge2:Placetheoperators
• Operatorplacementdecision:acomplexproblem– Tradecommunica>oncostagainstresourceu>liza>on
• When– Ini>al(sta>c)operatorplacement
• Canbemoreexpensiveandcomprehensive
– Canalsobeatrun->me• Moveonlyrelocatableoperators• Requireoperatormigra>on
• SeePartII
V.Cardellini-ACROSS2ndSummerSchool 35
Challenge3:Manageloadvaria>ons• Typicalstreamprocessingworkloadsare:
– withhighvolumeandhighrates– burstyandwithworkloadspikesnotknowninadvance
• TwiXerin2013:rateoftweetspersecond=5700…• butsignificantpeakof144,000tweetspersecond
V.Cardellini-ACROSS2ndSummerSchool 36
Challenge3:Manageloadvaria>ons• Possibleapproaches:
– Admissioncontrol– Sta>creserva>on
• Reservespecificresourcesinadvance• Cons:over-provisioningandcostincrease
– Applydynamictechniquessuchasloadshedding• Selec>velydroptuplesatstrategicpoints(e.g.,whenCPUusageexceedsaspecificlimit)
• Cons:sacrificeaccuracyandcompleteness
A Shedder AV.Cardellini-ACROSS2ndSummerSchool 37
Challenge3:Manageloadvaria>ons• Possibleapproaches(con=nued):
– Useadap>veratealloca>on[Bou12]– Redistributeload,e.g.,determinenewoperatorplacementandrelocateoperatorsoncompu>ngnodes
• Cons:availableresourcescouldbeinsufficient
V.Cardellini-ACROSS2ndSummerSchool 38
• Alterna>vesolu>on:– DetectboXleneck– Usedata-parallelism(akaoperatorfission[Hir14])
• ApplySIMDparadigm:concurrentexecu>onofmul>plereplicasofthesameoperatorondifferentdatapor>ons
• Byhand:possible,butcumbersome
Exploitdataparallelism
A B
A
A
A
Split Merge
V.Cardellini-ACROSS2ndSummerSchool 39
Elas>cstreamprocessing
V.Cardellini-ACROSS2ndSummerSchool 40
• Exploitelas1city:acquireandreleaseresourceswhenneeded
– Atapplica>onlayer(i.e.dataparallelism)• Scaleout(orscalein)operators• Ac>vate(ordeac>vate)replicatedoperators[Bel14]
– Atinfrastructurelayer• Scaleout(orscalein)compu>ngnodes
Elas>cstreamprocessing
• Whenandhowtoscale?– SeePartII
• Butelas>cityoverheadisnotzero!– Inmoststreamingsystems:runanewplacementdecisiontotakethenewresourcesintoaccount
– Dynamicscalingimpactsstatefuloperators
V.Cardellini-ACROSS2ndSummerSchool 41
Challenge4:Self-adaptatrun->me
• Tocopewithhighlydynamicopera>veenvironment– Unpredictableworkload– Computa>onalcharacteris>csofoperatorsnotknowna-priori
– Needtosustainedloadforlongprovisioning>mes– Nodeavailability,networkconges>on,…
• Exploitrun->meadapta>oncapabili>esofstreamingsystems
• Whatadap>onac>ons?– Scalethenumberofoperatorinstances,relocatetheoperators,…
V.Cardellini-ACROSS2ndSummerSchool 42
Self-adapta>onframework• MAPE:Monitor,Analyze,PlanandExecute• Soswarereferenceframeworkforself-adapta>on
V.Cardellini-ACROSS2ndSummerSchool 43
DistributedStorm
• WedevelopedanextensionofStorm[Car15]• Goals:toprovide
– distributedmonitoring– distributedplacement(seePartII)– andadapta>oncapabili>es
• Where:large-scaleenvironment• CodeavailableonGitHub
matnar.github.io/uniroma2-storm/
V.Cardellini-ACROSS2ndSummerSchool 44
DistributedStormarchitecture
V.Cardellini-ACROSS2ndSummerSchool 45
DistributedStorm:monitoring• QoSMonitor(foreachworkernode)
– Es>matenetworklatencies• Useanetworkcoordinatesystem• Vivaldi’salgorithm[ref]:decentralizedandgossip-based
– MonitorQoSaXributes• Nodeu>liza>onandavailability
• WorkerMonitor(foreachworkerprocess)– Monitorexchangeddatarateamongtheoperators
V.Cardellini-ACROSS2ndSummerSchool 46
DistributedStorm:performance
Loadspikeonasubsetofnodes
~50%
V.Cardellini-ACROSS2ndSummerSchool 47
Self-adapta>onchallenges
• Adapta>onhasanonnegligiblecost!– Run->mereconfigura>onscanincreaselatencyandreduceapplica>onavailability• Performadapta>ononlywhenneeded
– Costsofoperatormigra>onscannotbeneglected• Freeze>mescausedbyoperatormigra>on• Howtomigratestatefuloperators?
V.Cardellini-ACROSS2ndSummerSchool 48
Challenge5:statefuloperators• Statecomplicatesthings…1. Dynamicscaling2. Operatorre-placement3. Recoveryfromfailure
Lossofstate!V.Cardellini-ACROSS2ndSummerSchool 49
impactstate
Approachesforstatefulmigra>on• Moststreamingsystemsdonotsupportstatefulprocessingandmigra>on(e.g.,Storm)– Developersmanagestate– Typicallycombinewithexternalsystemtostorestate– Designcomplexity
• Requirementsforstatefulopera>ormigra>on– Safety(i.e.,topreservetheconsistencyoftheopera>ons)– Applica>ontransparency– Minimalfootprint
V.Cardellini-ACROSS2ndSummerSchool 50
Statefuloperatormigra>on
• Paralleltrackapproach[Hei14]• Pause-and-resumeapproach
Stopmigra>ngtask Savestate
Terminatemigra>ngtaskandstartitonnewnode
Restorestate
Resumestreamprocessing
V.Cardellini-ACROSS2ndSummerSchool 51
Approachesforstatefulmigra>on
• Howtoiden>fythepor>onofstatetomigrate?– ExposeanAPItolettheusermanuallymanagethestate[Fer13]
– Supportonlypar>>onedstatefuloperators[Ged14]
• Par>>onedstatefuloperatorsstoreindependentstateforeachsub-streamiden>fiedbyapar>>oningkey
• Automa>callydetermine,onthebasisofapar>>oningkey,theop>malnumberofstatepar>>onstobeusedandmigrate
V.Cardellini-ACROSS2ndSummerSchool 52
Elas>cstatefulmigra>oninStorm• Wedevelopedmechanismsforelas>cstatefulmigra>oninStorm[Car16]
• CodeonGitHubmatnar.github.io/elas>c-storm/
Supervisor Supervisor Supervisor Supervisor
wor
ker
proc
ess
wor
ker
proc
ess
wor
ker
slot
wor
ker
slot
wor
ker
slot
wor
ker
slot
wor
ker
proc
ess
wor
ker
proc
ess
wor
ker
proc
ess
wor
ker
proc
ess
wor
ker
proc
ess
wor
ker
proc
ess
DDS DDS DDS DDS
Network
schedulerMigrationNotifier
ElasticityManager
Nimbus ZooKeeperV.Cardellini-ACROSS2ndSummerSchool 53
Elas>cstatefulmigra>oninStorm• Scalingdecisionsattheframeworklevel
– Adaptthenumberofparallelinstancesforeachapplica>onoperator
– Simplethreshold-basedscalingpolicy(seePartII)
• RelocatetheoperatorinternalstateonadifferentnodeandenableStormtochangetheapplica>ondeploymentatrun->me
MIGRATION NOTIFIED
MIGRATIONMODE
SAVESTATE
first synchronizationbarrier
the migrating taskcan be terminated
MIGRATION MODE
RESTORE STATE(if any)
OPERATIONALMODE
new task
second synchronizationbarrier
streams areresumed
time
DDS DDS
V.Cardellini-ACROSS2ndSummerSchool 54
Time (s)500 1000 1500 2000 2500 3000 3500 4000 4500
App
licat
ion
Late
ncy
(ms)
0
200
400
600
800
1000
1200
1400
1600
Data rate ScalingSchedulingwith E+SMw/o E+SM
120 tweets/s120 tweets/s 250 tweets/s350 tweets/s 900 tweets/s
Time (s)500 1000 1500 2000 2500 3000 3500 4000 4500
Num
ber
of E
xecu
tors
0
5
10
15
20
25
30
Data rateScalingSchedulingwith E+SM
120 tweets/s250 tweets/s900 tweets/s350 tweets/s120 tweets/s
Performanceresults
• Elas>cscalingandstatefulmigra>onimprovetheapplica>onlatency
V.Cardellini-ACROSS2ndSummerSchool 55
• DSPapplica>on:frequentpaXerndetec>on
Challenge6:guaranteefaulttolerance• DSPapplica>onsrunforlong>meintervals
failuresareunavoidable• Possiblesolu>ons:
– Ac>vereplica>on[Bri09]– Check-poin>ng[Seb11]– Replaylogs[Bal08]– Hybridsolu>ons[Zha10]
• Havingdifferenttrade-offsbetweenrun>mecostinabsenceoffailuresandrecoverycost
• Large-scalecomplicatesthings…– Networkpar>>onsandCAPtheorem
V.Cardellini-ACROSS2ndSummerSchool56
Challenge7:Managemul>pleconcurrentDSPapplica>ons
• Considermul>plecompe>ngDSPapplica>ons• Howshouldthestreamingsystemallocateresources?– Fairness– Resourceu>liza>on– Profitability,…
V.Cardellini-ACROSS2ndSummerSchool 57
ApacheMesos• Runconcurrentframeworksonthesameclusteranddynamicallysharetheclusterresources
• Mesos:acluster“opera>ngsystem”[Hin11]– Efficientresourceisola>onandsharingacrossdistributedframeworks
V.Cardellini-ACROSS2ndSummerSchool 58
ApacheMesos
V.Cardellini-ACROSS2ndSummerSchool 59
• Two-levelschedulingbasedonDominantResourceFairness(DRF)algorithm
GMesos:distributedMesos
60V.Cardellini-ACROSS2ndSummerSchool
• WearecurrentlydevelopingGMesosforlarge-scaleenvironment…staytuned!
Somenewchallengesandresearchopportuni>es
• IntegratedatastreamprocessingwithSDN– WithSDN,networkintothecontrolloop
• Studycross-layerop>miza>on
• Addresssecurityandprivacyissuesindatastreamprocessing
V.Cardellini-ACROSS2ndSummerSchool 61
References[And14]H.C.M.Andrade,B.Gedik,D.S.Turaga,“FundamentalsofStreamProcessing:Applica>onDesign,Systems,andAnaly>cs”,CambridgeUniversityPress,2014.[Art14]A.Ar>kisetal.,“Heterogeneousstreamprocessingandcrowdsourcingforurbantrafficmanagement”,InProc.ofEDBT’14,2014.[Bal08]M.Balazinska,H.Balakrishnan,S.Madden,M.Stonebraker,“Fault-toleranceintheborealisdistributedstreamprocessingsystem”,ACMTrans.DatabaseSyst.33,1,2008.[Bel14]P.Bellavista,A.Corradi,S.Kotoulas,A.Reale,"Adap>veFault-ToleranceforDynamicResourceProvisioninginDistributedStreamProcessingSystems",InProc.ofEDBT’14,2014.[Bou12]I.Boutsis,V.Kalogeraki,“RADAR:Adap>veratealloca>onindistributedstreamprocessingsystemsunderburstyworkloads”,Proc.ofSRDS’12,2012.[Bri09]A.Brito,C.FetzerandP.Felber,“Mul>threading-enabledac>vereplica>onforeventstreamprocessingoperators”,InProc.ofSRDS'09,2009.[Car15]V.Cardellini,V.Grassi,F.LoPres>,M.Nardelli,“DistributedQoS-awareschedulinginStorm”,Proc.ofACMDEBS’15,2015.[Car16]V.Cardellini,M.Nardelli,D.Luzi,“Elas>cstatefulstreamprocessinginStorm”,Proc.ofHPCS‘16,2016. V.Cardellini-ACROSS2ndSummerSchool 62
References[Dab04]F.Dabek,R.Cox,F.Kaashoek,R.Morris,“Vivaldi:Adecentralizednetworkcoordinatesystem”,SIGCOMMComput.Commun.Rev.34,4,2004.[Dea09]J.Dean,Design,LessonsandAdvicefromBuildingLargeDistributedSystems,InLADIS'09,2009.[DEBS14GC]Z.Jerzak,H.Ziekow,“TheDEBS2014grandchallenge”,InProc.ofACMDEBS'14,2014.[DEBS15GC]Z.Jerzak,H.Ziekow,“TheDEBS2015grandchallenge”,InProc.ofACMDEBS'15.[DEBS16GC]V.Gulisano,Z.Jerzak,S.Voulgaris,H.Ziekow,“TheDEBS2016grandchallenge”,InProc.ofACMDEBS'16,2016.[Fer13]R.Fernandez,M.Migliavacca,E.Kalyvianaki,andP.Pietzuch,“Integra>ngscaleoutandfaulttoleranceinstreamprocessingusingoperatorstatemanagement,”inProc.ofACMSIGMOD’13,2013.[Ged14]B.Gedik,S.Schneider,M.Hirzel,andK.-L.Wu,“Elas>cscalingfordatastreamprocessing”IEEETrans.ParallelDistrib.Syst.25,6,2014.[Gol03]L.Golab,M.Özs,“Issuesindatastreammanagement”,ACMSIGMODRec.32,2,2003.
V.Cardellini-ACROSS2ndSummerSchool 63
References[Hei14]T.Heinze,L.Aniello,L.Querzoni,andZ.Jerzak,“Cloud-baseddatastreamprocessing,”inProc.ofACMDEBS’14,2014.[Hin11]B.Hindmanetal.,“Mesos:aplazormforfine-grainedresourcesharinginthedatacenter”,InProc.ofOSDI’11,2011.[Hir14]M.Hirzel,R.Soulé,S.Schneider,B.Gedik,R.Grimm,“Acatalogofstreamprocessingop>miza>ons”,ACMComput.Surv.46,4,2014.[Seb11]Z.Sebepou,K.Magou>s,“CEC:Con>nuouseventualcheckpoin>ngfordatastreamprocessingoperators”,InProc.ofDSN’11,2011.[Zha10]Z.Zhangetal.,“Ahybridapproachtohighavailabilityinstreamprocessingsystems.InProc.ofICDCS‘10,2010.
V.Cardellini-ACROSS2ndSummerSchool 64
Thankyou!Anyques>ons?
www.ce.uniroma2.it/~valeriaV.Cardellini-ACROSS2ndSummerSchool 65