jockey: guaranteed job latency in data parallel clustersadf/work/eurosys2012-poster.pdf · jockey:...

1
Jockey: Guaranteed Job Latency in Data Parallel Clusters Andrew D. Ferguson [email protected] Peter Bodík [email protected] Srikanth Kandula [email protected] Eric Boutin [email protected] Rodrigo Fonseca [email protected] Jockey Deadlines and Varying Latency Expressing Performance Targets Evaluation Cosmos is Microsoft’s data parallel processing environment It primarily supports Bing, Microsoft AdCenter, and MSN Cosmos clusters contain 1000s of commodity servers, each running multiple tasks for many jobs Resources are managed by granting tokens to tasks Tokens are de-normalized weights in the scheduler and guarantee a fixed slice of CPU and memory The Cosmos Environment Conclusion Users of data parallel clusters now demand predictable latency Predictable latency can be required for deadlines with business partners; missing a deadline can have financial consequences It would be easy to provide deadlines if job latency had low variance; unfortunately, it does not. 2 Job completion time 0 Users provide utility curves to express performance targets Deadline For single jobs, scale doesn’t matter For multiple jobs, use financial penalty 0 0.2 0.4 0.6 0.8 1 0 5 10 15 20 25 30 35 40 CDF latency [minutes] Runtime of Recurring Jobs CosmosStore: distributed storage layer Dryad: data-parallel execution engine SCOPE: SQL-like query language for Dryad Cosmos Components Cosmos Cluster Pipeline Job Stage Team Boundaries Pipeline complexity: Users develop multi-stage pipelines of dependant jobs, variance in earlier jobs impacts later ones Noisy environment: Simultaneous data parallel jobs compete for highly utilized shared resources, which can also fail Why does latency vary? Allocation above oracle Released resources due to excess pessimism Deadline lowered from 140 min. to 70 min. “Oracle” allocation: Total allocation-hours deadline How Jockey Managed a Real Job in a Production Cluster 0% 20% 40% 60% 80% 100% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% 110% 120% 130% CDF job completion time relative to deadline max allocation Jockey Jockey w/o adapting Jockey w/o simulator deadline Jockey’s Performance Relative to other Allocation Schemes Allocated too many resources Simulator made good predictions Only missed 1 of 94 deadlines Control-loop is stable and successful 0% 5% 10% 15% 20% 0% 25% 50% 75% 100% fraction of deadlines missed fraction of allocation above oracle Jockey w/o adaptation max allocation Jockey w/o simulator Jockey Compared with a naive max allocation scheme, and simulator and control- loop independently 21 jobs in production clus- ter, CPU use: ~80% Two metrics: Did jobs complete before deadline? Minimized impact on the rest of the cluster? Problem Solution Pipeline complexity Use a simulator Noisy environment Dynamic control Jockey works without requiring latency guarantees from individual cluster components When a shared environment is underloaded, guaranteed latency brings predictability to the user experience When a shared environment is overloaded, utility-based resource allocation ensures jobs are completed by importance Conceptually, Jockey is 1) a function from progress and allocation to remaining run time 2) a control-loop which dynamically adjusts the resource allocation simulator offline during job runtime running job utility function job stats latency predictions resource allocation control loop job profile Our Goal: Maximize utility, while minimizing resources by dynamically adjusting the allocation Our paper provides details and evaluation of each component

Upload: others

Post on 03-Jun-2020

13 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Jockey: Guaranteed Job Latency in Data Parallel Clustersadf/work/EuroSys2012-poster.pdf · Jockey: Guaranteed Job Latency in Data Parallel Clusters Andrew D. Ferguson adf@cs.brown.edu

Jockey: Guaranteed Job Latency in Data Parallel ClustersAndrew D. [email protected]

Peter Bodí[email protected]

Srikanth [email protected]

Eric [email protected]

Rodrigo [email protected]

Jockey

Deadlines and Varying Latency Expressing Performance Targets Evaluation

Cosmos is Microsoft’s data parallel processing environment•It primarily supports Bing, Microsoft AdCenter, and MSN•Cosmos clusters contain 1000s of commodity servers, •each running multiple tasks for many jobsResources are managed by granting • tokens to tasksTokens are de-normalized weights in the scheduler and •guaranteeafixedsliceofCPUandmemory

The Cosmos Environment

Conclusion

Usersofdataparallelclustersnowdemandpredictablelatency•Predictablelatencycanberequiredfordeadlineswithbusiness•partners;missingadeadlinecanhavefinancialconsequences

It would be easy toprovide deadlines if job latency had low variance; unfortunately, it does not.

2

Job completion time0

Usersprovideutilitycurvestoexpressperformancetargets

DeadlineFor single jobs, scale doesn’t matter

For multiple jobs, usefinancialpenalty

0

0.2

0.4

0.6

0.8

1

0 5 10 15 20 25 30 35 40

CDF

latency [minutes]

Runtime of Recurring Jobs

CosmosStore: •distributed storage layerDryad: data-parallel •executionengineSCOPE:SQL-likequery•language for Dryad

Cosmos Components

Cos

mos

Clu

ster

Pipeline

Job

Stage

Team Boundaries

Pipelinecomplexity:Usersdevelopmulti-stagepipelinesof •dependant jobs, variance in earlier jobs impacts later ones Noisy environment: Simultaneous data parallel jobs compete •for highly utilized shared resources, which can also fail

Why does latency vary?

Allocationabove oracle

Released resourcesduetoexcesspessimism

Deadline lowered from 140 min. to 70 min.

“Oracle” allocation:Total allocation-hours

deadline

How Jockey Managed a Real Job in a Production Cluster

0%

20%

40%

60%

80%

100%

10% 20% 30% 40% 50% 60% 70% 80% 90% 100% 110% 120% 130%

CD

F

job completion time relative to deadline

max allocation Jockey

Jockey w/o adapting Jockey w/o simulator

deadline

Jockey’s Performance Relative to other Allocation Schemes

Allocated toomany resources

Simulator made good predictions

Only missed 1 of 94 deadlines

Control-loop isstable and successful

0%

5%

10%

15%

20%

0% 25% 50% 75% 100%

fract

ion

of d

eadl

ines

mis

sed

fraction of allocation above oracle

Jockey w/o adaptation

max allocation

Jockey w/o simulator

Jockey

Compared with a naive •maxallocationscheme,and simulator and control-loop independently21 jobs in production clus-•ter,CPUuse:~80%Two metrics: Did jobs •complete before deadline? Minimized impact on the rest of the cluster?

Problem SolutionPipelinecomplexity UseasimulatorNoisy environment Dynamic control

Jockeyworkswithoutrequiringlatencyguaranteesfrom •individual cluster componentsWhen a shared environment is underloaded, guaranteed •latencybringspredictabilitytotheuserexperienceWhen a shared environment is overloaded, utility-based •resource allocation ensures jobs are completed by importance

Conceptually, Jockey is1) a function from progress and allocation toremaining run time2) a control-loop which dynamically adjusts the resource allocation

simulator

offline during job runtime

running job

utility function job stats latency

predictions

resource allocation control loop

job profile

Our Goal:Maximize utility, while minimizing resourcesby dynamically adjusting the allocation

Our paper provides details and evaluation of each component