# the power of both choices: practical load balancing for distributed stream processing engines

Post on 17-Jul-2015

426 views

Embed Size (px)

TRANSCRIPT

Load Balancing in Stream Processing Systems

The Power of Both ChoicesPractical Load Balancing for Distributed Stream Processing Engines

Muhammad Anis Uddin Nasir, Gianmarco De Francisci Morales, David Garcia-Soriano Nicolas Kourtellis, Marco SerafiniInternational Conference on Data Engineering (ICDE 2015)

Hey everyone:

I am Anis. A PhD student from KTH Royal Institute of Technology.

The work I will present today is a result of my internship at Yahoo Labs Barcelona

For this work, I was working with people at the top. There is Gianmarco, David, Nicolas and Marco

In our work, we propose a very simple and practical approach for load balancing for Stream processing Engines1Stream Processing EnginesStreaming ApplicationOnline Machine LearningReal Time Query ProcessingContinuous Computation

Streaming FrameworksStorm, Borealis, S4, Samza, Spark Streaming2The Power of Both ChoicesStream processing systems are specialized for low latency and high throughput processing for real time data

Few common domains of streaming applications are online machine learning, real time query processing, continuous computation

Due to the need of real time processing, various frameworks have been proposed in the last decade.2Stream Processing ModelStreaming Applications are represented by Directed Acyclic Graphs (DAGs)WorkerWorkerWorkerSourceSource3The Power of Both ChoicesData StreamOperatorsData ChannelsStreaming Applications are represented as DAGsIn a DAG a vertex are set of operators that are distributed across cluster, which apply various light weight transformation of the incoming stream.For example, filters, join, union, aggregrate are common stream transformation.Edges are data channels that are use to route the data from one operator to anotherIn stream processing systems, there are various stream grouping strategies for routing the keys from one level of operators to the never level of operatorsIn todays talk, I will be concentrating on load balancing between the group of operators at each level of DAG3Stream GroupingKey or Fields GroupingHash-based assignmentStateful operations, e.g., page rank, degree count

Shuffle GroupingRound-robin assignmentStateless operations, e.g., data logging, OLTP

4The Power of Both ChoicesKey grouping uses hash based assignment. It applies a hash function on the incoming data and assign the worker to the message by taking the mod of the hashKey grouping is a good choice for stateful operators like aggregatesShuffle grouping is a simple round roibin assignment scheme.Shuffle grouping is a good choice for stateless operators4Key Grouping5The Power of Both ChoicesKey GroupingScalable Low Memory Load Imbalance

To understand stream grouping strategies, lets take an example of word countSuppose we want to count the words in the tweets from twitter

To implement twitter wordcount, we need to set of operators, A first level of operator to split the tweets into set of words.And the next level of operator to count the words

As we know that many of the real workloads are highly skewed. 5Shuffle Grouping6The Power of Both ChoicesShuffle GroupingLoad Balance Memory O(W) Aggregation O(W)

Problem FormulationInput is a unbounded sequence of messages from a key distribution

Each message is assigned to a worker for processing

Load balance propertiesMemory Load BalanceNetwork Load BalanceProcessing Load Balance

Metric: Load Imbalance

The Power of Both Choices7

Talk about different load balance schemes.

MemoryNetworkProcessing

In particular, we are interested in balancing the workload for processing. So total number of messages processing7

Power of two choicesBalls-and-bins problem

AlgorithmFor each ball, pick two bins uniformly at randomAssign the ball to least loaded of the two bins

IssuesDistributed Consensus on Keys Skewed distribution Continuous DataLoad Information

8The Power of Both ChoicesImg source: http://s17.postimg.org/qqctbpftr/Galton_prime_box.jpg An elegant solution is using the power of 2 choices:

In the past it has been introduced as a balls and bins problem:

Given a ball (that means a key in our setup) and a set of bins (workers)pick 2 random bins, check which one has the least load of balls and send the new ball to that one.

Surprisingly, this simple strategy leads to an imbalance that is independent of the balls thrown at the bins.

So this looks like a good solution.However, when applied in DSPEs, it has some complications

Better than hashing Power of choices takes two bins and put at random, and it is better load balance

8Partial Key Grouping (PKG)Key SplittingSplit each key into two serverAssign each instance using power of two choices

Local Load Estimationeach source estimates load on using the local routing history

9The Power of Both Choices9Partial Key Grouping (PKG)Key SplittingLocal Load EstimationThe Power of Both Choices10SourceSourceWorkerWorkerWorker201102202Partial Key Grouping (PKG)Key SplittingDistributedStatelessHandle Skew

Local load estimationNo coordination among sourcesNo communication with workers

11The Power of Both Choices

11Partial Key GroupingThe Power of Both Choices12PKGLoad Balance Memory O(1) Aggregation O(1)

It uses twice at most memory than key grouping

12Analysis: Chromatic Balls and BinsProblem FormulationIf messages are drawn from a key distribution where probabilities of keys are p1p2p3.. pnEach key has d choices out of n workers

Minimize the difference between maximum and average workload

The Power of Both Choices13

AnalysisNecessary Condition: If pi represents the probability of occurrence of a key i

Bounds:The Power of Both Choices14

Streaming ApplicationsMost algorithms that use Shuffle Grouping can be expressed using Partial Key Grouping to reduce:Memory footprintAggregation overhead

Algorithms that use Key Grouping can be rewritten to achieve load balanceThe Power of Both Choices15Streaming Examples Nave Bayes Classifier

Streaming Parallel Decision Trees

Heavy Hitters and Space Saving

The Power of Both Choices16Motivate the people to read the paper16Stream Grouping: SummaryGroupingProsConsKey GroupingScalableMemoryLoad ImbalanceShuffle GroupingLoad BalanceMemory O(W)Aggregation O(W)Partial Key GroupingScalableLoad BalanceMemory O(1)Aggregation O(1)

The Power of Both Choices17ExperimentsWhat is the effect of key splitting on POTC?

How does local estimation compare to a global oracle?

How does PKG perform on a real deployment on Apache Storm?

The Power of Both Choices18We performed various experiments to assess the performance of our technique.

We compare with pure PoTC to study the effect of key splitting

We study how well local load estimation reaches a solution which is similar in balance as with when having global information

We study how robust PKG is to shifting skew

We implemented the PKG on Apache Storm to see how well it does on a real DSPE system.

18Experimental SetupMetricthe difference of maximum and the average load of the workers at time t

DatasetsTwitter, 1.2G tweets (crawled July 2012)Wikipedia, 22M access logsTwitter, 690K cashtags (crawled Nov 2013)Social Networks, 69M edgesSynthetic, 10M keys19The Power of Both Choices

We measured imbalance in the system:

The maximum load observed across workers the average load across workers.

Compared global, local, and a version that does probing of load on workers at regular intervals

Used different real and synthetic data: tweets, wikipedia page access, etc.

19Effect of Key SplittingThe Power of Both Choices20

Here, we compare PKG with regular POTC, a greedy online and offline algorithm and hashing.

We show average imbalance across time, given different number of workers, for Wikipedia and Twitter

Online greedy picks the least loaded worker to handle a new key.

Offline greedy first sorts the keys by frequency and then executes online greedy.

Hashing just applies a hash function on the keys (so its the KG version)

-> Hashing performs the worst.

-> PKG performs very well and similar to the Greedy algorithms

-> Adding workers increases imbalance.

Hashing is the single choice paradigm

20Local Load EstimationThe Power of Both Choices21

This is the average imbalance across the system and time, for Twitter, Wikipedia, Cashtags and a synthetic lognormal distribution.

We study how well the local load estimation compares with global information.

It is always very close, regardless of how many sources we allow the system to have.

Some different patterns for the various datasets, but the trend is the same.

Hashing is worst.

Hashing is just for reference

21Real deployment: Apache StormThe Power of Both Choices22

These are some results from a real deployment on Storm.

We tried to simulate different working time per key. Fore example, reading 400KB from memory is 0.1ms and 1/10th of disk seek is 1ms

On the left plot, we don't have aggregation phase but apply different CPU load per key and measure the throughput supported.

On the right plot, we keep constant the CPU load to 0.4msecs and vary how often we do aggregation. 0.4 is close to saturation to the system.The less frequent is the aggregation, the more the memory cost.

We see that PKG can offer similar or better throughput than SG for the same or smaller memory cost.

22ConclusionPartial Key Grouping (PKG) reduces the load imbalance by up to seven orders of magnitude compared to Key Grouping

PKG imposes constant memory and aggregation overhead, i.e., O(1), compared to Shuffle Grouping that is O(W)

Apache Storm 60% improvement in throughput45% improvement in latency

PKG has been integrated in Apache Storm ver 0.10.

23The Power of Both ChoicesFuture WorkLoad Balancing for Stateful Operators using key migration

Adaptive Load Balancing for highly skewed data

Load Balancing for graph processing systemsThe Power of Both Choices24The Power of Both ChoicesPractical Load Balancing for Distributed Stream Processing Engines

Muhammad Anis Uddin Nasir, Gianmarco De Francisci Morales, David Garcia-Soriano Nicolas Kourtellis, Marco SerafiniInternational Conference on Data Engineering (ICDE) 2015

25