a model of computation for mapreduce karloff, suri and vassilvitskii (soda ’ 10) presented by ning...

A Model of Computation for A Model of Computation for MapReduceMapReduce

Karloff, Suri and Vassilvitskii (SOKarloff, Suri and Vassilvitskii (SODA’10)DA’10)

Presented by Ning Xie

Why MapReduceWhy MapReduce

Tera- and petabytes data set (search enTera- and petabytes data set (search engines, internet traffic, bioinformatics, etgines, internet traffic, bioinformatics, etc)c)

Need parallel computingNeed parallel computing Requirement: easy to program, reliable, Requirement: easy to program, reliable,

distributeddistributed

What is MapReduceWhat is MapReduce

A new framework for parallel computing A new framework for parallel computing originally developed at Google (beforeoriginally developed at Google (before ’04) ’04)

Widely adopted and became standard foWidely adopted and became standard for large scale data analysisr large scale data analysis

HadoopHadoop (open source version) is being u (open source version) is being used by Yahoo, Facebook, Adobe, IBM, Ased by Yahoo, Facebook, Adobe, IBM, Amazon, and many institutions in academazon, and many institutions in academiamia

What is MapReduce (cont.)What is MapReduce (cont.)

Three-stage operations:Three-stage operations:• Map-stageMap-stage: mapper operates on a single pair : mapper operates on a single pair

<key, value>, outputs any number of new pair<key, value>, outputs any number of new pairs <key’, value’>s <key’, value’>

• Shuffle-stageShuffle-stage: all values that are associated t: all values that are associated to an individual key are sent to a single machio an individual key are sent to a single machine (done by the system)ne (done by the system)

• Reduce-stageReduce-stage: reducer operates on the all the : reducer operates on the all the values and outputs a multiset of <key, value> values and outputs a multiset of <key, value>

What is MapReduce (cont.)What is MapReduce (cont.)

MapMap operation is stateless (parallel) operation is stateless (parallel) ShuffleShuffle stage is done automatically stage is done automatically

by the underlying systemby the underlying system ReduceReduce stage can only start when all stage can only start when all

Map operations are done Map operations are done (interleaving between sequential and (interleaving between sequential and parallel)parallel)

An example: kAn example: kthth frequency moment frequency moment of a large data setof a large data set

Input: x Input: x 22ΣΣn n , , ΣΣ is a finite set of symbols is a finite set of symbols Let f(Let f(¾¾) be the frequency of symbol ) be the frequency of symbol ¾¾ note: note: ¾¾f(f(¾¾)=n)=n Want to compute Want to compute ¾¾ffkk((¾¾))

An example (cont.)An example (cont.)

Input to each mapper: <i, xInput to each mapper: <i, xii>>

• MM11(<i, x(<i, xii>)= <x>)= <xii , i> (i is the index) , i> (i is the index)

Input to each reducer: <xInput to each reducer: <xii,{i,{i11, i, i22,…, i,…, imm}>}>

• RR11(<x(<xii,{i,{i11, i, i22,…, i,…, imm}>)=<x}>)=<xii, m, mkk>>

Each mapper: MEach mapper: M22(<x(<xii, v>)=<$, v>, v>)=<$, v> A single reducer: RA single reducer: R22(<$,{v(<$,{v11,…,v,…,vll}>=<$, }>=<$,

iivvii>>

Formal DefinitonsFormal Definitons A MapReduce program consists of a A MapReduce program consists of a

sequence <Msequence <M11, R, R11, M, M22, R, R22,…, M,…, Mll, R, Rll> of > of mappers and reducersmappers and reducers

The input is UThe input is U00, a multiset of , a multiset of <key,value><key,value>

Formal DefinitonsFormal Definitons

Execution of the programExecution of the programFor r=1,2,…,lFor r=1,2,…,l

1. feed each <k,v> in U1. feed each <k,v> in Ur-1r-1 to mapper M to mapper Mrr Let the output be U’Let the output be U’r r

2. for each key k, construct the multiset V2. for each key k, construct the multiset Vk,rk,r s.t. <k, vs.t. <k, vii> > 22 U Ur-1r-1

3. for each k, feed k and some perm. of V3. for each k, feed k and some perm. of Vk,rk,r to a separate instance of Rto a separate instance of Rrr. Let U. Let Ur r be the be the multiset of <key, value> generated by Rmultiset of <key, value> generated by Rrr

The MapReduce Class (The MapReduce Class (MRCMRC))

On input {<key,value>} of size nOn input {<key,value>} of size n• Memory: each mapper/reducer uses Memory: each mapper/reducer uses O(nO(n1-1-²² ) )

spacespace• Machines: There are Machines: There are ££(n(n1-1-²²)) machines available machines available• Time: each machine runs in time polynomial in Time: each machine runs in time polynomial in

n, not polynomial in the length of the input n, not polynomial in the length of the input they receivethey receive

• Randomized algorithms for map and reduceRandomized algorithms for map and reduce• Rounds: Shuffle is expensiveRounds: Shuffle is expensive

MRCMRCii : num. of : num. of rounds=O(logrounds=O(logiin)n) DMRCDMRC: the deterministic variant : the deterministic variant

Comparing Comparing MRCMRC with PRAM with PRAM

Most relevant classical computation Most relevant classical computation model is the PRAM (model is the PRAM (Parallel Random Parallel Random

Access MachineAccess Machine) model) model The corresponding class is The corresponding class is NCNC Easy relation: Easy relation: MRCMRC µµ PP Lemma:Lemma: If If NC NC P P, then , then MRCMRC * * NCNC Open questionOpen question: show that : show that DMRC DMRC P P

Comparing with PRAM (cont.)Comparing with PRAM (cont.)

Simulation lemmaSimulation lemma:: Any CREW (concurrent read exclusive write) PRAM Any CREW (concurrent read exclusive write) PRAM

algorithm using O(nalgorithm using O(n2-22-2²²) total memory and O(n) total memory and O(n2-2-

22²²) processors and runs in time t(n) can be ) processors and runs in time t(n) can be simulated by an algorithm in simulated by an algorithm in DMRCDMRC which runs in which runs in O(t(n)) roundsO(t(n)) rounds

Example: Finding an MSTExample: Finding an MST

Problem: find the minimum spanning tree Problem: find the minimum spanning tree of a dense graphof a dense graph

The The algorithmalgorithm• Randomly partition the vertices into k partsRandomly partition the vertices into k parts• For each pair of vertex sets, find the MST of the For each pair of vertex sets, find the MST of the

bipartite subgraph induce by these two setsbipartite subgraph induce by these two sets• Take the union of all the edges in the MST of Take the union of all the edges in the MST of

each pair, call the graph Heach pair, call the graph H• Compute an MST of HCompute an MST of H

Finding an MST (cont.)Finding an MST (cont.)

The algorithm is easy to parallelizeThe algorithm is easy to parallelize• The MST of each subgraph can be The MST of each subgraph can be

computed in parallelcomputed in parallel Why it works?Why it works?

• TheoremTheorem: the MST tree of H is an MST of : the MST tree of H is an MST of GG

• ProofProof: we did not discard any relevant : we did not discard any relevant edge when sparsify the input graph G edge when sparsify the input graph G

Finding an MST (cont.)Finding an MST (cont.)

Why the algorithm in MRC?Why the algorithm in MRC?• Let N=|V| and m=|E|=NLet N=|V| and m=|E|=N1+c1+c

• So input size n satisfies N=nSo input size n satisfies N=n1/1+c1/1+c

• Pick k=NPick k=Nc/2 c/2

• LemmaLemma: with high probability, the size of : with high probability, the size of each bipartite subgraph has size Neach bipartite subgraph has size N1+c/21+c/2

• so the input to any reducer is nso the input to any reducer is n1-1-²²

• The size of H is also nThe size of H is also n1-1-²²

Functions LemmaFunctions Lemma A very useful building block for designing A very useful building block for designing

MapReduce algorithmsMapReduce algorithms Definition [Definition [MRC-parallelizable functionMRC-parallelizable function]: Let ]: Let

S be a finite set. We say a function f on S is S be a finite set. We say a function f on S is MRC-parallelizableMRC-parallelizable if there are functions g if there are functions g and h so that the following hold:and h so that the following hold:• For any partition of S, S = TFor any partition of S, S = T11 [[ T T22 [[ … …[[ T Tkk

• f can be written as: f(S) =h(g(Tf can be written as: f(S) =h(g(T11), g(T), g(T22),… g(T),… g(Tkk)).)).• g and h each can be represented in O(logn) bits.g and h each can be represented in O(logn) bits.• g and h can be computed in time polynomial in |g and h can be computed in time polynomial in |

S| and all possible outputs of g can be expressed S| and all possible outputs of g can be expressed in O(logn) bits.in O(logn) bits.

Functions Lemma (cont.)Functions Lemma (cont.)

Lemma (Lemma (Functions LemmaFunctions Lemma)) Let U be a universe of size n and let S = {SLet U be a universe of size n and let S = {S11,…, S,…, Skk} }

be a collection of subsets of U, where kbe a collection of subsets of U, where k·· n n2-32-3²² and and i=1i=1

kk|S|Sii||·· n n2-22-2²². Let F = {f. Let F = {f11, …, f, …, fkk} be a collection of } be a collection of MRC-parallelizable functions. Then the output MRC-parallelizable functions. Then the output ff11(S(S11), …, f), …, fkk(S(Skk) can be computed using O(n) can be computed using O(n1-1-²²) ) reducers each with O(nreducers each with O(n1-1-²²) space.) space.

Functions Lemma (cont.)Functions Lemma (cont.)

The power of the lemmaThe power of the lemma• Algorithm designer may focus only on the Algorithm designer may focus only on the

structure of the subproblem and the inputstructure of the subproblem and the input• Distribution the input across reducer is Distribution the input across reducer is

handled by the lemma (existence handled by the lemma (existence theorem)theorem)

Proof of the lemma is not easyProof of the lemma is not easy• Use universal hashingUse universal hashing• Use Chernoff bound, etcUse Chernoff bound, etc

Application of the Functions Application of the Functions Lemma: s-t connectivityLemma: s-t connectivity

Problem: given a graph G and two Problem: given a graph G and two nodes, are they connected in G?nodes, are they connected in G?

Dense graphs: easy, powering Dense graphs: easy, powering adjacency matrixadjacency matrix

Sparse graphs?Sparse graphs?

A logn-round MapReduce algorithm A logn-round MapReduce algorithm for s-t connectivityfor s-t connectivity

Initially every node is activeInitially every node is active For i=1,2,…, O(logn) doFor i=1,2,…, O(logn) do

• Each active node becomes a leader with Each active node becomes a leader with probability ½probability ½

• For each non-leader active node u, find a node For each non-leader active node u, find a node v in the neighbor of u’s current connected v in the neighbor of u’s current connected componentcomponent

• If the connected component of v is non-empty, If the connected component of v is non-empty, then u become passive and re-label each node then u become passive and re-label each node in u’s connected component with v’s labelin u’s connected component with v’s label

Output Output TRUETRUE if s and t have the same if s and t have the same label, label, FALSEFALSE otherwise otherwise

ConclusionsConclusions

A rigorous model for MapReduceA rigorous model for MapReduce Very loose requirements for the Very loose requirements for the

hardware requirementshardware requirements Call for more research in this Call for more research in this

directiondirection

THANK YOU!THANK YOU!

a model of computation for mapreduce karloff, suri and vassilvitskii (soda ’ 10) presented by ning...

Documents

parallel slide

stage operations

multiset of slide

mapreduce program

single reducer

academia slide

distributed slide

large data set input