concurrent algorithms. summing the elements of an array 2 7 3 15 10 13 18 6 4 10 25 31 10 35 41 76

13
Concurrent Algorithms

Upload: martin-higgins

Post on 03-Jan-2016

212 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Concurrent Algorithms. Summing the elements of an array 2 7 3 15 10 13 18 6 4 10 25 31 10 35 41 76

Concurrent Algorithms

Page 2: Concurrent Algorithms. Summing the elements of an array 2 7 3 15 10 13 18 6 4 10 25 31 10 35 41 76

Summing the elements of an array

2

7 3 15 10 13 18 6 4

10 25 31 10

35 41

76

Page 3: Concurrent Algorithms. Summing the elements of an array 2 7 3 15 10 13 18 6 4 10 25 31 10 35 41 76

Parallel sum and parallel prefix sum

It’s relatively easy to see how to sum up the elements of an array in a parallel fashion

This is a special case of a reduce operation—combining a number of values into a single value

It’s harder to see how to do a prefix (cumulative) sum For example, the list [3, 1, 4, 1, 6] to [3, 4, 8, 9, 15] This is a special case of what is sometimes called a scan operation

An example is shown on the next slide The algorithm is done in two passes:

The first pass is “up” the tree, retaining the summands The second pass is “down” the tree

Note: These two examples are from Principles of Parallel Programming by Calvin Lin and Lawrence Snyder

3

Page 4: Concurrent Algorithms. Summing the elements of an array 2 7 3 15 10 13 18 6 4 10 25 31 10 35 41 76

Summing the elements of an array

4

7 3 15 10 13 18 6 4

10 = 7 + 3 25 = 15 + 10 31 = 13 + 18 10 = 6 + 4

35 = 10 + 25 41 = 31 + 10

76 = 35 + 41035 (0+35)

0

03510 (0+10) 66 (35+31)

035

25 (10+15) 7

10 6648 (41+13) 72

7 10 25 35 48 66 72 76

Page 5: Concurrent Algorithms. Summing the elements of an array 2 7 3 15 10 13 18 6 4 10 25 31 10 35 41 76

Using parallel prefix sum to filter

Apply the filter operation to each element of the sequence (in parallel), yielding 1’s and 0’s

Starting with -1, perform prefix sum on the resultant list The first element for each sum is to be kept Example: Selecting only even numbers

3 1 4 1 5 9 2 6 5 3 6 0 0 1 0 0 0 1 1 0 0 1 -1 -1 -1 0 0 0 0 1 2 2 2 3 a[0]=4, a[1]=2, a[2]=6, a[3]=6

5

Page 6: Concurrent Algorithms. Summing the elements of an array 2 7 3 15 10 13 18 6 4 10 25 31 10 35 41 76

Batcher’s Bitonic sort

Batcher’s bitonic sort is a sorting algorithm with the following characteristics: It’s a variation of MergeSort It’s designed for 2n processors It fully occupies all 2n processors

Unlike array sum, which uses fewer processors on each pass

I’m not going to go through this algorithm—I just want you to be able to say you’ve heard of it

6

Page 7: Concurrent Algorithms. Summing the elements of an array 2 7 3 15 10 13 18 6 4 10 25 31 10 35 41 76

MapReduce

MapReduce is a patented technique perfected by Google to deal with huge data sets on clusters of computers

From Wikipedia: "Map" step: The master node takes the input, chops it up into smaller

sub-problems, and distributes those to worker nodes. A worker node may do this again in turn, leading to a multi-level tree structure. The worker node processes that smaller problem, and passes the answer back to its master node.

"Reduce" step: The master node then takes the answers to all the sub-problems and combines them in a way to get the output - the answer to the problem it was originally trying to solve.

Hadoop is a free Apache version of MapReduce

7

Page 8: Concurrent Algorithms. Summing the elements of an array 2 7 3 15 10 13 18 6 4 10 25 31 10 35 41 76

Basic idea of MapReduce

In MapReduce, the programmer has to write only two functions, and the framework takes care of everything else The Map function is applied (in parallel) to each item of data,

producing a list of key-value pairs The framework collects all the lists, and groups the key-value

pairs by key The Reduce function is applied (in parallel) to each group,

returning either a single value, or nothing The framework collects all the returns

8

Page 9: Concurrent Algorithms. Summing the elements of an array 2 7 3 15 10 13 18 6 4 10 25 31 10 35 41 76

MapReduce picture

Source: http://www.google.com/imgres?imgurl=http://people.apache.org/~rdonkin/hadoop-talk/diagrams/map-reduce.png&imgrefurl=http://people.apache.org/~rdonkin/hadoop-talk/hadoop.html&usg=__-

o9xqKbO4FaSEXpfViPX2cgesJo=&h=393&w=504&sz=12&hl=en&start=0&sig2=m4ExSHfMsoQUbWTbGTHwwA&zoom=1&tbnid=xi4TPuXkbg5f-M:&tbnh=150&tbnw=193&ei=voOwTe6mM6Xt0gGJtd2SCQ&prev=/images%3Fq%3Dmapreduce%26hl%3Den%26safe%3Doff%26biw%3D981%26bih%3D666%26gbv%3D2%26tbm%3Disch&itbs=1&iact=rc&dur=505&page=1&ndsp=12&ved=1t:429,r:11,s:0&tx=48&ty=53

9

Page 10: Concurrent Algorithms. Summing the elements of an array 2 7 3 15 10 13 18 6 4 10 25 31 10 35 41 76

Example: Counting words (Python)

The following Python program counts how many times each word occurs in a set of data, and returns the list of words and their counts def mapper(key, value):

words=key.split() for word in words: Wmr.emit(word, '1')

def reducer(key, iter): sum = 0 for s in iter: sum = sum + int(s) Wmr.emit(key, str(sum))

10

Page 11: Concurrent Algorithms. Summing the elements of an array 2 7 3 15 10 13 18 6 4 10 25 31 10 35 41 76

Example: Counting words (Java) * Mapper for word count */

class Mapper { public void mapper(String key, String value) { String words[] = key.split(" "); int i = 0; for (i = 0; i < words.length; i++) Wmr.emit(words[i], "1"); }}

/* Reducer for word count */

class Reducer { public void reducer(String key, WmrIterator iter) { int sum = 0; while (iter.hasNext()) { sum += Integer.parseInt(iter.next()); } Wmr.emit(key, Integer.valueOf(sum).toString()); }}

11

Page 12: Concurrent Algorithms. Summing the elements of an array 2 7 3 15 10 13 18 6 4 10 25 31 10 35 41 76

Example: Average movie ratings #!/usr/bin/env python

def mapper(key, value): avgRating = float(value) binRating = 0.0 if (0 < avgRating < 1.25): binRating = 1.0 elif (1.25 <= avgRating < 1.75): binRating = 1.5 elif (1.75 <= avgRating < 2.25): binRating = 2.0 elif (2.25 <= avgRating < 2.75): binRating = 2.5 elif (2.75 <= avgRating < 3.25): binRating = 3.0 elif (3.25 <= avgRating < 3.75): binRating = 3.5 elif (3.75 <= avgRating < 4.25): binRating = 4.0 elif (4.25 <= avgRating < 4.75): binRating = 4.5 elif (4.75 <= avgRating < 5.0): binRating = 5.0 else: binRating = 99.0 Wmr.emit(str(binRating), key)

#!/usr/bin/env python

def reducer(key, iter): count = 0 for s in iter: count = count + 1

Wmr.emit(key, str(count))

12

Page 13: Concurrent Algorithms. Summing the elements of an array 2 7 3 15 10 13 18 6 4 10 25 31 10 35 41 76

The End

13