Transcript
Page 1: An accumulative computation framework on MapReduce ppl2013

Examples of the Accumulative Computation Benchmarks on the MapReduce Clusters

• Programmability: The parallel accumulate programming interface can simplify many problems which have data dependency.

• Efficiency and Scalability: The experiments show the framework can process large data in reasonable time, and achieves near linear speed-up when increasing CPUs.

Conclusions

Line-of-sight

Hideya Iwasaki and Zhenjiang Hu. A new parallel skeleton for general accumulative computations, International Journal of Parallel Programming, 2004.Yu Liu, Kento Emoto , Kiminori Matsuzaki, Zhenjiang Hu, Accumulative Computation on MapReduce ( Submitted to Euro-Par 2013).

Computations which have data dependency are usually hard to be parallelized by using MapReduce or other parallel programming models. For example, given an input list [ x1, x2, x3, x4 ], and an binary operator ⊙ to compute:

An Accumulative Computation Framework on MapReduce

劉 雨1,4 江本 健斗2 松崎 公紀3 胡 振江1,4

1総合研究大学院大学 2東京大学 3高知工科大学 4国立情報学研究所

Parallel Accumulative Computation MapReduce-Accumulation

連絡先: 劉 雨(YU LIU)/ 国立情報学研究所 アーキテクチャ科学研究系 胡研究室

TEL : 03-4212-2611 FAX : 03-4212-2533 Email : [email protected]

EduBaseCloud of National Institute of Informatics

Eliminate Smallers

Tag Match

Here are four accumulative computation examples

< , <, />, <, />, <, />

[ 1, 3, 2, 9, 4, 6, 7, 12, 10 ]

A communication-efficient MapReduce algorithm

To simplify the problems like above, we propose an accumulative computation framework on MapReduce. We provide a general pattern: accumulate to encode many parallel computations in this framework.

The above definition can be rewrite in the following form:

Programs written in terms of accumulate can be automatically transformed to efficient MapReduce programs by our framework.

Here are 6 accumulate programs.

Input data size is about 5 x 109 items, for each program.

Top Related