an accumulative computation framework on mapreduce ppl2013

1
Examples of the Accumulative Computation Benchmarks on the MapReduce Clusters Programmability: The parallel accumulate programming interface can simplify many problems which have data dependency. Efficiency and Scalability: The experiments show the framework can process large data in reasonable time, and achieves near linear speed-up when increasing CPUs. Conclusions Line-of-sight Hideya Iwasaki and Zhenjiang Hu. A new parallel skeleton for general accumulative computations, International Journal of Parallel Programming, 2004. Yu Liu, Kento Emoto , Kiminori Matsuzaki, Zhenjiang Hu, Accumulative Computation on MapReduce ( Submitted to Euro-Par 2013). Computations which have data dependency are usually hard to be parallelized by using MapReduce or other parallel programming models. For example, given an input list [ x 1 , x 2 , x 3 , x 4 ], and an binary operator to compute: An Accumulative Computation Framework on MapReduce 劉雨 1,4 江本 健斗 2 松崎 公紀 3 胡 振江 1,4 1 総合研究大学院大学 2 東京大学 3 高知工科大学 4 立情報学研究所 Parallel Accumulative Computation MapReduce-Accumulation 連絡先: 雨(YU LIU)/ 国立情報学研究所 アーキテクチャ科学研究系 胡研究室 TEL : 03-4212-2611 FAX : 03-4212-2533 Email : [email protected] EduBaseCloud of National Institute of Informatics Eliminate Smallers Tag Match Here are four accumulative computation examples < , <, />, <, />, <, /> [ 1, 3, 2 , 9, 4, 6, 7 , 12, 10 ] A communication-efficient MapReduce algorithm To simplify the problems like above, we propose an accumulative computation framework on MapReduce. We provide a general pattern: accumulate to encode many parallel computations in this framework. The above definition can be rewrite in the following form: Programs written in terms of accumulate can be automatically transformed to efficient MapReduce programs by our framework. Here are 6 accumulate programs. Input data size is about 5 x 10 9 items, for each program.

Upload: yu-liu

Post on 18-Feb-2017

94 views

Category:

Technology


0 download

TRANSCRIPT

Page 1: An accumulative computation framework on MapReduce ppl2013

Examples of the Accumulative Computation Benchmarks on the MapReduce Clusters

• Programmability: The parallel accumulate programming interface can simplify many problems which have data dependency.

• Efficiency and Scalability: The experiments show the framework can process large data in reasonable time, and achieves near linear speed-up when increasing CPUs.

Conclusions

Line-of-sight

Hideya Iwasaki and Zhenjiang Hu. A new parallel skeleton for general accumulative computations, International Journal of Parallel Programming, 2004.Yu Liu, Kento Emoto , Kiminori Matsuzaki, Zhenjiang Hu, Accumulative Computation on MapReduce ( Submitted to Euro-Par 2013).

Computations which have data dependency are usually hard to be parallelized by using MapReduce or other parallel programming models. For example, given an input list [ x1, x2, x3, x4 ], and an binary operator ⊙ to compute:

An Accumulative Computation Framework on MapReduce

劉 雨1,4 江本 健斗2 松崎 公紀3 胡 振江1,4

1総合研究大学院大学 2東京大学 3高知工科大学 4国立情報学研究所

Parallel Accumulative Computation MapReduce-Accumulation

連絡先: 劉 雨(YU LIU)/ 国立情報学研究所 アーキテクチャ科学研究系 胡研究室

TEL : 03-4212-2611 FAX : 03-4212-2533 Email : [email protected]

EduBaseCloud of National Institute of Informatics

Eliminate Smallers

Tag Match

Here are four accumulative computation examples

< , <, />, <, />, <, />

[ 1, 3, 2, 9, 4, 6, 7, 12, 10 ]

A communication-efficient MapReduce algorithm

To simplify the problems like above, we propose an accumulative computation framework on MapReduce. We provide a general pattern: accumulate to encode many parallel computations in this framework.

The above definition can be rewrite in the following form:

Programs written in terms of accumulate can be automatically transformed to efficient MapReduce programs by our framework.

Here are 6 accumulate programs.

Input data size is about 5 x 109 items, for each program.