high performance runtime for next generation parallel ... · next generation parallel programming...

32
Vivek Kumar 1 , Stephen M Blackburn 1 , David Grove 2 , Daniel Frampton 1, 3 1 The Australian National University 2 IBM T.J. Watson Research 3 Microsoft High Performance Runtime for Next Generation Parallel Programming Languages

Upload: others

Post on 02-Oct-2020

16 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: High Performance Runtime for Next Generation Parallel ... · Next Generation Parallel Programming Languages. Hardware and Software Today Background High Performance Runtime for Next

Vivek Kumar1, Stephen M Blackburn1, David Grove2, Daniel Frampton1, 3

1 The Australian National University 2 IBM T.J. Watson Research 3 Microsoft

High Performance Runtime for Next Generation Parallel Programming

Languages

Page 2: High Performance Runtime for Next Generation Parallel ... · Next Generation Parallel Programming Languages. Hardware and Software Today Background High Performance Runtime for Next

Hardware and Software Today

Background

High Performance Runtime for Next Generation Parallel Programming Languages | Kumar 2

Page 3: High Performance Runtime for Next Generation Parallel ... · Next Generation Parallel Programming Languages. Hardware and Software Today Background High Performance Runtime for Next

3

The Challenge

Background

•  Productivity •  Language based features to expose

parallelism – X10, Cilk, Habanero etc

•  Performance •  Work–stealing scheduling

•  Portability •  Managed runtime to hide the hardware

complexities

High Performance Runtime for Next Generation Parallel Programming Languages | Kumar

Page 4: High Performance Runtime for Next Generation Parallel ... · Next Generation Parallel Programming Languages. Hardware and Software Today Background High Performance Runtime for Next

4

Options ?

•  Productivity •  Language based features to expose

parallelism – X10, Cilk, Habanero-Java etc

•  Performance •  Work–stealing scheduling

•  Portability •  Managed runtime to hide the hardware

complexities

Background

High Performance Runtime for Next Generation Parallel Programming Languages | Kumar

Page 5: High Performance Runtime for Next Generation Parallel ... · Next Generation Parallel Programming Languages. Hardware and Software Today Background High Performance Runtime for Next

5

Thesis Statement

High performance languages are using managed

platforms for productivity and portability, but

performance is inadequate. By exploiting and

extending the underlying mechanisms of managed

runtimes, implementation of these languages will be

able to deliver scalability and performance at the

levels necessary for widespread uptake.

High Performance Runtime for Next Generation Parallel Programming Languages | Kumar

Page 6: High Performance Runtime for Next Generation Parallel ... · Next Generation Parallel Programming Languages. Hardware and Software Today Background High Performance Runtime for Next

Contributions

6 High Performance Runtime for Next Generation Parallel Programming Languages | Kumar

Page 7: High Performance Runtime for Next Generation Parallel ... · Next Generation Parallel Programming Languages. Hardware and Software Today Background High Performance Runtime for Next

Contributions

7 High Performance Runtime for Next Generation Parallel Programming Languages | Kumar

Page 8: High Performance Runtime for Next Generation Parallel ... · Next Generation Parallel Programming Languages. Hardware and Software Today Background High Performance Runtime for Next

Contributions

8

High Performance

High Performance Runtime for Next Generation Parallel Programming Languages | Kumar

Page 9: High Performance Runtime for Next Generation Parallel ... · Next Generation Parallel Programming Languages. Hardware and Software Today Background High Performance Runtime for Next

Contributions

9

High Productivity High Performance

High Performance Runtime for Next Generation Parallel Programming Languages | Kumar

Page 10: High Performance Runtime for Next Generation Parallel ... · Next Generation Parallel Programming Languages. Hardware and Software Today Background High Performance Runtime for Next

Contributions

10

High Productivity High Performance Highly Competitive

High Performance Runtime for Next Generation Parallel Programming Languages | Kumar

Page 11: High Performance Runtime for Next Generation Parallel ... · Next Generation Parallel Programming Languages. Hardware and Software Today Background High Performance Runtime for Next

Understanding Work–Stealing

11 High Performance Runtime for Next Generation Parallel Programming Languages | Kumar

Page 12: High Performance Runtime for Next Generation Parallel ... · Next Generation Parallel Programming Languages. Hardware and Software Today Background High Performance Runtime for Next

Understanding Work–Stealing

High Performance Runtime for Next Generation Parallel Programming Languages | Kumar

Page 13: High Performance Runtime for Next Generation Parallel ... · Next Generation Parallel Programming Languages. Hardware and Software Today Background High Performance Runtime for Next

Understanding Work–Stealing

13 High Performance Runtime for Next Generation Parallel Programming Languages | Kumar

Page 14: High Performance Runtime for Next Generation Parallel ... · Next Generation Parallel Programming Languages. Hardware and Software Today Background High Performance Runtime for Next

Methodology •  Hardware Platform

–  2x8 cores Intel Xeon E5-2450

•  Software Platform –  Jikes RVM (3.1.3)

•  Benchmarks –  UTS, BarnessHut, FFT, Jacobi, LUDecomposition,

JGF_SeriesTest, HeatDiffusion, PointCorrelation, NQueens, Matmul, CilkSort and Fibonacci

•  To evaluate performance –  JMetal (sourceforge project with 327 Java files)

•  To evaluate the productivity of our system

14 High Performance Runtime for Next Generation Parallel Programming Languages | Kumar

Page 15: High Performance Runtime for Next Generation Parallel ... · Next Generation Parallel Programming Languages. Hardware and Software Today Background High Performance Runtime for Next

15

Big…… But How Big ??

High Performance Runtime for Next Generation Parallel Programming Languages | Kumar

Motivating Analysis

Page 16: High Performance Runtime for Next Generation Parallel ... · Next Generation Parallel Programming Languages. Hardware and Software Today Background High Performance Runtime for Next

16

1

2

4

geomean

Habanero-Java ManagedX10 Fork-Join

Sequential Overhead

Motivating Analysis

3.7x

2.5x

1.6x

High Performance Runtime for Next Generation Parallel Programming Languages | Kumar

Page 17: High Performance Runtime for Next Generation Parallel ... · Next Generation Parallel Programming Languages. Hardware and Software Today Background High Performance Runtime for Next

17

Steal to Task Ratio

Motivating Analysis

High Performance Runtime for Next Generation Parallel Programming Languages | Kumar

0.0000001

0.000001

0.00001

0.0001

0.001

0.01

0.1

1

2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

Threads

Page 18: High Performance Runtime for Next Generation Parallel ... · Next Generation Parallel Programming Languages. Hardware and Software Today Background High Performance Runtime for Next

18

Insights

•  Move the overheads from common case to the rare case

•  Re-use existing mechanisms inside modern managed runtimes

Motivating Analysis

High Performance Runtime for Next Generation Parallel Programming Languages | Kumar

Page 19: High Performance Runtime for Next Generation Parallel ... · Next Generation Parallel Programming Languages. Hardware and Software Today Background High Performance Runtime for Next

foo() { finish { async X = S1(); Y = S2(); }}

Implementation

steal

….

THIEF

S1

foo

C

B

A Sta

ck G

row

th D

irect

ion

VICTIM

Yieldpoint Mechanism

foo

C

B

A

THIEF

S2

•  Yieldpoint mechanism •  On-stack replacement •  Java try/catch exceptions •  Dynamic code patching

Page 20: High Performance Runtime for Next Generation Parallel ... · Next Generation Parallel Programming Languages. Hardware and Software Today Background High Performance Runtime for Next

Evaluation

20

1

2

4

geomean

Habanero-Java ManagedX10 Fork-Join TryCatchWS

Sequential Overhead

7%

3.7x

2.5x

1.6x

High Performance Runtime for Next Generation Parallel Programming Languages | Kumar

Page 21: High Performance Runtime for Next Generation Parallel ... · Next Generation Parallel Programming Languages. Hardware and Software Today Background High Performance Runtime for Next

21

Steal Rate

Motivating Analysis

High Performance Runtime for Next Generation Parallel Programming Languages | Kumar

0

10

20

30

40

2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

Ste

als

per m

illi-s

econ

ds

Threads

Page 22: High Performance Runtime for Next Generation Parallel ... · Next Generation Parallel Programming Languages. Hardware and Software Today Background High Performance Runtime for Next

Dynamic Overhead

Motivating Analysis

22 High Performance Runtime for Next Generation Parallel Programming Languages | Kumar

0

10

20

30

40

2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 0

2

4

6

8

10

12

14

2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

Dyn

amic

ove

rhea

d (%

)

Threads

Page 23: High Performance Runtime for Next Generation Parallel ... · Next Generation Parallel Programming Languages. Hardware and Software Today Background High Performance Runtime for Next

23

Insights

•  Still the same –  Re-use existing mechanisms inside modern managed

runtimes

Motivating Analysis

High Performance Runtime for Next Generation Parallel Programming Languages | Kumar

Page 24: High Performance Runtime for Next Generation Parallel ... · Next Generation Parallel Programming Languages. Hardware and Software Today Background High Performance Runtime for Next

Return Barrier Hijack a return and bridge to some other method

E

D

C

B

A Sta

ck G

row

th D

irect

ion

Implementation

24 High Performance Runtime for Next Generation Parallel Programming Languages | Kumar

Page 25: High Performance Runtime for Next Generation Parallel ... · Next Generation Parallel Programming Languages. Hardware and Software Today Background High Performance Runtime for Next

Dynamic Overhead

Evaluation

25 High Performance Runtime for Next Generation Parallel Programming Languages | Kumar

0

0.5

1

1.5

2

geomean

Old New

Dyn

amic

ove

rhea

d (%

) For threads=16

Page 26: High Performance Runtime for Next Generation Parallel ... · Next Generation Parallel Programming Languages. Hardware and Software Today Background High Performance Runtime for Next

26 High Performance Runtime for Next Generation Parallel Programming Languages | Kumar

Page 27: High Performance Runtime for Next Generation Parallel ... · Next Generation Parallel Programming Languages. Hardware and Software Today Background High Performance Runtime for Next

Productivity in a Large Code Base

•  Project with several hundred files •  Multiple dependencies (inheritance…) •  Achieving parallelism

–  Minimal changes –  Track fields with atomic updates –  Avoid deadlocks

27

Motivating Analysis

High Performance Runtime for Next Generation Parallel Programming Languages | Kumar

Page 28: High Performance Runtime for Next Generation Parallel ... · Next Generation Parallel Programming Languages. Hardware and Software Today Background High Performance Runtime for Next

Java Language Annotations

Implementation

•  Annotate and leave the rest on compiler •  Parallelism

–  syncsteal {…} –  steal {…}

•  Data centric concurrency control (Dolby et al. 2012) –  @Atomicsets(X) –  @Atomic(X) –  @AliasAtomic(Y=this.X)

28 High Performance Runtime for Next Generation Parallel Programming Languages | Kumar

Page 29: High Performance Runtime for Next Generation Parallel ... · Next Generation Parallel Programming Languages. Hardware and Software Today Background High Performance Runtime for Next

29

So Where Do We Stand …?

High Performance Runtime for Next Generation Parallel Programming Languages | Kumar

Page 30: High Performance Runtime for Next Generation Parallel ... · Next Generation Parallel Programming Languages. Hardware and Software Today Background High Performance Runtime for Next

0

1

2

3

4

5

6

7

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

Spe

ed

up o

ver

Seq

uentia

l

Threads

Habanero-Java

ManagedX10

Fork-Join

TryCatchWS

Work–Stealing Performance

Evaluation

High Performance Runtime for Next Generation Parallel Programming Languages | Kumar 30

Jacobi

Page 31: High Performance Runtime for Next Generation Parallel ... · Next Generation Parallel Programming Languages. Hardware and Software Today Background High Performance Runtime for Next

Work–Stealing Performance

Evaluation

0

1

2

3

4

5

6

7

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

Speedup o

ver

Sequentia

l

Threads

Habanero-Java

ManagedX10

Fork-Join

TryCatchWS

High Performance Runtime for Next Generation Parallel Programming Languages | Kumar 31

UTS

Page 32: High Performance Runtime for Next Generation Parallel ... · Next Generation Parallel Programming Languages. Hardware and Software Today Background High Performance Runtime for Next

Summary and Conclusion •  Work–stealing overheads – sequential and dynamic •  Reused existing mechanisms inside modern managed

runtimes –  Yieldpoint mechanism –  On-stack replacement –  Java try/catch exception handling –  Dynamic code patching –  Return barrier

•  Effectively eliminated sequential overhead (only 7%) •  Halved the dynamic overhead •  Annotations in Java to generate work-stealing calls and

synchronization blocks

Summary

32 High Performance Runtime for Next Generation Parallel Programming Languages | Kumar