adaptive multi-level compilation in a trace-based java jit ... · recom piled cyclic trace recom...

21
IBM Research - Tokyo October 23, 2012 | SPLASH/OOPSLA 2012 at Tucson, Arizona © 2012 IBM Corporation Adaptive Multi-Level Compilation in a Trace-based Java JIT Compiler Hiroshi Inoue , Hiroshige Hayashizaki , Peng Wu and Toshio Nakatani IBM Research – Tokyo IBM Research – T.J. Watson Research Center

Upload: others

Post on 12-Oct-2020

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Adaptive Multi-Level Compilation in a Trace-based Java JIT ... · recom piled cyclic trace recom piled cyclic trace recom piled linear trace recom piled linear trace executed by interpreter

IBM Research - Tokyo

October 23, 2012 | SPLASH/OOPSLA 2012 at Tucson, Arizona © 2012 IBM Corporation

Adaptive Multi-Level Compilation in a Trace-based Java JIT Compiler

Hiroshi Inoue†, Hiroshige Hayashizaki†, Peng Wu‡ and Toshio Nakatani†† IBM Research – Tokyo‡ IBM Research – T.J. Watson Research Center

Page 2: Adaptive Multi-Level Compilation in a Trace-based Java JIT ... · recom piled cyclic trace recom piled cyclic trace recom piled linear trace recom piled linear trace executed by interpreter

2

IBM Research - Tokyo

Adaptive Multi-Level Compilation in a Trace-based Java JIT Compiler © 2012 IBM Corporation

Goal

Balancing steady-state performance and startup performancehas been a challenging task for JIT compilers

to improve the steady-state performance without hurting startup performance in a trace-based Java JIT compiler

Our Goal

Our contributions:1. We show that adaptive multi-level compilation is a practical way to

balance the startup time and steady-state performance in a trace-JIT2. We develop a new technique to efficiently identify hot paths to

recompile in a higher optimization level

Page 3: Adaptive Multi-Level Compilation in a Trace-based Java JIT ... · recom piled cyclic trace recom piled cyclic trace recom piled linear trace recom piled linear trace executed by interpreter

3

IBM Research - Tokyo

Adaptive Multi-Level Compilation in a Trace-based Java JIT Compiler © 2012 IBM Corporation

Background: Trace-based Compilation

Using a Trace, a hot path identified at runtime, as a basic unit of compilation

method fmethod entry

return

if (x != 0)

rarelyexecuted

while (!end)

do something

frequentlyexecuted

Page 4: Adaptive Multi-Level Compilation in a Trace-based Java JIT ... · recom piled cyclic trace recom piled cyclic trace recom piled linear trace recom piled linear trace executed by interpreter

4

IBM Research - Tokyo

Adaptive Multi-Level Compilation in a Trace-based Java JIT Compiler © 2012 IBM Corporation

Background: Trace-based Compilation

Using a Trace, a hot path identified at runtime, as a basic unit of compilation

method fmethod entry

a hot pathreturn

if (x != 0)

frequentlyexecuted

while (!end)

trace exit

trace exit(side exit)

trace entry trace A

include only a hot path

if (x != 0)

rarelyexecuted

while (!end)

do something

frequentlyexecuted

Trace selection: how to form good compilation scope

AA

BB

exit

exit

linear trace

AA

BB

exit

cyclic trace

stub stub

Page 5: Adaptive Multi-Level Compilation in a Trace-based Java JIT ... · recom piled cyclic trace recom piled cyclic trace recom piled linear trace recom piled linear trace executed by interpreter

5

IBM Research - Tokyo

Adaptive Multi-Level Compilation in a Trace-based Java JIT Compiler © 2012 IBM Corporation

Background: Trace Selection and Performance

Steps of trace selection1. Identify a hot trace head

a target of a backward branch, a exit point of an existing trace2. Record next execution path starting from the trace head3. Stop recording when the trace being recorded

forms a cycle, reaches pre-defined maximum length

Generating longer traces (compilation scopes):☺ yields better steady-state performance

• more optimization opportunities for compilers• smaller trace transitioning overhead

but it hurts startup performance• longer compilation time• more duplicated code among traces

Page 6: Adaptive Multi-Level Compilation in a Trace-based Java JIT ... · recom piled cyclic trace recom piled cyclic trace recom piled linear trace recom piled linear trace executed by interpreter

6

IBM Research - Tokyo

Adaptive Multi-Level Compilation in a Trace-based Java JIT Compiler © 2012 IBM Corporation

Our Approach: Adaptive multi-level compilation for trace-JIT

1. Interpreted execution

2. Initial compilation for faster startup– smaller compilation scope– lower optimization level

3. Upgrade recompilation for higher steady-state performance– larger compilation scope– higher optimization level

identify hot paths by counting the execution count for each potential trace head (e.g. loop backedge)

identify really hot paths by using timer-based sampling profiler

Our focusTrace selection for

upgrade recompilation

Page 7: Adaptive Multi-Level Compilation in a Trace-based Java JIT ... · recom piled cyclic trace recom piled cyclic trace recom piled linear trace recom piled linear trace executed by interpreter

7

IBM Research - Tokyo

Adaptive Multi-Level Compilation in a Trace-based Java JIT Compiler © 2012 IBM Corporation

Steps of Our Multi-level Compilation

BB1BB1

BB2BB2

BB3BB3

BB4BB4

BB5BB5

BB6BB6

source program

cyclic tracecyclic trace

linear trace 1linear

trace 1

linear trace 2linear

trace 2

BB7BB7

initialcompilation

hot path

cold path

trace head

upgraderecompilation

recompiledcyclic trace

recompiledcyclic trace

recompiledlinear trace

recompiledlinear trace

executed by interpreter with monitoring to

identify hot pathexecuted as

compiled codewith timer-

based sampling profiler

Our focusTrace selection

for upgrade recompilation

We limit the max trace length for

faster startup(here 2 BBs)

Page 8: Adaptive Multi-Level Compilation in a Trace-based Java JIT ... · recom piled cyclic trace recom piled cyclic trace recom piled linear trace recom piled linear trace executed by interpreter

8

IBM Research - Tokyo

Adaptive Multi-Level Compilation in a Trace-based Java JIT Compiler © 2012 IBM Corporation

recompiledlinear trace

recompiledlinear trace

An Example of Inefficient Selection for Recompilation

linear trace 1linear

trace 1

linear trace 2linear

trace 2

linear trace 3linear

trace 3

linear trace 1linear

trace 1

recompiledlinear trace

recompiledlinear trace

upgraderecompilation

Probleminsufficient

coverage by recompiled traces

Badly formed case Desired case

We want to cover the entire hot pathby a recompiled

trace

a st

raig

ht-li

ne h

ot p

ath

Page 9: Adaptive Multi-Level Compilation in a Trace-based Java JIT ... · recom piled cyclic trace recom piled cyclic trace recom piled linear trace recom piled linear trace executed by interpreter

9

IBM Research - Tokyo

Adaptive Multi-Level Compilation in a Trace-based Java JIT Compiler © 2012 IBM Corporation

Our Solution: Trace Transition Graph (TTgraph)

Trace-Transition Graph (TTgraph) is a weighted directed graph representing the control flow among traces

Node: compiled traceEdge: transition between two traces– weight represents relative frequency

We identify the hot paths to recompile using the TTgraph– to capture the entire hot path by

traversing the edges backwardlinear trace

3’

linear trace

3’

linear trace

1

linear trace

1

linear trace

2

linear trace

2

linear trace

1’

linear trace

1’

4

6

81

2

linear trace

3

linear trace

3

92

Page 10: Adaptive Multi-Level Compilation in a Trace-based Java JIT ... · recom piled cyclic trace recom piled cyclic trace recom piled linear trace recom piled linear trace executed by interpreter

10

IBM Research - Tokyo

Adaptive Multi-Level Compilation in a Trace-based Java JIT Compiler © 2012 IBM Corporation

Steps to Build TTgraph

1. We add an node when a trace is compiled

2. We add an edge between two traces when we link them (trace linking)

3. We increment weight of an edge by timer-based sampling profiler

linear trace

3’

linear trace

3’

linear trace

1

linear trace

1

linear trace

2

linear trace

2

linear trace

1’

linear trace

1’

linear trace

3

linear trace

3

cyclic tracecyclic trace

1

1

11

1

11See the paper for more detail

(e.g. how we profile transition andhow we employ bursty tracing)

See the paper for more detail (e.g. how we profile transition and

how we employ bursty tracing)

Page 11: Adaptive Multi-Level Compilation in a Trace-based Java JIT ... · recom piled cyclic trace recom piled cyclic trace recom piled linear trace recom piled linear trace executed by interpreter

11

IBM Research - Tokyo

Adaptive Multi-Level Compilation in a Trace-based Java JIT Compiler © 2012 IBM Corporation

12

12

Steps to Build TTgraph

1. We add an node when a trace is compiled

2. We add an edge between two traces when we link them (trace linking)

3. We increment weight of an edge by timer-based sampling profiler

linear trace

3’

linear trace

3’

linear trace

1

linear trace

1

linear trace

2

linear trace

2

linear trace

1’

linear trace

1’

linear trace

3

linear trace

3

cyclic tracecyclic trace

1

1

1

11See the paper for more detail

(e.g. how we profile transition andhow we employ bursty tracing)

See the paper for more detail (e.g. how we profile transition and

how we employ bursty tracing)

Page 12: Adaptive Multi-Level Compilation in a Trace-based Java JIT ... · recom piled cyclic trace recom piled cyclic trace recom piled linear trace recom piled linear trace executed by interpreter

12

IBM Research - Tokyo

Adaptive Multi-Level Compilation in a Trace-based Java JIT Compiler © 2012 IBM Corporation

Steps to Build TTgraph

1. We add an node when a trace is compiled

2. We add an edge between two traces when we link them (trace linking)

3. We increment weight of an edge by timer-based sampling profiler

linear trace

3’

linear trace

3’

linear trace

1

linear trace

1

linear trace

2

linear trace

2

linear trace

1’

linear trace

1’

linear trace

3

linear trace

3 See the paper for more detail (e.g. how we profile transition and

how we employ bursty tracing)

See the paper for more detail (e.g. how we profile transition and

how we employ bursty tracing)

cyclic tracecyclic trace

4

6

81

1

92

Page 13: Adaptive Multi-Level Compilation in a Trace-based Java JIT ... · recom piled cyclic trace recom piled cyclic trace recom piled linear trace recom piled linear trace executed by interpreter

13

IBM Research - Tokyo

Adaptive Multi-Level Compilation in a Trace-based Java JIT Compiler © 2012 IBM Corporation

Recompilation Based on TTgraph

We traverse the hot edges backward to identify the new trace head for recompilation– we stop the backward traversal

when hitting a cyclic trace (our system does not capture cyclic and linear path in one trace)

– If there are multiple hot incoming edges for a node, we track both of themlinear

trace 3’

linear trace

3’

linear trace

1

linear trace

1

linear trace

2

linear trace

2

linear trace

1’

linear trace

1’

linear trace

3

linear trace

3

See the paper for more detail See the paper for more detail

cyclic tracecyclic trace

4

6

81

1

92

recompile

Page 14: Adaptive Multi-Level Compilation in a Trace-based Java JIT ... · recom piled cyclic trace recom piled cyclic trace recom piled linear trace recom piled linear trace executed by interpreter

14

IBM Research - Tokyo

Adaptive Multi-Level Compilation in a Trace-based Java JIT Compiler © 2012 IBM Corporation

Performance Evaluation

Hardware: IBM BladeCenter JS22– 4 cores (8 SMT threads) of POWER6 4.0GHz – 16 GB system memory

Our trace-JIT:– Extended IBM JVM for Java 6 (32-bit) to support trace

compilation– Two optimization levels (both trace selection and code

generation), cold and hot– We compare cold only, hot only, and our adaptive multi-

level compilationBenchmark:

– DaCapo benchmark suite 9.12

Page 15: Adaptive Multi-Level Compilation in a Trace-based Java JIT ... · recom piled cyclic trace recom piled cyclic trace recom piled linear trace recom piled linear trace executed by interpreter

15

IBM Research - Tokyo

Adaptive Multi-Level Compilation in a Trace-based Java JIT Compiler © 2012 IBM Corporation

0.0

0.2

0.4

0.6

0.8

1.0

1.2

1.4

1.6

1.8

avror

a

batik

eclip

se fop h2

jytho

nlui

ndex

lusea

rch pmd

sunfl

ow

tomca

ttra

debe

ans

xalan

geom

ean

rela

tive

stea

dy-s

tate

per

form

ance

.

cold onlyhot onlyOur adaptive multi-level compilation

Steady-state Performance

high

er is

fast

er

always using hot level gives26% gain in steady state

Our technique achieved 22% gain

in steady state

Page 16: Adaptive Multi-Level Compilation in a Trace-based Java JIT ... · recom piled cyclic trace recom piled cyclic trace recom piled linear trace recom piled linear trace executed by interpreter

16

IBM Research - Tokyo

Adaptive Multi-Level Compilation in a Trace-based Java JIT Compiler © 2012 IBM Corporation

0.0

0.5

1.0

1.5

2.0

2.5

3.0

3.5

avror

a

batik

eclip

se fop h2

jytho

nlui

ndex

lusea

rch pmd

sunfl

owtom

cat

trade

bean

s

xalan

geom

ean

rela

tive

star

tup

time

.

cold onlyhot onlyOur adaptive multi-level compilation

Startup Time (Execution time of first iteration)

shor

ter i

s fa

ster

using hot level makes the startup 50% (up to 3x) slower!

Our technique does not hurt the startup

performance!

Page 17: Adaptive Multi-Level Compilation in a Trace-based Java JIT ... · recom piled cyclic trace recom piled cyclic trace recom piled linear trace recom piled linear trace executed by interpreter

17

IBM Research - Tokyo

Adaptive Multi-Level Compilation in a Trace-based Java JIT Compiler © 2012 IBM Corporation

98.0%

100.0%

102.0%

104.0%

106.0%

108.0%

110.0%

avror

a

batik

eclip

se fop h2

jytho

nlui

ndex

lusea

rch pmd

sunfl

owtom

cat

trade

bean

s

xalan

geom

ean

rela

tive

peak

per

form

ance

basic recompilationTTgraph-based recompilation

Improvement by TTgraph-Based Recompilation

high

er is

fast

er

Page 18: Adaptive Multi-Level Compilation in a Trace-based Java JIT ... · recom piled cyclic trace recom piled cyclic trace recom piled linear trace recom piled linear trace executed by interpreter

18

IBM Research - Tokyo

Adaptive Multi-Level Compilation in a Trace-based Java JIT Compiler © 2012 IBM Corporation

Summary

We showed that adaptive multi-level compilation is a practical way to balance the startup time and steady-state performance in a trace-JIT

We developed an efficient technique based on TTgraph to identify hot paths. We described the trace selection engine, the timer-based sampling profiler, and the code generator work together for effective recompilation.

Page 19: Adaptive Multi-Level Compilation in a Trace-based Java JIT ... · recom piled cyclic trace recom piled cyclic trace recom piled linear trace recom piled linear trace executed by interpreter

19

IBM Research - Tokyo

Adaptive Multi-Level Compilation in a Trace-based Java JIT Compiler © 2012 IBM Corporation

Thank you!

Page 20: Adaptive Multi-Level Compilation in a Trace-based Java JIT ... · recom piled cyclic trace recom piled cyclic trace recom piled linear trace recom piled linear trace executed by interpreter

20

IBM Research - Tokyo

Adaptive Multi-Level Compilation in a Trace-based Java JIT Compiler © 2012 IBM Corporation

Backup

Page 21: Adaptive Multi-Level Compilation in a Trace-based Java JIT ... · recom piled cyclic trace recom piled cyclic trace recom piled linear trace recom piled linear trace executed by interpreter

21

IBM Research - Tokyo

Adaptive Multi-Level Compilation in a Trace-based Java JIT Compiler © 2012 IBM Corporation

CPU Time Breakdown

shor

ter i

s fa

ster

0.00.10.20.30.40.50.60.70.80.91.0

avror

a

batik

eclip

se fop h2

jytho

nluind

exluse

arch pmd

sunfl

owtomca

ttra

debe

ans

xalan

GEOMEAN

0.0

1.0

avror

a

batik

eclip

se fop h2

jytho

nluind

exluse

arch pmd

sunfl

owtomca

ttra

debe

ans

xalan

GEOMEAN

0.0

1.0

avror

a

batik

eclip

se fop h2

jytho

nluind

exluse

arch pmd

sunfl

owtomca

ttra

debe

ans

xalan

GEOMEAN

0.0

1.0

avror

a

batik

eclip

se fop h2

jytho

nluind

exluse

arch pmd

sunfl

owtomca

ttra

debe

ans

xalan

GEOMEAN

norm

aliz

ed C

PU

tim

e

single-level compilation (cold level)single-level compilation (hot level)

adaptive multi-level compilation (TTgraph-based recompilation)

OSnative

librariesJIT-compiled code

hot cold

Most of the execution time is spent in

recompiled code