connectivity-based garbage collection presenter feng xian author martin hirzel, et.al published in...

32
Connectivity-Based Garbage Collection Presenter Feng Xian Author Martin Hirzel, et.al Published in OOPSLA’2003

Post on 20-Dec-2015

218 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Connectivity-Based Garbage Collection Presenter Feng Xian Author Martin Hirzel, et.al Published in OOPSLA’2003

Connectivity-BasedGarbage Collection

Presenter Feng XianAuthor Martin Hirzel, et.alPublished in OOPSLA’2003

Page 2: Connectivity-Based Garbage Collection Presenter Feng Xian Author Martin Hirzel, et.al Published in OOPSLA’2003

2

Garbage Collection Benefits

Garbage collection leads to simpler• Design no complex deallocation protocols

• Implementation automatic deallocation

• Maintenance fewer bugs

Benefits are widely accepted

• Java, C#, Python, …

Page 3: Connectivity-Based Garbage Collection Presenter Feng Xian Author Martin Hirzel, et.al Published in OOPSLA’2003

3

Garbage Collection:Haven’t we solved this problem yet?• For a state-of-the-art garbage collector:

– time ~14% of execution time– space 3x high watermark– pauses 0.8 seconds

• Can reduce any one cost

• Challenge: reduce all three costs

Page 4: Connectivity-Based Garbage Collection Presenter Feng Xian Author Martin Hirzel, et.al Published in OOPSLA’2003

4

o2

o1

o4o3

o5

o10

o6

o8o9

o7

o11

o15

o14

o12o13

Example Heap

Boxes: heap objects

Arrows: pointers

Long box: stack + global variables

s1

s2

g

Page 5: Connectivity-Based Garbage Collection Presenter Feng Xian Author Martin Hirzel, et.al Published in OOPSLA’2003

5

o2

o1

o4o3

o5

o10

o6

o8o9

o7

o11

o15

o14

o12o13

Thesis

1. Objects form distinct data structures

2. Connected objects die together

3. Garbage collectors can exploit 1. and 2. to reclaim objects efficiently

stack +globals

Page 6: Connectivity-Based Garbage Collection Presenter Feng Xian Author Martin Hirzel, et.al Published in OOPSLA’2003

6

Experimental Infrastructure

JikesRVM Research Virtual Machine– From IBM Research– Written in Java– Application and runtime system share heap

Good garbage collection even more important

Benchmarks– SPECjvm98 suite and SPECjbb2000– Java Olden suite– xalan, ipsixql, nfc, jigsaw

Page 7: Connectivity-Based Garbage Collection Presenter Feng Xian Author Martin Hirzel, et.al Published in OOPSLA’2003

7

Outline

• Garbage Collector Design Principles

• Family of Garbage Collectors

• Design Space Exploration

• Conclusion

Page 8: Connectivity-Based Garbage Collection Presenter Feng Xian Author Martin Hirzel, et.al Published in OOPSLA’2003

8

Garbage Collector Design Principles

“Do partial collections.”

Don’t collect the full heap every time

Shorter pause times

o2

o1

o4o3

o5

o10

o6

o8o9

o7

o11

o15

o14

o12o13

stack +globals

Page 9: Connectivity-Based Garbage Collection Presenter Feng Xian Author Martin Hirzel, et.al Published in OOPSLA’2003

9

Garbage Collector Design Principles

“Predict lifetime based on age.”

Generational hypothesis:Most objects die young

Generational garbage collection:– Partition by age– Collect young objects

most often

Low time overhead

That’s the state of the art.

o2

o1

o4o3

o5

o10

o6

o8o9

o7

o11

o15

o14

o12o13

stack +globals

young generation old generation

Page 10: Connectivity-Based Garbage Collection Presenter Feng Xian Author Martin Hirzel, et.al Published in OOPSLA’2003

10

Garbage Collector Design Principles

Generational GC Problems

o2

o1

o4o3

o5

o10

o6

o8o9

o7

o11

o15

o14

o12o13

stack +globals

young generation old generation

Regular full collections Long peak pause

Old-to-young pointers Need bookkeeping

Page 11: Connectivity-Based Garbage Collection Presenter Feng Xian Author Martin Hirzel, et.al Published in OOPSLA’2003

11

Garbage Collector Design Principles

“Collect connected objects together.”Likelihood that two objects die at the same time:

Connectivity Example Likelihood

Any pair 33.1%

Weakly connected 46.3%

Strongly connected 72.4%

Direct pointer 76.4%

o2o1 ?

o2o1

o2o1

o2o1

Page 12: Connectivity-Based Garbage Collection Presenter Feng Xian Author Martin Hirzel, et.al Published in OOPSLA’2003

12

Garbage Collector Design Principles

“Focus on objects with few ancestors.”

Shortlived objects are easy to collect

LifetimeMedian number of ancestor objects

Short 2 objects

Long 83,324 objects

Page 13: Connectivity-Based Garbage Collection Presenter Feng Xian Author Martin Hirzel, et.al Published in OOPSLA’2003

13

Garbage Collector Design Principles

“Predict lifetime based on roots.”

o1

o2

o3

stack +globals

Lifetime

Objects reachable … Short Long

indirectly from stack 25.6% 16.2%

only directly from stack 32.9% 0.8%

from globals 4.0% 20.5%

Total 62.5% 37.5%

o4g

s

For details, see [ISMM’02] paper.

Page 14: Connectivity-Based Garbage Collection Presenter Feng Xian Author Martin Hirzel, et.al Published in OOPSLA’2003

14

Outline

• Garbage Collector Design Principles

• Family of Garbage Collectors

• Design Space Exploration

• Conclusion

Page 15: Connectivity-Based Garbage Collection Presenter Feng Xian Author Martin Hirzel, et.al Published in OOPSLA’2003

15

CBGC Family of Garbage Collectors:

Connectivity-Based Garbage Collection

o2

o1

o4o3

o5

o10

o6

o8o9

o7

o11

o15

o12o13

p1

p2

p3

p4

o14

stack +globals

• Do partial collections.• Collect connected

objects together.• Predict lifetime based

on age.• Focus on objects with

few ancestors.• Predict lifetime based

on roots.

Page 16: Connectivity-Based Garbage Collection Presenter Feng Xian Author Martin Hirzel, et.al Published in OOPSLA’2003

16

Family of Garbage Collectors

Components of CBGC

Before allocation:1. Partitioning

Decide into which partition to put each object

Collection algorithm:2. Estimator

Estimate dead + live objects for each partition

3. ChooserChoose “good” set of partitions

4. Partial collectionCollect chosen partitions

Page 17: Connectivity-Based Garbage Collection Presenter Feng Xian Author Martin Hirzel, et.al Published in OOPSLA’2003

17

Find fine-grained partitions, where

• Partition edgesrespect pointers

• Objects don’t move between partitions

o2

o1

o4o3

o5

o10

o6

o8o9

o7

o11

o15

o12o13

p1

p2

p3

p4

Family of Garbage Collectors

Partitioning Problem

o14

stack +globals

Page 18: Connectivity-Based Garbage Collection Presenter Feng Xian Author Martin Hirzel, et.al Published in OOPSLA’2003

18

Pointer analysis• Type-based [Harris]

– o1 may point to o2 if o1 has a field of atype compatible to o2

-conservative: they determine the absence of a pointer btw two heaps only if they can prove that such pointer cannot exist.

o2

o1

o4o3

o5

o10

o6

o8o9

o7

o11

o15

o12o13

p1

p2

p3

p4

Family of Garbage Collectors

Partitioning Solutions

o14

stack +globals

Page 19: Connectivity-Based Garbage Collection Presenter Feng Xian Author Martin Hirzel, et.al Published in OOPSLA’2003

19

Family of Garbage Collectors

Estimator Problem

For each partition guess

dead – Objects that can be

reclaimed– Pay-off

live– Objects that must be

traversed– Cost

3 dead + 3 live

1 dead + 2 live

2 dead + 0 live

p1

p2

p3

p42 dead + 2 live

stack +globals

Page 20: Connectivity-Based Garbage Collection Presenter Feng Xian Author Martin Hirzel, et.al Published in OOPSLA’2003

20

Family of Garbage Collectors

Estimator Solutions

Heuristics• Connected objects die

together• Most objects die

young• Objects reachable

from globals live long• The past predicts the

future

3 dead + 3 live

1 dead + 2 live

2 dead + 0 live

p1

p2

p3

p42 dead + 2 live

stack +globals

Page 21: Connectivity-Based Garbage Collection Presenter Feng Xian Author Martin Hirzel, et.al Published in OOPSLA’2003

21

Family of Garbage Collectors

Chooser Problem

Pick subset of partitions• Maximize total dead

• Minimize total live

• Closed under predecessor relation

No bookkeeping for external

pointers

p3

p1

p2

p3

p4

7 dead + 5 live

3 dead + 3 live

1 dead + 2 live

2 dead + 0 live

2 dead + 2 live

stack +globals

Page 22: Connectivity-Based Garbage Collection Presenter Feng Xian Author Martin Hirzel, et.al Published in OOPSLA’2003

22

Family of Garbage Collectors

Chooser Solutions

Optimal algorithm based on network flow [TR]

Simpler, greedy algorithm

p3

p1

p2

p3

p4

7 dead + 5 live

3 dead + 3 live

1 dead + 2 live

2 dead + 0 live

2 dead + 2 live

stack +globals

Page 23: Connectivity-Based Garbage Collection Presenter Feng Xian Author Martin Hirzel, et.al Published in OOPSLA’2003

23

o5

o10

o8

o11

Family of Garbage Collectors

Partial Collection Problem

o2

o6

o9

o7

o5

o10

o8

o11

o12o13

o15

p2

p3

p4

rest of heap

o14

Look only at chosen partitions

Traverse reachable objects

Reclaim unreachable objects

stack +globals

o

o

Page 24: Connectivity-Based Garbage Collection Presenter Feng Xian Author Martin Hirzel, et.al Published in OOPSLA’2003

24

o5

o10

o8

o11

Family of Garbage Collectors

Partial Collection Solutions

o2

o6

o9

o7

o5

o10

o8

o11

o12o13

o15

p2

p3

p4

rest of heap

o14

stack +globals

Generalize canonical full-heap algorithms

• Mark and sweep[McCarthy’60]

• Semi-space copying[Cheney’70]

• Treadmill[Baker’92]

Page 25: Connectivity-Based Garbage Collection Presenter Feng Xian Author Martin Hirzel, et.al Published in OOPSLA’2003

25

Outline

• Garbage Collector Design Principles

• Family of Garbage Collectors

• Design Space Exploration

• Conclusion

Page 26: Connectivity-Based Garbage Collection Presenter Feng Xian Author Martin Hirzel, et.al Published in OOPSLA’2003

26

Design Space Exploration

Questions

How good is a naïve CBGC?

How good could CBGC be in 20 years?

How well does CBGC do in a JVM?

Page 27: Connectivity-Based Garbage Collection Presenter Feng Xian Author Martin Hirzel, et.al Published in OOPSLA’2003

27

Design Space Exploration

Simulator Methodology

Garbage collection simulator (under GPL)– Uses traces of allocations and pointer writes

from our benchmark runs

Simulator advantages– Easier to implement variety of collector algorithms– Know entire trace beforehand:

can use that for “in 20 years” experiments

Currently adding CBGC to JikesRVM

Page 28: Connectivity-Based Garbage Collection Presenter Feng Xian Author Martin Hirzel, et.al Published in OOPSLA’2003

28

Design Space Exploration

How good is a naïve CBGC?

Cost in time

Cost in space

Pause times

Full-heapSemi-space

copying

CBGC-naïve• Type-based

partitioning [Harris]• Heuristics

estimator

AppelCopying

generational

jack xalan jbb javac jack xalan jbb javac jack xalan jbb javac1.72

0

0

0

0.87

0.22

Page 29: Connectivity-Based Garbage Collection Presenter Feng Xian Author Martin Hirzel, et.al Published in OOPSLA’2003

29

Cost in time

Cost in space

Pause times

Full-heapSemi-space

copying

CBGC-oraclesPartitioning

and estimatorbased on trace

AppelCopying

generational

jack xalan jbb javac jack xalan jbb javac jack xalan jbb javac

Design Space Exploration

How good could CBGC be in 20 years?1.72

0

0

0

0.87

0.22

Page 30: Connectivity-Based Garbage Collection Presenter Feng Xian Author Martin Hirzel, et.al Published in OOPSLA’2003

30

CBGC with oracles beats Appel– We did not find a “performance wall”– CBGC has potential

The performance gap between CBGC with oracles and naïve CBGC is large

– Research challenges

Design Space Exploration

How good could CBGC be in 20 years?

Page 31: Connectivity-Based Garbage Collection Presenter Feng Xian Author Martin Hirzel, et.al Published in OOPSLA’2003

31

How well does CBGC doin a Java virtual machine?

Implementation in progress

Need a pointer analysis for the partitioning

Page 32: Connectivity-Based Garbage Collection Presenter Feng Xian Author Martin Hirzel, et.al Published in OOPSLA’2003

32

Contributions presented in this talk

Connectivity-based GC design principles[ISMM’02]

CBGC, a new family of garbage collectors;

Design space exploration with simulator[OOPSLA’03]