connectivity-based garbage collection presenter feng xian author martin hirzel, et.al published in...
Post on 20-Dec-2015
218 views
TRANSCRIPT
Connectivity-BasedGarbage Collection
Presenter Feng XianAuthor Martin Hirzel, et.alPublished in OOPSLA’2003
2
Garbage Collection Benefits
Garbage collection leads to simpler• Design no complex deallocation protocols
• Implementation automatic deallocation
• Maintenance fewer bugs
Benefits are widely accepted
• Java, C#, Python, …
3
Garbage Collection:Haven’t we solved this problem yet?• For a state-of-the-art garbage collector:
– time ~14% of execution time– space 3x high watermark– pauses 0.8 seconds
• Can reduce any one cost
• Challenge: reduce all three costs
4
o2
o1
o4o3
o5
o10
o6
o8o9
o7
o11
o15
o14
o12o13
Example Heap
Boxes: heap objects
Arrows: pointers
Long box: stack + global variables
s1
s2
g
5
o2
o1
o4o3
o5
o10
o6
o8o9
o7
o11
o15
o14
o12o13
Thesis
1. Objects form distinct data structures
2. Connected objects die together
3. Garbage collectors can exploit 1. and 2. to reclaim objects efficiently
stack +globals
6
Experimental Infrastructure
JikesRVM Research Virtual Machine– From IBM Research– Written in Java– Application and runtime system share heap
Good garbage collection even more important
Benchmarks– SPECjvm98 suite and SPECjbb2000– Java Olden suite– xalan, ipsixql, nfc, jigsaw
7
Outline
• Garbage Collector Design Principles
• Family of Garbage Collectors
• Design Space Exploration
• Conclusion
8
Garbage Collector Design Principles
“Do partial collections.”
Don’t collect the full heap every time
Shorter pause times
o2
o1
o4o3
o5
o10
o6
o8o9
o7
o11
o15
o14
o12o13
stack +globals
9
Garbage Collector Design Principles
“Predict lifetime based on age.”
Generational hypothesis:Most objects die young
Generational garbage collection:– Partition by age– Collect young objects
most often
Low time overhead
That’s the state of the art.
o2
o1
o4o3
o5
o10
o6
o8o9
o7
o11
o15
o14
o12o13
stack +globals
young generation old generation
10
Garbage Collector Design Principles
Generational GC Problems
o2
o1
o4o3
o5
o10
o6
o8o9
o7
o11
o15
o14
o12o13
stack +globals
young generation old generation
Regular full collections Long peak pause
Old-to-young pointers Need bookkeeping
11
Garbage Collector Design Principles
“Collect connected objects together.”Likelihood that two objects die at the same time:
Connectivity Example Likelihood
Any pair 33.1%
Weakly connected 46.3%
Strongly connected 72.4%
Direct pointer 76.4%
o2o1 ?
o2o1
o2o1
o2o1
12
Garbage Collector Design Principles
“Focus on objects with few ancestors.”
Shortlived objects are easy to collect
LifetimeMedian number of ancestor objects
Short 2 objects
Long 83,324 objects
13
Garbage Collector Design Principles
“Predict lifetime based on roots.”
o1
o2
o3
stack +globals
Lifetime
Objects reachable … Short Long
indirectly from stack 25.6% 16.2%
only directly from stack 32.9% 0.8%
from globals 4.0% 20.5%
Total 62.5% 37.5%
o4g
s
For details, see [ISMM’02] paper.
14
Outline
• Garbage Collector Design Principles
• Family of Garbage Collectors
• Design Space Exploration
• Conclusion
15
CBGC Family of Garbage Collectors:
Connectivity-Based Garbage Collection
o2
o1
o4o3
o5
o10
o6
o8o9
o7
o11
o15
o12o13
p1
p2
p3
p4
o14
stack +globals
• Do partial collections.• Collect connected
objects together.• Predict lifetime based
on age.• Focus on objects with
few ancestors.• Predict lifetime based
on roots.
16
Family of Garbage Collectors
Components of CBGC
Before allocation:1. Partitioning
Decide into which partition to put each object
Collection algorithm:2. Estimator
Estimate dead + live objects for each partition
3. ChooserChoose “good” set of partitions
4. Partial collectionCollect chosen partitions
17
Find fine-grained partitions, where
• Partition edgesrespect pointers
• Objects don’t move between partitions
o2
o1
o4o3
o5
o10
o6
o8o9
o7
o11
o15
o12o13
p1
p2
p3
p4
Family of Garbage Collectors
Partitioning Problem
o14
stack +globals
18
Pointer analysis• Type-based [Harris]
– o1 may point to o2 if o1 has a field of atype compatible to o2
-conservative: they determine the absence of a pointer btw two heaps only if they can prove that such pointer cannot exist.
o2
o1
o4o3
o5
o10
o6
o8o9
o7
o11
o15
o12o13
p1
p2
p3
p4
Family of Garbage Collectors
Partitioning Solutions
o14
stack +globals
19
Family of Garbage Collectors
Estimator Problem
For each partition guess
dead – Objects that can be
reclaimed– Pay-off
live– Objects that must be
traversed– Cost
3 dead + 3 live
1 dead + 2 live
2 dead + 0 live
p1
p2
p3
p42 dead + 2 live
stack +globals
20
Family of Garbage Collectors
Estimator Solutions
Heuristics• Connected objects die
together• Most objects die
young• Objects reachable
from globals live long• The past predicts the
future
3 dead + 3 live
1 dead + 2 live
2 dead + 0 live
p1
p2
p3
p42 dead + 2 live
stack +globals
21
Family of Garbage Collectors
Chooser Problem
Pick subset of partitions• Maximize total dead
• Minimize total live
• Closed under predecessor relation
No bookkeeping for external
pointers
p3
p1
p2
p3
p4
7 dead + 5 live
3 dead + 3 live
1 dead + 2 live
2 dead + 0 live
2 dead + 2 live
stack +globals
22
Family of Garbage Collectors
Chooser Solutions
Optimal algorithm based on network flow [TR]
Simpler, greedy algorithm
p3
p1
p2
p3
p4
7 dead + 5 live
3 dead + 3 live
1 dead + 2 live
2 dead + 0 live
2 dead + 2 live
stack +globals
23
o5
o10
o8
o11
Family of Garbage Collectors
Partial Collection Problem
o2
o6
o9
o7
o5
o10
o8
o11
o12o13
o15
p2
p3
p4
rest of heap
o14
Look only at chosen partitions
Traverse reachable objects
Reclaim unreachable objects
stack +globals
o
o
24
o5
o10
o8
o11
Family of Garbage Collectors
Partial Collection Solutions
o2
o6
o9
o7
o5
o10
o8
o11
o12o13
o15
p2
p3
p4
rest of heap
o14
stack +globals
Generalize canonical full-heap algorithms
• Mark and sweep[McCarthy’60]
• Semi-space copying[Cheney’70]
• Treadmill[Baker’92]
25
Outline
• Garbage Collector Design Principles
• Family of Garbage Collectors
• Design Space Exploration
• Conclusion
26
Design Space Exploration
Questions
How good is a naïve CBGC?
How good could CBGC be in 20 years?
How well does CBGC do in a JVM?
27
Design Space Exploration
Simulator Methodology
Garbage collection simulator (under GPL)– Uses traces of allocations and pointer writes
from our benchmark runs
Simulator advantages– Easier to implement variety of collector algorithms– Know entire trace beforehand:
can use that for “in 20 years” experiments
Currently adding CBGC to JikesRVM
28
Design Space Exploration
How good is a naïve CBGC?
Cost in time
Cost in space
Pause times
Full-heapSemi-space
copying
CBGC-naïve• Type-based
partitioning [Harris]• Heuristics
estimator
AppelCopying
generational
jack xalan jbb javac jack xalan jbb javac jack xalan jbb javac1.72
0
0
0
0.87
0.22
29
Cost in time
Cost in space
Pause times
Full-heapSemi-space
copying
CBGC-oraclesPartitioning
and estimatorbased on trace
AppelCopying
generational
jack xalan jbb javac jack xalan jbb javac jack xalan jbb javac
Design Space Exploration
How good could CBGC be in 20 years?1.72
0
0
0
0.87
0.22
30
CBGC with oracles beats Appel– We did not find a “performance wall”– CBGC has potential
The performance gap between CBGC with oracles and naïve CBGC is large
– Research challenges
Design Space Exploration
How good could CBGC be in 20 years?
31
How well does CBGC doin a Java virtual machine?
Implementation in progress
Need a pointer analysis for the partitioning
32
Contributions presented in this talk
Connectivity-based GC design principles[ISMM’02]
CBGC, a new family of garbage collectors;
Design space exploration with simulator[OOPSLA’03]