quantifying the performance of garbage collection vs. explicit memory management
DESCRIPTION
Quantifying the Performance of Garbage Collection vs. Explicit Memory Management. Matthew Hertz * & Emery Berger University of Massachusetts Amherst * now at Canisius College. Explicit Memory Management. malloc / new allocates space for an object free / delete returns memory to system - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Quantifying the Performance of Garbage Collection vs. Explicit Memory Management](https://reader036.vdocuments.site/reader036/viewer/2022062500/56815a6b550346895dc7c613/html5/thumbnails/1.jpg)
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS A AMHERST • MHERST • Department of Computer Science Department of Computer Science
Quantifying the Performance of Garbage Collection vs.
Explicit Memory Management
Matthew Hertz* & Emery BergerUniversity of Massachusetts Amherst
*now at Canisius College
![Page 2: Quantifying the Performance of Garbage Collection vs. Explicit Memory Management](https://reader036.vdocuments.site/reader036/viewer/2022062500/56815a6b550346895dc7c613/html5/thumbnails/2.jpg)
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS A AMHERST • MHERST • Department of Computer Science Department of Computer Science
Explicit Memory Management
malloc / new allocates space for an object
free / delete returns memory to system
Simple, but tricky to get right Forget to free memory leak free too soon “dangling pointer”
![Page 3: Quantifying the Performance of Garbage Collection vs. Explicit Memory Management](https://reader036.vdocuments.site/reader036/viewer/2022062500/56815a6b550346895dc7c613/html5/thumbnails/3.jpg)
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS A AMHERST • MHERST • Department of Computer Science Department of Computer Science
Dangling Pointers
Node x = new Node (“happy”);Node ptr = x;delete x; // But I’m not dead yet!Node y = new Node (“sad”);cout << ptr->data << endl; //
sad
Insidious, hard-to-track down bugs
![Page 4: Quantifying the Performance of Garbage Collection vs. Explicit Memory Management](https://reader036.vdocuments.site/reader036/viewer/2022062500/56815a6b550346895dc7c613/html5/thumbnails/4.jpg)
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS A AMHERST • MHERST • Department of Computer Science Department of Computer Science
Solution: Garbage Collection
No need to free Garbage collector periodically
scans objects on heap Reclaims non-reachable objects
Won’t reclaim objects until they’re dead(actually somewhat later)
![Page 5: Quantifying the Performance of Garbage Collection vs. Explicit Memory Management](https://reader036.vdocuments.site/reader036/viewer/2022062500/56815a6b550346895dc7c613/html5/thumbnails/5.jpg)
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS A AMHERST • MHERST • Department of Computer Science Department of Computer Science
No More Dangling Pointers
Node x = new Node (“happy”);Node ptr = x;// x still live (reachable through ptr) Node y = new Node (“sad”);cout << ptr->data << endl; // happy!
So why not use GC all the time?
![Page 6: Quantifying the Performance of Garbage Collection vs. Explicit Memory Management](https://reader036.vdocuments.site/reader036/viewer/2022062500/56815a6b550346895dc7c613/html5/thumbnails/6.jpg)
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS A AMHERST • MHERST • Department of Computer Science Department of Computer Science
It’s The Performance…There just aren’t all
that many worse ways to f*** up your cache
behavior than by using lots of allocations and lazy GC to manage
your memory.
GC sucks donkey brains through a
straw from a performance standpoint.
LinusTorvalds
![Page 7: Quantifying the Performance of Garbage Collection vs. Explicit Memory Management](https://reader036.vdocuments.site/reader036/viewer/2022062500/56815a6b550346895dc7c613/html5/thumbnails/7.jpg)
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS A AMHERST • MHERST • Department of Computer Science Department of Computer Science
Slightly More Technically…
“GC impairs performance” Extra processing (collection,
copying) Degrades cache performance (ibid) Degrades page locality (ibid) Increases memory needs
(delayed reclamation)
![Page 8: Quantifying the Performance of Garbage Collection vs. Explicit Memory Management](https://reader036.vdocuments.site/reader036/viewer/2022062500/56815a6b550346895dc7c613/html5/thumbnails/8.jpg)
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS A AMHERST • MHERST • Department of Computer Science Department of Computer Science
On the other hand… No, “GC enhances
performance!” Faster allocation
(pointer-bumping vs. freelist) Improves cache performance
(no need for headers) Better locality
(can reduce fragmentation, compact data structures according to use)
![Page 9: Quantifying the Performance of Garbage Collection vs. Explicit Memory Management](https://reader036.vdocuments.site/reader036/viewer/2022062500/56815a6b550346895dc7c613/html5/thumbnails/9.jpg)
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS A AMHERST • MHERST • Department of Computer Science Department of Computer Science
Outline Quantifying GC performance
A hard problem Oracular memory management Experimental methodology Results
![Page 10: Quantifying the Performance of Garbage Collection vs. Explicit Memory Management](https://reader036.vdocuments.site/reader036/viewer/2022062500/56815a6b550346895dc7c613/html5/thumbnails/10.jpg)
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS A AMHERST • MHERST • Department of Computer Science Department of Computer Science
Comparing Memory Managers
Node v = malloc(sizeof(Node));v->data=malloc(sizeof(NodeData));memcpy(v->data, old->data,
sizeof(NodeData));free(old->data);v->next = old->next;v->next->prev = v;v->prev = old->prev;v->prev->next = v;free(old);
Using GC in C/C++ is easy:
BDWCollector
![Page 11: Quantifying the Performance of Garbage Collection vs. Explicit Memory Management](https://reader036.vdocuments.site/reader036/viewer/2022062500/56815a6b550346895dc7c613/html5/thumbnails/11.jpg)
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS A AMHERST • MHERST • Department of Computer Science Department of Computer Science
Comparing Memory Managers
Node v = malloc(sizeof(Node));v->data=malloc(sizeof(NodeData));memcpy(v->data, old->data,
sizeof(NodeData));free(old->data);v->next = old->next;v->next->prev = v;v->prev = old->prev;v->prev->next = v;free(old);
…slide in BDW and ignore calls to free.
BDWCollector
![Page 12: Quantifying the Performance of Garbage Collection vs. Explicit Memory Management](https://reader036.vdocuments.site/reader036/viewer/2022062500/56815a6b550346895dc7c613/html5/thumbnails/12.jpg)
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS A AMHERST • MHERST • Department of Computer Science Department of Computer Science
What About Other Garbage Collectors?
Compares malloc to GC, but only conservative, non-copying collectors (really = BDW) Can’t reduce fragmentation,
reorder objects, etc. But: faster precise, copying
collectors Incompatible with C/C++ Standard for Java…
![Page 13: Quantifying the Performance of Garbage Collection vs. Explicit Memory Management](https://reader036.vdocuments.site/reader036/viewer/2022062500/56815a6b550346895dc7c613/html5/thumbnails/13.jpg)
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS A AMHERST • MHERST • Department of Computer Science Department of Computer Science
Comparing Memory Managers
Node node = new Node();node.data = new NodeData();useNode(node);node = null;...node = new Node();...node.data = new NodeData();...
Adding malloc/free to Java:not so easy…
LeaAllocator
![Page 14: Quantifying the Performance of Garbage Collection vs. Explicit Memory Management](https://reader036.vdocuments.site/reader036/viewer/2022062500/56815a6b550346895dc7c613/html5/thumbnails/14.jpg)
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS A AMHERST • MHERST • Department of Computer Science Department of Computer Science
Comparing Memory Managers
Node node = new Node();node.data = new NodeData();useNode(node);node = null;...node = new Node();...node.data = new NodeData();...
... need to insert frees, but where?
free(node.data)?
free(node)?
LeaAllocator
![Page 15: Quantifying the Performance of Garbage Collection vs. Explicit Memory Management](https://reader036.vdocuments.site/reader036/viewer/2022062500/56815a6b550346895dc7c613/html5/thumbnails/15.jpg)
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS A AMHERST • MHERST • Department of Computer Science Department of Computer Science
Oracular Memory Manager
Java
Simulator
C malloc/free
perform actions at
no cost below here
execute program here
allocation
Oracle
Consult oracle at each allocation Oracle does not disrupt hardware state Simulator invokes free()…
![Page 16: Quantifying the Performance of Garbage Collection vs. Explicit Memory Management](https://reader036.vdocuments.site/reader036/viewer/2022062500/56815a6b550346895dc7c613/html5/thumbnails/16.jpg)
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS A AMHERST • MHERST • Department of Computer Science Department of Computer Science
Object Lifetime & Oracle Placement
Oracles bracket placement of frees Lifetime-based: most aggressive Reachability-based: most conservative
unreachable
live dead
reachable
freed bylifetime-based oracle
freed byreachability-based oracle can be
collectedfree(obj) free(??)
obj =new Object;
can be freed
free(obj)
![Page 17: Quantifying the Performance of Garbage Collection vs. Explicit Memory Management](https://reader036.vdocuments.site/reader036/viewer/2022062500/56815a6b550346895dc7c613/html5/thumbnails/17.jpg)
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS A AMHERST • MHERST • Department of Computer Science Department of Computer Science
Liveness Oracle Generation
Java
PowerPCSimulator
C malloc/free
perform actions at
no cost below here
execute program here
tracefile
allocation, mem
access, prog. roots
Post-process
Liveness: record allocs, mem. accesses Preserve code, type objects, etc. May use objects without accessing them
Oracle
![Page 18: Quantifying the Performance of Garbage Collection vs. Explicit Memory Management](https://reader036.vdocuments.site/reader036/viewer/2022062500/56815a6b550346895dc7c613/html5/thumbnails/18.jpg)
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS A AMHERST • MHERST • Department of Computer Science Department of Computer Science
Reachability Oracle Generation
Java
PowerPCSimulator
C malloc/free
perform actions at
no cost below here
execute program here
tracefile
allocations,ptr
updates,prog. roots
Merlin analysis
Reachability: Illegal instructions mark heap events Simulated identically to legal instructions
Oracle
![Page 19: Quantifying the Performance of Garbage Collection vs. Explicit Memory Management](https://reader036.vdocuments.site/reader036/viewer/2022062500/56815a6b550346895dc7c613/html5/thumbnails/19.jpg)
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS A AMHERST • MHERST • Department of Computer Science Department of Computer Science
Oracular Memory Manager
Java
PowerPCSimulator
C malloc/free
perform actions at
no cost below here
execute program here
oracle
allocation
Consult oracle before each allocation When needed, modify instruction to call free Extra costs (oracle access) hidden by simulator
![Page 20: Quantifying the Performance of Garbage Collection vs. Explicit Memory Management](https://reader036.vdocuments.site/reader036/viewer/2022062500/56815a6b550346895dc7c613/html5/thumbnails/20.jpg)
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS A AMHERST • MHERST • Department of Computer Science Department of Computer Science
Experimental Methodology
Java platform: MMTk/Jikes RVM(2.3.2)
Simulator: Dynamic SimpleScalar (DSS) Simulates 2GHz PowerPC processor
G5 cache configuration Garbage collectors:
GenMS, GenCopy, GenRC, SemiSpace, CopyMS, MarkSweep
Explicit memory managers: Lea, MSExplicit (MS + explicit deallocation)
![Page 21: Quantifying the Performance of Garbage Collection vs. Explicit Memory Management](https://reader036.vdocuments.site/reader036/viewer/2022062500/56815a6b550346895dc7c613/html5/thumbnails/21.jpg)
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS A AMHERST • MHERST • Department of Computer Science Department of Computer Science
Experimental Methodology
Perfectly repeatable runs Pseudoadaptive compiler
Same sequence of optimizations Compiler advice from average of 5 runs
Deterministic thread switching Deterministic system clock
![Page 22: Quantifying the Performance of Garbage Collection vs. Explicit Memory Management](https://reader036.vdocuments.site/reader036/viewer/2022062500/56815a6b550346895dc7c613/html5/thumbnails/22.jpg)
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS A AMHERST • MHERST • Department of Computer Science Department of Computer Science
Execution Time for pseudoJBB
GC performance can be competitive90%
100%
110%
120%
130%
140%
150%
1.00 1.25 1.50 1.75 2.00 2.25 2.50 2.75 3.00 3.25 3.50 3.75 4.00Heap Size Relative to Collector Minimum
Tim
e Re
lativ
e to
Lea
GenMS
GenCopy
GenRC
Lea w/ Reach
Lea w/ Life
MSExplicit w/ Reach
![Page 23: Quantifying the Performance of Garbage Collection vs. Explicit Memory Management](https://reader036.vdocuments.site/reader036/viewer/2022062500/56815a6b550346895dc7c613/html5/thumbnails/23.jpg)
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS A AMHERST • MHERST • Department of Computer Science Department of Computer Science
Geo. Mean of Execution Time
Garbage collection trades space for time
90%
95%
100%
105%
110%
115%
120%
125%
130%
1.00 1.25 1.50 1.75 2.00 2.25 2.50 2.75 3.00 3.25 3.50 3.75 4.00
Heap Size Relative to Collector Minimum
Exec
utio
n Ti
me
Rela
tive
to L
eaGenMSGenCopyGenRCLea w/ ReachLea w/ LifeMSExplicit w/ Reach
![Page 24: Quantifying the Performance of Garbage Collection vs. Explicit Memory Management](https://reader036.vdocuments.site/reader036/viewer/2022062500/56815a6b550346895dc7c613/html5/thumbnails/24.jpg)
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS A AMHERST • MHERST • Department of Computer Science Department of Computer Science
Footprint at Quickest Run
GC uses much more memory0%
100%
200%
300%
400%
500%
600%
700%
800%
Lea w/ Reach Lea w/ Life MMTk Kingsley GenMS GenCopy CopyMS SemiSpace MarkSweep
![Page 25: Quantifying the Performance of Garbage Collection vs. Explicit Memory Management](https://reader036.vdocuments.site/reader036/viewer/2022062500/56815a6b550346895dc7c613/html5/thumbnails/25.jpg)
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS A AMHERST • MHERST • Department of Computer Science Department of Computer Science
0%
100%
200%
300%
400%
500%
600%
700%
800%
Lea w/ Reach Lea w/ Life MMTk Kingsley GenMS GenCopy CopyMS SemiSpace MarkSweep
Footprint at Quickest Run
GC uses much more memory
1.001.38 1.61
5.105.66
4.84
7.697.09
0.63
![Page 26: Quantifying the Performance of Garbage Collection vs. Explicit Memory Management](https://reader036.vdocuments.site/reader036/viewer/2022062500/56815a6b550346895dc7c613/html5/thumbnails/26.jpg)
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS A AMHERST • MHERST • Department of Computer Science Department of Computer Science
Avg. Relative Cycles and Footprint
GC always requires more space
![Page 27: Quantifying the Performance of Garbage Collection vs. Explicit Memory Management](https://reader036.vdocuments.site/reader036/viewer/2022062500/56815a6b550346895dc7c613/html5/thumbnails/27.jpg)
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS A AMHERST • MHERST • Department of Computer Science Department of Computer Science
Javac Paging Performance
GC: poor paging performance
![Page 28: Quantifying the Performance of Garbage Collection vs. Explicit Memory Management](https://reader036.vdocuments.site/reader036/viewer/2022062500/56815a6b550346895dc7c613/html5/thumbnails/28.jpg)
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS A AMHERST • MHERST • Department of Computer Science Department of Computer Science
pseudoJBB Paging Performance
Lifetime vs. reachability… a wash
![Page 29: Quantifying the Performance of Garbage Collection vs. Explicit Memory Management](https://reader036.vdocuments.site/reader036/viewer/2022062500/56815a6b550346895dc7c613/html5/thumbnails/29.jpg)
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS A AMHERST • MHERST • Department of Computer Science Department of Computer Science
Summary of Results Best collector equals Lea's
performance… Up to 10% faster on some benchmarks
... but uses more memory Quickest runs require 5x or more
memory GenMS at least doubles mean footprint
![Page 30: Quantifying the Performance of Garbage Collection vs. Explicit Memory Management](https://reader036.vdocuments.site/reader036/viewer/2022062500/56815a6b550346895dc7c613/html5/thumbnails/30.jpg)
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS A AMHERST • MHERST • Department of Computer Science Department of Computer Science
Take-home: Practitioners Practitioners: GC - ok
if system has more than 3x needed RAM and no competition with other processes
Not so good: Limited RAM Competition for physical memory Depends on RAM for performance
In-memory database Search engines, etc.
![Page 31: Quantifying the Performance of Garbage Collection vs. Explicit Memory Management](https://reader036.vdocuments.site/reader036/viewer/2022062500/56815a6b550346895dc7c613/html5/thumbnails/31.jpg)
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS A AMHERST • MHERST • Department of Computer Science Department of Computer Science
Take-home: Researchers GC performance already good
enough with enough RAM Problems:
Paging is a killer Performance suffers for limited RAM
![Page 32: Quantifying the Performance of Garbage Collection vs. Explicit Memory Management](https://reader036.vdocuments.site/reader036/viewer/2022062500/56815a6b550346895dc7c613/html5/thumbnails/32.jpg)
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS A AMHERST • MHERST • Department of Computer Science Department of Computer Science
Future Work Obvious dimensions
Other collectors: Bookmarking collector [PLDI 05] Parallel collectors
Other allocators: New version of DLmalloc (2.8.2) Our locality-improving allocator [ISMM 05]
Other architectures: Examine impact of different cache sizes
Other memory management methods Regions, reaps
![Page 33: Quantifying the Performance of Garbage Collection vs. Explicit Memory Management](https://reader036.vdocuments.site/reader036/viewer/2022062500/56815a6b550346895dc7c613/html5/thumbnails/33.jpg)
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS A AMHERST • MHERST • Department of Computer Science Department of Computer Science
Thank you
![Page 34: Quantifying the Performance of Garbage Collection vs. Explicit Memory Management](https://reader036.vdocuments.site/reader036/viewer/2022062500/56815a6b550346895dc7c613/html5/thumbnails/34.jpg)
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS A AMHERST • MHERST • Department of Computer Science Department of Computer Science
Execution Time for ipsixql
Object lifetimes can be very important80%
90%
100%
110%
120%
130%
140%
150%
160%
170%
1.00 1.25 1.50 1.75 2.00 2.25 2.50 2.75 3.00 3.25 3.50 3.75 4.00Heap Size Relative to Collector Minimum
Tim
e Re
lativ
e to
Lea
GenMSGenCopyGenRCLea w/ ReachLea w/ LifeMSExplicit w/ Reach
![Page 35: Quantifying the Performance of Garbage Collection vs. Explicit Memory Management](https://reader036.vdocuments.site/reader036/viewer/2022062500/56815a6b550346895dc7c613/html5/thumbnails/35.jpg)
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS A AMHERST • MHERST • Department of Computer Science Department of Computer Science
What's the Catch?
There just aren’t all that many worse ways
to f*ck up your cache behavior than
by using lots of allocations and lazy GC to manage your
memory.
GC sucks donkey brains through a
straw from a performance standpoint.
LinusTorvalds“famous computerscientist”
![Page 36: Quantifying the Performance of Garbage Collection vs. Explicit Memory Management](https://reader036.vdocuments.site/reader036/viewer/2022062500/56815a6b550346895dc7c613/html5/thumbnails/36.jpg)
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS A AMHERST • MHERST • Department of Computer Science Department of Computer Science
Who Cares About Memory?
RAM is not cheap Already up to 25% of the cost of
computer Percentage continues to rise
Sun E1000: 4GB costs $75,000 Get additional CPU for free!
Upgrading laptops may require new machine
![Page 37: Quantifying the Performance of Garbage Collection vs. Explicit Memory Management](https://reader036.vdocuments.site/reader036/viewer/2022062500/56815a6b550346895dc7c613/html5/thumbnails/37.jpg)
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS A AMHERST • MHERST • Department of Computer Science Department of Computer Science
Quantifying GC Performance
Perform apples-to-apples comparison Examine unaltered applications Measurements differ only in memory
manager
Consider range of metrics Both time and space measurements