operating systems - garbage collection

63
UNIVERSITY OF MASSACHUSETTS AMHERST Department of Computer Science Emery Berger University of Massachusetts Amherst Operating Systems CMPSCI 377 Garbage Collection

Upload: emery-berger

Post on 10-May-2015

2.938 views

Category:

Technology


5 download

DESCRIPTION

From the Operating Systems course (CMPSCI 377) at UMass Amherst, Fall 2007.

TRANSCRIPT

Page 1: Operating Systems - Garbage Collection

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science

Emery Berger

University of Massachusetts Amherst

Operating SystemsCMPSCI 377

Garbage Collection

Page 2: Operating Systems - Garbage Collection

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science

Questions To Come

Terms:

Tracing

Copying

Conservative

Parallel GC

Concurrent GC

Runtime & space costs

Live objects

Reachable objects

References2

Page 3: Operating Systems - Garbage Collection

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science

Explicit Memory Management

malloc/new

allocates space for an object

free/delete

returns memory to system

Simple, but tricky to get right

Forget to free memory leak

free too soon “dangling pointer”

Double free, invalid free...

Page 4: Operating Systems - Garbage Collection

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science

Dangling Pointers

Node x = new Node (“happy”);

Node ptr = x;

delete x; // But I’m not dead yet!

Node y = new Node (“sad”);

cout << ptr->data << endl; // sad

Insidious, hard-to-track down bugs

Page 5: Operating Systems - Garbage Collection

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science

Solution: Garbage Collection

Garbage collector periodically scans objects on heap

No need to free

Automatic memory management

Reclaims non-reachable objects

Won’t reclaim objects until they’re dead(actually somewhat later)

Page 6: Operating Systems - Garbage Collection

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science

No More Dangling Pointers

Node x = new Node (“happy”);

Node ptr = x;

// x still live (reachable through ptr)Node y = new Node (“sad”);

cout << ptr->data << endl; // happy!

So why not use GCall the time?

Page 7: Operating Systems - Garbage Collection

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science

Because This Guy Says Not To

There just aren’t all that many worse ways to f*** up your cache behavior than by using lots of allocations and lazy GC to manage your

memory.

GC sucks donkey brains through a straw from a performance standpoint.

LinusTorvalds

Page 8: Operating Systems - Garbage Collection

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science

Slightly More Technically…

“GC impairs performance”

Extra processing

collection, copying

Degrades cache performance (ibid)

Degrades page locality (ibid)

Increases memory needs

delayed reclamation

Page 9: Operating Systems - Garbage Collection

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science

On the other hand…

No, “GC enhances performance!”

Faster allocation

pointer-bumping vs. freelists

Improves cache performance

no need for headers

Better locality

can reduce fragmentation, compact data structures according to use

Page 10: Operating Systems - Garbage Collection

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science

Outline

Classical GC algorithms

Quantifying GC performance

A hard problem

Oracular memory management

GC vs. malloc/free bakeoff

Page 11: Operating Systems - Garbage Collection

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science

Classical Algorithms

Three classical algorithms

Mark-sweep

Reference counting

Semispace

Tweaks

Generational garbage collection

Out of scope

Parallel – perform GC in parallel

Concurrent – run GC at same time as app

Real-time – ensure bounded pause times11

Page 12: Operating Systems - Garbage Collection

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science

Mark-Sweep

12

Start with roots

Global variables, variables on stack& in registers

Recursively visit every object through pointers

Mark every object we find (set mark bit)

Everything not marked = garbage

Can then sweep heap for unmarked objectsand free them

Page 13: Operating Systems - Garbage Collection

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science

Mark-Sweep Example

13

global 1

global 2

global 3

roots

object 1

object 2

object 3

object 4

object 5

object 6

Initially,all objects white(garbage)

Page 14: Operating Systems - Garbage Collection

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science

Mark-Sweep Example

14

global 1

global 2

global 3

roots

object 1

object 2

object 3

object 4

object 5

object 6

Initially,all objects white(garbage)

Visit objects, following object graph

Page 15: Operating Systems - Garbage Collection

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science

Mark-Sweep Example

15

global 1

global 2

global 3

roots

object 1

object 2

object 3

object 4

object 5

object 6

Initially,all objects white(garbage)

Visit objects, following object graph

Page 16: Operating Systems - Garbage Collection

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science

Mark-Sweep Example

16

global 1

global 2

global 3

roots

object 1

object 2

object 3

object 4

object 5

object 6

Initially,all objects white(garbage)

Visit objects, following object graph

Page 17: Operating Systems - Garbage Collection

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science

Mark-Sweep Example

17

global 1

global 2

global 3

roots

object 2

object 4

object 5

Initially,all objects white(garbage)

Visit objects, following object graph

Can sweep immediately or lazily

freelist

object 1

object 3

object 6

Page 18: Operating Systems - Garbage Collection

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science

Reference Counting

For every object, maintain reference count= number of incoming pointers

a->ptr = x refcount(x)++

a->ptr = y refcount(x)--; refcount(y)++

Reference count = 0 no more incoming pointers: garbage

18

Page 19: Operating Systems - Garbage Collection

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science

Reference Counting Example

19

global 1

global 2

global 3

roots

New objects:ref count = 1

object 5

1

object 4

1

object 2

2

Page 20: Operating Systems - Garbage Collection

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science

Reference Counting Example

20

global 1

global 2

global 3

roots

New objects:ref count = 1

Delete pointer:refcount--

object 5

1

object 4

1

object 2

1

Page 21: Operating Systems - Garbage Collection

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science

Reference Counting Example

21

global 1

global 2

global 3

roots

New objects:ref count = 1

Delete pointer:refcount--

object 5

1

object 4

1

object 2

1

Page 22: Operating Systems - Garbage Collection

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science

Reference Counting Example

22

global 1

global 2

global 3

roots

New objects:ref count = 1

Delete pointer:refcount--

And recursively delete pointers

object 5

0

object 4

1

object 2

1

Page 23: Operating Systems - Garbage Collection

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science

Reference Counting Example

23

global 1

global 2

global 3

roots

New objects:ref count = 1

Delete pointer:refcount--

And recursively delete pointers

refcount == 0:put on freelist

object 5

0

object 4

0

object 2

1

Page 24: Operating Systems - Garbage Collection

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science

Cycles & Reference Counting

24

global 1

global 2

global 3

roots

Big problem: cycles

object 5

2

object 4

1

object 2

2

Page 25: Operating Systems - Garbage Collection

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science

Reference Counting Example

25

global 1

global 2

global 3

roots

Big problem: cycles

object 5

1

object 4

1

object 2

2

Page 26: Operating Systems - Garbage Collection

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science

Reference Counting Example

26

global 1

global 2

global 3

roots

Big problem: cycles

Cycles lead to unreclaimablegarbage

Need to do periodic tracing collection (e.g., mark-sweep)

object 5

1

object 4

1

object 2

2

Page 27: Operating Systems - Garbage Collection

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science

Semispace

Divide heap in two semispaces:

Allocate objects from from-space

When from-space fills,

Scan from roots through live objects

Copy them into to-space

When done, switch the spaces

Allocate from leftover part of to-space

27

Page 28: Operating Systems - Garbage Collection

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science

Semispace Example

Allocate in from-space

Pointer bumping

28

from-space

to-space

Page 29: Operating Systems - Garbage Collection

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science

Semispace Example

Allocate in from-space

Pointer bumping

29

from-space

to-space

Page 30: Operating Systems - Garbage Collection

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science

Semispace Example

Allocate in from-space

Pointer bumping

30

from-space

to-space

Page 31: Operating Systems - Garbage Collection

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science

Semispace Example

Allocate in from-space

Pointer bumping

Copy live objects into to-space

31

from-space

to-space

Page 32: Operating Systems - Garbage Collection

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science

Semispace Example

Allocate in from-space

Pointer bumping

Copy live objects into to-space

Leaves forwarding pointer

32

from-space

to-space

Page 33: Operating Systems - Garbage Collection

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science

Semispace Example

Allocate in from-space

Pointer bumping

Copy live objects into to-space

Leaves forwarding pointer

33

from-space

to-space

Page 34: Operating Systems - Garbage Collection

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science

Semispace Example

Allocate in from-space

Pointer bumping

Copy live objects into to-space

Leaves forwarding pointer

34

from-space

to-space

Page 35: Operating Systems - Garbage Collection

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science

Semispace Example

Allocate in from-space

Pointer bumping

Copy live objects into to-space

Leaves forwarding pointer

Flip spaces;allocate from end

35

to-space

from-space

Page 36: Operating Systems - Garbage Collection

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science

Generational GC

Optimization for copying collectors

Generational hypothesis:“most objects die young”

Common-case optimization

Allocate into nursery

Small region

Collect frequently

Copy out survivors

Key idea: keep track of pointers frommature space into nursery

36

Page 37: Operating Systems - Garbage Collection

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science

Generational GC Example

37

nursery

mature space

Page 38: Operating Systems - Garbage Collection

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science

Generational GC Example

38

nursery

mature space

Page 39: Operating Systems - Garbage Collection

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science

Generational GC Example

39

nursery

mature space

Page 40: Operating Systems - Garbage Collection

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science

Generational GC Example

40

nursery

mature space Copy out survivors (via roots & mature space pointers)

Page 41: Operating Systems - Garbage Collection

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science

Generational GC Example

41

nursery

mature space Copy out survivors (via roots & mature space pointers)

Page 42: Operating Systems - Garbage Collection

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science

Generational GC Example

42

nursery

mature space Copy out survivors (via roots & mature space pointers)

Page 43: Operating Systems - Garbage Collection

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science

Generational GC Example

43

nursery

mature space Copy out survivors (via roots & mature space pointers)

Reset allocation pointer & continue

Page 44: Operating Systems - Garbage Collection

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science

Conservative GC

Non-copying collectors for C & C++

Must identify pointers

“Duck test”: if it looks like a pointer, it’s a pointer

Trace through “pointers”, marking everything

Can link with Boehm-Demers-Weiser library (“libgc”)

44

Page 45: Operating Systems - Garbage Collection

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science

GC vs. malloc/free

45

Page 46: Operating Systems - Garbage Collection

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science

Comparing Memory Managers

Node v = malloc(sizeof(Node));

v->data=malloc(sizeof(NodeData));

memcpy(v->data, old->data,

sizeof(NodeData));

free(old->data);

v->next = old->next;

v->next->prev = v;

v->prev = old->prev;

v->prev->next = v;

free(old);

Using GC in C/C++ is easy:

BDWCollector

Page 47: Operating Systems - Garbage Collection

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science

Comparing Memory Managers

Node v = malloc(sizeof(Node));

v->data=malloc(sizeof(NodeData));

memcpy(v->data, old->data,

sizeof(NodeData));

free(old->data);

v->next = old->next;

v->next->prev = v;

v->prev = old->prev;

v->prev->next = v;

free(old);

…slide in BDW and ignore calls to free.

BDWCollector

Page 48: Operating Systems - Garbage Collection

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science

What About Other Garbage Collectors?

Compares malloc to GC, but only conservative, non-copying collectors

Can’t reduce fragmentation,reorder objects, etc.

But: faster precise, copying collectors

Incompatible with C/C++

Standard for Java…

Page 49: Operating Systems - Garbage Collection

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science

Comparing Memory Managers

Node node = new Node();

node.data = new NodeData();

useNode(node);

node = null;

...

node = new Node();

...

node.data = new NodeData();

...

Adding malloc/free to Java:not so easy…

LeaAllocator

Page 50: Operating Systems - Garbage Collection

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science

Comparing Memory Managers

Node node = new Node();

node.data = new NodeData();

useNode(node);

node = null;

...

node = new Node();

...

node.data = new NodeData();

...

... need to insert frees, but where?

free(node.data)?

free(node)

?

LeaAllocator

Page 51: Operating Systems - Garbage Collection

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science

Oracular Memory Manager

Java

Simulator

C malloc/free

perform actions at no costbelow here

execute program here

allocation

Oracle

Consult oracle at each allocation

Oracle does not disrupt hardware state

Simulator invokes free()…

Page 52: Operating Systems - Garbage Collection

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science

Object Lifetime & Oracle Placement

Oracles bracket placement of frees

Lifetime-based: most aggressive

Reachability-based: most conservative

unreachable

live dead

reachable

freed bylifetime-based oracle

freed byreachability-based oracle

can be collected

free(obj) free(??)

obj =

new Object; can be freed

free(obj)

Page 53: Operating Systems - Garbage Collection

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science

Liveness Oracle Generation

Java

PowerPCSimulator

C malloc/free

perform actions at no costbelow here

execute program here

tracefile

allocation, mem access, prog. roots

Post-process

Liveness: record allocs, memory accesses

Oracle

Page 54: Operating Systems - Garbage Collection

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science

Reachability Oracle Generation

Java

PowerPCSimulator

C malloc/free

perform actions at no costbelow here

execute program here

tracefile

allocations,ptr updates,prog. roots

Merlin analysis

Reachability:

Illegal instructions mark heap events (especially pointer assignments)

Oracle

Page 55: Operating Systems - Garbage Collection

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science

Oracular Memory Manager

Java

PowerPCSimulator

C malloc/free

perform actions at no costbelow here

execute program here

oracle

allocation

Run & consult oracle before each allocation

When needed, modify instruction to call free

Extra costs (oracle access) hidden by simulator

Page 56: Operating Systems - Garbage Collection

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science

Execution Time for pseudoJBB

GC can be faster than malloc/free

90%

100%

110%

120%

130%

140%

150%

1.00 1.25 1.50 1.75 2.00 2.25 2.50 2.75 3.00 3.25 3.50 3.75 4.00

Tim

e R

elat

ive

to L

ea

Heap Size Relative to Collector Minimum

GenMS

GenCopy

GenRC

Lea w/ Reach

Lea w/ Life

Page 57: Operating Systems - Garbage Collection

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science

Geo. Mean of Execution Time

Trades space for time

90%

95%

100%

105%

110%

115%

120%

125%

130%

1.00 1.25 1.50 1.75 2.00 2.25 2.50 2.75 3.00 3.25 3.50 3.75 4.00

Heap Size Relative to Collector Minimum

Exe

cutio

n Ti

me

Rel

ativ

e to

Lea

GenMS

GenCopy

GenRC

Lea w/ Reach

Lea w/ Life

MSExplic it w/ Reach

Page 58: Operating Systems - Garbage Collection

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science

Footprint at Quickest Run

GC uses much more memory for speed

Page 59: Operating Systems - Garbage Collection

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science

Footprint at Quickest Run

GC uses much more memory for speed

1.001.38 1.61

5.105.66

4.84

7.697.09

0.63

Page 60: Operating Systems - Garbage Collection

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science

Javac Paging Performance

GC: poor paging performance

Page 61: Operating Systems - Garbage Collection

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science

Summary of Results

Best collector equals Lea's performance…

Up to 10% faster on some benchmarks

... but uses more memory

Quickest runs require 5x or more memory

GenMS at least doubles mean footprint

Page 62: Operating Systems - Garbage Collection

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science

When to Use Garbage Collection

Garbage collection fine if

system has more than 3x needed RAM

and no competition with other processes

or avoiding bugs / security more important

Not so good:

Limited RAM

Competition for physical memory

Depends on RAM for performance

In-memory database

Search engines, etc.

Page 63: Operating Systems - Garbage Collection

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science

The End

63