region-based memory management. region-based memory management zregions represent areas of memory...
Post on 21-Dec-2015
227 views
TRANSCRIPT
Region-Based Memory Management
Region-based Memory Management
Regions represent areas of memoryObjects are allocated “in” a given regionVarious deallocation optionsVarious safety (no access of freed objects)
optionsRegion r = newregion();
for (i = 0; i < 10; i++) {
int *x = ralloc(r, (i + 1) * sizeof(int));
work(i, x); }
deleteregion(r);
Policy choices
Deallocation Garbage collection (GC) per-object free (per-object) region deletion (all-at-once)
Safety none (none) reachability (GC) per-region reference counting (RC) statically checked (static)
Some Existing Region Systems
Deallocation Safetyarenas all-at-once noneapache all-at-once nonezones per-object noneStoutamire GC GCvmalloc all-at-once or none
per-objectTT all-at-once staticCWM all-at-once staticC@/RC all-at-once RC
Why Regions ?
per-region allocation/deallocation policies zones (D.T. Ross, 1967), vmalloc (K. Vo, 1996)
performance arenas (D. Hanson, 1990)
locality benefits Stoutamire (1997)
expressiveness apache, arenas, C@/RC
target for compiler-inferred memory management Tofte & Talpin (1994), Crary, Walker, Morisett
(1999)
Why Regions ? (more reasons)
statically guaranteed memory safety CWM (1999)
target for garbage collection Wang & Appel (2001)
Region Performance: Allocation and Deallocation
Applies to all-at-once onlyBasic strategy:
allocate a big block of memory individual allocation is:
pointer incrementoverflow test
deallocation frees the list of big blocks
all operations are fast
a region
allocpoint
wastage
Region Performance:Locality
Regions can express locality: Sequential allocs in a region can share cache
line Allocs in different regions less likely to pollute
cache for each other
Example: Moss 24% faster when frequently accessed, small
objects placed in different region than infrequently accessed, large object
Locality: moss
1-region version: small & large objects in 1 region 2-region version: small & large objects in 2 regions 45% less cycles lost to r/w stalls in 2-region version
moss - stalls
0200400600800
10001200
1-re
g
2-re
g
meg
acyc
les
moss - time
0
5
10
15
20
25
1-re
g
2-re
g
time
(s)
Region Expressiveness
Adds some structure to memory management
Few regions: easier to keep track of delay freeing to convenient "group" time (e.g.,
end of an iteration, closing a device, etc)
No need to write "free this data structure" functions
Region Static Checking:Region Type Systems
Basic idea: name regions in typesA simple region type system:
= int | region @ | <1, …, n> @ | ' | .
: region variables
Example: .(<int, int> @ int)
Region Static checking:Tofte & Talpin
Regions follow stack discipline letregion in e:
allocate a region named evaluate e (can use )delete region
safe if:(region) type of e does not use is not free in the letregion's environment
deallocation of regions is required... problem: pure stack discipline too restrictive ("leaks")
Aiken, Fähndrich, Levien: allocate late, deallocate earlyTofte & Talpin: other optimisations
Region Static Checking:Capabilities
Crary, Walker, Morisett: capabilities available at each program point:
1: read objects in , allocate in , freergn • guarantee: no other regions alias (so freergn safe)
+: read objects in , allocate in 1 < + (capability "subtyping")
capabilities threaded through the program:newrgn adds 1 to the current capabilitiesfreergn removes 1 from current capabilitiesfunction calls can temporarily "lose" capabilities (but
recoverable on return)
no capabilities allowed at exit:deallocation of regions is required
CWM Example
Static Checking Limitations
Some types are not expressible: list of regions
Ease of programming is unknownNo clear bounds on memory usage
Region Dynamic checking:RC
Features of RC: region-based allocation:
newregion/deleteregion/ralloc
safety via reference-counting (RC):RC(region r) = number of references to objects in r
from outside rdeleteregion(r) fails if RC(r) > 0
type annotations to describe program's region structure
Example
struct list { int i; struct list @next;} *a, *b;
Region r = newregion();
RC
a
b
r
0
Example
struct list { int i; struct list @next;} *a, *b;
Region r = newregion();b = rcons(r, 77, null);
RC
77
a
b
r
1
Example
struct list { int i; struct list @next;} *a, *b;
Region r = newregion();b = rcons(r, 77, null);a = rcons(r, 23, b);
RC
23
77
a
b
r
2
Example
struct list { int i; struct list @next;} *a, *b;
Region r = newregion();b = rcons(r, 77, null);a = rcons(r, 23, b);b->next = a;
RC
23
77
a
b
r
2
Example
struct list { int i; struct list @next;} *a, *b;
Region r = newregion();b = rcons(r, 77, null);a = rcons(r, 23, b);b->next = a;a = b = null;
RC
23
77
a
b
r
0
Example
struct list { int i; struct list @next;} *a, *b;
Region r = newregion();b = rcons(r, 77, null);a = rcons(r, 23, b);b->next = a;a = b = null;deleteregion(r);
a
b
Region advantages (over regular RC):
good for cyclic structures
space cost of RCs is negligible
RC: Type annotations
User-view: int *traditional x: "traditional" C pointer (not to region) struct list { int i; struct list *sameregion next; }:
pointer within same region
Abstract view (ignoring issues with null): region type system (like for static systems) with addition
of existential types: = … | . and runtime checks anylist = .<int, 1.anylist[1]> @ list = .<int, list[]> @ runtime check that two region variables are identical:
chk 1 = 2
RC: Implementation
Compiles to CMost RC updates for local variables are
avoidedAssignments to fields and globals produce
obvious RC updates (16-23 inst. cost)Deleting a region is expensive (scan)
RC: Experiments
Machine: 333 MHz UltraSparc I, Solaris 2.7Benchmarks: 8 medium to large C
programsRegions vs malloc/freeC compiler: gcc 2.95Measurements with UltraSparc internal
counters
The Benchmarks
Eight C programs:cfrac: factorise large integersgröbner: Find the Gröbner basis of a set of
polynomialsmudlle: byte-code compiler lcc: the lcc compiler tile: partitions text files based on word frequencymoss: software plagiarism detector rc: RC compilerapache: apache web server
Results: Ease of Use(from old implementation)
Size of substantive changes: cfrac: 18 of 4203 lines gröbner: 111 of 3219 lines mudlle: 22 of 5078 lines lcc: 349 of 12430 lines tile: 10 of 926 lines moss: 4 of 2675 lines
Types of changes: extra copying clear unused references work around prototype limitations
Results: Execution Time
012345678
time
(s)
malloc
RC
Results: Safety overhead
RC overhead
0
5
10
15
20
25%
ex
ec
uti
on
tim
e
RC overhead
Dynamic Checking Limitations
Runtime overhead: 0-20%Must clear dangling referencesSmall number of objects/region is bad:
RC more painful space & time overhead