Download - Talking trash
WORLDWARECONFERENCE
Talking Trash
Michael LabriolaSenior ConsultantDigital Primates
@mlabriola
Page 0 of 59
WORLDWARECONFERENCE
Who am I?
Michael LabriolaSenior ConsultantDigital Primates
Client side architect specializing in Adobe Flex– Lead architect and developer of FlexUnit 4– Benevolent Dictator of the Open Spoon Project
Team MentorArchitect of applications deployed worldwideFan of disassembling things
Page 2 of 59
WORLDWARECONFERENCE
What are we going to cover?
Memory in Flash Player
Garbage Collection as it exists today
A tiny bit of coming soon
Some wild speculation on what’s next
Page 3 of 59
WORLDWARECONFERENCE
Disclaimer
I am going to tell you lies.
Even at this depth, we will need to gloss over some details and treat them in a simplified manner to make it through.
Further, most of this has been learned by reading code from the Tamarin project. It is likely the same as in Flash Player, but I can’t know for sure.
Page 3 of 59
WORLDWARECONFERENCE
Memories
You have some choices in (programming) life.
• You can take full responsibility for asking for memory and returning it when you are done.
• You can employ another entity to do that work for you.
• You can do a combination of those two, but seriously, let’s keep this simple.
Page 3 of 59
WORLDWARECONFERENCE
What is GC
GC (Garbage Collection) is a way to automatically manage memory.
As an oversimplification let’s think about it this way —
GC proxies our access to memory:
Page 3 of 59
Memory
GC
ProgramCode
WORLDWARECONFERENCE
Why Proxy?
By proxying our access to the memory our program needs, GC can keep track of what objects are using memory. This removes the responsibility of remembering when and where to free (deallocate) memory.
To understand what GC is trying to do for you, let’s look at what can go wrong if you try to do it manually.
Page 3 of 59
WORLDWARECONFERENCE
Dangling References
If you manually allocate and free memory you can cause dangling references. What happens if the memory below does not exist when it is accessed?
Page 3 of 59
Memory
p1 p2
memory = allocate( size );p1 = memory;p2 = memory;free( memory );
trace( p1.toString() );trace( p2.toString() );
WORLDWARECONFERENCE
Double Free
What happens if you accidently try to free the same memory twice? What if it has already been reused?
Page 3 of 59
Memory2memory1 = allocate( size );free( memory1 );
memory2 = allocate( size );free( memory1 );
WORLDWARECONFERENCE
Real Leaks
What happens if you don’t free the memory before you lose access to it?
Page 3 of 59
Memory2
p1
p1 = allocate( size );p2 = allocate( size );p1 = p2;
Memory1
WORLDWARECONFERENCE
GC Instead
GC takes care of these situations by determining when a chunk of memory is no longer accessible and freeing it.
This is helpful, but GC will never understand your program logic, meaning that it will need to do more work to determine things you already know (i.e., you no longer need a chunk of memory).
So, GC has advantages but will also necessarily be slower and more complicated.
Page 3 of 59
WORLDWARECONFERENCE
GC How
The rest of this presentation is about that complication.
It is about how GC decides when and where to allocate and free memory and all of the overhead and processes that go into that decision.
Page 3 of 59
WORLDWARECONFERENCE
Allocation
Page 3 of 59
Flash Player uses a page allocator (GCHeap) to allocate large chunks from system memory (megabytes). In turn GCHeap gives 4k chunks (pages) to the GC memory manager as needed.
System Memory
4k 4k 4kGCHeapGCHeap GCGC
WORLDWARECONFERENCE
Pages
Page 3 of 59
Those 4k pages are then used by GC to provide memory for many objects in the system.
4k GCGC
Object A
Object B
Object C
WORLDWARECONFERENCE
Much
Page 3 of 59
The memory we are discussing right now is called heap memory. It is responsible for *much* but not all of the storage in the system
For our purposes, there are two types of memory we care about: heap and stack.
WORLDWARECONFERENCE
Stack
16
A stack is a data structure (think pile of papers on your desk) which is managed by two operations: pushing and popping. Pushing means adding something to the top stack, popping means removing.
Item 1
Item 2
Item 1
Item 2
Item 3In this data structure, the only way to get something in the middle is the remove the things above it.
WORLDWARECONFERENCE
Now imagine removing (popping) item 3 and adding a new (pushing) item 4. In Flash Player we have a memory structure like this, affectionately called ‘the stack.’
Altering the Stack
17
Item 1
Item 2
Item 1
Item 2
Item 3
Item 1
Item 2
Item 4
WORLDWARECONFERENCE
Local Variable Declaration
function doThing() { var a:int; var b:int; var c:Number;}
a
b
c
As you declare local method variables in ActionScript, they are pushed onto the stack.
WORLDWARECONFERENCE
Method Calls
function doThing() { var a:int; var b:int; var c:Number;
someMethod();}
function someMethod() { var x:int;}
a
b
c
x
As you call methods, each of their local variables are pushed onto a stack as well.
WORLDWARECONFERENCE
Method Calls
function doThing() { var a:int; var b:int; var c:Number;
someMethod();}
function someMethod() { var x:int;}
a
b
c
x
There is also other information contained on the stack about where to return when this method is complete.
doThing
WORLDWARECONFERENCE
Method Calls
function doThing() { var a:int; var b:int; var c:Number; someMethod1(); someMethod2();}function someMethod()1 { var x:int;}function someMethod2() { var y:int;}
a
b
c
y
Stack memory is frequently reused as items are pushed onto and popped off the stack.
doThing
WORLDWARECONFERENCE
Stack Memory
Heap Memory
Instances
function doThing() { var a:int; var o:Object;
o = new Object();}
ao
References to complex objects may also exist on the stack, but memory allocated for those objects comes from the heap.
Object A
WORLDWARECONFERENCE
Stack Memory
Heap Memory
Instances
function doThing() { var a:int; var o1:Object;
o1 = new Object(); doIt( o1 );}
function doIt( o2 ) { var b:int;}
Arguments passed to methods are also pushed into stack memory.
Object A
ao1
doIt
o2
b
WORLDWARECONFERENCE
Instances
function doThing() { var a:int; var o1:Object;
o1 = new Object(); doIt( o1 );}
function doIt( o2 ) { var b:int;}
When a method returns, the stack is unwound so that the memory can be reused.
Stack Memory
Heap Memory
ao1
Object A
WORLDWARECONFERENCE
Freeing
Now that we have memory and it’s all in use, we need to find a way to deallocate (free) it.
The premise behind GC is that we only free memory that is no longer needed. So how do we figure out how things are no longer needed?
First, some terms.
Page 3 of 59
WORLDWARECONFERENCE
Roots
In traditional programming models, you manage your own memory. Let’s call this unmanaged memory.
In a managed memory model, we ask the system to manage the lifetime (birth through death) of our memory.
The first chunk of managed memory asked for from unmanaged memory is called a root, or a GCRoot.
The garbage collector is aware of all roots. They serve as a starting point for things we are about to cover.
Page 3 of 59
WORLDWARECONFERENCE
Hybrid
The current Flash Player is a conservative mark and sweep garbage collector with deferred reference counting.
Let’s take those in reverse.
Page 3 of 59
WORLDWARECONFERENCE
Reference Counting
Reference counting is a very easy-to-understand way of keeping track of objects that can be removed from the system.
Imagine each object has an extra property called refCount that keeps track of the number of references to that object.
Page 3 of 59
WORLDWARECONFERENCE
Reference Counting
Looking at the following code, how many references does the object assigned to a have? How many does b have? For our purposes, assume a and b are global variables.
Page 3 of 59
ObjectA
a = new ObjectA();b = new ObjectB();b.prop = a;
ObjectB
a bObjectA: 2ObjectB: 1
WORLDWARECONFERENCE
Reference Counting
What if we set a to null? Can anything be collected?
Page 3 of 59
ObjectA
a = new ObjectA();b = new ObjectB();b.prop = a;a = null;
ObjectB
a bObjectA: 1ObjectB: 1
WORLDWARECONFERENCE
Reference Counting
What if instead we set b to null? Can anything be collected?
Page 3 of 59
ObjectA
a = new ObjectA();b = new ObjectB();b.prop = a;b = null;
ObjectB
a bObjectA: 2ObjectB: 0
WORLDWARECONFERENCE
Reference Counting
What if instead we set a and b to null? Notice how this will eventually cascade?
Page 3 of 59
ObjectA
a = new ObjectA();b = new ObjectB();b.prop = a;a = b = null;
ObjectB
a bObjectA: 1ObjectB: 0
WORLDWARECONFERENCE
Circular Reference
What if a also references b? Can these ever be collected via this technique?
Page 3 of 59
ObjectA
a = new ObjectA();b = new ObjectB();b.prop = a;a.prop = b;a = b = null;
ObjectB
a bObjectA: 1ObjectB: 1
WORLDWARECONFERENCE
Reference Counting Issues
Obviously the circular reference issue is a problem for reference counting, but there are some others too.
It takes storage to keep the reference count.It takes time to keep updating the reference count.
This is particularly cumbersome and adds overhead to items that are created and destroyed quickly (i.e., items on the stack).
Page 3 of 59
WORLDWARECONFERENCE
Frequent Allocations
To address the issue we simply don’t reference count items on the stack.
That reduces the amount of work significantly, but…
Page 3 of 59
WORLDWARECONFERENCE
Stack Memory
methodx
Missing Stack Count
If x doesn’t count as a reference to ObjectA, then how many references does ObjectA have? What would happen if ObjectA disappeared before the trace() statement?
Page 3 of 59
ObjectA
a = new ObjectA();
var x = a;a = null;trace( x.someProp );
ObjectA: 0
WORLDWARECONFERENCE
Zero Count Table
Instead of immediately destroying ObjectA when its reference count reaches 0, it is added to a Zero Count Table (ZCT).
Page 3 of 59
a = new ObjectA();
var x = a;a = null;trace( x );
ObjectA: 0
Zero Count Table Stack
Memory
methodx
ObjectA
WORLDWARECONFERENCE
Zero Count Table
When the ZCT is full, it can be reaped to destroy objects. If an object made it to the ZCT, then the only reference that could remain are on the stack. So, the stack is scanned. Any objects without a stack reference are destroyed.
Page 3 of 59
ObjectA
Zero Count Table
ObjectB
ObjectC
Stack Memory
methodx
WORLDWARECONFERENCE
Back to Circles
The combination of those techniques are called Deferred Reference Counting.
Which is great, but we still have circular reference problems. This is a huge problem as we can never collect certain objects (think about parent/child relationships of XML).
So, we need another technique to handle these cases.
Page 3 of 59
WORLDWARECONFERENCE
Mark and Sweep
The other technique used by the Flash Player garbage collector is called mark and sweep.
Each managed object in the system has an extra bit called a mark bit.
Page 3 of 59
WORLDWARECONFERENCE
Marking
You start with a tree of Objects. In this case Object A is a GC root.
Page 3 of 59
Object A
Object Object Object
Object Object
Object Object
Object Object
Object Object
WORLDWARECONFERENCE
Marking
From Object A all paths are followed, marking each child that is encountered.
Page 3 of 59
Object A
Object Object Object
Object Object
Object Object
Object Object
Object Object
WORLDWARECONFERENCE
Remainder
Anything that is not marked is not reachable. That means it can be discarded.
Page 3 of 59
Object A
Object Object Object
Object Object
Object Object
Object Object
Object Object
WORLDWARECONFERENCE
Tangent – Weak References
Many of you may have heard of weak references and have heard advice such as:
“When adding an event listener to an object, you should make the reference weak”
myObj.addEventListener(type, listener,
useCapture, priority, useWeakReference);
What does this argument actually do?
Page 3 of 59
WORLDWARECONFERENCE
Tangent – Weak References
Additionally, the Dictionary class can be constructed to use weak keys.
var d = new Dictionary(weakKeys:Boolean = false);
d[ someObject ] = 5;
Note, only the keys can be made weak in the Dictionary.
So, again, what does this argument actually do?
Page 3 of 59
WORLDWARECONFERENCE
Weak Reference
I like to think of it as adding a path that cannot be travelled by the marking. If the object has no other references, it can be collected.
Page 3 of 59
Object A
Object Object Object
Object Object
Object Object
Object Object
Object
WORLDWARECONFERENCE
Tangent – Weak References
So, should you always use weak references?
Absolutely not.
Understand when and where these are needed, then use them appropriately.
What are some appropriate times? What are the alternatives?
Page 3 of 59
WORLDWARECONFERENCE
Back on Track
For some strange reason, people don’t like their application to pause noticeably while garbage collection runs.
Unfortunately, the process of walking an entire application and marking all of the objects takes serious time.
So, how do you do that and not make things pause? Well, don’t do it all at once, of course.
Page 3 of 59
WORLDWARECONFERENCE
Work Queue
So, we make a work queue. We push all of the roots onto the work queue and we process. When the queue is empty we are done.
Page 3 of 59
WorkQueue
Object A
Object B
WORLDWARECONFERENCE
Hello Complexity
The problem with not doing all the marking at once introduces serious complexity. Imagine if your code adds child to parent after parent has already been marked. Child would never be marked and eventually collected.
Page 3 of 59
Object A
Parent Object Object
Child
WORLDWARECONFERENCE
Tri-Color
To handle the complexity of managing this type of garbage collection marking incrementally, Flash Player uses something called the tri-color algorithm.
Every object has 3 possible states:
Black objects are marked and no longer in the work queue.
Gray objects are in the queue, but not yet marked.
White objects are not in the queue and not marked.
Page 3 of 59
WORLDWARECONFERENCE
Work Queue Start
We start by putting all the roots on the queue. This makes them gray.
Page 3 of 59
WorkQueue
Object A
Object B
WORLDWARECONFERENCE
Queue Progress
As the queue is processed, gray becomes black, white becomes gray.
Page 3 of 59
WorkQueue
Object B
WORLDWARECONFERENCE
New Addition
In most cases, new objects can be added at will, but we have a problem when a white object is added to a black object. It will never be marked.
Page 3 of 59
Object B
Object Object
Object Object
WorkQueue
Object B
O1 O2
O3
O4
WORLDWARECONFERENCE
Added to the Queue
So, whenever a white object is added to black, it immediately gets added to the work queue. If you are interested, this is accomplished via a write barrier.
Page 3 of 59
Object B
Object Object
Object Object
WorkQueue
Object B
O1 O2
O1
O2
O3
O4
WORLDWARECONFERENCE
Being Conservative
So far that explains most of the phrase conservative mark and sweep garbage collector with deferred reference counting.
What’s missing is the explanation of conservative.
In this case conservative means we might not free something that we can’t definitively tell should be free.
Page 3 of 59
WORLDWARECONFERENCE
Pointer or Integer
When the garbage collector comes across certain values it is difficult to tell if the value is a memory address (a pointer to an object) or just a numeric value. So, GC will not collect an object that *may* be referenced.
For example:
0x00800A30 – Could represent 8,391,216
0x00801F37 – May represent an object in memory
0x00802FFE – Some bytes in a bitmap
Page 3 of 59
WORLDWARECONFERENCE
Minor Leaks
This can mean that, on occasion, an object may not be collected because there is an integer somewhere that may, possibly, be interpreted as pointing to it.
GC would rather take that risk than remove something you could be using.
This type of leak is minimal and not likely to cause much of an issue. The item to note is that we don’t know exactly where objects in memory may exist, we have to guess. More on that later.
Page 3 of 59
WORLDWARECONFERENCE
Moving On
So, now you have a good sense of what GC is doing to decide when your memory is collected.
The two remaining questions that always come up are:
1.Can I control GC?2.Since my system memory didn’t go down when I did X, Y or Z, I must have a leak, right?
Page 3 of 59
WORLDWARECONFERENCE
Controlling GC
Leaving aside hacky approaches for the moment, you have some minimal ability to influence and control GC (depending on your runtime version).
class System {
public static function gc():void;
}
In Flash Player this does nothing. Absolutely nothing.
In AIR this forces either a mark or sweep to run. Not immediately, but later this frame.
Page 3 of 59
WORLDWARECONFERENCE
Imminence
As you know, it takes a while for to mark all of the objects in the system. This is done incrementally.
It also takes time to reclaim that memory. Doing so synchronously can make your application pause.
The term imminence is a measurement of how far the collector is through marking all of the objects, and hence, how close it is to creating that pause.
Page 3 of 59
PauseMarking
0 Imminence Increasing 1
WORLDWARECONFERENCE
Something Incubating
The Flash Player Incubator introduces a new function:
class System {
public static function pauseForGCIfCollectionImminent (imminence:Number=.75):void;
}
The value of imminence is clamped between .25 and 1.
Using this API you can advise Flash Player about good times to collect.
Page 3 of 59
WORLDWARECONFERENCE
Pause Now
When you call pauseForGCIfCollectionImminent() the value you pass is compared against the collectors current imminence.
Effectively, if the collector’s imminence is greater than the value you provide, the collector will finish marking and sweeping synchronously.
This will result in an application pause.
Page 3 of 59
PauseMarking
0 Imminence Increasing 1
WORLDWARECONFERENCE
Wait, pause now?
Why in the world would you want to pause?
The player is going to pause at some point. Using this API you can effectively indicate how much of a pause you are willing to tolerate at that moment.
Low value means: I have lots of time and think I should GC now. I am willing to tolerate a long pause now.
Higher value means: I have a little time and think I should GC now. I am willing to tolerate a shorter pause.
Note: In both cases you are asking for GC if it can be done within your criteria.
Page 3 of 59
WORLDWARECONFERENCE
Advice Examples
A bad time to collect:
function doIt() {
startComplexAnimation();
System.pauseForGCIfCollectionImminent( .25 );
}
A good time to collect:
function doIt() {
complexDataProvider = null;
asyncCallToGetMoreData();
System.pauseForGCIfCollectionImminent( .25 );
}
Page 3 of 59
WORLDWARECONFERENCE
Giving Back
Way back in the beginning of this journey, we talked about how the GCHeap allocates megabytes and gives 4k chunks of memory to the GC so that new objects can be allocated.
Page 3 of 59
System Memory
4k 4k 4kGCHeapGCHeap GCGC
WORLDWARECONFERENCE
Objects are allocated on those pages and some are collected. Over time we get fragmentation.
Fragmentation
Page 3 of 59
Time
Object
Object
Object
Pages are allocated and space for objects allocated.
Object
Object
Object
Object
Object
Object
Some objects collected, some new ones created
Object
Object
Object
Net result: Even though we did collect objects, we cant release any pages.
Object
Object
WORLDWARECONFERENCE
That fragmentation makes it almost impossible to judge if you have memory leaks by looking at system memory alone.
It’s why tools like memory profilers are so important. They are the only reliable way to gain insight into your situation from outside of the player.
Fragmentation
Page 3 of 59
WORLDWARECONFERENCE
Now, I would like to wildly speculate.
Well, okay, perhaps not wildly. If one were to pay attention to the Tamarin project, one might see a lot of work being committed by Adobe around GC.
One might conclude that this gives some insight into where player GC is going. Yes, one just might.
Pure Speculation
Page 3 of 59
WORLDWARECONFERENCE
Going back to fragmentation: why don’t we just collapse all of these objects into a single page and free the others?
Fragmentation
Page 3 of 59
Time
Object
Object
Object
Pages are allocated and space for objects allocated.
Object
Object
Object
Object
Object
Object
Some objects collected, some new ones created
Object
Object
Objects are moved into a single page. Other pages could be freed.
Object
Object
Object
WORLDWARECONFERENCE
Well, it goes back to the point that Flash doesn’t really know where references to objects are. Back when we discussed conservative, we showed this example:
If you don’t know where your references are, it is impossible to move them around.
0x00800A30 – Could represent 8,391,216
0x00801F37 – May represent an object in memory
0x00802FFE – Some bytes in a bitmap
Objects, where?
Page 3 of 59
WORLDWARECONFERENCE
There is a solution to this problem, though. We could keep better track of what is actually a pointer and what isn’t.
If we did that we might gain a few advantages.
1.No more guessing on what’s a pointer and not — possible conservative leaks disappear.2.We can likely be a bit faster doing our overall GC.3.If we wanted to move an object, we could find and update all references to it.
Exact Tracing
Page 3 of 59
WORLDWARECONFERENCE
And, for fun, here is a bunch of information about implementing exact tracing in Tamarin:
Exact tracing manual: http://hg.mozilla.org/tamarin-redux/raw-file/tip/doc/mmgc/exactgc-manual.html
Exact GC cookbook: http://hg.mozilla.org/tamarin-redux/raw-file/tip/doc/mmgc/exactgc-cookbook.html
Exact tracing profiler http://hg.mozilla.org/tamarin-redux/raw-file/tip/doc/mmgc/exactgc-profilers.html
Exact Tracing
Page 3 of 59
WORLDWARECONFERENCE
So, speculating just a bit more, why else might we want to be able to move objects?
Well, it turns out if we can move objects we can implement other types of garbage collection, some of which can be very, very efficient.
Enter the idea of ephemeral garbage collection.
Why Else?
Page 3 of 59
WORLDWARECONFERENCE
Ephemeral, also known as generational, garbage collection is all based on a simple hypothesis:
The objects that were created most recently are the ones that are most likely to have a short life span and hence need collection sooner.
Generational garbage collectors use a heuristic approach. In this way it optimizes the work it is doing to have the most impact.
Generational
Page 3 of 59
WORLDWARECONFERENCE
In a generational garbage collection scheme, you keep objects in separate areas depending on how long they have been around.
Generational
Page 3 of 59
Age
Object
Object
Object
New objects live in this region. GC checks this region often for reaping.
Object
Object
Object
Object
Object
The oldest objects in the system are checked infrequently
Object
Object
Object
Object
WORLDWARECONFERENCE
When the youngest region is full, we promote objects still referenced in older memory and reuse the youngest area.
Generational
Page 3 of 59
Movement
Object
Object
Object
This region is continually reused
Object
Object
Object
Object
Object
Object
Objects are copied if they are referenced from older memory
Object
The oldest objects change infrequently
Object
Object
Object
Object
Object
WORLDWARECONFERENCE
If using generation garbage collection becomes a reality, then the need for reference counted objects and the overhead that goes with that scheme could be eliminated.
This means less operations for GC, taking less time, requiring less processor, in a smaller memory footprint, yielding more performant applications.
Remember, just speculation.
Reference Count No More
Page 3 of 59
WORLDWARECONFERENCE
Contact Information
Michael Labriolahttp://twitter.com/mlabriola
Page 59 of 59