custom stl allocators
TRANSCRIPT
Custom STL Allocators
Pete IsenseeXbox Advanced Technology [email protected]
Topics• Allocators: What are They Good
For?• Writing Your First Allocator• The Devil in the Details• Allocator Pitfalls
– State– Syntax– Testing
• Case Study
Containers and Allocators• STL containers allocate memory
– e.g. vector (contiguous), list (nodes)– string is a container, for this talk
• Allocators provide a standard interface for container memory use
• If you don’t provide an allocator, one is provided for you
Example• Default Allocator
list<int> b;// same as:list< int, allocator<int> > b;
• Custom Allocator#include “MyAlloc.h”list< int, MyAlloc<int> > c;
The Good• Original idea: abstract the notion
of near and far memory pointers• Expanded idea: allow
customization of container allocation
• Good for– Size: Optimizing memory usage
(pools, fixed-size allocators)– Speed: Reducing allocation time
(single-threaded, one-time free)
Example Allocators• No heap locking (single thread)• Avoiding fragmentation• Aligned allocations
(_aligned_malloc)• Fixed-size allocations• Custom free list• Debugging• Custom heap• Specific memory type
The Bad• No realloc()• Requires advanced C++ compilers• C++ Standard hand-waving• Generally library-specific
– If you change STL libraries you may need to rewrite allocators
• Generally not cross-platform– If you change compilers you may need
to rewrite allocators
The Ugly• Not quite real objects
– Allocators with state may not work as expected
• Gnarly syntax– map<int,char> m;– map<int,char,less<int>, MyAlloc<pair<int,char> > > m;
Pause to Reflect• “Premature optimization is the
root of all evil” – Donald Knuth• Allocators are a last resort and
low-level optimization• Especially for games, allocators
can be the perfect optimization• Written correctly, they can be
introduced w/o many code changes
Writing Your First Allocator• Create MyAlloc.h• #include <memory>• Copy or derive from the default
allocator• Rename “allocator” to “MyAlloc”• Resolve any helper functions• Replace some code with your own
Writing Your First Allocator• Demo• Visual C++ Pro 7.0 (13.00.9466)• Dinkumware STL (V3.10:0009)• 933MHz PIII w/ 512MB• Windows XP Pro 2002• Launch Visual Studio
Two key functions• Allocate• Deallocate• That’s all!
Conventionstemplate< typename T >class allocator{ typedef size_t size_type; typedef T* pointer; typedef const T* const_pointer; typedef T value_type;};
Allocate Function• pointer allocate( size_type n, allocator<void>::const_pointer p = 0)– n is the number of items T, NOT bytes– returns pointer to enough memory to
hold n * sizeof(T) bytes– returns raw bytes; NO construction– may throw an exception
(std::bad_alloc)– default calls ::operator new– p is optional hint; avoid
Deallocate function• void deallocate( pointer p, size_type n )
– p must come from allocate()– p must be raw bytes; already
destroyed– n must match the n passed to
allocate()– default calls ::operator delete(void*)– Most implementations allow and
ignore NULL p; you should too
A Custom Allocator• Demo• That’s it!• Not quite: the devil is in the
details– Construction– Destruction– Example STL container code– Rebind
Construction• Allocate() doesn’t call constructors• Why? Performance• Allocators provide construct
function void construct(pointer p, const T& t) { new( (void*)p ) T(t); }• Placement new
– Doesn’t allocate memory– Calls copy constructor
Destruction• Deallocate() doesn’t call
destructors• Allocators provide a destroy
function void destroy( pointer p )
{ ((T*)p)->~T(); }• Direct destructor invocation
– Doesn’t deallocate memory– Calls destructor
Example: Vectortemplate< typename T, typename A >class vector { A a; // allocator pointer pFirst; // first object pointer pEnd; // 1 beyond end pointer pLast; // 1 beyond last};
Example: Reservevector::reserve( size_type n ){ pointer p = a.allocate( n, 0 ); // loop on a.construct() to copy // loop on a.destroy() to tear down a.deallocate( pFirst, capacity() ); pFirst = p; pLast = p + size(); pEnd = p + n;}
Performance is paramount• Reserve
– Single allocation– Doesn’t default construct anything– Deals properly with real objects
• No memcpy• Copy constructs new objects• Destroys old objects
– Single deallocation
Rebind• Allocators don’t always allocate T
list<Obj> ObjList; // allocates nodes
• How? Rebindtemplate<typename U> struct rebind{ typedef allocator<U> other; }
• To allocate an N given type TAlloc<T> a;T* t = a.allocate(1); // allocs sizeof(T)Alloc<T>::rebind<N>::other na;N* n = na.allocate(1); // allocs sizeof(N)
Allocator Pitfalls• To Derive or Not to Derive• State
– Copy ctor and template copy ctor– Allocator comparison
• Syntax issues• Testing• Case Study
To Derive or Not To Derive• Deriving from std::allocator
– Dinkumware derives (see <xdebug>)– Must provide rebind, allocate,
deallocate– Less code; easier to see differences
• Writing from scratch– Allocator not designed as base class– Josuttis and Austern write from scratch– Better understanding
• Personal preference
Allocators with State• State = allocator member data• Default allocator has no data• C++ Std says (paraphrasing
20.1.5):– Vendors encouraged to support
allocators with state– Containers may assume that
allocators don’t have state
State Recommendations• Be aware of compatibility issues
across STL vendors• list::splice() or C::swap()will
indicate if your vendor supports stateful allocators– Dinkumware: yes– STLport: no
• Test carefully
State Implications• Container size increase• Must provide allocator:
– Constructor(s)• Default may be private if parameters required
– Copy constructor– Template copy constructor– Global comparison operators (==, !=)
• No assignment operators required• Avoid static data; generates one per
T
Heap Allocator Exampletemplate< typename T >class Halloc { Halloc(); // could be private explicit Halloc( HANDLE hHeap ); Halloc( const Halloc& ); // copy template< typename U > // templatized Halloc( const Halloc<U>& ); // copy};
Template Copy Constructor• Can’t see private data
template< typename U >Halloc( const Halloc<U>& a ) : m_hHeap( a.m_hHeap ) {} // error
• Solutions– Provide public data accessor function– Or allow access to other types U
template <typename U>friend class Halloc;
Allocator comparison• Example
template< typename T, typename U >bool operator==( const Alloc<T>& a, const Alloc<U>& b ){ return a.state == b.state; }
• Provide both == and !=• Should be global fucns, not
members• May require accessor functions
Syntax: Typedefs• Prefer typedefs• Offensive
list< int, Alloc< int > > b;• Better
// .htypedef Alloc< int > IAlloc;typedef list< int, IAlloc > IntList;// .cppIntList v;
Syntax: Construction• Containers accept allocators via
ctorsIntList b( IAlloc( x,y,z ) );
• If none specified, you get the defaultIntList b; // calls IAlloc()
• Map/multimap requires pairsAlloc< pair< K,T > > a;map< K, T, less<K>, Alloc< pair< K,T > > > m( less<K>(), a );
Syntax: Other Containers• Container adaptors accept containers via constructors, not allocatorsAlloc<T> a;deque< T, Alloc<T> > d(a);stack< T, deque<T,Alloc<T> > > s(d);
• String exampleAlloc<T> a;basic_string< T, char_traits<T>, Alloc<T> > s(a);
Testing• Test the normal case• Test with all containers (don’t forget
string, hash containers, stack, etc.)• Test with different objects T,
particularly those w/ non-trivial dtors• Test edge cases like list::splice• Verify that your version is better!• Allocator test framework:
www.tantalon.com/pete.htm
Case Study• In-place allocator
– Hand off existing memory block– Dole out allocations from the block– Never free
• Example usagetypedef InPlaceAlloc< int > IPA;void* p = malloc( 1024 );list< int, IPA > x( IPA( p, 1024 ) );x.push_back( 1 );free( p );
• View code
In-Place Allocator• Problems
– Fails w/ multiple concurrent copies– No copy constructor– Didn’t support comparison– Didn’t handle containers of void*
• Correct implementation– Reference counted– Copy constructor implemented– Comparison operators– Void specialization
In-Place Summary• Speed
– Scenario: add x elements, remove half– About 50x faster than default allocator!
• Advantages– Fast; no overhead; no fragmentation– Whatever memory you want
• Disadvantages– Proper implementation isn’t easy– Limited use
Recommendations• Allocators: a last resort
optimization• Base your allocator on <memory>• Beware porting issues (both
compilers and STL vendor libraries)
• Beware allocators with state• Test thoroughly• Verify speed/size improvements
Recommendations part II• Use typedefs to simplify life• Don’t forget to write
– Rebind– Copy constructor– Templatized copy constructor– Comparison operators– Void specialization
References• C++ Standard section 20.1.5,
20.4.1• Your STL implementation:
<memory>• GDC Proceedings: References
section• Game Gems III• [email protected]• www.tantalon.com/pete.htm