innovation in the real world: making generics mainstream
DESCRIPTION
Innovation in the Real World: Making Generics Mainstream. Jim Miller Architect, Common Language Runtime Andrew Kennedy Microsoft Research Don Syme Microsoft Research. A remark of Archimedes quoted by Pappus of Alexandria, c. AD 340. Agenda. Introduction What are Generics? - PowerPoint PPT PresentationTRANSCRIPT
1
Innovation in the Real World: Making Generics
MainstreamJim Miller
Architect, Common Language Runtime
Andrew KennedyMicrosoft Research
Don SymeMicrosoft Research
A remark of Archimedes quoted by Pappus of Alexandria, c. AD 340
2
Agenda IntroductionIntroduction What are Generics? Phase I: Research Prototype Phase II: Joint Development Phase III: New Feature! Phase IV: What’s Missing? Phase V: Moving the World
3
What This Talk Is About
One New Feature:
Generics
4
Agenda Introduction What are Generics?What are Generics? Phase I: Research Prototype Phase II: Joint Development Phase III: New Feature! Phase IV: What’s Missing? Phase V: Moving the World
5
In English . . .Instead of defining StackOfInt, StackOfString, etc.,
use
class Stack<T> { void Push(T item) { … } T Pop() { … } T TopOfStack() { … }}
static Stack<int> IntStack;static Stack<string> StringStack;
Type safe (compile and design time support) Shared code (better perf, easier maintenance)
6
Polymorphic Programming Languages
Standard ML
O’Caml
Eiffel
Ada
GJ
C++
Mercury
Miranda Pizza
Haskell
Clu
7
Widely-usedPolymorphic Programming
Languages
C++
8
By 2005:
C#
Visual Basic
JavaCobol, Fortran, …?
Managed C++
9
Design for multiple languages
MLFunctors are cool!
Visual BasicDon’t confuse
me!
C++ Give me template
specializationC++
And template meta-
programmingJava
Run-time types please
SchemeWhy should I care?
C#Just give me decent collection classes
HaskellRank-n types? Existentials?
Kinds? Type classes?
EiffelAll generic types covariant please
COBOLChange my call
syntax!?!?
C++ Can I write class C<T> : T
10
Simplicity => no odd restrictions
interface IComparable<T> { int CompareTo(T other); }
class Set<T> : IEnumerable<T> where T : IComparable<T>{ private TreeNode<T> root; public static Set<T> empty = new Set<T>(); public void Add(T x) { … } public bool HasMember(T x) { … }}
Set<Set<int>> s = new Set<Set<int>>();
Type arguments can be value or reference types
Even statics can use type parameter
Constraints can reference type parameter (“F-bounded
polymorphism”)
Interfaces and superclass can be
instantiated
11
Non-goals C++ style template meta-programming
Leave this to source-language compilers Higher-order polymorphism, existentials
Let’s get the basics right first!
12
Agenda Introduction What are Generics? Phase I: Research PrototypePhase I: Research Prototype Phase II: Joint Development Phase III: New Feature! Phase IV: What’s Missing? Phase V: Moving the World
13
MSR PrototypeMay 1999 – Feb. 2000
Started with not-yet-completed V1 sources Private copy, no reason to stay in sync
Discussed approach with product team Modified key CLR data structures
Object layout Runtime type representation Virtual dispatch tables
Modified the JIT compiler Modified the C# compiler ~9 months, 2 researchers
14
Compiling polymorphism, as was
Two main techniques: Specialize code for each instantiation
C++ templates, MLton & SML.NET monomorphization good performance code bloat (though not a problem with modern C++
impls) Share code for all instantiations
Either use a single representation for all types (ML, Haskell) Or restrict instantiations to “pointer” types (Java) no code bloat poor performance (extra boxing operations required on
primitive values)
15
Compiling polymorphism in the Common Language Runtime
Polymorphism is built-in to the intermediate language (IL) and the execution engine
CLR performs “just-in-time” type specialization Code sharing avoids bloat Performance is (almost) as good as hand-
specialized code
16
Code sharing Rule:
share field layout and code if type arguments have same representation
Examples: Representation and code for methods in Set<string>
can be also be used for Set<object> (string and object are both 32-bit GC-traced pointers)
Representation and code for Set<long> is different from Set<int> (int uses 32 bits, long uses 64 bits)
17
Exact run-time types We want to support
if (x is Set<string>) { ... }else if (x is Set<Component>) { ... }
But representation and code is shared between compatible instantiations e.g. Set<string> and Set<Component>
So there’s a conflict to resolve… …and we don’t want to add lots of overhead to
languages that don’t use run-time types (ML, Haskell)
18
Object representation in the CLR
vtable ptr
fields
normal object representation:type = vtable pointer
vtable ptr
elements
array representation:type is inside object
element typeno. of
elements
19
Object representation for generics Array-style: store the instantiation directly in the
object? extra word (possibly more for multi-parameter types)
per object instance e.g. every list cell in ML or Haskell would use an extra
word Alternative: make vtable copies, store
instantiation info in the vtable extra space (vtable size) per type instantiation expect no. of instantiations << no. of objects so we chose this option
20
Object representation for generics
vtable ptr
fields
x : Set<string>
vtable ptr
fields
y : Set<object>
Add
HasMemberToArray
Add
HasMemberToArray
code for HasMember
code for ToArray
code for Add
string object
… …
21
Selling The Results Presented prototype to product teams Reviewed design with product teams Reviewed code with product teams
Sold!Provided
researchers port their work to the active code base … and complete missing items … and train the product team on the new code … and remain on-board to answer questions
22
Agenda Introduction What are Generics? Phase I: Research Prototype Phase II: Joint DevelopmentPhase II: Joint Development Phase III: New Feature! Phase IV: What’s Missing? Phase V: Moving the World
23
What’s in the design? Type parameterization for all declarations
classes e.g. class Set<T>
interfaces e.g. interface IComparable<T>
structse.g. struct HashBucket<K,D>
methods e.g. static void Reverse<T>(T[] arr)
delegates (“first-class methods”) e.g. delegate void Action<T>(T arg)
24
Life Is HellFeb. 2000 – Nov. 2002
In a live tree with 150 other developers! Especially if you are 6000 miles away, in a time zone that’s off by 8 hours, and connected by a slow Internet connection
There were “some issues” with the prototype Additional work to flesh out design Reflection Debugging Performance Pre-compilation (“Ngen”)
25
Precompilation (ngen) JIT compilation is flexible, but
can lead to slow startup times increases working set (must load JIT compiler, code
pages can’t be shared between processes) Instead, we can pre-compile
.NET CLR has “ngen” tool for native generation IL is compiled to x86 up-front runtime data structures (vtables etc) are persisted in
native image read-only pages (e.g. code) can be shared between
processes loader now responsible only for “link” step (cross-
module fix-ups)
26
Ngen for generics For non-generic code, to ngen an assembly:
just compile every class and method in the assembly perhaps inline a little across assemblies
For generic code: compile every generic class and method, but at what
instantiations? just reference types? (code is shared) or some “commonly-used” types? (e.g. int)
we don’t know statically what instantiations will be used it’s a “separate compilation” problem
27
Ngen all instantiations Our approach:
always compile generic code for reference-type instantiations
for value type instantiations, compute the transitive closure of instantiations used by the assembly
compile code for those instantiations not already present in other linked ngen images
leads to code duplication at load-time, just pick one has some interesting interactions with app-domain code-
sharing policy (see SPACE’04 paper on Don Syme’s home page)
28
NGen: example
class List<T>class Set<T>…Set<int>…
x86 for List<object> x86 for Set<object>
x86 for Set<int>
struct Point…List<Point>…Set<int>…List<int>…
class Window…List<Window>……List<int>…
MyCollections Client1 Client2
x86 for List<int>x86 for List<Point>x86 for List<int>
ngen
29
NGen: when we can’t JIT is still required for
instantiations requested through reflection (“late-bound”)e.g. typeof(List<>).BindGenericParameters(typeof(int))
generic virtual methods double dispatch, on instantiation and class of object
polymorphic recursion (unbounded number of instantiations)
30
Issues and Resolutions Getting the Results Out
MSR wanted to share their work CLR wouldn’t allow live source out⇨⇨ Port work to Rotor and release in source form
Remote Development Issues⇨⇨ One researcher, two months in Redmond⇨⇨ Coordinate check-in times
Transfer of Ownership⇨ ⇨ Code reviews⇨ ⇨ Phone calls, email lists, and accountability
31
Plan Of Record Generics will be in the CLR in “Whidbey” Generics will be in C# in Whidbey Class libraries will ship a separate “generic
collections” class Not part of mscorlib, the lowest-level library
Generic interfaces to be added to a select few basic types (arrays implement IList<T>, etc.)
Generics are not CLS compliant in Whidbey Give time to other languages to implement them Not required in base libraries for Whidbey
Generics must be “forward compatible” Old runtimes can execute new code, provided they don’t
use generics
32
Agenda Introduction What are Generics? Phase I: Research Prototype Phase II: Joint Development Phase III: New Feature!Phase III: New Feature! Phase IV: What’s Missing? Phase V: Moving the World
33
Forward Compatibility Doesn’t seem too bad at first
But what if a program uses Reflection? If the underlying system uses generics, the application
program will see them even if it doesn’t use them And what about debugging?
What if an old debugger tries to debug a program that uses generics?
And what about serialization? It’s risky, and it’s fragile And for other reasons we abandoned it …
A security-related change to the metadata But it’s too late to change the basic design
34
Generics Are “In the Build”Nov. 2002 – May 2004
C# implements them fully VB does user acceptance testing
Users like the feature But they find it confusing VB reworks the language design, retests, and
finds them “usable by Mort” Longhorn library developers start to use
them Managed C++ provides support for them
35
Announcement! Anders Hejlsberg announces generics in C# No backing off the feature now! Early customer feedback (inside and
outside Microsoft) is very positive But customers report bugs and design problems Performance is a serious issue for internal users
Product team takes primary ownership But still needs support from MSR, especially on
design issues Like the constraint language
36
What’s in the design (2)?Constraints on type parameters
class constraint (“must extend”)e.g. class Grid<T> where T : Control
interface constraints (“must implement”)e.g. class Set<T> where T : IComparable<T>
type parameter constraints (“must subtype”)e.g. class List<T> { void AddList<U>(List<U> items) where U : T }
3 special cases Can be instantiated (“new”) Can be null (“nullable”) Must be a value type (“struct”)
37
And What About Perf? Do generics really provide performance? It depends on how you ask the question…
And who is asking the question Or at least why they are really asking the question
38
MSR Perf Measurements
0
0.5
1
1.5
2
2.5
3
3.5
4
int double string (length)
element type
Quicksort on 1,000,000 elementsTimes in seconds
Generic
Non-generic (object)
39
My Perf MeasurementsQuickSort, 500 items
(Shorter is better)
0
10
20
30
40
50
60
70
Integer String Double Integer String Double
Data Type
Sec
on
ds
/ 50
,000
cal
ls
Generic
Non-Generic
Specific
Note:
• First three columns are based on my “natural” implementation of QuickSort(Array).
• Second three are based on Andrew Kennedy’s QuickSort(Array, ComparisonOperation)
40
What’s Our Recommendation? Performance numbers are never
Simple Complete Repeatable
“Apples to apples” isn’t always the question Sometimes absolute performance is paramount Sometimes ease-of-use is paramount Usually it’s a combination of both
Guidelines differ based on the audience
41
Agenda Introduction What are Generics? Phase I: Research Prototype Phase II: Joint Development Phase III: New Feature! Phase IV: What’s Missing?Phase IV: What’s Missing? Phase V: Moving the World
42
Early Adopters Pre-Beta releases circulated to select
customers Feedback is very positive Lots of suggestions
Including a complete rewrite of collections Including many previously requested features
We’re ready for Beta 1 So we can only do a few items, and only the
most important
43
Remember…
MLFunctors are cool!
Visual BasicDon’t confuse
me!
C++ Give me template
specializationC++
And template meta-
programmingJava
Run-time types please
SchemeWhy should I care?
C#Just give me decent collection classes
HaskellRank-n types? Existentials?
Kinds? Type classes?
EiffelAll generic types covariant please
COBOLChange my call
syntax!?!?
C++ Can I write class C<T> : T
44
What’s in the design (3)? Variance annotations on type parameters (CLR
only) covariant subtyping
interface IEnumerator<+T> { T get_Current(); bool MoveNext(); }
so IEnumerator<string> assignable to IEnumerator<object>
contravariant subtypinginterface IComparer<-T> { int Compare(T x, int y); }
so IComparer<object> assignable to IComparer<string>
45
Agenda Introduction What are Generics? Phase I: Research Prototype Phase II: Joint Development Phase III: New Feature! Phase IV: What’s Missing? Phase V: Moving the WorldPhase V: Moving the World
46
Standards and the CLS All changes for generics submitted to ECMA
And later to ISO Common Language Specification
A “deal” between compiler writers and library designers Remember the plan of record?
No generics in the CLS Schedules are readjusted
Longhorn (OS) will ship Whidbey later version had been planned
Compilers enforce CLS rules Windows API (WinFX) uses generics heavily
Library teams want generics in the CLS
47
Remember…
MLFunctors are cool!
Visual BasicDon’t confuse
me!
C++ Give me template
specializationC++
And template meta-
programmingJava
Run-time types please
SchemeWhy should I care?
C#Just give me decent collection classes
HaskellRank-n types? Existentials?
Kinds? Type classes?
EiffelAll generic types covariant please
COBOLChange my call
syntax!?!?
C++ Can I write class C<T> : T
48
Moving the World The CLS is the lever . . .
Languages sign up to be “consumers” or “extenders” Library designers sign up to live within the rules
But it isn’t well placed Nothing requires languages to live up to their part Not all rules can be mechanically checked
And Microsoft doesn’t have central enforcement And using it is painful
Will languages move forward or pull out? It depends on their customers (developers) And the importance to them of the libraries And the complexity/cost of implementation
Will libraries stay within the bounds? It depends on their customers (developers) And the complexity/cost of implementation
49
Questions?
50
Pointers If you’re interested:
“Design and Implementation of Generics for .NET”, PLDI’01
“Formalization of Generics for .NET”, POPL’04 “Transposing F to C#”, CCPE, 2004 “Generics, Pre-compilation and Sharing”, SPACE’04 http://research.microsoft.com/~akenn
Download Whidbey Beta1: http://msdn.microsoft.com/vs2005
Download prototype generics implementation (Gyro) extending the Shared Source CLI: http://research.microsoft.com/projects/clrgen