design and implementation of object oriented dynamic programming languages jonathan bachrach greg...

72
Design and Implementation of Object Oriented Dynamic Programming Languages Jonathan Bachrach Greg Sullivan Kostas Arkoudous

Upload: magnus-sanders

Post on 23-Dec-2015

217 views

Category:

Documents


0 download

TRANSCRIPT

Design and Implementation of Object Oriented Dynamic Programming Languages

Jonathan Bachrach

Greg Sullivan

Kostas Arkoudous

Who am I?

• Jonathan Bachrach• PhD in CS: neural networks applied to

robotic control• Experience in real-time high performance

electronic music• Involved in design of Sather and Dylan• Five years experience writing Dylan’s

runtime and compiler at Harlequin

The Seminar

• 6.894, 26-311, 1-2.30, 3-0-9, 3 EDP’s• Theme• Proto running example• Lectures• Readings + Presentations• Panels• Assignments• Projects

Today

• Taste of the Seminar

• Evolutionary Programming

• Proto

• Language Design in the New Millenium

• Language Implementation Techniques

• Seminar Details

Tomorrow

• Motivation and overview of OODL

• Implementation perspective

• Greg’s favorite topics

Language Renaissance

• Perl• Python• MetaHTML• PHP• TCL• Cecil• C#• Java

• JavaScript• Sather• Rebol• Curl• Dylan• Isis• Limbo• …

The Stage

• Increasing Demands for Quick Time to Market• Moore’s Law for Hardware but is software

keeping up?• Most Time Spent after initial deployment• Complex environments

– Distributed– Agents– Real world– Continuous, non-deterministic, dynamic inputs

Conventional Programming

• Assume user knows exactly what they want before programming

• Assume that specifications don’t change over time

• Problem: Forces premature decisions and implementation effort

• Example: Java Class Centric Methods

Evolutionary Programming

• Need to play with prototypes before knowing what you really want– Support rapid prototyping– Smooth transition to delivery

• Requirements change all the time– Late binding allowing both developers and

program alike to update behavior

Language Design:User Oriented Goals

• Perlis: “A language that doesn't affect the way you think about programming, is not worth knowing.”

• What user-oriented goals would you suggest would result in the best language?

Proto

• Goals

• Examples

• Relation

• Definition

• State

Proto Hello World

(format out “hello world”)

Proto Goals

• Simple• Productive• Powerful• Extensible• Dynamic• Efficient• Real-time

• Teaching and research vehicle

• Electronic music is domain to keep it honest

Proto Ancestors

• Language Design is Difficult– Leverage proven ideas– Make progress in selective directions

• Ancestors– Scheme– Cecil– Dylan

Proto <=> Scheme

• Concise naming• Procedural macros• Objects all the way

• Long-winded naming• Rewrite rule only• Only records

Proto <=> Cecil

• Prefix syntax• Scheme inspired

special forms

• Infix syntax• Smalltalk inspired

special forms

Proto <=> Dylan

• Prefix syntax• Prototype-based• Procedural macros• Rationalized collection

protocol / hierarchy• Always open• Predicate types

• Infix syntax• Class-based• Rewrite-rule only …• Conflated collection

protocol / hierarchy• Sealing• Fixed set of types

Object Orientation

• Assume you know OO basics

• Motivations:– Abstraction– Reuse– Extensibility

Prototype-based OO

• Simplified object model

• No classes

• Cloning basic operation for instance and prototype creation

• Prototypes are special objects that serve as classes

• Inheritance follows cloned-from relation

Proto: OO & MM

(dv <point> (isa <any>))

(slot <point> (point-x <int>) 0)

(slot <point> (point-y <int>) 0)

(dv p1 (isa <point>))

(dm + ((p1 <point>) (p2 <point>) => <point>)

(isa <point>

(set point-x (+ (point-x p1) (point-x p2)))

(set point-y (+ (point-y p1) (point-y p2))))

Language Design:User Goals -- The “ilities”

• Learnability

• Understandability

• Writability

• Modifiability

• Runnability

• Interoperability

Learnability

• Simple

• Small

• Regular

• Gentle learning curve

• Perlis: “Symmetry is a complexity reducing concept…; seek it everywhere.”

Proto: Learnability

• Simple and Small: – 16 special forms: if, seq, set, fun, let, loc, lab,

fin, dv, dm, dg, isa, slot, ds, ct, quote

– 7 macros: try, rep, mif, and, or, select, case

• Gentle Learning Curve: – Graceful transition from functional to object-oriented

programming

– Perlis: “Purely applicative languages are poorly applicable.”

Proto: Special Forms

IF (IF ,test ,then ,else)SEQ (SEQ ,@forms)SET (SET ,name ,form) | (SET (,name ,@args) ,form)LET (LET ((,var ,init) …) ,@body)FUN (FUN ,sig ,@body)LOC (LOC ((,name ,sig ,@body) …) .@body)LAB (LAB ,name ,@body)FIN (FIN ,protected-form ,@cleanup-forms)DV (DV ,var ,form)DM (DM ,name ,sig ,@body)DG (DG ,name ,sig)ISA (ISA (,@parents) ,@slot-inits)SLOT (SLOT ,owner ,var ,init)

sig (,@vars) | (,@vars => ,var)var ,name | (,name ,type)slot-init (SET ,name ,value)

Understandability

• Natural notation

• Simple to predict behavior

• Modular

• Models application domain

• Concise

Proto: Understandability

• Describable by a small interpreter– Size of interpreter is a measure of complexity

of language

• Regular syntax– Debatable whether prefix is natural, but it’s

simple, regular and easy to implement

Writability

• Expressive features and abstraction mechanisms

• Concise notation

• Domain-specific features and support

• No error-prone features

• Internal correctness checks (e.g., typechecking) to avoid errors

Tradeoff One: Abstraction <=> Writability

• Abstraction can obscure code– Example: abuse of macros

• Concision can obscure code– Example: APL

Tradeoff Two:Domain Specific <=> Simplicity

• Challenge is to introduce simple domain specific features that don’t hair up language– CL has reader macros for introducing new

token types

Proto: Error Proneness

• No out of language errors– At worst all errors will be be caught in

language at runtime– At best potential errors such as “no applicable

methods” will be caught statically earlier and in batch

• Unbiased dispatching and inheritance– Example: Method selection not based on

lexicographical order as in CLOS

Design Principle Two:Planned Serendipity

• Serendipity:– M-W: the faculty or phenomenon of finding

valuable or agreeable things not sought for

• Orthogonality– Collection of few independent powerful

features combinable without restriction

• Consistency

Proto: Serendipity

• Objects all the way down• Slots accessed only through calls to

generic’s• Simple orthogonal special forms• Expression oriented• Example:

– Exception handling can be built out of a few special forms: lab, fin, loc, …

Modifiability

• Minimal redundancy

• Hooks for extensibility included automatically

• No features that make it hard to change code later

Proto: Extensible Syntax

• Syntactic Abstraction

• Procedural macros

• WSYWIG– Pattern matching– Code generation

• Example:(ds (unless ,test ,@body)

`(if (not ,test) (seq ,@body)))

Proto: Multimethods

• Can add methods outside original class definition:– (dm jb-print ((x <node>)) …)– (dm jb-print ((x <str>)) …)

Proto: Generic Accessors

• All slot access goes through generic function calls

• Can easily redefine these generic’s without affecting client code

Runnability

• Features for programmers to control efficiency

• Analyzable by compilers and other tools

• Note: this will be a running theme!

Tradeoff Three:Runnability <=> Simplicity

• Much of a language design can be dedicated to providing mechanisms to control efficiency (e.g., sealing in Dylan)

• Obscure algorithms

• Perlis: “A programming language is low level when its programs require attention to the irrelevant.”

Proto: Optional Types

• All bindings and parameters can take optional types

• Rapid prototype without types

• Add types for documentation and efficiency

• Example:(dm format (s msg (args …)) …)

(dm format ((s <stream>)(msg <str>) (args …)) …)

Proto: Pay as You Go

• Don’t charge for features not used• Pay more for features used in more complicated

ways• Examples:

– Dispatch• Just function call if method unambiguous from argument types• Otherwise require dynamic method lookup

– Proto’s bind-exit called “lab”• Local exits are set + goto• Non local exits must create a frame and stack alloc an exit

closure

Interoperability

• Portability

• Foreign Language Interface

• Directly or indirectly after mindshare

The Rub

• Support for evolutionary programming creates a serious challenge for implementors

• Straightforward implementations would exact a tremendous performance penalty

The Game

• Every Problem in CS can be Solved with another Level of Indirection -- ancient proverb

• Every Optimization Problem can be Viewed as Removing a Level of Indirection -- jb

• Example: Procedures <=> Inlining

Implementation Techniques

• System code techniques: – Often called runtime optimizations– Turbo charges user code– Example: caching

• User code rewriting techniques: – Often called compile-time optimizations– Example: inlining

Implementing Prototypes

• Problem– No classes only objects– But objects need descriptions of slots and state– Would be too expensive for each object to have a copy

of these descriptions

• A solution:– Create classes called traits on demand

• When you are cloned or• When slots are added to you

– Objects contain their values and share traits through a traits pointer

Proto: Traits on Demand

x: 89

y: 55

fo lks: x:

x: 33

y: 67

fo lks: x:

x: 0

y: 0

fo lks: x:

89

55

x:

33

67

x:

x: 0

y: 0

fo lks: x:

0

0

x:

< po in t>

p1 p2

p1 p2

< po in t>

< po in t>T ra its

Implementing Generic Dispatch

• Multimethod dispatch– Considers all arguments

– Orders methods based on specializer type specificity

• Naïve description:– Find applicable methods

– Sort applicable methods

– Call most specific method

• Problem: too expensive

Dispatch Technique One:Dispatch Cache

• Hierarchical cache – Each level discriminates one argument using

associative mechanism– Keys are concrete object-traits– Values are either

• Cache for remaining arguments or• Method to call with given arguments

• Problems– Too many indirections:

• generic call + method call• key and value lookups

Engine Node Dispatch

• Glenn Burke and myself at Harlequin, Inc.

circa 1996-– Partial Dispatch: Optimizing Dynamically-Dispatched Multimethod Calls

with Compile-Time Types and Runtime Feedback, 1998

• Shared decision tree built out of executable engine nodes

• Incrementally grows trees on demand upon miss

• Engine nodes are executed to perform some action typically tail calling another engine node eventually tail calling chosen method

• Appropriate engine nodes can be utilized to handle monomorphic, polymorphic, and megamorphic discrimination cases corresponding to single, linear, and table lookup

Engine Node Dispatch Picture

textcall

\+<i>,<i>m ethod

NAM

\+<f>,<f>m ethod

...discrim inator

...

<generic>

...

<f>

<i>

linear ep

<linear-engine>

...

< i>

m ono ep

<mono-engine>

...

<f>

m ono ep

<mono-engine>

...MEP

<method>

...MEP

<method>

...MEP

<method>

Define method \+ (x :: <i>, y :: <i>) … end;Define method \+ (x :: <f>, y :: <f>) … end;Seen (<i>, <i>) and (<f>, <f>) as inputs.

Dispatch Technique Two:Decision Tree

• Intelligently number traits with id’s– Children tend to be in contiguous id range

• Label all traits according to most applicable method• Construct decision tree constructing largest class id

ranges by merging• Implement with simple numeric operations and JIT

code generation• Problem:

– Generic call + method call– Lost inlining opportunities

Class Numbering

<object>

<a> <d>

<c><b> <e>

text0 1 2 3 4 5

Binary Search Tree Picture•From Chambers and Chen OOPSLA-99

Code Rewriting Technique One:Inlining Dispatch

• Inline when number of methods is small• When methods themselves are small• Example: List operations• Twist: Partially inline dispatch when there is a

spike in the method frequencies• Example: Numeric operations• Problem: Lose downstream type inference

possibilities because result of call gets merged back into all other possibilities

Inlining Dispatch Example One

(dm empty? ((x <lst>)) #f)

(dm empty? ((x nil)) #t)

(dm null? ((x <lst>))

(empty? X))

=>

(dm null? ((x <lst>))

(if (== x nil) #t

(if (isa? X <lst>) #f

(error …))))

Inlining Dispatch Example Two

(dm sum+1 (x y)

(+ (+ x y) 1))

=>

(dm sum+1 (x y)

(let ((t (if (and (isa? x <int>) (isa? y <int>))

(i+ x y)

(+ x y))))

(if (isa? t <int>)

(i+ t 1)

(+ t 1))))

Code Rewriting Technique Two:Code Splitting

• Clone code below typecase so each branch can run with tighter types

• Problem: potential code explosion

Code Splitting Example

(dm sum+1 (x y)

(+ (+ x y) 1))

=>

(dm sum+1 (x y)

(if (and (isa? x <int>) (isa? y <int>))

(i+ (i+ x y) 1)

(+ (+ x y) 1)))

Summary

• Touched on evolutionary programming

• Introduced Proto

• Discussed language design

• Sketched out several implementation techniques to gain back performance

Today’s Reading List

• Dijkstra: Goto harmful

• Hoare: Hints on PL design

• Steele: Growing a Language

• Gabriel: Worse is Better

• Chambers: Synergies Between Object-Oriented Programming Language Design and Implementation Research

• Chambers: The Cecil Language

• Norvig: Adaptive Software

• Cook: Interfaces and Specifications for the Smalltalk-80 Collection Classes

• XP: www.extremeprogramming.com

• *Lee: Advanced Language Implementation

• *Queinnec: Lisp In Small Pieces

Open Problems

• Extensible enough language to stem need for domain specific languages

• Best of both worlds of performance and dynamism

• Simplest combination of language + implementation

• …

Credits

• Craig Chambers’ notes on language design for UW CSE-505

Course

• Seminar style format• Encourage discussion• Identify and make progress towards the

solution of open problems• Course mostly not about language design

– Will use proto as a point of discussion– Will talk about language design implications

along the way

Prerequisites

• 6.001 (SICP)

• 6.035 (Computer Language Engineering)

• 6.821 (Programming Languages) preferred

• Or permission of instructor

Lectures

• Intro I & II• Interpreters and VM’s• Objects and Types• Runtime Object

Systems• Reflection• Memory Management

• Macros• Type Inference• OO Optimizations• Partial Evaluation• Dynamic Compilation• Proof-Based

Compilation• Practical Aspects

Presentation Talking Points

• Enumerate design parameters

• Identify open problems and make progress towards their solution

• Identify and understand tradeoffs

• Note language design implications

Paper Presentations

• Each student will present one or two papers judged important to the course

• Each presentation will be about a half hour

• A reading list is available on the course web site

• Other research can be presented at instructor’s discretion

• Three slots in the following categories:

• Language Design, Interpreters & VM’s

• Types and Object Systems, Reflection, and Memory Management

• Macros, Type Inference, and OO Optimizations, Partial Evaluation and Dynamic Compilation

Panels

• Local experts

• System, Compiler, Language Design

• Positions

• Open Problems

• Secret Tricks Revealed

• Q & A

Assignments

Small one week projects– Metacircular

interpreter

– Simple VM

– Fast isa?

– Fast method dispatch

– Slot layout

in Proto such as:– Simple GC

– Type inferencer

– Specialization

– Feedback guided dispatch

– Profile guided inlining

Project

• Substantial project involving original language design and/or implementation

• Produce a design and project plan

• Working implementation

• Present an overview of their project

• 10+ page write-up

Grades

• Will be based on the deliverables and class participation

First Assignment

• Survey

• Due by Thursday