arvind computer science and artificial intelligence laboratory m.i.t. l06-1 september 26, 2006 type...

ArvindComputer Science and Artificial Intelligence Laboratory

M.I.T.

L06-1September 26, 2006 http://www.csg.csail.mit.edu/6.827

Type Inference

September 26, 2006

September 26, 2006 http://www.csg.csail.mit.edu/6.827 L06-2

Type Inference

• Type inference is typically presented in two different forms:

– Type inference rules: Rules define the type of each

expression• Clean and concise; needed to study the semantic

properties, i.e., soundeness, of the type system

– Type inference algorithm: Needed by the compiler writer to deduce the type of each subexpression or to deduce that the expression is ill typed.

• Often it is nontrivial to derive an inference algorithm for a given set of rules. There can be many different algorithms for a set of typing rules.


first ...

A Simple Type System(no polymorphism)


-calculus with Constants & Letrec

E ::= x | x.E | E E | Cond (E, E, E) | PFk(E1,...,Ek) | CN0 | CNk(E1,...,Ek) | CNk(SE1,...,SEk)

| let S in E

PF1 ::= negate | not | ... | Prj1| Prj2 | ... PF2 ::= + | ... CN0 ::= Number | BooleanCN2 ::= cons | ...

StatementsS ::= | x = E | S; S

not ininitialterms

There are no types in the syntax of the language!


A Simple Type System

Types ::= base types

| t type variables | -> 2 Function types

Type Environments TE ::= Identifiers Types

int, bool, ...


Simple Type Inference Rules

Typing: TE ├ e :

(App) TE ├ e1 : -> ’ TE ├ e2 : TE ├ (e1 e2) : ’

(Abs)TE ├ x.e : -> ’

(Var)TE ├ x :

(Const)TE ├ c :

(Let) TE ├ (let x = e1 in e2) : ’

TE + {x : } ├ e : ’

(x : )TE

typeof(c) =

TE+{x : } ├ e1 : TE+{x : } ├ e2 : ’


Simple Type Inference Rules cont

(Cond) TE ├ Cond(eB, e1, e2) :

(PF)TE ├ (e1 + e2) : int

(CN)TE ├ CN(e1 , e2 ) : CN(

(Pro)TE ├ Prj1(e) :

TE ├ eB : bool TE ├ e1 : TE ├ e2 :

TE ├ e1 : int TE ├ e2 : int

TE ├ e1 : TE ├ e2 :

TE ├ e : CN()

one such rule for each PF

one set of such rules for each data structure type


Simple Inference Algorithm

W(TE, e) returns (S,) such that S (TE) ├ e :

The type environment TE records the most general type of each identifier while the substitution S records the changes in the type variables

Def W(TE, e) =Case e of x = ...x.e = ... (e1 e2) = ... let x = e1 in e2 = ...


Simple Inference Algorithm (cont-1)

Def W(TE, e) = Case e ofc = ({}, Typeof(c))x = if (x Dom(TE)) then Fail

else let = TE(x); in ({}, )

x.e = let (S1, 1) = W(TE + { x : u }, e); in (S1, S1(u) -> 1)

(e1 e2) =

let x = e1 in e2

=

u’s represent new type variables

let (S1, 1) = W(TE, e1); (S2, 2) = W(S1(TE), e2);

S3 = Unify(S2(1), 2 -> u);in (S3 S2 S1, S3(u))

let (S1, 1) = W(TE + {x : u}, e1); S2 = Unify(S1(u), 1);

(S3, 2) = W(S2 S1(TE) + {x : }, e2); in (S3 S2 S1, 2)


Simple Inference Algorithm (cont-1)

Cond(eB ,e1, e2) =

(e1 + e2) =

CN(e1, e2) =

Prj1(e) =

let (S1, 1) = W(TE, e1); (S2, 2) = W(S1(TE), e2); S3 = Unify(1, int); S4 = Unify(2, int); in (S4 S3 S2 S1, int)

let (S1, 1) = W(TE, e1); (S2, 2) = W(S1(TE), e2);in (S2 S1, CN(1 , 2 ))

let (S, ) = W(TE, e); S1 = Unify(, CN(u1,u2)));in (S1 S, S1(u1))

These should probably be type constants

let (SB, B) = W(TE, eB); (S1, 1) = W(SB(TE), e1); (S2, 2) = W(S1 SB(TE), e2); S3 = Unify(B, bool); S4 = Unify(1,2); in (S4 S3 S2 S1 SB , S4 S3 2)


Type Inference : Example

let f = n.cond ((eq0 n), 1, n*(f (pred n)) in f

W(, A) =

W({f : u1}, n.B) =

( [Int / u2] , Bool )W({f : u1, n : u2}, (eq0 n)) =W({f : u1, n : u2}, B) =

( [ ] , Int -> Int )

Unify(u2 , Int) =

A

B

[ Int / u2 ]

W({f : u1, n : Int}, 1) = ( [ ] , Int) W({f : u1, n : Int}, n*(f (pred n))) =

W({f : u1, n : Int}, n) = ( [ ] , Int)

( [Int -> Int / u1] , Int)

W({f : Int -> Int}, f ) = ( [ ] , Int -> Int)

([Int -> Int / u1, Int / u2] , Int)

([Int -> Int / u1] , Int -> Int)

W({f : u1, n : Int}, f (pred n)) = ( [Int -> u3 / u1] , u3) Unify(u1 , Int -> u3) = [ Int -> u3 / u1 ]

Unify(Int -> Int -> Int , Int -> u3 -> Int) = [ Int / u3]


Soundness

• The proposed type system is said to be sound if e : then e indeed evaluates to a value in

• A method of proving soundness: – The semantics of the language is defined in

terms of a value space that has integer values, Boolean values etc. as subspaces.

– Any expression that may evaluate to a special value “wrong” will not type check.

– There is no type expression that denotes the subspace “wrong”.

Such proofs tend to be difficult ...


Some observations

• A type system restricts the class of programs that are considered “legal”

• It is possible a term in the untyped –calculus may not cause any errors but may not be typeable in a particular type system

let id = x. x in

... (id True) ... (id 1) ...

This term is not typeable in the simple type system we have discussed so far.


Polymorphic Types

Constraints:

let id = x. x in

... (id True) ... (id 1) ...

id :: t1 --> t1

id :: Int --> t2

id :: Bool --> t3

Does not unify!!

id :: t1. t1 --> t1

Different uses of a generalized type variable may be instantiated differently

id2 : Bool --> Boolid1 : Int --> Int

Solution: Generalize the type variable

When can we generalize?


now...

The Hindley-Milner Type System


A mini Languageto study Hindley-Milner Types

• There are no types in the syntax of the language!

• The type of each subexpression is derived by the Hindley-Milner type inference algorithm.

Expressions E ::= c constant

| x variable| x. E abstraction| (E1 E2) application| let x = E1 in E2 let-block


A Formal Type System

Note, all the ’s occur in the beginning of a type scheme, i.e., a type cannot contain a type scheme

Types ::= base types

| t type variables | -> 2 Function types

Type Schemes ::= |t.

Type Environments TE ::= Identifiers Type Schemes


Instantiations

• Type scheme can be instantiated into a type ’ by substituting types for the bound variables of , i.e.,

’ = S for some S s.t. Dom(S) BV()

- ’ is said to be an instance of (>’)

- ’ is said to be a generic instance of when S maps variables to new variables.

=t1...tn.

Example: =t1. t1 -> t2

t3 -> t2 is a generic instance ofInt -> t2 is a non generic instance of


Generalization aka Closing

• Generalization introduces polymorphism

• Quantify type variables that are free in but not free in the type environment (TE)

• Captures the notion of new type variables of

Gen(TE,) = t1...tn. where { t1...tn } = FV() - FV(TE)


HM Type Inference Rules

Typing: TE ├ e :

(App) TE ├ e1 : -> ’ TE ├ e2 : TE ├ (e1 e2) : ’

(Abs)TE ├ x.e : -> ’

(Var)TE ├ x :

(Const)TE ├ c :

(Let) TE ├ (let x = e1 in e2) : ’

TE + {x : } ├ e : ’

(x : )TE

typeof(c)

TE+{x : } ├ e1 : TE+{x:Gen(TE,)} ├ e2 : ’


HM Inference Algorithm

Def W(TE, e) = Case e ofc = ({}, Typeof(c))x =

x.e =

(e1 e2) =

let x = e1 in e2

=

u’s represent new type variables

let (S1, 1) = W(TE, e1); (S2, 2) = W(S1(TE), e2);

S3 = Unify(S2(1), 2 -> u);in (S3 S2 S1, S3(u))

if (x Dom(TE)) then Failelse let t1...tn.= TE(x); in ( { }, [ui / ti] )

let (S1, 1) = W(TE + { x : u }, e);in (S1, S1(u) -> 1)

let (S1, 1) = W(TE + {x : u}, e1); S2 = Unify(S1(u), 1); = Gen(S2 S1(TE), S2(1) ); (S3, 2) = W(S2 S1(TE) + {x : }, e2);in (S3 S2 S1, 2)


Hindley-Milner: Example

x. let f = y.x in (f 1, f True)

W(, A) =

W({x : u1}, B) =

( [ ] , u1 )

( [ ] , u3 -> u1 )

u3.u3 -> u1

TE = {x : u1, f : u3.u3 -> u1}

( [ ] , u4 -> u1 )

W(TE, 1) = ( [ ] , Int )[ Int / u4 , u1 / u5 ]

( [ ] , u1 )

( [ ] , (u1,u1) )

Unify(u4 -> u1 , Int -> u5) =

W({x : u1, f : u2, y : u3}, x) =W({x : u1, f : u2}, y.x) =

Gen({x : u1}, u3 -> u1) =

W(TE, (f 1) ) =

( [ ] , u1 -> (u1,u1) )

W(TE, f) =

Unify(u2 , u3 -> u1) =

AB

[ (u3 -> u1) / u2 ]

...


Important Observations

• Do not generalize over type variables used elsewhere

• Let is the only way of defining polymorphic constructs

• Generalize the types of let-bound identifiers only after processing their definitions


Properties of HM Type Inference

• It is sound with respect to the type system.An inferred type is verifiable.

• It generates most general types of expressions.Any verifiable type is inferred.

• ComplexityPSPACE-HardDEXPTIME-CompleteNested let blocks


Extensions

• Type DeclarationsSanity check; can relax restrictions

• Incremental Type checkingThe whole program is not given at the same time, sound inferencing when types of some functions are not known

• Typing references to mutable objectsHindley-Milner system is unsound for alanguage with refs (mutable locations)

• Overloading Resolution


HM Limitations: -bound vs Let-bound Variables

Only let-bound identifiers can be instantiated differently.

let twice f x = f (f x)in twice twice succ 4

vs.

let twice f x = f (f x) foo g = (g g succ) 4 in foo twice

foo is not type correct !

Generic vs. Non-generic type variables


Puzzle: Another set of Inference rules

(Gen) TE ├ e : t FV(TE)TE ├ e : t.

(Spec) TE ├ e : t.TE ├ e : [t’/t]

(Var) (x : )TETE ├ x :

(Let) TE+{x:} ├ e1: TE+{x:} ├ e2:’

TE ├ (let x = e1 in e2) : ’

(App) and (Abs) rules remain unchanged.

Sound but no inference algorithm !


next time --- Overloading

arvind computer science and artificial intelligence laboratory m.i.t. l06-1 september 26, 2006 type...

Documents