towards a computational interpretation of...

Towards a computational interpretation ofparametricity

Jean-Philippe Bernardy and Guilhem Moulin

Chalmers University of Technology and University of Gothenburg{bernardy,mouling}@chalmers.se

Abstract. Reynolds’ abstraction theorem has recently been extendedto lambda-calculi with dependent types. In this paper, we show how thistheorem can be internalized. More precisely, we describe an extensionof the Calculus of Constructions with a special parametricity rule (withcomputational content), and prove fundamental properties such as Church-Rosser’s and strong normalization. The instances of the abstractiontheorem can be both expressed and proved in the calculus itself.

1 Introduction

In his seminal paper, Reynolds [24] showed that types can be interpreted as pred-icates, in such a way that all inhabitants of a given type satisfy its interpretationas a predicate. A simple example is that if f has type ∀a : type. a → a — thetype of the polymorphic identity — then the following proposition holds

∀a : type. ∀a : a→ type.∀x : a. a x→ a (f a x)

(which implies that f must return exactly its argument).The above result, abstraction, has proved useful for reasoning about functional

programs [29, 11, 27]1. It also has deep theoretical implications: for example, thecorrectness of the so-called Church encoding of data types [7, 30] depends on it.

Study of parametricity is typically semantic, including the seminal workof Reynolds. There, the concern is to construct a model that captures thepolymorphic character of a λ-calculus. Mairson [14] pioneered a different angle ofstudy, of more syntactical nature: for each concrete term, a proof that it satisfiesthe relational interpretation of its type is constructed. That style has then beenused by various authors, including Abadi et al. [1], Plotkin and Abadi [22] andWadler [30]. Bernardy et al. [6] have also shown how terms, types, their relationalinterpretation as proofs and propositions can all be expressed (but not proved) ina single calculus. The calculi where this is possible must however be rich enough:in particular they must support dependent types. Many systems turn out to besuitable: the Calculus of Constructions [9] or Martin-Lof’s Intuitionistic TypeTheory [15] both satisfy the requirements.1 Even though the abstraction theorem as stated by Reynolds’ is valid only for idealized

languages (which for example are normalizing).

2

Still, even though we know that all terms satisfy the parametricity condition,and each of these conditions can be expressed in the same system as the terms theyconcern, the very fact that any given term satisfies the relational interpretationof its type is itself not provable in the system. For example, the parametricitycondition arising from the type ∀a : type. a→ a, namely

∀f : (∀a : type. a→ a).∀a : type. ∀a : a→ type.∀x : a. a x→ a (f a x),

is not provable in type-theory.In proof assistants based on type theory, one would like to rely on parametric-

ity conditions to prove certain theorems. Indeed, the correctness of numerousfunctional programming techniques relies on parametricity. Examples includeprogram transformation [11, 13], semantic program inversion [27] and genericprogramming [28]. Proof assistants based on type theory can already describethese techniques, and it would be useful to take advantage of parametricity tomechanize their proofs.

We are not the first to recognize this need: it is for example a recurringtopic in the field of mechanized metatheory, where precise encoding of variablebindings often makes use of polymorphism. For example, Pouillard [23] describesthe following representation for terms, using the Agda [19] proof assistant:

data Term (V : Set) : Set wherevar : V → TermVapp : TermV → TermV → TermVabs : (MaybeV → TermV )→ TermV

One can argue that terms defined as above must be well-formed as follows: becauseV is an abstract type variable, the only way to obtain a type-correct argumentto the var constructor is from the higher-order binding in the abs constructor.Pouillard formalizes the above argument within Agda, using logical relationsas defined by Bernardy et al. [6]. Only the parametricity axiom is missing toestablish that all terms are well-formed. Chlipala [8] and Atkey et al. [3, p. 41]encounter the same type of situation.

In this paper, we aim to tackle the lack of support for parametricity inproof assistants. Technically, we propose to extend the Generalized Calculusof Constructions (CCω) to make all the parametricity propositions provableinternally. The aim is to pave the way for native support of parametricity in toolsbased on dependent types, such as Agda or Coq [26]. The challenge is not tomerely postulate the axiom and check its consistency with the rest of the system:because we want to retain the constructive character of CCω, we must provide acomputation rule for parametricity.

Our technical contributions are as follows:

– We describe a dependently-typed λ-calculus with internalized parametricity.The calculus is summarized in definitions 1, 2, and 3; and motivated inSection 2. In particular, we show that the calculus internalizes parametricity.

3

– We prove that it satisfies the properties of a well-behaved λ-calculus withtypes. In particular, we prove the Church-Rosser property, subject reduction,and strong normalization.

2 Design, step by step

In this section we describe and motivate our design step by step, starting fromthe generalized calculus of constructions (CCω). The full and final definition ofthe system we will arrive at is described in definitions 1, 2, and 3.

2.1 CCω, and our notation

Our starting point is CCω [18], but familiarity with the calculus of constructions [9]is all we assume: we use CCω only because it is technically more convenient tohave a type for each sort in the system. The presentation is in the style of puretype systems (PTS). Readers not familiar with PTSs are referred to Barendregt[4], but we give a brief reminder in the following paragraphs, as well as introduceour notation.

The system CCω features an infinite hierarchy of sorts, written here S ={∗,�0,�1, . . . }. The first sort, written ∗, is called the sort of propositions. Eachsorts inhabits the its successor in the above list. Propositions are impredicative.

The various forms of quantifications, (and corresponding abstraction andapplication) are syntactically unified, and in general one needs to inspect sorts toidentify which form is meant. In this paper, we sometimes need to have specialtreatment of quantifications over propositions (types of sort ∗). To support this,we tag the variables (resp. applications) with their sort (resp. the argument ofthe application). The concrete syntax is given in Definition 1. Sort annotationwith ∗ are sometimes notationally inconvenient, in that case we use the followingspecial syntax. Concretely, we have A •→ B , ∀x∗: A.B when x does not appearin B, and F •A , F ∗A.

We also restrict our raw syntax to well-sorted terms: (λx�0 . t)∗u and x�0 [x∗ 7→u] are for instance ill-sorted, but (λx�0 . t)�0

u and x∗[x∗ 7→ u] are not. We omitthe annotations when they do not play any role, or can be inferred e.g., for welltyped terms.

The full syntax and typing rules is given in the following definitions. Theremainder of the section explains our special construction for parametricity (bb·cc)and its typing rule, as well as the technical device of relational substitutions.

4

Definition 1 (Syntax).

Var 3 x, y, z, x, y, zSort 3 s ::= ∗ | �0 | �1 | . . .Term 3 a, b, f, t, u,A,B,C, . . . ::= s sort

| xs variable| bbx�0cc parametric witness| AsB application| λxs : A.B abstraction| ∀xs : A.B function space

Context 3 Γ,∆ ::= − empty context| Γ, xs : A context extension

relational substitution 3 ξ ::= ∅ empty| ξ, x�0 7→ x extension

Definition 2 (Typing rules).

(s1, s2) ∈ A` s1 : s2

Axiom

Γ ` A : B Γ ` C : sΓ, xs : C ` A : B

Weakening

Γ ` F : (∀xs : A.B) Γ ` a : AΓ ` F sa : B[xs 7→ a]

Application

Γ, xs : A ` b : B Γ ` (∀xs : A.B) : s′

Γ ` (λxs : A. b) : (∀xs : A.B)Abstraction

Γ ` A : s1 Γ, xs1 : A ` B : s2

Γ ` (∀xs1 : A.B) : s3

Product (s1, s2, s3) ∈ R

Γ ` A : B Γ ` B′ : s B =β B′

Γ ` A : B′

Conversion

Γ ` A : sΓ, xs : A ` x : A

Start

Γ ` A : �0

Γ, x�0: A ` bbxcc : x ∈ JAK∅Param

with A = {(∗,�0), (�i,�1+i) | i ∈ N}R = {(∗, ∗, ∗), (�i, ∗, ∗), (∗,�i,�i), (�i,�j ,�itj) | i, j ∈ N}

2.2 Logical relations in CCω

In this section, we define a relational interpretation of terms and types of CCω.The material of this section is directly inspired from results of Bernardy et al. [6]and Bernardy and Lasson [5], which describe how to derive logical relations forany PTS [4]. In the following sections, we will show how it can be amended tosuit our purposes.

In any PTS, types and terms can be interpreted as relations and proofs thatthe terms satisfy the relations. Each type can be interpreted as a predicate that itsinhabitants satisfy; and each term can be turned into a proof that it satisfies thepredicate of its type. Usual presentations of parametricity use binary relations,but for simplicity of notation we present here a unary version. The generalizationto arbitrary arity is straightforward, and we refer the readers to [5] for details.

5

Likewise, extending the theory to support inductive types is straightforward anddetailed by Bernardy et al. [6].

In the following we define what it means for a program C to satisfy thepredicate generated by a type T (C ∈ JT K); and the translation from a programC of type T to a proof JCK that C satisfies the predicate. An invariant of thetranslation is that whenever x is free in T , there is another free variable x in JT K,which witnesses that x satisfies the parametricity condition of T (x : x ∈ JT K).This means that the translation must extend contexts, as follows:

J−K = −JΓ, x : AK = JΓ K, x : A, x : x ∈ JAK

In fact, we will need to be explicit about the renaming relation existing betweenvariables and their witness of parametricity, to ensure that the translations arewell scoped2. (From now on we call such renamings relational substitutions.)Hence we subscript the translation with it:

J−Kξ = −JΓ, x : AKξ,x 7→x = JΓ Kξ, x : A, x : x ∈ JAKξ

Until the next section, we make the assumption that a relational substitutioncontains information about all free variables in the term or context it is appliedto.

The relational interpretation of types and its underlying intuition follow.Recall that, to interpret a type T , it suffices to give a proposition C ∈ JT K statingthat C satisfies the interpretation of the type.

– Because types in a PTS are abstract (there is no pattern matching on types),any predicate over C can be used to witness that C satisfies the relationalinterpretation of a sort s.

C ∈ JsKξ = C → s

– If the type is a product (∀x : A.B), then C is a function, and it satisfiesthe relational interpretation of its type iff it maps related inputs to relatedoutputs.

C ∈ J∀x : A.BKξ = ∀x : A.∀x : x ∈ JAKξ. (C x) ∈ JBKξ,x 7→x

– For any other syntactic form for a type T , which may be a variable or anapplication in a PTS, T is interpreted as a predicate (JT K : T → s), and tocheck that C satisfies it, one can use application.

C ∈ JT Kξ = JT Kξ C

Finally we can give the translation from terms to proofs.2 If one were to always interpret x by x, the term JJλx.aKK would bind x twice.

6

– The translation of a variable is done by looking up the corresponding para-metric witness in the context.

JxKξ = ξ(x)

– The case for abstraction adds a witness that the input satisfies the relationalinterpretation of its type and returning the relational interpretation of thebody, mirroring the interpretation of product:

Jλx : A.BKξ = λx : A. λx : x ∈ JAKξ. JBKξ,x 7→x

– The application follows the same pattern: the function is passed a witnessthat the argument satisfies the interpretation of its type.

JABKξ = JAKξ B JBKξ

– If the term has another syntactic form, then it is a type (T ), thus we can useλ-abstraction to create a predicate and check that the abstracted variable zsatisfies the relational interpretation of the type in the body (z ∈ JT Kξ).

JT Kξ = λz : T . z ∈ JT Kξ

At this point the definition might appear circular; but in fact the form · ∈ JT Kinvokes JT K only when T is an application, which is processed structurallyby J·K.

Theorem 1 (Abstraction). If Γ `CCωA : B : s, then

JΓ Kξ `CCω JAKξ : (A ∈ JBKξ) : s

where ξ maps every variable in Γ to a globally fresh variable.

Proof. The theorem is proved by induction on the derivation tree, similarly to[5]. Since we will revise the definition of J·K, we omit all details of the proof here.

A direct reading of the above result is as a typing judgement about translatedterms: if A has type B, then JAKξ has type A ∈ JBKξ. However, it can also beunderstood as an abstraction theorem for CCω: if a program A has type B in Γ ,then A satisfies the relational interpretation of its type (A ∈ JBKξ). For arity 2,JΓ Kξ contains two related environments (and witnesses that they are properlyrelated), and JAKξ is a proof that the two possible interpretations of A (by pickingvariables out of each environment in JΓ Kξ) are related.

2.3 Internalization

Both the source and target of J·K are in the same system. Both the theoremsand the proofs are expressible in that system. Therefore we can hope that forevery term A of type B, the user of the system can get a witness JAKξ that it isparametric (A ∈ JBKξ). Even though this hope is fulfilled for closed terms, we

7

run out of luck for open terms, because the context where JAKξ is meaningfulis “bigger” than that where A is: for each free variable x : A in Γ , we need avariable x : x ∈ JAKξ in JΓ Kξ.

However, we know that every closed term is parametric. Therefore, ultimately,we know that for each possible concrete term a that can be substituted for afree variable x, it is possible to construct a concrete term JaK∅ to substitutefor x. This means that the witness of parametricity for x does not need to begiven explicitly if x is bound. Therefore we allow to access such a witness via thesyntactic form bbxcc. This intuition justifies the addition the substitution rule

bbxcc[x 7→ a] = JaK∅

as well as the following typing rule, which expresses that if x is found in thecontext, then it is valid to use bbxcc, which is the witness that x satisfies theparametricity condition of its type.

Γ ` A : sΓ, x : A ` bbxcc : x ∈ JAK∅

Using the above, we can now lift the restriction that relational substitutions mustgive a mapping for all free variables and define3:

JxKξ = ξ(x) if x ∈ ξJxKξ = bbxcc if x /∈ ξ

Conversely, contexts are not extended if the variable is not present in the mapping.

J−Kξ = −JΓ, x : AKξ,x 7→x = JΓ Kξ, x : A, x : x ∈ JAKξ

JΓ, x : AKξ = JΓ Kξ, x : A if x /∈ ξ

(From here on, we may omit the relational substitution argument to JAK to meanJAK∅.)

The above construction is the only addition required to obtain full internalizedparametricity. That is, assuming parametricity on variables, we have parametricityfor all terms, as stated in the Theorem 2.

Theorem 2 (Parametricity).

Γ ` A : B ⇒ Γ ` JAK : A ∈ JBK

Proof. By induction on the derivation tree. The difference with the proof ofthe abstraction theorem is essentially that occurrences of Start for relevantvariables are transformed to Param. (The definition of J·K will be revised later,as well as this proof.)3 Careful readers might worry that we forget the substitution in the second case. An

informal justification is that if x has no substitute in ξ, then the variables of its typedo not either. Therefore, types are preserved when doing the substitution.

8

The fact that all values are parametric is also captured by the following, which isa theorem (an inhabited type inside the calculus):

parametricity : ∀A : type.∀(a : A). a ∈ JAKparametricity = λA. λa : A. bbacc

Example 1. We can prove that any function of type ∀a : �0. a→ a is an identity,as we hinted at in the introduction. The formulation of the theorem and its proofterm are as follows.

identities : ∀f : (∀a : �0. a→ a).∀a : �0.∀x : a.Eq a (f a x)xidentities = λf.λa.λx.bbfcc a (Eq a x)x (refl a x)

When identities is used on a concrete identity function, say i = λa : �0. λx :a. x, the theorem specializes to reflexivity of equality:

identities i : ∀a : �0.∀x : a.Eq a xx

and, using our extended definition of the substitution, after reduction the proofno longer mentions bb·cc :

identities i→β λa.λx.bbfcc[f 7→ λa : �0. λx : a. x] a (Eq a x)x (refl a x)= λa.λx.Jλa : �0. λx : a. xK a (Eq a x)x (refl a x)= λa.λx.(λa.λa.λx.λx.x) a (Eq a x)x (refl a x)→β λa.λx.refl a x

2.4 Nesting of J·K and computational irrelevance

In this section we describe an issue with subject reduction in the system ex-plained so far, and we tweak the system to fix the issue, by taking advantage ofcomputational irrelevance in CCω.

Nesting J·K and subject reduction For some closed type A, consider the termJparametricityAK, which intuitively is a proof that parametricity itself satisfiesthe abstraction theorem. It is convertible to λa : A. λa : a ∈ JAK. JJaKK{x 7→x}.

So far, we have not specified how to reduce the subterm JJxKK{x 7→x} (where xis a free variable). Indeed, it is actually not possible to substitute x for a valuein it, because x also appears as an index in a relational substitution, and onlyvariables can appear there (not arbitrary terms). This means that JJxKK{x 7→x} isnot an acceptable normal form: either it must reduce to something else, or it isshould not be well-typed.

We first explore the possibility of reducing the term. A perhaps natural ideawould be to modify the reduction rules in such a way to allow the followingreduction, swapping the two occurrences of the parametric interpretation:

JJxKK{x 7→x} −→ JJxK{x 7→x}K.

9

In that case the expression would further reduce to bbxcc, which is a proper normalform. Unfortunately, allowing this reduction would break subject reduction. Thiscan be checked by assuming x : A (where A is any closed type, its actual form isirrelevant), and computing the types of the expression before and after reduction.By Theorem 2 we have JxK : JAK x. By a second application of Theorem 2, we get

JJxKK{x 7→x} : JJAK xK{x 7→x} JxK: JJAKK{x 7→x} x JxK{x 7→x} JxK: JJAKKx JxK{x7→x} JxK: JJAKKx x JxK

On the other hand, JxK{x7→x} : JAK x, and by a another application of Theorem 2,we get

JJxK{x 7→x}K : JJAK xK JxK{x 7→x}: JJAKK x JxK JxK{x 7→x}: JJAKK x JxK x

That is, in the above example, the reduction rule suggested above has the effectto swap the second and third arguments to JJAKK in the type.

We are then left with the alternative to prevent the above situation fromoccurring, making the above term ill-typed. The solution we choose is to forbidall nested uses of J·K. Concretely, we make a twofold change to the system:

1. we make sure that all parametricity conditions inhabit the sort ∗, and2. we prevent parametricity to be used at sort ∗.

One can easily check that the above is an invariant of our parametric inter-pretation (Definition 3). For instance, we ensure that explicit witnesses x lie insort ∗ for x of sort �0.

A brief review of computational irrelevance Essentially, the change wepropose in the above section takes advantage of a form of computational irrele-vance. This notion is found in various flavors in the literature, but the versionwe use here is most similar to [20]. As Paulin-Mohring, we distinguish betweeninformative (inhabiting �i) and non-informative (inhabiting ∗) propositions:non-informative fragments cannot influence computation of informative ones,rather, they act as mere witnesses.

The above property of separation between �i and ∗ is already ensured bythe structure of CCω: by invoking a proof a : A : ∗, it is impossible to gainany information about its structure, beyond its mere existence. If inductiveconstructions were to be added to the system, as in the calculus of inductiveconstructions [31], informative elimination from ∗ to �i would be forbidden,in order to avoid avoid logical inconsistency. (We discuss the issue further inSection 3.1.)

10

Amending the relational interpretation Even though the relational inter-pretation of terms as described in Section 2.2 works to represent terms of CCωinside CCω, it does not capture the irrelevance of ∗. To do so, we can proceedto change it as follows. (The end result is shown in Definition 3.) Because anyfunction F of type ∀x∗: A.B is guaranteed not to depend computationally on itsargument, the results of the function are related regardless of whether the inputsare related. Therefore, the translation of ∀x : A.B can be changed to reflect thisobservation: when the sort of A is ∗, we merely drop the parametricity witness x.

C ∈ J∀x�i : A.BKξ = ∀x�i : A.∀x∗: x ∈ JAKξ. C x ∈ JBKξ,x 7→xC ∈ J∀x∗: A.BKξ = ∀x∗: A. C x ∈ JBKξ

The translation of abstraction, application and context formation must thenbe adapted accordingly (equations (5), (6) and (14) in Definition 3), in eachcase by removing the witness x. (Dropping the witness from the translationroughly corresponds to having that all proofs are related: C ∈ JAKξ = >, where> represents truth.)

After this change, the relational interpretation never “reaches” a type of sort∗ (nor an inhabitant of such a type), given that one starts with a type of sort �i(or an inhabitant of such a type). Therefore, we have now more freedom in thechoice of the relational interpretation of �i. As hinted in the previous section,we choose types of sort ∗:

C ∈ J�iKξ = C → ∗

However, having the single sort ∗ for the relational interpretation of the wholehierarchy �i leads to a technical issue (we discuss it in Section 3.2), hencewe choose to only allow the relational interpretation for types of sort �0 (orinhabitants of such types).

The abstraction theorem remains valid with the above modifications, with noessential change to the proof (see Section 4.2 for details).

Even though we forbid direct nesting of J·K, indirect nesting is still possible.For instance, parametricity can be used to prove a proposition a ∈ JAK that aprogram F relies on. That is, if F : (a ∈ JAK)→ B with a : A : �0, then JF JaKKis acceptable. Indeed, because the relational interpretation simply ignores proofarguments (see Definition 3), the above expression reduces to JF K JaK, and thenesting disappears.Definition 3 (Relational interpretation).

JxKξ = x if x ∈ ξ (1)JxKξ = bbxcc otherwise (2)

Jλx�0: A.BKξ = λx�0: A. λx∗: x ∈ JAKξ. JBKξ,x7→x (x /∈ ξ) (3)JABKξ = JAKξ B•JBKξ (4)

Jλx∗: A.BKξ = λx∗: A. JBKξ (x /∈ ξ) (5)JA•BKξ = JAKξ•B (6)

JT Kξ = λz�0: T. z ∈ JT Kξ if T is ∀ or �0 (7)

11

C ∈ J�0Kξ = C → ∗ (8)

C ∈ J∀x�0: A.BKξ = ∀x�0: A.∀x∗: x ∈ JAKξ. (C x) ∈ JBKξ,x 7→x (x /∈ ξ) (9)C ∈ J∀x∗: A.BKξ = ∀x∗: A. (C•x) ∈ JBKξ (x /∈ ξ) (10)

C ∈ JT Kξ = JT Kξ C if T is not ∀ nor a sort (11)

J−Kξ = − (12)

JΓ, x : AKξ,x7→x = JΓ Kξ, x�0: A, x∗: x ∈ JAKξ if A : �0 (13)JΓ, x : AKξ,x7→x = JΓ Kξ, x∗: A if A : ∗ (14)

JΓ, xs : AKξ = JΓ Kξ, xs : A if x /∈ ξ (15)

2.5 Summary

The changes made to relational interpretation do not compromise the abstractiontheorem, nor parametricity. The proofs need only to be amended to ignore irrele-vant variables. (Full proofs are found in the appendix.) In particular, Example 1remains valid.

This concludes the description of the design of our calculus, and its motivation.In summary, we have added a parametricity rule, whose computational contentis given by the systematic construction of parametricity witnesses from terms.Proofs that are irrelevant (in the sense of [20]) are always known to satisfy therelational interpretation of their type: this is built into the definition of therelational interpretation.

We have proved confluence, subject reduction, and strong normalizationfor our calculus. Since parametricity acts as a typing rule for J·K, and subjectreduction for our calculus stems directly from it. Strong normalization is provedby modelling the system in CCω. This model is done by introducing explicitwitnesses of parametricity for all relevant variables. Details of the proofs of thesetheorems and parametricity are delayed until the appendix.

3 Discussion

3.1 Extension: inductive types

Even though we considered only PTSs (without inductive constructions), it isstraightforward to support them, provided usual precautions regarding compu-tational relevance are respected. That is, if inductive constructions were to beadded to the system, informative elimination from ∗ to �0 would be forbidden.This is consistent with what happens in the Calculus of Inductive Constructions[31], and its implementation in the Coq system.

Forbidding this kind of elimination is necessary for our abstraction theorem tohold. Indeed, if elimination were permitted from ∗ to �0, one could in particularlift proofs to programs. Showing that such a lifting satisfies the parametricitycondition arising from its type would require a witness that the proof itselfsatisfies a similar condition, but no witness of this fact is available, given ourinterpretation of proofs.

12

However, systems with computational irrelevance are often extended with thepossibility to transform an (irrelevant) proof of falsity (⊥ : ∗) to any type α : �0.

botElim : ⊥ •→ ∀α : �0. α

Such an elimination is compatible with our theory of parametricity. Indeed, wecan give a relational interpretation of botElim as follows:

JbotElimK : ∀b : ⊥.∀α : �0.∀α : α→ ∗. α (botElim•b α)JbotElimK = λb.botElim•b (∀α : �0.∀α : α→ ∗. α (botElim•b α))

The trick is that, even though there is no witness that b satisfies any relationalinterpretation, it is itself a proof of falsity, and therefore no further condition isneeded to prove its parametricity (or indeed anything else).

3.2 Sort hierarchies

We have allowed parametricity to be only used at values of types inhabiting �0.We already justified in detail why parametricity at sort ∗ is disallowed, but whatabout �1, �2, and so on? For the axiom ` �i : �i+1, the abstraction theoremsays: (λz : �i.z → ∗) : �i → ∗, which is only possible if z : �i ` z → ∗ : ∗, whichis wrong.

One way to allow parametricity at higher sorts would be to extend the divisionbetween relevant and irrelevant sorts. That is, have a parallel hierarchy of sortsfor propositions ∗i : ∗i+1 with impredicative rules (s, ∗i, ∗i). The parametricityconditions for types of �i would live in ∗i. In that case, the abstraction theoremwould yield z : �i ` z → ∗i : ∗i+1, which would hold by assumption.

3.3 Related Work

The relationship between logic and programming languages goes both ways. Onthe one hand, logical systems can be given meaning by assigning a computationalinterpretation to them. On the other hand, one would like to prove programscorrect within formal logical systems. This means that one needs logical systemswith enough power to support reasoning about programs. Seminal work featuringboth sides of the connection includes [17] and [15].

The work described in this paper falls right into this tradition: we not onlyattempt to improve the support for reasoning about parametric reasoning in alogical framework, but also give parametricity a computational meaning. To ourknowledge this has not been attempted before. Some logical frameworks withcomputational meaning have features which are related to parametricity, withoutsubsuming it. Some logical systems have featured parametricity, but not in acomputationally meaningful way.

13

Inductive families In the first category, we must first mention inductive families,studied in different contexts by Pfenning and Paulin-Mohring [21], and Dybjer[10].

The computational realization of the induction principle over an inductivedefinition is a recursive function over the data, but which also carries informationabout the data being recursed over.

One can draw a parallel with our interpretation of terms: computationalconstructs such as application and abstraction are interpreted as application andabstraction, but with additional information about the original term carried asan index. In fact, it is possible to see induction principles over data as a specialcase of parametricity, as shown for example by Wadler [30].

Extensionality The ability to reason about functions can be supported not justby parametricity, but also by the principle of extensionality, which says that twofunctions are equal if they map equal inputs to equal outputs.

Altenkirch et al. [2] show how to integrate extensionality in a computationallymeaningful way. Many parallels can be drawn between the work of Altenkirchet al. and ours. Mainly, their equality relation depends on the structure of thetypes that it relates, in the same way as the interpretation of types we propose.Additionally, Altenkirch et al. also rely on a separation between proofs andprograms, in a similar fashion to what we present here.

Meta-level reasoning Miller and Tiu [16] propose a logical framework where onecan reason precisely about terms and types of an object language. Their domainof application clearly overlaps with ours. For example, one can prove in theirframework that the object type ∀a : ?.a is uninhabited. A characteristic of [16] isthe total separation between object and host language. Here, we are able to unifyboth languages, even though we must keep some amount of semantic separationbetween proofs and programs: programs cannot depend on the structure of proofs.

Parametricity Plotkin and Abadi [22] have formulated a logic extended withaxioms of parametricity. However, they are content with the consistency ofparametricity, and do not give any computational interpretation for it.

As mentioned previously, it has been shown before e.g., by Hasegawa [12],that parametricity is consistent with some models of System F. Consistency ofparametricity with dependently-typed theories does not appear to have beenproved previously.

In unpublished work, Takeuti [25] attempted to extend CC with parametricity.Takeuti asserted parametricity at all types, with a similar definition of theparametric interpretation as ours. His system also seems to feature a notionof irrelevance: two types in �0 appear to always be related. Takeuti does notattempt to assign a computational content to his parametricity axioms, and onlyconjectures consistency of his system.

14

3.4 Future Work

A natural extension of this work would be to integrate it in the Coq system.One could experiment and measure how much power does parametricity give onpractical examples. The presentation we give here fits naturally with the existingCoq implementation, except for that fact that we allow parametricity only onthe first level of the hierarchy of sorts. Lifting the limitation could be done asoutlined in Section 3.2, but then it would not blend well with the current Coqtype system.

Another limitation of this work is the impossibility to apply parametricity toproofs. That is, programs can depend on parametricity proofs, and these proofshave a computational meaning, but programs cannot depend on the structure ofproofs (only their existence). The limitation does not appear to prevent use ofparametricity to prove correctness of functional programs, but it prevents deeperuse of parametricity, for example justifying impredicative Church-encodings ofdata (which necessarily live in the ∗ sort). Therefore, we think that lifting therestriction can bring fundamental benefits. For example, by nesting parametricity,one can generate parametricity of arbitrary arity [5]. Because implementingunary parametricity would then be enough, nesting has the possibility to simplifythe system. In the process, we hope to clarify design decisions and reveal morefundamental properties of the relational interpretation.

Bibliography

[1] M. Abadi, L. Cardelli, and P. Curien. Formal parametric polymorphism. InProc. of POPL’93, pages 157–170. ACM, 1993.

[2] T. Altenkirch, C. McBride, and W. Swierstra. Observational equality, now!In Proc. of the 2007 workshop on Programming languages meets programverification, pages 57–68. ACM, 2007.

[3] R. Atkey, S. Lindley, and J. Yallop. Unembedding domain-specific languages.In Proc. of the 2nd ACM SIGPLAN symposium on Haskell, Haskell ’09,pages 37–48. ACM, 2009.

[4] H. P. Barendregt. Lambda calculi with types. Handbook of logic in computerscience, 2:117–309, 1992.

[5] J.-P. Bernardy and M. Lasson. Realizability and parametricity in puretype systems. In M. Hofmann, editor, Foundations Of Software Sci. AndComputational Structures, volume 6604 of LNCS, pages 108–122. Springer,2011.

[6] J.-P. Bernardy, P. Jansson, and R. Paterson. Parametricity and dependenttypes. In Proc. of the 15th ACM SIGPLAN international conference onFunct. programming, pages 345–356. ACM, 2010.

[7] C. Bohm and A. Berarducci. Automatic synthesis of typed lambda-programson term algebras. Theor. Comp. Sci., 39(2-3):135–154, 1985.

[8] A. Chlipala. Parametric higher-order abstract syntax for mechanized seman-tics. In Proceeding of the 13th ACM SIGPLAN international conference onFunct. programming, ICFP ’08, pages 143–156. ACM, 2008.

15

[9] T. Coquand and G. Huet. The calculus of constructions. Technical report,INRIA, 1986.

[10] P. Dybjer. Inductive families. Formal Aspects of Computing, 6(4):440–465,1994.

[11] A. Gill, J. Launchbury, and S. Peyton Jones. A short cut to deforestation.In Proc. of FPCA, pages 223–232. ACM, 1993.

[12] R. Hasegawa. Categorical data types in parametric polymorphism. Mathe-matical Structures in Comp. Sci., 4(01):71–109, 1994.

[13] P. Johann. A generalization of short-cut fusion and its correctness proof.Higher-Order and Symbol. Comput., 15(4):273–300, 2002.

[14] H. Mairson. Outline of a proof theory of parametricity. In Proc. of FPCA1991, volume 523 of LNCS, pages 313–327. Springer-Verlag, 1991.

[15] P. Martin-Lof. Intuitionistic type theory. Bibliopolis, 1984.[16] D. A. Miller and A. F. Tiu. A proof theory for generic judgments: An

extended abstract, 2003.[17] R. Milner. Logic for Computable Functions: description of a machine

implementation. Artificial Intelligence, 1972.[18] A. Miquel. Le Calcul des Constructions implicite: syntaxe et semantique.

These de doctorat, Universite Paris 7, 2001.[19] U. Norell. Towards a practical programming language based on dependent

type theory. PhD thesis, Chalmers Tekniska Hogskola, 2007.[20] C. Paulin-Mohring. Extracting Fω’s programs from proofs in the calculus

of constructions. In POPL’89, pages 89–104. ACM, 1989.[21] F. Pfenning and C. Paulin-Mohring. Inductively defined types in the calculus

of constructions. In Mathematical Foundations of Programming Semantics,pages 209–228. 1990.

[22] G. Plotkin and M. Abadi. A logic for parametric polymorphism. In Proc.of the International Conference on Typed Lambda Calculi and Applications,volume 664 of LNCS, page 361–375. Springer, 1993.

[23] N. Pouillard. Nameless, painless. In Proc. of the 16th ACM SIGPLANinternational conference on Funct. programming, ICFP ’11. ACM, 2011. toappear.

[24] J. C. Reynolds. Types, abstraction and parametric polymorphism. Informa-tion processing, 83(1):513–523, 1983.

[25] I. Takeuti. The theory of parametricity in lambda cube. Manuscript, 2004.[26] The Coq development team. The Coq proof assistant, 2011.[27] J. Voigtlander. Bidirectionalization for free! (Pearl). In Proc. of POPL 2009,

pages 165–176. ACM, 2009.[28] D. Vytiniotis and S. Weirich. Parametricity, type equality, and higher-order

polymorphism. J. Funct. Program., 20(02):175–210, 2010.[29] P. Wadler. Theorems for free! In Proc. of FPCA 1989, pages 347–359. ACM,

1989.[30] P. Wadler. The Girard–Reynolds isomorphism. Theor. Comp. Sci., 375(1–3):

201–226, 2007.[31] B. Werner. Une theorie des constructions inductives. PhD thesis, Universite

de Paris 7, 1994.

16

4 Appendix

This section contains details of our proofs.

4.1 Confluence

To begin, we generalize some basic properties possessed by well-behaved λ-calculi.In particular, we prove the substitution lemma and the confluence property. Welimit the detail of the proofs to the cases relevant to the specifics of our calculus.

Remark 1. It follows immediately by induction on the structure of terms thatsubstitution and reduction preserve sorts. This result will be silently used in thepresent section.

Lemma 1. If x does not occur free in t, then JtKξ,x 7→x = JtKξ.

Proof. Induction on the structure of t.

Lemma 2 (J·K and substitution, part 1). if z�0 is not in ξ, then

Jt[z�0 7→ u]Kξ = JtKξ,z 7→z[z�0 7→ u][z∗ 7→ JuKξ]

Proof. By induction on t. Only the variable case (t = x) will be explained here,as the other ones boil down to easy equational reasoning.

Sort trivial.Abstraction (Product is similar)

J(λx�0: A. b)[z�0 7→ u]Kξ= Jλx�0: A[z�0 7→ u]. b[z�0 7→ u]Kξ= λx�0: A[z�0 7→ u]. λx∗: x ∈ JA[z�0 7→ u]Kξ. Jb[z�0 7→ u]Kξ,x 7→x= λx�0: A[z�0 7→ u]. λx∗: x ∈ JAKξ,z 7→z[z�0 7→ u][z∗ 7→ JuKξ].

JbKξ,z 7→z,x7→x[z�0 7→ u][z∗ 7→ JuKξ,x 7→x]= λx�0: A[z�0 7→ u]. λx∗: x ∈ JAKξ,z 7→z[z�0 7→ u][z∗ 7→ JuKξ].

JbKξ,z 7→z,x7→x[z�0 7→ u][z∗ 7→ JuKξ]= (λx�0: A. λx∗: x ∈ JAKξ,z 7→z. JbKξ,z 7→z,x7→x)

[u�0 7→ z][z∗ 7→ JuKξ]= Jλx�0: A. bKξ,z 7→z[z�0 7→ u][z∗ 7→ JuKξ]

17

Application

J(F a)[z�0 7→ u]Kξ= JF [z�0 7→ u] a[z�0 7→ u]Kξ= JF [z�0 7→ u]Kξ a[z�0 7→ u]•Ja[z�0 7→ u]Kξ= JF Kξ,z 7→z[z�0 7→ u][z∗ 7→ JuKξ] a[z 7→ u]

•JaKξ,z 7→z[z�0 7→ u][z∗ 7→ JuKξ]= JF Kξ,z 7→z[z�0 7→ u][z∗ 7→ JuKξ] a[z�0 7→ u][z∗ 7→ JuKξ]

•JaKξ,z 7→z[z�0 7→ u][z∗ 7→ JuKξ]= (JF Kξ,z 7→z a•JaKξ,z 7→z)[z�0 7→ u][z∗ 7→ JuKξ]= JF aKξ,z 7→z[z�0 7→ u][z∗ 7→ JuKξ]

Abstraction, Product and Application are straightforward for (∗,�0,�0)Variable

– If x = z:Jz[z�0 7→ u]Kξ = JuKξ = z[z�0 7→ u][z∗ 7→ JuKξ]

= JzKξ,z 7→z[z�0 7→ u][z∗ 7→ JuKξ]– If x 6= z and x ∈ ξ:

Jx[z�0 7→ u]Kξ = x = x[z�0 7→ u][z∗ 7→ JuKξ]= JxKξ,z 7→z[z�0 7→ u][z∗ 7→ JuKξ]

– If x 6= z and x /∈ ξ:Jx[z�0 7→ u]Kξ = bbxcc = bbxcc[z�0 7→ u][z∗ 7→ JuKξ] ut

Lemma 3 (J·K and substitution, part 2). if ξ does not contain either z orany of the free variables of u, then

Jt[z 7→ u]Kξ = JtKξ[z 7→ u]

Proof. By induction on t. Only the variable case (t = x) will be proved here.

Abstraction, Product, Application and Sort same as above.Variable

– If x = z:Jz[z�0 7→ u]Kξ = JuKξ = JuK∅ = bbzcc[z�0 7→ u]

since z /∈ ξ– If x 6= z and x ∈ ξ:

Jx[z∗ 7→ u]Kξ = x = JxKξ[z∗ 7→ u]– If x 6= z and x /∈ ξ:

Jx[z∗ 7→ u]Kξ = bbxcc = bbxcc[z∗ 7→ u] ut

If z is a proof variable, the previous lemma can be slightly generalized:

Lemma 4 (J·K and substitution, part 3).

Jt[z∗ 7→ u]Kξ = JtKξ[z∗ 7→ u]

18

Proof. Same as above (by induction on t), except for the variable case, where tcannot be the proof variable z.

Lemma 5 (Substitution).

t[z 7→ u][z′ 7→ u′] = t[z′ 7→ u′][z 7→ u[z′ 7→ u′]]

Proof. By induction on t; the only non-trivial case is for the parametric witnessesbbzcc:

bbzcc[z�0 7→ u][z′ 7→ u′] = JuK∅[z′ 7→ u′]= Ju[z′ 7→ u′]K∅ = bbzcc[z′ 7→ u′][z�0 7→ u[z′ 7→ u′]]

by Lemma 3.

We now check that the modifications made to CCω do not break the Church-Rosser property, that is, we verify that the order in which the reductions areperformed does not matter. To prove this property, we define a parallel reduction(following the Tait/Martin-Lof technique), and show that the diamond propertyholds for this reduction. Note that the case for redex-elimination differs slightlyfrom the usual one, in that it may perform multiple reductions in a row, ifthey occur on the head of the redex. We had to extend the formulation ofthe β rule since the parametric interpretation does not preserve the numberof redexes: relevant redexes result in two redexes (see equations (3) and (4) inDefinition 3). This multiplication of redexes does not compromise the soundnessof the technique, because it occurs only at the leaves of the relation. (If we hadΣ-types in the syntax, we could uncurrify the term first, to keep only one redex).

Definition 4 (Parallel nested reduction).

ReflA . A

βt . t′ ui . u

′i

(λx1s1 : A1. . . . λxn

sn : An. t)s1u1 · · ·snun . t

′[x1s1 7→ u′1] · · ·[xnsn 7→ u′n]

App-Congt . t′ u . u′

tsu . t′su′

Abs-CongA . A′ t . t′

λxs : A. t . λxs : A′. t′

All-CongA . A′ B . B′

∀xs : A.B . ∀xs : A′. B′

Lemma 6 (Congruence of J·K). If A . A′, then JAKξ . JA′Kξ for all ξ.

Proof. By induction on A . A′:

– The case of Refl is trivial.

19

– For β , we proceed by induction on n (the number of β-reductions); for n = 1,one expects

J(λxs : A. t)suKξ . Jt′[xs 7→ u′]Kξ,

knowing that t . t′ and u . u′. The sort s can either be ∗ or �0, as in theother cases our term would be ill-sorted.• For ∗,

J(λx∗: A. t)•uKξ= {by def. of J·Kξ}

(λx∗: A. JtKξ)•u. {by β , Refl and IH}

Jt′Kξ[x∗ 7→ u′]= {by Lemma 4}

Jt′[x∗ 7→ u′]Kξ• For �0,

J(λx�0: A. t)uKξ= {by def. of J·Kξ}

(λx�0: A. λx∗: x ∈ JAKξ. JtKξ,x 7→x)u•JuKξ. {by β , Refl and IH}

Jt′Kξ,x 7→x[x�0 7→ u′][x∗ 7→ Ju′Kξ]= {by Lemma 2}

Jt′[x�0 7→ u′]KξIf n = 0 (no β reduction), the conclusion stems from Refl. The inductivecase is similar to the one for n = 1.

– The cases of ?-Cong are straightforward using the definition of J·K. ut

Lemma 7 (Congruence of substitution). If t . t′ and u . u′, then t[z 7→u] . t′[z 7→ u′].

Proof. By induction on t . t′:

– For Refl, the expected result follows from an induction on t (using Lemma 6for the case bbzcc).

– For β , one expects

((λx : A. t) e)[z 7→ u] . t′[x 7→ e′][z 7→ u′],

knowing that t . t′ and e . e′. We have

((λx : A. t) e)[z 7→ u]= {by def. of the substitution}

(λx : A[z 7→ u]. t[z 7→ u]) e[z 7→ u]. {by β and IH}

t′[z 7→ u′][x 7→ e′[z 7→ u′]]= {by Lemma 5}

t′[x 7→ e′][z 7→ u′]

– The cases of ?-Cong are straightforward. ut

20

Theorem 3 (Diamond property). The rewriting system (.) has the diamondproperty. That is, for each x, y, y′ such that y / x . y′, there exists z such thaty . z / y′

Proof. By inductions on the derivations:

– If one of the derivations ends with Refl, one has either x = y, or x = y′. Wepick z = y′ in the former case and z = y in the latter.

– If one of the derivations ends with Abs-Cong or All-Cong, the other onehas to end with the same rule, and the result is a straightforward use of theinduction hypothesis.

– If one of the derivations ends with App-Cong, the other one has to end withApp-Cong, or with β . The first case is straightforward; in the second one,one has

(λxs : A′. t′)su′ / (λxs : A. t)su . t′′[xs 7→ u′′]with λxs : A′. t′ / λxs : A. t, t . t′′ and u′ / u . u′′

The situation is summarized in the diagram below. In more details, the endof the derivation of λxs : A′. t′ / λxs : A. t can be either Abs-Cong, or Reflin the first case (the last one is similar), one has A′ / A and t′ / t.By induction hypothesis there exist t′′′, u′′′ such that t′.t′′′/t′′ and u′.u′′′/u′′.The result follows by β and Lemma 7:

(λxs : A′. t′)su′ . t′′′[xs 7→ u′′′] / t′′[xs 7→ u′′]

(λx : A. t)u

(λx : A′. t′)u′ t′′[x 7→ u′′]

t′′′[x 7→ u′′′]

Abs-Cong

u′ / u

, t′ / t

βt . t ′′, u . u ′′

β

t ′. t ′′′, u ′

. u ′′′Lem

ma 7

t′′′ /

t′′ , u′′′ /

u′′

– If both derivations end with the same β rule, the result is a straightforwarduse of the induction hypothesis and Lemma 7. ut

Corollary 1 (Church-Rosser property). Our calculus system has the conflu-ence (Church-Rosser) property that is, for each x, y, y′ such that y ←−? x −→? y′,there exists z such that y −→? z ←−? y′

Proof. Direct consequence of Theorem 3, noticing .? =−→?.

21

4.2 Abstraction

In this section we check that our main goal, the integration of parametricity, isachieved by the design that we propose. At the same time, we check that theabstraction theorem also holds. We do so by proving Lemma 8, which subsumesboth theorems.

Theorem 4 (Abstraction).

1. Γ ` A : B : �0 ⇒ JΓ Kξ ` JAKξ : (A ∈ JBKξ) : ∗, where ξ maps all the vari-ables in Γ to fresh ones.

2. Furthermore, if the original judgement makes no use of Param, nor doesthe resulting judgement.

Proof.

1. Direct consequence of Lemma 8.1.2. In the proof of Lemma 8.1, if ξ is full, then the target derivation trees contains

Param iff Param occurs in the derivation tree for Γ ` A : B. ut

Theorem 5 (Parametricity).

Γ ` A : B : �0 ⇒ Γ ` JAK : (A ∈ JBK) : ∗

This theorem means that every term satisfies the parametricity condition of itstype, even if it contains free variables.

Proof. Take ξ empty in Lemma 8.1.

Definition 5. ξ conforms to Γ when ξ maps exactly a suffix of Γ (restricted to�0-variables) to relational variables.

Remark 2. J·K preserves conforming substitutions.

Proof (sketch). By induction on the typing derivation. In the definition of J·K, ofevery bound variable in a term is assigned an explicit relational interpretation.

Lemma 8 (Generalized abstraction). Assuming that ξ conforms to Γ ,1. Γ ` A : B : �0 and B 6= ∗ ⇒ JΓ Kξ ` JAKξ : A ∈ JBKξ : ∗2. Γ ` A : B : s ⇒ JΓ Kξ ` A : B : s3. Γ ` B : �0 and B 6= ∗ ⇒ JΓ Kξ, x : B ` x ∈ JBKξ : ∗

(where ξ contains all the variables of sort �0 bound in Γ ).

Proof. The proof is done by simultaneous induction on the derivation tree, and isvery similar to the proofs done by Bernardy et al. [6] and Bernardy and Lasson[5]. The new parts occur in the special handling of ∗ and of the Start rule. Theproof of each sub-lemma can be sketched as follows:

22

1. The cases of abstraction and application stem from the fact that theirrespective relational interpretation follows the same pattern as the relationalinterpretation of the product. The case of a variable x (Start) is more tricky:if x ∈ ξ, then the context contains x, and looking up this second variablein the context justifies the translated judgement. If x /∈ ξ, then we can usethe parametricity rule on x to translate the typing judgement. Note thatusing parametricity is possible because, by assumption, x cannot be a proof.Param can be ignored, because direct nesting of parametricity is forbidden.

2. This sub-lemma is used to justify weakening of contexts in the other sub-lemmas. It is a consequence of the thinning lemma and the fact that theinterpretation of types in always well-typed (see the third item below).

3. This sub-lemma expresses that if T is a well-sorted type, then x ∈ JT K is alsowell-sorted. It is easy to convince oneself of that result by checking that thetranslation of a type always yields a predicate. ut

Since this result is the angular stone of our development, we give yet more detail(full construction of the target derivation tree) in the second appendix.

Remark 3. In summary, and roughly speaking, Lemma 8 replaces the occurrencesof Start for variables not in ξ by Param. Occurrences on Start for variablesin ξ are preserved.

4.3 Subject reduction

In this section we prove subject reduction (preservation of types). We start bydiscussing basic properties generally attributed to PTSs.

The weakening of contexts behaves in our calculus exactly in the same wayas in all PTSs. Indeed, the usual thinning lemma holds.

Lemma 9 (Thinning). Let Γ and ∆ be legal contexts such that Γ ⊆ ∆. ThenΓ ` A : B =⇒ ∆ ` A : B.

Proof. As in [4, lem. 5.2.12].

The generation lemma for our calculus must account for the new parametricityconstruct.

Lemma 10 (Generation). The statement of the lemma is the same as that ofthe generation lemma for PTS [4, lem. 5.2.13], but with the additional case forthe Param rule:

– If Γ ` bbxcc : C then there exists B such that Γ ` B : �0, (x : B) ∈ Γ , andC =β x ∈ JBK.

Proof. As in [4]:

23

– we follow the derivation Γ ` bbxcc : C until bbxcc is introduced. It can only bedone by the following rule

∆ ` B : �0

∆,x : B ` bbxcc : x ∈ JBKParam

with C =β x ∈ JBK, and (∆,x�0 : B) ⊆ Γ . The conclusion stems fromLemma 9. ut

Theorem 6 (Subject reduction). If A −→ A′ and Γ ` A : T , then Γ ` A′ : T .

Proof. Most of the technicalities of the proof by Barendregt [4], concern β-reduction, and are not changed by our addition of parametricity.

Hence we discuss here only the handling of the parametricity construct: ourtask is to check that substitution a concrete term a for x in bbxcc preserves thetype of the expression.

Facing a term such as bbxcc in context Γ , we know by generation that itmust have type x ∈ JBK (for some type B valid in Γ , and x : B). We can thenprove that substituting a term a of type B′ (where B′ is convertible to B) forx preserves the type of the expression. Indeed, the expression then reduces toJaK, which has type a ∈ JB′K by Theorem 5. In turn, a ∈ JB′K is convertible tox ∈ JBK by Lemma 6.

4.4 Strong normalization

In this section we present a formalization of the intuitive model presented inSection 2.3. We do this via a transformation, from the system extended withparametricity to the naked CCω system.

Each term is mapped to a term where parametricity witnesses are passedexplicitly. Simultaneously, contexts are extended with explicit witnesses: eachbinding x : A : �0 is replaced by a multiple binding x : A, x : x ∈ JAK. Thismeans that bbxcc can be interpreted by the corresponding variable x in the context.The following table shows how some example terms can be interpreted (for thesake of readability we omit type annotations in the abstractions, since they playno role in these examples):

original term A its interpretation 〈|A|〉λx. bbxcc λx. λx. x

(λx. bbxcc) (y z) (λx. λx. x) (y z) (y z z)

Given that the interpretation (〈| · |〉) is sound with respect to CCω (Lemma 14)and that it preserves reductions (Lemma 15), we obtain strong normalization(Theorem 7). The rest of the section is devoted to defining the model formally,and arguing for its soundness.

The situation is that we have two transformations (J·K and 〈| · |〉Γ ) which canboth introduce parametricity witnesses. Because J·K will be applied to the resultof 〈| · |〉Γ , there is a danger that a single variable may end up being given two

24

parametricity witnesses. We want to avoid such an occurrence, otherwise wewould have to prove these two witnesses equal, which is technically cumbersome.Therefore, to define our interpretation, we need to identify the bindings x : a : �0for which an explicit witness x : x ∈ JAK has already been introduced. We doso via a special syntactic marking on the bindings of variables which are givenan explicit witness of parametricity by either transformation. This marking issyntactically realized by underlining binders of abstractions and products, aswell as the argument of a corresponding application.

Concretely, we slightly modify J·K by marking the duplicated binding (casesfor abstraction and product); correspondingly the first argument is marked aswell, in the case of the application. Similarly, we mark the new binding in thecase for JT Kξ, and the case for C ∈ JT Kξ has to be amended accordingly.

Jλx�0: A.BKξ = λx : A. λx∗: x ∈ JAKξ. JBKξ,x 7→xJABKξ = JAKξ B•JBKξ

JT Kξ = λz : T . z ∈ JT Kξ

C ∈ J�0Kξ = C → ∗C ∈ J∀x�0: A.BKξ = ∀x : A.∀x∗: x ∈ JAKξ. (C x) ∈ JBKξ,x 7→x

C ∈ JT Kξ = JT Kξ C

JΓ, x�0: AKξ,x 7→x = JΓ Kξ, x : A, x∗: x ∈ JAKξ

Additionally, since nesting J·K is forbidden (it yields ill-typed terms), we arefree to choose the behavior of J·K on the new marked bindings; avoiding theduplication of parametricity witnesses. The result is as follows:

Jλx : A. λx∗: A′. BKξ = λx : A. λx∗: A′. JBKξ,x 7→xJAB•B′Kξ = JAKξ B•B′

C ∈ J∀x : A.∀x∗: A′. BKξ = ∀x : A.∀x∗: A′. (C x) ∈ JBKξ,x 7→x

JΓ, x : A, x∗: A′Kξ = JΓ Kξ, x : A, x∗: A′

It is useful to stress that we only change the behavior of J·K on cases that wereforbidden in our extended system. This means that the newly introduced binding,as well as the changes, are only relevant in this proof of normalization. Thismeans that all the previous results remain valid. In fact, the users of the systemare free to remain entirely oblivious to the special marking, and the presentationof the interpretation of the system done in the previous sections need not beingamended at all. On the other hand, one may wonder why we change the behaviorof J·K only in cases that are forbidden. The answer is that we are going to applyit to terms in CCω, which do not contain any occurrence of bb·cc, and thereforenested uses of J·K can safely occur.

25

We can define our interpretation function as follows, adding explicit witnessesonly if necessary that is, only for non-marked bindings (resp. arguments ofapplications) of sort �0.

Definition 6 (Explicit binding of witnesses).

〈|s|〉Γ = s

〈|x|〉Γ = x

〈|bbxcc|〉Γ = x

〈|λx�0: A.B|〉Γ = λx : 〈|A|〉Γ . λx∗: x ∈ J〈|A|〉Γ Kχ(Γ ). 〈|B|〉Γ,x〈|λx∗: A.B|〉Γ = λx∗: 〈|A|〉Γ . 〈|B|〉Γ

〈|λx : A. λx∗: A′. B|〉Γ = λx : 〈|A|〉Γ . λx∗: 〈|A′|〉Γ . 〈|B|〉Γ,x

〈|AB|〉Γ = 〈|A|〉Γ 〈|B|〉Γ •J〈|B|〉Γ Kχ(Γ )

〈|A•B|〉Γ = 〈|A|〉Γ •〈|B|〉Γ〈|AB•B′|〉Γ = 〈|A|〉Γ 〈|B|〉Γ •〈|B′|〉Γ

〈|∀x�0: A.B|〉Γ = ∀x : 〈|A|〉Γ .∀x∗: x ∈ J〈|A|〉Γ Kχ(Γ ). 〈|B|〉Γ,x〈|∀x∗: A.B|〉Γ = ∀x∗: 〈|A|〉Γ . 〈|B|〉Γ

〈|∀x : A.∀x∗: A′. B|〉Γ = ∀x : 〈|A|〉Γ .∀x∗: 〈|A′|〉Γ . 〈|B|〉Γ,x

〈| − |〉 = −〈|Γ, x�0: A|〉 = 〈|Γ |〉, x : A, x∗: x ∈ J〈|A|〉Γ Kχ(Γ )

〈|Γ, x∗: A|〉 = 〈|Γ |〉, x∗: 〈|A|〉Γ〈|Γ, x : A, x∗: A′|〉 = 〈|Γ |〉, x : 〈|A|〉Γ , x∗: 〈|A′|〉Γ

The above assumes that for each variable x, there is a globally fresh variablex. The relational substitution mapping each x to x is written χ.

Definition 7. χ(Γ ) = {z 7→ z|z ∈ Γ}

The essence of the model defined by 〈| · |〉Γ is that a parametricity witness bbxccis adequately modeled by the variable x, that is, if x has type A, then x : x ∈ JAK.The situation is somewhat more complex though, since the type of x will itselfundergo translation by 〈| · |〉Γ , as well as the type of the witness. Hence, we mustultimately prove that x ∈ J〈|A|〉Γ Kχ(Γ ) = 〈|x ∈ JAK|〉Γ . In fact, the lemma shouldbe further generalized before it can be proven:

Lemma 11 (〈| · |〉Γ and J·K commute). If

– Γ ` A : B,– ∆ ⊆ Γ

then J〈|A|〉Γ Kχ(Γ ) = 〈|JAKχ(∆)|〉Γ

26

Proof. By induction on A.

– The base case on variables shows how we rely on consistently having x as awitness for x.x ∈ ∆

〈|JxKχ(∆)|〉Γ = 〈|x|〉Γ = x = JxKχ(Γ ) = J〈|x|〉Γ Kχ(Γ )

x /∈ ∆〈|JxKχ(∆)|〉Γ = 〈|bbxcc|〉Γ = x = JxKχ(Γ ) = J〈|x|〉Γ Kχ(Γ )

Application Case Fa , the other ones being trivial

J〈|F a|〉Γ Kχ(Γ )= {by def. of 〈| · |〉Γ }

J〈|F |〉Γ 〈|a|〉Γ •J〈|a|〉Γ Kχ(Γ )Kχ(Γ )= {by def. of J·K}

J〈|F |〉Γ Kχ(Γ ) 〈|a|〉Γ •J〈|a|〉Γ Kχ(Γ )= {by IH}〈|JF Kχ(∆)|〉Γ 〈|a|〉Γ •〈|JaKχ(∆)|〉Γ

= {by def. of 〈| · |〉Γ }〈|JF Kχ(∆) a•JaKχ(∆)|〉Γ

= {by def. of J·K}〈|JF aKχ(∆)|〉Γ

– The case of the abstraction λx�0: A. b shows how bindings are handled, andillustrates why the non-duplication of parametricity witnesses is important.

〈|Jλx�0: A. bKχ(∆)|〉Γ= {by def. of J·K}〈|λx : A. λx∗: x ∈ JAKχ(∆). JbKχ(∆),x 7→x|〉Γ

= {by α-renaming, and def. of 〈| · |〉Γ }λx : 〈|A|〉Γ . λx∗: 〈|x ∈ JAKχ(∆)|〉Γ . 〈|JbKχ(∆,x)|〉Γ,x

= {by IH}λx : 〈|A|〉Γ . λx∗: x ∈ J〈|A|〉Γ Kχ(Γ ). J〈|b|〉Γ,xKχ(Γ,x)

= {by def. of J·K}Jλx : 〈|A|〉Γ . λx∗: x ∈ J〈|A|〉Γ Kχ(Γ ). 〈|b|〉Γ,xKχ(Γ )

= {by def. of 〈| · |〉Γ }J〈|λx�0: A. b|〉Γ Kχ(Γ )

Product Same as above.Otherwise straightforward. ut

Another important lemma is that 〈| · |〉 applied to a context adds all necessarywitnesses. Formally:

Lemma 12. J〈|Γ |〉Kξ = 〈|Γ |〉

Proof. By induction on Γ . An easy consequence of the fact that we leave un-changed the bindings already equipped with an explicit parametricity witness.

27

Lemma 13 (〈| · |〉Γ and substitution).

– 〈|A[z∗ 7→ a]|〉Γ = 〈|A|〉Γ [z 7→ 〈|a|〉Γ ]– 〈|A[z�0 7→ a]|〉Γ = 〈|A|〉Γ,z[z 7→ 〈|a|〉Γ ][z 7→ J〈|a|〉Γ Kχ(Γ )]

Proof. By (simultaneous) induction on A; We illustrate how the proof proceedsby showing (only) the case for λx�0: A. b, with z of sort �0 (The second item ofthe lemma).

Abstraction Case λx�0: A. b, the other ones being trivial

〈|(λx�0: A. b)[z�0 7→ a]|〉Γ= {by def. of the substitution}〈|λx�0: A[z 7→ a]. b[z 7→ a]|〉Γ

= {by def. of 〈| · |〉Γ }λx : 〈|A[z 7→ a]|〉Γ . λx∗: x ∈ J〈|A[z 7→ a]|〉Γ Kχ(Γ ).

〈|b[z 7→ a]|〉Γ,x= {by IH and Lemmas 2 and 4}

(λx : 〈|A|〉Γ,z. λx∗: x ∈ J〈|A|〉Γ,zKχ(Γ,z). 〈|b|〉Γ,z,x)[z 7→ 〈|a|〉Γ ][z 7→ J〈|a|〉Γ Kχ(Γ )]

= {by def. of 〈| · |〉Γ }〈|λx�0: A. b|〉Γ,z[z 7→ 〈|a|〉Γ ][z 7→ J〈|a|〉Γ Kχ(Γ )]

Product Same as above.Otherwise straightforward. ut

Finally we can show the soundness of our model, by proving that the trans-formation yields well-typed terms in CCω.

Lemma 14. Γ ` A : B =⇒ 〈|Γ |〉 `CCω〈|A|〉Γ : 〈|B|〉Γ

Proof. By induction on the derivation.Most cases are straightforward given the above definition of 〈| · |〉Γ , and use

the abstraction theorem, wherever 〈| · |〉Γ uses J·K.Three cases merit further mention:

– The case of application essentially relies on the fact that if 〈|a|〉Γ is typeablein 〈|Γ |〉, then the explicit witness of parametricity, J〈|a|〉Γ K, is typeable in thesame context. This is a consequence of Lemma 12.

– The conversion case relies on Lemma 13.– The base case of parametricity relies on Lemma 11, and is informally explained

above.

Details of the application and parametricity cases are given in Figure 1. ut

28

Axiom

The

axiomcase

istrivial:the

treeis

untouched.A

xiom`s

1:s

2W

eakeningC

aseΓ,x

�0:CÀ

:B

ΓÀ

:B

induction〈|Γ|〉`〈|A|〉Γ

:〈|B|〉Γ

Γ`C

:�

0

induction〈|Γ|〉`〈|C|〉Γ

:�

0

Γ`C

:�

0

induction〈|Γ|〉`〈|C|〉Γ

:�

0Lem

ma

8.3〈|Γ|〉Γ,x

:〈|C|〉Γ`x∈

J〈|C|〉ΓKχ

(Γ) :∗

2×W

eak.〈|Γ|〉,x

:〈|C|〉Γ,x∗:x∈

J〈|C|〉ΓKχ

(Γ) `〈|A|〉Γ

:〈|B|〉Γ

xis

freein

neitherA

norB

〈|Γ|〉,x

:〈|C|〉Γ,x∗:x∈

J〈|C|〉ΓKχ

(Γ) `〈|A|〉Γ,x

:〈|B|〉Γ,x

Application

Case

F�

0a

Γ`F

:(∀x�

0:A.B

)induction

〈|Γ|〉`〈|F|〉Γ

:(∀x

:〈|A|〉Γ.∀x∗:x∈

J〈|A|〉ΓKχ

(Γ) .〈|B

|〉Γ,x )

Γà

:A

induction〈|Γ|〉`〈|a|〉

Γ:〈|A|〉Γ

Γà

:A

induction〈|Γ|〉`〈|a|〉

Γ:〈|A|〉Γ

Lem.8.1

J〈|Γ|〉Kχ

(Γ) `

J〈|a|〉ΓKχ

(Γ) :〈|a|〉

Γ∈

J〈|A|〉ΓKχ

(Γ)

Lem.12

〈|Γ|〉`

J〈|a|〉ΓKχ

(Γ) :〈|a|〉

Γ∈

J〈|A|〉ΓKχ

(Γ)

2×A

pp.〈|Γ|〉`〈|F|〉Γ〈|a|〉

Γ•J〈|a|〉

ΓKχ

(Γ) x

:〈|B|〉Γ [x7→〈|a|〉

Γ ][x7→

J〈|a|〉ΓKχ

(Γ) ]

Lemm

a13

〈|Γ|〉`〈|F

a|〉Γ

:〈|B[x7→a]|〉

Γ

Abstraction

Case

λx�

0:A.b

Γ,x

:A`b

:B

induction〈|Γ|〉,x

:A,x∗:x∈

J〈|A|〉ΓKχ

(Γ) `〈|b|〉

Γ,x

:〈|B|〉Γ,x

Γ`

(∀x�

0:A.B

):s

induction〈|Γ|〉`

(∀x

:〈|A|〉Γ.∀x∗:x∈

J〈|A|〉ΓKχ

(Γ) .〈|B

|〉Γ,x ):

s2×

Abs.

〈|Γ|〉`

(λx

:〈|A|〉Γ.λx∗:x∈

J〈|A|〉ΓKχ

(Γ) .〈|b|〉

Γ,x ):(∀

x:〈|A|〉Γ.∀x∗:x∈

J〈|A|〉ΓKχ

(Γ) .〈|B

|〉Γ,x )

bydef.

〈|Γ|〉`〈|λx�

0:A.b|〉

Γ:〈|∀

x�

0:A.B|〉Γ

Product

Case∀x�

0:A.B

ΓÀ

:�

0


:�

0

ΓÀ

:�

0


:�

0Lem

ma

8.3〈|Γ|〉,x

:〈|A|〉Γ`x∈

J〈|A|〉ΓKχ

(Γ) :∗

Γ,x

:A`B

:s

2

induction〈|Γ|〉,x

:A,x∗:x∈

J〈|A|〉ΓKχ

(Γ) `〈|B|〉Γ,x

:s

22×

Prod.

〈|Γ|〉`

(∀x

:〈|A|〉Γ.∀x∗:x∈

J〈|A|〉ΓKχ

(Γ) .〈|B

|〉Γ,x ):

s3

bydef.

〈|Γ|〉`〈|∀x�

0:A.B|〉Γ

:s

3C

onversionΓÀ

:B


:〈|B|〉Γ

Γ`B′:s

induction〈|Γ|〉`〈|B′|〉Γ

:s

B=βB′

Corollary

1,Lemm

a15

〈|B|〉Γ

=β〈|B′|〉Γ

Conv.

〈|Γ|〉`〈|A|〉Γ

:〈|B′|〉Γ

StartEasy.(A

ddone

weakening

forΓ`x

:A

:�

0 ).Param

ΓÀ

:�

0


:�

0Lem

ma

8.3〈|Γ|〉,x

:〈|A|〉Γ`x∈

J〈|A|〉ΓKχ

(Γ) :∗

Start〈|Γ|〉,x

:〈|A|〉Γ,x∗:x∈

J〈|A|〉ΓKχ

(Γ) `

x:x∈

J〈|A|〉ΓKχ

(Γ)

Lemm

a11

〈|Γ|〉,x

:〈|A|〉Γ,x∗:x∈

J〈|A|〉ΓKχ

(Γ) `

x:〈|x∈

JAK|〉Γ

bydef.

〈|Γ,x

�0:A|〉`〈|bbxcc|〉

Γ,x

:〈|x∈

JAK|〉Γ,x

Fig.1.Soundness

ofmodel:proofdetails.

29

Lemma 15 (Congruence of 〈| · |〉Γ ). If Γ ` A : B with A −→ A′, then

〈|A|〉Γ −→+ 〈|A′|〉Γ .

Proof. Each occurrence of the App rule is translated to at least one occurrenceof App in the typing derivation.

Theorem 7 (Strong normalization). If CCω is strongly normalizing, thenso is the system extended with Param.

Proof. Assume Γ ` A : B and consider a chain of reductions A −→n A′. Wehave 〈|A|〉Γ −→m 〈|A′|〉Γ , and m ≥ n by Lemma 15. We also have that 〈|A|〉Γ istypeable in CCω, by Lemma 14. Therefore, only finite chains of reductions arepossible.

5 Generalized abstraction: Proof details

The proof of Lemma 8 depends on the following lemmas:

1. Γ ` A : B : �0 and B 6= ∗ ⇒ JΓ Kξ ` JAKξ : A ∈ JBKξ : ∗2. Γ ` A : B : s ⇒ JΓ Kξ ` A : B : s3. Γ ` B : �0 and B 6= ∗ ⇒ JΓ Kξ, x : B ` x ∈ JBKξ : ∗

The lemmas are proved by transforming derivation trees. They mutuallydepend on each other, (but only for structurally smaller statement: the recursionis sound). In the proofs, the trees generated by each sub-lemma are written asfollows:

– Lemma 1: JΓ ` A : BKξ,– Lemma 2: |Γ ` A : B|,– Lemma 3: {Γ ` B : �0}ξ.

We only give further detail for 1. and 3.Note that proofs never need to handle the Param case, because nesting of

parametricity is forbidden. (The premise of the theorems is always invalidated.)

towards a computational interpretation of...

Documents