incremental reduction in the lambda calculus*...tality: demers, reps, and teitelbaum [drt81,rep82]...

16
Incremental Reduction in the Lambda Calculus* John Field Tim Teitelbaum Cornell University+ Abstract An incremental algorithm is one that takes advantage of the fact that the function it computes is to be evaluated repeat- edly on inputs that differ only slightly from one another, avoiding unnecessary duplication of common computations. We define here a new notion of incrementality for reduc- tion in the untyped X-calculus and describe an incremental reduction algorithm, nine. We show that A”” has the desir- able property of performing non-overlapping reductions on related terms, yet is simple enough to allow a practical im- plementation. The algorithm is based on a novel X-reduction strategy that may prove useful in a non-incremental setting as well. Incremental X-reduction can be used to advantage in any setting where an algorithm is specified in a functional or apphcative manner. 1 Introduction 1.1 Incremental Computation Applications that require computations on a sequence of slightly differing inputs occur with surprising frequency, and researchers have investigated numerous notions of incremen- tality: Demers, Reps, and Teitelbaum [DRT81,Rep82] pro- posed algorithms to update functions on syntax trees de- scribed by attribute grammars. Alpern, et al. [ACR+87] describe a generalization of incremental attribute gram- mar evaluation for arbitrary graphs. Earley and Paige and Koenig [Ear76,PK82] have investigated what are in essence incremental methods for updating aggregate data structures (such as sets) in loops in very high-level programming lan- guages. Yellin and Strom pS87] describe a special pro- gramming language where networks of functions on aggre- gate data structures can be defined. Small changes to such aggregates are reflected in efficient propagation of updates throughout the network. Pugh and Teitelbaum [PTSS] dis- ‘This research was supported by NSF and ONR under NSF grant CCFG3514862. tAuthors’ Address: Department of Computer Science, Cor- nell University, Upson Hall, Ithaca, NY 14853. Electronic mail: fieldQcs.cornell.edu, ttQcs.corneIl.edu. Permission to copy without fee all or part of this material is granted provided that the copies are not made or distributed for direct commercial advantage, the ACM copyright notice and the title of the publication and its date appear, and notice is given that copying is by permission of the Association for Computing Machinery. To copy other- wise, or to republish, requires a fee and/or specific permission. cuss incremental evaluation of functions using caching tech- niques. Berman, Paull, and Ryder [BPR85] have investi- gated the limitations of incremental algorithms from the abstract point of view of computational complexity. Ben- gelloun [Ben821 describes an incremental evaluator for the language SCHEME in which list structures manipulated by functions may be.incrementally updated. There are count- less more specific (or ad-hoc) examples of incremental com- putation in other domains. The advantage of investigating incremental reduction in the X-calculus is that it is more general than the approaches discussed above, yet it is also simple-one can describe changes to functions (including higher-order functions) and to data structures (encoded as X-terms) in the same formal- ism. The X-calculus also has a rich reduction theory that can be exploited to analyze the limits of incremental techniques. 1.2 Prerequisites The reader is referred to Appendix A for a brief review of X-calculus terminology we will use here, some of which is slightly nonstandard. An acquaintance with the work on Categorical Combinators of Curien [Cur86a,Cur86b], or the related systems of [ACCLSO] and [FieSOb] would be useful. The latter is used extensively in the sequel, and a review of it may be found in Appendix B. The formal treatment of incremental reduction in Section 6 requires some familiar- ity with the reduction theory of the A-calculus covered in [Bar84,BKKS87,Klo80,L&78,LCv80]. 2 Motivation As an example of an application for which incremental X- reduction is particularly appropriate, consider the abstract syntax for a small expression language in Figure 1, which we will call 1;. A program of L consists of a list of declara- tions and an expression. The expressions of L are built from natural numbers, identifiers, and sums. A declaration is a binding of an identifier to a number. Though patently use- less, the language serves to illustrate some important issues. A typical expression of L is the following: leta=2;b=5in((3+4)+a) (where we add appropriate keywords for the sake of read- ability). Along with each grammar rule in L, we define a deno- tation for the rule’s left-hand nonterminal (e.g., (rl’rogl), using a notational variant of the lambda calculus in which @ 1990 ACM 089791-368-X/90/0006/0307 $1.50

Upload: others

Post on 02-Oct-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Incremental Reduction in the Lambda Calculus*...tality: Demers, Reps, and Teitelbaum [DRT81,Rep82] pro- posed algorithms to update functions on syntax trees de- scribed by attribute

Incremental Reduction in the Lambda Calculus*

John Field Tim Teitelbaum

Cornell University+

Abstract

An incremental algorithm is one that takes advantage of the fact that the function it computes is to be evaluated repeat- edly on inputs that differ only slightly from one another, avoiding unnecessary duplication of common computations.

We define here a new notion of incrementality for reduc- tion in the untyped X-calculus and describe an incremental reduction algorithm, nine. We show that A”” has the desir- able property of performing non-overlapping reductions on related terms, yet is simple enough to allow a practical im- plementation. The algorithm is based on a novel X-reduction strategy that may prove useful in a non-incremental setting as well.

Incremental X-reduction can be used to advantage in any setting where an algorithm is specified in a functional or apphcative manner.

1 Introduction

1.1 Incremental Computation

Applications that require computations on a sequence of slightly differing inputs occur with surprising frequency, and researchers have investigated numerous notions of incremen- tality: Demers, Reps, and Teitelbaum [DRT81,Rep82] pro- posed algorithms to update functions on syntax trees de- scribed by attribute grammars. Alpern, et al. [ACR+87] describe a generalization of incremental attribute gram- mar evaluation for arbitrary graphs. Earley and Paige and Koenig [Ear76,PK82] have investigated what are in essence incremental methods for updating aggregate data structures (such as sets) in loops in very high-level programming lan- guages. Yellin and Strom pS87] describe a special pro- gramming language where networks of functions on aggre- gate data structures can be defined. Small changes to such aggregates are reflected in efficient propagation of updates throughout the network. Pugh and Teitelbaum [PTSS] dis-

‘This research was supported by NSF and ONR under NSF grant CCFG3514862.

tAuthors’ Address: Department of Computer Science, Cor- nell University, Upson Hall, Ithaca, NY 14853. Electronic mail: fieldQcs.cornell.edu, ttQcs.corneIl.edu.

Permission to copy without fee all or part of this material is granted provided that the copies are not made or distributed for direct commercial advantage, the ACM copyright notice and the title of the publication and its date appear, and notice is given that copying is by permission of the Association for Computing Machinery. To copy other- wise, or to republish, requires a fee and/or specific permission.

cuss incremental evaluation of functions using caching tech- niques. Berman, Paull, and Ryder [BPR85] have investi- gated the limitations of incremental algorithms from the abstract point of view of computational complexity. Ben- gelloun [Ben821 describes an incremental evaluator for the language SCHEME in which list structures manipulated by functions may be.incrementally updated. There are count- less more specific (or ad-hoc) examples of incremental com- putation in other domains.

The advantage of investigating incremental reduction in the X-calculus is that it is more general than the approaches discussed above, yet it is also simple-one can describe changes to functions (including higher-order functions) and to data structures (encoded as X-terms) in the same formal- ism. The X-calculus also has a rich reduction theory that can be exploited to analyze the limits of incremental techniques.

1.2 Prerequisites

The reader is referred to Appendix A for a brief review of X-calculus terminology we will use here, some of which is slightly nonstandard. An acquaintance with the work on Categorical Combinators of Curien [Cur86a,Cur86b], or the related systems of [ACCLSO] and [FieSOb] would be useful. The latter is used extensively in the sequel, and a review of it may be found in Appendix B. The formal treatment of incremental reduction in Section 6 requires some familiar- ity with the reduction theory of the A-calculus covered in [Bar84,BKKS87,Klo80,L&78,LCv80].

2 Motivation

As an example of an application for which incremental X- reduction is particularly appropriate, consider the abstract syntax for a small expression language in Figure 1, which we will call 1;. A program of L consists of a list of declara- tions and an expression. The expressions of L are built from natural numbers, identifiers, and sums. A declaration is a binding of an identifier to a number. Though patently use- less, the language serves to illustrate some important issues. A typical expression of L is the following:

leta=2;b=5in((3+4)+a)

(where we add appropriate keywords for the sake of read- ability).

Along with each grammar rule in L, we define a deno- tation for the rule’s left-hand nonterminal (e.g., (rl’rogl), using a notational variant of the lambda calculus in which

@ 1990 ACM 089791-368-X/90/0006/0307 $1.50

Page 2: Incremental Reduction in the Lambda Calculus*...tality: Demers, Reps, and Teitelbaum [DRT81,Rep82] pro- posed algorithms to update functions on syntax trees de- scribed by attribute

Grammar

Prog ::= De& Expr (program)

Expr ::= Nat (constant) I Id (variable). 1 Expr, Exprs (sum)

Decls ::= Nulldecls (empty decl.) 1 BindDecJs (decl. list)

Denotation

[Prog] I ([Expr][Decls])

[Expr] E Aenv.Nat E A em .( env Id) E A env .((I[Expr,] env) + ( [Exp~2] env))

[Deck] G Ai. E A i .if (([Bina i) # (-1))

then ([IBindjj i) else ([Deck] i)

Bind ::= IdNat (binding) [Bind E A i . if (i = Id) then Nat else (-1)

Figure 1: The Language L

integer operations, conditionals, etc., are predefined. By re- ducing the X-expression denoting a program of L to weak head normal form, we obtain its meaning. Thus the mean- ing of the expression above is 9, obtained by reduction of the X-expression that denotes it. Most of the complications in the denotational definitions of L arise from the “lookup” mechanism used for determining values to which variables are bound. The abstract syntax of L may be viewed as defin- ing a term algebra of the “parse trees” for the language in the usual way. Editing an expression of L thus amounts to performing a subtree replacement in a term of its algebra, for which there is a corresponding subtree replacement in the X-expression denoting the term.

Consider the possible consequences of changing part of the program above (by making a subtree replacement in its corresponding term), e.g.,

1. Changing the binding for b to 7.

2. Changing the binding for a to 3.

3. Adding another binding for a.

If we perform (l), there is no need to re-evaluate the sum since it contains no instance of variable b. (2) requires re- evaluating the outer sum to determine its new meaning, but not the inner expression (3+4). (3) may invalidate the use of the previous binding for a, as well as require re-evaluation of the outer sum. Multiple changes would have further- reaching consequences.

We would like it to be the case that the new dens tational X-expression can be re-evaluated without repeat- ing any of the work required to evaluate the denotation of the old term-in other words, we want to reduce the new X-expression incrementally. Note that the X-expression defining a denotation is not subject to completely arbitrary changes after a subtree replacement; only A-terms denoting corresponding subterms in the language’s term algebra are subject to alteration. For instance, in the X-term denoting a sum,

A env .((uExpr,] env) + (UExprJ env))

the only X-subterms that can be replaced are UExpr,jj and Ubdl.

3 The Problem

Moving away from our motivating example, let us examine the problem of incremental X-reduction in the abstract.

3.1 Reduction of Similar Terms

Consider the following A-terms (where I E A2.z):

ML

M2

M3

( XXY.XY

( XXY.X (

E ( XXY.Y (

(II) (II)) II) (II)) II) (II))

Ml, M2, and MS are identical except for the boxed terms. The following set of (outermost) reductions suffice to reduce these terms to normal form:

~1: (p((IV)(I*I)) E Ml - (PI) (PI) s, I (PI) - IbI

b - I E M;

p3: (I~I(IV) (I*I)) = MJ rbr

- 1-1

b - I

Certain redexes above have been labeled by attachment of a superscript letter to their abstraction term. Contraction of such a redex is indicated by the correspondingly labeled ‘ - ‘.

The redexes contracted in pi, ps, and ps overlap, in the sense that pi and pz both contract the redex labeled a, and pi and p3 both contract the redex labeled b.

If we reduce Mr and subsequently are faced with the task of reducing M2 and Ma, it would be useful to take advantage of the fact that certain redexes in Mi have already been contracted, avoiding contracting the same redexes in the reductions of MZ and Ms.

308

Page 3: Incremental Reduction in the Lambda Calculus*...tality: Demers, Reps, and Teitelbaum [DRT81,Rep82] pro- posed algorithms to update functions on syntax trees de- scribed by attribute

The similarity among Ml, Mz, and Ms is made more apparent if we note that they can be represented as substi- tution instances of the same term:

Ml E N[” := Xzy.zy] Mz E N[z := Azy.z] Ms 3 N[z := Xzy.y]

where N E (z (II) (II)). We will refer to such terms as similar by substitution

(or, in the context of thii paper, simply “similar”), since they are the same modulo differing substitutions for some

free variable in a term. We will use the notation ML z MZ if Ml and M2 can be rewritten ss N[z := PI] and N[z := Pz] for some z, PI, and P2. The terms A and P2 are deemed the substituands of the similar terms, and N will be called the common derm (of Ml and Mz). A set of similar terms, e.g., SN will refer to a set of pairwise similar terms all having the same common term (e.g., N).

Note that the common term is not necessarily a subterm of any of the set of similar terms-the appearance of N in N[z := Pi] is merely a notational convenience. For instance, (z (II)( is not a subterm of Ml. However, the com- mon term of a set of similar terms always can be explicitly “factored out” using P-expansion:

((s4z.N) Xzy.zy) -ep Ml

((Xz.N) Xzy.z) -@ Mz

(0z.N) 1zy.y) -,s Ma

3.2 Reduction in Common Subterms

Our goal is to reduce a sequence of similar terms in such a way as to avoid the contraction of common redexes, with the aim of enabling shorter (thus faster) reductions of later terms in the sequence than would be possible were each term reduced ab initio. We defer a more formal definition of what constitutes a Yommon contractior? or “common computa- tion” to Section 6, appealing here to intuition.

Our approach to the problem is as follows: each mem- ber of a set of similar terms is reduced in such a way as to preserve in the common term the relevant computations performed in the reduction of the entire term. The idea is illustrated in the following reduction of Ml, where we let Ml E N[z := ~zy.zy] and N E (t (PI) (1”1)):

6):: N[z := Xzy.zy] G Ml s, (z I (PI))[z := Xzy.zy] E (Xzy.zy) I (PI)

2. p.aJz := Azy.zy] z (Azy.zy) I I

- II” - I E M:

While pi is not a outermost reduction, it nonetheless reaches a normal form, and has the pleasing property of producing a very useful intermediate subterm, N’[z := Xzy.zy]. This term results from the contraction of redexes labeled a and b, which are in a sense relevant only to the common term N. We can summarize pi as follows:

p;: N[z := Xzy.zy] - N’[t := hy.zy] --H I

If we were able to extract the reduced common term N’ from its substitution instance Ml and preserve it in the

course of the reduction, we could use it to reduce M2 and Ms as follows: Let N’ s (211). We then have:

4: N’[z := Azy.z] E Xzy.z I I -I E M;

Pi: N’[z := Xzy.y] z Azy.y I I -I I M;

We also know that since N--H N’, and Mi - N’[z := Pi], pi and pi yield correct normal forms for M2 and Ma, respectively, though their initial terms are not M2 and Mz.

Unlike their counterparts p2 and ps, pi and pj avoid con- tracting redexes labeled u and b, since these computations were previously carried out in reduction pi, and are em- bodied in N’. In the sequel, we will demonstrate that the common term and its reduction can indeed be effectively preserved in the course of reducing one of its substitution inst antes .

Given a set of similar terms, it should be clear that in general it is not possible simply to extract and reduce the common term to a normal form (as we did with N’), as none may exist. Let R z (z (II) S2), where R EE (Xz.zz)(Xz.zz). We then have the following outermost reductions:

rl: R[z := Xzy.z] c Ql - xzy.z (II) Cl - II - I Q: R[z := Xzy.a] 3 Q2 - Xzy.a (II) R - a rj: R[z := Xzy.y] I Q3 - Xzy.y (II) il - s2 - . . .

It should be evident from the example above that the redexes contracted in R depend on R’s substituand. In rl, Xzy.z %aused” redex (II) in R to be contracted. In ~2, however, the (II) redex is not contracted at all. TJ never reaches a normal form (and thus none exists for Qa). In short, whether or not a redex in common term R is needed to yield a normal form depends on the value of the substituand. The notion of “neededness” is formalized and discussed in some detail in [BKKS87].

The possibility of contracting redexes in the common term in advance of the reduction of the entire term is lim- ited by the fact that knowing which redexes in the common term will be subsequently needed is undecidable. At best, the common term can only be reduced to some “safe” form (such as head normal form), but this would not necessarily capture all the contractions performed in the common term during the reduction of one of its substitution instances.

Such safe pre-computation falls under the general rubric of partial evaluation, a technique that has been proposed for the implementation of compiler construction systems, among other things (see, e.g., [JSSSS]). The goal here is rather different: to preserve computations in the common term that are dependent on the substitutable part of an ex- pression in addition to those that are independent.’

We are thus led to investigate techniques that allow the preservation of the result of contractions in the common term as a side-efiect of reducing one of the members of a set of similar terms.

3.3 Incremental Lambda-Reduction

In general, we assume we are given terms Ml and MS such

that Ml c M2, that is, terms having the forms N[z := PI]

‘We have recently become aware of work on incremental computa- tion using partial evaluation [SunSO], but were unable to obtain infor- mation on its details in time to permit comparison with the methods used here.

309

Page 4: Incremental Reduction in the Lambda Calculus*...tality: Demers, Reps, and Teitelbaum [DRT81,Rep82] pro- posed algorithms to update functions on syntax trees de- scribed by attribute

and N[z := Pz]. We would then like to perform reduction

~1: N[z := PI] - M:

while also computing a term N’ such that ri”: N --H N’, where (informally) N’ embodies all of the contractions per- formed in N during rr. ri N will be called a projection of rl on (common) term N. We can then use N’ in a reduction of Mz:

~2: N’[z := Pz] - M;

This process can easily be extended to a sequence of reduc- tions of an arbitrarily large set of similar terms; e.g., from rz, we can compute an N”, the final term of a projection rzN’: N’ - N” of r2 on N’, etc.

Note that depending on N’, it is generally not the case that N[z := Pz] - Q implies N’[z := Pz] - Q (although N[z := P2]=oN’[x := Pz]). We will, however, be concerned here with applications in which the desired final term of a reduction is some sort of normal form (e.g., weak head normal form), where any final term of the desired form is acceptable. Our goal will be to minimize as much ss possible the number of /%contractions required to yield such a normal form by taking advantage of previous computations. The process described above constitutes incremental reduction in the ~-calculus.

We have heretofore described only terms that differ in a single subterm. However, the notion of “similarity” can be extended easily to terms that may differ in arbitrar- ily many subterms (e.g., let Ml 3 N[zi := Pl1 , z2 := Pi21 and Mz E N[zi := 91 , 22 := Pzz]). We will also consider terms whose substituands may be nested, i.e., terms in which substituands contain other substituands. The language de- notation discussed in Section 2 has substituands of this form induced by the term structure of its grammar. However, we will defer discussion of these generalizations, (which are of far greater practical value than the case of single substi- tuands), to Section 8. The special case of similar terms with single substituands suffices here for the purposes of exposi- tion.

4 Informal Presentation of Incremental Reduction

4.1 Preliminaries

4.1.1 Lambda Terms as Graphs

Our incremental X-reduction algorithm will use the common device of representing X-terms by (directed acyclic) graphs, an idea pioneered by Wadsworth [WadU]. In such a rep resentation, subgraphs may be shared, enabling simultane- ous contraction of the various redexes represented by the shared term. In addition, implementations will generally al- low nodes in the graph to be overwritten in place by terms to which they reduce. For more details on the use of graphs in X-reduction, see [Pey87]. L&y [Ldv78,LCvSO] observed that such shared contractions can be formally described as “parallel” reductions. Such reductions are briefly reviewed in Appendix A.

In the sequel, we will feel free to intermix conventional text and graphs, e.g.,

0: (b(arV))(~~)-+ ( I- ( ) = z* U(I,,U.

represents a reduction where two (1%) redexes are shared.

4.1.2 Closures and Environments

Also useful will be the notion of environment, used in many practical reduction schemes, e.g., those of [FW87,CCM87]. An environment consists of sets of mappings between vari- able names and values, or bindings. The result of a p- contraction is then a closure consisting of the body of the abstraction part of the redex, paired with an environment updated to contain the binding of the abstraction’s bound variable to the argument of the redex. The idea is illustrated below:

(Xz.(xx))N - [(xx), ((z := N))]

We will use the notation [T, E] to represent a closure con- sisting of term T and environment E. An environment is denoted thus:

((S,Bz,...))

where RI, 92, etc. are bindings.

4.1.3 The Fork

Finally, we introduce a new data structure, the fork node: A. Forks allow certain sets of X-terms to be represented by a single graph. For example, let:

3 Aq/.(I I 2) :; E Axy.Ix Q3 z Xzy.z

These terms could be represented simultaneously as follows:

Depending on whether the right or left branches of the two A nodes are inspected, different terms may be read off. In the sequel, the two branches will always represent P-equivalent terms (as is the case above).

Note that the arguments of an application whose func- tion part is a A node can be “distributed” through the A to yield an alternate representation of the same set of terms, e.g.:

4.2 Example

Let us return to the example of Section 3 to illustrate how the reduction algorithm Ainc works, where, as before, we are given the mutually similar terms Ml i N[z := Xzy.zy], Mz z N[z := Xzy.~], and Ms z N[z := Xzp.y].

We begin by reducing Ml to normal form, using the fol- lowing graph to represent the initial term:

Ml f (( (II))(

8 xxy.xy

310

Page 5: Incremental Reduction in the Lambda Calculus*...tality: Demers, Reps, and Teitelbaum [DRT81,Rep82] pro- posed algorithms to update functions on syntax trees de- scribed by attribute

Note that the two terms represented by the graph above are the same.

We then reduce MI as shown in Figure 2, using a variant of the outermost graph-reduction strategy of Wadsworth, where all steps are either @-contractions (on possibly shared terms) or distribution of arguments of applications through A nodes. The node being reduced at each step is boxed. At all times, the graph represents a pair of P-equivalent terms, depending on whether only left branches or only right branches are inspected. The pair of terms represented by the graph at each stage of the reduction process is listed next to each graph.

When reduction of Ml is complete, we are left with a term headed by a A node whose left subtree is the normal form of Ml and whose right subtree is the term

(% II)[% := A] G N’[Z := PI]

If we disregard the substituand PI, we can regard the right subtree as representing N’, the desired final term of the projection of the reduction on N.

In general, A’“’ reduces an initial term whose substi- tuand is replaced by a A node (the branches of which both point to the substituand). When a A node is encountered in the function part of an application during reduction, its argument is distributed into each branch of the A, effec- tively creating two copies of the application. We can view the A node as a sort of “checkpoint” in the reduction pro- cess, which preserves the redex whose contraction depends on the value of the substituand (i.e., a redex whose function part is the substituand). Reduction then proceeds only on the left branch of the A. However, contraction of a redex in the left subterm of a A may have the beneficent side-effect of contracting a redex in the right subterm, if the redex is shared. The trick is to perform the reduction in such a way that the maximal number of such shared contractions take place.

The combination of graph reduction and appropriate use of the A node is crucial to performing incremental reduction: The A node enables a subterm whose reduction is dependent upon the substituand to be preserved, while graph reduction enables redex contractions that are independent of the sub- stituand to affect both terms reachable from a A.

Continuing with the example above, if we remove the topmost A node, discard its left subterm, and replace Pi by

8 xxy.x

in the (former) right subterm of the topmost A, we form the new term (disregarding the A node)

N’[z := Pz]

We then reduce the new term, taking advantage of the fact that N’ embodies the reduction already performed in N above, and avoiding (as was our aim) contraction of over- lapping redexes. The new reduction is given in Figure 3. We can repeat the process as required for additional sub stituands Ps, Pt.. . . Note that in the example above, the final graph is the same as that of the reduction in Figure

2. In general, however, each substituand may induce addi- tional reductions in the common term (which wss not pos- sible above since the common term N’ E (Z I I) was already reduced to normal form by the first reduction).

Each time reduction is to be performed in the presence of a new substituand, the initial term is reconstructed by replacing each reference to a fork node with a reference to its right subterm, thus preserving the “checkpoints” on which the substituand depended. A new substituand (along with an appropriately configured A node) replaces the old one and the reduction process begins anew.

Unfortunately, Wadsworth-styIe graph reduction is not completely satisfactory in the case illustrated in Figure 4. There we are unable to take advantage of the fact that the redex la is essentially a copy (i.e., a residual) of the ly redex in Xy.1~. The 1~ redex is left uncontracted in the final term. Thii phenomenon is related to the issue of “full-laziness” in functional language implementations. Although we could have contracted the ly redex before applying Xy.ly to a, we wish to retain an outermost reduction strategy. By using shared closures and environments, the (1~) redex can be contracted independently of the substitution of a for y, even though the reduction is nominally outermost, as we see in Figure 5.

In order to preserve residuals of inner redexes, Ain’ will use shared environments and closures, as well as shared X- terms. In addition, by using environments, substitution can be implemented efficiently.

5 The Incremental Reduction Algorithm

5.1 AACCL

Ainc is specified using an augmented version of the ACCL formalism described in PieSOb]. The system defined in [ACCLSOJ is q ui e t similar, and would also suffice for this purpose. ACCL is intended to formalize the notions of en- vironment and closure used to implement the operation of substitution in many A-reduction algorithms. ACCL is a conservative extension of the A/3-calculus, that is, for every P-reduction, there is a corresponding reduction in ACCL. This close connection between ACCL and the X-calculus makes it easy to show that our algorithm is correct.

By manipulating environments and closures explicitly, ACCL makes possible X-reduction strategies in which cer- tain substitution operations may be deferred until the need for a term arises, or in which a term may be reduced indepen- dently of several different substitutions for its free variables. Furthermore, as a left-linear term rewriting system, ACCL reduction can be implemented in a straightforward way us- ing graph-reduction on shared terms (see e.g., [BvEG+87] for formalizations of this idea). AlI of these properties of ACCL will be exploited in nine. The details of ACCL are reviewed in Appendix B.

To implement incremental reduction, we will extend ACCL with a new constructor, A(., .), the “fork” node used above informally, and add several new axioms defining its behavior. The resulting system is called AACCL:

Definition 5.1 The terms of AACCL are those ofACCL, along with those built from the following new constructor:

A(C, L) : C (fork)

311

Page 6: Incremental Reduction in the Lambda Calculus*...tality: Demers, Reps, and Teitelbaum [DRT81,Rep82] pro- posed algorithms to update functions on syntax trees de- scribed by attribute

Graph

(

xxy.xy

Corresponding Terms Operation Left A-Subterms Right A-Subterms

distrib ((XXY-XY)PW)) @Y-XYW)W))

P ((~XY.XY)WWN (PxY*xYww~N

XY.‘, Y) (PXY.XY) 1’

(II) distrib ((Xy.(ll) y)(ll)) ((kcy.~y)(ll)(~~))

J

>

cc P @Y-W) YW)) (WY.XYWW))

f=i%, p (II) ((Xxy.xy)l(ll))

@XY.XY) 14

Figure 2: Reduction of MI

312

Page 7: Incremental Reduction in the Lambda Calculus*...tality: Demers, Reps, and Teitelbaum [DRT81,Rep82] pro- posed algorithms to update functions on syntax trees de- scribed by attribute

(

VP

I 1) -

Ax y.x

P

XYJY)

Xx.xa

Figure 3: Contraction of N’[z := Pz]

A ?

(3 ((Xx.xa) Ay.B

Figure 4: Shortcomings of Outermost Graph Reduction

The axioms of AACCL am those of ACCL, augmented with the following:

RightInterp(T) js defined in the analogous manner for right subterms of As.

[E) [A(A, B), El = Wk El, VT 4) NW& B)) =

(AApply) APP~Y(A(A, B)> C) = A@(A), A(B))

AtApply (A, C), APP~Y(& C))

Given an arbitrary AACCL reduction, we can construct its image on the ACCL terms formed by following only left or only right branches of As:

Proposition 5.2 Given AACCL reduction

These new axioms formalize the distribution rules for A used earlier.

P: A -AACCL B

Treated as a term-rewriting system, AACCL is well be- haved, i.e., we have the following:

there exists an interpretation ofp, Leftlnterp(p), such that

Leftlnterp(p): Leftlnterp(A) -ACCL Leftlnterp(B) Proposition 5.1 AACCL is confluent.

We can interpret a AACCL term as a set of related ACCL terms (or A-terms). A particular term of the set is determined by specifying a path to be taken through the A nodes in the term (i.e., for each A, whether to inspect the right or the left subterm). Two particularly useful terms to consider are those designated by inspecting only left sub- terms of A nodes, or only right subterms:

where the number of(Beta) contractions in LeftInterp(p) is less than or equal to the number of (Beta) contractions in P-

There also exists a symmetric interpretation, Rightlnterp(p).

5.2 Reduction Strategy

Definition 5.2 Let T be a AACCL term. Leftlnterp(T) is defined as follows:

Then Figure 6 consists of two versions of an algorithm that reduces a term of AACCL to the ACCL-equivalent of weak head normal form, WHNF.

LeftInterp(A(A, B)) = Leftlnterp(A)

LeftInterp(Apply(A, I?)) =

Apply(LeftInterp(A), LeftInterp(B))

Leftlnterp(h(A)) = A(LeftInterp(A))

LeftZnterp(Var) = Var

The first algorithm, rwhnf() is the starting point for Ai”‘. It is obtained by omitting the lines preceded by ‘*‘s in Figure 6, and operates only on terms of ACCL. rwhnf() is weak-head normalizing, and reduces only needed redexes (see Section 6.1). Its reduction strategy nevertheless differs from that normally used by environment-based X-reduction algorithms, e.g., that of [HM76] (see below).

Arwhnf() (the full algorithm in Figure 6) is an extension of rwhnf() to terms of AACCL. Most of the properties of

(similarly for other AACCL operators)

313

Page 8: Incremental Reduction in the Lambda Calculus*...tality: Demers, Reps, and Teitelbaum [DRT81,Rep82] pro- posed algorithms to update functions on syntax trees de- scribed by attribute

P XYJY)

Xx.xa

- A a ((X-4 CAY.YN

Figure 5: Reduction using Closures and Environments

ArwJmf() are straightforward consequences of the proper- ties of rwhnf(). However, by inserting A nodes at apprG priate points in the initial term, Arwhnf() will be used to perform incremental A-reduction.

rwhnf() and Arwhnf() are specified using the rewrite rules of AACCL, a recursive redex selection strategy, and shared terms. The functional notation used in the algorithm should be reasonably self explanatory for someone familiar with a language such as ML or Miranda. However, the algo- rithm should be considered a recursively specified sequence of transformations on the term given as argument, not a true function, since no value is to be returned.

The case statement executes various statements depend- ing on a pattern to be matched. Subpatterns within larger patterns are named using the notation subpo&A. Pattern variables represent pointers to terms, and if a pattern vari- able appears on the right side of a pattern, the pointer to the term represented by the variable, not the term itself, is copied. “:=” causes a term to be overwritten accord- ing to some rule of AACCL; only those parts of the over- writing term not named by pattern variables are newly al- located. Statements inside *seq.. . endseq” are executed in sequence. CopyTop copies the topmost operator of A, while retaining the original references (pointers) to A’s subterms, if any. rpenf () reduces an environment to a simple normal form, PEiVF.

5.2.1 Behavior of rwhnf()

Let us examine some of the salient features of rwhnf(). Our goal is to ensure that no redex shared between different sub- terms of a A is ever copied, thus ensuring that reduction of a redex in a left A subterm is always shared with its coun- terpart in the right subterm, .if any exists. This property is crucial for ensuring that Alnc does indeed perform non- overlapping reductions

A ACCL term not in whnf (thus a candidate for reduc- tion by rwhnf()) must have an outer constructor that is ei- ther an application or a closure. In the former case, rwhnf() reduces the function part of the application to whnf (in line

Ll of the algorithm), simulating the usual outermost strat- egy. More unusual is the latter case (line L2), in which the term part of a closure is reduced before the closure itself is reduced. This is done to prevent reduction of the closure from causing a redex in its term part to be copied.

In line L3, we see that an environment must be “pushed inside” an abstraction before the abstraction is considered to be in whnf. Most other reduction schemes that use en- vironments eschew this step, regarding a term of the form [A(A), E] as a whnf. This requires an alternate (Beta’) axiom that operates on an application whose function part is a closure (see, e.g., [ACCLSO]). Once again, the rationale for our alternate strategy is to maximize sharing.

In Line L4, we see that a term in an environment is reduced to whnf before its top constructor is copied by CopyTop(), which could otherwise copy a shared redexes.

Taken together, the properties of rwhnf() outlined above make it particularly suitable as the basis for incremental reduction.

5.3 Plan of the Algorithm

The general plan of the algorithm is as follows: Given a sequence of similar A-terms

Ml E iv[z := PI] M2 E N[% := Pi]

Ainc effectively computes

M{,M;,... E whnf

as well a.9 Nl[% := P:], Aqr := Pi],. . .

2 We assume without loss of generality that t is the only free vari- able in N.

314

Page 9: Incremental Reduction in the Lambda Calculus*...tality: Demers, Reps, and Teitelbaum [DRT81,Rep82] pro- posed algorithms to update functions on syntax trees de- scribed by attribute

such that for all i,

Mi -p M:

P; -p P;’

Ni-1 -j3 Ni

(where No E N). F or each i, rather than reduce Mi directly, the algorithm reduces Ni-l[z := Pi] to MI (in the process yielding Ni[z := P/l as well).

Note that while N[z := PI] and Ml refer to the same X-term, the substitution notation used in the former case has an ezplicit counterpart in the ACCL term

[UNDACCL, (@, up~nAccL)l

which differs from [[M~lncc~, although the two are inter- convertible. In Ain’, we will use the former as the initial term on which the algorithm operates to ensure that PI is represented by a single node in the graph.

5.4 Ain=

The algorithm Ainc is given in Figure 7. We assume we are given the ACCL translation of the common term, [NJaoo~, after which the algorithm iteratively processes a sequence of substituands translated into AACCL:

[PI~ACCL, UP2DACCL,. . .

At each iteration, the algorithm constructs a AACCL term Ai, and applies Arwhnf() to Ai to yield a new term Ai.

Note that Step (2) of hint depends on the fact that there is only a single instance of [[Pi--l’I]AccL (perhaps with multi- ple references) in A,-,, and that each phase of the algorithm causes

A(UP&ccL, UpiDnccL)

to be overwritten by [Pi-&CCL. Note also that in the examples considered thus far, the

reduced term has only one A node. In general, however, multiple A nodes may appear in the final term as a result of self-application.

We now claim that Ainc is correct, that is:

Proposition 5.3 For all i = 1.. ., the terms Ai and Af constructed by Ainc have the following properties:

[Leftlnterp(A;)Jx = I[Rightlnterp(Ai)Jx

= Ni-l[Z := Pi]

ULeftInterp(A:)ljx = M,f E whnf

[Rightlnterp(A& = Ni[z := P;‘]

and

Mi -,g M;

Pi -p Pi’

Ni-1 -p Ni

(where NO z N)

Proposition 5.3 follows from the properties of AACCL dis- cussed in Section 5.1, as well as the following fact: Every contraction performed by Arwhnf () preserves the invariant

I[RightInterp(B)jx = $[z := Pi]

where N -B N and B is the the root of the graph on which rwhnf() operates. We can also show that the substituand is always represented by a single term in its graph (i.e., is never copied in the course of the reduction), and thus that Step (2) of the algorithm is well-defined.

However, we also want to show that Ainc is indeed incre- mental, in that it avoids contracting redexes unnecessarily at each step, relative to the reductions it has performed earlier. To do this requires an excursion into the reduction theory of the ,&alculus, which we undertake below.

6 Theoretical Machinery

In order to give a rigorous treatment of incremental re- duction, we will need to refer to the reduction theory of the *\-calculus developed in Barendregt [Bar84], Barendregt, et al. [BKKS87], Klop [Klo80], and particularly, in Levy [LCv78,Lkv80], with which we must assume the reader is fa- miliar, as space precludes a complete review.

6.1 Preliminaries

The discussion to follow will depend on the following notions from one of the above sources:

- The classical notion of the set of residuals of a term R in M relative to a reduction p: M - M’, notated R/p (described in [Bar84]).

- L&y’s notion of reduction residual, notated o/p, along with the related concepts of reduction prejiz, notated p 5 u, and &strong” equivalence of reduction (or equiv- alence modulo permutation of redexes), notated p N u (described in [LCv78,LCv80,Klo80]).

- L&y’s idea of family class and relative reduction (dis- cussed in [L&SO]).

- The concept of neededness and needed reduction, (de- scribed in [Ldv78,Lkv80,BKKS87]).

In the sequel, both &reductions and AACCL reductions are assumed (in general) to be parallel-i.e., those in which multiple redexes in a given term may be contracted in a single step via shared redexes. (See Appendix A).

6.2 New Definitions

6.2.1 Relative Prefixes of Reductions

The following strengthening of L&y’s notion of reduction prefix will be used:

Definition 6.1 Let p and u be coinitial reductions. Then p is a relative prefix ofo, notation p 5~ 0 if:

I. psu

2. p is relative to 6.

Definition 6.2 Let p and 0 be coinitial reductions. Then p is relatively equivalent to u, notation p CXR u if p 2~ u and u <R P.

Proposition 6.1 zR i3 an eqUiVaknce rda&m.

315

Page 10: Incremental Reduction in the Lambda Calculus*...tality: Demers, Reps, and Teitelbaum [DRT81,Rep82] pro- posed algorithms to update functions on syntax trees de- scribed by attribute

Arwhnf(T) I case T of * A(T{, Ti): rwhnf(T{);

A(A): skip; Var: skip;

APP~YV, B): 5eq WI rwhnf (A);

case A of * A(Ai, AS): seq * T := A(Apply(A;, B): rf, Apply(A;, B)); * rwhnf (p) * endseq

A(A’): seq T := [A’, (0, B)]; z:pfebT)

; else: skip

endseq; [L, E]: seq

rwhnf (L); wj

* 1: * *

case i df A(L:, L:): seq

T := A([L:, E]:?‘, [L;, El); rwhnf (T) endseq;

APP~Y(A, 8): seq T := APPLY 6% El, P, El); z-fs;7-’

$(A);; := $[A, (E o 0, Var)]); I, I : se

T:=[L1, El 04; rwhnf (T) endseq;

Var: seq

F33

W41

rpeif (E); case E of

0: T := Var; q “: skip; (E, A): seq

rwhnf(A); T := CopyTop

endseq endseq

endseq

Figure 6: Algorithms Arwhnf() and rwhnf() (without ‘*‘s)

let A;o= [~'QAccL, (0, URlhcc~>l, =

in for i = 1. . . do

1) Recursively traverse Ai-1, replacing all references to A nodes with references to the A’s right child, yielding Ai- ;

2) Overwrite [P~-JJ~ccL in Ai- with A([PilaccL, UP&C&, yielding Ai; 3) Apply Arwhnf() to Ai, yielding Ai

{T E WNF) {T f WHIVF)

{rule AApply}

{rule Beta}

{T E WHNF}

{rule AC}

{rule DApply}

{rule DA; T E WHNF}

{rule AssC}

{rule NullC} {T E n!; T E WHNF}

{rule VarRef}

Figure 7: Algorithm Ainc

316

Page 11: Incremental Reduction in the Lambda Calculus*...tality: Demers, Reps, and Teitelbaum [DRT81,Rep82] pro- posed algorithms to update functions on syntax trees de- scribed by attribute

6.2.2 Overlapping Reductions

Definition 6.3 Let p and u be coinitial. Then p and u overlap, notation p =: u, if there ezists a non-null common relative prefix r and reductions 61 and 62 such that r+& ?fR p andr+& ERU.

Proposition 6.2 Let R(M) be the set of reductions begin- ning with term M. Let R(M)/zR be the set of equivalence classes of R(M) defined by the relation ?R, whose members are ordered by:

(where the meaning of <R is here overloaded). Then for all reductions p: Ml -n M; and u: M2 - Mi, there exists a greatest lower bound [p] tl [o] in R(M)/%R.

Thus we see that for any pair of overlapping reductions, there exists a “maximal overlap” (modulo NR). L&y [L&78] shows that such a greatest lower bound does not in general exist for reductions ordered by 5.

6.2.3 Projection of Reduction

We are now in a position to give a more formal definition of projection.

Definition 6.4 Let M E C[N] (i.e., a term containingsub- term N), and p: M - M’. Then pN is a projection of p on N if:

[pN] = I-I{+ N I N’, and for all r E R(N’), 7 $ (p/a)}

The idea of this definition is that the projection of

p: M-M’

on N should be the least reduction entirely in N which is needed to compute M’ or some term to which M’ reduces.

We can construct the projection of a reduction on a given term using a method similar to the Klop’s standardization construction [Klo80].

We can extend the definition above in the obvious way to A-terms of the form N[.z := P]:

Definition 6.5 Let M E N[z := P] andp: M - M’. Let u be the following reduction.: -

u: ((Az.N)P) - M

Then the projection pN is defined by:

PN=((O+p)N

6.2.4 Overlapping Reductions on Similar Terms

We can now define overlapping reductions on similar terms:

Definition 6.6 Let Ml 2 M2 be similar terms; i.e., let MI s N[z := Pl] and Mz s N[z := Pz]. Then

pl: MI-M;

and ~2: M2--++M;

overlap modulo N, notation MI g Mz, if

PY ,’ Pz”

Definition 6.6 gives us the formal counterpart of our require- ment that incremental reduction not “repeat any work”- such reductions must be non-overlapping.

6.2.5 Overlapping Non-Coinitial Reductions

Finally, we will need a notion of overlapping for certain non- coinitial reductions:

Definition 6.7 Let p1 and pz be reductions of terms MI I Nl[z := PI] and Mz z Nz[z := 41, and 71 and yz be reduc- tions of N such that:

p1: NI [z := A] - M: p2: Nz[z := Pz] - M: ~1: iv - Nl 72: N - Nz

Let t be any reduction in the equivalence class of yl II 72

(which exists by Proposition 6.2) such that

‘fl =R (r + 61)

7'2 NR (r + 62)

Then p1 and p2 overlap modulo (N1 , Nz), notation

if

NI J’Ja Pl =: P2

(61 $PIN’)=: (62 +f7zN3)

Given Definition 6.7, the following is immediate:

Proposition 6.3 Given terms Ml E N[z := PI], M2 s N[z := Pz], N’ and reductions

PI: Ml - M; pz: N’[z := Pz] - M;

such that pfy: N ---H N’, then

N,N’

PI $ PZ

Note that

(P? + ~2): Mz ---H N'[z := Pz] - M;

Proposition 6.3 states that one can effectively reduce similar terms with non-overlapping reductions by using the final term of the projection of the first reduction on the common term as the starting point for the second reduction.

7 Properties of Ainc

We now return to Ainc with the intention of showing that the reductions used to compute each term are formally non- overlapping:

At each iteration, Ainc applies Arwhnf() to Ai, yielding A:. Let pi be the AACCL reduction performed by Ai”‘, such that

pi:Ai --+*AAccLA~

Define the P-reductions corresponding to pi as follows:

e7i = fl(LeftInterp(pi)) c = /3(RightInterp(pi))

By Proposition 5.3, we have

ui: Ni-l[z := Pi] -p MI ri: Ni-l[z := Pi] -p Ni[z := Pi]

We can now show that hint lives up to its name:

317

Page 12: Incremental Reduction in the Lambda Calculus*...tality: Demers, Reps, and Teitelbaum [DRT81,Rep82] pro- posed algorithms to update functions on syntax trees de- scribed by attribute

Proposition 7.1 Let p;, Ui, and ri be a$ defined above. Then

Ni-1 =i : Ni-1 -@ Ni

pi . ui . Pi -fl Pi

and

The proof of Proposition 7.1 requires showing that any con- traction of a redex R in t[LeftInterp(B)~x by Arwhnf() also has the effect of contracting every redex in [RightInterp(B)~x of which R is a residual. This property is a consequence of the sharing properties of the reduction strategy used by rwhnf () outlined in 5.2.1, and can be shown formally through the use of a labeled version of ACCL, which is described in FieSOb] .

We see then that Ainc simultaneously computes a weak normal form of its initial term, and the projection of the reduction thus performed on the common term of the ini- tial term (as well as he projection on the substituand). By Proposition 6.3, we can conclude that the 0; are pairwise non-overlapping, and thus that Ainc is incremental in a well- defined sense.

8 Multiple Substituands

The discussion above has been concerned with incremental reduction of terms differing in a single substituand. How- ever, as the example of Section 2 illustrates, it is much more useful to consider the possibility of multiple, and perhaps nested, substituands. It turns out that Zinc can be extended almost trivially to deal with this new model. Here we will not attempt to cover all the formal details required to treat the extension, which are given in [FieSOa], and instead pro- vide an overview of the concepts involved.

We will now assume that we are given a X-term M such that

M 3 N[zl :=Ml,...,z,,:=M,,], n>O

where each Mi has the form

Mi s Ni[zil := Mil,. . . , Zing, I= Mini], ni 2 0

and each of the Mij are assumed have a similar form, etc. After reducing M to M’ E whnf, we would like to replace

some set of disjoint substituands

Mu,, . . . , Ma,

in M, and reduce the resulting new term. Consider, for example, the term M E (Xy.yy (I(la))),

in which we designate two substituands zi and zz as follows:

MS (a (1 z2))Ia := AY.YY, ~2 := (Ia)] z N[Zl := A, 74 := 91

To reduce this term incrementally, we start by replac- ing the substituands zl and 22 by terms with A nodes, as before. However, we now indez the A nodes by a number corresponding to the index of the substituands, as shown by the term A in Figure 8.

Let A be the indexed AACCL term representing M (i.e., where [LeftInterp(A)jjx = [Rightlnter~(A)~x =-M). We then proceed to reduce it using Arwhnf() as before. Arwhnf() is oblivious to the values of indices used for A

nodes, although rules (AApply) and (AC) are expected to preserve the values of indices of A operators on each side of their respective equations. Arwhnf() performs the reduction p: A - A’, summarized in Figure 8 (using our informal notation for environments and closures). Though the final term of the reduction, A’, is a bit complicated, its interpretation is straightforward:

[LeftInterp(A’)J x E (ua)

[Rightlnterp(A’)jx I (21 22)[z1 := xy.yy, %2 := a]

E N’[zl := Pi, z2 := P;]

Note that the reduction above preserves the contraction of the (1~1) redex in N and the (Ia) redex in P2 (there are no redexes in Pr to preserve).

The indices in A nodes only come into play when a sub- term is replaced and reduction is to be perfprmed anew. The only alterations required for algorithm A’“’ to behave correctly on multiple substituands (other than changing the initializations appropriately) are these:

-

-

Step (1) of algorithm Ai”’ removes all A nodes having the same index as the substituands being replaced.

In Step (2), more than one substituand may be re- placed by appropriately configured and indexed A nodes.

We can now interpret terms of AACCL by specifying for each index found in a A node whether its right or left subterm is to be examined; i.e., instead of Leftlnterp(A) or Rightlnterp(A), we have indexed interpretations of the form Interp,(A):

8.1 Indexed Interpretations

Let ForkIndices be the set of indices of A nodes found in term A E AACCL. Let S be a subset of ForkIndices( Then (informally) Interps(A) is the ACCL term gotten by following right branches of A nodes with indices i E S, and left branches otherwise.

8.2 Overlapping Contexts

Consider the term

M G (21 ((A2.m) %Z))[Zl := I, z2 := I]

s N[Zl := A,zz := Pz]

where we reduce M by p: M -H a. In this case, not only do we want Ainc to compute the projection of p on N, PI, and P2 as before, but since we are now allowed to replace arbitrary combinations of substituands, we need to compute the projection of p on every context that could contain some subset of the substituands. For example, if we replace only substituand PI, we need to ensure that we compute the pro jection of p on Q, where we write M as

M E (~1 ((Xz.za) I))[21 := I] s Q[ZI := PI]

Note that the projection p Q is not the concatenation of projections pN and ps, since together, N and P2 create a redex, (Ia), that does not exist separately in either term.

It turns out that by choosing the indexed interpretation of a term reduced by Arwhnf() according to the indices

318

Page 13: Incremental Reduction in the Lambda Calculus*...tality: Demers, Reps, and Teitelbaum [DRT81,Rep82] pro- posed algorithms to update functions on syntax trees de- scribed by attribute

Figure 8: Reduction with indexed A nodes

of the substituands being replaced, we can reconstruct the (reduced) context containing that set of substituands.

That is, if A represents a X-term M with substituands Pi,, ij E S such that:

[LeftIn terp( A)] x I M

E &[Zi, :=Pil,...,%i, :=Pi,], ij ES

and A --H A’, then

[Interp&A)lx 3 Q’[Zil := P;‘, , . . . , Zi, := P;‘n], ij E S

is a substitution instance of the original context Q. Further- more, A’“’ computes the projection of a reduction on every such context simultaneously.

9 Other Issues

rwhnf() and Ai”’ have a number of additional interesting properties that we are able to mention here only in passing:

- rwhnf() performs reductions that are strongly equiv- alent to O’Donnell’s noncopying reductions [O’D77], which are an efficient (though not optimal) class of parallel X-reductions. This leads us to conclude that rwhnf() performs well (in terms of total number of redexes contracted) even in a non-incremental setting.

- We can show that Ainc is relatively optimal, i.e., that a non-overlapping reduction of a common term of a set of similar terms yields the shortest (in terms of (Beta) contractions) subsequent reduction of any new term resulting from substituand replacement, relative to the reduction strategy used by rwhnf().

- We can extend the model of allowable subtree replace- ments to allow susbstituands in completely arbitrary contexts, not just those representable by substitution. For example, let N[z] E A2y.z. Then we might wish to replace z with y to yield N{y] 3 Xzy.y, which is not representable by substitution since the substituand is a bound variable. We can perform reduction on contexts such as N without losing track of the proper bindings since ACCL allows “checkpoints” to be made in the substitution process, not just before P-reduction.

E A’

- There are some interesting connections between the idea of projection of a reduction on a term and the semantic notion of stability.

We also note that a number of enhancements can be made to the implementation of Ain’:

-

-

-

We can extend the reduction system to allow strict predefined functions and atomic values (e.g., addition and integers).

The Y combinator can be implemented using a cyclic graph.

Numerous enhancements can be made to the imple- mentation of Arwhnf() to avoid such anomalous over- head as repeated examination of terms already in whnf, and chains of A nodes with the same index.

The issues mentioned here are all covered in greater de- tail in [FieSOa].

10 Applications and Conclusions

Incremental X-reduction can be used in any situation where an algorithm expressed functionally is applied to (or trans- formed to) a set of similar terms. Our model of incremental- ity requires that the possible points of X-term replacement be designated in advance. While at first glance this might seem a severe restriction, we note that one can, if desired, specify every subterm of a A-term as a possible point of replacement, (given the remark above about extensions of the replacement model to arbitrary contexts). However, the overhead required to maintain the attendant A node “check- points” is increased accordingly.

We envision incremental reduction ss most useful in ap plications similar to that of Section 2. In such cases, the X-terms themselves are not the primary objects of interest, but instead are used to specify the “semantics” (in an op erational way) of other languages, systems, or structured data, and are thus manipulated indirectly. We also feel that the generality of incremental ;\-reduction makes it a good starting point for devising techniques for incremental com- putation in more restricted domains.

To summarize, we have defined a notion of incremen- tal reduction in the X-calculus. The concept of overlapping reductions on similar terms is presented, and a practical

319

Page 14: Incremental Reduction in the Lambda Calculus*...tality: Demers, Reps, and Teitelbaum [DRT81,Rep82] pro- posed algorithms to update functions on syntax trees de- scribed by attribute

incremental evaluator, Ainc is described. The reductions performed by Ainc on a sequence of terms are guaranteed to be non-overlapping, which we take as our definition of a (relatively) optimal incremental evaluator.

A The Lambda Calculus and Term Rewriting Sys- tems

The following is a brief review of some of the notation for the lambda-calculus used herein. The conventions used here will generally follow those of [Bar84], to which the reader is referred for details, although a few are taken from [Klo80] or [BKKS87].

A.1 Notation

C[M] denotes a contezt containing M, i.e., C[M] is a X- term with designated subterm M. M need not be a proper subterm of C[M]. Contexts may be defined similarly for other rewriting systems.

ML2 := N] denotes the result of substituting N for all free occurrences of z in M.

/3-contraction is denoted by -6. The reflexive, tran- sitive closure of +p, P-reduction, is denoted by -p. P-equality will be denoted by ‘=‘, syntactic identity of X- terms (modulo renaming of bound variables) by ‘G’.

M is a normal form (or M E nf) if and only if it contains no redexes.

M is a head normal form (M E hnf) if and only if it has the form

AZ 1 *.- zn.(arA **-Pm), n,m>O

where y and the zi are arbitrary variables and the Pj are arbitrary X-terms.

M is a weak head normal form (M E wbnf) if and only if it has either of the forms

A2.N

or (x Pl -*.Pn), n>O

for arbitrary variable x and arbitrary X-term N and terms Pi.

Reductions, sequences of &contractions, will be denoted as follows:

Rl R2 R3 u: Mo-M1-Mz-. . . - Rn M,,

Q designates the entire reduction sequence. The concatena- tion of two reductions u and p is denoted by v + p, and is defined only if the final term of IJ is the initial term of p. 0 denotes the null reduction (the reduction of no terms).

One can define a notion of parallelp-contraction, notated M-lip N, m which sets of redexes are contracted in a single step; see [LCv78,L&80] for the formal details. Parallel reductions are represented thus:

where the Ci are the sets of redexes in Mi contracted in parallel at each step.

The de Bruijn ,&alculus [dB72] is a variant of the A- calculus in which variables are replaced by de Bruiin num- bers denoting their binding depth in the term in which they are contained.

Definition A.1 The set oJ terms in the de Bruijn X- calculus, designated Ter(AD. ), is defined inductively as fol- lows

nEhf :==s nETer(ADB) M, N E Ter(ADB) e (MN) E Ter(ADB)

M E Ter(ADB) a AM E Ter (A”“)

where A’ is the set of natural numbers.

By providing a variable substitution mechanism that ap- propriately adjusts the de Bruijn numbers of substituted terms, the de Bruijn ~-calculus eliminates the need for (Y- conversion. Substitution and p-reduction can be suitably redefined on XDB to mimic the analogous operations in the X-calculus. See [Cur86a] for a concise summary of the oper- ational behavior of the de Bruijn X-calculus.

In the sequel, we will assume that any A-terms under consideration are actually terms of the de Bruijn ~-calculus, although we will feel free to give examples using named vari- ables.

B ACCL

B.l Term Structure

Definition B.l The terms of ACCL are built from a set of variables and constructors over o two-sorted aignoture. The sorts ore as follows:

- L, the sort of lambdalike expressions

- E, the sort of environments

The constructors are listed below. Each constructor is given

with the sort of the term constructed and the aorta of its ar- gument(a) specified in the corresponding argument positions.

Var : L APP~Y(G C):c

A(L): L

[CT %

(variable reference) (application) (obstroction) (closure)

(null environment)

(ah ifi)

(expression list) (environment composition)

The terms of ACCL will be denoted by Ter(ACCL) and the closed terms, those terms containing no variables, by Terc(ACCL).

The following notation (for “de Bruijn” numbers) will be used:

Definition B.2

n=O n>O

where

b n=l

0” E ~0(00(...(000)...)) n>l Y /

n times

320

Page 15: Incremental Reduction in the Lambda Calculus*...tality: Demers, Reps, and Teitelbaum [DRT81,Rep82] pro- posed algorithms to update functions on syntax trees de- scribed by attribute

The intuition behind the term structure of ACCL is fairly straightforward: Terms of sort L: are analogous to terms in the de Bruijn X-calculus, after variable numbers are encoded as above. Closures are created by the ACCL equivalent of ,&contraction. Environments are essentially lists of terms, the association between bound variables and the terms to which they are bound being represented im- plicitly by position in the list. An environment informally presented as

((21 := M~,zz := Mz,...,z,, := M,})

is represented in ACCL as

(((...(0, Mn)...), Mz), ML).

“0” allows separate environments to be merged. The only perhaps mysterious term present is “a”, which when com- posed on the left with an arbitrary environment effects the “shifting” of de Bruijn numbers required when environments are moved inside abstractions, and when composed on the right with an environment causes the outermost piece of the list to be stripped away in the course of variable lookup. All these operations are embodied in the axioms below:

B.2 Axioms

Definition B.3 The axioms of ACCL are as fottows:

(Beta) (AssC)

Apply@(A), B) = [A, (0, B)l

(N&EL) [[A, El\ WI g, El 0 Ezl

{:lYg) E:B=E

(VaAbzq q o(E, A)=E

g:,’

[Var, (E, A)]=A [A(A), El = A.([A (E 0 0, Var)l)

(El, A) o E2 = (4 0 Ez, [A, Ez])

[~A;$Y) [APP~Y(& B), EI=APP~Y&% El, [K E1) (N%C)

(El o E,) o E3 = El o (E2 o Es) [A, a]=A

B.3 ACCL as Rewriting System on Closed Terms

By orienting the equations of ACCL from left to right, they can be treated as a term rewriting system. The notation ~~CCL will be used to denote the application of a rule

of ACCL in some context, i.e., A -*CCL B if and only if A zz C[X], X may be rewritten to Y using one of the oriented axioms of ACCL, and B E C[Y].

B.4 Translation

We can now show state the translation between terms of the de Bruijn X-calculus and terms of ACCL.

Definition B.4 For any term M E iiDB, we can define a corresponding term [IMjlacc~ E ACCL inductively as fol- tows:

[i]accL = i!

U(~.WllACCL = A(pqACCL)

U(NlN2)]ACCL = APPb (([N&CCL, %N2nACCL)

B.5 Equivalence

There is an equivalence between P-reduction and reduction of terms of sbrt C in ACCL:

Theorem B.l Given M E Ter(itDB),

A4 -@ N - IIMnaccL -ACCL [Nl]ACCL

This result shows that any reduction of a ACCL term A E LNF simulates a reduction in the ,&alculus. The result above can be strengthened as follows:

Theorem B.2 For all A : L, B : L, such that

P: A -ACCL B

there ezists

P(p): u4~-pm4l~ such that the number of (parallel) p-contractions in p(p) equals the number of (Beta) contmctions in p.

In other words, an arbitrary ACCL reduction can be sim- ulated by a parallel p-reduction of the same length (using the number of (Beta) contractions as a length measure in the former case).

B.6 Normal Forms

Definition B.5 The set of lambda normal forms (LNF) is a subset of the terms of ACCL, defined inductively as

f0110w3:

ra! E LNF

AELNF -r A(A)EUVF

A E LNF, B E LNF -7 Apply (A, B) E LNF

Lambda normal forms are intuitively those terms that “look like” terms of the (de Bruijn) X-calculus.

The following fact makes possible a straightforward trans- lation from terms in LNF to terms of the de Bruijn X- calculus:

Theorem B.3 All lambda-like ezpressions (terms of sort L) of ACCL are reducible to a lambda normal form, using the rules of ACCL; i.e., for all A: Is, there exists B E LNF such that A +ACCL B.

The lambda normal form of A is given by hf(A). The translation [AnA is then defined as the inverse of I[&ccL on Inf(A).

References

[ACCLSO]

[ACR+87]

[Bar841

M. Abadi, L. Cardelli, P.-L. Curien, and J.-J. Levy. Explicit substitutions. In Proc. Seven- teenth ACM Symposium on Principles of Pro- gmmming Languages, San Francisco, 1990.

B. Alpern, A. Carle, B. Rosen, P. Sweeney, and K. Zadeck. Incremental evaluation of at- tributed graphs. Technical Report RC 13205, IBM Thomas J. Watson Research Center, York- town Heights, New York 10598, October 1987.

H.P. Barendregt. The Lambda Calculus, vol- ume 103 of Studies in Logic and the Founda- tions of Mathematics. North-Holland, Amster- dam, 1984.

321

Page 16: Incremental Reduction in the Lambda Calculus*...tality: Demers, Reps, and Teitelbaum [DRT81,Rep82] pro- posed algorithms to update functions on syntax trees de- scribed by attribute

[Ben821

[BKKS87]

[BPR85]

[BvEG+ 871

[CCM87]

[Cur86a]

[Cur86b]

[dB72]

[DRT8 l]

[Ear761

[FieSOa]

[FieSOb]

[FW87]

S. A. Bengelloun. Aspects of Incremental Com- puting. PhD thesis, Yale University, 1982.

H.P. Barendregt, J.R. Kennaway, J.W. Klop, and M.R. Sleep. Needed reduction and spine strategies for the lambda calculus. Information and Computation, 75:191-231, 1987.

A. M. Berman, M. C. Paul& and B. G. Ryder. Proving relative lower bounds for incremental algorithms. Technical Report DCS-TR-154, De- partment of Computer Science, Rutgers Univer- sity, New Brunswick, New Jersey 08903, 1985.

H.P. Barendregt, M.C.J.D. van Eekelen, J.R.W. Glauert, J.R. Kennaway, M.J. Plasmeijer, and M.R. Sleep. Term graph rewriting. In Proc. PARLE Conference, Vol. 11: Parallel Lan- guages, pages 141-158, Eindhoven, The Nether- lands, 1987. Springer-Verlag. Lecture Notes in Computer Science 259.

G. Cousineau, P.-L. Curien, and M. Mauny. The categorical abstract machine. Science of Computer Programming, 8:173-202, 1987.

P.-L. Curien. Categorical combinators. Infor- mation and Control, 69:188-254, 1986.

P.-L. Curien. Categorical Combinatora, Sequen- tial Algorithms, and Functional Programming. Research Notes in Theoretical Computer Sci- ence. Pitman, London, 1986.

N.G. de Bruijn. Lambda calculus notation with nameless dummies, a tool for automatic formula manipulation, with application to the church-rosser theorem. Proc. of the Koninkli- jke Nederlandse Akademie van Wetenschappen, 75(5):381-392, 1972.

A. Demers, T. Reps, and T. Teitelbaum. Incre- mental evaluation for attribute grammars with application to syntax-directed editors. Proc. Eighth ACM Symp. on Principles ofC Program- ming Languages, pages 105-116, 1981.

J. Earley. High level iterators and a method for automatically designing data structure rep- resentation. Journal of Computer Languages, 1:321-342, 1976.

John Field. Incremental Reduction and its Ap- plications. PhD thesis, Cornell University, 1990. (Forthcoming).

John Field. On laziness and optimality in lambda interpreters: Tools for specification and analysis. In Proc. Seoenteenth A CM Symposium on Principles of Progmmming Languages, San Francisco, January 1990.

Jon Fairbairn and Stuart Wray. Tim: A sim- ple, lazy abstract machine to execute supercom- binators. In Proc. Conference on Functional Programming Languages and Computer Archi- tecture, pages 34-45, Portland, 1987. Springer- Verlag. Lecture Notes in Computer Science 274.

[HM76]

[JSS89]

[Klo80]

[L&78]

[L&80]

[O’Di’7]

[PeWI

[PK82]

[PT89]

Pep821

[Sun901

[Wad711

WSf371

P. Henderson and J.H. Morris. A lazy evaha, tor. In Proc. Third ACM Symposium on Princi- ples of Programming Languages, pages 95-103, 1976.

Neil D. Jones, Peter Sestoft, and Harald Sondergaard. Mix: A self-applicable partial evaluator for experiments in compiler genera- tion. Lisp and Symbolic Computation, 2, 1989.

J.W. Klop. Combinatory Reduction Systems, volume 127 of Mathematical Centre Tracts. Mathematical Centre, Kruislaan 413, Amster- dam 1098SJ, The Netherlands, 1980.

Jean-Jacques LCvy. RCductiona correctes et op- timalea dons le lambda-calcul. PhD thesis, Uni- versitC de Paris VII, 1978. (Thkse d’Etat).

Jean-Jacques Levy. Optimal reductions in the lambda-calculus. In J.P. Seldin and J.R. Hind- ley, editors, To H-B. Curry: Essays on Combi- natory Logic, Lambda Calculus, and Formalism. Academic Press, London, 1980.

Michael J. O’Donnell. Computing in Systems Described by Equations. Springer-Verlag, 1977. Lecture Notes in Computer Science 58.

S. L. Peyton Jones. The Implementation of Functional Programming Languages. Prentice Hall International, Englewood Cliffs, New Jer- sey, 1987.

R. Paige and S. Koenig. Finite differencing of computable expressions. Transactions on Pro- gramming Languages and Systems, 4(3):402- 454, July 1982.

W. Pugh and T. Teitelbaum. Incremental com- putation by function caching. In Proc. Six- teenth ACM Symp. on Principles of Program- ming Languages, 1989.

T. Reps. Optimal-time incremental seman- tic analysis for syntax-directed editors. Proc. Ninth ACM Symp. on Principles of Program- ming Languages, pages 169-176, 1982.

R.S. Sundaresh. Technical report. Yale Univer- sity, 1990.

C.P. Wadsworth. Semantics and Pragmatica of the Lambda-Calculus. PhD thesis, Oxford Uni- versity, 1971.

D. Yellin and R. Strom. “INC”: A language for incremental computations. Technical Report RC 13327, IBM Thomas J. Watson Research Center, Yorktown Heights, New York 10598, December 1987.

322