semantic specification using two-level grammars: blocks, procedures and parameters

('ompmel Lan~Cuages, Vok 4. pp. 171 to 185 0096-05~ 1 79 I 2(H-(H 7150200'0 ,f Pergamon Press Lid 1979 Printed in Great Britain

SEMANTIC SPECIFICATION USING TWO-LEVEL GRAMMARS: BLOCKS, PROCEDURES

AND PARAMETERS

FRANK G. PAGAN

Department of Computer Science, Southern Illinois University at Carbondale, Carbondale, IL 62901, U.S.A.

(Received 5 January 1979)

Abstract--Formal specifications are presented for the complete syntax and semantics of an ALGOL-like language fragment, using a recently introduced definitional technique employing two-level grammars (W-grammars). The fragment contains several important features whose dynamic semantics have not previously been treated by means of this technique : block structure, (recursive) procedures, and parameters passed by value, by reference, and by name. The degree of conciseness, clarity, etc., of the specifications is comparable to that obtainable with other approaches to formal seamantics, and it is concluded that two-level grammars must currently be regarded as a competitive approach for progress in language specification.

Formal semantics Semantic specification Formal grammars Two-level grammars Parameter- passing mechanisms

INTRODUCTION

THE CONCEPT of a two-level g rammar (van Wijngaarden grammar, W-grammar) first became widely known as a powerful syntactic formalism with the publication of the defining document for the original version of A L G O L 68 I l l . In the revised definition [2] of that language, a single two-level g rammar was used to define all aspects of the syntax, including all the context-dependent conditions ("static semantics"), man), of which are very intricate and subtle; as a result, the set of terminal strings generated by the g rammar is precisely the set of programs that should be accepted as free of all compile-time errors by any good A L G O L 68 processor. More recently [3, 4], an inge- nious scheme for extending the use of two-level g rammars to include the specification of dynamic semantics was devised and applied to the miniature language ASPLE, which includes boolean and integer data types and operations, pointers, simple assignment and input/output statements, and the if-then, if-then-else, and while-do control structures.

In the belief that this new approach to the formal definition of programming language semantics is worthy of further exploration and comparat ive evaluation as regards qualities of specification such as completeness, conciseness, and understandability, this paper presents a two-level g rammar defining the complete syntax and complete semantics of an ALGOL-l ike language fragment incorporating block structure (recursive) procedures, and the three standard parameter-passing mechanisms known as call-by-value, call-by- reference, and call-by-name. A special effort has been made to "program" the g rammar in a clear and readable style, for, other things being equal, it is qualities such as these which will ultimately determine whether this formalism gains wide acceptance as a semantic specification method.

The basic idea underlying the approach is that the terminal strings derivable from the g rammar are of the form

P eof F1 eof F2 eof

where P is a syntactically valid program, F1 is a representation of an input file, and F2 is a representation of an output file, such that the execution of P with the input file represented by F1 terminates normally and results in the creation of the output file represented by F2. The g rammar must generate all and only the strings (here termed "programmes" to distinguish them from the proper "programs" P contained in them) satisfying these constraints. The semantics of each valid program P is then complclelv

171

172 FRANK G. PAGAN

defined by the set of all pairs of files (F1,F2) occurring in those programmes containing P.

AN ALGOL-LIKE LANGUAGE FRAGMENT The language fragment to be used for illustrative purposes has a conventional syntax,

the context-free aspects of which are given by the following BNF production rules:

( p r o g r a m ) : : = (b lock) (b lock) :: = begin (declaration sequence); (statement sequence) end (declaration sequence) :: = (declarat ion) I

(declaration sequence); (declarat ion) (statement sequence) :: = ( s t a tement ) [

(statement sequence); (s ta tement) (declara t ion) : : = in t (identifier) I

proe (identifier) = ((parameter list)): ( s ta tement) (parameter list) :: = (parameter ) I (parameter list), (parameter ) (pa ramete r ) :: = val (identifier) I ref (identifier) I name (identifier) ( s ta tement) :: = (identifier) : = (expression) I read (identifier) [

write (expression) I (b lock) I ( identif ier)((argument list)) I

(argument list) :: = (expression) I (argument list), (expression) (expression) :: . . . .

The context-dependent conditions are what one would expect: every identifier must be suitably declared (or appear in a parameter list), and references to multiply-declared identifiers are resolved in accordance with the usual ALGOL-like scope rules; no identifier may be declared more than once at the same level or appear more than once in the same parameter list; an identifier in an assignment statement, read statement, or argument list must be a variable (local or non-local int variable, or a val, ref, o r name parameter) as opposed to a procedure name, and the identifier at the beginning of a call statement must be a procedure name; the number of arguments in a call must equal the number of parameters in the declaration of the procedure being called; an argument corresponding to a ref parameter must be a variable (but not a name parameter of the current procedure)•

The following sample program P will serve as a source of examples in subsequent parts of the paper:

begin 1 int x; 2 proc p = (val a): 3

write a; 4 read x; 5 p (x) 6

end 7

The language fragment can easily be extended to a complete, realistic programming language by adding expressions, other types of statements and control structures, and other data types and data structures. The formal definition of these additions would be handled in much the same way as in previous publications [3, 4] and would not mater- ially affect the complexity of the two-level grammar for the fragment given here.

The remainder of the paper assumes a knowledge of the two-level grammar formalism as described in earlier literature• In particular, a knowledge of the following metalinguistic concepts and their use is assumed: protonotion, notion, metanotion, hypernotion, hyper- rule, metarule, predicate• In order to enhance the readability of the hyper-rules, two special devices are used.

(1) Parentheses are used to enclose the "noun phrases" which constitute the significant parts of predicates; formally, the parentheses are just "'small syntactic marks" which extend the alphabet of lower-case letters [2].

Semantic specification using two-level grammars 173

(2) Hyphens are frequently used to join together the words in certain logical segments of long hypernotions; formally, the hyphens are like blanks and add nothing to the meaning of the rules.

THE T W O - L E V E L G R A M M A R

The grammar for the language fragment is here presented m eight parts, with the hyper-rules numbered and cross-referenced according to a self-exlblanatory scheme [2]. A parse tree for a programme consists of two main par ts- - the "syntactic" subtrees for the visible programme itself, as defined by the hyper-rules in parts 5 and 6 of the grammar, and an invisible "semantic" subtree, as defined by the hyper-rules in parts 7 and 8. (An "invisible" tree is one in which all the leaves are associated with the ~'empty symbol".) The semantic subtree guarantees that the program portion of the programme will pro- duce the given output file from the given input file, The syntactic part also contains some invisible subtrees, defined by the hyper-rules in part 6, which enforce the various context- dependent conditions stated earlier. A detailed understanding of all these hyper-rules can only be attained after gaining a thorough familiarity with the grammar's metanotions, as defined by the metarules in parts 1-4.

1. Basic metarules

(a) ALPHA : :a ; b; c; d; e; f; g; h; i; j; k; l; m; n; o; p; q; r; s; t; u; v; w; x; y; z. (b) LETTER :: letter ALPHA. (c) EMPTY ::. (d) N O T I O N :: ALPHA; N O T I O N ALPHA. (e) NOTETY :: N O T I O N ; EMPTY. (f) TAG :: LETTER; TAG LETTER.

Metarules similar to l(a)-(f) are standard in most two-level grammars. They imply that the metanotion EMPTY stands for the empty protonotion, that N O T I O N stands for any non-empty protonotion, that NOTETY stands for any protonotion, and that TAG stands for protonotions which are metalinguistic analogs of identifiers (e.g. letter i letter d would correspond to id).

2. Metarules for environments

(a) ENV :: new; new LAYER; ENV new LAYER. (b) ENVETY :: ENV; EMPTY.

An ENV protonotion models the concept of an environment, i.e. it serves as a record of the meanings of all currently accessible identifiers. It consists of a number of LAYERs corresponding to dynamic allocation levels.

(c) LAYER :: ITEM; LAYER ITEM. (d) LAYETY :: LAYER; EMPTY.

Each LAYER is a sequence of ITEMs, each of which records the type of, and the storage "'address" (NUMBER--see 4(c, d) allocated to, some identifier (TAG):

(e) ITEM : : TYPE type T A G allocated NUMBER.

The manner in which types are recorded is as follows:

(fl TYPE :: N O N P R O C ; proc with parameters and STMT. (g) N O N P R O C :: int; K I N D . (h) P A R A M E T E R S :: P A R A M E T E R ; P A R A M E T E R S P A R A M E T E R . (i) P A R A M E T E R :: K I N D type T A G allocated N U M B E R . (j) K I N D :: val; ref; name.

The parameters and body (STMT, defined in part 3) of a procedure are treated as part of the type information for the procedure identifier, and such an identifier will always be allocated the address 0 (EMPTY) instead of an actual location. Note thal every PAR-

174 FRANK G. PAGAN

AMETER protonotion is also an ITEM protonotion; only integer-valued items can be used as parameters, and their types (KIND) are recorded as vai, ref, or name according to the argument transfer mechanism.

(k) VAR:: int; vai; ref.

VAR is that subset of TYPE which excludes procedure identifiers and name parameters.

As an example of a complete ENV protonotion, the two-layer current environment at the time line 4 of the sample program P is executed is given by new new int-type-letter-x- allocated-i proc-with-val-type-letter-a-allocated-EMPTY-and-STMT-type-letter-p-allocated-EMPTY new val-type-letter-a-ailocated-ii, where STMT is, according to the rules below, write var-letter-a.

3. Metarules for abstract syntax

The basic strategy for defining semantics will be to give hyper-rules for predicates of the form where (STMTS) transforms (STATEI) into (STATE2), where STATE is defined in the next part and STMTS is a metalinguistic analog of a program or part of a program:

(a) STMTS :: STMT; STMTS STMT. (b) STMT : : set TAG to EXPR;

read TAG; write EXPR; begin LAYER do STMTS; call TAG with A R G U M E N T S ; . . . {other statement forms}.

(c) A R G U M E N T S :: EXPR; A R G U M E N T S EXPR. (d) EXPR : : . . . ; var TAG; . . . .

These metarules in effect specify an abstract syntax for the language. Short protonotions such as set, to, and read act as "metadelimiters" to ensure that the encoding of abstract programs as STMTS protonotions is unambiguous.

The STMTS protonotion corresponding to lines 5 and 6 of P is read-letter-x call-letter- p-with-var-letter-x.

4. Metarules for states A STATE protonotion models an execution state and contains an environment por-

tion (ENV), a s tore portion (STORE), an input file portion (infile FILE) and an output file portion (ontfile FILE):

(a) STATE :: state ENV STORE infile FILE outfile FILE.

A STORE protonotion consists of the word store, followed by a string of zero or more i's (NUMBER), followed by that number of "locations" (LOCSETY), where each location (LOC) is of the form (Ioc VALUE):

(b) STORE :: store N U M B E R LOCSETY. (c) N U M B E R :: EMPTY; TALLY; negative TALLY. (c) TALLY :: i; TALLY i. (e) LOCSETY :: LOCS; EMPTY. (f) LOCS :: LOC; LOCS LOC. (g) LOC :: loc VALUE. (h) VALUE :: NUMBER; undefined; EXPR requiring ENV.

The VALUE protonotion in a location corresponding to a non-parameter value will be a NUMB ER or undefined, that for a val or ref parameter will be a NUMBER, and that for a name parameter will be of the form EXPR requiring ENV, where the EXPR part will be


the abstract form of the corresponding argument and the ENV part will record the environment that existed before the procedure was called.

A FILE protonotion is a simple metalinguistic analog of an input or output file:

(i) FILE :: DATETY eof. (j) DATETY :: DATA; EMPTY. (k) DATA :: datum NUMBER; datum NUMBER DATA.

For the program P, an initial input file

3.4. eof

would correspond to infile datum iii datum iiii eof in the initial state. The complete current state after the execution of line 5 would then be given by state ENVI store-i-loc- iii infile-datum-iiii-eof outfile-eof, where ENVI is new new int-type-letter-x-allocated-i proc-with-vai-type-letter-a-allocated-EMPTY-and-write-var-letter-a-type-letter- p-allocated-EMPTY.

Turning now to the grammar's hyper-rules, the manner in which the rules of parts 5 and 6 completely define the language's syntax, both the context-free and the context- sensitive aspects, is fairly standard in two-level grammars and will not be explained at length here. The rules themselves, presented with a minimum of commentary, should suffice.

5. Main hyper-rules for syntax

(a) programme: new begin-LAYER-do-STMTS block {d},

eof symbol, FILEI file {b,c}, FILE2 file {b,c}, where (LAYER) consistent {6d,e}, where (begin LAYER do STMTS) transforms (state new-EMPTY

store-EMPTY infile-FILEl outfile-eof) into (state new-EMPTY STORE infile-FILE3 outfile-FILE2) {Ta}.

Here the protonotion of the form begin LAYER do STMTS is guaranteed to be the correct abstract encoding of the program in question (with all storage allocations EMPTY) and FILEI and FILE2 to be the metalinguistic analogs of the two file portions of the programme. The last part of the rule is of the form where (STMT) transforms (STATEI) into (STATE2) and "corresponds to the root of the semantic subtree. It states that the program must have the effect of transforming an initial state containing an empty environment, an empty store, an input file consisting of FILEI , and an empty (eof) output file into a final state consisting of an empty environment, some final store STORE, some final input file FILE3. and the output file FILE2.

The next two rules define FILE file:

(b) datum-NUMBER-FILE file: NUMBER constant {. . .}, point symbol, FILE file {b,c}.

(c) eof file: eof symbol.

Rule 5d gives the syntax of blocks:

(d) ENV begin-LAYER-do-STMTS block: begin symbol,

ENV-new-LAYER declaration sequence for LAYER {e,f}, ENV-new-LAYER STMTS statement sequence { 1}, end symbol.

176 FRANK G. PAGAN

Rules 5(e) and (f) define ENV declaration-sequence-for-LAYER:

(e) ENV declaration sequence for LAYER-ITEM: ENV declaration sequence for LAYER {e,f},

ENV declaration of ITEM {g,h}. (f) ENV declaration sequence for ITEM:

ENV declaration of ITEM {g,h}.

Rules 5(g) and (h) define ENV declaration-of-iTEM:

(g) ENV declaration of int-type-TAG-allocated-EMPTY: int symbol, TAG symbol.

(h) ENV declaration of proc-with-PARAMETERS-and-STMT-allocated-EMPTY proc symbol,

TAG symbol, equals symbol, Ipar symbol, definition part for PARAMETERS {i,j}, rpar symbol, colon symbol, ENV-new-PARAMETERS STMT statement {m-q}, where (PARAMETERS) consistent {6d,e}.

Rules 50) and (j) define definition part for PARAMETERS:

(i) definition part for PARAMETERS-PARAMETER: definition part for PARAMETERS {i,j},

comma symbol, definition of PARAMETER {k}.

(j) definition part for PARAMETER: definition of PARAMETER {k}.

Rule 5(k) defines definition of PARAMETER:

(k) definition of KIND-type-TAG-allocated-EMPTY: KIND symbol, TAG symbol.

Rule 5(1) defines ENV STMTS statement-sequence:

(1) ENV STMTS-STMT statement sequence: ENV STMTS statement sequence {1},

ENV STMT statement {m-q}.

Rules 5(m) through (q) define ENV STMT statement:

(m) ENV set-TAG-to-EXPR statement: NONPROC ENV identifier with TAG {w},

becomes symbol, ENV EXPR expression {...}.

(n) ENV read-TAG statement: read symbol, NONPROC ENV identifier with TAG {w}.

(o) ENV write-EXPR statement: write symbol, ENV EXPR expression {...}.

(p) ENV begin-LAYER-do-STMTS statement: ENV begin-LAYER-do-STMTS block {d},

where (LAYER) consistent {6d,e}. (q) ENV call-TAG-with-ARGUMENTS statement:

proc-with-PARAMETERS-and-STMT ENV identifier with TAG {w}, Ipar symbol, ENV ARGUMENTS arglist for PARAMETERS {r,s}, rpar symbol.


Rules 5(r) and (s) define ENV ARGUMENTS arglist-for-PARAMETERS:

(r) ENV ARGUMENTS-EXPR arglist for PARAMETERS-PARAMETER: ENV ARGUMENTS arglist for PARAMETERS {r,s},

comma symbol, ENV EXPR arg for PARAMETER {t,u,v}.

(s) ENV EXPR arglist for PARAMETER: ENV EXPR arg for PARAMETER {t,u,v}.

Rules 5(t v) define ENV EXPR arg-for-PARAMETER:

(t) ENV EXPR arg for vai-type-TAG-allocated-EMPTY: ENV EXPR expression { . . . }.

(u) ENV EXPR arg for ref-type-TAG-allocated-EMPTY: VAR ENV identifier with TAG2 {w},

where (EXPR) is (var TAG2) {61}. (v) ENV EXPR arg for name-type-TAG-allocated-EMPTY:

ENV EXPR expression { . . . }.

The rules for ENV EXPR expression are omitted. These would also refer to 5(w):

(w) TYPE ENV identifier with TAG: TAG symbol,

where (TYPE type TAG) found in (ENV) {6a}.

6. Hyper-rules for syntactic predicates

Some of the rules in this part are referenced by the semantic hyper-rules as well. Rule 6(a) defines where (TYPE type TAG) found in (ENV):

(a) where (TYPE type TAG) found in (ENV new LAYER): where (TYPE type TAG) one of (LAYER) {b,c/; where (TAG) not in (LAYER) {f,g},

where (TYPE type TAG) found in (ENV) {a}.

Rules 6(b) and (c) define where (TYPE type TAG) one of (LAYER):

(b) where (TYPE type TAG) one of (LAYER ITEM): where (TYPE type TAG) one of )ITEM) {c}; where (TYPE type TAG) one of (LAYER) {b,c}.

(c) where (TYPE type TAG) one of (TYPE type TAG allocated NUMBER): EMPTY.

Rules 6(d) and (e) define where (LAYER) consistent and serve to enforce the context condition banning duplicate declarations and duplicate parameters:

(d) where (ITEM) consistent: EMPTY. (c) where (LAYER TYPE-type-TAG-allocated-NUMBER) consistent:

where (LAYER) consistent {d,e}, where (TAG) not in (LAYER) {f, gl.

Rules 6(f) and (g) define where (TAG) not in (LAYER):

(f) where (TAG) not in (LAYER ITEM): where (TAG) not in (LAYER) {f,g},

where (TAG) not in (ITEM {g/. (g) where (TAG1) not in (TYPE type TAG2 allocated NUMBER):

where (TAGI) is not (TAG2) {h,i,j}.

Rules 6(h-j) define where (NOTETYI) is not (NOTETY2):

(h) where (NOTETYI ALPHAI) is not (NOTETY2 ALPHA2): where (NOTETYI) is not (NOTETY2) {h,i,j};

178 FRANK G. PAGAN

where (ALPHAI) precedes (ALPHA2) in (abedefghijklmnopq- rstuvwxyz) {k};

where (ALPHA2) precedes (ALPHA1) in (abedefghijkimnopq- rstuvwxyz) {k}.

(i) where (NOTION) is not (EMPTY): EMPTY. (j) where (EMPTY) is not (NOTION): EMPTY. (k) where (ALPHAI) precedes (ALPHA2) in (NOTETY1 ALPHAI NOTETY2

ALPHA2 NOTETY3): EMPTY.

Rule 6(1) defines where (NOTETYI) is (NOTETY2) and is very frequently referenced by the semantic rules:

(l) where (NOTETY) is (NOTETY): EMPTY.

7. Main hyper-rules for semantics

The rules in this part, though rather numerous and lengthy, basically define only four predicate forms:

(1) where (STMTS) transforms (STATEI) into (STATE2) (7a,e-k) (2) where (LAYER2) and (STORE2) built from (LAYERI)

and (STOREI) (7b,c,d) (3) where (STATE) implies (LAYER) and (STORE) derived

from (ARGUMENTS) and (PARAMETERS) (71-q) (4) where (EXPR) yields (NUMBER) given (STATE) (Tr,s . . . . )

Form (1) serves to define the semantics of statements in terms of state transformations, and the other forms are subsidiary to it. Form (2) is used to define the augmentation of the current state that occurs when a block is entered; LAYERI represents the block's local declarations and LAYER2 the corresponding new level of the environment, while STOREI represents the store that exists just prior to block entry and STORE2 the larger store obtained by adding locations for the local variables. Form (3) deals with the augmentation of the current state (STATE) that occurs when a procedure is called; LAYER is the new level of the environment corresponding to the procedure's parameters

(PARAMETERS), and STORE is the modified store that results from transfer of the arguments in the call statement (ARGUMENTS). Form (4) defines the result (NUMBER) of evaluating an expression (EXPR) in a given state (STATE).

Rule 7(a) states that the execution of a block with local declarations LAYERI and body STMTS, with an environment ENVI and a store STOREI, consists of executing STMTS with an augmented environment ENVI new LAYER2 and an extended store STORE2:

(a) where (begin LAYERI do STMTS) transforms (state ENVI STOREI infile-FILEl outfile-FILE2) into (state ENVI STORE3 infile-FILE3 outfile-FILE4):

where (STMTS) transforms (state ENVI-new-LAYER2 STORE2 infile-FILEl outfile-FILE2) into (state ENVI-new-LAYER2 STORE3 infile-FILE3 outfile-FILE4) {a,e-k . . . . },

where (LAYER2) and (STORE2) built from (LAYERI) and (STOREI) {b,c,d}.

It can be seen that, While the original environment ENV1 is restored upon completion of the block, any changes to the store (STORE2--*STORE3) and the files (FILEI-- . FILE3, FILE2--~ FILE4) are permanent.

The specification of where (LAYER2) and (STORE2) built from (LAYERI) and (STORE1) begins with a rule to recursively decompose the problem into subproblems


dealing with the individual items in the layers:

(b) where (LAYER2 ITEM2) and (STORE2) built from (LAYERI ITEMI) and (STOREI):

where (LAYER2) and (STORE3) built from (LAYERI) and (STOREI) {b,c,d},

where (ITEM2) and (STORE2) built from (ITEM1) and (STORE3) {c,d}.

A declaration of a local variable gives rise to a new environment item which is allocated a new storage location, thus increasing the length of the store from NUMBER to NUMBER i, and that location is initialized with the value undefined:

(c) where (ITEM) and (STORE2) built from (int type TAG allocated EMPTY) and (store NUMBER LOCSETY):

where (ITEM) is (int type TAG allocated NUMBER-i) {61}, where (STORE2) is (store NUMBER-i LOCSETY Ioc undefined) {61},

A declaration of a local procedure does not alter the store:

(d) where (ITEM) and (STORE2) built from (proe-with-PARAMETERS-and- STMT type TAG allocated EMPTY) and (STORE1):

where (ITEM) is (proc-with-PARAMETERS-and STMT type TAG allocated EMPTY) {61},

where (STORE2) is (STOREI) {61}.

Rule 7(e) recursively defines the semantics of a sequence of statements in terms of the individual statements and intermediate states:

(e) where (STMTS STMT) transforms (STATE1) into (STATE3): where (STMTS) transforms (STATEI) into (STATE2) {a,e-k . . . . },

where (STMT) transforms (STATE2) into (STATE3) {a,f-k . . . . }.

The next two rules define the semantics of assignment to (f) variables which are not name parameters of an enclosing procedure (VAR-type-...) and to (g) variables which are name parameters. The significance of many of the metanotions in these rules is as follows:

TAG (TAGI)-- the variable being assigned to EXPR-- the expression whose value is being assigned STATEI--the state existing before the assignment STATE2--the state existing after the assignment ENV--the environment ENVl--the layers of ENV older than that containing TAG (or TAGI) LAYETYI--the items in the relevant layer preceding the item

for TAG LAYETY2--the items in the relevant layer following the item

for TAG ENVETY--the layers of ENV newer than that containing TAG NUMBERI--the address of the location allocated to TAG NUMBER2--the length of the store NUMBER3--the value (of EXPR) to be assigned LOCSETYI--the locations in the store preceding the one

allocated to TAG LOCSETY2--the locations in the store following the one

allocated to TAG FILEl-- the input file FILE2--the output file

( .L . 43 - , I I)

180 FRANK G. PAGAN

If TAG is not a name parameter, the only effect of the assignment is that the location (LOC) allocated to TAG receives the new value NUMBER3:

(f) where (set TAG to EXPR) transforms (STATE1) into (STATE2): where (STATEI) is (state ENV store-NUMBER2-LOCSETY1-LOC-

LOCSETY2 inlile-FILEl outfile-FILE2) {61}, where (STATE2) is (state ENV store-NUMBER2-LOCSETYI-loc-

NUMBER3-LOCSETY2 infile-FILEl outlile-FILE2) {61}, where (ENV) is (ENVI new LAYETYI VAR-type-TAG-allocated-

NUMBER1 LAYETY2 ENVETY) {61}, where (TAG) not found in (ENVETY) {8a, b}, where (LOCSETYI LOC) has length (NUMBER1) {8c}, where (EXPR) yields (NUMBER3) given (STATE1) {r,s, . . .}.

If the assignment is to a name parameter TAGI, the corresponding argument must have been a variable, and the value for TAG1 in the store must be of the form var TAG2 requiring ENV2; the semantic error that would otherwise be present is disallowed by the rule:

(g) where (set TAGI to EXPR) transforms (STATE1) into (STATE2): where (STATEI) is (state ENV store-NUMBER2-LOCSETYI-loc-var-

TAG2-requiring-ENV2-LOCSETY2 infile-FILE1 outtile-FILE2 {61},

where (STATE2) is (state ENV STORE infile-FILEl ontfile-HLE2 {61 },

where (ENV) is (ENVl new LAYETY1 name-type-TAGl-allocated- NUMBER1 LAYETY2 ENVETY) {61},

where (TAG1) not found in (ENVETY) {8a,b}, where (LOCSETY1 Ioc var-TAG2-requiring-ENV2) has length

(NUMBERI) {8c}, where (STORE) updated from (store NUMBER2 LOCSETY1 loc

var-TAG2-requiring-ENV2 LOCSETY2) by (NUMBER3) in location (NUMBER4) {8e},

where (EXPR) yields (NUMBER3) given (STATEI) {r,s . . . . }, where (NUMBER4) location for (TAG2) in (ENV 2 store-

NUMBER2-LOCSETYI-loc-var-TAG2-requiring-ENV2- LOCSETY2) {8f, g}.

The address (NUMBER4) of the location in which NUMBER3 is placed is determined by the last predicate in the rule, which is of the form where (NUMBER4) location for (TAG2) in (ENV2 STORE1), where STORE1 is the initial store. This predicate, together with where (STORE) updated from (STORE1) by (NUMBER3) in location (NUMBER4), is defined in part 8.

Rules 7(h) and (i) for input statements are analogous to 7(f) and (g), the difference being that the value assigned (NUMBER3) is obtained from and removed from the input file:

(h) where (read TAG) transforms (STATE1) into (STATE2): where (STATE1) is (state ENV store-NUMBER2-LOCSETYI-LOC-

LOCSETY2-infile-datum-NUMBER3-FILE1 outfile-FILE2 {61}, where (STATE2) is (state ENV store-NUMBER2-LOCSETYI-ioc-

NUMBER3-LOCSETY2 infile-FILE1 outfile-FILE2) {61}, where (ENV) is (ENVI new LAYETY1 VAR-type-TAG-ailocated-

NUMBER1 LAYETY2 ENVETY) {61}, where (TAG) not found in (ENVETY) {8a, b}, where (LOCSETY1 LOC) has length (NUMBERI) {8c}.

(i) where (read TAG1) transforms (STATE1) into (STATE2): where (STATEI) is (state ENV store-NUMBER2-LOCSETYI-loc-var-

Semantic specification using two-level grammars lgl

TAG2-requiring-ENV2-LOCSETY2 infile-datum-NUMBER3-FILE1 outfile-FILE2) {61},

where (STATE2) is (state ENV STORE infile-FILEl outfile-FILE2) {61},

where (ENV) is (ENV1 new LAYETYI name-type-TAGl-allocated- NUMBER1 LAYETY2 ENVETY) {61},

where (TAGI) not found in (ENVETY) {Sa,b}, where (LOCSETY1 loc var-TAG2-requiring-ENV2) has length

(NUMBER1) {8c}, where (STORE) updated from (store NUMBER2 LOCSETY1 ioc

var-TAG2-requiring-ENV2 LOCSETY2) by (NUMBER3) in location (NUMBER4) {8e},

where (NUMBER4) location for (TAG2) in (ENV2 store-NUMBER2- LOCSETY l-loc-var-TAG2-requiring-ENV2-LOCSETY2) { 8f, g}.

Note that the grammar does not permit reading from an empty file. The effect of an output statement is, of course, to append a value to the end of the output file:

(j) where (write EXPR) transforms (STATE1) into (START2): where (STATEl) is (state ENV STORE infile-FILEl outfile-

DATETY-eof) {61}, where (STATE2) is (state ENV STORE infile-FILEl outfile-

DATETY-datum-NUMBER-eof) {61}, where (EXPR) yields (NUMBER) given (STATE1) {r,s . . . . },

The next rule defines the effect of calling a procedure with arguments ARGUMENTS, given an environment ENVI new LAYER1 ENVETY (where LAYER1 is the layer containing the information about the procedure) and a store STORE1, as the effect of executing the procedure body (STMT) in a first-contracted-then-extended environment ENVl new LAYER1 new LAYER, where LAYER is derived from the procedure's parameters, with a modified store STORE2 incorporating the argument values:

(k) where (call TAG with ARGUMENTS) transforms (STATE1) into (STATE2): where (STATED is (state ENV STOREI infile-FILEl outfile-

FILE2) {61}, where (STATE2) is (state ENV STORE3 infile-FILE3

outfile-FILE4) {61}, where (ENV) is (ENVI new LAYETYI proc-with-PARAMETERS-and-

STMT-type-TAG-aliocated-EMPTY LAYETY2 ENVETY {61}, where (TAG) not found in (ENVETY) {8a,b}, where (STMT) transforms (state ENV2 STORE2 infile-FILEl

outfile-FILE2) into (state ENV2 STORE3 infile-FILE3 outfile-FILE4) {a,f-k . . . . },

where (ENV2) is (ENV1 new LAYETYI proc-with-PARAMETERS- and-STMT-type-TAG-allocated-EMPTY LAYETY2 new LAYER) {61},

where (STATE1) implies (LAYER) and (STORE2) derived from (ARGUMENTS) and (PARAMETERS) {l-q}.

The protonotions for LAYER and STORE2, i.e. the semantics of the three parameter mechanisms, are defined by 7(l-q), all of which are rules for predicates of the form where (STATE) implies (LAYER) and (STORE) derived from (ARGUMENTS) and (PARA- METERS).

All three types of parameters extend the environment for the procedure body with items of the general form KIND type TAG allocated NUMBER. If KIND is val, a new location is allocated to the parameter, thus extending the store, and this location is initialized with the value [NUMBER3 in 70), NUMBER2 in 7(m)] obtained by evaluating

182 FRANK G. PAGAN

the argument (EXPR) in the original state:

(1) where (STATE) implies (LAYER ITEM) and (STORE) derived from (ARGUMENTS EXPR) and (PARAMETERS PARAMETER):

where (STATE) is (state ENV store-NUMBER1-LOCSETY1 infile-FILEl outfile-FILE2) {61},

where (PARAMETER) is (val type TAG allocated EMPTY) {61}, where (ITEM) is (val type TAG allocated NUMBER1-NUMBER2-i)

{61}, where (STORE) is (store NUMBER1-NUMBER2-i LOCSETYI

LOCSETY2 Ioc NUMBER3) {61}, where (EXPR) yields (NUMBER3) given (STATE) {r,s . . . . }, where (STATE) implies (LAYER) and (store NUMBER1-NUMBER2

LOCSETY1 LOCSETY2) derived from (ARGUMENTS) and (PARAMETERS) {l-q},

where (LOCSETY1) has length (NUMBERI) {8c,d}, where (LOCSETY2) has length (NUMBER2) {8c, d}.

(m) where (STATE) implies (ITEM) and (STORE) derived from (EXPR) and (PARAMETER):

where (STATE) is (state ENV store-NUMBER1-LOCSETY infile-FILE1 outfile-FILE2) {61},

where (PARAMETER) is (val type TAG allocated EMPTY) {61}, where (ITEM) is (vai type TAG allocated NUMBER1-i) {61}, where (STORE) is (store NUMBERI-i LOCSETY loc NUMBER2) {61}, where (EXPR) yields (NUMBER2) given (STATE) {r,s . . . . }, where (LOCSETY) has length (NUMBER1) {8c,d}.

If KIND is ref, the parameter is allocated the (existing) location allocated to the argument, which can only be a variable:

(n) where (STATE) implies (LAYER ITEM) and (STORE2) derived from (ARGUMENTS var-TAGl) and (PARAMETERS PARAMETER):

where (STATE) is (state ENV-new-LAYETYI-VAR-type-TAG1- ailocated-NUMBER-LAYETY2-ENVETY sTOi~EI infile-FILEl outfile-FILE2) {61},

where (PARAMETER) is (ref type TAG2 allocated EMPTY) {61}, where (ITEM) is (ref type TAG2 allocated NUMBER) {61}, where (TAG1) not found in (ENVETY) {8a,b}, where (STATE) implies (LAYER) and (STORE2) derived from

(ARGUMENTS) and (PARAMETERS) {l-q}. (o) where (STATE) implies (ITEM) and (STORE) derived from (var TAG1)

and (PARAMETER): where (STATE) is (state ENV-new-LAYETYI-VAR-type-TAG-

ailocated-NUMBER-LAYETY2-ENVETY STORE infile-FILE1 outfile-FILE2) {61},

where (PARAMETER) is (ref type TAG2 allocated EMPTY) {61}, where (ITEM) is (ref type TAG2 allocated NUMBER) {61}, where (TAGI) not found in (ENVETY) {8a, b},

If KIND is name, the parameter is allocated a new location which is given a value of the form EXPR requiring ENV, where EXPR is the unevaluated argument and ENV is the original environment:

(p) where (STATE) implies" (LAYER ITEM) and (STORE) derived from (ARGUMENTS EXPR) and (PARAMETERS PARAMETER):

where (STATE) is (state ENV store-NUMBERI-LOCSETYI infile-FILEl outfile-FILE2) {61},


where (PARAMETER) is (name type TAG allocated EMPTY) {61}, where (ITEM) is (name type TAG allocated NUMBERI-NUMBER2-i

{61}, where (STORE) is (store NUMBER1-NUMBER2-i LOCSETY1

LOCSETY2 lop EXPR-requiring-ENV) {61}, where (STATE) implies (LAYER) and (store NUMBER1-NUMBER2

LOCSETYI LOCSETY2) derived from (ARGUMENTS) and (PARAMETERS) {l-q},

where (LOCSETY1) has length (NUMBER1) {8c,d}, where (LOCSETY2) has length (NUMBER2) {8c,d}.

(q) where (STATE) implies (ITEM) and (STORE) derived from (EXPR) and (PARAMETER):

where (STATE) is (state ENV store-NUMBER-LOCSETY infile-FILEl outfile-FILE2) {61},

where (PARAMETER) is (name type TAG allocated EMPTY) {61}, where (ITEM) is (name type TAG allocated NUMBER-i) {61}, where (STORE) is (store NUMBER-i LOCSETY Ioc EXPR-

requiring-ENV) {61}, where (LOCSETY) has length (NUMBER) {Sc,d}.

The only specific rules given here for expression evaluation, i.e., for where (EXPR) yields (NUMBER) given (STATE), are 7(r) and (s), which define the evaluation of variables. Evaluation of a non-name variable is very straightforward:

(r) where (vat TAG) yields (NUMBER1) given (STATE): where (STATE) is (state ENV-new-LAYETY1-VAR-type-TAG-aIiocated-

NUMBER2-LAYETY2-ENVETY store-NUMBER3-LOCSETY 1-1OC- NUMBERI-LOCSETY2 infile-FILE1 ontfile-FILE2 {61},

where (TAG) not found in (ENVETY) {Sa,b}, where (LOCSETYI ioc NUMBER1) has length (NUMBER2) {8c}.

There is no provision for evaluation of a variable whose value is still undefined, so that this semantic error is also ruled out. Evaluation of a name parameter involves expression evaluation in a different environment (ENV2) and may require a series of back references (realized by recursion in the hyper-rule) if the argument has been passed by name through a chain of calls:

(s) where (var TAG)yields (NUMBER1) given (STATE): where (STATE) is (state ENVl-new-LAYETYI-name-type-TAG-

allocated-NUMBER2-LAYETY2-ENVETY STORE infile-FILE1 outfile-FILE2) {61},

where (STORE) is (store NUMBER3 LOCSETYI ioc EXPR- requiring-ENV2 LOCSETY2) { 61},

where (TAG) not found in (ENVETY) {8a, b}, where (LOCSETY1 Ioc EXPR-requiring-ENV2) has length

(NUMBER2) {8c}, where (EXPR) yields (NUMBERI) given (state ENV2 STORE

infile-FILEl outfile-FILE2) {r,s . . . . }.

8. Auxiliary hyper-rules for semantics

Rules 8(a) and (b) define where (TAG) not found in (ENVETY) and are used by the earlier semantic rules to ensure access of the most locally-declared item with a" given identifier:

(a) where (TAG) not found in (ENVETY new LAYER): where (TAG) not found in (ENVETY) {a,b},

where (TAG) not in (LAYER) {6f, g}. (b) where (TAG) not found in (EMPTY): EMPTY.

184 FRANK G. PAGAN

Rules 8(c) and (d) define where (LOCSETY) has length (NUMBER) and facilitate access to a specific storage location:

(c) where (LOCSETY LOC) has length (NUMBER i): where (LOCSETY) has length (NUMBER) {c,d}.

(d) where (EMPTY) has length (EMPTY): EMPTY.

Rule 8(e) specifies how a store is updated by changing the value in a given location:

(e) where (STORE2) updated from (STORE D by (NUMBER2) in location (NUMBER3):

where (STORED is (store NUMBER1 LOCSETY1 LOC LOCSETY2) {61}, where (STORE2) is (store NUMBER1 LOCSETYI loc NUMBER2

LOCSETY2) {61}, where (LOCSETY1 LOC) has length (NUMBER3) {c}.

Rules 8(0 and (g) have the function of determining what location must be altered in order to change the value of a name parameter. This requires knowledge of both the environment and the store:

(f) where (NUMBER) location for (TAG) in (ENV STORE): where (ENV) is (ENV1 new LAYETYI VAR-type-TAG-allocated-

NUMBER LAYETY2 ENVETY) {61}, where (TAG) not found in (ENVETY) {a,b}.

If an argument has been passed by name through a chain of procedure calls, it is necessary to search back through a series of earlier environments:

(g) where (NUMBERI) location for (TAG1) in ENV STORE): where (ENV) is (ENV1 new LAYETY1 name-type-TAGl-allocated-

NUMBER2 LAYETY2 ENVETY) {61}, where (STORE) is (store NUMBER3 LOCSETY1 Ioc TAG2-

requiring-ENV2 LOCSETY2) {61}, where (TAG1) not found in (ENVETY) {a,b}, where (LOCSETYI ioc TAG2-requiring-ENV2) has length

(NUMBER2) {c}, where (NUMBER1) location for (TAG2) in (ENV2 STORE) {f,g}.

C O N C L U S I O N

This two-level grammar completely and precisely defines the syntax and semantics of the language fragment in the sense of "define" described earlier. While it is not at present clear that two-level grammars will play a permanent role in semantic specification, neither is it clear that they will not. The construction and consideration of diverse examples and case studies is an essential prelude to a final judgment. This paper has shown that the approach can be successfully applied to a fairly difficult and previously untreated set of common language features without the relative complexity of the specifications reaching unreasonable proportions. Analogies to semantic concepts, such as environments, stores, and interpretive execution, found in other definition methods (e.g. Vienna Definition Language, the denotational approach) are implicit in the style used for "programming" the grammar, and the overall readability, clarity, etc. are not inferior to those of other approaches. Assuming a judicious choice of vocabulary and style on the part of the grammar-writer, a two-level grammar is largely self-documenting, and the rules can be read like text, with little additional commentary, by anyone familiar with the basic mechanisms and conventions of the formalism. Although there are other things to consider, such as comparative amenability to compiler generation and pi'ogram proving, it is concluded that two-level grammars must currently be regarded as a competitive approach for progress in formal language specification.

Semantic specification using two-level grammars 18~

R E F E R E N C E S

1. A. van Wijngaarden et al., Report on the Algorithmic Language ALGOL 68. Mathematisch Centrum, Amsterdam, MR 101 (1969). Also in Numerische Mathematik 14, 79-218 (1969).

2. A van Wijngaarden et al., Revised Report on the Algorithmic Language ALGOL 68. Springer-Verlag, Berlin, (1976). Also in Acta Informatica 5, 1-236 (1975) and SIGPLAN Notices 12, 1-70 (1977).

3. J. C. Cleaveland and R. C. Uzgalis, Grammars for Programming Languages. Elsevier North-Holland, Inc., New York (1977).

4. M. Marcotty, H. F. Ledgard and G. V. Bochmann, A sampler of formal definitions, Computing Surveys 8, 191 276 (1976).

About the Author--Frank G. Pagan received his Ph.D. in computer science from the University of Toronto in 1972. He has taught at the University of Aston in Birmingham (England), Memorial University of Newfoundland (Canada), and Southern Illinois University at Carbondale (U.S.A.), where he is presently an Associate Professor in the Department of Computer Science. He has carried out research in computational linguistics and in the design, implementation, and formal definition of programming languages, and is the author of an introductory book on ALGOL 68.

semantic specification using two-level grammars: blocks, procedures and parameters

Documents