investigation into the compilation of the regular

102
INVESTIGATION INTO THE COMPILATION OF THE REGULAR OPERATIONS OF SequenceL by JOSEPH WILLL\M PIZZI, B.S. A THESIS IN COMPUTER SCIENCE Submitted to the Graduate Faculty of Texas Tech University in Partial Fulfillment of the Requirements for the Degree of MASTER OF SCIENCE Approved Mav, 2001

Upload: others

Post on 04-May-2022

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: INVESTIGATION INTO THE COMPILATION OF THE REGULAR

INVESTIGATION INTO THE COMPILATION

OF THE REGULAR OPERATIONS OF SequenceL

by

JOSEPH WILLL\M PIZZI, B.S.

A THESIS

IN

COMPUTER SCIENCE

Submitted to the Graduate Faculty

of Texas Tech University in Partial Fulfillment of the Requirements for

the Degree of

MASTER OF SCIENCE

Approved

Mav, 2001

Page 2: INVESTIGATION INTO THE COMPILATION OF THE REGULAR

^ o ACKNOWLEDGEMENTS \D

/\)0 3 I would like to thank my children, Ethan, Evan, and Jared for tolerating me

-•-0 /'• '^through this busy fime. I would also like to thank my committee members. Dr. Daniel

Cooke, Dr. William Marcy, and Dr. Lloyd Heinze, xvithout whom I could not hax e

completed this work. Last, but by no means least, I would like to thank my xxife, Trac\'.

whose perserverence, persistance, confidence, and love made this project possible.

11

Page 3: INVESTIGATION INTO THE COMPILATION OF THE REGULAR

TABLE OF CONTENTS

ACKNOWLEDGEMENTS ii

ABSTRACT \-

LIST OF FIGURES x i

CHAPTER

I. INTRODUCTION 1

A Brief Introducfion to SequenceL 1

Mofivafion for a SequenceL Compiler 5

The Compiler Development Process 6

Work on Parallelizafion of SequenceL Code 8

II. PROJECT EXTENT 14

SequenceL Grammar 14

Grammar Modificafion 15

Projected Results 16

III. CONSTRUCTION OF THE COMPILER 18

Lexical Analysis 18

Syntax Analysis 20

Semanfic Analysis 24

Intermediate Code Generation 20

Optimization and Code Generation 27

IV. DESIGN AND CONSTRUCTION OF THIS COMPILER 29

111

Page 4: INVESTIGATION INTO THE COMPILATION OF THE REGULAR

Design of the Symbol Table 29

Design of the Semanfic Acfion Stack 29

Design of the Run-time Data Structure 30

V. RESULTS 34

The Compiler 34

The Run-fime Data Structure 34

Language Insights 35

VI. DISCUSSION AND FURTHER RESEARCH 36

The Run-time Data Structure 36

Presentafion of Results 36

Parallelisms 36

REFERENCES 44

APPENDICES

A. SEQUENCEL GRAMMAR 45

B. COMPILER SOURCE CODE 46

C. RUN-TIME SUPPORT FILES 83

IX-

Page 5: INVESTIGATION INTO THE COMPILATION OF THE REGULAR

ABSTRACT

SequenceL is a language developed by Dr. Daniel E. Cooke. It is intended for

experimentafion with declarafive constructs for nonscalar processing. Presentlx, the onh

implementafion of the language is an interpreter xxritten m PROLOG. The purpose of

this research is to invesfigate a compiled implementafion of the "regular" portion of the

language. The construcfion of this compiler x\as successful. The language has three major

constructs, regular, irregular, and generafive. It is believed that compilation of the

regular construct possesses the majority of the xvork involxed in a compiled version of the

complete language, due to the amount of derivable algorithmic content contained therein.

Addifionally, recent research has shown that SequenceL implies parallel control

structures. The consistency of this compiled implementafion xx ith that obserx ation is also

discussed.

\'

Page 6: INVESTIGATION INTO THE COMPILATION OF THE REGULAR

LIST OF FIGURES

1.1 Matrix Multiplicafion Example 9

1.2 Matrix from Figure 1.1 10

1.3 Trace of Matrix Multiply 12

2.1 Inifial SequenceL Grammar used for Compiler 14

2.2 Final SequenceL Grammar Used for Compiler 15

2.3 Projected Compiler Output 1"

3.1 Compiler Phases and Their Relafionship 18

3.2 Grammar for Syntax Analysis Parsing Examples 21

3.3 Top-down Parsing Sequence for 5 -f- 3 - 1 21

3.4 Top-down Parsing Tree for 5-1-3-1

3.5 Bottom-up Parsing Sequence for 5 + 3 -1

3.6 Bottom-up Parsing Tree for 5 -f- 3 - 1

3.7 Right-most Derivafion of 5 + 3 -1

3.8 Grammar with Embedded Semantic Acfions 25

4.1 SequenceL Example Data Structure 31

4.2 Run-time Representation of Figure 4.1 31

6.1 Add Method Code 37

6.2 Involved SequenceL Example for Parallelizafion 37

6.3 Compiler Output from Code in Figure 6.2 38

6.4 Data Dependency Directed .Acyclic Graph for code in Figure (v3 40

xi

TO

O")

O T

^3

Page 7: INVESTIGATION INTO THE COMPILATION OF THE REGULAR

6.5. Single Sequence Mathemafical Operafion 40

6.6. Compiler Output for Equafion in Figure 6.5 41

6.7. DAG for Equafion in Figure 6.5 42

Vll

Page 8: INVESTIGATION INTO THE COMPILATION OF THE REGULAR

CHAPTER I

INTRODUCTION

A Brief Introducfion to SequenceL

SequenceL is a relafively new computer language designed for the processing of

data structures. It is different from tradifional lan^ua^es in that in a traditional lansuase,

the programmer describes the algorithm to transform the input data structure into the

output data structure, while in SequenceL, the programmer's task is to simplx describe

the input and output data structures. The transformation algorithm is implied by the

SequenceL computafional and structural operators applied to data structures. This is a

new approach to programming. No longer is the programmer required to learn or specifx

complex algorithms. Now, the programmer is only required to learn SequenceL's

relafively small language of data structure description.

In [5], one finds a thorough descripfion of the SequenceL language. The

remainder of this secfion is a summary of that descripfion. SequenceL is based upon a

strategy for problem solving wherein one solves a problem bx describing data structures

strictly in terms of their form and content, rather than also having to describe the

iterative/recursive detail to produce and/or process the data structures algorithmicallx [3].

SequenceL's data elements are sequences. A sequence is a collection of elements wherein

each element may occur more than one fime (as opposed to a mathematical set) and

XX here each occurrence of an element possesses an ordinal position (as opposed to a hag

or multiset) [3]. Each element can be a singleton (e.g., [99]), or a sequence (e.g..

Page 9: INVESTIGATION INTO THE COMPILATION OF THE REGULAR

[[[1].[2]],[[3],[4]]]). This allows construcfion of complex structures containing sequences

of sequences to be created.

SequenceL's operations fall into one of three categories: regular, irregular- or

generative. All SequenceL operafions perform a mapping, from a domain sequence to a

range sequence. Regular and irregular operafions reduce a sequence, either in

dimensionahty (dimensionality refers to the level of nesfing of sequences) or cardinality

Generative operations expand a sequence, either in dimensionalit\ or cardinalitx'.

Reducfion or expansion may sometimes leave the range sequence equal in dimensionalitx

and/or cardinality to the domain sequence.

The regular construct applies an operation to corresponding elements of

normalized operand sequences [5]. The operation includes all forms of arithmetic

processing. For example:

One singleton: + ([7]) = [7]

Two singletons: +([[2],[9]]) = [11]

Non-scalars: + ([[6],[2],[5],[7]]) = [20]

, , , + ([[[20],[30],[10],[50],[60]].[[2],[6],[1],[9],[2]]]) Nested:

= [[22],[36],[11],[59],[62]]

None of these examples ufilize the normalizafion operation. Normalization occurs

when there are two or more sequences, and they differ in dimensionality or cardinalitx. In

the following example, there are txx o operands. One has dimensionality of three xx ith

cardinalifies of three, two and three. The other has dimensionalitx two, xvith cardinalities

of four and two.

Page 10: INVESTIGATION INTO THE COMPILATION OF THE REGULAR

+ ([[[[[3],[2],[1]],[[4],[6],[3]]],[[[5].[7],[2]].[[6],[8],[9]]],[[[1],[3],[2]],[[4],[2],[4]]]],

[[[1],[2]],[[3],[4]],[[5].[6]],[[7],[8]]]]])

Nofice that the deeper the nesfing gets, the more difficult it is to read the

sequence. For that reason, all future equafions with more than two dimensions will be

represented graphically instead of in the actual syntax of SequenceL. The prexious

equafion (graphically) is:

+ 3 2 1

4 6 3

5 7 2

6 8 9

1 3 2

4 2 4

1 3 4

5 6

7 8

After normalizafion, the equafion is (line break for readability):

3 2 1

4 6 3

3 2 1

4 6 3

5 7 1

6 8 9

5 7

6 8 9

1 3

4 -)

4

1 J)

2

4 o

4

1 2 1

3 4 3

5 6 5

7 8 7

1 2 1

3 4 3

5 6 5

7 8 7

1

1

3 4

5 6 5

7 8 7

+

Nofice that the sequences are repeated to equalize the cardinalitx and the

dimensionality. This final equation gives the result of:

Page 11: INVESTIGATION INTO THE COMPILATION OF THE REGULAR

4 4 2

7 10 6 O

N

00

00

11 14 10

6 9 3

9 12 12

10 13 7

13 16 16

2 5 3

7 6 7

6 9 7

11 10 11

The irregular construct applies an operation to selected corresponding elements of

operand sequences. The selecfion is based upon either value or posifion, and is defined b\

the when clause. The when clause contains a condifional. When the condifional evaluates

to true, that element is selected for the operafion. For example:

Funcfion raiseiconsumeisalaryin),evaluationin)),-produceisalary))

where salary = {*i[salary{i),l.l]) when evaliiationii) > 5

else [salaryii)]} taking i from gen([l,.... n]).

The operafion will mulfiply each salaryii) by 1.1 xvhen evaliiationii) is greater

than 5. / comes from the generafive operafion (see below for a description of generatixe

operafions). n is obtained when the funcfion begins execution [5].

The generafive construct allows for the expansion of sequences. The simple form

generates integers within some bounds:

ge/2([[l]-...,[5]]) = [[l],[2],[3],[4],[5]].

The complex version generates values within bounds xvhen those x alues satisf\

some constraint. For example:

geni[[0],[\],...M[predipredisucci))),predisucci))]) [5]])

which yields, [[0],[1],[1],[2],[3],[5]], the Fibonacci Senes through 5.

4

Page 12: INVESTIGATION INTO THE COMPILATION OF THE REGULAR

Mofivation for a SequenceL Compiler

When using computer languages, there are two ways of converting a program

from the language that the programmer uses (high-level language) into the language that

the computer uses (machine language). Those methods are: interpreted and compiled.

In an interpreter, each statement coded by the programmer is conxerted, one at a

fime, into machine language for the computer to process. This is a method that is often

preferred during program development, because there is no separate compilation step,

and it is therefore faster to make a change and test that change. In fact, it is quite

common for there to be the ability to make a change xvhile the program is running-say to

correct an error detected while debugging. After the program is dexeloped, hoxxexer. it is

generally accepted that a compiled version of the program is better than an interpreted

version. This is because the compilafion step can be done before distribufing the program

for general use, thereby eliminating the time-consuming translation process ex cry time

the user would like to run the program. For example, during the execufion of an

interpreted program, each and every time a statement is executed, it must be translated

and executed. If the program contains a loop that is executed 1000 fimes, the statements

inside that loop are re-interpreted 1000 fimes each. In a compiler, the loop statements are

translated to machine code only once, and that one fime is before the user ever sees the

program.

During the development of a language, an interpreter is usuallx its first

implementafion. While a language is being developed, changes to the language's

grammar and semantic definitions are inevitable, and those changes are typicallx easier to

Page 13: INVESTIGATION INTO THE COMPILATION OF THE REGULAR

make in an interpreter. Once a language is sufficienfix' dexeloped, it is common for there

to be efforts to develop a compiler for the language. This is the stage that SequenceL has

reached.

However, SequenceL is such a different language that traditional compiler

construction methods have to be modified to apply to the unique constructs of the

language. By invesfigafing different compiler construcfion methods to determine the best

ways to compile one of the constructs, the regular construct, further insights into the

language may be revealed. The method for compilafion of the remaining constructs can

be guided by the method developed here.

The Compiler Development Process

A compiler consists of six phases: A lexical analyzer, a syntax analyzer, a

semantic analyzer, an intermediate code generator, a code optimizer, and a code

generator [IF Often, some phases are combined. The compiler in this project produces

C++ code, so the code optimizer and code generator phases are omitted, and C++ is used

as the intermediate (and final) code. The generated C++ code is then run through a

commercially available C++ compiler (Microsoft Visual C++ xx as used in this thesis) to

produce the machine-executable version. There are multiple methods available for

solving each phase's tasks.

Lexical analysis consists of reading the input code, and parsing it into the tokens

of the language. One of the most common methods of dexeloping the lexical analxzer is

to use a program designed for the task, such as lex. Another method available is to

analyze the input code by hand-written code.

Page 14: INVESTIGATION INTO THE COMPILATION OF THE REGULAR

Syntax analysis consists of taking the tokens from lexical analx sis and comparing

them to the grammar of the language to see if they form valid statements. There are two

major approaches for syntax analysis: top-down and bottom-up. .A top-down analxzer is

also known as a recursive-descent analyzer. A bottom-up analyzer is table-driven. There

are tools available for generafing a table-driven analyzer, yacc is an example of such a

tool, commonly called a compiler compiler. Recursive-descent compilers are usuallx

constructed manually.

Semanfic analysis is usually done along with syntax analysis, as there is a close

relafionship between the two. Semanfic analysis consists of analyzing the statements to

make sure they fit together meaningfully [1]. The semanfic analysis code is alxx ays done

manually (i.e., without the use of tools beyond an online editor). Some of the sxntax

analysis tools allow semantic actions to be added by hand to the input files (xx hich

contain the grammar for the language) for easier integrafion into a complete compiler (as

opposed to just syntax analysis).

The generation of an intermediate code representation of a program xx ritten in the

source language is the product of semanfic analysis and is done as a step to translate the

source language into machine code. This step consists of taking the meaning of the

processed statement from the source language and converting it into the same meaning

intermediate code. Intermediate code is a pseudo. machine independent object language

that contains primifives, or non-primifives that can be easily converted into the machine

langauge statements available on any von Neumann target platform. Jaxa's bxte code rmd

early PASCAL'S p-code are well-known examples of intermediate code languages.

7

Page 15: INVESTIGATION INTO THE COMPILATION OF THE REGULAR

The opfimizafion phase of the compiler consists of an analysis of the intermediate

code. There are numerous levels of opfimizafion, from peephole opfimizafion, the

examinafion of short sequences of intermediate code, to global opfimizafion, the

examinafion of the enfire program for opfimizations that cannot be seen through smaller-

scope invesfigafion.

The code generafion phase of the compiler takes the opfimized intermediate code

and generates the actual executable code for the target platform. This is usually a fairiy

straightforward conversion (similar to how an assembler works) from the intermediate

code to machine code. Code generafion is complicated b\' efforts to optimize register and

memory usage as well as other efforts to provide machine dependent optimizations.

Work on Parallelizafion of SequenceL Code

Recent improvements in SequenceL have indicated that parallel control structures

are implied by the SequenceL problem solufions [5]. As computing moves towards a

more distributed environment, it becomes increasingly necessary for computing systems

to be able to find every type of parallelism possible in an application [8]. There are (at

least) two levels of parallelism revealed in SequenceL. One is ''flat parallelism." The

other is "nested parallelism." "Flat parallelism" is xx here one can idenfify that an

operafion is independent of another, for example, adding txvo pairs of independent values.

.Adding e = a + b is independent of adding f = c + d . Those txvo operations can be done

in parallel, xvith no loss of generality or corruption of data. "Nested parallelism" is where

there is an operation that can be broken up into multiple steps, each of which may have a

"flat parallelism" inside of it. For example, summing the values in a list. Instead of

8

Page 16: INVESTIGATION INTO THE COMPILATION OF THE REGULAR

adding element one to element two and adding that to element three, etc., elements one

and two can be added while elements three and four are added, and those intermediate

results can then be added (in parallel, too) to obtain the original desired result [2].

SequenceL is very good for this type of parallel idenfification because the

operations themselves can be nested, so that there is an inherent parallelism from that,

and addifionally, each operafion, since it operates on a list, can be parallelized at that

lower level. In SequenceL, the expression +[[3,2],[4,5]] yields the independent

operafions 3 + 4 and 2 + 5 , giving the result of [7,7]. Both operafions can be executed in

parallel. This is "flat parallelism." In SequenceL, the expression,

*[+[[3,2],[4,5]],+[[5,6],[7,8]]]

gives us the independent operafions 3 + 4 and 2 + 5 , along x\ ith the independent

operafions 5 + 7 and 6 + 8. The results from both of those operations are then multiplied

together (*[[7,7],[12,14]]) to give us the final result of [84,98]. Nofice that 3 + 4 and

2 + 5 can be done concurrenfiy, concurrenfiy with 5 + 7 and 6 + 8 being done

concurrenfiy. Then, 7*12 and 7*14 are done concurrenfiy. This is "nested parallelism."

As an example of the revealed, nested parallelisms, examine the matrix multiply

in SequenceL [5]. The program is given in Figure 1.1. This program squares the matrix in

Figure 1.2.

Function matmuliconsumeis_lin,'^),s_2i*,m)),pToduce{?iext)) where next{i, j) =

{composei[+i[''is _ 1(/,*), s _ 2(*,;))])])} taking [/, j]

from cartesian_ producti[geni[l n]),geni[l,...,m])])

([[2,4,6],[3,5,7],[1,1,1]],[[2,4,6],[3,5,7],[1,1,1]])

Figure 1.1. Matrix Mulfiplicafion Example.

Page 17: INVESTIGATION INTO THE COMPILATION OF THE REGULAR

"2

4

6

3

5

7

r 1

1

Figure 1.2. Matrix from Figure 1.1.

To trace execufion of this example, the first step is to instanfiate all variables xx ith

their values based upon the input data. This evaluates the compose to

{compo5^([+([*([[2,4,6],[3,5,7],[U,l]](/,*),[[2,4,6],[3,5,7],[l,l,l]](*.7))])])}

taking [i, j] from cartesian _ producti[geni[l,... ,3]), gefi([l,... ,3])])

Next, the generafive constructs revise the taking clause:

{compo5^([+([*([[2,4,6],[3,5,7],[l,l,l]](/,*),[[2,4,6],[3,5,7],[l,l,l]](*,;))])])}

taking [/, 7] from cartesian_ producti[[l,2,3],[l2,3]])

Next, the cartesian_product operafion generates the values for / and7:

{compo5^([+([*([[2,4,6],[3,5,7],[l,l,l]](z,*),[[2,4,6],[3,5,7],[l,l,l]](*,;))])])}

taking [/,;•]from[ [[1],[1]], [[1],[2]], [[1],[3]],

[[2],[1]],[[2],[2]],[[2],[3]],

[[3],[1]],[[3],[2]],[[3],[3]] ]

10

Page 18: INVESTIGATION INTO THE COMPILATION OF THE REGULAR

Next, the expansion of / and; into the compose operafion:

[

[ +([*([[2,4,6],[3,5,7],[1,1,1]](1,*),[[2,4,6],[3,5,7],[1,1,1]](*,1))])

+ ([*([[2,4,6],[3,5,7],[1,U]](1,*),[[2,4,6],[3,5,7],[1,1,1]](*.2))])

+ ([*([[2,4,6],[3,5,7],[1,1,1]](1,*),[[2,4,6],[3,5,7].[1,1,1]](*.3))]) ]

[ +([*([[2,4,6],[3,5,7],[1,1,1]](2,*),[[2,4,6],[3,5,7],[1,1,1]](M))])

+ ([*([[2,4,6],[3,5,7],[1,U]](2,*),[[2,4,6],[3,5,7],[1,U]](^2))])

+ ([*([[2,4,6],[3,5,7],[1,1,1]](2,*),[[2,4,6],[3,5,7],[1,1,1]](*,3))]) ]

[ +([*([[2,4,6],[3,5,7],[1,1,1]](3,*),[[2,4,6],[3,5,7],[1,1,1]](M))])

+ ([*([[2,4,6],[3,5,7],[1,U]](3,*),[[2,4,6],[3,5,7],[1,1,1]](*,2))])

+ ([*([[2,4,6],[3,5,7],[1,U]](3,*),[[2,4,6],[3,5,7],[1,1,1]](*,3))]) ]

Next, the selection for the irregular operafion (the mulfiply) is performed. This

can be done in parallel (one process for each line).

[ +([*([[2,4,6],[2,3,1]])])

+ ([*([[2,4,6],[4,5,1]])])

+ ([*([[2,4,6],[6,7,1]])]) ]

[ +([*([[3,5,7],[2,3,1]])])

+ ([*([[3,5,7],[4,5,1]])])

+ ([*([[3,5,7],[6,7,1]])]) ]

[ +([*([[1,U],[2,3,1]])])

+ ([*([[U,1],[4,5,1]])])

+ ([*([[1,1,1],[6,7,1]])]) ]

]

Next, the multiplicafion operafion (now a regular operation) is performed in

parallel (27 independent operafions).

11

Page 19: INVESTIGATION INTO THE COMPILATION OF THE REGULAR

[ +([4,12,6])

+ ([8,20,6])

+ ([12,28,6]) ]

[ +([6,15,7])

+ ([12,25,7])

+ ([18,35,7]) ]

[ +([2,3,1])

+ ([4,5,1])

+ ([6,7,1]) ]

Then, finally, the addifion is performed (again, a regular operation), giving us the

final result [[22,34,46],[28,44,60],[6,10,14]]. graphically shoxvn below.

22 34 46

28 44 60

6 10 14

This example shows several levels of parallelizafion. It starts with one equation,

separates into 27 independent mulfiplication operafions, then those 27 results combine

into nine independent addifion operafions, which is collected to form the result.

mm

* * * S e * * * * * * 3 t 3 t * * * * * * * * * 3C*:r 3 t * - t

+ + + + + + + + +

III221, 134], [4611, [1281, [441, [60]], 1[6], 110], [14]1]

Figure 1.3. Trace of Matrix Multiplx'.

12

Page 20: INVESTIGATION INTO THE COMPILATION OF THE REGULAR

As compilafion is investigated, additional parallelization opportunities will be

revealed. One such optimization is discussed in the section entitled Parallelisms, in

Chapter VI.

13

Page 21: INVESTIGATION INTO THE COMPILATION OF THE REGULAR

CHAPTER n

PROJECT EXTENT

SequenceL Grammar

One of the first steps in defining the scope of the project xxas to idenfif>- which of

the grammar producfions are included in regular operations (the initial grammar is

included in Appendix A). It was decided that addition, subtracfion. mulfiplication, and

division (the standard arithmetic operators) would be sufficient to demonstrate the

viability of the hypothesized approach. The standard arithmetic operators are part of the

0 producfion. The 0 production is included as part of the T production, xx hich is

included as part of the M, B, and F productions. The start symbol of the grammar is E.

So, it appeared to indicate that the portion of the grammar that would hax e to be

implemented was productions E, F, B, T, M, 0 , S. and C. It xxas decided, hoxvever, that a

subset of the grammar, equivalent to just the relevant portions of the xvhole grammar

relating to regular operations, could be extracted. This is the grammar gix en in Figure

2.1.

C -^ [Integer] | [Real] \ [Character] \ [String ]

S^C\ [C+]

0-> + l-l*l/ T->S\ [7*]

E-^T

Figure 2.1. Initial SequenceL Grammar used for Compiler.

14

Page 22: INVESTIGATION INTO THE COMPILATION OF THE REGULAR

After starting with the granrunar in Figure 2.1, it was discovered that there xxere

ambiguities involved, so the grammar was altered. These ambiguities xx ere overcome in

the interpreted version through the addition of enveloping processing to account for

ambiguity, but could not be permitted in the compiled version. Additionally, xvhile

interpreters can easily deal with an untyped language, compilers cannot. Therefore,

further modificafion to the language was discussed.

Grammar Modification

In discussions with Dr. Cooke [6], it was discovered that the type of the operands

(e.g., integer, real, character, or string) was not the main confounding issue; rather the

dimensionality of the operand (whether it has one, two, three, etc., dimensions, and the

sizes of each of those dimensions) was the more complex issue. Dr. Cooke had

previously discussed this with some colleagues at NASA Ames, and they had come to the

agreement that including the dimensionality informafion was not a significant impairment

to the language. As a result of these discussions, the final grammar that was implemented

is given in Figure 2.2.

S^[T] T-^e{[L])TI

TI^-L\s

L-^T\<int>Ll

LI ^,L\:L\s

0 - ^ + | - | * | /

Figure 2.2. Final SequenceL Grammar Used for Compiler.

15

Page 23: INVESTIGATION INTO THE COMPILATION OF THE REGULAR

Note that this is a significant deviafion from the original grammar. In the original

grammar, the actual values for the list elements are specified; in this grammar, the

dimension informafion for the list is specified, and the values are read in at run-time.

One side effect of this change is that normalizafion does not occur within an

operand, only between operands. Since normalization is one of the most significant

helper funcfions in SequenceL, this was examined closely. Normalizafion does occur at

the inter-operand level, so it was decided that this is not a change to the language that

affects the usability of it (or the applicability of the project solution).

Projected Results

The expected input to this compiler is a file containing the SequenceL source

code, for example, the following would be a vahd input: [-([*([3,2,2:4,3]);+([3,3;4,4])])].

This equafion says to mulfiply the contents of a 3x2x2 matrix to the contents of a 4x3

matrix (yielding a 3x4x3 matrix). From that, subtract the result of adding a 3x3 matrix to

a 4x4 matrix (which is a 4x4 matrix). The final result would be a 3x4x4 matrix. The

compiler would need to convert this to something like the following:

16

Page 24: INVESTIGATION INTO THE COMPILATION OF THE REGULAR

int main() {

}

new matrix0(3,2,2); matrixO.readO;

new matrix 1(4,3); matrix l.readO;

matrixO = matrixO * matrix!;

new matrix2(3,3); matrix2.read();

new matrix3(4,4); matrix3.read();

matrix2 = matrix2 + matrix3;

matrixO = matrixO - matrix2;

matrixO.printO;

return 0;

Figure 2.3. Projected Compiler Output.

17

Page 25: INVESTIGATION INTO THE COMPILATION OF THE REGULAR

CHAPTER m

CONSTRUCTION OF THE COMPILER

As stated in Chapter I, there are six phases to a compiler: lexical analxsis. s\ntax

analysis, semanfic analysis, intermediate code generation, optimization, and code

generafion. The relafionship of these phases is shown m Figure 3.1.

Source Program

Semantic Analysis

ASCII Vocabulary For Lexical Analysis

Tokens X'ocabulary For Syntax Analysis

Quads Intermediate Code

a^-ggiB;^jra8&a.^ia«k. y. s

Code Generator

E.xecutable Machine Code

<ik.--.«tetifjea.i»aaB,:;.ai^«^.s •.r,\u,-'-i.-.t'i.;rr.n^ifin ri<r sm,iii — s^LrsiMTtr. ..y..^

Figure 3.1. Compiler Phases and Their Relafionship [7].

Lexical Analvsis

Lexical analysis is the first step in compilation. The lexical analyzer reads the

input file, character by character (conceptually, at least), forming the indix idual items that

the language recognizes (tokens). These tokens represent each element of the language,

including reserved words, operators, numeric constants, identifiers (variable names,

function names, etc.), and even punctuafion (parenthesis, etc.).

This process is accomplished by examining a character, after eliminating leading

white space (blanks, tabs, and nexv line characters) if necessary, and determining if it is a

valid start character for anv token or the end-of-file marker. If it is the end-of-filc marker.

18

Page 26: INVESTIGATION INTO THE COMPILATION OF THE REGULAR

a special value indicafing the end of the input file is returned, and any further call to the

lexical analyzer is reported as an error. If it is a valid start character, it is accepted (read)

and the next character is then examined to see if it is a valid character to folloxx- the first.

If it is, it is accepted (read) and the next character is examined, and so on. This continues

until a delimiter or an invalid follow-on character is found. Delimiters are characters that

can follow a valid token. These are usually white space, punctuation, operators, etc. If an

invalid follow-on character is found, an error is reported. If a delimiter is found, the

sequence of characters read is examined to see if it is a x alid token. If it is. that token is

returned for the next step in compilation. If it is not, an error is reported.

The lexical analyzer is a finite state automaton (FSA). This automaton consists of

four types of states. The first is the start state. This is the state that the FSA starts in each

fime it is called. The second is acceptance. These are states reached when a valid

sequence has been read. The third is error. These are the states reached when a character

is read that is not a valid character at that point. The fourth is pending. These are the

states reached during reading, when neither a delimiter has been read nor an error has

been found. Each state will have exactiy one path that leads to it (except, possibly, error

states), but may have one or more paths leading out of it. This path represents the

characters read at that point. At each acceptance state, that path forms a x alid token for

the language. There will likely be more than one state of each type. Additionally, a state

can be an accepting state and a pending state if. for example, there is a token that is a

prefix of another. In all such cases, the next character is examined to see if it is a valid

follow-on, or a delimiter. If it is a delimiter, the token recognized is returned to the

19

Page 27: INVESTIGATION INTO THE COMPILATION OF THE REGULAR

routine that called the lexical analyzer. If it is a valid follow-on, the FSA proceeds to the

next state.

The lexical analyzer can either be constructed by hand, or a tool can be used, a

common one being lex. For this project, an open source replacement for lex, caWedflex,

was used. This tool takes as input a file containing the regular operations specifying the

tokens of the language. This tool produces as output C code that accurately parses and

recognizes the language specified. The use of this tool eliminates the necessitx of

explicifiy generafing an FSA for the lexical analyzer. The syntax for the input data to the

tool was straightforward enough that the lexical analyzer generated for this project

generates all of the tokens for the language, not just the subset needed for this project.

Syntax Analvsis

The token returned by the lexical analyzer is used as the input for the syntax

analyzer. The syntax analyzer performs the function of parsing the input. Parsing is the

action of determining whether or not an input sequence is a valid sequence for the

language. An FSA is an inadequate representafion of syntax analysis. The reason an FSA

is inadequate can be demonstrated with the task of matching symbols, for example square

brackets. In SequenceL, there are numerous instances where items are enclosed in square

brackets. In every case, for every left bracket there must be a corresponding right bracket.

The syntax analyzer must keep track of the number of left brackets in order to ensure the

same number of right brackets. Since there can be an infinite number of left brackets, it

would be impossible to keep track of the number of left brackets xvith an FS.A. Therefore,

20

Page 28: INVESTIGATION INTO THE COMPILATION OF THE REGULAR

an additional structure is needed. This structure is a stack, referred to as the semantic

action stack (SAS). Adding a stack to an FSA creates a pushdown atomaton (PDA) [11].

Most parsing methods fall into one of two classes, called the top-down and

bottom-up methods [1]. Both methods build a syntax tree (at least implicitly). In the top-

down method, construction starts with the "start symbol" of the grammar as the root of

the tree, and proceeds to break down each symbol by a production in a left-most manner,

in an attempt to process every token in the input sequence. For example, given the

grammar in Figure 3.2, the equation 5 + 3 -1 can be parsed as shown in Figure 3.3.

which is graphically shown in Figure 3.4

S-^E E-^E + T E-^E-T E^T r ^ 0 | l | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9

Figure 3.2. Grammar for Syntax Analysis Parsing Examples.

S^E^E-T=^E + E-T=>T + E-T=^ 5 + £-r=>5 + r-r=>5 + 3-r=>5 + 3-i

Figure 3.3. Top-down Parsing Sequence for 5 + 3 - 1 .

21

Page 29: INVESTIGATION INTO THE COMPILATION OF THE REGULAR

E + T ' I I ^

T 3

Figure 3.4. Top-down Parsing Tree for 5 + 3 - 1 .

In the bottom-up method, construction starts at the leaves and combines to form

the root. Using the same example, a bottom-up parser would parse it as Figure 3.5.

which is graphically shown in Figure 3.6.

5+ 3-1=^ T+ 3-1=^ E+ 3-l=> E+ T-l^ E-l=> E-T ^ E^S

Figure 3.5. Bottom-up Parsing Sequence for 5 + 3 - 1 .

5 I T 3 I I 1 E + T J

E I S

Figure 3.6. Bottom-up Parsing Tree for 5 + 3 - 1.

Notice that the right-most derivation, shown in Figure 3.7. is the reverse of this

process.

- ) T

Page 30: INVESTIGATION INTO THE COMPILATION OF THE REGULAR

S=^E=^E-T=^E-l=^E + T-l=>E-h3-l=^T-^3-l=> 5 + 3-1

Figure 3.7. Right-most Derivation of 5 + 3 - 1 .

Top-down parsers are nearly always written as recursive-descent parsers. In the

recursive-descent compiler, there is a function for each production in the grammar. Each

function in turn calls the other functions as needed. Each function returns true or false,

indicating that the input sequence is valid or not for that production. This is a

straightforward method for syntax analysis, and it is relatively easy to create a recursixe-

descent parser by hand. The drawback is that there are grammars for xx hich a recursix e-

descent compiler cannot be written. However, most of these grammars haxe equivalent

forms that can be compiled by a recursive descent compiler. Truth-preserving algorithms

that provide these conversions were the subjects of many of the early research projects

focused on compiler theory [9].

Bottom-up parsers are typically done as shift-reduce parsers. In a shift-reduce

parser, the logic running the parser is static. The parser is driven by a table, and the

values in that table (and its size) are the items that change from language to language.

Generating the tables for a language is usually non-trivial, however, so the most common

way to create a shift-reduce parser is to use a compiler compiler, such as xacc.

A compiler compiler takes as input a file containing the grammar to be

recognized, and produces as output C (or other high-level language) code that accuratelx'

parses and recognizes the grammar given. Addifionally, a compiler compiler can point

out potenfial ambiguifies with a grammar.

23

Page 31: INVESTIGATION INTO THE COMPILATION OF THE REGULAR

For this compiler, bison, an open source replacement for yacc, was used for the

inifial prototype. However, the difficulty of embedding the semanfic actions in the

grammar in the input file for bison caused abandonment of that method. Instead, a

recursive-descent parser was constructed, which was facilitated by the simplistic elegance

of the grammar subset that was decided upon for this project.

Semanfic Analvsis

The third step in compilafion is semanfic analysis. Semantic analysis in a one-pass

compiler is nearly always done during syntax analysis. One of the primary purposes for

semanfic analysis is type-checking. However, since SequenceL is not a typed language,

this portion of the analysis was not required. Assuming a meaningful definition for an

operation between types, there is no reason that any type could not be mixed xx ith any

other type in the resulting code. In this implementafion, the only types that the compiler

sees are integers, and those are used for specifying the dimensionality of the sequences.

Since the dimensionality of a sequence is always a list of integers, there is no specificity

in this implementafion. The run-time support files generated by the compiler assume that

input data will be integers, too, but that is easily changed; and with the proper

overloading of operators, could be made to be completely type transparent.

During semanfic analysis, informafion is pushed onto the SAS for retrieval dunng

intermediate code generation. There are several places in the grammar in xvhich this

occurs. The augmented grammar in Figure 3.8 contains the semantic actions.

24

Page 32: INVESTIGATION INTO THE COMPILATION OF THE REGULAR

S ^[r]print_res

r -> 0 ([ L gen_term ]) perform_op TI

TI^;L TI->£ L-^T L -^ int push _ int LI

L1^,L LI —> ; gen _ term L

LI->£

0 ^ + push_op

0 ^^ - push_op

0 -^ * push_op

0 -^ / push_op

Figure 3.8. Grammar with Embedded Semantic Acfions.

The bolded items represent the semantic actions. The functions push_int and

push_op are functions that implement purely semantic actions. They push the integer just

found or the operation just found, respectively, onto the SAS. The functions print_res,

gen_term, and perform_op represent combination semantic action/intermediate code

generation actions. Normal semantic analysis procedures and practices do not apply

directly to a compiler for SequenceL, as the language implies the algorithms, rather than

the language specifying the algorithms. This can lead to unwieldy results from the

semantic action methods used in traditional compiler construction. Therefore, a new

approach to semantic analysis was also needed.

25

Page 33: INVESTIGATION INTO THE COMPILATION OF THE REGULAR

Intermediate Code Generation

The fourth step of compilation is the generation of intermediate code. This is also

done while syntax checking. This was the majority of work in this project. Several

months were spent developing a data structure to support the sequences during

compilation. As the development of the data structure progressed, however, it xx as

observed that the compiler does not need to maintain a complex structure at all. The

complex data structure is needed at run-time, however, so the effort xx as not wasted. As

the conclusion of a statement is processed, the information from that statement is

removed from the stack, and the generation of intermediate code takes place.

The function print_res checks the semantic action stack for data structures that

have not been created yet (as would happen if the code [+([2,2]); 2.2 ] uas found). If it

finds some, it calls gen_term to create them. Then it generates the code to print all data

structures on the stack.

The function gen_term pops the integers off the SAS into another stack. It then

pops them off that stack (this restores the original ordering) for code generation. It then

generates the code to create the run-fime structure and populate it. The population of the

run-fime structure is done by generafing reads from (the run-fime) standard input.

The semanfic acfion perform_op possesses the majority of the significant actix ity

of the compiler. It first checks to see if there are any structures that need to be created. If

there are, it calls gen_term to create them. Then, it pops off all data structures from the

SAS and places them into a double-ended queue. Since SequenceL's operators are nght-

associafive, not left as is the norm, the double-ended queue allows the utilization of the

26

Page 34: INVESTIGATION INTO THE COMPILATION OF THE REGULAR

data structures in the reverse order from their order in the source code. This is the same

order that they are found on the SAS. However, the operator is as yet unknown, as it is

buried on the SAS before the operands, so the operands must be removed and their

ordering kept while the operator is retrieved. Once the operator is retrieved, code can then

be generated for the operafion in a right-to-left manner. The code for the operation is

contained in the run-fime support files that are generated by the compiler (see Appendix

C). This code includes the normalizafion routine that is necessary for the operation.

Additionally, there is code to handle the special case of only one operand for the

operation. In common mathematics, the expression *5 makes no sense. But in

SequenceL, this operation makes sense, because the operand is a sequence, not just a

single value. For completeness, though, SequenceL defines the operation on a single

value as the value itself. Since these are valid operations in SequenceL, they must be

handled by the compiler. The code to handle this is also included in the run-time support

files. How this is handled is by breaking up the operand into mulfiple structures by the

first non-unit dimension, if it is not a single value (singleton). The operafion is then

performed on those operands (from right to left). If the operand is a singleton, it is

returned as the result.

Opfimizafion and Code Generation

Since the output of this compiler is C++, C++ is used as the intermediate code

product, and hence, the next two steps, opfimizafion and code generation are left to the

C++ compiler. It is an assertion that since C++ can be compiled to machine language,

being able to produce C++ code from SequenceL code is sufficient to demonstrate that

27

Page 35: INVESTIGATION INTO THE COMPILATION OF THE REGULAR

SequenceL can be compiled. There has been some research into the use of Java as an

intermediate language [10]. A review of the literature, howexer, shows that the reasoning

behind the support of this, which is the portabilitx' of Java, the poxx er of the language, and

its automated garbage collecfion, were not strong enough to outweigh the reasons for

using C++. The C++ code generated by this compiler is compliant xvith the nexx C++

standard, so portability is not an issue. C++ is arguably more powerful than Jax a and

most implementafions are much faster, so that removes that argument. Garbage collection

is addressed by the code produced by this compiler. Arguabh', explicit garbage collection

is more efficient in terms of CPU overhead. If done properh'. it is also more efficient in

terms of memory usage, as the code to implement the garbage collecfion does not hax e to

maintain the extra information on usage of the program's objects.

28

Page 36: INVESTIGATION INTO THE COMPILATION OF THE REGULAR

CHAPTER IV

DESIGN AND CONSTRUCTION OF THIS COMPILER

Design of the Svmbol Table

SequenceL is such a different language from traditional programming languages

that design of the compiler involved some unusual choices. One task that is normalh

addressed during compiler construction is the design of the symbol table, and the

definifion of its necessary contents. Inifial attempts at this compiler contained a symbol

table, and the contents of that symbol table involved designing the data structure to

support the SequenceL sequence (see Design of the Run-fime Data Structure, below) in

order to know what informafion was needed at a minimum to support the data structure.

Further refinement of the compiler showed, however, that all lixe data is contained on the

stack, so there is no need to include a symbol table in the compiler. The fact that all lixe

data is contained on the stack is due to the fact that SequenceL has no assignment

operation. Therefore it has no identifiers per se.

Design of the Semantic Action Stack

The design of that stack was another issue to be addressed. In compilers, it is

quite common for the SAS to be a heterogeneous stack. Initial versions of the SAS in this

compiler had to contain dimensionality information, data structures, and sentinels for left

brackets, commas and semicolons. To help distinguish the txpe of data each element of

the stack represented, it xvas decided to have the stack contain a type field and a value

field. The type field contains such items as INTNUM, DATA. COMM.A. SEMICOLON,

29

Page 37: INVESTIGATION INTO THE COMPILATION OF THE REGULAR

LBRACKET, and the operators, PLUS, MINUS, TIMES, DIVIDE (each represented bx'

an integer). The value field would contain the value of the item, for example, the xalue of

the integer would be stored m the value field of an INTNXlvI entry, and a counter xx ould

be stored in the value field of a DATA entry. Of course, the value field had no meaning

for the sentinel entries.

During refinement of the compiler, however, it was discox'ered that there was no

need for the senfinels; the context of the current operation and the type of the data on the

stack included all the necessary informafion for the current actix it\. Therefore, the

COMMA, SEMICOLON, and LBRACKET entry types xx ere removed. There was still

the need to distinguish between an FNTNUM entry, a DATA entr>', and the operators,

however, so the design of the SAS did not change as a result of removing the sentinel

entries.

Design of the Run-fime Data Structure

The most significant amount of time in the dex elopment of this compiler xx as

spent designing the data structure to support the SequenceL sequence in a manner that

was easily used during run-fime.

One of the first designs was to simply duplicate the conceptual representation in

code. The conceptual representafion of a sequence is a list of items, each of xx hich could

possibly be a list. In pursuing this representafion, hoxx ever, it was discoxered that the

X agarities of manipulating the levels of nesfing caused more effort to be expended than it

was deemed worth.

30

Page 38: INVESTIGATION INTO THE COMPILATION OF THE REGULAR

The next design was a structure that contained an integer represenfing the level of

nesting, a list of integers representing the size of each level of nesting, and a list of items.

representing the actual data contained in the sequence. The pursuit of this design rexealed

potential problems with the consistency of the integers representing the size of the list

and the actual size of the list. Since the list structure used easilx' allows the size to be

retrieved, the integer representing the level of nesting xxas removed from the data

structure. The list of integers representing the size of each level of nesting, however,

could not be removed, as the actual data is stored as a single list, so the dimension

information would be lost if it was not retained. The action of normalizing two structures

during a mathematical operation necessitated keeping the nesting information. .\n

example of a SequenceL data structure and its run-time representation are shown in

Figure 4.1 and Figure 4.2, respectively.

1 2

3 4

5 6

7 8

9 0

1

Figure 4.1. SequenceL Example Data Structure.

dims

elements

3 2 2

1 2 3 4 5 6 7 8 9 0 1 o

Figure 4.2. Run-time Representation of Figure 4.1.

Once the data structure was designed, it was decided that the cleanest

implementation was as a C++ object. The programmer cannot be expeeied to create the

31

Page 39: INVESTIGATION INTO THE COMPILATION OF THE REGULAR

code to support the object. He or she should be able to use the compiler and the resulting

program without knowing the definition (or even the existence) of the object. It xxas

therefore decided that the compiler would have to be able to generate the support files.

Since those files do not change, it was further decided to have the compiler create them

opfionally, as they may already exist.

The object is implemented in the code contained in MyDataTxpe.cpp (see

Appendix C), and is supported by the header file, MyDataType.h. That code is static. No

matter the program that the user enters for the compiler to handle, those files are static.

The only code that changes is the actual output file generated b\ the compiler (for an

example, see Figure 6.3). The name of this file can be specified via a command-line

switch to the compiler, and the default for the filename, if none is specified, is output.cpp.

The dimension informafion for the structure is stored the dims attribute of the

object. This attribute is an instanfiation of the standard C++ vector. The C++ xector class

IS an abstract data structure that can be used in place of an arrax'. One difference betxx een

a vector and an array, however, is that a vector automatically resizes itself to

accommodate more elements, while the size of an array is static. This allows the number

of dimensions to the structure to be limited only by the memory of the system upon

which the code will be run. The elements of the structure are also stored in a C++ x ector.

Again, this allows the structure to be as large as will fit in the memory of the destination

machine. Since all data for the structure is stored in a linear list, the actual number of

dimensions is irrelevant to the C++ object. When that information is needed (for

- o

Page 40: INVESTIGATION INTO THE COMPILATION OF THE REGULAR

example, when normalizing), the original structure is re-created virtually using the

dimension information stored inside the object.

33

Page 41: INVESTIGATION INTO THE COMPILATION OF THE REGULAR

CHAPTER V

RESULTS

The Compiler

This project has several tangible results. The most obvious is the compiler itself.

The compiler can take a regular operafion from SequenceL and create C++ code that can

then be compiled to execute the original code as specified in the SequenceL language

documentafion. Since the compilafion of regular constructs possesses a significant

amount of the work involved in compilafion of the complete langauge, this project

represents a very significant step towards the implementation of a compiler for the

complete langauge. Much, if not all, of the work in the implementafion of the remaining

constructs of the language can be extended from this work. For example, irregular

construct processing is simply selecfive application of the same techniques of regular

constructs processing. The processing of generafive constructs is straightforward; and the

processing of evenfive constructs can be handled easily, once the nature of the ex ent is

defined.

The Run-time Data Structure

Another result, which is not so obvious, is the run-time data structure. As xx ith the

compilafion of regular constructs possessing the majonty of xvork in a complete compiler,

the run-fime data structure possesses the majority of the work in a complete

implementafion via C++ objects. This could be used, as it is noxv, in support files for the

output from a compiler. It could also be used for the implementation of a C++

34

Page 42: INVESTIGATION INTO THE COMPILATION OF THE REGULAR

infrastructure supporting the SequenceL development "paradigm.'" In this capacity, it

may be possible to leverage the power and flexibility of C++ with the ease of

implementation of SequenceL.

Language Insights

Invesfigafing compilafion has given insights into the language. One insight was

discussed in the modificafion of the grammar for compilation (see Chapter II, Grammar

Modification). A second insight is the addifional parallelisms that have been found (see

Chapter VI, Parallelisms). Other insights are anticipated.

35

Page 43: INVESTIGATION INTO THE COMPILATION OF THE REGULAR

CHAPTER VI

DISCUSSION AND FURTHER RESEARCH

The Run-time Data Structure

During the design and implementation of this project, the decision xxas made to

produce a fairly complex run-time data structure. All information to implement that data

structure, however, is known at compile time. It should therefore be possible to

incorporate the data structure construction into the compiler itself, so that the resulting

output from the compiler uses more fundamental constructs and operations. This would

more easily facilitate the inevitable step of compiling to native machine code.

Presentation of Results

Currentiy, the run-time print routine displays the results in a linear format. Since

the conceptual representation of the structure is potentially a multi-dimensional list, the

print routine could be modified to better present the conceptual layout of the data.

Having only a two-dimensional surface in which to display those results, howexer, will

prove a challenge for structures of more than three dimensions.

Parallelisms

This compiler implementation was created without regard to the inherent

parallelisms of SequenceL. A truer compiler implementation xvould eliminate the

necessity of the following analysis, but the following analysis shows how ex en this

implementation can be utilized to produce parallel code.

36

Page 44: INVESTIGATION INTO THE COMPILATION OF THE REGULAR

There are no loops, no conditionals, and no branches in the code produced bx this

compiler, other than inside the object's methods. Inside the object's methods, loops are

used to iterate through the lists, as demonstrated by the code in Figure 6.1. Since the size

of the list is statically known (at compile-time). those loops could easih be unrolled to

produce straight-line code. This would allow for parallel implementation of the

operations of the object's methods. The remainder of the code, being straight-line (for an

example, see Figure 6.3), yields trivial parallelization. Data dependencies are manifest,

therefore nested parallelisms are trivially realized, as well.

void CMyDataType::add( CMyDataType &data2 ) {

int index; int accum = 1; normalize( data2 ); data2.normalize( "^this); for (index = 0; index < dims.size(); ++index )

accum *= dims[indexj; for (index = 0; index < accum; ++index )

elements[index] += data2.elements [index]; }

Figure 6.1. Add Method Code.

One method for data dependency analysis is utilizing a directed-acyclic graph

(DAG). As an example of the ease at which the output of this compiler can be

parallelized, take the program in Figure 6.2. The code produced by the compiler is shoxvn

in Figure 6.3. A DAG of data dependencies is shown in Figure 6.4. This example is not

pedantic, either.

[*([ + ([ + ([3,2;3,2,4]);4,5,2,2]);*([3;3,2]);/([3.4;3.4,2])]) ]

Figure 6.2. Involved SequenceL Example for Parallelization.

Page 45: INVESTIGATION INTO THE COMPILATION OF THE REGULAR

#include <iostream> #include <vector> #include "MyDataType.h"

int main() {

vector<int> one;

one.clearO; one.push_back( 3 ); one.push_back( 2); cout « "New dataO." « endl; CMyDataType *dataO = new CMyDataType( one );

one.clearO; one.push_back( 3 ); one.push_back( 2 ); one.push_back( 4 ); cout « "New datal." « endl; CMyDataType *datal = new CMyDataType( one );

dataO->add( *datal ); delete datal;

one.clearO; one.push_back( 4 ); one.push_back( 5 ); one.push_back( 2 ); one.push_back( 2 ); cout « "New data2." « endl; CMyDataType *data2 = new CMyDataType( one );

dataO->add( *data2 ); delete data2;

one.clearO; one.push_back( 3 ); cout « "New data3." « endl; CMyDataType *data3 = new CMyDataType( one );

one.clearO; one.push_back( 3 );

Figure 6.3. Compiler Output from Code in Figure 6.2.

38

Page 46: INVESTIGATION INTO THE COMPILATION OF THE REGULAR

one.push_back( 2 ); cout « "New data4." « endl; CMyDataType *data4 = new CMyDataType( one );

data3->multiply( *data4); delete data4;

one.clearO; one.push_back( 3 ); one.push_back( 4 );

cout « "New data5." « endl; CMyDataType *data5 = new CMyDataType( one );

one.clearO; one.push_back( 3 ); one.push_back( 4) ; one.push_back( 2) ; cout « "New data6." « endl; CMyDataType *data6 = new CMyDataType( one );

data5->divide( *data6 ); delete data6;

data3->multiply( *data5 ); delete data5;

dataO->multiply( *data3 ); delete data3;

dataO->print();

return 0; }

Figure 6.3. Continued, (b) Compiler Output from Code in Figure 6.2.

39

Page 47: INVESTIGATION INTO THE COMPILATION OF THE REGULAR

Figure 6.4. Data Dependency Directed Acyclic Graph for code in Figure 6.3.

At each level breadth-wise, every operation is independent of the others, so they

can be done in parallel. In addition, at each node save the leaf nodes, there are

opportunities for parallelism within the object, since the mathematical operation is the

potentially parallel operation on n individual elements (where n is the number of

elements in the sequence). Each of those n operations is independent of the others, as

they share no common data. It would be easy to construct this DAG automatically inside

the compiler, so the compiler could be extended to produce explicitly parallel code,

eliminating the need to have a programmer manually perform the parallelization.

However, because the compiler can produce straight-line code, explicit construction of a

DAG is not necessary.

An example of an unreahzed parallelism is the SequenceL code in Figure 6.5.

After encorporating the run-time data structure into the compiler, the output w ould look

something like the code in Figure 6.6. The DAG for the equation in Figure 6.5 is in

Figure 6.7.

[ + ([4.2,2])]

Figure 6.5. Single Sequence Mathematical Operation.

40

Page 48: INVESTIGATION INTO THE COMPILATION OF THE REGULAR

#include <iostream> #include <vector> #include "MyDataType.h"

int main() {

vector<int> one;

one.clearO; one.push_back( 4 ) ; one.push_back( 2 ); one.push_back( 2 ); cout « "New dataO." « endl; CMyDataType *dataO = new CMyDataType( one );

one.clearO; one.push_back( 2 ); one.push_back( 2 ); cout « "New newO." « endl; CMyDataType *newO = new CMyDataType( data0(2));

one.clearO; one.push_back( 2 ); one.push_back( 2 ); cout « "New newl." « endl: CMyDataType *newl = new CMyDataType( dataO(3));

newO->add( *newl ); *newl = *newO; delete newO;

one.clearO; one.push_back( 2 ); one.push_back( 2 ); cout « "New newO." « endl; CMyDataType *newO = new CMyDataType( dataO(l));

newO->add( *newl ); *newl = *newO; delete newO;

Figure 6.6. Compiler Output for Equation in Figure 6.5.

41

Page 49: INVESTIGATION INTO THE COMPILATION OF THE REGULAR

one.clearO; one.push_back( 2) ; one.push_back( 2 ); cout « "New newO." « endl; CMyDataType *newO = new CMyDataType( dataO(O)):

newO->add( *newl ); *newl = *newO; delete newO;

*dataO = *newl; delete newl; dataO->print(); return 0;

} Figure 6.6. Continued (b). Compiler Output for Equation in Figure 6.5

newO

dataO

Figure 6.7. DAG for Equation in Figure 6.5.

As can be seen, there is no parallelism revealed w ith the current implementation.

However, this example can be parallelized utilizing the tree addition method descnbed in

Chapter I, section Work on Parallelization of SequenceL Code. .After splitting up dataO

into the four structures, newO, newl, new2, and new3. newO and newl could be added in

parallel with the addition of new2 and new3, and then those two intermediate sums added

to form the overall result. .As this kind of organization is operator specific, this

42

Page 50: INVESTIGATION INTO THE COMPILATION OF THE REGULAR

parallelism is an opfimizafion, not an inherent parallelism. As compilation is investigated,

similar addifional opportunifies will be revealed.

43

Page 51: INVESTIGATION INTO THE COMPILATION OF THE REGULAR

REFERENCES

[IJ Aho, A., Sethi, R., Ullman, J. Compilers: Principles, Techniques, and Tools. Reading, MA: Addison-Wesley, 1986.

[2] Blelloch, G. "Programming Parallel Algorithms," Communications of the ACM 1996, 39(3): 85-97.

[3] Cooke, D. "An Introducfion to SequenceL: A Language to Experiment with Constructs for Processing Nonscalars," Software Practice and Experience 1996; 26(11):1205-1246.

[4] Cooke, D. "SequenceL Provides a different xvay to view programming." Computer Languages 1998; 24:1-32.

[5] Cooke, D., Andersen, P. "Automafic parallel control structures in SequenceL," Software Practice and Experience 2000; 30:1541-1570.

[6] Cooke, D. 2000. Personal Communication. Department of Computer Science, Texas Tech University, Lubbock, TX

[7] Cooke, D. A Concise Introduction to Computer Languages: Design, Experimentation, and Paradigms. Pacific Grove, CA: Brooks-Cole, to be published.

[8] Foster, I., Kesselman, C , editors. The Grid: Blueprint for a New Computing Infrastructure. San Francisco, CA: Morgan Kaufmann, 1999.

[9] Foster, J. "A Syntax Improving Program." Computer Journal 1968: ll(l):31-34.

[10] Hardwick, J., Sipelstein, J. Java as an Intermediate Language. Technical Report CMU-CS-96-161, School of Computer Science, Carnegie Mellon University, August 1996.

[11] Lewis, P., Rosenkrantz, D., Steams, R. Compiler Design Theory. Reading, M.*\. Addison-Wesley, 1976.

44

Page 52: INVESTIGATION INTO THE COMPILATION OF THE REGULAR

APPENDIX A

SEQUENCEL GRAMMAR

C ::= [Integer] | [Real] | [Character] | [String]

S ::= C | [ C 1 | [ B T

V ::= IDENTIFIER | FUNCTION_REFERENCE

0 ::= asc | desc | abs | sqrt | cos | sin | tan | log | ^ | + | -1 * | /1 div | mod | reverse

size I transpose | rotate_right | rotatejeft | compose | cartesian_product

M ::= * ,M|T,M|* |T

T ::= S I [T*] | V | pred(T) | succ(T) | 0(T) | V(map(M)) | T(M) | gen([T,...,T]) |

gen([T,...,T,...,T])

O ::= < I <= I = I >= I > I <> I in I integer | real | var | operator

R ::= 0(T) | and(R+) | or(R+) | not(R)

B ::= T+I T+when R else B

F ::= V(Consume(successor(V+),PRODUCE(next)) where next = {B} |

V(Consume(successor(V+),PRODUCE(next)) where next = {B} taking

[V^] from T

E ::= F(E)E I SE I F(E) I S

U ::= { E^ }

+, as a superscript, indicates the element is repeated one-to-many times. When elemeniN

are repeated more than once, the repeated elements are separated bx commas.

45

Page 53: INVESTIGATION INTO THE COMPILATION OF THE REGULAR

APPENDIX B

COMPILER SOURCE CODE

Seql.cpp. The Main Code File

5i< SJC 5]C

// Program seql.cpp / /

// Joseph Pizzi // Computer Science // Texas Tech University / /

// December 22, 2000 / /

// This program implements a compiler for regular operations // (plus, minus, times and divide) of SequenceL. / /

// The usage of this program is: // seql [-h] [-0 output] [-d debug] [-i input] / /

// -h is to generate the header files (only needed if not present) // -o is to specify the name of the output file, default is output.cpp // -d is to specify the name of the debug file, default is seql.debug.txt // -i is to specify the name of the input file, default is input.seql / /

// This compiler uses the parser generated by flex, a lex replacement. // The parser will parse the entire vocabulary of SequenceL. // The remainder of this program is a recursive decent compiler, which // generates C++ code instead of machine language. / /

JJC PJC JJC

#include <stdio.h> #include <stdlib.h> #include <string.h> #include <deque>

using namespace std;

46

Page 54: INVESTIGATION INTO THE COMPILATION OF THE REGULAR

#include "mystack.h" #include "seql.yy.c" //#include "seql.h"

// default debug filename #define MY_DEBUG_FILE "seql.debug.txt"

// default output filename #define MY_OUTPUT_FILE "output.cpp"

// default input filename #define MY_INPUT_FrLE "input.seql"

//Funcfion: stackpush // Inputs: the value of the item to push, if applicable // an integer represenfing the type of the item to push // Output: none / /

// This function pushes the specified item onto the global stack // and generates the debug code telling what it pushed / /^{c sic sic sfe sic sic sic Jc sic sic sic sic sic sic sic SJc sic S^ ^^ sic Jc sic *w* ^^ ^1* L* ^1^ mi^ %^ ^J^ % j] ^1^ m^ m^ ^^ ^^ ^^ ^^ ^ j ^ j ^ j ^^ ^^ ^^ ^^ ^^ ^^ ^^ ^^ ^^ ^^ ^^ ^^ ^^ ^^ ^^ ^^ ^^ ^^ ^^ ^^ ^^ ^^

void stackpush( long item, int type );

// Function: gettype // Input: an integer representing the type // Output: a string containing the plain-text value of the type / /

// This function decodes the integer value to the actual t\ pe it // represents, for output in debug mode.

char *gettype( int type );

// Funcfion: performoper // Input: none // Output: none / /

// This function manipulates the global stack. It gets the // operands off the stack, places them into a double-ended // queue, then get the operafion off the stack. It then // generates the code for the run-time computation requested // on the operators given.

47

Page 55: INVESTIGATION INTO THE COMPILATION OF THE REGULAR

/ / T^ 't^ 1 ^ "Tr *1r T^ ^ I * T^ 't* "T^ 'T^ ^ ^t* "t* "T^ ^ "1^ ^ ^ ^K "t^ ^ "J^ ^ T^ 't^ ' i ^ *!* ^ ^ 'J^ ^ 't^ ^ T^ ^ "t^ ^ ' i ^ 't^ ^ "I^ ^ 'T^ ^ V ' i ^ ^ "^ 'r ^ ^ ^^ ^ "T ^ ^P ^ ^^ ^ T^ ^ ^ ^

void performoper( void);

/ / ^I^ i^S ^I^ ^(S *t^ ^1^ ^1^ ^I^ 'J^ ^i^ ^1^ ^1^ ^1^ ^JS ^t^ ? p ^i^ ^ j ^ ^N 'T* ^I^ 'J^ ^1^ 'T* ^i^ ^ ^ ^I^ ^ ^ ^I^ ^r* ^I^ ' t ^ ' i ^ ^J^ ^i^ ^I^ ^1^ ^i^ ^r* ^ ^ ' l ^ ^I^ *T* ^^ ^fC PJC SJC SIC 5JC *IC 5fC JJC ^ ^ ^ ^ ^ ^ ^jC 5j i ^ ^ ^ ^ ^|C ^ ^ .. . JJt J | t

// Funcfion: printresult // Input: none // Output: none / /

// This funcfion manipulates the global stack. It pops off all // operators from the stack, and generates the code to print // them at run-fime. / / 5jC 3|C 5jC 5jC SjC 5jC 5jC SJC ?JC 5jC 5jC 5jC 5jC 5jC 3(C JjC 5(C 5JC 5jC 5(C 5JC 3jC 5jC 5jC SjC 5jC 5jC 5jC ?(C 5jC 5 ^ 5(C 5jC 5jC 3jC 3jC JjC J"C 5j^ 5jC 5jC 5jC 5[C 3JC JjC 5JC ^jJ 5[C 5j€ 5|C 5|C 3{C 3 ^ SjC SjC i ^ ?jC 5jC 3JC - • - ^ ^ ^ ? ^ 3 |^

void printresult( void);

/ / 5jC 5j€ 5jC 5JC 5]C 5 ^ 5jC 5(C 5jC 5jC 5j€ 5[C 5jC S C ^ C 5jC 5jC SjC SjC 5jC SjC SjC 5jC 5]< 3jC 5j€ 5]< 5 ^ 5jC 3jC 5jC 5jC SjC ^ 3 ^ 5jC 5j< 3jC SjC ^ ^ 7fZ 5jC 5 ^ SjC 5jC 5jC 5jC 5jC SjC 5|C 5]C 5jC SjC SjC SjC 3jC 5jC 3{C SjC 3JC 3]C 5jC -"^

// Funcfion: genterm // Input: none // Output: none / /

// This funcfion manipulates the global stack. It pops off all // integers from the stack, and generates the code to build the // objects and read the data at run-fime.

void genterm( void);

/ /He * * * sic * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

// Function: yyparse // Input: none // Output: flag indicating EOF / /

// This is the parser generated by flex. / / * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

int yyparse( void);

/ / * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

// Function: printerror // Input: character string representing the expected token(s) // Output: none / /

// This function prints error messages indicating the expected // tokens at the point of error. It then increments the global // error count. z / * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * ' ^ * * * * *

void printerror( char *message );

48

Page 56: INVESTIGATION INTO THE COMPILATION OF THE REGULAR

/ / * * * * * * * * * * * * * * * * * * * * * ^ 5 ^ : i , 5 ^ : ( . : i ; ^ 5 j . ^ - j , 5 ) , ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^

// Function: code_gen // Input: none // Output: none / /

// This function finishes up the generated file by placing the // return statement and closing main() in the generated code. / / * * * * * * * * * * * * * * * * * * * * * : f : : } ; : } . 5 | < ^ ^ : i . 5 ) . ^ ^ 5 f . ^ ^ ^ 5 ^ 5 j , ^ j ( . ^ ^ ^ ^ 5 j . ^ ^ ^ ^ ^ j j , j j , ^ 5 j ^

void code_gen( void);

/ y * * * * * * * * * * * * * * * * * * * * * * * * * * : } : ^ ; j : : { . ; j ; : ( ; 5 j . ;{ : :} : j j 5 ; ( , ^ : j . ^ ^ 5 j j j , j 5 j j - ( , ^ ^ 5 j , 5 j . ^ 5 ( . 5 j j 5 l . ^ ^ ^

// Funcfion: gen_headers // Input: none // Output: none / /

// This function generates the headers and code for // the run-time objects. / / * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * > : *

int gen_headers( void);

/ / ' t ^ ' t ^ ^ ^ T ^ ^ ' l ^ ' i ^ 1 ^ 51^ *I^ ' I ^ 'j^ 5|< ?|C J j c SjC S|C 5jC SjC ^jC 5[C 3j< JijC SjC 3jC 5(C 3jC ?jC ?(C 5jC SjC ^ -jC 5jC 5jC ^ 3jC 5jC 5jC 3jS SjC SjC 5jC SjC 5jC 3jC 5jC ^fZ 5jC 3)C DjC 3fC SjC ^ S[C JjC ^ 5(C IjC 3|C 5jC SjC 5jC

// Function: parse_prep // Input: none // Output: none / /

// This function generates the preamble to the main output // C++ code.

void parse_prep( void);

/ / * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

// Function: usage // Input: none // Output: none / /

// This function prints the syntax and options for the program. / / * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

void usage( void);

/ / * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * • * * * * * * * *

// Function: S // Input: none // Output: true for success, false for error

49

Page 57: INVESTIGATION INTO THE COMPILATION OF THE REGULAR

/ /

// This function manipulates the global stack. It checks for // the portion of the grammar: // S -> [ T ] // Upon finding the complete sentence, it then calls printresult / / * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * ;^ :,!: 4: ;i; sfj ^ ^ ^ ;^ :^ :i: ;;< :i;:{; : i ; :^ :jc:(:

int S( void);

y y * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * : j c : j c * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

// Funcfion: T // Input: none // Output: true for success, false for error / /

// This funcfion manipulates the global stack. It checks for // the portion of the grammar: // T -> theta ( [ L ] ) Tl // Upon getting a successful return from L, it calls genterm. // Upon finding the right parenthesis, it calls performoper. / / ^L^ s i r ^ I f ^X* ^ i ^ ^L^ ^Lf ^ 1 ^ ^ 1 ^ ^ 1 ^ ^ 1 ^ >1^ ^ 1 ^ ^ 1 ^ ^ 1 ^ ^ I ^ ^Lf ^ 1 ^ ^ I ^ ^J^ ^ [ ^ ^L' ^l^ ^t^ >|^ ^ ! f ^ 1 ^ ^J^ ^ 1 ^ ^ 1 ^ ^ 1 ^ ^ 1 ^ ml^ ^1^ ^J> ^ 1 ^ ^ 1 ^ fct^ fcl^ ^X* ^ 1 ^ ^1^ ^!^ ^]> ^ j *J^ ^ ^ ^ 1 ^ ^ ^ ^ ^ ^"^ ^ ^ %1> %1^ ^1^ ^ t * ^ i ^ ^ ^ ^ 1 ^ « ^ ^ 1 ^ ^ 1 ^ ^ 1 ^ ^ l -

int T( void);

// Function: Tl // Input: none // Output: true for success, false for error / /

// This funcfion manipulates the global stack. It checks for // the portion of the grammar: // Tl -> ; L / / - > / / * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

int Tl( void);

/ / * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

// Funcfion: L // Input: none // Output: true for success, false for error / /

// This function manipulates the global stack. It checks for // the portion of the grammar: // L -> T // -><int>Ll // Upon finding an integer, it pushes it on the stack

/ / * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * • * * * * • * * * * * * *

50

Page 58: INVESTIGATION INTO THE COMPILATION OF THE REGULAR

int L( void);

^ y * * * * * * * * * * * * * * * * * * * * * * * : j j : j : ^ : i c 4 : 5 j . : j c ^ ^ - ^ ^ ^ ^ ^ j j j j ( , 5 j . j ^ ^ ^ ^ ^ ^ 5 j , ^ ^ 5 j , 5 j , ^ ^ ^ ^

// Function: LI // Input: none // Output: true for success, false for error / /

// This funcfion manipulates the global stack. It checks for // the portion of the grammar: // LI -> , L // -> ; L / / - >

// Upon finding a semicolon, it calls genterm / / ^p* JJ^ ^ ^ *J* PJ^ r^ 5|w 5 ^ 5JC ?JQ J|C ^ s ?Jy Jjw ?|C 5(C JJC ?|C ^ ^ ?jC 5JC ?JC ?jC ?jC ?tC 3jC ?JH ? K ?fC *iC ^ ^ SfC 5jC ? ^ 5jC JjC ?|C 3jC 3jC ?jC jJC 3iC 3fC ?jC ? ^ 5lC JjC ?jC JJC - ^ JK ?JC ^ ^ ^^ 5f* ^C ^ * * ^ ^ ^ - ^ 5|C ^C 5 ^ ir|t

int Ll( void);

/ / * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

// Funcfion: theta // Input: none // Output: true for success, false for error / /

// This function manipulates the global stack. It checks for // the portion of the grammar: // theta -> + / / - > -

/ / - > *

/ / - > /

// Upon finding an operation, it pushes it on the stack / / * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * > * * * * *

int theta( void );

extern int yylex();

FILE *myerror, *myformula; FILE *myoutput; int errors; int DataCount;

CMyStackType mystack, dimstack;

int main( int argc, char *argv[] )

{ char *debug = NULL; char *input = NULL;

51

Page 59: INVESTIGATION INTO THE COMPILATION OF THE REGULAR

char *output = NULL; int i = 1; bool runtime = false; DataCount = 0;

while ( i + 1 < argc )

{ // Check for the -o option if ( strcmp( argv[i], "-o") == 0 )

{ output = argv[++i]; printf( "output file: %s\n", output);

} // Check for the -d option else if ( strcmp( argv[i], "-d" ) == 0 )

{ debug = argv[++i]; printf( "debug file: %s\n", debug );

} // Check for the -i opfion else if ( strcmp( argv[i], "-i" ) == 0 )

{ input = argv[++i]; printf( "input file: %s\n", input);

} // Check for the -h option else if ( strcmp( argv[i], "-h" ) == 0 )

{ runtime = true;

} // Must have found an invalid option else

{ usageO; retu^m EXIT_FAILURE;

} ++i;

}

// if we were not given a debug filename, use the default if ( debug == NULL )

{ debug = (char *) malloc( strien( MY_DEBUG_FILE ) + 1); strcpy( debug, MY_DEBUG_FILE );

S2

Page 60: INVESTIGATION INTO THE COMPILATION OF THE REGULAR

fprintf( stderr, "default debug file: %s\n", debug ): }

// if we were not given an input filename, use the default if (input == NULL ) {

input = (char *) malloc( strien( MY_INPUT_FILE ) + 1 ); strcpy( input, MY_INPUT_FILE); fprintf( stderr, "default input file: %s\n", input):

}

// if we were not given an output filename, use the default if ( output == NULL ) {

output = (char *) malloc( strien( MY_OLTPUT_FILE ) + 1 ); strcpy( output, MY_OUTPUT_FILE); fprintf( stderr, "default output file: %s\n", output);

}

// If asked for, generate the run-fime support files if (runtime ) {

printf( "Generafing run-fime support files.\n"); if ( gen_headers() == EXIT_FAK.URE )

return EXIT.FAILURE; }

// open the input file as stdin if ((myformula = freopen(input,"r",stdin))==0)

{ fprintf( stderr, "seql: could not open %s for input", input); return EXIT_FALLURE;

}

// open the output file as stdout if ((myoutput = freopen(output,"xv",stdout))==0)

{ fprintf( stderr, "seql: could not open 9rs for output", output ): return EXIT_FAILURE;

}

// open the debug file as stderr if ((myerror = freopen(debug,"w",stderr))==0)

{

Page 61: INVESTIGATION INTO THE COMPILATION OF THE REGULAR

fprintf( stderr, "seql: could not open %s for output", debug ); return EXIT_FAILURE;

}

errors = 0; parse_prep();

SO;

if ( errors > 0 ) return EXIT.FAILURE;

code_gen();

return EXIT_SUCCESS; }

void usage( void) {

fprintf( stderr, "Usage: seql [-h] [-o ouptut] [-i input] [-d debug]\n" ); fprintf( stderr, "\n" ); fprintf( stden-, "Where\n"); fprintf( stderr, " -h Generate run-fime support filesVn" ): fprintf( stderr, " -o output write <output> instead of %s\n",

MY_OUTPUT_FILE); fprintf( stderr, " -d debug write <debug> instead of %s\n", MY_DEBUG_FILE

) ;

fprintf( stderr, " -i input use <input> instead of %s\n", MY_rNPUT_FILE ); }

char *gettype( int type ) {

char *name; name = (char *) malloc( 30 * sizeof( char)); switch (type) { case INTNUM: strcpy( name, "INTEGER");

break;

case DATA: strcpy( name, "DAT.A." ); break;

case COMMA: strcpy( name, "COMM.\" ); break;

54

Page 62: INVESTIGATION INTO THE COMPILATION OF THE REGULAR

case SEMICOLON: strcpy( name, "SEMICOLON" ): break;

case LBRACKET: strcpy( name, "LBRACKET"); break;

case PLUS:

case MINUS:

case TIMES:

case DIVIDE:

case DIV:

case MOD:

default:

}

return name;

strcpy(name, "PLUS" ); break;

strcpy( name, "MINUS"); break;

strcpy( name, "TIMES"); break;

strcpy( name, "DIVIDE"); break;

strcpy( name, "DIV"); break;

strcpy( name, "MOD" ); break;

strcpy( name, "EsfVALID ( break;

void stackpush( long item, int type )

{ charpushed[30]; char number[20];

MyStackElem element; element, value = 0; element.type = type;

strcpy( pushed, gettype( type )); if ((type -= DATA ) || (type == INTNUM );

{

55

Page 63: INVESTIGATION INTO THE COMPILATION OF THE REGULAR

strcat( pushed, ", " ) ; strcat( pushed, ltoa( item, number, 10 ));

element.value = item;

mystack.push( element); fprintf( myerror, "Pushing %s\n", pushed ):

}

void genterm( void) {

int dimension; MyStackElem element;

// get all integers off stack and place them in new stack while ( mystack.topO.type == INTNUM ) {

element = mystack.topO; dimstack.push( element); mystack.popO;

}

// clear run-fime list printf( "\n" ); printf(" one.clear();\n");

// for every element in new stack, place it in run-fime list while ( Idimstack.emptyO ) {

element = dimstack.topO; dimension = element.value; dimstack.popO; printf(" one.push_back( %d );\n", dimension );

}

// Now, generate code to generate run-time data structure printf( " cout « V'New data%d.\" « endl;\n". DataCount); printf(" CMyDataType *data9od = new CMyDataType( one ):\n", DataCount );

// push the item onto the stack stackpush( DataCount, D.AT.A.);

// keep track of how many run-fime structures we haxe generated

56

Page 64: INVESTIGATION INTO THE COMPILATION OF THE REGULAR

++DataCount; fflush( stdout);

}

void performoper( void ) {

MyStackElem element; deque<int> operands; int operl, oper2;

// if we have any integers on the stack, we need to generate a // run-time data structure if ( mystack.topO.type == INTNUM )

gentermO;

fprintf( myerror, "performoper: performing operation.\n"):

// get operands off stack, place into queue // language is right-associative!! while ( mystack.topO.type == DATA ) {

element = mystack.topO; fprintf( myerror, "Popping from stack %s, %d\n",

gettype( element.type ), element.xalue ); mystack.popO; operands.push_front( element.value );

}

// get operation off stack element = mystack.topO; fprintf( myerror, "Popping from stack %s, %d\n",

gettype( element.type ), element.value ): mystack.popO;

// check if we have only one operand if ( operands.sizeO == 1 ) {

operl = operands.back(); operands .pop_back(); printf( "\n"); switch (element.type) { case PLUS: printf(" datarcd->add():\n", operl );

57

Page 65: INVESTIGATION INTO THE COMPILATION OF THE REGULAR

break;

case MINUS: printf(" data%d->subtract():\n", operl ); break;

case TIMES: printf(" data%d->multiply();\n", operl ); break;

case DIVIDE: printf(" data%d->divideO;\n", operl ): break;

} operands.push_back( operl );

}

// get last two operands off list, then perform operation. // then, push the result onto the back-end of the list // as the right-most operand for the next iteration while ( operands.sizeO > 1 ) {

// language is right-associative oper2 = operands.backO; operands .pop_back(); operl = Operands.backO; operands.pop_back();

printf( "\n");

switch (element.type) { case PLUS: printf(" data%d->add( *data%d ):\n", operl.

oper2 );

oper2 );

oper2 );

break;

case MINUS: printf(" data%d->subtract( *data^rd );\n". operl.

break;

case TEMES: printf( " data9'cd->multiply( *data7rd );\n". operl

break;

case DIVIDE: printf(" data7cd->divide( *data^fd );\n", operl, oper2 ); break:

}

58

Page 66: INVESTIGATION INTO THE COMPILATION OF THE REGULAR

}

printf(" delete data%d;\n", oper2 ); operands.push_back( operl );

}

// no more operands, just the result. Take it off the list, and put // it on the stack stackpush( operands.backO, DATA ); operands .pop_back();

void printresult( void) {

MyStackElem element; stack<int> operands;

fprintf( myerror, "printresult: printing results.\n" );

// if we have integers on the stack, we need to generate a run-time // data structure if ( mystack.topO.type == INTNUM )

gentermO;

// Now, get all the "operands" off the stack and put them into a list while ( !mystack.empty()) {

element = mystack.topO; mystack.popO; fprintf( myerror, "Popping from stack 9cs, 9cd\n",

gettype( element.type ), element.value ); operands.push( element.value);

}

// Now, generate the code to print the run-time data structures (from // front to back. printf( "\n" ); while ( !operands.empty()) {

printf(" data%d->print();\n", operands.topO ); operands.popO;

} }

// Grammar parsed by these routines: / /

59

Page 67: INVESTIGATION INTO THE COMPILATION OF THE REGULAR

// s II II T RBRACKET / /

//

//

//

//

//

//

//

//

//

//

//

//

//

//

//

//

//

Tl

LI

theta

LBRACKET T RBRACKET { printresult(); }

TH LPAREN LBRACKET L { generate tenn }

RPAREN { performoperO; } Tl SEMICOLON L /* empty */

T INTNUM { stackpush( $1 ); } LI

COMMA L SEMICOLON { generate term } L /* empty /*

PLUS { stackpush( PLUS ); } 1 ME^US 1 1 TIMES ] 1 DIVIDE \ 1 DIV 1 1 MOD \

[ stackpush( MINUS ); } stackpush( TEMES ): } stackpush( DIVIDE); } stackpush( DIV ); } stackpush( MOD ); }

int S( void) {

ThisToken = yylex(); if (ThisToken == LBRACKET ) {

ThisToken = yylex(); if (TO )

} else

{ if (ThisToken == RBRACKET )

{

} else

printresult(); return true;

printerror( "T" );

} else

printerror( "T" );

printerror("'["' );

60

Page 68: INVESTIGATION INTO THE COMPILATION OF THE REGULAR

return false;

}

int T( void ) {

if (thetaO) {

} else

if (ThisToken == LPAREN ) {

ThisToken = yylex(); if (ThisToken == LBRACKET ) {

ThisToken = yylex(); if (L() ) {

} else

if ( ThisToken == RBRACKET ) {

ThisToken = yylex(); if (ThisToken == RPAREN ) {

performoperO; ThisToken = yylex(); return Tl();

} else

printerror( "V ); } else

pnnten-or( "1"' );

printerror( "L" ); } else

printerror(""["'); } else

printerror( "'("');

printerror( "theta" );

61

Page 69: INVESTIGATION INTO THE COMPILATION OF THE REGULAR

return false; }

int Tl( void) {

if ( ThisToken == SEMICOLON ) {

ThisToken = yylex(); return L();

} else if ( ThisToken == RBRACKET )

return true; else

printen-or("';'orT");

return false; }

int L( void) {

if ( ThisToken == INTNUM ) {

stackpush( yyval, EslTNUM); ThisToken = yylex(): return Ll();

} else

return T(); }

int LI ( void) {

if ( ThisToken == COMMA ) {

ThisToken = yylex(); return L();

} else if ( ThisToken == SEMICOLON ) {

gentermO; ThisToken = yvlex(); return L();

else if ( ThisToken == RBRACKET )

62

Page 70: INVESTIGATION INTO THE COMPILATION OF THE REGULAR

return true; else

printerror("',',';'orT"):

return false; }

int theta( void) {

stackpush( 0, ThisToken ); ThisToken = yylex(); return true;

}

void printerror( char *message ) {

fprintf( stderr, "Parse error: expected %s.\n", message ): ++errors;

}

int gen_headers( void ) {

FILE *myheader;

myheader = fopen( "MyDataType.h", "w" ); if ( !myheader ) {

fprintf( stderr, "Error writing MyDataType.hVn"); return EXIT_FAILURE;

} else {

fprintf( myheader, "#include <vector>\n"); fprintf( myheader, "\n" ); fprintf( myheader, "\nusing namespace std;\n" ); fprintf( myheader, "Vn"); fprintf( myheader, "typedef double CMyElemType;\n" ); fprintf( myheader, "\n" ); fprintf( myheader, "class CMyDataTypeVn" ); fprintf( myheader, "{\n" ): fprintf( mxheader, "protected:\n" ); fprintf( myheader, " \ector<int> dims;\n" ): fprintf( myheader, " vecior<CMyElemType> elements;\n" fprintf( myheader, "\n" );

63

Page 71: INVESTIGATION INTO THE COMPILATION OF THE REGULAR

true );\n");

);\n");

) ;

newone );\n" );

fprintf( myheader, "public:\n"); fprintf( myheader, " CMyDataType( );\n"); fprintf( myheader, " CMyDataType( vector<int> sizes, bool DoRead =

fprintf( myheader," void add( CMyDataType &data2 );\n"): fprintf( myheader, " void subtract( CMyDataType &data2 ):\n"): fprintf( myheader, " void multiply( CMyDataType &data2 );\n" ); fprintf( myheader," void divide( CMyDataType &data2 ):\n"): fprintf( myheader," void add( void );\n"); fprintf( myheader," void subtract( void );\n"); fprintf( myheader, " void multiply( void );\n"); fprintf( myheader, " void divide( void ):\n" ): fprintf( myheader, " void print( void );\n" ): fprintf( myheader, " CMyDataType &operator =( CM>DataT>pe &data2

fprintf( myheader, "\n"); fprintf( myheader, "protected:\n"): fprintf( myheader, " void normalize( const CMyDataType &data2 );\n"

fprintf( myheader, " int MapDimension( int index, const CMyDataTx pe

fprintf( myheader, "};\n" );

fclose( myheader); }

myheader = fopen( "MyDataType.cpp", "w"); if ( Imyheader) {

fprintf( stderr, "Error writing MyDataType.cpp\n" ): return EXIT_FAILURE;

} else {

fprintf( myheader, "#include <vector>\n"); fprintf( myheader, "#include <iostream>\n" ); fprintf( myheader, "#include \"MyDataType.h\"\n" ); fprintf( myheader, "\nusing namespace std;\n" ): fprintf( myheader, "\n" ); fprintf( myheader, "// Data stored in two parts: part one, dims, is a list\n" ): fprintf( myheader, "// of the dimensions of the structure. Part two,

elements\n" ); fprintf( myheader, "// is a (x irtually) one-dimensional storage of the

elements\n");

64

Page 72: INVESTIGATION INTO THE COMPILATION OF THE REGULAR

fprintf( myheader,"// of the structure. The actual structure can be reconstructed\n" );

fprintf( myheader, "// at any time from this information.\n"): fprintf( myheader, "//\n"); fprintf( myheader,

" / / * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * ; ; : * ^ : j « : i - ^ ^ * : ^ ^ * \ M v.

fprintf( myheader, "//Function: CMyDataType::CMyDataType\n"): fprintf( myheader, "//Inputs: a STL vector containing the dimensions

(integers)\n" ); fprintf( myheader, "// a boolean telling whether or not to generateVn"

) ;

fprintf( myheader,"// reads to fill-in the elements with xalues.\n"): fprintf( myheader, "//Outputs: none\n"); fprintf( myheader, "//\n"); fprintf( myheader, "//This funcfion is the constructor for the data

structure.\n"); fprintf( myheader, "//It creates the structure, filling in the dimension list\n"

) ;

fprintf( myheader, "//from the input list, and allocating enough memory for\n" );

fprintf( myheader, "//the element list. Note that the elements are inifialized\n" );

fprintf( myheader, '7/to zero by the STL vector constructor, if DoRead is false.\n");

fprintf( myheader, "//In any case, elements.size() xxill return the product of \n");

fprintf( myheader, "//the dimensions after this constructor.\n"); fprintf( myheader,

" / / * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * \ j ^ " \.

fprintf( myheader, "CMyDataType: :CMyDataType( vector<int> sizes, bool DoRead )\n");

fprintf( myheader, "{\n"); fprintf( myheader," int accum = l;\n" ); fprintf( myheader, " int index;\n"); fprintf( myheader," dims = sizes;\n"); fprintf( myheader, " for (index = 0; index < dims.size(); ++index )\n" ); fprintf( myheader, " accum *= dims[index];\n" ); fprintf( myheader, " elements.resize( accum );\n" ); fprintf( myheader, " if ( DoRead )\n" ); fprintf( myheader, " for (index = 0; index < accum; ++index )\n" ); fprintf( myheader, " cin » elements[indexj;\n" ); fprintf( myheader, "}\n" ); fpnntf( myheader, "\n");

65

Page 73: INVESTIGATION INTO THE COMPILATION OF THE REGULAR

) ;

fprintf( myheader, 'y/**************:(c*********:}::j,:^:i,^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^„ y

fprintf( myheader, "//Funcfion: CMyDataType::add\n"): fprintf( myheader, "//Inputs: data2, another CM\DataType structure to \n"

fprintf( myheader, "// add to this one\n"); fprintf( myheader, "//Outputs: This, data2\n" ); fprintf( myheader, "//\n");

fprintf( myheader, "//This function adds the elements of data2 to this

fprintf( myheader, "//first having normalized both elements. NOTE that

fprintf( myheader, "//elements will (potentially) change as a result of

one,\n" );

BOTH\n" );

this\n" ); fprintf( myheader, "//funcfion.\n" ); fprintf( myheader,

" / / * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * \ _ " \.

fprintf( myheader, "void CMyDataType::add( CMyDataTxpe &data2 )\n" ) ;

fprintf( myheader, fprintf( myheader, fprintf( myheader, fprintf( myheader, fprintf( myheader, fprintf( myheader, fprintf( myheader, fprintf( myheader, fprintf( myheader, fprintf( myheader, fprintf( myheader.

{\n"); ' int index;\n");

int accum = l;\n" ); normalize( data2 );\n" ):

' data2.normalize( *this );\n" ): ' for (index = 0; index < dims.size(); ++index )\n" );

accum *= dims[index];\n" ); for (index = 0; index < accum; ++index )\n" );

elements[index] += data2.elements[index];\n" ); }\n" ); \n");

fprintf( myheader, " / / * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * • * * * * = ^ - \ p i " )•

fprintf( myheader, "//Funcfion: CMxDataType::subtract\n" ); fprintf( myheader, "//Inputs: data2, another CMxDataTxpe structure to \n"

) ;

this one,\n" );

BOTH\n");

this\n");

fprintf( myheader, "// subtract from this one\n" ); fprintf( myheader, "//Outputs: This, data2\n" ): fprintf( myheader, "//\n");

fprintf( myheader, "//This function subtracts the elements of data2 from

fprintf( myheader, "//first having normalized both elements. NOTE that

fpnntf( myheader, "//elements will (potentially) change as a result of

66

Page 74: INVESTIGATION INTO THE COMPILATION OF THE REGULAR

fprintf( myheader, "//function.\n"); fprintf( myheader,

" / / * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * \ _ " >.

fprintf( myheader, "void CMvDataType::subtract( CMxDataTxpe &data2 )\n");

fprintf( myheader, "{\n"); fprintf( myheader, " int index ;\n"): fprintf( myheader, " int accum = l;\n"); fprintf( myheader," normalize( data2 );\n"); fprintf( myheader," data2.normalize( *this );\n"): fprintf( myheader, " for (index = 0; index < dims.size(); ++index )\n" ); fprintf( myheader, " accum *= dims[index];\n"); fprintf( myheader, " for (index = 0; index < accum: ++index )\n" ); fprintf( myheader, " elements[index] -= data2.elements[indexj;\n"); fprintf( myheader, "}\n"); fprintf( myheader, "\n"); fprintf( myheader,

" / / * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * \ T ^ " X.

fprintf( myheader. "//Function: CMyDataType::multipl\\n" ); fprintf( myheader, "//Inputs: data2, another CMyDataTxpe structure to \n"

) ;

data2,\n");

BOTHXn" );

this\n");

fprintf( myheader, "// multiply this one by\n" ); fprintf( myheader. "//Outputs: This, data2\n" ); fprintf( myheader, "//\n" ); fprintf( myheader, "//This function multiplies the elements of this one b\'

fprintf( myheader, "//first having normalized both elements. NOTE that

fprintf( myheader, "//elements xvill (potentially) change as a result of

fprintf( myheader, "//function.\n" ): fprintf( myheader,

I ' / / * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * = ^ * * * * * * * * * ' * * * \ n " );

fprintf( myheader, "void CMyDataType::multiply( CMyDataTxpe &data2

)\n" ); fprintf( myheader, fprintf( myheader, fprintf( myheader, fprintf( myheader, fprintf( myheader, fprintf( myheader. fprintf( myheader. fprintf( myheader. fprintf( myheader.

'{\n" ): ' int index;\n" ); ' int accum = l;\n" ); ' normahzei data2 );\n" ); ' data2.normalize( *this );\n" ); ' for (index = 0; index < dims.size(); ^+index )\n" );

accum *= dims[index]:\n" ); ' for (index = 0; index < accum; ++index )\n" );

elements[index] *= data2.elements[index];\n" );

67

Page 75: INVESTIGATION INTO THE COMPILATION OF THE REGULAR

fprintf( myheader, "}\n"); fprintf( myheader, "\n"); fprintf( myheader,

' Y / * * * * * * * * * * * * * * * * * * * * * * * * * * : j « : f : ^ 5 ^ : ( . : j ; 5 j . ^ 5 ^ ^ 5 j 5 5 j . ^ ^ ^ ^ ^ ^ j j , ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ , \.

fprintf( myheader, "//Function: CMyDataType::dix'ide\n" ): fprintf( myheader, "//Inputs: data2, another CMxDataType structure to \n"

) ;

data2,\n" );

BOTH\n" );

this\n");

fprintf( myheader,"// divide this one by\n" ); fprintf( myheader, "//Outputs: This, data2\n" ): fprintf( myheader, "/An"); fprintf( myheader, "//This function divides the elements of this one by

fprintf( myheader, "//first having normalized both elements. NOTE that

fprintf( myheader, "//elements will (potenfially) change as a result of

fprintf( myheader, "//funcfion.\n" ); fprintf( myheader,

t y / * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * ^ - ' .

fprintf( myheader, "void CMyDataType::divide( CMvDataType &data2 )\n");

fprintf( myheader, "{\n" ): fprintf( myheader," int index;\n"); fprintf( myheader," int accum = l;\n"); fprintf( myheader, " normalize( data2 );\n" ); fprintf( myheader, " data2.normalize( *this );\n" ); fprintf( myheader, " for (index = 0; index < dims.size(); ++index )\n" ); fprintf( myheader, " accum *= dims[index];\n" ): fprintf( myheader, " for (index = 0; index < accum; ++index )\n" ); fprintf( myheader, " elements[indexj /= data2.elements[index];\n" ); fprintf( myheader, "}\n" ); fprintf( myheader, "\n"); fprintf( myheader,

" I / / * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * \ n " );

fprintf( myheader, "//Funcfion: CMyDataType::add\n" ); fprintf( myheader, "//Inputs: none\n" ); fprintf( myheader, "//Outputs: This\n"); fprintf( myheader, "//\n" ); fprintf( myheader, 7/This funcfion breaks up this structure into

mulfipleVn" ); fprintf( myheader, "//structures. dim[0] in quantitx. each having the

dimensions\n"); fprintf( myheader, "//of this one, less the first dimension.Vn" );

68

Page 76: INVESTIGATION INTO THE COMPILATION OF THE REGULAR

fprintf( myheader, "//It then adds the resulting structures, as if the\ were\n" );

fprintf( myheader, "//independent structures.\n"); fprintf( myheader,

" / / * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * \ T - I " y

fprintf( myheader, "void CMyDataType::add( void )\n" ): fprintf( myheader, "{\n" ); fprintf( myheader, " int index, index2;\n"); fprintf( myheader," int accum = l;\n"); fprintf( myheader, " vector<int> newdims;\n" ); fprintf( myheader, " while (( dims.size() > 2 ) && ( dims[0] == 1 ))\n" ); fprintf( myheader, " dims.erase( dims.begin() );\n"); fprintf( myheader," if ((dims.size() == 1 ) && ( dims[0] == 1 ))\n" ); fprintf( myheader," return ;\n"); fprintf( myheader, " for (index = 1; index < dims.size(); ++index )\n" ): fprintf( myheader, " {\n"); fprintf( myheader, " newdims.push_back( dims[indexj );\n" ); fprintf( myheader, " accum *= dims[index];\n" ); fprintf( myheader, " }\n" ); fprintf( myheader," CMyDataType *new0 = new CMyDataType(

newdims, false );\n"); fprintf( myheader, " CMyDataType *newl = nexv CMyDataType(

newdims, false );\n"); fprintf( myheader, " for (index = accum - 1; index >= 0; -index )\n" ); fprintf( myheader, " {\n");

newl->elements[index] = elements.back():\n" ); elements.pop_back();\n" );

fprintf( myheader, " }\n"); fprintf( myheader, " for (index = dims[0] - 2; index >= 0; -index )\n" j ; fprintf( myheader," {\n");

for (index2 = accum - 1: index2 >= 0; -index2

fprintf( myheader, " fprintf( myheader, "

)\n");

) ;

fprintf( myheader,"

fprintf( myheader," fprintf( myheader,"

fprintf( myheader, " fprintf( myheader, " fprintf( myheader," fprintf( myheader, "

{\n"): new0->elements[index2] = elements.back();\n"

elements.pop_back();\n" ); }\n"); nexvO->add( *newl );\n" ); *newl = *nexvO;\n" ):

fprintf( myheader, " }\n"); fprintf( myheader, " *this = *newl;\n" ); fprintf( myheader, "}\n" ); fprintf( myheader, "\n");

69

Page 77: INVESTIGATION INTO THE COMPILATION OF THE REGULAR

fprintf( myheader, " / / * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * \ _ . " \.

fprintf( myheader, "//Function: CMyDataType::subtract\n"); fprintf( myheader, "//Inputs: none\n"); fprintf( myheader, "//Outputs: This\n" ): fprintf( myheader, "//\n"); fprintf( myheader, "//This function breaks up this structure into

multiple\n"); fprintf( myheader

dimensions\n" ); fprintf( myheader fprintf( myheader

) ;

fprintf( myheader fprintf( myheader

fprintf( myheader fprintf( myheader fprintf( myheader fprintf( myheader fprintf( myheader fprintf( myheader fprintf( myheader fprintf( myheader fprintf( myheader fprintf( myheader fprintf( myheader fprintf( myheader fprintf( myheader fprintf( myheader fprintf( myheader

newdims, false );\n"); fprintf( myheader

newdims, false );\n" ); fprintf( myheader fprintf( myheader fprintf( myheader fprintf( myheader fprintf( myheader fprintf( myheader fprintf( myheader, fprintf( myheader.

"//structures, dim[0] in quantity, each hax ing the

"//of this one, less the first dimension.Vn" ): "//It then subtracts the resulfing structures, as if theyVn"

"//were independent structures.\n" );

void CMyDataType::subtract( void )\n"); {\n" );

int index, index2;\n" ); int accum = l;\n"); vector<int> newdims ;\n" ); while (( dims.sizeO > 2 ) && ( dims[0] == 1 ))\n" ):

dims.erase( dims.begin() );\n" ); if (( dims.sizeO == 1 ) && ( dims[0] == 1 ))\n" );

return;\n"); for (index = 1; index < dims.size(); ++index )\n" ); {\n");

newdims.push_back( dims[index] );\n" ); accum *= dims[indexj;\n" );

}\n");

CMyDataType *newO = new CMyDataTypei

CMyDataType *newl = nexv CMyDataType!

for (index = accum - 1; index >= 0; -index )\n" ); {\n" ):

newl->elements[index] = elements.back();\n" ): elements.pop_back();\n" );

}\n"); for (index = dims[0] - 2; index >= 0; -index )\n" ); {W):

for (index2 = accum - 1; index2 >= 0; -inde\2 )\n" );

fprintf( myheader, " {\n" );

70

Page 78: INVESTIGATION INTO THE COMPILATION OF THE REGULAR

fprintf( myheader, " new0->elements[index2] = elements.back( );\n" ) ;

fprintf( myheader, " elements.pop_back():\n"); fprintf( myheader, " }\n"); fprintf( myheader," newO->subtract( *newl );\n" ); fprintf( myheader, " *newl = *newO;\n"); fprintf( myheader, " }\n"); fprintf( myheader, " *this = *newl;\n"); fprintf( myheader, "}\n"); fprintf( myheader, "\n"); fprintf( myheader,

" / / * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * ^ > i ; : ( : 5 } : - i c * * * * : ^ * * * * : ^ * * * \ M y

fprintf( myheader, "//Function: CMyDataType::multiply\n" ): fprintf( myheader, "//Inputs: none\n"): fprintf( myheader, "//Outputs: This\n" ); fprintf( myheader, "/An"); fprintf( myheader, "//This function breaks up this structure into

multiple\n" ); fprintf( myheader, "//structures, dim[0] in quantity, each haxing the

dimensions\n" ); fprintf( myheader, "//of this one, less the first dimension.Vn" ); fprintf( myheader, "//It then mulfiplies the resulting structures, as if

theyVn"); fprintf( myheader, "//were independent structures.Vn" ); fprintf( myheader,

/ / * * * * * * : , : * * * * : : < * * * * : i < * * * * * * * ¥ = : c * * * * * : : : * > r : * : « * * = : : H = = K = i c * * ¥ 5 i c : t : : ) c : t : : t : > i c : C : i c ; p : i c = , c : , c = : c : t : ^ j .

fprintf( myheader, "void CMyDataType::mulfiply( void )Vn" ); fprintf( myheader, "{Vn"); fprintf( myheader, " int index, index2;Vn"); fprintf( myheader, " int accum = l;Vn" ); fprintf( myheader, " vector<int> newdims;Vn"): fprintf( myheader, " while (( dims.sizeO > 2 ) && ( dims[0] == 1 ))Vn" ); fprintf( myheader, " dims.erase( dims.begin() );Vn" ); fprintf( myheader, " if (( dims.sizeO == 1 ) && ( dims[0] == 1 ))Vn" ); fprintf( myheader," return ;Vn" ); fprintf( myheader, " for (index = 1; index < dims.sizet): ++index )Vn" ); fprintf( myheader. " {Vn" ); fprintf( myheader, " newdims.push_back( dims[index] );Vn" »; fprintf( myheader, " accum *= dims [index ]:Vn" ): fprintf( myheader, " }Vn" ); fprintf( myheader, " CMyDataType *nexvO = new CMyDataTxpci

newdims, false );Vn" ); fprintf( myheader, " CMxDataTxpe *nexvl = nexv C.\lxDataT\pe(

newdims, false );Vn");

71

Page 79: INVESTIGATION INTO THE COMPILATION OF THE REGULAR

)\n");

) ;

fprintf( myheader," for (index = accum - 1; index >= 0; -index )Vn"); fprintf( myheader, " {Vn"); fprintf( myheader, " newl->elements[index] = elements.back();Vn" ): fprintf( myheader, " elements.pop_back();Vn"); fprintf( myheader," }Vn"); fprintf( myheader, " for (index = dims[0] - 2; index >= 0; -index )Vn"); fprintf( myheader, " {Vn"); fprintf( myheader, " for (index2 = accum - 1; index2 >= 0; -index2

fprintf( myheader, " {Vn"); fprintf( myheader, " new0->elements[index2] = elements.back();Vn"

fprintf( myheader," elements.pop_back();Vn"): fprintf( myheader, " }Vn"); fprintf( myheader, " newO->mulfiply( *newT );Vn" ); fprintf( myheader, " *newl = *newO;Vn"); fprintf( myheader, " }Vn"); fprintf( myheader, " *this = *newl;Vn"); fprintf( myheader, "}Vn" ); fprintf( myheader, "Vn"); fprintf( myheader,

• y / * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * \ II \ .

fprintf( myheader, "//Funcfion: CMyDataType: :di vide Vn" ); fprintf( myheader, "//Inputs: noneVn" ); fprintf( myheader, "//Outputs: ThisVn"); fprintf( myheader, "//Vn" ); fprintf( myheader, "//This function breaks up this structure into

mulfipleVn"); fprintf( myheader, "//structures, dim[0] in quantity, each hax ing the

dimensionsVn" ); fprintf( myheader, "//of this one, less the first dimension.Vn" ); fprintf( myheader, "//It then divides the resulfing structures, as if they

wereVn"); fprintf( myheader, "//independent structures.Vn" ); fprintf( myheader,

" / / * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * \ p " Y

fprintf( myheader, "void CMyDataType::divide( void )Vn" ); fprintf( myheader, "{Vn" ); fprintf( myheader, " int index, index2;Vn" ); fprintf( myheader, " int accum = l;Vn" ): fprintf( myheader, " vector<int> newdims;Vn" ); fprintf( myheader, " while (( dims.sizeO > 2 ) && ( dims[0] == 1 ))Vn" ); fprintf( myheader, " dims.erase( dims.begin() );Vn" ); fpnntf( myheader, " if (( dims.sizeO == 1 ) && ( dims[0] == 1 ))Vn" );

72

Page 80: INVESTIGATION INTO THE COMPILATION OF THE REGULAR

fprintf( myheader, " retum;Vn"); fprintf( myheader, " for (index = 1; index < dims.sizeO: ++index )Vn" ): fprintf( myheader, " {Vn"); fprintf( myheader, " newdims.push_back( dims[index] ):Vn" ): fprintf( myheader, " accum *= dims[indexJ;Vn"): fprintf( myheader, " }Vn"); fprintf( myheader, " CMyDataType *new0 = new CMxDataType(

newdims, false );Vn"); fprintf( myheader, " CMyDataType *newl = new CMxDataType(

newdims, false );Vn"); fprintf( myheader, " for (index = accum - 1; index >= 0; -index )Vn" ); fprintf( myheader, " {Vn"); fprintf( myheader, " newl->elements[index] = elements.back();Vn"); fprintf( myheader, " elements.pop_back();Vn" ): fprintf( myheader, " }Vn"); fprintf( myheader, " for (index = dims[0] - 2; index >= 0: -index )Vn" ); fprintf( myheader, " {Vn" ); fprintf( myheader, " for (index2 = accum - 1; index2 >= 0; -index2

)\n");

) ;

fprintf( myheader, " {Vn" ); fprintf( myheader, " new0->elements[index2] = elements.back();Vn"

fprintf( myheader, " elements.pop_back();Vn" ); fprintf( myheader, " }Vn"); fprintf( myheader, " newO->divide( *newl );Vn"); fprintf( myheader, " *newl = *newO;Vn" ); fprintf( myheader, " }Vn" ); fprintf( myheader, " *this = *newl:Vn"); fprintf( myheader. "}Vn"); fprintf( myheader, "Vn"); fprintf( myheader,

V V / / sic sic sic sic sic sic sic sic sic ^^ sic sic ^£ sic sic sic sic sic sic sic SJ^ sic sic ^^ sic sic sic ^ 1 ^ ^ i f ^ic. s ic sic sic sic sic sic sic s l ^ sic sic sic SJc sic sic sic sic sic sic sic ^ ^ ^ C S ^ sic ^ ^ ^LT ^ ^ ^ i * ^ > vL* V IV \

fprintf( myheader, "//Funcfion: CMyDataType::printVn" ); fprintf( myheader, "//Inputs: noneVn"); fprintf( myheader, "//Outputs: noneVn" ); fprintf( myheader, "/An" ); fprintf( myheader, "//This function prints this structureVn" ); fprintf( myheader,

l y / * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * \ p " j .

fprintf( myheader, "void CMyDataType: :print( void )Vn" ): fprintf( myheader, "{Vn" ); fprintf( myheader, " int index;Vn" ): fprintf( myheader, " int accum = l;Vn"); fpnntf( myheader, " for (index = 0; index < dims.sizeO; ++index )Vn" );

73

Page 81: INVESTIGATION INTO THE COMPILATION OF THE REGULAR

fprintf( myheader, " accum *= dims [index] :Vn"); fprintf( myheader," cout «V"Result:V" « endl;Vn"); fprintf( myheader, " for (index = 0; index < accum: ++index )Vn" ): fprintf( myheader, " cout « elements [index] « endfiVn" ): fprintf( myheader, "}Vn"); fprintf( myheader, "Vn"); fprintf( myheader,

" / / * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * \ _ II N.

fprintf( myheader, fprintf( myheader, fprintf( myheader, fprintf( myheader, fprintf( myheader,

toVn"); fprintf( myheader,

consistingVn" ); fprintf( myheader,

both inVn"); fprintf( myheader,

dimension.Vn" ); fprintf( myheader,

whichVn"); fprintf( myheader, fprintf( myheader,

fprintf( myheader.

7/Function: CMyDataType::normalizeVn" ); •//Inputs: data2Vn"); '//Outputs: thisVn"); VAn"); 7/This function normalizes this structure with respect

'//data2. It does this by creating a new structure

V/of the maximum dimensions of either this or data2.

'//the number of dimensions and the size of each

'//It then uses the mapdimension function to determine

'//of the current elements to use to fill in the new Vn" ); '//structure. Note that ALL of the current elements will

'//replicated in the nexv structure. Some, how ever, may beVn");

beVn"); fprintf( myheader, "//repeated, if the new structure is larger than this

one.Vn"); fprintf( myheader,

"//***********************************************************\n" ); fprintf( myheader, "void CMyDataType::normalize( const CMyDataTxpe

&data2 )Vn"); fprintf( myheader, fprintf( myheader, fprintf( myheader, fprintf( myheader, fprintf( myheader, fprintf( myheader, fprintf( myheader, fprintf( myheader, fprintf( myheader, fprintf( myheader.

•{Vn"); ' CMyDataType *data3;Vn" ); ' vector<int> newsizelist;Vn" ); ' int newsize, sizel, size2, index, remainder;Vn" ); ' vector<int>::const_iterator iterl, iter2;Vn" ); ' sizel = dims.size();Vn" ); ' size2 = data2.dims.size();Vn"): ' iterl = dims.begin():Vn" ); ' iter2 = data2.dims.begin():Vn" ); ' // the nexx structure has a number of dimensionsVn"

) ;

74

Page 82: INVESTIGATION INTO THE COMPILATION OF THE REGULAR

) ;

fprintf( myheader, " // which is equal to the maximum number foundVn"

largerVn");

)\n");

)\n");

fprintf( myheader, " fprintf( myheader, " fprintf( myheader, " fprintf( myheader, "

fprintf( myheader," fprintf( myheader, " fprintf( myheader, "

fprintf( myheader, " fprintf( myheader, " fprintf( myheader, "

fprintf( myheader, " fprintf( myheader, "

remainingVn" );

);\n");

structure,Vn" );

) ;

thisVn" );

)Vn");

oneVn" );

fprintf( myheader, " fprintf( myheader, " fprintf( myheader, " fprintf( myheader, " fprintf( myheader, "

fprintf( myheader, " fprintf( myheader, " fprintf( myheader, " fprintf( myheader, "

fprintf( myheader, " fprintf( myheader, "

fprintf( myheader, " fprintf( myheader, "

fprintf( myheader, " fprintf( myheader, "

fprintf( myheader, " fprintf( myheader, "

fprintf( myheader, " fprintf( myheader, "

// between this and data2Vn" ): newsize = max( sizel, size2 ):Vn"): newsizelist.reserve( newsize ):Vn"): // now, copy the excess dimensions from the

// structure to the new oneVn"); if (sizel >size2)Vn");

for (index = 0; index < sizel - size2; ++index

newsizelist.push_back( *iterl++ ):Vn"); elseVn" );

for (index = 0; index < size2 - sizel; ++index

newsizelist.push_back( *iter2++ );Vn"); // now, copy the maximum of each of the

// dimensions to the new structure.Vn" ); remainder = min( sizel, size2 );Vn"); for (index = 0; index < remainder; ++index )Vn" ); {\n");

newsizelist.push_back( max( *iterl, ' iter2 )

++iterl;Vn"); ++iter2;Vn");

}\n" ); // now, we have the correct size for the nexx

// create itVn" ); data3 = new CMxDataType( newsizelist, false );Vn"

newsize = l;Vn" ); // now, calculate how many elements (total) are in

// new StructureVn"); for (index = 0; index < newsizelist.sizeO'^ ++index

newsize ' = newsizelist[index];Vn" ); // and copy the appropriate elements from the old

// to the new oneVn" ); for ( index = 0; index < nexx size; ++index )\n" );

75

Page 83: INVESTIGATION INTO THE COMPILATION OF THE REGULAR

fprintf( myheader, " data3->elements[index] = elements[MapDimension( index, *data3 )];Vn");

fprintf( myheader, " // now, make the new structure this oneVn"); fprintf( myheader, " *this = *data3:Vn"); fprintf( myheader, "}Vn" ); fprintf( myheader, "Vn"); fprintf( myheader,

' y / * * * * * * * * * * * * * * * : j c * : ^ ^ : ) ; : f : ; j : : ) : : ( : : < , 5 ^ : ( . ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ j j . ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ .

fprintf( myheader, "//Funcfion: CMyDataType::MapDimensionVn"): fprintf( myheader, "/Anputs: index, newoneVn"); fprintf( myheader, "//Outputs: integerVn"); fprintf( myheader, "//Vn"); fprintf( myheader, "//This funcfion takes the index relative to the new Vn"); fprintf( myheader, "//structure, and finds the dimensions. It then

calculatesVn"); fprintf( myheader, "//the appropriate dimension for the old structure.

ThenVn"); fprintf( myheader, "//it calculates the index of that element in the oldVn" ); fprintf( myheader, "//structure. It then returns that number.Vn" ); fprintf( myheader,

' y / * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * \ ^ i i \ .

fprintf( myheader, "int CMyDataType::MapDimension( int index, const CMyDataType newone )Vn");

fprintf( myheader, "{Vn" ); fprintf( myheader, " int counter, accum. position;Vn" ); fprintf( myheader, " vector<int> newdimension. oldoffset, nexvoffset:Vn"

) ;

fprintf( myheader, " vector<int>::const_iterator iterl, iter2;Vn" ); fprintf( myheader, " accum = l;Vn"); fprintf( myheader, " posifion = 0;Vn" ); fprintf( myheader, " // calculate the offsets for each dimension of the

new StructureVn"); fprintf( myheader, " for ( counter = newone.dims.sizeO - 1- counter >=

0; —counter )Vn"); fprintf( myheader, " {Vn" ); fprintf( myheader," newoffset.insert( newoffset.begin(). accum

);\n");

thatVn");

newVn" );

fprintf( myheader, " accum *= newone.dims[counter];Vn" ); fprintf( myheader, " }Vn" ); fprintf( myheader, " // find each dimension in the nexv structure, so

fprintf( myheader, " // we know the V'realV" position of index in the

fprintf( myheader, " // structureVn" );

76

Page 84: INVESTIGATION INTO THE COMPILATION OF THE REGULAR

fprintf( myheader, ++counter )Vn");

fprintf( myheader, fprintf( myheader,

newoffset[posifion] );Vn"); fprintf( myheader, fprintf( myheader, fprintf( myheader, fprintf( myheader,

modding, to theVn"); fprintf( myheader, fprintf( myheader,

++counter )Vn"); fprintf( myheader,

dims[counter%%dims.size()];Vn") fprintf( myheader, ' fprintf( myheader,'

StructureVn" ); fprintf( myheader, '

counter )Vn"); fprintf( myheader, ' fprintf( myheader, '

);\n");

)Vn" );

fprintf( myheader, ' fprintf( myheader, ' fprintf( myheader, ' fprintf( myheader, ' fprintf( myheader, ' fprintf( myheader, ' fprintf( myheader, ' fprintf( myheader, '

fprintf( myheader,' fprintf( myheader,' fprintf( myheader, ' fprintf( myheader, ' fprintf( myheader, ' fprintf( myheader, ' fprintf( myheader, ' fprintf( myheader, ' fprintf( myheader,

l y / * * * * * * * * * * * * * * * * * * * * * * * * * *

fprintf( myheader, " fprintf( myheader, "

for (counter = 0; counter < newoffset.size():

{\n"); newdimension.push_back( index /

index %%= newoffset[positionJ;Vn"); ++position;Vn"):

}\n"); // take those dimensions and convert them, bx

// dimensions in this structureVn" ); for ( counter = 0; counter < newdimension.size():

newdimension[counter] %9c=

accum = l;Vn"); // calculate teh offsets for each dimension of this

for ( counter = dims.sizeO - 1; counter >= 0; -

{\n"); oldoffset.insert( oldoffset.begin(), accum

accum *= dims [counter] ;Vn"); }\n"); // now, find the index into this structure for theVn" ); // modified dimensionsVn" ): accum = 0;Vn" ); iterl = newdimension.end();Vn" ); iter2 = oldoffset.end();Vn" ); for ( counter = 0; counter < dims.size(); ++counter

{\n"); -iterl ;Vn"); -iter2;Vn"); accum += *iterl * *iter2;Vn" );

}Vn"); return accum;Vn");

}\n" ); Vn");

* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * \ r j " )•

//Funcfion: CMyDataType::operator =Vn" ); //Inputs: this, data2Vn" );

77

Page 85: INVESTIGATION INTO THE COMPILATION OF THE REGULAR

fprintf( myheader, "//Outputs: noneVn"); fprintf( myheader, "//Vn"); fprintf( myheader, "//This funcfion copies the elements from data2 into

thisVn" ); fprintf( myheader, "//structure, making them equal.Vn" ): fprintf( myheader,

I I / / * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * \ _ ' I \ .

fprintf( myheader, "CMyDataType &CMyDataType::operator =( CMyDataType &data2 )Vn");

fprintf( myheader, "{Vn"); fprintf( myheader," dims = data2.dims;Vn"); fprintf( myheader," elements = data2.elements;Vn"); fprintf( myheader," return *this;Vn" ); fprintf( myheader. "}Vn"); fprintf( myheader, "Vn");

fclose( myheader); }

return EXIT_SUCCESS; }

void parse_prep( void ) {

// print includes needed for run-time compilation printf( "#include <iostream>Vn" ); printf( "#include <vector>Vn"); printf( "#include V'MyDataType.hVVn" ); printf( "Vn" ); printf( "int main()Vn"); printf("{Vn");

// declare the one run-time static structure needed. printf( " vector<int> one;Vn" ); pnntf( "Vn");

}

void code_gen( void )

{ // close out mainO routine pnntf( "Vn"); printf(" return 0;Vn"); printf("}Vn");

}

78

Page 86: INVESTIGATION INTO THE COMPILATION OF THE REGULAR

MvStack.h, The Stack Implementation

#ifndef _M YSTACK_H #define _MYSTACK_H

#include <stack>

using namespace std;

typedef struct MyStackElem {

long value; int type;

} MyStackElem;

typedef stack<MyStackElem> CMyStackType;

#endif

Seql.l. The flex Source Code

%{

%}

delim ws letter digit idchar id int real char string

%%

{ws}

#include <stdio.h> #include <stdlib.h> #include <string.h> #include "seql.h" #define true 1 #define false 0

[\tVn] {deHm}+ [a-z] [0-9] [{letter} {digit}.] {idchar}+ [+V-]?{ digit}+ [+V-]'?{digit}*V.{digit}*(e[+V-]?{digit}+)'? V'.V yi *Y'

{ ; }

79

Page 87: INVESTIGATION INTO THE COMPILATION OF THE REGULAR

m integer real var operator next consume produce where taking from and or not when else pred succ map gen >

> =

V< V < = v<>

V.V.V.

\ (

\ )

\[ \] asc desc abs sqrt cos sin tan log V V+

retum(rN); } retum(ENTEGER); } retum(REAL); } retum(VAR); } retum(OPERATOR); } retum(NEXT); }

{ retum(CONSUME); } { retum(PRODUCE); }

retum(WHERE); } retum(TAKING); } retum(FROM); } retum(AND); } return (OR); } retum(NOT); } retum(WHEN); } retum(ELSE); } retum(PRED); } retum(SUCC); } return (MAP); } retum(GEN); } retum(GREATERTHAN); } retum(GREATEREQUAL); } retum(EQUAL); } retum(LESSTHAN); } retum(LESSEQUAL); } retum(NOTEQUAL); } retum(COMMA); } retum(SEMICOLON); } retum(ELIPSES); } retum(LPAREN); } retum(RPAREN); } retum(LBRACKET); } retum(RBRACKET); } retum(ASC); } retum(DESC); } retum(ABS); } retum(SQRT); } retum(COS); } retum(SIN); } retum(TAN); } retum(LOG); } retum(POWER); } retum(PLUS); } retum(MINUS); }

80

Page 88: INVESTIGATION INTO THE COMPILATION OF THE REGULAR

\* { retum(TIMES); } V { retum(DIVIDE); } div { retum(DIV); } mod { retum(MOD); } reverse { return (REVERSE); } size { retum(SIZE); } transpose { retum(TRANSPOSE); } rotate_right { return(RROTATE); } rotate_left { return(LROTATE); }

compose { retum(COMPOSE); } cartesian_product { return(CARTPROD); } \{ { retum(LBRACE); } M { retum(RBRACE); } {int} { yyval = atol( yytext);

retum(INTNUM); } {real} { return (RE ALNUM); } {id} { retum(IDENTIFIER); } {id}A( { retum(FUNCTION); }

{ retum(CHAR); } V".*V" { retum(STRING); }

{ printfC'Error: Unrecognized character "9cs\n", \ \ t ex t ) : }

'o

int yywrap( vo id)

{ return 1;

}

int insert_table( int tokentype, char *text)

{

if ( ++NumberOfTokens > S YMBOL_TABLE_SIZE ) return false;

switch (tokentype)

{ case INTNUM: symbol_table[NumberOfTokens].intnum = atol(text

break;

case REALNUM: symbol_table[NumberOfTokens].realnum = strtod(text, NULL):

break;

case CHAR: symbol_table[NumberOfTokens].character = text[0];

81

Page 89: INVESTIGATION INTO THE COMPILATION OF THE REGULAR

break;

case STRING: strcpy(symbol_table[NumberOfTokens].string, text); break;

} symbol_table[NumberOfTokens].tokentype = tokentype: token val = NumberOfTokens;

return true; }

82

Page 90: INVESTIGATION INTO THE COMPILATION OF THE REGULAR

APPENDIX C

RUN-TIME SUPPORT FILES

MvDataTvpe.h, The Run-fime Header File

#include <vector>

using namespace std;

typedef double CMyElemType;

class CMyDataType { protected:

vector<int> dims; vector<CMyElemType> elements;

public: CMyDataType(); CMyDataType( vector<int> sizes, bool DoRead = true ); void add( CMyDataType &data2 ); void subtract( CMyDataType &data2 ); void multiply( CMyDataType &data2 ); void divide( CMyDataType &data2 ); void add( void); void subtract( void); void mulfiply( void); void divide( void); void print( void); CMyDataType &operator =( CMyDataType &data2 );

protected: void normalize( const CMyDataType &data2 ); int MapDimension( int index, const CMyDataType newone );

} ;

MvDataTvpe-cpp. The Run-time Support File

#include <vector> #include <iostream>

83

Page 91: INVESTIGATION INTO THE COMPILATION OF THE REGULAR

#include "MyDataType.h"

using namespace std;

// Data stored in two parts: part one, dims, is a list // of the dimensions of the structure. Part two, elements // is a (virtually) one-dimensional storage of the elements // of the structure. The actual structure can be reconstructed // at any fime from this informafion. / / / / * * * * * * * * * * * > ( : : { . : ^ : ( : 5 j , ; j , - ^ ; ^ - ^ ^ ^ 5 j . ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^

//Funcfion: CMyDataType::CMyDataType //Inputs: a STL vector containing the dimensions (integers) // a boolean telling whether or not to generate // reads to fill-in the elements with values. //Outputs: none / /

//This funcfion is the constructor for the data structure. //It creates the structure, filling in the dimension list //from the input list, and allocafing enough memory for //the element list. Note that the elements are inifialized //to zero by the STL vector constructor, if DoRead is false. //In any case, elements.size() will return the product of //the dimensions after this constructor.

CMyDataType::CMyDataType( vector<int> sizes, bool DoRead ) {

int accum = 1; int index; dims = sizes; for (index = 0; index < dims.sizeO: ++index )

accum *= dims [index]; elements.resize( accum ); if (DoRead )

for (index = 0; index < accum; ++index ) cin » elements [index];

}

/ / * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

//Funcfion: CMyDataType::add //Inputs: data2, another CMyDataType structure to // add to this one //Outputs: This, data2 / /

84

Page 92: INVESTIGATION INTO THE COMPILATION OF THE REGULAR

//This funcfion adds the elements of data2 to this one, //first having normalized both elements. NOTE that BOTH //elements will (potentially) change as a result of this //funcfion.

/ / ?]C ?jC 3K JtC 5tC 5K ZK 3K *jC 5tC ?iC 5tC SK StC 5|C 5|C 3K 3K ?jC 3|C 5jC 5K ?jC ?lC ?K ?JC 5K 5jC ?JC ?tC ?JC ?JC 5K 5K 3K *JC SJC SjC ?pC 3|C 3K 3jC ?|^ JjC 5K ?jC 3lC 5K 5jC ?JC Jj^ ^ ^ 5JC ^ ^ ^ ^ J | t ^ ^ * ^ ^ ^

void CMyDataType:;add{ CMyDataType &data2 ) {

int index; int accum = 1; normalize( data2); data2.normalize( *this); for (index = 0; index < dims.sizeO; ++index )

accum *= dims [index]; for (index = 0; index < accum; ++index )

elements [index] += data2.elements [index];

}

/ / * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

//Funcfion: CMyDataType::subtract //Inputs: data2, another CMyDataType structure to // subtract from this one //Outputs: This, data2 / /

//This funcfion subtracts the elements of data2 from this one, //first having normalized both elements. NOTE that BOTH //elements will (potenfially) change as a result of this //funcfion. / / * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * = ^ * * * * * * * *

void CMyDataType: :subtract( CMyDataType &data2 )

{ int index; int accum = 1; normalize( data2); data2.normalize( *this); for (index = 0; index < dims.sizeO; ++index )

accum *= dims [index]; for (index = 0; index < accum; ++index )

elements[index] -= data2.elements[index];

}

/ / * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * • * * * * ^ ^ * * * * * * = '

//Funcfion: CMyDataType::mulfiply //Inputs: data2, another CMyDataType strticture to // multiply this one by

85

Page 93: INVESTIGATION INTO THE COMPILATION OF THE REGULAR

//Outputs: This, data2 / /

//This function multiplies the elements of this one by data2, //first having normalized both elements. NOTE that BOTH //elements will (potentially) change as a result of this //function. / ^ * * * * * * * * * * * * * * * * * * * * * * * * * * * : j : ^ : } : : } . 5 [ ; 5 ( . ^ : j « ^ ^ ^ ^ ; j j : { . ; j . ^ : j 5 : j , ^ 5 ^ ^ 5 ( . ^ ^ 5 l j 5 t . j ( . - j ^ ; j . ^ 5 { ^

void CMyDataType: :multiply( CMyDataType &data2 ) {

int index; int accum = 1; normalize( data2 ); data2.normalize( *this); for (index = 0; index < dims.sizeO; ++index )

accum *= dims[index]; for (index = 0; index < accum; ++index )

elements [index] *= data2.elements[index]; }

/ / ^ ^ ^ ^ ^L« m.Lf ^X* ^^ ^ ^ ^ ^ ^X' ^ ^ * ^ ^ ^ ^ l ' *^ * ^ *J^ ^L* *<ll* *i^ *i^ *^ *^ ^^ ^X* ^X* ^ ^ ^1^ ^ ^ ^1^ *^ ^1^ ^ ^ ^ ^ *^ ^X* *^ *^ ^ ^ * ^ *X* *H* ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ *^ ^ f * ^ ^ ^ ^ ^ ^ ^ ^ * ^ ^ ^ ^ ^ ^ / / ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^% ^ ^ ^ ^ ^ ^ ^ ^ ^ R *T* ^ ^ ^ » ^W ^ ^ ^ ^ ^% ^ ^ ^ ^ *T* ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^* ^ ^ ^ ^ ^ ^ ^ ^ ^ » ^ ^ ^ ^ ^ ^ * ^ ^ ^ ^ * * ^ ^ * ^ * ^ ^ ^ ^ ' J * ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ *7* 'V* ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^

//Function: CMyDataType::divide //Inputs: data2, another CMyDataType structure to // divide this one by //Outputs: This, data2 / /

//This function divides the elements of this one by data2. //first having normalized both elements. NOTE that BOTH //elements will (potenfially) change as a result of this //function. / / * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

void CMyDataType: :divide( CMyDataType &data2 )

{ int index; int accum = 1; normalize( data2); data2.normalize( *this ); for (index = 0; index < dims.sizeO; ++index )

accum *= dims[index]; for (index = 0; index < accum; ++index )

elements[index] /= data2.elements[index];

}

Z / * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * ' ^ * * * * * * * * * * * -

/ /Func f ion : C M y D a t a T y p e : : a d d

86

Page 94: INVESTIGATION INTO THE COMPILATION OF THE REGULAR

//Inputs: none //Outputs: This / /

//This function breaks up this structure into multiple //structures, dim[0] in quanfity, each having the dimensions //of this one, less the first dimension. //It then adds the resulfing structures, as if they were //independent structures. / / sic sic sic ^C sic sic sic sic sic sic sic sic sic sic stc sic sic sic S^ sic sic sic sic S^ sic sic sic sic sic SiC sic sic sic S^ sic ^^ ^^ *^ ^i^ *i^ >I ^^ ^^ m^ ^^ ^1^ ^^ v^ *JL* ^^ ^^ ^^ ^^ t ! >1 *1^ ^^ ^^

void CMyDataType: :add( void ) {

int index, index2; int accum = 1; vector<int> newdims; while (( dims.sizeO > 2 ) && ( dims[0] == 1 ))

dims.erase( dims.begin()); if (( dims.sizeO == 1 ) && ( dims[0] == 1 ))

return; for (index = 1; index < dims.sizeO; ++index ) {

newdims.push_back( dims [index]); accum *= dims[indexj;

} CMyDataType *newO = new CMyDataType( newdims, false ): CMyDataType *newl = new CMyDataType( newdims, false ); for (index = accum - 1; index >= 0; -index ) {

newl->elements[index] = elements.back(); elements.pop_back();

} for (index = dims[0] - 2; index >= 0; -index )

for (index2 = accum - 1; index2 >= 0; -index2 )

new0->elements[index2] = elements.back(): elements.pop_back();

} newO->add( *newl ); *newl = *newO;

} *this = *newl;

}

/ / * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * ^ *

87

Page 95: INVESTIGATION INTO THE COMPILATION OF THE REGULAR

//Funcfion: CMyDataType::subtract //Inputs: none //Outputs: This / /

//This function breaks up this structure into multiple //structures, dim[0] in quantity, each having the dimensions //of this one, less the first dimension. //It then subtracts the resulfing structures, as if they //were independent structures. / / *i^ < j ^j* ^[^ ^j^ ^1^ 'i^ *j^ ^j^ ^j^ '^ ^js 4 iS ^iS 'iS ^1^ ^1* t 'Is t ^i^ ^1^ ^j^ i jS ^1^ t ^r* ^ ^1^ 'i^ ^j^ 'i^ T^ ^1^ *T^ f* ^P ' P ^i^ *i^ TC i i t ^ ^ " ( ^ ^ ^?^ r* ^ ^ s ^ ^ s ^

void CMyDataType: :subtract( void ) {

int index, index2; int accum = 1; vector<int> newdims; while (( dims.sizeO > 2 ) && ( dims[0] == 1 ))

dims.erase( dims.begin()); if (( dims.sizeO == 1 ) && ( dims[0] == 1 ))

return; for (index = 1; index < dims.sizeO; ++index ) {

newdims.push_back( dims [index] ); accum *= dims [index];

} CMyDataType *newO = new CMyDataType( newdims, false ); CMyDataType *newl = new CMyDataType( newdims, false ); for (index = accum - 1; index >= 0; -index ) {

newl->elements[index] = elements.back(); elements.pop_back();

} for (index = dims[0] - 2; index >= 0; -index )

for (index2 = accum - 1; index2 >= 0; -index2 )

new0->elements[index2] = elements.back(); elements.pop_back();

} newO->subtract( *newl ); *newl = *newO;

} *this = *newl;

}

88

Page 96: INVESTIGATION INTO THE COMPILATION OF THE REGULAR

/ / *T^ ?JC ^ ^ ?|C J|C 5(C ?JC SJC S(C ?jC 5K ?JC J ^ ?jC PJC 5JC ?JC ^f» JJC ?JC 3JC ?|C JJC JJ^ ? | t 5JQ ^ ^ J ^ JJQ ^Q JJt ^ ^ ^ t ^^ JJt 5Jt ^|w ^ ^ ?JC ^ ^ ^C ^ ^ 5"Q ^ s ^C 5|^ ^C ^ ^ SiC SjC ?!C ^ C ?jC ^ ^ ^[C ^ ^ JjC ^ ^ ^ s

//Funcfion: CMyDataType::multiply //Inputs: none //Outputs: This / /

//This function breaks up this structure into multiple //structures, dim[0] in quantity, each having the dimensions //of this one, less the first dimension. //It then mulfiplies the resulfing structures, as if they //were independent structures. / / * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

void CMyDataType::multiply( void ) {

int index, index2; int accum = 1; vector<int> newdims; while (( dims.sizeO > 2 ) && ( dims[0] == 1 ))

dims.erase( dims.begin()); if (( dims.sizeO == 1 ) && ( dims[0] == 1 ))

return; for (index = 1; index < dims.size(); ++index )

{ newdims.push_back( dims[index]); accum *= dims[index];

} CMyDataType *newO = new CMyDataType( newdims, false ); CMyDataType *newl = new CMyDataType( newdims, false ); for (index = accum - 1; index >= 0; -index ) {

new l->elements [index] = elements.back();

elements.pop_back();

for (index = dims[0] - 2; index >= 0; -index )

for (index2 = accum - 1; index2 >= 0; -index2 )

new0->elements[index2] = elements.back(); elements.pop_back();

} newO->mulfiply( *newl ); *newl = *newO;

} *this = *newl;

}

89

Page 97: INVESTIGATION INTO THE COMPILATION OF THE REGULAR

/ / T^ t ^ ?[* ^i^ '[^ »i^ ^i^ 't^ "1^ *i» ?i^ 'i^ T^ ^i^ 'TT ^^ *Ti^ ^^ T^ ^^ ^^ T^ ^^ ^^ ' ^^ ^^ ^^ ^^ ^^ ^^ ^^ ^^ ^^ ^^ ^^ ^^ ^^ ^^ ^^ ^^ '(^ i 1 'T" 'T' 'i^ "i^ ^K t* "^ "i* '.^ ^t* T* ^i^ ^i^ ^^ ^^

//Function: CMyDataType::divide //Inputs: none //Outputs: This / /

//This function breaks up this structure into multiple //structures, dim[0] in quanfity, each having the dimensions //of this one, less the first dimension. //It then divides the resulfing structures, as if they were //independent structures. / / 5(C ^JC ?J* 3K ?JC ?K ? K 5JC 3|C 3K ?JC 5|C 3 K f* ?|C ?TC 3 ( C 5 | C JJC 3 | S 5J^ 5 K 5lC ?K ?K ?K ?iC ?|C 5jC ?jC ?JS 3jC PK PfC JjC ?JC 5jC ^ s 5K 3JC >jC JJC ?j^ ?|s 3J^ ?Jw PJC ^|C 5Jw JJ^ JC ^ ^ ^j"^ JC ^ ^ ^^ * ^ ^^ ^ ^

void CMyDataType::divide( void ) {

int index, index2; int accum = 1; vector<int> newdims; while (( dims.sizeO > 2 ) && ( dims[0] == 1 ))

dims.erase( dims.begin()); if (( dims.sizeO == 1 ) && ( dims[0] == 1 ))

return; for (index = 1; index < dims.sizeO; ++index ) {

newdims.push_back( dims[indexj); accum *= dims [index];

} CMyDataType *newO = new CMyDataType( newdims, false ); CMyDataType *newl = new CMyDataType( newdims, false ); for (index = accum - 1; index >= 0; -index ) {

new l->elements [index] = elements.back();

elements.pop_back();

for (index = dims[0] - 2; index >= 0; -index )

for (index2 = accum - 1; index2 >= 0; -index2 )

new0->elements[index2] = elements.back(); elements.pop_back();

} newO->divide( *newl ); *newl = *new0;

} *this = *newl;

90

Page 98: INVESTIGATION INTO THE COMPILATION OF THE REGULAR

}

Z / * * * * * * * * * ^ * : ^ : ^ : ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^

//Funcfion: CMyDataType::print //Inputs: none //Outputs: none / /

//This funcfion prints this structure / / * * * * * * * * * * * * : f , ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^

void CMyDataType: :print( void ) {

int index; int accum = 1; for (index = 0; index < dims.sizeO; ++index )

accum *= dims[index]; cout « "Result:" « endl; for (index = 0; index < accum; ++index )

cout « elements[index] « endl; }

/ / * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * ;i: :Jc :i: :J< :j« ^ : j : 5j: jj: 5jc :i; :j< ^ - } ; 5j5 :(,:(. ^ : j , 5jj ^ ^

//Funcfion: CMyDataType::normalize //Inputs: data2 //Outputs: this / /

//This funcfion normalizes this structure with respect to //data2. It does this by creating a new structure consisting //of the maximum dimensions of either this or data2, both in //the number of dimensions and the size of each dimension. //It then uses the mapdimension function to determine which //of the current elements to use to fill in the new //structure. Note that ALL of the current elements will be //replicated in the new structure. Some, however, may be //repeated, if the new structure is larger than this one.

void CMyDataType: :normalize( const CMyDataType &data2 )

{ CMyDataType *data3; vector<int> newsizelist; int newsize, sizel, size2, index, remainder; vector<int>::constJterator iterl, iter2; sizel = dims.sizeO: size2 = data2.dims.sizeO; iterl = dims.beginO;

91

Page 99: INVESTIGATION INTO THE COMPILATION OF THE REGULAR

iter2 = data2.dims.begin(); // the new structure has a number of dimensions // which is equal to the maximum number found // between this and data2 newsize = max( sizel, size2 ); newsizelist.reserve( newsize); // now, copy the excess dimensions from the larger // structure to the new one if ( sizel > size2 )

for (index = 0; index < sizel - size2; ++index ) newsizelist.push_back( *iterl++);

else for (index = 0; index < size2 - sizel; ++index )

newsizelist.push_back( *iter2++); // now, copy the maximum of each of the remaining // dimensions to the new structure. remainder = min( sizel, size2 ); for (index = 0; index < remainder; ++index ) {

newsizelist.push_back( max( *iterl, *iter2 )); ++iterl; ++iter2;

} // now, we have the correct size for the new structure, // create it data3 = new CMyDataType( newsizelist, false );

newsize = 1; // now, calculate how many elements (total) are in this // new structure

for (index = 0; index < newsizelist.size(); ++index ) newsize *= newsizeHst[index];

// and copy the appropriate elements from the old one // to the new one

for (index = 0; index < newsize; ++index ) data3->elements[index] = elements[MapDimension( index. *data3 )];

// now, make the new structure this one *this = *data3;

}

/ / * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

//Funcfion: CMyDataType: :MapDimension //Inputs: index, newone //Outputs: integer / /

92

Page 100: INVESTIGATION INTO THE COMPILATION OF THE REGULAR

//This funcfion takes the index relafive to the new //structure, and finds the dimensions. It then calculates //the appropriate dimension for the old structure. Then //it calculates the index of that element in the old //structure. It then returns that number. / / * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

int CMyDataType: :MapDimension( int index, const CMyDataType newone ) {

int counter, accum, position; vector<int> newdimension, oldoffset, newoffset; vector<int>::const_iterator iterl, iter2; accum = 1;

position = 0; // calculate the offsets for each dimension of the new structure

for (counter = newone.dims.size() - 1; counter >= 0; —counter ) {

newoffset.insert( newoffset.begin(), accum ); accum *= newone.dims[counter];

} // find each dimension in the new structure, so that // we know the "real" posifion of index in the new // structure

for ( counter = 0; counter < newoffset.size(); ++counter ) {

newdimension.push_back( index / newoffset[position] ); index %= newoffset[posifion]; ++posifion;

// take those dimensions and convert them, by modding, to the // dimensions in this structure

for ( counter = 0; counter < newdimension.size(); ++counter ) newdimension [counter] %= dims[counter%dims.size()];

accum = 1; // calculate teh offsets for each dimension of this structure

for ( counter = dims.sizeO - 1; counter >= 0; -counter)

oldoffset.insert( oldoffset.begin(), accum ); accum *= dims[counter];

// now, find the index into this structure for the // modified dimensions

accum = 0; iterl = newdimension.end(); iter2 = oldoffset. end();

93

Page 101: INVESTIGATION INTO THE COMPILATION OF THE REGULAR

for ( counter = 0; counter < dims.sizeO; ++counter ) {

—iterl; -iter2; accum += *iterl * *iter2;

} return accum;

}

/ / * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

//Funcfion: CMyDataType::operator = //Inputs: this, data2 //Outputs: none / /

//This funcfion copies the elements from data2 into this //structure, making them equal.

CMyDataType &CMyDataType::operator =( CMyDataType &data2 ) {

dims = data2.dims; elements = data2. elements; return *this;

}

94

Page 102: INVESTIGATION INTO THE COMPILATION OF THE REGULAR

PERMISSION TO COPY

In presenting this thesis in partial fulfillment of the requirements for a master's

degree at Texas Tech University or Texas Tech University Health Sciences Center, I

agree that the Library and my major department shall make it freely available for

research purposes. Permission to copy this thesis for scholarly purposes may be

granted by the Director of the Library or my major professor. It is understood that

any copying or publication of this thesis for financial gain shall not be allowed

without my ftirther written permission and that any user may be liable for copyright

infi'ingement.

Agree (Permission is granted.)

Stud#|il^ignature Date

Disagree (Permission is not granted.)

Student Signature Date